From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Category theory, Monads, and Duality in the world of (BIG) Data
1. Category theory, Monads, and Duality in the world of (BIG) Data Bart J.F. De Smet bartde@microsoft.com Cloud Programmability Team
2. What’s in a name? Cloud Programmability Team Logical role: Research-oriented team Collaboration with MSR Physical placement: Oasis within the product team Close to the SQL Server business (Dual) 80/20 rule of success Portfolio Language Integrated Query (LINQ) XML literals in Visual Basic 9 Reactive Extensions (Rx) Various undisclosed projects Democratizing the cloud
9. Maybe baby! Billion Null-propagating dot string s = name?.ToUpper(); Syntactic sugar name.SelectMany( _ => _.ToUpper(), s => s) from _ in name from s in _.ToUpper() select s Compiler Can useextension method
14. Pull-based data access interfaceIEnumerable<out T> { IEnumerator<T> GetEnumerator();} interfaceIEnumerator<out T> : IDisposable { boolMoveNext(); T Current { get; } void Reset(); } You could get stuck
15. Duality in the world around us(Or… the Dutch are cheap) Electricity: inductor and capacitor Logic: De Morgan’s Law Programming? ¬𝐴∨𝐵≡¬𝐴∧¬𝐵 ¬𝐴∧𝐵≡¬𝐴∨¬𝐵
16. Duality as the secret sauce?Give me a recipe http://en.wikipedia.org/wiki/Dual_(category_theory) Reversing arrows…Input becomes output and vice versa Making a U-turnin synchrony
17. Distilling the essenceProperties and unchecked exceptions interfaceIEnumerable<out T> { IEnumerator<T> GetEnumerator();} interfaceIEnumerator<out T> : IDisposable { boolMoveNext(); T Current { get; } }
18. Distilling the essenceProperties and unchecked exceptions interfaceIEnumerable<out T> { IEnumerator<T> GetEnumerator();} interfaceIEnumerator<out T> : IDisposable { boolMoveNext() throwsException; T GetCurrent(); }
19. Distilling the essenceEmbracing a (more) functional style interfaceIEnumerable<out T> { IEnumerator<T> GetEnumerator();} interfaceIEnumerator<out T> : IDisposable { boolMoveNext() throwsException; T GetCurrent(); }
20. Distilling the essenceEmbracing a (more) functional style interfaceIEnumerable<out T> { IEnumerator<T> GetEnumerator();} interfaceIEnumerator<out T> : IDisposable { (void | T | Exception) MoveNext(); } () -> (() -> (void | T | Exception))
21. Flipping the arrowsPurely mechanical transformation () -> (() -> (void | T | Exception)) ((void | T | Exception) -> ()) -> ()
22. Harvesting the resultSo far for abstract nonsense interfaceIBar<out T> { voidQux(IFoo<T> foo);} interfaceIFoo<in T> { voidWibble(); void Wobble(T value); voidWubble(Exception error); }
27. Observable.Create<T> operator IObservable<int> o = Observable.Create<int>(observer => { // Assume we introduce concurrency (see later)… observer.OnNext(42); observer.OnCompleted(); return () => { /* unsubscribe action */ }; }); IDisposable subscription = o.Subscribe( onNext: x => { Console.WriteLine("Next: " + x); }, onError: ex => { Console.WriteLine("Oops: " + ex); }, onCompleted: () => { Console.WriteLine("Done"); } ); C# doesn’t have anonymous interface implementation, so we provide various extension methods that take lambdas. C# 4.0 named parameter syntax
28. Observable.Create<T> operator IObservable<int> o = Observable.Create<int>(observer => { // Assume we introduce concurrency (see later)… observer.OnNext(42); observer.OnCompleted(); return () => { /* unsubscribe action */ }; }); IDisposable subscription = o.Subscribe( onNext: x => { Console.WriteLine("Next: " + x); }, onError: ex => { Console.WriteLine("Oops: " + ex); }, onCompleted: () => { Console.WriteLine("Done"); } ); Thread.Sleep(30000); // Main thread is blocked… F10
29. Observable.Create<T> operator IObservable<int> o = Observable.Create<int>(observer => { // Assume we introduce concurrency (see later)… observer.OnNext(42); observer.OnCompleted(); return () => { /* unsubscribe action */ }; }); IDisposable subscription = o.Subscribe( onNext: x => { Console.WriteLine("Next: " + x); }, onError: ex => { Console.WriteLine("Oops: " + ex); }, onCompleted: () => { Console.WriteLine("Done"); } ); Thread.Sleep(30000); // Main thread is blocked… F10
30. Observable.Create<T> operator IObservable<int> o = Observable.Create<int>(observer => { // Assume we introduce concurrency (see later)… observer.OnNext(42); observer.OnCompleted(); return () => { /* unsubscribe action */ }; }); IDisposable subscription = o.Subscribe( onNext: x => { Console.WriteLine("Next: " + x); }, onError: ex => { Console.WriteLine("Oops: " + ex); }, onCompleted: () => { Console.WriteLine("Done"); } ); Thread.Sleep(30000); // Main thread is blocked… F5
31. Observable.Create<T> operator IObservable<int> o = Observable.Create<int>(observer => { // Assume we introduce concurrency (see later)… observer.OnNext(42); observer.OnCompleted(); return () => { /* unsubscribe action */ }; }); IDisposable subscription = o.Subscribe( onNext: x => { Console.WriteLine("Next: " + x); }, onError: ex => { Console.WriteLine("Oops: " + ex); }, onCompleted: () => { Console.WriteLine("Done"); } ); Thread.Sleep(30000); // Main thread is blocked… Breakpoint got hit
32. Iterators dualized IObservable<int> GetXs() { returnObservable.Create(o => for(int i = 0; i < 10; i++) o.OnNext(i * i); o.OnCompleted(); ); } GetXs().Subscribe(x => { Console.WriteLine(x); }); IEnumerable<int> GetXs() { for (int i = 0; i < 10; i++) yieldreturni * i; yield break; } foreach(var x inGetXs()) { Console.WriteLine(x); } Synchronous Asynchronous
33. Compositionality matters IObservable<T>Merge<T>(thisIObservable<T> left, IObservable<T> right) { return Create<T>(observer => { // Ignoring a few details for OnCompleted var d1 = left.Subscribe(observer); var d2 = right.Subscribe(observer); returnnewCompositeDisposable(d1, d2); }); } Lazy evaluation
34. Bridging Rx with the WorldWhy .NET events aren’t first-class… Hidden data source How to pass around? form1.MouseMove+= (sender, args) => { if(args.Location.X==args.Location.Y) // I’d like to raise another event }; form1.MouseMove -=/* what goes here? */ Lack of composition Resource maintenance?
35.
36. Bridging Rx with the World…but observable sequences are first-class Source of Point values Objects can be passed IObservable<Point>mouseMoves= Observable.FromEvent(frm, "MouseMove"); varfiltered = mouseMoves .Where(pos => pos.X == pos.Y); varsubscription = filtered.Subscribe(…); subscription.Dispose(); Can define operators Resource maintenance!
37. Composition and QueryingIt’s the continuation monad! // IObservable<string> from TextChanged events varchanged = Observable.FromEvent(txt, "TextChanged"); var input = (from e in changed let text = ((TextBox)e.Sender).Text wheretext.Length >= 3 select text) .DistinctUntilChanged() .Throttle(TimeSpan.FromSeconds(1)); // Bridge with the dictionary web service var svc = newDictServiceSoapClient();var lookup = Observable.FromAsyncPattern<string, DictionaryWord[]> (svc.BeginLookup, svc.EndLookup); // Compose both sources using SelectMany var res = from term in input from words in lookup(term) select words; input.SelectMany(term => lookup(term))
38. Introducing schedulers How to be asynchronous? Different ways to Introduce of concurrency Parameterization by schedulers interfaceIScheduler { DateTimeOffset Now { get; } IDisposableSchedule<T>( T state, Func<IScheduler, T, IDisposable> f); // Overloads for time-based scheduling }
45. Object graphs var_1579124585 = newProduct{ Title = “The Right Stuff”, Author = “Tom Wolfe”, Year = 1979, Pages = 304, Keywords = new[] { “Book”, “Hardcover”, “American” }, Ratings = new[] { “****”, “4 stars” }, }; var Products = new[] { _1579124585 };
46. Queries over object graphs varq = from product in Products whereproduct.Ratings.Any(rating => rating == “****”) selectnew { product.Title, product.Keywords };
47. The O/R paradox Objects Fully compositional value ::= scalar new {…, name = value, … } Tables Non compositional value ::= new {…, name = scalar, … }
49. Queries over tables var q = from product in Products fromrating in Ratings whereproduct.ID == rating.ProductId && rating == “****” fromkeyword in Keywords whereproduct.ID == keyword.ProductID selectnew { product.Title, keyword.Keyword }; varq = from product inProducts joinrating in Ratings onproduct.ID equalsrating.ProductId whererating == “****” selectproduct intoFourStarProducts fromfourstarproductinFourStarProducts joinkeyword in Keywords onproduct.ID equalskeyword.ProductID selectnew { product.Title, keyword.Keyword };
50. Welcome to O/R voodoo varq = from product in Products whereproduct.Ratings.Any(rating => rating == “****”) selectnew { product.Title, product.Keywords };
51. What did we gain? Ad-hoc queries? But what about scale… The relational Gods invented indexes Going against the PK-FK flow… from p1 in WWW from p2 in WWW where p2.Contains(p1.URL) selectnew { p1, p2 };
57. Thank you! Bart J.F. De Smet bartde@microsoft.com Cloud Programmability Team
Notas do Editor
Speaker tips:So far, we’ve seen specific operators to create new observable sequencesLike preprogrammed (parameterized) implementations of the IObservable<T> interfaceSometimes we just want to implement the interfaceCan be simplified using Create (general pattern in Rx)Is really the Subscribe method as a lambdaSlide omits what the lambda should return, an Action delegate that’s used to create the IDisposable that’s returned…There’s also CreateWithDisposableWe chose to omit this from the slide (and be imprecise) to focus on the flow of data hereTypically, an observable creates concurrency upon subscription in order to send out the messagesMention this but refer till later, where we mention ISchedulerAlso notice the use of a Subscribe extension methodAgain… this shows how to mimic anonymous interface implementations that C# lacks
Speaker tips:Assume we’re in the debuggerSet a breakpoint on the onNext lambda bodyStart the program using F10Now let’s see what Subscribe will do…
Speaker tips:Slide is just here for animation of F10Press F10 againSubscribe will new up an IObserver<int> objectThis will get passed to the Create method’s lambda parameter as the “observer” parameter
Speaker tips:Emphasize the asynchronous nature of SubscribeMain thread has moved on beyond the asynchronous Subscribe callStill assume the body of create has introduced concurrencyi.e. calls to OnNext and OnCompleted are scheduled to happen in the backgroundWe’ll let the debugger go using F5 to see our breakpoint getting hit
Speaker tips:Bang – the breakpoint got hit!Notice where the main thread sits, indicated in grayThough we’re blocking that thread in the sample, it could be doing other useful work……while the observable notifies us about data being availableThis shouldn’t be new to the audienceE.g. when using += (o, e) => {…} to set up an event handlerSyntactical location of a breakpoint can belong to a whole different thread compared to code close-by!
Speaker tips:Big message:Primitive constructor operators are great, but we’d like to do something of more interest…Rx doesn’t aim at replacing existing sources of asynchrony in the frameworkInstead we can bridge with those worldsFirst common source of asynchrony are .NET eventsSuffer from some problems:Nobody thinks of a mouse as a database of pointsMouse database is not “preprogrammed” (like: “give me a mouse that can move once across the screen”) but is an infinite sourceHave to dot into the EventArgs object to obtain the data (sometimes it isn’t even there!)Events cannot be grabbedNo objects that can be passed to a method, stored in a field, put in an array, etc.How’d you write a GPS visualizer that expects to get passed an event producing points? Can’t pass a .NET event along!Composition suffersEveryone has to write logic in event handlers, e.g. an if to filterCan’t hire a mathematician to write a generic filter that works with all eventsAlso, we’d like a filtered event still to be an event (stay in the same “world”) we have to settle for procedural code todayResource maintenance requires stateHave to remember what you gave to += in order to get rid of it using -= use a field?Same C# code passed to -= won’t work (there is no value equality between delegates based on what code they contain)Notice resource management gets even harder in the face of compositionSay that applying a hypothetical generic filter to an event gives you a new event “object”Now if you unhook a handler from the filtered event, you want it to unhook from the original event Cascading effect with lots of state maintenance!
Speaker tips:How does Rx improve on this?FromEvent methods here reflective overload being usedNotice: omits a few things…Generic parameter for EventArgsFact it returns an IObservable<IEvent<…>> correct this in the demoRationale: focus on the essence hereComparison to before:Look at the type to see the (no longer hidden) data source source of Point values (thanks to generics)Objects a la IObservable<Point> can be passed, e.g. to our GPS visualizerJust like LINQ to Objects does, we can define operators on objects compositionality enters the pictureResource maintenance can be done using a “subscription handle”Yes, you still need to store it somewhere, but you don’t need to remember what you gave itThe old world is like subscribing to a magazine but keeping your hands on the check so you can take it back!In the new world you get an unsubscription card (the Dispose method) you can send in to unsubscribe…Notice state maintenance for unsubscription can be encapsulated now tooE.g. Merge operator that merges n number of observable sequences into oneSubscription causes a subscribe on all of the sequencesUnsubscribe should unsubscribe from all of the underlying sequencesNeeds a list of IDisposable objects we have an algebra over IDisposable in System.DisposablesCan hide those from the outside world!