Constructive Criticism of the Reactive Extensions API

posted by Craig Gidney on September 6, 2012

I love the Reactive Extension (Rx) library for .Net. You may also have heard it referred to as “linq to events”, which is a perfect name because Rx simplifies event handling in a linq-y functional style (also, the creators have a strange obsession with calling IObservable the dual of IEnumerable.).

You can use Rx to write event-driven systems that, written in the normal style, would be absolute messes. For example, consider this code that uses Rx plus some custom extension methods to implement ‘dragging the volume up and down’:

private async void OnPressed(object sender, PointerRoutedEventArgs e) {
    var preVolume = GetVolume();

    var releasedNormally = await this.CaptureObserveDrag(e).Select(pt => {
        var f = (pt.GetCurrentPoint(this).Position.X/this.ActualWidth).Between(0, 1);
        SetVolume(f);
        return f;
    }).WillCompleteNormally();

    if (releasedNormally) {
        PlayBeep();
    } else {
        SetVolume(preVolume);
    }
}

The equivalent code, written without Rx or something similar, is awful. You have to register for multiple events (PointerReleased, PointerMoved, PointerCancelled, CaptureLost), sprinkle state throughout the control’s code, and then the next time you want drag-like functionality you have to do it all over again from scratch.

Although the rest of this post is about issues I have with the Rx API, I absolutely recommend that you use it or at least read more about it.

The Rx API has a few gaping flaws in it that I really hope are fixed before it is set in concrete by being included in the .Net framework. I suppose the easiest way to cover them is awkwardly segueing into a list.

1) Extension methods have return types that are too general. Return tasks instead of observables where possible.

A method’s return type should closely match its set of possible results. Almost all of the observable extension methods return IObservable<T>, but many of them (Aggregate, First, All, Max, ToList, …) could return the more specific Task<T> instead. Returning a Task<T> ensures the user knows there is exactly one result (or an exception), and makes the result easier to use. This is especially significant now that C#5 includes async features that make working with tasks a breeze.

As an analogy, imagine if the enumerable extension method ToList returned an IEnumerable<List<T>> instead of a List<T>. You would be forced to call GetEnumerator() then MoveNext() then Current just to get what you actually wanted, and any static analysis tools would complain unless you made accessing Current dependent on MoveNext’s result. You know there’s always exactly one result, and the return type should reflect that fact.

To drive this point home, here is a concrete example. Suppose we want to print the average of the sums of integers in observables in a list, and consider Observable.Sum/ToArray returning either an IObservable or a Task.

First, returning a Task:

static async void PrintAverageSum(IEnumerable<IObservable<int>> observables) {
    try {
        var asyncSums = observables.Select(e => e.Sum()); // enumerable task sums
        var sums = await Task.WhenAll(asyncSums); // enumerable of task --> task of enumerable
        var avg = sums.Average(); // compute average in the normal way
        Print(avg); // print result
    } catch (Exception ex) {
        Print(ex); // print exception (if any)
    }
}

Second, returning an IObservable:

static void PrintAverageSum(IEnumerable<IObservable<int>> observables) {
    var sums = observables.Select(e => e.Sum()) // enumerable observable sums
                          .ToObservable() // convert top-level enumerable to observable
                          .SelectMany(e => e) // flatten two-level observable
                          .ToArray(); // group all items into observable single int[]
     var avg = sums.Select(e => e.Average()); // single observable average of all sums
     avg.Subscribe(Observer.Create<double>( // print result or exception
        v => Print(v),
        ex => Print(ex)));
}

The task variant has two big advantages here: it looks more like “normal” C# and it is easier to analyze. It uses standard control flow (try-catch) and it is trivially clear that exactly one thing is printed. The observable variant is forced to use custom control flow built out of lambdas, which is easier to get wrong. Also, proving that it prints exactly one thing requires knowing that the observable returned from Observable.ToArray contains exactly one item (or an exception). This fact is not present in the method’s signature, meaning static analysis tools will have difficulty with it.

These advantages compound as methods become more complicated, especially in the presence of asynchronous loops and branches.

Note that, when Rx was first released, .Net didn’t have the Task type yet (it was introduced in .Net 4.0). Using task types would make Rx incompatible with .Net 3.5, but I think it’s worth it.

2) The caller should manage the “lifetime data” of a subscription. Use cancellation tokens, not returned disposables.

Many asynchronous methods in the .Net base class library have a CancellationToken parameter, used to allow disposal/cancellation of the result. In contrast, the Rx Subscribe methods return an IDisposable allows disposal/cancellation of the subscription. The cancellation token approach is fundamentally better for a few reasons:

  1. Matching lifetimes: The lifetime of a subscription often matches the lifetime of other subscriptions or of the calling method. If you are using cancellation tokens then the same token can be used for all subscriptions with the same lifetime, or the token given to the caller can be passed along. If you are using returned disposables, then you must write additional wiring code each time.
  2. Dependent lifetimes: The lifetime of a subscription often matches the lifetime of the observable to which it is subscribed. If you are using cancellation tokens this is achieved simply by passing CancellationToken.None. To do this with returned disposables you must choose between two evils, to prevent the subscriptions from being collected: either force users to store the disposable subscription tokens and draw their ire or break the convention that finalization implies disposal and draw the ire of code analysis tools like FxCop (letting a disposable go out of scope without disposing it is a warning).
  3. Single results: The return value of a method is often used for some other purpose. If that method must also allow management of the lifetime of a subscription then returning a disposable requires returning an awkward combination type or using an out parameter. Using a cancellation token is unaffected by the return type: just add the token parameter like usual.

Cancellation tokens just work better. They’re easier to add to existing methods, they handle edge cases nicely, they have the exact semantics you want, and they even play nice with optional parameters because default(CancellationToken) is CancellationToken.None.

3) When in doubt, leave it out. The exposed API is too large.

The Rx framework exposes a lot of classes that don’t necessarily need to be exposed. For example, System.Reactive.Disposables.RefCountDisposable is a public class that

Represents a disposable resource that only disposes its underlying disposable resource when all System.Reactive.Disposables.RefCountDisposable.GetDisposable() have been disposed.

This class is potentially useful, but it’s also something that must be learned and considered when exploring the API. The Rx framework is full of bits like this that almost no one will ever use. Browsing it all in the object explorer is frankly a little daunting.

The Rx framework also exposes types like System.Reactive.Unit (the ‘void’ value) and System.Reactive.Disposables.ICancelable (disposable with IsDisposed getter). These are great types to have, but they’re so great that the user probably already has one implemented. Giving them another “not a value” value just creates conflict between their code and your code.

In spite of the flaws I’ve listed (reducible return types, awkward lifetimes, large API), Rx is still a very useful framework. It’s just not as good as it could be. The longer we wait, the harder it will be to change it.

Honestly, it’s probably already too late.

Update: Bart De Smet, one of the Rx devs, has posted an excellent response to this critique.

  • http://rehansaeed.co.uk Muhammad Rehan Saeed

    Totally agree. The framework is massive in size. I did a series of posts
    at http://rehansaeed.co.uk to recommend using Rx as a replacement for
    .NET events.

  • Niall

    Firstly, your Rx approach is combining both the composition buildup of the Observable, and the consuming of its values. A more common approach would be to separate the construction from the consumption. So you’d have GetAverageSum that returns the IObservable that gives you your Average value. Then the caller could subscribe to this and print out the value or whatever it wants. Then it doesn’t feel so weird as it would returning an IDisposable from your PrintAverageSum method.

    Also, your code to get the Average Sum in Rx is a lot more complex than it needs to be. If you’re using this to evaluate the cleanness of the API, you need to look at how to implement the solution in the simplest way before you evaluate it.

    I knocked this up in LinqPad, it seems to fit the bill:

    void Main()
    {
    var o1 = Observable.Range(0, 5);
    var o2 = Observable.Range(1, 5);
    var o3 = Observable.Range(2, 5);
    GetAverageSum(new[] { o1, o2, o3 }).Dump();
    }

    IObservable GetAverageSum(IEnumerable<IObservable> observables)
    {
    return observables.Select(e => e.Sum()).Merge().Average();
    }


Twisted Oak Studios offers consulting and development on high-tech interactive projects. Check out our portfolio, or Give us a shout if you have anything you think some really rad engineers should help you with.

Archive

More interesting posts (24 of 33 articles)

Or check out our Portfolio.