The Case Against Synchronous Worker APIs

May 15, 2013

Several times in the past few months I've been presented with this question: is it good or useful to provide synchronous APIs to web workers?

Having considered the question at some length, it seems to me the answer must now be "no".

Consider IndexedDB. It's not implemented in any browser yet, but some hard-working soul did the arduous work to specify a second synchronous version of its API which is meant to be available only in Workers where it can't lock up the UI thread. That spec work was probably done because it was thought that a synchronous API would be nicer to use than the async version. As a result, the API is now double the size, but only in some contexts. I came across this while attempting to rework the IDB API to use Futures in order to improve usability in a backwards-compatible way.

So why not the sync version? At least 2 reasons:

The async version doesn't have to be thorny as the current IDB API is. The current IDB API has its issues: it uses events for one-time operations (not 0-through-N times operations). To do so it has to create non-obvious control flow contracts to give developers a place to set up listeners. Another legacy of using events is that delivered values are always wrapped in event objects and must be dug out with boilerplate. Some API conveniences can be added, and some things in the API are implicit that perhaps shouldn't be (e.g. transactions). All of these things can be repaired in a backwards-compatible way that improves async IDB to the point of being nearly as nice to use as sync IDB. Futures can help, as can some reworking of optional arguments.
There's very little a worker will do that is really long running and truly synchronous. Even things that look like they're a good fit at first blush break down quickly. Just think of the user experience: who wants their laptop fan spinning up, their lap getting warm, and their precious battery indicator slipping towards a powerless coma without being able to either find what is causing it or, better, some UI that communicates what work is being done on their behalf and a way to cancel it if necessary. Getting work off the main thread is what allows apps to do work for users without locking up those UI indicators, but to provide any ability to interrupt the task or broadcast progress, it must yield to the main event loop.

It's this second concern that I think it truly fatal to the cause of sync worker APIs: assuming they work and are popular, they will create a world in which it's necessary to put limits on their overall running time...limits that will be circumvented by breaking up the work into smaller chunks and dealing with it asynchronously inside the worker. Likewise, anyone building a serious app that's trying to do the right thing by the user will factor their worker's tasks into small enough chunks that they can both service "stop" messages and distribute progress notifications to the UI. There might be scenarios where such messages aren't necessary and where users aren't coveting CPUs and batteries...where sending SIGHUP doesn't matter. But the intersection of those scenarios and the client-side web seems mostly to be a happy accident: your code might not have encountered enough data to create the problems. Yet.

This is particularly clear in the IDB cases: upgrading, iterating over, and updating hundreds of thousands of items of data is the sort of thing that will take a while, and is likely in response to some application semantic the user cares about: synchronizing mail, migrating to a faster schema layout in the process of some upgrade, etc. A blind for loop is asking for trouble. All of this might work fine in a dev environment with a (small) staging set of data...but it's recipe for disaster when power-users with tons of data encounter it. What then? If the APIs an app depends on are all synchronous, it's a huge boulder to roll up a hill to provide notifications, chunk up work, and refactor around async-ish patterns that chunk work up. If the work was async in the first place, the burden is much lower. So even apps that aren't Doing It Right (TM) are likely to reap some benefit down the line from thinking in terms of async first.

There are other arguments that you can field against these sorts of APIs, particularly ones that double-up API surface area, but it doesn't seem to me that they're necessary. The person attempting to justify synchronous worker APIs who provides a good argument for ergonomics and learnability still has all their work ahead of them: they must show that these APIs are not harmful to the user experience. After all, Workers were added to the platform as a way of improving UX (by moving work off the main thread). And I fear they cannot do so without violating core JS semantics.

So let's pour one out for our sync API dreams: we're gonna miss you, control flow integration. But not for too long. Generators, iterators, and yield will see you avenged.