with functional programming idioms
to control concurrency
without recourse to 3rd-party libraries.
This post contrasts two such patterns
that enable you to process data
either concurrently or serially,
with back-off and retry logic
in the event of errors occurring.
About a month ago,
we had an outage
caused by a slow-running query in MySQL.
This particular slow query
wasn't spotted when it was deployed
because it depended on data
inserted by client browsers
and the related preference in Firefox
was not enabled at that point.
A few weeks after it shipped,
the client pref was flipped on
and as the table grew,
it slowed down MySQL increasingly
until the whole of Firefox Accounts
And of course,
because Sod's Law
is one of the fundamental forces of nature,
this happened late on a Friday night.
There were some complicating factors
that slowed down diagnosis,
but it's also fair to say
we could have caught it at source
with an EXPLAIN of the offending query
during code review.
Because of that,
I decided to try and automate
EXPLAIN checks for our MySQL queries.
Let's say you have a huge amount of JSON data
and you want to parse values from it in Node.js.
Perhaps it's stored in a file on disk
or, more trickily,
it's on a remote machine
and you don't want to download the entire thing
just to get some data from it.
And even if it is on the local file system,
the thing is so huge that reading it in to memory
and calling JSON.parse
will crash the process
with an out-of-memory exception.
Today I implemented a new method
for my async JSON-parsing lib, BFJ,
which has exactly this type of scenario in mind.