This talk was given at JSSummit 2013. Entitled "Avoiding Callback Hell with Async.js", my talk focused on common pitfalls with asynchronous functions and callbacks in JavaScript, and using the async.js library and its advanced control flows to create cleaner, more manageable code.
17. Some specific challenges
• When branching, we can’t know the
order of these function calls
• If we want parallel execution, we have
to do some real gymnastics to get
return data back together
• Also, scoping becomes a challenge
18. A more ‘real’ example
var db = require('somedatabaseprovider');
//get recent posts
http.get('/recentposts', function(req, res) {
// open database connection
db.openConnection('host', creds,function(err, conn){
res.param['posts'].forEach(post) {
conn.query('select * from users where
id='+post['user'],function(err,users){
conn.close();
res.send(users[0]);
});
}
});
});
19. A more ‘real’ example
var db = require('somedatabaseprovider');
//get recent posts
http.get('/recentposts', function(req, res) {
// open database connection
db.openConnection('host', creds,function(err, conn){
res.param['posts'].forEach(post) {
conn.query('select * from users where
id='+post['user'],function(err,users){
conn.close();
res.send(users[0]);
});
}
});
});
20. Solutions
• You can make this easier to read by
separating anonymous functions
• Passing function references instead of
anonymous functions helps even more
22. Separate Callback
fs = require('fs');
callback = function(err,data){
if (err) {
return console.log(err);
}
console.log(data);
}
fs.readFile('f1.txt','utf8',callback);
23. Can turn this:
var db = require('somedatabaseprovider');
http.get('/recentposts', function(req, res){
db.openConnection('host', creds, function(err,
conn){
res.param['posts'].forEach(post) {
conn.query('select * from users where id=' +
post['user'],function(err,results){
conn.close();
res.send(results[0]);
});
}
});
});
24. …into this
var db = require('somedatabaseprovider');
http.get('/recentposts', afterRecentPosts);
function afterRecentPosts(req, res) {
db.openConnection('host', creds, function(err, conn) {
afterDBConnected(res, conn);
});
}
function afterDBConnected(err, conn) {
res.param['posts'].forEach(post) {
conn.query('select * from users where id='+post['user'],afterQuery);
}
}
function afterQuery(err, results) {
conn.close();
res.send(results[0]);
}
25. Good start!
• Callback function separation is a nice
aesthetic fix
• The code is more readable, and thus
more maintainable
• But it doesn’t improve your control flow
– Branching and parallel execution are still
problems
26. Enter Async.js
Async.js provides common patterns for
asyncronous code control flow
https://github.com/caolan/async
BONUS: Also provides some common
functional programming paradigms
27. Client or Server -side
My examples will mostly be Node.js
code, but Async.js can be used in both
client and server side code
31. Waterfall Execution
Async also provides a flow for serial
execution, passing results to successive
functions
Function
1
args
args
args
Function
2
Function
3
Function
4
33. Times
Times() offers a shortcut to iterating over
a function multiple times in parallel
async.times(5, function(n, next) {
createUser(n,function(err, user {
next(err, user);
})
}, function(err, users){
// „users‟ now contains 5 users
});
35. Collection Management
Functional programming provides some
useful tools that are becoming
mainstream
Specifically, map, reduce, and filter
operations are now common in many
languages
40. Filter
To minimize computational overhead,
it’s often helpful to filter data sets to only
operate on acceptable values
Async.js provides a Filter function to do
just this
Welcome. I’m going to talk today about callback hell, a place well-traveled by experienced javascript developers, as well as some other common control flows and patterns that make javascript – less than comfortable sometimes. I’m going to focus on Async.js, a useful JavaScript library for managing complex control flows. Async also has a number of other handy tricks up its sleeve to make your life easier, and I’ll show you some of them.
So hi. I’m Aaron. Here’s a few places you can contact me, should you want to ask any questions later, or troll my posts.
So, JavaScript? You guys know it, right? A language we all know and love. Once relegated to small-scale browser scripting, but now exploding into use everywhere. Remember back when Java was supposed to be in everything from servers to toasters? I think javascript has taken up that mantle.
So why do I like javascript so much? Well, it’s a really nice language. C-ish syntax, efficiency and flexibility in the object model. And once people started really working with javascript, they discovered that it actually has an excellent event model, which gave it some robust asynchronous programming capabilities. Now, asynchronous-focused programing languages are nothing new, as anyone who’s worked with actionscript can attest, but Javascript has a particularly nice model. Also, javascript is now both a client and server-side language, thank to…
…node.js. In fact, the pre-existing event model was one of the reasons javascript was such a perfect fit for an asynchronous server-side framework like Node.
So javascript’s great, we all know it, but let’s talk about the skeletons in the closet. Every language has things that can cause you pain, and javascript is no exception.
Javascript handles its asynchronous control flow using callbacks. Callbacks, while useful, allow your code to jump to any other spot in the code instantly, upon completion of a function. Because of this, some have described callbacks at the modern-day GOTO statement – and hopefully we all remember why GOTO is evil. Impossible to trace code, difficult to read and understand – its a maintainability issue. This is actually a really big deal, because low readability and maintainability is what inhibits certain languages, frameworks, and syntax paradigms from enabling the creation of large, long-lasting code bases. We are starting to build huge, full stack stuff with javascript, which means we have to start being very cognizant of the fact that large potentially numbers of developers with varying levels of expertise and understanding will need to look at our code, modify and extend our code, and debug our code.
So let’s take a look at a callback. Point of note, most of my examples are going to be written in node.js syntax, because I’m a node guy and its easier to demo, but async.js itself, though originally written for node.js, now works in both client and server side javascript. So here’s a standard callback. We call the readfile method, pass it some parameters (a file and a format), and then pass it a callback function.
Standard javascript convention is to use inline, anonymous functions as callbacks, which is what we see here.
Because this conventional syntax often causes some confusion, let’s pick it apart to make sure its completely clear. Here’s equivalent, if slightly less elegant, syntax, separating out the entire anonymous function declaration. So there it is.
The problem comes in when you start nesting callbacks. As in, my callback from an asyncronous function calls another asynchronous funciton, which has its own callback. And on, and on…
We in the biz’ call this callback hell. Because its scary. And painful.
And has a devil.
Now let’s visualize what’s going on here. When a function is called, it goes off into the asynchronous, background ether. And the rest of the file continues to execute. When the function finishes its asynchronous task, it calls the callback. This ends up looking something like this, except the functions aren’t calling each other directly. Regardless, in the best case scenario, we have a linear callback path. Function 1 finishes, calls function 2. function 2 finishes, calls function 3. etc. Now without the asynchronous component, we’d be all set. These functions would call each other in a cascading fashion. But alas, these callbacks are asynchronous.
Which means they can _branch_
If I loop over a collection in function 1, and call an asynchronous function for each of 2 elements, I get 2 calls to its callback, function 2. If in one instance of func 2, we iterate over another collection, we get three instances of function 3. And each of these branches instances calls the full chain of future callbacks. Starting to see how this can get really out of hand?
So other than being able to draw a non-confusing diagram, what’s the problem with branching? First, we lose any control over the order of calls within our callback chain. It’s unlikely, but theoretically a single branch could play out all the way down before any other branch starts. We are getting parallel execution, for some definition of parallel, but each branch will end with its own piece of data as a result of the full execution chain. Since we don’t know the order of execution, its tricky to aggregate all of these final return data values without global variables. Further, understanding and managing scope within so many nested functions can become extremely difficult.
Here’s a more realistic piece of pseudocode. This is a very simple piece of code. It makes an http request, gets a response containing a number of recent posts (let’s assume this query is going to a blogging app with an API), opens a connection to a database, extracts the id of the user who wrote the post, and queries information about that user from the database. Even in this basic functionality, we have accumulated three nested callbacks, three anonymous inline functions.
Here’s our friend. We can already see the nesting begin, without too much complexity to the functionality of our code. Whathappens when we add error handling? More nesting, harder to read code. Code that’s becoming harder to maintain.
The first step in improving understandability and readability is to separate out those anonymous functions. Instead of passing an entire anonymous function into our asynchronous method, I’d rather pass a reference to a function formally defined elsewhere.
So let’s remove the inline callback…
… and define it as an object. We can then pass it into the async method. This is much easier to read, which is good.
Sure, a few more lines of code. But look how much better the organization is. Look how much easier it would be to come in and immediately understand this code.
So let’s talk about a solution to our control flow problems. Async.js is a fantastic library created by Caolan McMahon. Async gives you easy to use tools for a number of common control flows to address the problems we’ve been talking about, and also gives you some advanced collection management tools in the form of common functional programming paradigms. Let’s take a look so you can see what I mean.
The first set of tools async provides are control flows for serial and parallel execution of functions. In some cases, you will want to run a number of asynchronous functions in series (while still allowing other code to execute during async operations). In many other cases, you want to run all of your functions in parallel, but to aggregate results when all computations are complete, as we discussed earlier. Async gives you nice, lean syntax to implement both of these control flows:
Like so. Just pass a list of functions into an async.parallel or async.series function, and async will execute them as you have requested. You can declare these functions as inline anonymous functions, or separate them into independent function objects as I showed you earlier, to make the control flow code very small and readable. But wait, there’s more!
You don’t have to manage the callback data pipeline directly anymore! For parallel execution, I can pass in as many functions as I want, and a single callback for all of them. Async will aggregate all return values from each function together into a single collection, and call the universal control flow callback when all parallel executions have completed. All of your data, in one place, delivered to you when computation is complete, with no work on your part. If you have ever had to deal with a lot of parallel execution, this is seriously cool. Refactoringcontrol flows is now much easier as well – you can see the small change to go from series to parallel. Just change the async method, and remove or add a callback. Two lines of code to change your entire execution strategy. This can be very powerful when you have complex systems and interactions.
The one drawback to serial execution is that while async executes each function, one after the other, the functions have no way of talking to each other.