There are few things in life that I enjoy more than good, healthy, broken code. It’s inevitable that things are going to break, it’s inevitable that I'm going to need to debug those things, and it’s inevitable that I'll need do whatever is necessary to fix them. No one ever ships 100% perfect code.
This happens all the time: we ship code to production, all our tests pass, things seem fine, we celebrate… and then users start complaining. Sometimes they complain right away. Sometimes days later. We usually have no idea what happened to make them complain in the first place. No one has ever intended their code to unexpectedly break whatever it broke.
So we end up scrambling to do post-mortem debugging. An exception has happened, we don’t have the info as to why, but we need to figure it all out real quick by debugging after the fact.
But we want to catch these issues before users have the chance to complain to us. Ideally, this would be solved easily with tests. Why not just cover every scenario with a test? Then life would be perfect and fine and great.
Because here in reality, humans are pretty bad at writing tests. Not just because we’re all kinda lazy and maybe a little dumb, but also because:
– Who wants to write tests for everything?
If your product has a customer base and one of those customers emails you to complain about something being broken, they probably send you a message like: “It doesn’t work for me. Please fix! I’m losing revenue.” People are always losing revenue, even if your app just hosts cool pictures of dogs.
If you’re fortunate enough to hear from a tech savvy user, then you can maybe somehow convince them to open up their browser console, reproduce the bug, and send you a screenshot. If an exception happens in production and nobody sees the Chrome debug console, did it really happen?
You probably won't be so fortunate. Which may lead you to throw up your hands. “It works for me and I’m not going to fix this cause I don’t know how to debug your problem. And I’m not getting enough info to fix.”
Obviously we don’t want to rely on users sending angry messages and faulty screenshots to tell us what happened. We want to see what happened in real time. We want to see the stack trace and the problem in context. How can we be more proactive about this?
Well, there’s always the most obvious way to go about doing this, my absolute old time favorite:
If you’re not familiar, it’s essentially a global callback that throws a net over your entire application. I call it
window.onNOPE. It’s literally the worst way to capture an exception other than mind melding with your code so that it makes you faint every time it breaks.
The callback signature used to look something like this:
function(message, url, lineNumber)
It took three arguments, the first one being a message, the url of the file that caused the exception, and the lineNumber
The problem here is that none of these are an actual Error object. They’re just a message, url, and a number. In practice these three arguments end up looking like this:
message: "TypeError: Cannot read property 'foo' of undefined" url: "http://example.com/foo.js" lineNumber: 10
At a glance this looks really useful. We can drill into our code, look at line 10, and figure out what’s going on. The issue is that your code is probably minified. Line 1 is your entire application. You have a million characters on this one really (really) really long line.
Also, your JS Is likely hosted on another domain. A CDN or a subdomain or a cookie-less domain, something like that. If a script is on another domain, it’s subject to CORS (Cross Origin Resource Sharing). And if the page it’s on doesn’t have the proper CORS headers — and it probably doesn’t — then all
window.onerror is gonna tell you is "Script Error". So you have literally no idea what happened, just that something happened. It’s like if someone called the fire department to alert them there was a fire somewhere in a thousand mile radius and that they really need to get on that.
Over time browsers have expanded on
window.onerror to include more information:
function(message, url, lineNumber, columnNumber, error)
Now we have our Error object. This is nice, cause we can (try to) get a stack trace from this. For those unaware (though you are almost certainly already very much aware), a stack trace is just a record of the function calls up to this point within the current call stack. A series of function calls: you call a, a calls b, b calls c, and on down the call stack.
When you’re debugging something, if you don’t have a stack trace it’s going to be very difficult to fix. You can maybe see that the exception happened at this line, but you don’t know how you actually got there: you don’t know which function called that function; you don’t know the order of events that preceded it.
Let’s explore by opening up our favorite executable to throw a new error:
$ node > throw new Error('lol')
I’m sure you’ve seen something like this:
Error: lol at repl:1:7 at REPLServer.self.eval (repl.js.110:21) at repl.js:249:20 at REPLServer.self.eval (repl.js:122:7) at Interface.<anonymous> (repl.js:239:12) at Interface.EventEmitter.emit (events.js:95:17) at Interface._onLine (readline.js:202:10) at Interface._line (readline.js:531:8) at Interface._ttyWrite (readline.js:760:14) at ReadStream.onkeypress (readline.js:99:10)
This is what comes back when you arbitrarily throw and do not catch an exception. What do each of these things mean?
It gives us the caller, aka the function of the frame (Interface.EventEmitter.emit). The source the function was called in (events.js). The line number (95). And the column number (17).
We also now have this property off the error prototype “.stack”. A string. Probably the most useless thing you can get back from this.
If you want to start breaking this apart and extract the pieces out of it, you need to use a regular expression, something like:
Now get out there and break some code.