Note: you can skip ahead to the node.js troubleshooting tl;dr.
One of the dangers of building a site around web scraping rather than say, some sort of official API, is the danger the site will change and break things in unexpected ways.
Well, I managed to build a web scraper around the weather.gov site a few months too early.
See, if archive.org is to be believed, the site hasn't seen a major update since 2002. I know that sure seems possible; weather.gov has looked the same ever since I've known about it.
I've actually wanted to build this severe weather site for a really long time, but was hesitant to build something that would (at least initially) be so reliant on screen scraping.
Finally, I guess I decided I was comfortable with spending time on something that I might have to rewrite in the future.
Well, if only I had waited a few months...
My Severe Weather is one of the things I built while I was on my node.js kick that started sometime in late March (I think anyway). I got the site to MVP status sometime in late April and here it is just 2 months and a few days later, and the site breaks because they launched the weather.gov redesign today (July 2nd).
But first, my ADD.
See, I have this problem with shiny things. It affects me in every way. I get bored with stuff. There's a reason why I never have a todo list in one system for very long. I change everything about it (from software to software, from computer to handwritten). That desire for change affects so much more than my choice of todo list.
In reality, I kinda hate it. Instead of just building cool sh*t, I can get hung up on the technology itself. I want to find THAT thing that makes me super excited about programming and that makes me super productive.
And of course, I can't go back to that OLD thing I was using. I can't be distracted. I only want to use my new favorite tool. So I end up abandoning projects.
And it works in reverse too. If I abandon that new tool to come back to an old favorite, then I abandon those projects. Nobody wins.
And my severe weather had become one of those abandoned projects. Which was okay though right, cause it still worked.
Until today.
I had already decided to rebuild it in ruby at some point, so initially I told the friends that knew about it, that it would be down until the rebuild. A rebuild which, knowing me, would be a couple of months in the future.
But I decided to say eff it and see if I could fix it without too much hassle.
But this story isn't about fixing the scraper errors. No, it's about trying to track down an error that hid itself quite well.
My severe weather is currently hosted on nodejitsu. So after spending what was probably only an hour fixing the scraping errors, I deployed the app with a simple jitsu deploy.
Well, it should've been simple. First were errors related to (apparently) some functionality I had tried to integrate from another project, so it was relying on code that was on my hard drive but not part of the official project directory.
And the fun started after I fixed that.
Going to the website gave me:
Internal Server Error
      That's it.
jitsu logs produced nothing of interest. develop.nodejitsu.com had no relevant log information.
WTF am I supposed to do with "Internal Server Error"? Well, google it of course. Good luck trying to find something relevant with that one.
So my first troubleshooting step ultimately became to try my luck with heroku. I figured maybe something had happened in the nodejitsu to joyent transition, so trying another provider couldn't hurt.
Effing finally, on heroku I got a traceable error. Or so I thought anyway.
(Express 500 error)
      Error: /app/views/zipcode.jade:104
      unexpected token "newline"
      The confusing bit was that it was erroring out on a line of commented code. What the hell.
I still hadn't seen the error on my development machine though. At some point, I realized that the problem could be that something was a more recent version on nodejitsu or heroku than was on my own machine. So I did a npm update.
Boom. I've now got the error.
And thus began trying to figure out what was going on. First there was a weird looking character in the error's output ×, so I thought maybe there was some unicode (?) character or some invisible character wreaking havoc somewhere. I never found it (probably because it didn't exist).
And every time I'd try to remove offending code, I'd get the unexpected token "newline" on an earlier and earlier line.
So I figured it could be one of those fun bits where you forget some punctuation or something else, and it errors out way later than where your error actually occurred.
And thus began the fun, manual process of deleting large chunks of code and isolating the problem.
In the end, what intially showed as an error on line 104, was a freaking error on line 128.
#{riskName}:
      That little bastard was sitting on a line by itself, and apparently in a previous version of jade, that was just fine. But now, it generates this unexpected token "newline", um, many many lines earlier than the actual problem. Makes it quite fun to find.
Here was the fix:
| #{riskName}: 
      Suddenly my local machine worked, heroku worked, and nodejitsu worked.
So what could've been an experience in "remember how fun node.js was" only served to further cement my decision to start over and rewrite the site in ruby.
But not today.
Sadly, the only concrete advice I can give here is to try deploying your app to heroku, because at least you'll get some real errors there.
Also, nodejitsu may have installed newer versions of node libraries, so try an npm update on your local machine and see if you then get the same error locally.
Look for lines that start with interpolated variables (like #{riskName}), which apparently used to be a legal technique but now requires you prefix it with a pipe (so | #{riskName}).