bounding brokenness

New version of ntfsutils

I’ve released a new version of the ntfsutils library for Python.

The small spin-off project from some work I did at Mozilla became far more popular than I expected, netting over 600 downloads last year. Pretty neat!

Pymake has been enabled on Win32 Firefox builders

[crossposting from; please continue the discussion there. Google Groups link.]

As of yesterday, Pymake has been enabled on all Firefox Win32 builders. This includes mozilla-central, mozilla-inbound, try, and all project branches (except elm). Clobber build times (this includes Try builds!) should go down by 45 minutes to an hour, resulting in faster turnaround for developers and lower load on the infrastructure.

Pymake has now become the official way to build Firefox 18 and above on Windows. If you’re on Windows and you haven’t moved to Pymake yet, you should: it’s really simple. I recommend setting up an alias to the in-tree Pymake in your MSYS profile.

GNU Make will still work for now, but we won’t have tier 1 continuous integration for it (at least for Firefox — Thunderbird’s still on GNU Make, but I’m hoping to switch those builds to Pymake soon). Win64 builds are also on GNU Make right now, but those aren’t tier 1.

A few notes:

  • If your project branch is seeing build failures on Windows:

    1. make sure you’re tracking an up-to-date mozilla-central: the last couple of fixes landed earlier this week and more fixes will be coming down the line
    2. try setting a clobber for Win32 builds on your branch.
    3. ask in #developers or #pymake — perhaps someone might be able to help you out
    4. if all else fails, file a bug with releng similar to this one.
  • While try builds are much faster now, we also lose the ability to build Firefox 15-17 on them. I spent a bit of time investigating workarounds but didn’t get anywhere, unfortunately. Update 3/9: Simply add this patch to your queue before pushing to try.

  • This also means that you need to pushing an up-to-date mozilla-central to try, otherwise you will see failures on Windows.

  • One out of every 20 builds or so is failing with a strange error in dom/bindings/test — this seems to be corruption caused by a race condition that isn’t being properly handled somewhere but honestly shouldn’t be happening in the first place. If you see an error in dom/bindings similar to the one in this log, please retrigger that build. The error’s being tracked in this bug.

I’d like to say thanks to everyone who helped push it over the finish line: ted, khuey, coop, gps, glandium, catlee, bsmedberg, bhearsum, and anyone else if I missed them (sorry!).

Civilising global state, take 2

A few months ago, I wrote a post about sane global state via parameters. Now Racket has parameters, but most other languages don’t. How important parameters are for a dynamic language was driven home to me while working on getting the Pymake build system working on our tinderboxes.

Pymake is a Python reimplementation of the venerable GNU Make, with two big advantages over it:

  1. Parallel GNU Make (at least the MSYS variant) is extremely buggy and prone to deadlocks on Windows, which means the only option is to run serially (-j1). Pymake doesn’t have any issues, so build times are much faster on machines with enough cores.
  2. The Mozilla build system uses recursive Make. Each recursive call spawns another Make process, and process spawning is rather expensive on Windows. Pymake performs all its recursion within the same process, which speeds build times on Windows even further.

Pymake also provides a few bonus features, one of which is the ability to run Python files “natively”, meaning within the same process. This helps speed up builds even more by avoiding spawning extra Python processes. Of course, such scripts have to be written with care to avoid trampling over Pymake itself.

One of the scripts for which this feature was enabled was By default, Python only loads modules from the current directory and its system directories. lets us specify additional directories to load modules from. (Yes, virtualenv is a much better solution, and we’re going to switch to it soon.)

To do its work must modify important global variables like sys.argc and sys.argv. Since it doesn’t undo these modifications, it causes hard-to-debug problems down the line. To fix such issues, one would need to consider every eventuality: returns, exceptions, etc. Not only that, one might need to consider modifications by others as well.

As a result, we had to disable this feature for it, slowing down builds by a bit.

If those variables were parameters, one could simply wrap the modifications up in a parameterize and expect things to Just Work. Since everyone would be doing the same thing, things continue to Just Work no matter what is run.

Build System on BugsAhoy

Josh Matthews‘ amazing BugsAhoy now has support for the build system. All open bugs labelled [good first bug] and with a listed mentor in the Core::Build Config and MailNews Core::Build Config components are listed. If you’re interested in one of the bugs, post a comment there or find us in #pymake on

Assuming makes an ass out of everyone, or how Ruby’s build system sucks

At Mozilla, our build system has a firm rule we only grudgingly violate: explicit is better than implicit. What that means is that if we depend on a library foo and we don’t find it on the machine we’re building on, we fail instead of silently assuming the user doesn’t want to build in support for foo. If the user really wants that, she would need to pass in a --disable-foo configure flag saying so. This means we know exactly what we’re shipping as binaries, and users know exactly what to expect.

Once you spend a lot of time working with Mozilla code, you sometimes forget other projects don’t follow such obviously important rules. Case in point: Ruby. A default Ubuntu install builds Ruby out of the box. Of course, when you then try to do anything remotely useful:

% gem install heroku

Ruby fails with a cryptic no such file to load -- zlib (LoadError).

Turns out Ubuntu doesn’t come with the zlibg1 dev library, which means the Ruby build system assumes you don’t care about zlib support and happily builds without it.

Great, so you installed the library and built Ruby again, and gem actually worked. Now, you try to log in to Heroku:

% heroku login

… and Ruby fails with yet another no such file to load -- net/https error. At least the error message is slightly less cryptic this time, since it tells you to apt-get install libopenssl-ruby. Which means you need to install the library and rebuild Ruby a third time.

God knows how many more libraries the build system’s assumed I don’t care about.

no-www considered harmful

I remember reading about the no-www movement years ago. At that time it struck me as a cool thing to do, but I now know that no-www is a really bad idea in general. That’s because the DNS system lets you assign CNAME records to subdomains, including the www subdomain, but only A or AAAA records to root domains. In other words, you can say that is the same as, but you can only say that is the same as a particular IP address.

What’s the benefit of this? Well, if you use a CNAME you can let the owner of deal with anycast and routing issues, but if you use A records you have to deal with them yourself. If you do decide to deal with them yourself, you’ll be frustrated when your DNS updates take a while to propagate.

More generally, CNAME is an example of the most powerful law of computer science in action, and A isn’t.

All problems in computer science can be solved by another level of indirection. David Wheeler

See Heroku’s documentation on the matter.

Switching to Octopress

I finally made the switch to Octopress today. I’ve managed to migrate all my Blogger posts to it, and set up HTTP 301 redirects so that people hitting my old posts and feeds aren’t lost.

I’m hosting my blog on Heroku, and the theme I’m using is the default Octopress theme (which is actually really nice) with a few tweaks. I’ve also played around with OpenType features, which only work in IE 10 on Windows 8, Firefox 15 and above on all platforms, and Chrome 16 and above on Windows and Linux. (Firefox 14 and below do support OpenType features, but with a different syntax I’m too lazy to support.)

Web fonts and Windows

Windows’ GDI ClearType has a really bad time dealing with most web fonts, including the ones I’m using for this site. DirectWrite does much better, but only works in IE 9 and above and in Firefox with hardware acceleration. Thus I made the call to disable web fonts for body text on Windows and use the similar-looking system font Constantia instead. I’ve left web fonts enabled for titles though.

Civilising global state

It seems to be universally agreed that global mutable state is considered harmful. It makes dependencies less clear, means that functions are harder to test, introduces races, has to be handled separately when exceptions occur, and so on.

Yet anyone who has programmed for a while knows that there are always situations where global state is useful and convenient. Perhaps adding it to every single function call’s signature is a pain; perhaps you need to set up a callback and need some data there but can’t create a closure for it; perhaps you’d like to look up the stack for security reasons; perhaps you simply don’t want to refactor 90% of your code. (You might think you’re avoiding the pitfalls of global state by using singletons or static variables, but you’d be wrong.)

Funnily enough, we’ve had a solution to these use cases since the dawn of programming languages: dynamic scope. Dynamically looking up the stack for variable bindings is thread-safe, exception-safe and remarkably less error-prone than global state. It has also quite rightfully been mostly abandoned in favour of lexical scope.

But what if you could get all the great properties of dynamic scope with lexically scoped variables? That is what Racket achieves with parameters. Racket parameters
  • are explicitly declared as such, so you wouldn’t confuse them with regular variables
  • hold their value within blocks where they’re defined, called parameterize blocks
  • can be nested safely and work as you would expect them to
  • importantly for a Scheme dialect, don’t interfere with tail call optimization
  • are, by their very structure, thread-safe and exception-safe — it is always beautiful to see the semantics you want be a natural, obvious consequence of the syntax you’ve created
So the next time you’re tempted to use a global variable to solve a problem, don’t blame yourself; instead, blame the language you’re using for providing insufficient abstractions.

Transferring apps and data from one Android device to another

My Nexus S’s screen has been having some issues with touch lately, so I’ve decided to send it in for repairs. I’ve really come to depend on a smartphone, though, so I went out and bought another Android phone to tide me over until my Nexus S is fixed: an HTC One V.

Now my phone had a lot of data, none of which I wanted to lose. Google’s cloud-based backup service unfortunately doesn’t do much in my experience, and the total amount of data I had (over 2.5 GB) was pretty large anyway, so I decided it’d be best to use my computer to do a full backup and restore.

Full backups, restores and data transfers have always been a big issue with Android. Android 2.x requires you to root your phone and then use an application like Titanium Backup, which is honestly ridiculous. Android 4.0 is actually a lot better if you’re willing to use the command line a bit. adb for Android 4.0 has working “backup” and “restore” commands, no rooting required.

Here’s my setup: I wanted to transfer all my applications, settings and data from an encrypted Nexus S running Android 4.0 to an unencrypted HTC One V, also running Android 4.0. The instructions should work for any pair of Android 4.0 devices.

  1. I first made sure I had all the prerequisites. You essentially need most of an Android development environment on your computer.

    • Download the latest version of the Android SDK for your platform, then run the SDK Manager and make sure you have the latest version of the Android SDK Tools and Android SDK Platform Tools.
    • If you’re on Windows, you’ll also need the ADB USB drivers for both devices. The Nexus S drivers can be downloaded via the SDK Manager, while the HTC One V drivers are part of HTC Sync. (The HTC drivers are a pain to install: you need to force-install the “My HTC” drivers from C:\Program Files (x86)\HTC\HTC Driver\Driver Files\Win7_x64. Ridiculous.) For other OSes, see these instructions.
    • If you’re on Windows, let the HTC Sync installer install the prerequisites (including drivers), but do not install HTC Sync itself. It comes with its own, older adb that interacts badly with the SDK’s adb.
  2. I connected the Nexus S to my computer and turned on USB debugging on the phone. Then, from the platform-tools directory inside the SDK folder, I issued the following command:

    adb backup -all -apk -noshared -nosystem -f nexus-s-backup.ab

    See this guide for a complete list of options for the backup command.

  3. On my Nexus S,  a prompt (screenshot to the right) showed up asking me to enter my encryption password. I did so and moved ahead.

  4. I now connected the HTC phone to my computer and turned on USB debugging. There’s one more thing I needed to do: in Settings -> Developer Options, set the Desktop Backup Password to the encryption password from step 3.

  5. I then issued this command from the platform-tools directory:

    adb restore nexus-s-backup.ab

    Another prompt (screenshot to the right) showed up asking me to type in my encryption password, even though this phone wasn’t actually encrypted. (Very confusing!). I found I had to type the encryption password from my Nexus S into both fields – this was the only way the restore worked.

  6. I finally copied the SD card’s contents manually. That’s about it. Most apps were restored properly, and whichever apps I downloaded from the Android Market Google Play Store showed up as installed there. A few didn’t make the cut: most notably Google’s 2-step Authenticator. I’m not sure why.

More complex than necessary, but I’m happy I didn’t have to redownload much and didn’t lose any settings or game progress. I hope for everyone’s sake that future versions of Android are better at it.

OS X mouse strangeness

So I decided to upgrade my Apple notebook’s OS X install to Lion earlier this week. Due to an unfortunate series of events, partly my fault, mostly Apple’s, my MBR was corrupted and my Windows install became unbootable. So I was stuck with OS X for a few days until I had the time to fix my Windows install.

I’ve been running Windows and Linux with mouse acceleration turned off ever since I started using a high-precision mouse (it’s ugly, but really precise and comfortable to use), and I’ve come to appreciate it a lot. OS X’s acceleration curve is very different from Windows, which is fine, but the real problem is that there’s no built-in way to turn it off. Luckily, there’s a $0.99 app on the Mac App Store which lets you disable acceleration. That, and bumping up the resolution of the mouse a notch, made things better.

Things still felt strange, though, and once I fixed my Windows install and booted into it I realized why. OS X has severe input lag with the mouse. According to this blog post, the lag (which has been confirmed to exist by an Apple engineer) exists on all OS X installs, and on all mice and touchpads, and is 32 milliseconds. Now, 32 milliseconds might not sound like much, but I can easily tell the difference between 30 frames per second (33 ms/frame) and 60 FPS (17 ms/frame) while playing a game.

Windows and Linux do not have this problem: the input lag is less than the screen’s refresh rate, so the pointer is locked on to the mouse. Both feel significantly more responsive as a result.

If you’re using OS X now and are sensitive to such things, I suggest switching away until it’s fixed. You’ll be delighted you did.