bounding brokenness

Long-lived JS objects without references considered harmful

I was having trouble getting the unit test for bug 466227 (make gloda index the headers of IMAP messages that aren’t offline) to work. The test involved the IMAP fake server which we have in the tree, and was failing consistently after fetching the headers of exactly six messages.

IMAP logging revealed that the connection was getting dropped right after the sixth message was completed, and that we were trying to reconnect to the fake server to execute the next command. Now, the fake server currently supports only one connection in its lifetime, so after getting the second connection attempt, it responded with an error message.

Clearly there was something unexpected that was causing the connection to drop, but we had no clue why. After a few false starts, I finally enabled logging on the socket transport service, and that revealed that nsSocketTransport::OnMsgInputClosed was getting called.

Setting a breakpoint there revealed nothing particularly significant (other than that an nsPipeInputStream was getting destructed) in the socket thread’s call stack, but something very interesting in the main thread’s stack:

We were in the middle of a JS garbage collection run – a forced one, no less. A DumpJSStack() on the main thread showed that this line in the gloda indexer was responsible. Sure enough, commenting out that line was enough to make the test pass.

bienvenu had the idea that one of the fake server’s internal objects was the one getting GCed, as we hold strong refs to any input or output streams that are given to the IMAP code.

After that, since I didn’t know of a way to uniquely identify the object getting GCed, it was a matter of brute force. Inside an nsPipe, the mStatus determines the status of a connection. So I set a breakpoint at nsPipe’s constructor, and then every time we hit the breakpoint, I ran a DumpJSStack(), noted down the JS caller along with the memory address on a sheet of paper, and set a data breakpoint for that pipe’s mStatus.* The third pipe was the one giving trouble, and that corresponded to this innocuous-looking line.

The fix was inordinately simple – just push the input stream into a long-lived array, thus making sure that we always hold a ref to it.

Huge thanks to bienvenu and asuth for all the help.

* Surely there’s a better way to do this (maybe using gdb instead)? i.e. run DumpJSStack() automatically, set a data breakpoint at mStatus, and log the DumpJSStack() output and the memory address somewhere?

(minor edit for clarity)