Solve Memory Leaks by Not Solving Memory Leaks

Memory leaks have been a fact of life in programming for as long as anyone can remember. C is the mac daddy of memory leaks – any non-trivial program has a memory leak – but even in garbage collected languages you’re not safe, as there are some types of resource that you have to remember to close (OS-native objects, like files and processes).

Or at least, that’s the theory.

I was trying to track down the cause of a memory leak in some Scala code I’d written, using the scala.sys.process APIs. I couldn’t understand it. I was fastidiously calling Process.destroy() on every process I’d created, and yet the monitor threads for the processes weren’t getting destroyed.

This was particularly insidious, as the names of these monitor threads are produced by a by-name variable, which were accidentally capturing a large amount of state, and leaking a huge amount of memory in the process.

I spent ages staring at the scala.sys.process source before I realised the problem. The problem is that I didn’t actually need to call Process.destroy()!

The scala.sys.process package hasn’t been ported to use Scala 2.10’s Future yet. I was using an async framework, so to work around this I created a custom java.io.OutputStream that calls a function on close(), and piped my process output into there, so my code looked like this:

import java.io.ByteArrayOutputStream
import scala.sys.process._

def onCloseStream(action: => Unit) = new ByteArrayOutputStream {
  override def close() = {
    super.close()
    action
  }
}

def spawnProcess(command: String) = {
  // lazy val, to allow us to refer to the val from within its own definition
  lazy val myProcess = (command #> onCloseStream {
    myProcess.destroy() // This line is the problem
  }).run()
}

To be fair, this wasn’t the entire code – I’ve removed the code that actually did useful stuff.

Now, the problem here is the line myProcess.destroy(). It’s not necessary, as myProcess is already in the process of being destroyed. Moreover, I think there’s a bug in the destruction code in scala.sys.process.PipedProcess (my best guess is that it’s not re-entrant, but I’ll need to dig further), so this leaves myProcess in an invalid state, and the thread never terminates.

Now, if there is a bug in PipedProcess, then it should probably be fixed. But more importantly, my code is incorrectly destroying a process that’s already on the way out.

And that’s the way it goes these days. Modern languages are good enough that you almost never need to think about resource allocation and de-allocation. This is doubly true in Scala, where good APIs are very hard to mis-use.

So, before you do something clever to prevent a common mistake, ask yourself “Has the language already solved this for me?”

Updated: The problem is indeed a bug in PipedProcess. The problem wasn’t quite what I expected. I thought the problem was re-entrancy – so calling destroy() on a dying process caused the issue. It turns out the same thing happens when you call destroy() on any process. I’ve submitted a bug and a patch against the Scala standard library.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s