When `sbt test` Is Secretly Single-Threaded

I was watching sbt test chew through a backend suite the other day, htop open in another pane, and something didn’t add up. The machine has eight cores. One core was pinned at 100%. The other seven sat at 2–3% for the entire five-minute run.

That discrepancy is the whole post. SBT’s defaults are conservative for good reasons, and on a small suite you’d never notice. On a big one, they leave most of your hardware on the table.

What SBT does by default

Out of the box, SBT runs your tests in the same JVM that runs SBT. Within a subproject, test suites execute sequentially. Across subprojects, there’s a little parallelism — SBT can build a few things at once, subject to its internal task scheduler — but it won’t fork a separate JVM for tests, and it won’t run two suites from the same subproject concurrently.

If you’ve only got a few hundred tests, this is a perfectly reasonable shape. Isolation is cheap when everything runs in one process. No classloader weirdness, no shared state across JVMs, predictable logs. You pay for it in wall time and nobody cares.

Once the suite grows — lots of subprojects, tens of thousands of tests, meaningful setup per suite — the single-JVM model becomes the long pole. Compile finishes, tests start, one core goes red, the rest go to sleep. You’re paying for an eight-core machine and using one.

The settings that change it

Four settings in build.sbt flip the default from “run tests in-process, serially” to “fork separate JVMs and run them concurrently, with a ceiling.”

Test / parallelExecution  := true
Test / testForkedParallel := true
Test / logBuffered        := false
javaOptions ++= Seq("-Xmx2G", "-Xss4M")

Global / concurrentRestrictions := {
  val cores             = java.lang.Runtime.getRuntime.availableProcessors
  val configuredMax     = sys.props.get("test.maxForks").flatMap(s => Try(s.toInt).toOption).getOrElse(4)
  val maxForkedTestJVMs = math.max(1, math.min(configuredMax, math.max(1, cores / 2)))
  Seq(
    Tags.limit(Tags.ForkedTestGroup, maxForkedTestJVMs),
    Tags.limitAll(cores)
  )
}

Each line is doing real work, so:

Test / parallelExecution := true lets SBT run suites within a subproject concurrently. Without this, even if you turn on forking, you’re still doing one suite at a time per subproject — just in a separate JVM.

Test / testForkedParallel := true says that groups of tests sent to forked JVMs can be parallelized. With forking on but this off, SBT will run one fork at a time. Turning it on lets multiple forked JVMs execute simultaneously.

Test / logBuffered := false matters because once you fork, output from N test JVMs starts to interleave. Buffering gives you clean logs in exchange for holding everything until the suite finishes, which is terrible when you’re staring at a CI run trying to figure out which fork hung. I’d rather have the interleave.

javaOptions ++= Seq("-Xmx2G", "-Xss4M") sets per-fork JVM sizing. This is the line people miss. Forked JVMs don’t inherit the parent SBT heap automatically. If you’ve given the parent SBT 8 GB and then fork four test JVMs without setting per-fork options, each fork starts with whatever the default is and may OOM on a suite that used to pass. Size the forks for what a single suite actually needs.

Global / concurrentRestrictions is the ceiling. Tags.limit(Tags.ForkedTestGroup, maxForkedTestJVMs) caps how many forked test JVMs can be in flight at once. Tags.limitAll(cores) caps total concurrent SBT tasks at the core count, so compile and test don’t oversubscribe when running together. Without both, a fast machine will happily spawn more JVMs than it has cores for, and everything starts contending.

The cores / 2 heuristic is deliberate. The parent SBT JVM plus zinc are already consuming meaningful CPU. Running cores forks on top of that tends to thrash. Half gives the forks room without starving the parent.

The test.maxForks system property is the escape hatch. Start SBT with -Dtest.maxForks=2 and you get two forks instead of four, without editing build.sbt. Useful when a specific CI runner is small, or when you’re chasing a test that only flakes under higher parallelism.

Tuning

The per-fork heap is the setting most likely to bite you later.

-Xmx2G works for a suite that mostly touches in-memory structures. A suite that loads large fixtures, spins up test containers, or instruments bytecode for coverage will need more. If you start seeing OutOfMemoryError in forks that used to pass unforked, raise the fork heap before you start blaming the tests.

The other knob worth knowing about is what happens when one suite is much heavier than the others. SBT’s scheduler will happily pack small suites into forks while the big suite occupies its own. If the big suite is also the slowest, you’re gated on it — forking can only speed up your run as much as its longest single fork. Splitting the heaviest file or moving expensive setup into shared fixtures pays back at that point.

logBuffered := false means you’ll occasionally see two suites’ output interleaved in a way that’s briefly confusing. The fix isn’t to buffer again — it’s to make sure your test names are specific enough that interleaved output still tells you which suite said what. Generic names like "works correctly" are a problem well before this point, but forking surfaces them harder.

What it actually did

On an eight-core machine, a suite of tens of thousands of tests across roughly 120 subprojects went from about 5m 19s to about 2m 29s. Roughly 2.1× faster, clock-on-the-wall, from changes that are all in build.sbt.

I reran the largest subproject four times in a row to check for flakes introduced by the new concurrency. All four passed clean. That’s not a guarantee, but it’s enough signal to ship the change and watch CI for a few days.

What you give up

Forked JVMs start up. If your suite is small enough that SBT was finishing faster than a single JVM startup × N, forking will be slower, not faster. There’s a crossover point and it’s worth knowing where yours is.

Forked parallelism surfaces latent concurrency bugs. Tests that share global state — static mutable singletons, filesystem paths, fixed ports — will start flaking. That’s not forking’s fault; it’s the suite telling you the truth about itself. Fix the sharing, don’t turn parallelism back off.

OOMs show up differently when forked. The parent SBT JVM stays fine while a single fork dies with a nonzero exit. The signal in the log is less obvious than a classic OutOfMemoryError in the main thread. If a test disappears and you can’t figure out why, check whether the fork exited nonzero before assuming the test code is wrong.

The point

SBT’s defaults are the right defaults for a small project. They’re not the right defaults for a large one, and they won’t change on their own. The machine you’re running on is already capable; you just have to opt in.

Ten lines of configuration, half the run time. No new dependencies. The tool was always going to do this — it just needed to be asked.

When `sbt test` Is Secretly Single-Threaded

What SBT does by default

The settings that change it

Tuning

What it actually did

What you give up

The point

Related Posts

Stop learning frameworks. Start building things.

We Tripled the Test Suite. Then Everything Else Had to Change.

Breaking up with JetBrains

Event-driven microservices on SNS, SQS and Scala