Type-safe SQL in Scala with Doobie
We had a Scala service backed by MySQL where every model went through Doobie. Doobie itself is fine — it’s a thin functional wrapper over JDBC. The problem isn’t Doobie. The problem is what you build on top of it.
The raw pattern is sql"select id, user_id, amount from transactions" glued together with string interpolation. That works until someone runs SELECT * somewhere, or someone does an ALTER TABLE that shifts column positions, or someone forgets to add the team_id clause on a multi-tenant query. None of those break the compile. They break in prod, quietly, with the wrong data going into the wrong fields.
The abstraction layer in this post is what we landed on after enough of those near-misses. It’s not an ORM. It’s a thin set of traits and a query builder that makes the dangerous things harder to write than the safe things.
The migration that ruins your week
The pattern that started this whole exercise is the SELECT * problem. You have a table:
CREATE TABLE transactions (
id BIGINT PRIMARY KEY,
created_at TIMESTAMP,
updated_at TIMESTAMP,
amount DECIMAL(10,2)
);And a case class whose field order matches:
case class Transaction(
id: Long,
createdAt: DateTime,
updatedAt: DateTime,
amount: BigDecimal
)Now imagine the migration adds a column in the middle — ADD COLUMN user_id BIGINT AFTER id. SELECT * happily returns five columns in DB order; the case class still has four fields in its old order; Doobie’s Read[Transaction] maps positionally. The query compiles. The query runs. created_at ends up reading user_id, and your amount column is now whatever was in updated_at. Nothing throws. The data is just wrong.
You can argue that no one should ever ADD COLUMN ... AFTER. They will. You can argue no one should SELECT *. They will. The cheapest defense is making the column list explicit and making the explicit version the easy default.
The model trait
Every persisted entity extends a base trait. The trait does two useful things: it gives every model a uniform partitionKey for caching/logging, and it forces the timestamp fields to live at the type level so the framework can manage them.
abstract class BaseMySqlModel extends Product {
def partitionKey: String
def entityType: EntityType
}
abstract class IdBasedMySqlModel extends BaseMySqlModel {
def mySqlId: Long
def createdAt: DateTime
def updatedAt: DateTime
override def partitionKey: String = s"${entityType.modelStr}-$mySqlId"
}Extending Product matters — the diff machinery further down uses productElementNames and productIterator to compare old and new objects field-by-field. Case classes give us that for free.
Companion objects carry the schema
Each model has a companion object that owns the table name, the JSON codecs, and — crucially — columnOrdering. That sequence is the source of truth for which columns the SELECT clause emits and in which order:
abstract class BaseMySqlModelCompanionObject[T <: BaseMySqlModel, IdType](
val TableName: String
) {
implicit val reads: Reads[T]
implicit val writes: Writes[T]
def columnOrdering: Seq[F[_, _]]
def extractId(element: T): IdType
def setIdAndReturn(id: IdType, model: T): T
def preCreateInMemoryChanges(t: T): T
def preCheckForUpdates(oldObj: T, updatedObj: T): Unit
def fieldsToAddForUpdates: Seq[UpdatableField]
}F[A, B] is our typed column reference — A is the Scala type, B is the type as it appears in queries (often the same, but Option[T] columns differ). Each F knows the column name, the table it belongs to, and how to render itself in select vs. non-select positions.
A concrete model wires it together:
case class Transaction(
mySqlId: Long,
userId: Long,
createdAt: DateTime,
updatedAt: DateTime,
amount: BigDecimal,
entityType: EntityType = EntityType.Transaction
)
object Transaction extends IdBasedMySqlModelCompanionObject[Transaction]("transactions") {
val UserId: F[Long, Long] = F("user_id", TableName)
val Amount: F[BigDecimal, BigDecimal] = F("amount", TableName)
override val columnOrdering: Seq[F[_, _]] = Seq(
Id, UserId, CreatedAt, UpdatedAt, Amount
)
implicit val reads: Reads[Transaction] = Json.reads[Transaction]
implicit val writes: Writes[Transaction] = Json.writes[Transaction]
override def extractId(t: Transaction): Long = t.mySqlId
override def setIdAndReturn(id: Long, model: Transaction): Transaction =
model.copy(mySqlId = id)
}columnOrdering has to match the case class field order. That’s a constraint we test for explicitly (see below) — it’s the one piece of redundancy in the design and the test is what keeps it honest.
Safe and unsafe selection
The query builder exposes two select methods. One emits SELECT *, one emits an explicit column list derived from columnOrdering:
// Breaks when columns shift. Kept around for ad-hoc debugging.
queryBuilder
.where(Transaction.Id, Operator.Equal, transactionId)
.selectAllUnsafe[Transaction]
.runSingle
// What everyone uses.
queryBuilder
.where(Transaction.Id, Operator.Equal, transactionId)
.selectAllExplicit[Transaction](Transaction)
.runSingleThe implementation of selectAllExplicit is the whole point of the exercise — it’s a few lines:
def selectAllExplicit[A](
companionObject: BaseMySqlModelCompanionObject[_, _]
)(implicit a: Read[A]): SelectQueryExecutor[A] = {
val columnFragments = companionObject.columnOrdering.map(_.fNameForSelectClauses)
val fragment = fragments.comma(NonEmptyList.fromListUnsafe(columnFragments.toList))
queryExecutorCreator.createSelectQueryExecutor(getCompleteFragment(fragment).query[A])
}That’s it. It generates SELECT id, user_id, created_at, updated_at, amount FROM transactions instead of SELECT *, and the positional Read[Transaction] decoder lines up with both the case class and the column list because we control both.
The Unsafe name on the other method is deliberate. People will avoid the word.
Type-aware fragment interpolation
Doobie’s fr interpolator wants types with Put instances. When you’re building queries dynamically — a generic UPDATE generator, for instance — you don’t have the type at compile time, you have an Any. The escape hatch is a runtime dispatch that hands off to the right Put:
def interpolate(any: Any): Fragment = any match {
case str: String => fr"$str"
case long: Long => fr"$long"
case int: Int => fr"$int"
case bool: Boolean => fr"$bool"
case bigDecimal: BigDecimal => fr"$bigDecimal"
case Some(value) => interpolate(value)
case None => Fragment.const("NULL")
case value if DoobieGetsAndPuts.putMap.contains(value.getClass) =>
implicit val put: Meta[value.type] =
DoobieGetsAndPuts.putMap(value.getClass).asInstanceOf[Meta[value.type]]
toPutOps[value.type](value).fr
case _ =>
throw new IllegalArgumentException(s"Unsupported type: ${any.getClass}")
}This isn’t beautiful — runtime type dispatch on Any never is — but it lives in exactly one place. New domain types register their Meta in DoobieGetsAndPuts.putMap and become usable everywhere:
object DoobieGetsAndPuts {
implicit val dateTimeMeta: Meta[DateTime] =
Meta[Timestamp].timap(ts => new DateTime(ts.getTime))(dt => new Timestamp(dt.getMillis))
implicit val entityTypeMeta: Meta[EntityType] =
Meta[String].timap(EntityType.fromString)(_.modelStr)
implicit val jsonMeta: Meta[JsValue] =
Meta[String].timap(Json.parse)(Json.stringify)
val putMap: Map[Class[_], Meta[_]] = Map(
classOf[DateTime] -> dateTimeMeta,
classOf[EntityType] -> entityTypeMeta,
classOf[JsValue] -> jsonMeta
)
}Diff-based UPDATE
The other half of the framework is diffFragment. Given the old object and the new object, it walks the case class fields, finds the ones that changed, and emits column = newValue clauses for each. The fieldsToAddForUpdates set covers managed columns like updated_at = NOW() that should always be updated regardless of user changes:
def diffFragment(
oldObj: T,
newObj: T,
shouldUpdateDefaultFieldsForEmptyUserDefinedUpdates: Boolean
): Option[Fragment] = {
preCheckForUpdates(oldObj, newObj)
val defaultUpdateFragments = fieldsToAddForUpdates
val fieldsToIgnore = defaultUpdateFragments.map(_.columnName).toSet
val userDefinedUpdateFragments = newObj
.productElementNames
.zip(oldObj.productIterator)
.zip(newObj.productIterator)
.flatMap { case ((name, oldVal), newVal) =>
if (oldVal != newVal) {
val dbName = convertCamelToSnake(name)
if (fieldsToIgnore.contains(dbName)) None
else Some(Fragment.const(s"$dbName =") ++ interpolate(newVal))
} else None
}
.toList
if (userDefinedUpdateFragments.isEmpty &&
!shouldUpdateDefaultFieldsForEmptyUserDefinedUpdates) {
None
} else {
val allFragments = userDefinedUpdateFragments ++
defaultUpdateFragments.map(_.getFragment)
Some(fragments.set(NonEmptyList.fromListUnsafe(allFragments)))
}
}Returning Option[Fragment] is the small detail that paid off most often — when nothing changed, callers get None and can skip the round-trip entirely instead of issuing a no-op UPDATE.
The camelCase-to-snake_case translation is mechanical, with a small override map for the cases where the database disagrees with the convention — mySqlId maps to id, etc. The Opt suffix gets dropped so userIdOpt reads as user_id:
def convertCamelToSnake(name: String): String = {
val caseClassName = if (name.endsWith("Opt")) name.dropRight(3) else name
CaseFormat.LOWER_CAMEL.to(CaseFormat.LOWER_UNDERSCORE, caseClassName)
}
override def overriddenMap: Map[String, String] = Map(
"mySqlId" -> "id",
"jsonData" -> "data"
)Operators carry their arity
The where method is overloaded by operator kind. Each operator type knows how many arguments it takes, so IsNull can’t accidentally be called with a value and Equal can’t be called without one:
sealed trait BinaryOperator extends Operator {
def getFragment[T: Put](f: F[T, _], value: T): Fragment
}
sealed trait UnaryOperator extends Operator {
def getFragment[T](f: F[T, _]): Fragment
}
sealed trait ManyOperator extends Operator {
def getFragment[T: Put](f: F[T, _], value1: T, remaining: Iterable[T]): Fragment
}
sealed trait ThreeOperator extends Operator {
def getFragment[T: Put](f: F[T, _], low: T, high: T): Fragment
}Concrete operators go in their own case objects:
case object Equal extends BinaryOperator {
override def getFragment[T: Put](f: F[T, _], value: T): Fragment =
f.fNameForNonSelectClauses ++ fr"= $value"
}
case object IsNull extends UnaryOperator {
override def getFragment[T](f: F[T, _]): Fragment =
f.fNameForNonSelectClauses ++ Fragment.const("IS NULL")
}
case object In extends ManyOperator {
override def getFragment[T: Put](f: F[T, _], value1: T, remaining: Iterable[T]): Fragment =
f.fNameForNonSelectClauses ++ fr"IN (" ++
fragments.comma(NonEmptyList(fr"$value1", remaining.map(v => fr"$v").toList)) ++
Fragment.const(")")
}The ManyOperator signature insisting on value1 plus remaining: Iterable[T] is deliberate — IN () is a SQL error, so we make it impossible to construct an empty list at the type level.
In use, queries read close to the SQL they emit:
Transaction
.getDefaultQuery(queryContext, queryExecutor)
.leftJoin(
User.getBaseQuery(queryExecutor),
Transaction.UserId,
User.Id,
shouldSearchForeignTable = false
)
.where(Transaction.Amount, Operator.GreaterThan, BigDecimal(100))
.where(Transaction.CreatedAt, Operator.Between, startDate, endDate)
.orderBy(Transaction.CreatedAt, SortOrder.Desc)
.withPageNoAndPageSize(pageNo = 1, pageSize = 50)
.selectAllExplicit(Transaction, User)
.runListTenancy by default
The other class of bug we wanted out of the way is forgetting the team_id clause on a multi-tenant query. The fix is a query context that gets threaded through every call, plus a model trait that auto-injects the filter:
sealed trait MySqlQueryContext {
def teamIdOpt: Option[Long]
def shouldIncludeDeletedRecords: Boolean
}
case class TeamScopedQueryContext(
teamId: Long,
shouldIncludeDeletedRecords: Boolean = false
) extends MySqlQueryContext {
override def teamIdOpt: Option[Long] = Some(teamId)
}
abstract class TeamScopedMySqlModelCompanionObject[T <: TeamScopedMySqlModel](
override val TableName: String
) extends IdBasedMySqlModelCompanionObject[T](TableName) {
val TeamId: F[Long, Long] = F("team_id", TableName)
override def getDefaultQuery(
qc: MySqlQueryContext,
queryExecutorCreator: QueryExecutorCreator
): QueryBuilder = {
val queryBuilder = super.getDefaultQuery(qc, queryExecutorCreator)
qc.teamIdOpt match {
case Some(teamId) => queryBuilder.where(TeamId, Operator.Equal, teamId)
case None => queryBuilder
}
}
}The AllScopedQueryContext exists for admin tooling that genuinely needs to read across tenants — and grepping for it in code review is how we caught a couple of unintentional cross-tenant queries before they shipped. If teamIdOpt is always wrapped in a context type, the choice to skip the filter has to be visible.
Soft deletes get the same treatment — a separate trait that adds deleted_at IS NULL to the default query unless the context opts in:
override def getDefaultQuery(qc: MySqlQueryContext, queryExecutorCreator: QueryExecutorCreator): QueryBuilder = {
val queryBuilder = super.getDefaultQuery(qc, queryExecutorCreator)
if (qc.shouldIncludeDeletedRecords) queryBuilder
else queryBuilder.where(DeletedAtOpt, Operator.IsNull)
}The test that holds it together
The whole design hinges on columnOrdering matching the case class. We pin that with one test per model:
"Transaction companion object" should {
"have columnOrdering matching case class field order" in {
val expectedFields = Seq("mySqlId", "userId", "createdAt", "updatedAt", "amount")
val actualFields = Transaction.columnOrdering.map(_.columnName)
actualFields should contain theSameElementsInOrderAs expectedFields.map(
CaseFormat.LOWER_CAMEL.to(CaseFormat.LOWER_UNDERSCORE, _)
)
}
}It’s the kind of test that exists purely to hold up an invariant the compiler can’t see. When someone adds a field and forgets to update columnOrdering, the test fails before CI ever touches a database.
Migrations are still annoying
The migration discipline that this design encourages is: add columns at the end of the table, add the F[_, _] reference, append it to columnOrdering. If you absolutely must ADD COLUMN ... AFTER, update columnOrdering in the same commit. Case class field order doesn’t matter for correctness — only columnOrdering does — but it matters for readability, so we keep them aligned anyway.
The framework doesn’t make migrations safe. It makes them visible. The columnOrdering line shows up as a diff in code review next to the Flyway script, and that’s enough.
What it cost
The downside is the boilerplate. Every model has a companion object that’s mostly mechanical. The F references repeat the column name. The columnOrdering list repeats the field list. We considered a macro, decided it wasn’t worth the debugging tax, and lived with the repetition. The test catches the one thing that goes wrong.
The other tradeoff is that interpolate(any: Any) is a runtime dispatch — it’s the seam where the type-safety guarantee breaks down. We kept it small and central; new domain types get added to one map; everything else routes through Put instances at compile time. Not a perfect story, but a confined one.
If you’re starting a Scala service today, Doobie plus a thin wrapper of this shape is still the move I’d make. ORMs do too much. Raw sql"" interpolation does too little. The middle is small and lives mostly in your repo, where you can read it.