derisk-sql: Remove unexpected risks from your SQL migrations

BY ROBERT RYSZEWSKI
Senior Software Engineer

Why did we build this?

We recently developed open-source derisk-sql to solve a common problem we have seen based on client requests over the years: preventing a class of easy-to-miss mistakes from entering SQL migration files. Along the way, we’ve found ways that a tool like this can be used to improve performance, optimize storage usage, and provide overall linting capabilities. derisk-sql provides all the expertise of a seasoned infrastructure engineering team – like the one you can partner with here at APrime – for free and as an open source tool.

Most notably, derisk-sql has been designed from the ground up with extensibility in mind. Allowing end-users to extend the tool’s core functionality was a key objective, and we treat it as a first-class concept. Whether you want to add functionality that needs to be kept private, you want to contribute directly in the derisk-sql repository, or you just want to experiment, we’ve made adding your own linting rules (aka analyzers) as easy as possible.

Quickstart

Getting started is as easy as running the following commands:

$ go install github.com/aprimetechnology/derisk-sql/...

# --migrations-dir can be set explicitly, and defaults to ‘./migrations’
$ derisk-sql check run

Examples

What does some sample output look like? Like so:

$ derisk-sql check run
[WARNING]: 20240726161159_drop-index-no-concurrently.sql:6:1: (IND-001) CREATE INDEX statement missing CONCURRENTLY option
[WARNING]: 20240726161159_drop-index-no-concurrently.sql:9:1: (IND-002) DROP INDEX statement missing CONCURRENTLY option
[FATAL]: 20240726161159_drop-index-no-concurrently.sql:4:1: (IND-003) CREATE INDEX CONCURRENTLY statement is happening within a transaction block! This is prohibited
[FATAL]: 20240726161159_drop-index-no-concurrently.sql:11:1: (IND-003) DROP INDEX CONCURRENTLY statement is happening within a transaction block! This is prohibited
[FATAL]: 20240801151448_hr_employee_table.sql:2:1: (NMC-001) New schema name 'human_resources1' does not meet naming requirement regex '^[a-zA-Z_]+$'
[FATAL]: 20240801151448_hr_employee_table.sql:14:1: (NMC-002) New table name 'employee_old_2' does not meet naming requirement regex '^[a-zA-Z_]+$'

Extending with your own functionality

To show you how to easily integrate derisk-sql with your own custom functionality, we provide an example of doing so with a simple bash script (no Golang needed, no complex libraries to add, no messiness: period).

A more detailed step-by-step guide on how to provide your own functionality can be found in the derisk-sql README.

How does it work?

The high level overview is that every single analyzer (one you write, derisk-sql’s, anyone else’s) is treated the exact same way: spawned as a subprocess, passed in a JSON schema to its standard input, and expected to produce a JSON schema to its standard output.

This freeing design is what powers the ease of extension by end users like yourself.

Where are things now?

derisk-sql currently primarily focuses on integrating with dbmate for migration management, postgresql as a database system, and GitHub as a version control system.

All currently implemented analyzers can be found here, and are generally around:

  • Creating and dropping indexes concurrently
    • This prevents acquiring write exclusive locks that would devastate performance on critical production tables
  • Ensuring transaction blocks are not being mixed with SQL statements that don’t support them
  • Enforcing naming conventions around schemas, tables, columns, etc

The results of running these analyzers can be configured to end up in a few different places: simply printed out, written to JSON report files, commented on a pull request, and requesting specific reviewers on GitHub for sufficiently serious SQL issues.
For convenience, we’ve also provided a GitHub workflow for continuous integration on pull requests.

What’s coming next?

Some exciting features we may consider in the future include:

  • Optimizing the ordering of columns when defining tables to minimize storage space
    • In some cases, inefficient ordering can unnecessarily lead to doubling the storage space used for a table
  • Configurable enforcement of style requirements and readability
  • Broader support for MySQL and other database types
  • Broader support for flyway and other migration management systems
  • Broader support for CircleCI and other Continuous Integration (CI) systems 

We are open to feature requests

We are actively soliciting feature requests and feedback.

If you want specific features, or to just share your thoughts, star the repo, and feel free to contact us at derisk-sql@aprime.io.

For closer collaboration

Do you have feature requests that you’d like to develop and maintain privately?

Or are you seeking to work even more closely with us and our team, on this project or any others?

For any and all of your needs, we are here for you: send us a message or schedule a call — we look forward to learning about your work and exploring how we can help!

Let Aprime help you overcome your challenges

and build your core technology

Are you ready to accelerate?