July 31, 2017

Pursuing Postgres DDL Replication

by Jeremy Finzel
in Advanced

Data at Enova

At Enova, we have a lot of data. I am part of our Database Administration team, and our primary job is to keep the data flowing to all the various groups at Enova that utilize it. Enova uses Postgres primarily for our transaction systems — our client-facing and internal applications, reporting, and analytics systems. We use Postgres for the following reasons:

It has a robust backend to support high production loads
It has an incredible development and support community
It is open source, completely free, and highly extensible

Logical Replication Challenges

We have many databases that require replicating data to other databases for various purposes. One of the most useful database technologies that is used to move data from point A to point B is called “logical replication”. Here are two example use cases for it:

Having a full and real-time copy of our production database that can be used for reporting or analytics purposes without impact to production
Pruning data on a production database when it is not needed by the application, while continually propagating the full history to a separate archive system

Postgres has several technologies that do this, but all of them have the same limitation because of the following distinction. In database jargon, there are two categories of SQL statements:

DDL – Data Definition Language – this defines database object structures. When we create a table or add a column to a table, we are doing DDL
DML – Data Manipulation Language – this modifies data within tables. When we INSERT,UPDATE, or DELETE, we are doing DML

For a number of reasons, DDL has to be handled separately. It is a common solution for migration procedures to require both of the following if you use logical replication:

Come up with some framework to distinguish DDL and DML language in migrations, and train developers to write SQL in this custom way
Have the DBA manually deploy the SQL in the correct order for all involved database clusters, manage locking contention, and add new tables to replication if necessary

There is no technology available for Postgres that allows us to do transparent DDL replication along with DML without addressing these two issues (there is BDR but it’s not an option for our environment).

Because of this, we decided to come up with our own solution to this challenge.

Our Solution

Our inspiration for this solution was very modest: normally the DBA has to execute the same SQL on all database clusters in replication. Can we capture that and propagate it in a reliable way? Thanks to some great recent Postgres features, we have some great potential from the following 2 things:

Event triggers can be created which fire on any DDL statement, giving us access to what type of command is involved and what table(s) are being modified
In the same transaction, we are able to access the SQL statement responsible for the schema change

Building on top of Pglogical as the replication technology, this is just what we have done with the project pgl_ddl_deploy. Here are some big advantages to the method we are using:

Any migration framework and app dev language can be used
No migration overhaul (separation of DDL and DML) is required
It is transactionally consistent and atomic. This framework sends the DDL command to the subscriber at the exact point as it occurred transactionally (even if mixed with DML) on the application side

Finally, this concept could be generalized to work with other replication technologies (including the new one coming in 10.0). That is because the heart of the framework is basically the same regardless of the replication technology.

We have open sourced this tool to give back to the Postgres community. We would love for folks to get involved with this project, so that others may benefit from this solution as well.

About

Data at Enova

Logical Replication Challenges

Our Solution