Using Database Procedures

Searching for Opportunity:

One of the things that the database team at Enova is always looking for are opportunities to optimize queries and reduce unnecessary redundancy in the way that queries are written to fetch data from database.  One of the tools that we use to get a nice high level picture of this data is PgBadger.  Using this tool, we found a query that seemed very suspicious.  We were able to determine that this query was:

  • Running some ~30 million times per day
  • Running for a combined 2-3 days when added together
  • Responsible for some 30% of the total run time of all queries in the entire application

What was especially suspicious was that we don’t have anywhere near 30 million page hits per day.  We know as database developers that if there is an exponential factor in the number of queries, there is bad design somewhere.

Problem Identification:

When we looked more closely at this query in our legacy code, here are some things we discovered:

  • By tailing and sorting the production log output, we found hundreds of database calls for the same item_id next to each other, with nearly identical SQL
  • We found multiple queries that were identical, except for a different due_date
  • We found multiple queries that were 100% identical, meaning we are requesting the exact same query many times

Legacy Application Logic:

One of the most common mistakes that people make when writing SQL queries, which can be easily masked when people use an ORM like Ruby on Rails’ ActiveRecord, is to loop through database data as you would write a loop in most programming languages, fetching one record at a time using different conditions.  This misses the great benefit of a database which is the ability to get an entire set of data in a single call.

This was exactly the problem that we see here.  Instead of grabbing all due dates for the item at once, and then sending the entire data set to the client, this code is querying due dates separately. Here was the logic being used:

Query #1: Get the list of due_dates from db, executed one time per item:

SELECT account 
, SUM(amount) AS total_amount
FROM payments
WHERE item_id = 100
GROUP BY account;

Query #2: Get account sums for given due_date, executed 100s of times per item

SELECT account
, SUM(amount) AS total_amount
FROM payments
WHERE item_id = 100
AND due_date = '2017-01-01'
GROUP BY account;

Query #3: Get accounts sums for non dated accounts, executed one time per item:

SELECT account
, SUM(amount) AS total_amount
FROM payments
WHERE item_id = 100
AND due_date IS NULL
GROUP BY account;

Query #4: Get total account sums, executed one time per item:

SELECT account
, SUM(amount) AS total_amount
FROM payments
WHERE item_id = 100
GROUP BY account;

Optimization:

We replaced all above queries by one call to a PostgreSQL database function that returned all required data in a nice format so that application can still do accounting logic using iterative process on the records returned to application from database.

CREATE OR REPLACE FUNCTION items.sum_accounts_due_on(p_item_id INTEGER, p_as_of_eff DATE)
RETURNS TABLE (sum_type TEXT, due_date DATE, accounts JSON) LANGUAGE SQL STABLE
AS
$function$
 WITH item_txn AS(
 SELECT due_date
 , account
 , amount
 FROM payments
 WHERE item_id = p_item_id
 AND (p_as_of_eff IS NULL OR eff_date <= p_as_of_eff)
 )
 , account_sum_without_due_date AS(
 SELECT NULL AS due_date
 , account
 , sum(amount) AS sum
 FROM item_txn 
 WHERE due_date IS NULL
 GROUP BY 1,2
 )
 , account_sum_by_due_date AS(
 SELECT due_date
 , account
 , sum(amount) AS sum
 FROM item_txn
 WHERE due_date IS NOT NULL
 GROUP BY 1,2
 )
 , account_sum_totals AS(
 SELECT NULL AS due_date
 , account
 , SUM(amount) AS sum
 FROM item_txn
 GROUP BY 1,2
 )
 SELECT 'with_due_date' AS sum_type
 , due_date
 , FORMAT('{%s}'
 , string_agg(FORMAT('%s:%s',to_json(account),to_json(sum)), ',')
 )::json AS accounts
 FROM account_sum_by_due_date ptc
 GROUP BY due_date
 UNION ALL
 SELECT 'without_due_date' AS sum_type
 , NULL
 , FORMAT('{%s}'
 , string_agg(FORMAT('%s:%s',to_json(account),to_json(sum)), ',')
 )::json AS accounts
 FROM account_sum_without_due_date ptc
 GROUP BY due_date
 UNION ALL
 SELECT 'item_total' AS sum_type
 , NULL
 , FORMAT('{%s}'
 , string_agg(FORMAT('%s:%s',to_json(account),to_json(sum)), ',')
 )::json AS accounts
 FROM account_sum_totals ptc
 GROUP BY due_date
 ORDER BY due_date NULLS FIRST,sum_type;
$function$;

By doing this we reduced the database calls by a factor of 100, also the data was structured and account sums were created in json format. Earlier app was creating the json structure using loops in the application code. PgBadger showed database total query processing times reduced by over 1 day. This demonstrates how powerful database procedures can be and how performant an application can become by utilizing them. We also inlined the database function which is faster than a plpgsql version.

Account sums method response times improved from 200ms to ~20ms as shown in the graph below, where the green line represents old code response time, and the yellow line represents the new code response time:

Testing before release:

When we made this change and asked our application team to release this code they were worried that it may break our accounting logic and was too risky to release without specialized testing. We wrote test helper to call old and new application logic and compare the results. Since the items accounting information changes frequently we wanted to test it on live items data, to do that we used help from our Database Administration Team to test the new code safely on a replicated instance of our production database which was receiving the same traffic.