Advanced SQL in Rails - Part 2
Welcome back, dear readers! In part 1 of this post, we did a quick overview of SQL’s window functions and views. Now, we’re going to see how we can use those features from right within Rails.
Putting it all together
To demonstrate how we can spice up our Rails app with windows and views, let’s build a small Rails application called GifVotr. Think of it like Reddit for GIFs: you can post a GIF, and people can vote it up and down. The most popular GIFs will rise to the top.
We can start with some simple models and corresponding tables for GIFs and votes:
If we wanted to calculate the rank of each GIF, we could sum up the votes for each GIF and then sort by that:
score method will issue a
sum query every time it’s called, which will translate into a lot of additional queries as your dataset grows. Plus, we don’t just need to calculate the score - we also want to display the rank of each GIF - what’s the #1 most popular GIF, the #2 most popular, etc. Doing that in Ruby land will get expensive quickly.
Put a view on it
Instead, let’s make a view! We’ll start by loading up our database console using
rails db and figure out how we can construct an SQL query that gives us the data we need. Remember, we’re looking for the score of each GIF, as well as its rank relative to all the GIFs in the system.
This is a pretty complex query, so let’s break it down.
First, we need to calculate the score for each GIF. We can get this from the votes table with a basic aggregate function:
Next, we want to take this result set and calculate a rank for each row, based on the score. The highest score will get the highest ranking. To do this, we can take the query above and use it in the
FROM clause of our larger query, so that our calculations can operate on those results. We have to name this result set so that we can reference it in the rest of the query, so we’ll call it ‘total_votes’.
We can use the
rank() function to calculate the ranking for each row. Fun fact: you can’t use
rank() without a window function, which makes it a pretty great example!
We need to give
rank() a sorting by which it should rank our results, so we use a window function and instruct it to order the rows by score, descending. Unsurprisingly, we name this column ‘rank’.
Along with the ‘rank’ column, we also want to return the other columns from ‘total_votes’. While we’re at it, we should add an ‘id’ field so that our view feels like a real table:
Waiter, there’s a View in my Migration
Okay, we’ve got our query! Now, we can hook it up to Rails. Let’s make a migration that will create this view in the database for us. We wrap our handcrafted query in a
CREATE VIEW statement and call it ‘rankings’. We can use Active Record’s
execute function to pass our SQL right into the database.
Here’s where we diverge a bit from the Rails Way of things, so it’s important to point out a few things:
Since we’re passing raw SQL in this migration, we shouldn’t use the
changemethod, since Rails won’t know how to reverse the migration. Instead, we should use
downso that we can correctly drop the view if we need to roll it back.
When new migrations are run, Rails updates
db/schema.rbto reflect the current state of the database. Since we’re doing something custom now, the result of this migration will not be visible in the schema file. As a result, running
rails db:schema:loadwill no longer put our database in the right state. This means that anywhere you rely on initializing the database from the schema - continuous integration, for example - you should simply use
Anyway, now we’ve got a view in our database. Following the Rails convention of table/model naming, we can hook up a Ranking model to this ‘table’ of ours:
This looks like any other table-backed model, except for one small difference: we need to tell our model what its primary key is, since Rails can’t infer it from the schema.
Now that we’ve hooked up our view to a model, we’ve completely abstracted the fact that this is a complex SQL query, and we can proceed as if ‘rankings’ was actually a table.
Let’s fast-forward a bit and see how all three of these models could work together to produce a complex result set while hitting the database just once:
(One thing to note about our
order_by_rank scope: we’re explicitly ordering by
id in order to guarantee a consistent sort order, since our
rank() function will assign the same rank to GIFs with equal scores. Sorting by
id ensures that features like pagination will continue to work predictably. If you want to nerd out more on rank functions, check out this article.)
Back in ActiveRecord land, we can harness the power of scopes and associations to construct an efficient database query, with all the data we need, up front:
That’s quite the query, isn’t it? Bet you’re glad you didn’t have to worry about constructing that by hand!
When the size and complexity of your data begins to outgrow ActiveRecord, you don’t need to throw away all your Rails magic just to get the results you need. With the right workflow, higher-level SQL features can integrate nicely with any Rails app. Your data will thank you!
Are you going to try out views and window functions on your next project? Are you working with them in your Rails app already? Drop me a line in the comments!
Related: Advanced SQL in Rails - Part 1