Skip to main content

SQL tips & tricks

A true SQL delicacy – max_by

This section describes handy SQL features and functions in BigQuery. This article covers the MAX_BY() function to aggregate data more easily.

Editor's note: this post is written by Tamas Ujhelyi—follow him for handy SQL tips on BigQuery and Snowflake. 

If you love SQL delicacies, you'll be a fan of this.

I've used it a lot recently in my job, and am I grateful it exists!

It's called: 𝑚𝑎𝑥_𝑏𝑦, a convenient function in BigQuery.

It helps with a common task: getting a column's value based on the maximum value of another column.

Think of getting users' most recent order IDs.

Or the last comments under a webshop's products.

Or the last event_name of an app user.

Normally, you could do that with a row_number() window function (or some other way).

But max_by isn't called a convenient function for nothing.

You can achieve the very same result by just inputting two column names:

𝑚𝑎𝑥_𝑏𝑦(𝑐𝑜𝑙𝑢𝑚𝑛_𝑦𝑜𝑢_𝑛𝑒𝑒𝑑, 𝑏𝑎𝑠𝑒𝑑_𝑜𝑛_𝑜𝑡ℎ𝑒𝑟_𝑐𝑜𝑙𝑢𝑚𝑛'𝑠_𝑚𝑎𝑥_𝑣𝑎𝑙𝑢𝑒)

So getting users' last orders would be:

𝑚𝑎𝑥_𝑏𝑦(𝑜𝑟𝑑𝑒𝑟_𝑖𝑑, 𝑜𝑟𝑑𝑒𝑟𝑒𝑑_𝑎𝑡)

(Of course, combine it with group by user_id.)

It's cases like this where I find the more I learn about SQL, the more it fascinates me.

Anyways, I hope you'll find it as useful as I have.

And have an awesome day, okay? ✌