Select top N records in each group with PrestoDB
1 min readMay 26, 2017
Close your eyes, imagine plenty of data, about daily country stats such as visits:
France | 10/10/2017 | events=visits
Italy | 10/10/2017 | events=visits
…..
And now, you want to display on a graph, a timeline with N top countries. Looks easy, isnit?
Now open your eyes, solution (should) be here ;)
So, what do we have here, 3 nested query, woaaaa:
- First query, simple, get all the data, grouped by day, month and country. Here you can add a HAVING clause if they are really lot of countries
- Then, let’s add a RANK column, here we use windowed function (https://prestodb.io/docs/current/functions/window.html). Windowed function are your best friends for SQL analytics. This will create a new column, that rank rows (1,2,3,4 etc…) according partition and order.
- The latest add the condition, we want only the top 10 data.