When working with grouped data, you often need either the top N rows per group (e.g., highest scores per category) or a random sample per group. Here’s how to do both efficiently, especially in BigQuery.

You can check also how to get Random Sample per group in Pandas.

1. Selecting Top N Rows Per Group

To get the top N records for each group (based on a sort order), use window functions with ROW_NUMBER():

SELECT *
FROM (
  SELECT *,
         ROW_NUMBER() OVER (PARTITION BY group_col ORDER BY sort_col DESC) AS rn
  FROM my_table
)
WHERE rn <= 3;

Explanation:

  • PARTITION BY groups the data by group_col.
  • ORDER BY defines “top” within each group.
  • rn <= 3 keeps the top 3 rows per group.

2. Taking a Random Sample Per Group

To randomly sample N rows per group in BigQuery:

SELECT *
FROM (
  SELECT *,
         ROW_NUMBER() OVER (PARTITION BY group_col ORDER BY RAND()) AS rn
  FROM my_table
)
WHERE rn <= 5;

Explanation:

  • ORDER BY RAND() randomizes rows within each group.
  • The outer filter keeps the first N (here 5) random rows per group.

3. Example & Demo

You can test the SQL on this link: Get Random N results per group in SQL:

CREATE TABLE yourtable (
  year int,
  id varchar(20),
  rate decimal(10,2)
);

INSERT INTO yourtable VALUES
(2006,    'p01', 8),
(2003,    'p01', 7.4),
(2008,    'p01', 6.8),
(2001,    'p01', 5.9),
(2007,    'p01', 5.3),
(2009,    'p01', 4.4),
(2002,    'p01', 3.9),
(2004,    'p01', 3.5),
(2005,    'p01', 2.1),
(2000,    'p01', 0.8),
(2001,    'p02', 12.5),
(2004,    'p02', 12.4),
(2002,    'p02', 12.2),
(2003,    'p02', 10.3),
(2000,    'p02', 8.7),
(2006,    'p02', 4.6),
(2007,    'p02', 3.3);

SELECT yourtable.*
FROM   yourtable INNER JOIN (
  SELECT   id, GROUP_CONCAT(year ORDER BY rate DESC) grouped_year
  FROM     yourtable
  GROUP BY id) group_max
  ON yourtable.id = group_max.id
     AND FIND_IN_SET(year, grouped_year) <=3
ORDER BY
  yourtable.id, yourtable.year DESC;

The result will be 3 samples for each group:

  • p01
  • p02
year id rate
2008 p01 6.80
2006 p01 8.00
2003 p01 7.40
2004 p02 12.40
2002 p02 12.20
2001 p02 12.50

Resources