Tuesday, July 19, 2011

SQL pagination with complex query

It is well known that SQL has an OFFSET parameter, so if you need pages, you could do something like the following

SELECT * from TABLE1 LIMIT 100 OFFSET 100

The problem is, this query is very inefficient. Since many databases today are big (believe it or not, I saw a 2008 blog post saying "since many databases today are small"), inefficiency of OFFSET, which results in fetching all rows before it, is unacceptable.

The next step is to go specific and use a unique index in the TABLE1 (hope you have it), so you can sort and offset on this index, which will be efficient. You can then do something like the following

SELECT * from TABLE1 where TABLE1.id > 2500 order by TABLE1.id LIMIT 100,

assuming that the last id value you saw was 2500.

What do you do when you do not have an index that you can use, but instead are doing some join with multiple indexes? For example,

SELECT * from TABLE1 JOIN TABLE2 ORDER BY TABLE1.ID1, TABLE2.ID2 LIMIT 100

The query above is actually already half the solution. You only need to add this condition

WHERE
TABLE1.ID1 > last_value_id1 OR
(TABLE1.ID1 = last_value_id1 AND TABLE1.ID2 > last_value_id2).

You can continue and do this with as many indexes as you would like. This question and the solution came up in the  case of 4 indexes, and it worked great and very efficiently.

Art: Leonardo Da Vinci - Crossbow Machine

No comments: