Today we had an interesting situation where the same query was executed significantly slower when it was written with GROUP BY instead of DISTINCT and I saw many people still had the assumption that these two types of queries are actually equivalent which is simply not true. Although DISTINCT queries can be implemented using GROUP BY but not every GROUP BY query can be translated to DISTINCT. Depending on the brand and the optimizer the database server may actually use group by internally for the execution of distinct but that won’t make them equivalent. Let’s see why…
GROUP BY as the name suggest groups the result by some set of parameters and evaluate the whole result set. In most databases group by is implemented based on sorting and the same rules applies to it as well.
DISTINCT will make sure that the same row won’t be returned in the result set twice. Distinct doesn’t necessary …
[Read more]