Introduction # In this blog post, we will discuss an example of a change to the Vitess query planner and how it enhances the optimization process. The new model focuses on making every step in the optimization pipeline a runnable plan. This approach offers several benefits, including simpler understanding and reasoning, ease of testing, and the ability to use arbitrary expressions in ordering, grouping, and aggregations. Vitess distributed query planner # VTGate is the proxy component of Vitess.
Friend and former colleague Roland Bouwman has written an excellent update on the GROUP BY implementation in MySQL.
MySQL’s implementation of GROUP BY has historically been quirky. Sometimes that quirkiness has been useful, but often it causes grief as SQL authors can make mistakes that are executed but don’t produce the results they want (or expect).
Simple example:
SELECT cat, COUNT(val) as cnt, othercol FROM tbl GROUP BY cat
The ‘cat‘ column is in the GROUP BY clause, the COUNT(val) is an aggregate, but the ‘othercol‘ column is … well… neither. What used to effectively happen is that the server would pick one othercol value from within each group. As I noted before, sometimes useful but often a pest as the server wouldn’t know if …
[Read more]