In my opinion, one of the best things that happened to Planet MySQL
lately, is Explain Extended, a blog by Alex Bolenok (also
known as Quassnoi on Stackoverflow).
I never had the pleasure of meeting Alex in person, but his
articles are always interesting and of high quality, and the SQL
wizardry he pulls off is downright inspiring. I really feel
humbled by the creativity of some of his solutions and his
apparent experience with multiple RDBMS products.
Alex' most recent post is about aggregation, and
finding a top 3 based on the aggregate:
In …
After stepping off of the GROUP_CONCAT()
solution
for calculating quantiles I figured it would be
nice to find a better way to calculate the median too.
Solution
I previously wrote on how to calculate the median using
GROUP_CONCAT()
, but I think that this is a better
way:
[Read more]
SELECT AVG(length) AS median -- take the average of left and right median
, MIN(length) AS left_median --
, MAX(length) AS right_median --
, @l AS left_median_position --
, @r AS right_median_position --
FROM (
SELECT @n, length -- @n is just here to …
Yesterday, I was on the freenode ##pentaho irc channel when
Andres
Chaves asked me how to calculate the Nth percentile in MySQL. He saw a
solution somewhere using subqueries, but wasn't too happy about
it.
A while ago I wrote about calulating the median in MySQL, and it turns
out the Nth percentile can be calculated using a
similar, single-pass approach, not relying on subqueries, UDFs,
or user-defined variables.
The percentile....
So, what is a percentile exactly? Here's what the wikipedia
says:
A percentile is the value of a variable below which a certain
percent of observations fall. So the 20th percentile is the value
(or score) below which 20 …