In my previous post I wrote about how I achived a >40% speedup on sysbench read-only using profile-guided optimisation (PGO). While this is a preliminary result, I though it was so interesting that it deserved early mention. The fact that any benchmark can be improved that much shows clearly that PGO is something worth looking into. Even if we will probably not improve all workloads by 40%, it seems highly likely that we can obtain significant gains also for many real workloads.
I had one or two interesting comments on the post that raise valid concerns, so I wanted to write a follow-up here, explaining some of the points in more details and going deeper into the performance counter measurements. As I wrote before, actual observations and measurements are crucial to fully understand performance of complex code on modern CPUs. Intuition and …
[Read more]