One of the most common patterns I see in my consulting work is identifiers that are generated by MD5() or UUID(). Many times this is done in an application framework or something similar — not software the client has written. From the application programmer’s point of view, it’s just an incredibly handy idiom: generate a unique value and use it, you’re done.
Those values tend to appear in session identifiers, but that’s not the only place; I especially notice them in apps that use Java’s Hibernate interfaces, whether session IDs are involved or not. They propagate themselves all around the other tables, where they become secondary indexes and even get combined with other columns to make even bigger keys.
What’s wrong with this? There are two major things that hurt performance in such cases: larger data and indexes, and non-sequential values. I’ll ignore the latter in this article, since whether an identifier is …
[Read more]