Please make it descriptive, graphic, and if anything burnt or
exploded I'd love to have pictures.
Include an approximate timeline of when things happened and when
it was all working again (if ever).
Thanks!
This somewhat relates to the earlier post A
SAN is a single point-of-failure, too. Somehow people get
into scenarios where highly virtualised environments with SANs
get things like replication and everything, but it all runs on
the same hardware and SAN backend. So if this admittedly very
nice hardware fails (and it will!), the degree of "we're stuffed"
is particularly high. The reliance in terms of business processes
is possibly a key factor there, rather than purely technical
issues.
Anyway, if you have good stories of (distributed?) SAN and VM
infra failure, please step up and tell all. It'll help prevent
similar issues for …
Showing entries 1 to 2
Mar
13
2009
Sep
22
2008
In reply to Arjen's post about Single points of failure:
Arjen, you are absolutely right. It doesn't matter how
over-engineered a storage solution is (I'm thinking of a giant
dual-headed Netapp with redundant everything). After you've
paid a few hundred K for that, you still have a single point of
failure. Is it a highly-unlikely point of failure?
Sure, but it's still a point of failure.
Let's take it a step further, at Yahoo we're beyond thinking
about how to make a single node redundant (be it for storage,
networking, or even a simple webserver), we consider entire
datacenters to be single points of failure. What does that
mean?
Showing entries 1 to 2