So, after months (years?) of development and test, your application is ready for production.
The time is ripe for a new game: "break and seek".
This game requires 2 participants: Atilla and Sherlock.
In a test environment, Atilla will choose one component of the system to break (it can be anything: an instance of DB to bring down especially if using RAC, a JEE module to undeploy, a LDAP server to shutdown, a file system to be filled to 100%, an Application Server to shutdown, a password to change, a network cable to unplug....).
Atilla will now tell Sherlock to fix it, without telling him what went down.
Sherlock will then verify that:
a) the system fault is monitored and detected via some monitoring tool (eg Nagios)
b) the error message returned to the end user is not some horrible garbage, but something meaningful and reassuring
c) the automated tests are able to capture the fault
d) everybody is aware of the criticality of the fault and its consequences on other systems and on use cases... this can lead to rethinking the fault tolerance/redundancy of the system
e) transactions are properly rolled back and the system is not left in an inconsistent state
f) the logs contain meaningful messages and not loads of repeated stacktraces without any context information
g) the Operations manual contains instructions on how to fix the problem (location of start/stop scripts and other troubleshooting issues)
h) if the fault is not fixed within a given time, the system doesn't diverge (eg some file system gets filled with error reports or something along the line which would lead to a domino effect)
It's a beautiful game and the day you will go to production you will sleep better, knowing that you are ready to tackle all these accidents.
For us Italians, Atilla is a synonym of barbaric devastation. I was very surprised to learn, in Budapest, from my biking instructor, that in Hungary he is considered a national hero, and streets are dedicated to him... the same story seen by Spartacus and by Caesar...
Wednesday, September 29, 2010
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment