Tuesday, October 11, 2011

FAMOZ: Let's treat project failures like airplane crashes

Whenever an airliner crashes, two questions demand answers:
  • Who is responsible? -- so we can punish them, sue them or put them in jail (or gain advantage for ourselves)
  • How did this happen? -- so we can prevent accidents like this in the future. 
Oddly enough, if the investigation seeks to answer the first question, it becomes very difficult to achieve the goal of the second question. If people are afraid of punishment, they are reluctant to provide information which can and will be used against them. The investigation of airline incidents always focus on the second question and aviation has enjoyed an excellent and improving safety record because of it.

The city of Zurich has "pulled the plug" on "ELUSA" (or FAMOZ, as it was originally known). This system to integrate the operations of four departments of the city's social services office was originally budgeted at 11 million CHF, but after several rounds of additional financing was now expected to cost 29 million. Stopping the project will limit the costs to 26 Million. The politician are speaking of a disaster, citizens are expressing disgust, and the suppliers are saying 'we're being made into scapegoats.'

Is this case exceptional? According to the Tages-Anzeiger, 75% of the functionality is completed and operational, the costs are 235% of the original budget, and the project was restarted at least once. According to criteria of the Standish group's CHAOS report, this large project would be considered challenged: "completed and operational but over-budget, over the time estimate, and offers fewer features and functions than originally specified.

According to the CHAOS report, 61.5% of all large projects are challenged, only 9% are successful and the rest fail. 94% of all challenged or failed projects restart at least once. The average cost overrun is about 180% of the original budget. So this project is in many ways typical and based on the numbers comes into the "top 25%" of project failures.

Why did this project have to end in a disaster? 75% of the functionality is available. According to other work of the Standish Group, 64% of the functionality developed in such projects is seldom or never used. So if the project had implemented the 'right' 36% first, i.e. the functionality that is needed frequently or every day, the project could have been stopped long ago and as a success and at a figure close to the original budget.

Why did this crash have to happen? The first Chaos report was published in 1995. This is not the first or even the biggest 'plane crash' of a Swiss government IT project. If this had been an airplane crash, the safety authorities would be crawling over the scene of the accident; politicians and the flying public would demand that the causes be identified and the

Why can't an IT failure be treated like a plane crash? Why are failures used for political gain rather than as a basis for learning? Why can't we have an IT Project Safety Board, which investigates "accidents" and makes recommendations to prevent similar incidents in the future?


srinivas C said...

Because no one is killed. A typical s/w project is just something that brings about some benefit, sometimes hard to discern. A crash and this cannot be treating with the same or even similar seriousness. However I take your point, we indeed need to be more serious about project failures.

Peter said...

So true, no one is killed! Although you might get a different impression when you listen to the politicians attacking each other, the civil servants involved, the IT supplier and the whole IT sector!

Of course this kind of bickering can be more about the politicians than the project itself. In this case, insiders report that the project was a success and the software has been in service for years. Appartently there were issues around price negotiations for the next release which were the primary source of heat. A small detail: Again, according my source, 75% of the "project costs" are internal costs of the city administration, not funds paid to the supplier. "Figures don't lie, but liars can figure" -- Mark Twain.

If I understand the history correctly, Norm Kerth created the prime directive after a fatal accident at a regatta where he was a race official. The concept of creating an environment of safety is now a core concept of the retrospective community.

So for me plane crashes are an extreme case - we look at the principles that work in the extreme case and see how they apply in the more day to day cases.