Where IT “goes wrong”

This is the second in a short series of three commentaries that look at when, where and why IT fails in its support for business operations and objectives.

In the last commentary we covered a short list of thirteen unfortunate incidents (the number we included was unplanned, but perhaps apposite). If you got nothing else from that depressing list you should at least have understood that these incidents CAN affect all sizes and types of organisations, and inflict untold tangible and intangible costs on the customers/clients, staff, and owners. This – despite the earnest efforts of technologists, risk managers, and business managers!

Where do these problems occur?

Consider a high level model of the end-to-end business/IT processes:

• Business sets directions, expectations for business and IT performance levels, and the risk/reward criteria that are acceptable;
• Compliance managers, risk managers and risk controllers (auditors) apply external and internal constraints to the directions, plans and processes;
• IT works with business to translate the business directions and models into a codified architecture that can be used by IT for technical planning, then builds a prioritized programme of work – the IT project portfolio;
• Project managers augment and improve business and IT capabilities using structured project management methods;
• Business and IT managers monitor the health, performance, and relevance of the capabilities;
• IT operates infrastructure and applications in support of business – safely, securely, reliably, effectively and efficiently – with margins for inevitable errors and with well-practiced recovery and continuity processes.

In this simplified model, all processes are continuous, and each interacts with all the other processes to create dynamic feedback loops. Thus, the health of a technical capability may influence decisions around business direction and should probably affect both risk management and IT planning decisions.

Now, think back to the list of unfortunate incidents. While on the surface it may appear that the overwhelming majority were project blunders, further inspection reveals that in fact the errors occurred throughout the business/IT cycle:

• The pharmaceutical company’s accounting and IT auditors should have been able to pick up IT discrepancies – whether they were caused by a project or “evolved” during day to day operations. Where? Risk management.
• The drug company’s business strategists and CIO were responsible for taking high risks in a fragile business environment. Where? Business/IT strategy.
• The imports system evolved into an IT implementation for the government agency – the agency’s clients (the importers) appear to have been consulted insincerely and their very real concerns ignored. Where? Business/IT planning and project management.
• Telco billing – perhaps poor requirements management but certainly poor vendor management. Where? project management.
• The bank’s ATMs were taken out of service by early release software from a very assertive vendor. Other banks chose to test the software more thoroughly and detected the bug. Where? IT architecture, project management, capability management, and operations.
• The government agency’s regulatory changes were applied at short notice to applications that were known to be old, poorly maintained, and fragile. Senior business executives had blocked funding requests for major upgrades over the previous ten years, but still insisted on very short lead-time changes. Where? Business strategy, business/IT planning, capability management.
• Stock exchange system 11 years late. Where? Probably in all areas, but the responsibility must lie with the Board of Directors for allowing this debacle to drag on for so long.
• Emergency services system lock-outs. Where? IT architecture, project management, capability management, IT operations, risk management.
• IT infrastructure upgrade. Where? This was an extended comedy of errors, with apparently little business leadership, little risk management, little process control, technology-led IT planning, unfettered, demand-driven requirements, unskilled negotiators, uncontrolled vendors, and no escape clauses.
• The airport software was revolutionary, but posed a high business risk in the circumstances. Where? The initial business direction does not appear to have been constrained by intelligent risk management.
• The welfare agency’s desktop computer failure was caused by an unintended release of unready infrastructure software. Where? This occurred in the “escrow” capability management zone that exists between the project and IT operations. It was a defect in the quality process (in an organisation that had purportedly achieved some level of quality certification).
• The computer crash that prevented benefit payments was caused by distribution of software onto a platform that had not been updated to the minimum platform requirements. Where? Project management, capability (configuration) management, and IT operations.
• The slow call centres were caused by the same problem – updated software that had not been fully tested on all the configurations that were in use across an agency with hundreds of branches and call centres.

You should be getting the picture by now: the unfortunate incidents that are at face value IT problems may in fact have their origins in any of the six business/IT processes.

Next: Why IT goes wrong