This month’s rollout of the health insurance exchange offered a textbook example of the reputation that government has for not being able to develop an IT project correctly. At this point, however, it’s not even clear what the problem is — or whom you could trust to tell you.

Registration for state health insurance exchange websites was scheduled to go live on October 1, which would enable people to get health insurance under the Affordable Care Act effective January 1. However, users ran into problems with the majority of the websites, 15 of which were created and operated by individual states and 36 of which were created and operated by the federal government.

“The federally run website was unresponsive for much of Tuesday morning, and only four of 15 exchange websites being run by states and the District of Columbia were working from 9:30 a.m. to 10:30 a.m.,” reported Bloomberg BusinessWeek.

BusinessWeek reminded readers that the health insurance exchange websites were hardly alone in the IT industry in crashing and burning, pointing out that users couldn’t even copy and paste text on an iPhone for over a year after they went on sale, and that a similarly hotly awaited product the same week, Grand Theft Auto V, crashed the company’s servers.

But trying to pin the blame on what went wrong wasn’t so easy, though people had a lot of fun trying.

One school holds that the problem was primarily due to the massive number of (potentially inexperienced) users trying to log in at once. “In New York, officials said their exchange had 2.5 million visitors in its first half hour yesterday. California reported as many as 16,000 hits a second,” BusinessWeek reported. “And U.S. officials recorded 2.8 million visitors to the federal website, healthcare.gov, even as it fought technical problems much of the day.” By the second day, the federal government reported that 7 million people had used the system, and by the third day, 8.6 million — with as many as 250,000 on the system at the same time.

Detractors, however, point out that companies such as Amazon also handle big crowds on Black Friday and Cyber Monday — 130 million Americans on Cyber Monday altogether in 2012, for example. Though, to be fair, some of those companies end up having downtime as well.

The other school holds that the problem was due to methodology or design flaws from the get-go, though opinions varied as to the specific instances. David Auerbach, a writer and software engineer, wrote in Slate that the user interface was fine, and that the problem was on the back end. Reuters quoted developers saying that the issue was due to the use of Javascript, which was said to spawn a total of 92 processes for each new person. Developer Luke Chung wrote in his blog that the user interface was the problem, requiring people to enter security questions and so on just to register for an account. Insurers also reported receiving incorrect data.

People have also criticized the system for being inadequately tested, either the functionality itself or its scalability, saying it was designed to support only 50,000 people at once. It’s also been suggested that the problem is the federal procurement process, comparing the clunky exchange software with that of Barack Obama’s presidential campaign — though the development, which used open-source software, had been praised earlier. Others, like Avik Roy of Forbes, muttered darkly of sabotage.

Unfortunately, due to the politics behind the whole situation, it’s hard to ascertain exactly what the problem was. Certainly people invested in the success of the ACA are going to be quicker to cite evidence that the stunning popularity of the program was to blame, while people invested in its failure are going to look for any indication of mismanagement or errors with the basic program itself. And the fact that the state exchanges, as well as the federal one, had problems indicates that any cause goes beyond simply the way the federal system was designed — though after the first day, most of the state exchanges did appear to report fewer problems (other than Hawaii, which was scheduled to relaunch its system on October 15).

As we discussed previously, there are two schools of thought on IT failure. One is that, with enough testing and diligence, you can always produce working software. The other, “normal accident theory,” contends that as systems get more complex, their very complexity is inherently more likely to cause them to fail (a theory also held by The Watchman’s Rattle). And it’s hard to imagine a situation more complex than developing software that could theoretically be used by everyone in the U.S., simultaneously.

The takeaway from this? Chances are, in any major IT project failure, it’s going to be difficult to pin down a single cause — and politics, corporate or otherwise, can make it even more difficult to determine the biggest contributing factors.

Related Posts