The system was delivered in July 2013 for a cost of $46.6MM plus additional maintenance contracts for $4.5MM. It had over a 1000 reported bugs (no severity assigned) shortly after delivery. As of February 2014, there were reports of 100 bugs in the system (no severity assigned). Project cost, as compiled from SOW and addendums, is is shown in Table 1.
This is one of two Case Studies on recent government project failures that we were asked to review. I am providing this assessment as a public service. The other assessment is on the Oregon’s Cover Oregon Health Insurance Exchange.
Table 1: Bearing Point/Deloitte SOW and Addendums | |||||
Description | Date | Functionality Cost | Functionality Chg. Req. Cost | Maintenenace Cost | |
---|---|---|---|---|---|
Original Fixed Price | May-07 | $ 40,000,000 | |||
Addendum 1 - delay only | Sep-07 | $ 40,000,000 | |||
Addendum 2 - delay only | May-08 | $ 40,000,000 | |||
Addendum 3 | May-09 | $ 40,594,880 | $ 594,880 | ||
Addendum 4 | Missing | ||||
Addendum 5 | Aug-10 | $ 41,815,560 | $ 1,220,680 | ||
Addendum 6 | Dec-10 | $ 42,236,920 | $ 421,360 | $ 1,626,240 | |
$ 1,126,782 | |||||
Addendum 7 | Apr-11 | $ 42,425,710 | $ 188,790 | ||
Addendum 8 | Nov-11 | $ 46,647,939 | $ 256,375 | $ 1,656,900 | |
$ 3,965,854 | |||||
Addendum 9 | Feb-13 | $ 46,647,939 | |||
Functionality Total: | $ 46,647,939 | Maintenance | $ 5,630,602 | ||
Contract Total: | $ 52,278,541 |
Analysis
As with many failures, there are a number of factors that contributed. Some were out of everyone’s control, while others were due to lack of proper controls or process. From our analysis of the project three major factors were instrumental in the failure. Those were:
- The 2008 recession affecting the number of unemployment claims.
- Poor quality processes used by the vendor.
- Lack of large project experience by the State.
External Factors
The project serendipitously started about seven months prior to the start of the 2008 recession. Between January 2008 and December 2009 unemployment rates in Massachusetts soared from about 4.5% to 8.4%, doubling the number of unemployed people from 150,000 to 300,000.1 This severely taxed the DUA and surely restricted the amount of time people (especially subject matter experts) in the DUA could be assigned to the project.
Quality Processes
At delivery in July 2013, the DUA claimed quality was an issue. It appears that attention to and tracking of quality was not a focus by Bearing Point or, later, Deloitte. This was endemic from the start of the project. The original May 2007 contract shows major signs of that. Appendix A, titled “Deliverable Acceptance Criteria (Example)” is a blank page. Quality appeared to be a major focus of Addendum 8 of the contract (signed November 2011). In this addendum, Deloitte and DUA settled on a sampling plan for testing defects.
By definition, sample-plan testing allows for a given and determinant level of issues. For instance, one test allowed for 200 claims to be tested and 100% of them would need to pass. There is no mention of the test cases that were built, so it is assumed that the sampling was random and not testing edge cases. At approximately 250,000 claims per month (the number processed in February 2014), testing 200 with a acceptance rate of 100% would imply a statistical failure rate of about 0.89% or 2,250 failed claims per month. Expectations outside the DUA and Deloitte were that all claims would be processed by the system while the sampling plan was clearly designed otherwise.
Although I am not in a position to validate the claim, it appears that many of the bugs in the system are letting unreasonable issues pass through. The implication is that exception processing in the system as poorly designed. In other words, some processes are too difficult to handle programmatically and need human intervention. These should drop out of the system in a controlled and logical manner for people to assess and process.
Inexperience With Large-Scale Integration
In conversations with the Senate Committee on Post-Audit and Oversight, it was divulged that the state has little experience with large technology projects. A general lack understanding of how the deliverables were structured, basic governance to monitor progress and identify issues in the statements of work, and the use of a waterfall approach all added to the failure from the State’s side.
As mentioned above, traces of quality issues were at the beginning of the project and there did not seem to be a concern. These should have been addressed early in the project. Quality processes should have used industry standard acceptance test procedures that test edge cases. Sample testing should not have been used. Even as recently as February 2014, bugs were being reported without classifying their severity. It cannot be assumed that the 100 bugs reported in February were a mix of Severity 1 through 4 nor all Severity 1. Lack of notation indicates a poor understanding of testing and lax communication.
Figure 1: Bearing Point SOW Appendix A |
Governance of the project was also deficient. Early in 2008, when the unemployment rate was on the rise, the DUA (or even BearingPoint) should have stopped the project instead of delaying it. It was evident that the DUA was going to be too busy to dedicate time to the project. This may have caused funding issues as money may not have been able to be carried forward, but under the circumstances the legislature could have made an exception.
Numerous other factors are also evident, but unconfirmed. For instance, the overview of the original SOW states that Bearing will “Re-engineer the UI [Unemployment insurance] business model.” This is a major undertaking and, in my opinion, would require far more than originally bid in both time and effort.
Other Factors
There were unconfirmed statements that the use of uFACTS (Deloitte’s Unemployment Framework for Automated Claim and Tax Services product) was to mimic the Minnesota implementation. If this were the case and the State had agreed with it, then it contradicts the prior mentioned commitment by Bearing Point to re-engineer the system and one would have to ask why the cost would be $40MM.
There were also claims that test procedures were to be developed by the DUA. This again points to a lack of understanding of how test procedures are built. The tests must match the requirements as documented by the integrator (Deloitte) and approved and run by the client (the State). Test procedures developed by an inexperienced client will surely not match the requirements and cause numerous false failures or improperly test edge cases.
Recommendations
The state should implement a project governance group outside of the legislative and executive branch. It should not be run by the state information technology group (i.e. not under the commonwealth chief information officers (CCIOs)), but rather a group of technically knowledgeable agency experts that are aware of the intricacies of their departments. They should be matched in authority with outside experts with experience in managing large technical projects. Strict conflict-of-interest rules should be in place to ensure that no one benefits from decisions made during the governance process.
Alternate methodologies should be used to reduce the size of the project. Iterative approaches, like agile, that allow interim deliverables would be best. However, phasing projects to produce smaller deliverables that show the solutions failings may be adequate.
Smaller projects with interim deliverables will open the bidding process to other very capable vendors that cannot bid on huge monolithic projects. This will most likely allow more in-state companies to bid and win contracts and keep the money with the State. Using local resources could also decrease costs by reducing travel expenses.
Quality and testing procedures should be run by an independent third party. The should not be part of the system integrator team (in this case Deloitte). This can be an outside party that works closely with the system integrator and the State with their incentive being cost effective testing that properly processes exceptions.
The challenge will be implementing these changes in a non-bureaucratic manner.
Constraints on Analysis
The analysis was completed by looking at initial contractual and subsequent change orders. Two conversations were held with Senate Subcommittee members. No one from the project (State or Deloitte) was consulted prior to making this assessment and recommendations.
Conflict of Interest Disclaimer
This was a non-paid engagement. Neither eCameron nor myself has received any form of remuneration, although I have asked for a thank you letter from Senator Creem’s office. The Senate Committee on Post-Audit and Oversight approached me asking for interpretation of the material provided.