RCA: Springer Title Deletions, August 2021
Introduction
This document serves as a Root Cause Analysis for the reported incident on August 3, 2021.The goal of this document is to share our findings regarding the reported incident, specify the root cause analysis, handling of the event, outline actions to be taken for mitigation, as well as preventive measures Ex Libris is taking to avoid similar cases in future.
Event Timeline
Date |
Activity |
August 3 ,2021 |
Unintentional deletion of 2 Springer collections: - 'SpringerNature Complete Journals' - 3282 portfolios were deleted - 'SpringerLink Open Access eBooks' - 840 portfolios were deleted |
August 3 ,2021 |
The incident was reported on the Alma Listserv as well as via support cases |
August 4 ,2021 |
Ex Libris communicated update to the Alma Listserv, acknowledging the issue, its impact and suggested initial instructions to minimize impact |
August 6, 2021 |
Analysis completed and fix was ready for release |
August 8 ,2021 |
Release was applied on Community Zone triggering "Synchronize Changes from CZ" process |
August 8 ,2021 |
Ex Libris communicated update to the Alma Listserv, informing that the analysis completed, and solution is in place along with specific follow up instructions per activation profile |
Root Cause Analysis
Ex Libris investigated this event to determine the impact and root cause analysis with the following results: along automated process to ingest providers updated content, Springer file was sent with a change in its structure. As a result, a match mechanism against CZ collection portfolios failed causing deletion of valid portfolios.
Findings
Ex Libris investigated this event and determined the following:
- Springer update file was sent with a change in its structure, causing the match mechanism against Alma CZ to fail. As a result, new corrupted Bib records were created, and the valid ones were deleted.
- Automated QA validation did not detect the issue because of a bug and therefore no alert notification was sent as is expected in such a scenario.
Technical Action Items and Preventive Measures
Ex Libris has taken the following action and preventive measures to avoid such an occurrence in future:
-
Enhance prevention capabilities to perform actions outside of the routine procedure to significantly reduce the risk of manual error
-
Bug detected in the automated QA routine was fixed, tested, and applied (Done)
-
New automated validation routine that will allow detection of structure changes in early stage will be added to the process (October)
-
Overall review of all automatic QA routines to ensure they perform as expected (September)
-
Conclusion
Ex Libris treats the incident in high priority with evaluation, assessment, and mitigation processes and lessons learned. We are determined to improve the level of CZ quality and the value it provides to our customers.