Research Professional – RCA – March 3rd 2020
This document serves as a Root Cause Analysis for the Research Professional service interruption experienced by Ex Libris customers on March 3rd and March 4th, 2020.
The goal of this document is to share our findings regarding the event, specify the root cause analysis, outline actions to be taken to solve the downtime event, as well as preventative measures Ex Libris is taking to avoid similar cases in the future.
Service interruption was experienced by Research Professional users during the following hours:
From March 3rd 2020 10.15PM GMT until March 4th 2020 3.02PM GMT.
During the event, Shibboleth authentication for researchprofessional.com was unavailable (researchprofessional.com remained live and accessible via IP guest access or personal username and password).
Root Cause Analysis
Ex Libris Engineers investigated this event to determine the root cause analysis with the following results:
When attempting the scheduled update of Shibboleth metadata, metadata verification timed out causing the Shibboleth process to shut down – this occurred on both our live and backup servers. This was due to a large logo file embedded in one Identity Provider’s metadata.
Technical Action Items and Preventive Measures
Ex Libris has taken the following action and preventive measures to avoid such an occurrence in the future:
- Provision additional resources for Shibboleth processes.
- Made changes to the configuration for metadata update and verification to prevent this recurring for the current Shibboleth version in use.
- Preparing to upgrade the latest version of Shibboleth, allowing for more efficient consumption and verification of metadata.
- Article last edited: 20-March-2020