Skip to main content
ExLibris
  • Subscribe by RSS
  • Ex Libris Knowledge Center

    campusM AP01 - RCA - August 13, 2019

    Confidential Information, Disclaimer and Trade Marks

    Introduction

    This document serves as a Root Cause Analysis for the campusM service interruption experienced by Ex Libris customers on August 13th, 2019.

    The goal of this document is to share our findings regarding the event, specify the root cause analysis, outline actions to be taken to solve the downtime event, as well as preventive measures Ex Libris is taking to avoid similar cases in future.

    Event Timeline

    Service interruption was experienced by Ex Libris customers served by the campusM AP01 instance at the APAC Data Centre during the following hours:

    August 13th, 2019 from 3:36 PM until 4:43 PM Singapore time zone
    August 13th, 2019 from 7:36 PM until 8:43 PM AET time zone
    During the event, the service was unavailable for the environment.

    Root Cause Analysis

    Ex Libris Engineers investigated this event to determine the root cause analysis with the following results:

    • Following a successful run on Development and QA environment, a database script was executed on the production environment resulting in failure. This failure caused the deletion of database files and required a full database restore.
    • The script execution was not done according the change management procedure and was executed not during permitted time slot.

     

    Technical Action Items and Preventive Measures

    • To provide the most immediate and stable resolution path, Ex Libris engineers restored the database from the latest backup of 24h before the outage.
    • Database backups will be taken more frequently (every 4 hours)
    • The Ex Libris Cloud Teams undertook update and refresh training on the change management procedures and permitted time of working on production environments.

    Customer Communication 

    In addition to the ongoing, real-time communications from the Ex Libris Status page an initial summary of the events was sent to all affected customers on August 13, 2019.

    ExLibris is committed to providing customers with prompt and ongoing updates during Cloud events. Ongoing and prompt updates on service interruptions appear in the system status portal at this address: http://status.exlibrisgroup.com/

    These updates are automatically sent as emails to registered customers.

    Publication History 

     

    Date Publication History
    August 19, 2019 Initial Publication