Skip to main content
  • Subscribe by RSS
  • Ex Libris Knowledge Center

    North America Data Center RCA March 5, 2018

     

    Confidential Information, Disclaimer and Trade Marks

    Introduction

    This document serves as a Root Cause Analysis for the North America  performance issues experienced by Ex Libris customers on  March 5, 2018.

    The goal of this document is to share our findings regarding the event, specify the root cause analysis, outline actions to be taken to solve the downtime event, as well as preventive measures Ex Libris is taking to avoid similar cases in future.

    Event Timeline

    Periodic service disruption were experienced by Ex Libris customers on the North America Data center during the following hours: 

    March 5th, 2018 from 12:25 PM until 12:43 PM US Central Time.

    During the event, all services were unavailable for the environment.

    Root Cause Analysis

    Ex Libris Engineers investigated this event to determine the root cause analysis with the following results:

    An incorrect network connection between the primary and redundant backbone switches, performed by mistake by our internet service provider (ISP), caused a network loop and service interruption for all inbound sessions. There was no data impact.

     

    Technical Action Items and Preventive Measures

    The Ex Libris ISP provider reported on the following corrective actions that were taken

    • Data Center managers discussed with field technicians the importance of carefully reading tickets to identify wiring diagrams and follow them.
    • The ISP provider’s written process was updated with a step to validate server ports before loading configurations on the backbone

    Customer Communication

    ExLibris is committed to providing customers with prompt and ongoing updates during Cloud events. Ongoing and prompt updates on service interruptions appear in the system status portal at this address: http://status.exlibrisgroup.com/

    • Was this article helpful?