Skip to main content
ExLibris

Knowledge Assistant

BETA
 
  • Subscribe by RSS
  • Back
    Alma

     

    Ex Libris Knowledge Center
    1. Search site
      Go back to previous article
      1. Sign in
        • Sign in
        • Forgot password
    1. Home
    2. Alma
    3. Product Materials
    4. RCA Reports
    5. APAC
    6. Alma APAC API Gateway RCA April 4, 2016

    Alma APAC API Gateway RCA April 4, 2016

    1. Last updated
    2. Save as PDF
    3. Share
      1. Share
      2. Tweet
      3. Share
    1. Introduction
    2. Event Timeline
    3. Root Cause Analysis
    4. Technical Action Items and Preventive Measures
    5. Customer Communication

    Confidential Information, Disclaimer and Trade Marks

    Introduction

    This document serves as a Root Cause Analysis for the Alma API Gateway service interruption experienced by Ex Libris customers on April 4, 2016.  The

     

    goal of this document is to share our findings regarding the event, specify the root cause analysis, outline actions to be taken to solve the downtime event, as

     

    well as preventive measures Ex Libris is taking to avoid similar cases in future.

    Event Timeline

    Service interruption was experienced by Ex Libris customers using Alma API gateway systems on:

     

    April 4, 2016 from 07:30 AM until 10:30 AM Singapore Time zone.

     

    April 4, 2016 from 2:40 PM until 3:00 PM Singapore Time zone.

     

    During the event, Alma API gateway was inconsistent and was causing service disruption at times.

    Root Cause Analysis

    Ex Libris Engineers investigated this event to determine the root cause analysis with the following results:

     

    During a planned activity (during the Sunday maintenance window) to improve the storage infrastructure of the API gateway in the APAC data center, a human

    error resulted in a configuration error in the database synchronization process between the two redundant databases servicing the API gateway. This

    configuration error caused a temporary database synchronization issues, which have caused the application to be unavailable at times.

    Technical Action Items and Preventive Measures

    Ex Libris has taken the following action and preventive measures to avoid such an occurrence in future:

    • To immediately resolve the issue, the identified problematic service was shut off, allowing the redundant application service to take affect and the application to become available.
    • Following the event, a manual synchronization was triggered, this had caused the second short outage. Following the synchronization, the secondary application service became available again and allowed returning to complete regular service.
    • The configuration causing the original event is being investigated to identify cause of fault.
    •  Monitoring is being investigated in order to allow a faster and more accurate identification of a problematic situation as occurred in this case.

    Customer Communication

    ExLibris is committed to providing customers with prompt and ongoing updates during Cloud events. Ongoing and prompt updates on service interruptions appear in the system status portal at this address: http://status.exlibrisgroup.com/

    Confidential Information, Disclaimer and Trade Marks

     

    View article in the Exlibris Knowledge Center
    1. Back to top
      • Alma APAC Instance RCA Sept 29 and Sept 30 2015
      • APAC Data Center - RCA - October 25, 2018
    • Was this article helpful?

    Recommended articles

    1. Article type
      Topic
      Content Type
      Documentation
      Language
      English
      Product
      Alma
    2. Tags
      1. environment:Alma APAC API
    1. © Copyright 2025 Ex Libris Knowledge Center
    2. Powered by CXone Expert ®
    • Term of Use
    • Privacy Policy
    • Contact Us
    2025 Ex Libris. All rights reserved