Skip to main content
ExLibris

Knowledge Assistant

BETA
 
  • Subscribe by RSS
  • Back
    Aleph
    Ex Libris Knowledge Center
    1. Search site
      Go back to previous article
      1. Sign in
        • Sign in
        • Forgot password
    1. Home
    2. Aleph
    3. Knowledge Articles
    4. System extremely slow; "top" shows %wa in 10-25% range; high-CPU rpciod process

    System extremely slow; "top" shows %wa in 10-25% range; high-CPU rpciod process

    1. Last updated
    2. Save as PDF
    3. Share
      1. Share
      2. Tweet
      3. Share
    No headers

     

    • Product: Aleph
    • Product Version: 20, 21, 22, 23
    • Relevant for Installation Type: Dedicated-Direct, Direct, Local, Total Care

     

    Description    
    The system is extremely slow. "top" shows 2-5% Cpu %us but %wa is in the 10-25% range. (It's normally in the 1-3% range.) An "rpciod/16" process is consistently the highest CPU process (at 27%). rpciod is related to NFS (Network File System). 

     

    Resolution: 

    A long-running SQL query triggered the problem. (See Additional Information below.) Restarting Aleph and Oracle corrected the problem. 

    (Note: After the restart, the rpciod/16 process continued to exist and run, but its %CPU fell from 27% to 1%.)

     

    Additional Info
    An SQL query triggered the problem. Even when it's just a SELECT, Oracle will still start putting data from the tables into the UNDO tablespace when using a nested query to make sure the data is stable. The query was trying to get data from the Z00P table. This is the largest table in the database and is continually updated by ue_21. Thus, the reason for the larger iowait percentages was the fact that Oracle was keeping track of these changes to the Z00P table in the UNDO tablespace. The documentation also states that killing the SQL doesn't really help things out when the system is in trouble like it was. Oracle needs to put everything back to a stable point in time. The documentation states it takes Oracle longer to rollback the UNDO tablespace then it took to get where it was.  The same query on test ran for approximately 11 hours before the VPN connection was lost and the SQL was killed. It took the system 17 hours to get back to normal. During this query the iowait was sitting around 5%. This is on Test where not much is happening. So, while its easy to say don't run that query, this still points to an issue with IO on the system.  It seems there is not good IO throughput on this system to the disks that contain the .dbf files for the Oracle database.

     

     


    • Article last edited: 28-Feb-2016
    View article in the Exlibris Knowledge Center
    1. Back to top
      • SYSTEM DOWN:EXITING" at end of server or ue log.
      • System is overloaded message
    • Was this article helpful?

    Recommended articles

    1. Article type
      Topic
      Language
      English
      Product
      Aleph
    2. Tags
      This page has no tags.
    1. © Copyright 2025 Ex Libris Knowledge Center
    2. Powered by CXone Expert ®
    • Term of Use
    • Privacy Policy
    • Contact Us
    2025 Ex Libris. All rights reserved