Skip to main content
ExLibris
  • Subscribe by RSS
  • Ex Libris Knowledge Center

    Upgrade Express "Install Customer Data" process hangs; hundreds of z39 processes generated; extreme server slowness

     

    • Product: Aleph
    • Product Version: 20, 21, 22, 23
    • Relevant for Installation Type: Dedicated-Direct, Direct, Local, Total Care

     

    Description

    All upgrade_express_2201_2301 "Install customer Data" runs, as seen in the /exlibris/aleph/upgrade_express_2201_2301/logs/../install_utree.log files, hang at the same place:

    Importing to xxx01 ...
     ...
     Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX
     Processing object type SCHEMA_EXPORT/PROCEDURE/PROCEDURE
     Processing object type SCHEMA_EXPORT/PROCEDURE/ALTER_PROCEDURE
     Processing object type SCHEMA_EXPORT/TABLE/TRIGGER
     Processing object type SCHEMA_EXPORT/TABLE/INDEX/DOMAIN_INDEX/INDEX
     Processing object type SCHEMA_EXPORT/POST_SCHEMA/PROCACT_SCHEMA
     Job "ALEPH_ADMIN"."SYS_IMPORT_SCHEMA_01" successfully completed at ...

    Comparing this to a successful run of install_utree for the a23_1 instance on this server, we see that the next step should be:
     Import (dp) ope01 ended

    The following are the statements immediately preceding the "Import (dp) $lib ended" in the ./util/install_utree_a proc:
     echo Importing to $touser ...
     csh -f util/oracle_impdp_current_lib_a_ue $lib
     echo "Import (dp) $lib ended"

    And in the ./util/oracle_impdp_current_lib_a_ue we see this:

    impdp $ALEPH_ADMIN SCHEMAS=$lib_uc DIRECTORY=$
    {lib_uc} 
    _DIR_UE DUMPFILE=$
    {lib_lc} 
    %U.dmp NOLOGFILE=Y PARALLEL=4

    foreach file(`ls $data_files/dpdir/$lib_lc*.dmp`)
     /bin/gzip $file
     end

    And in
     /exlibris/aleph/upgrade_express_2201_2301/data/a20_2/xxx01/files/dpdir we, in fact, see this:

    -rwxrwxr-x 1 aleph exlibris 146296832 Sep 25 11:21 xxx0102.dmp*
    -rwxrwxr-x 1 aleph exlibris 352509952 Sep 25 11:21 xxx0101.dmp*
    -rwxrwxr-x 1 aleph exlibris 4527 Sep 25 11:21 expxxx01.log*
     rw------  1 aleph exlibris 45957120 Oct 3 03:21 xxx0101.dmp.gz

    Looking at the files in $LOGDIR, we see that 03:21 was when the server was rebooted.

    Thus, it seems that after the " Job "ALEPH_ADMIN"."SYS_IMPORT_SCHEMA_01" successfully completed .. " line was issued, the "/bin/gzip $file" started executing and the initial file it was working on was the ./ope0101.dmp.

    As a test to confirm that there's no problem with doing this gzip, I saved the existing file as xxx0101.dmp.gz.0321.partial and then did "gzip xxx0101.dmp" to zip it. After which we see this:

    -rwxrwxr-x 1 aleph exlibris 74659783 Sep 25 11:21 ope0101.dmp.gz*
     rw------ 1 aleph exlibris 45957120 Oct 3 03:21 ope0101.dmp.gz.0321.partial

    This took about five seconds. At the time the reboot was done, UE was still doing this first "/bin/gzip $file" and had been running for several hours and was only about 60% complete.

    Also, hundreds of z39_server and z39_gate processes are generated when there should be just one or two. This causes the server to become *extremely* slow.

    Thus, it seems that, when "ALEPH_ADMIN"."SYS_IMPORT_SCHEMA_01" successfully completed and the "/bin/gzip $file" was started, some other things happened in the system – such as a loop in which z39_server and z39_gate processes were repeatedly started (This makes no sense, but that does seem to be what is happening.)

    This Upgrade Express kit is upgrade_express_2101_2201.tar.1.07 which is exactly the same kit which we ran successfully, at another dc04 site. Thus, I think the problem is something with the environment rather than any problematic change to the Upgrade Express kit.

    I would note that this problem on aio0102 is occurring despite the increase in its memory. It seems that this loop which is generating the multiple z39_server and z39_gate processes just continues until it uses up whatever memory is available.


     

    Resolution

    It seems that this may have been caused by a space problem.  (At one point in the install_utree, a message appeared about /exlibris being 100% full.)  

        Reboot the server and make certain that the server is properly sized and that any large, unnecessary files are removed.

        (Ex Libris staff may view additional comments in the Internal Notes.) 

     

     


    • Article last edited: 04-Oct-2016