Skip to main content
ExLibris
  • Subscribe by RSS
  • Ex Libris Knowledge Center

    Multiple p_cir_nn jobs (for different sublibraries) execute simultaneously

    • Article Type: General
    • Product: Aleph
    • Product Version: 20, 21, 22, 23

    Description:
    All six of our abc50 p_cir_12 jobs (each for a different sublibrary) have incorrect data (from other sublibraries) starting on Sunday, 10/25. For example: AA requested books were under BB output: the first two books are XE; the others are BB.

    Resolution:
    This problem was caused by the fact (seen in util c/1) that two lib_batch processes were running in the abc50 library. The first lib_batch process started the p_cir_12 job for sublibrary AA; the second, started p_cir_12 for sublibrary BB. These two jobs were overwriting each other's work files, in the abc50 $data_scratch.

    The multiple lib_batch processes were due to the fact that the que_batch_lock file was deleted by util x/3. (que_batch_lock prevents a second lib_batch process from being started.)

    We see this in the abc50 $data_files:

      -rw-r--r-- 1 aleph aleph 1322748 Oct 26 13:26 que_batch.old
      -rw-r--r-- 1 aleph aleph       9 Oct 26 13:27 que_batch_lock


    So the generation of the que_batch.old file and the que_batch_lock occur at the same time.

    The $aleph_proc/org_que proc has this line:

    mv $data_files/que_batch $data_files/que_batch.old

    and org_que is called by start_library_batch:

    source $aleph_proc/org_que

    We see that the util_c_02 procedure ("Start Library Batch Queue") executes start_library_batch, but that the unlock_library proc *also* calls start_library_batch.

    The chronology:

    Aug. 18 abc50 lib_batch started (with que_batch_lock file dated Aug. 18)
    Oct. 22 abc50 que_batch_lock deleted (by util x/3)
    Oct. 23 abc50_util_a_13_b.56001 does "unlock_library" which (since it finds no que_batch_lock):
      (a) writes que_batch to que_batch.old and creates a que_batch_lock file
      (b) starts a 2nd abc50 lib_batch process
    Oct. 24 00:46 the first "file locked" error occurs in abc50: abc50_p_cir_10.56029.court_bb
    Oct. 26 you kill both lib_batch's and restart que_batch (lib_batch)

    We recommend not running util x/3 at all: not much is written to the libraries' $data_files directories and you can individually clean up what is written there.

    See also the article: " File locked" error; job being started before preceding job has finished ".

     


    • Article last edited: 10/8/2013
    • Was this article helpful?