Multiple z07a records for single bib update
- Article Type: General
- Product: Aleph
- Product Version: 20, 21, 22, 23
Problem Symptoms:
We are finding cases where a single update to a bib record (loaded, for instance, by manage-18) generates four, or even eight, z07a records (for word and z0102 indexing), and these records have the highest priority year ("1990"), which is equal in priority to GUI-cataloged records. Thus, they are preventing GUI-cataloged records from being word-indexed in a timely fashion.
Case 1:
After a set of 120,000 bib records was loaded five days ago ...
1. ue_01 has been processing the set continuously -- and filling up our archive logs;
2. a grep on record number shows that the same records being processed over and over: grep XXX01.003808388 retrieves four entries from run_e_01_word.nnnnn and four from run_e_z0102.nnnnn.
Case 2:
With these entries in aleph_start:
setenv z07_p_manage_21 1999
setenv z07_p_manage_18 2000
setenv z07_p_manage_40 2019
setenv z07_p_manage_33 2021
setenv z07_p_manage_37 2021
setenv z07_p_manage_50 2021
setenv z07_p_manage_55 2021
the site ran this sequence of jobs:
manage-18 (for approximately 5,000 bib records)
manage-50 (xxx01), 4 times, to create the hol and item for 4 xxx50 ADMs.
manage-55 (xxx01) + manage 18 (xxx60), to load the bib 856 into each hol.
manage-37 (xxx60), to add the proxy to each 856
manage-21 (xxx01), 2 times, to delete 852 in 856 in the bib.
For those 5k input records, 18,664 z07a's were generated, most with the year 1990.
Cause:
Updates to the record after the original load cause additional z07 and z07a records to be created.
If there's an existing, unprocessed z07 for a particular record, a second z07 will not be written for that record. See the article Overwriting of existing z07's by new ones in this regard.
But, while there can't be more than one z07 record at a time for a particular doc record, there can be any number of z07a records.
Resolution:
Stop ue_01 before running the first job, and start it only after the last job has finished. Thus, only one z07 and one z07a will be written for each input record.
As described in the article Order of Z07A processing , the z07a's priority is generally the same as the z07 it is derived from -- but see the last paragraph in the article for an exception to this.
Additional Information
This SQL showed 134,944 z07a records (waiting to be processed by ue_01_word and/or ue_01_z0102):
xxx01@LIBP> select count(*) from z07a;
COUNT(*)
----------
134944
util f/4 for XXX01.003808388 showed there have been various updates to XXX01.003808388:
CAT L $$aCBUEBSCO$$b$$c20130217 $$lXXX01$$h1154
CAT L $$aBATCH$$b00$$c20130217 $$lXXX01$$h1249
CAT L $$aBATCH$$b00$$c20130219 $$lXXX01$$h0555
CAT L $$aBATCH$$b00$$c20130219 $$lXXX01$$h0957
CAT L $$a035OCLC$$b$$c20130223 $$lXXX01$$h0753
The following SQL shows that there are currently no z07a's waiting to be processed for XXX01.003808388:
xxx01@LIBP> select z07a_rec_key from z07a where z07a_rec_key like '%003808388';
no rows selected
XXX01.003808388 will appear in the log again *only* if another update is made to it. There is no evidence that records will be processed multiple times by ue_01 without multiple updates occurring.
Former title: "ue_01 continuously processing same records - huge archive redo logs"
- Article last edited: 15-Dec-2016