ue_01 (ue_01_a) running very slowly; "More than 10000 entries"
- Article Type: General
- Product: Aleph
- Product Version: 20, 21, 22, 23
Description:
Our $data_scratch/run_e_01 logs show that ue_01 is processing most records in 1-3 seconds, but, for about 1/4 of the records, ue_01 is taking 4-12 *minutes*.
For some (but not all) of these records that take a long time, we see messages such as the following:
Case 1: Note: More than 10000 entries for 0
Case 2: Note: More than 10000 entries for http://db14.linccweb.org/login?url=http://site.ebr'
Resolution:
The message: "More than 10000 entries for" is issued by the ./com/update_long_heading program.
This program reads all the Z01s which have the same Z01-ACC-CODE/Z01-ALPHA/Z01-FILING-TEXT as the ACC heading being indexed. **It does this regardless of how short the heading is.** That is, even if the heading is only three characters, it will read all the Z01s with the same first 75 bytes in the z01_rec_key.
If there are more than 10,000, it will issue this message, but, in any case, it will sort the Z01s it has, up to 10,000. (*It is this sorting which takes all the time.*)
Case 1:
This is a problem for indexes like the ACC "LOC" and the "SH" (from tab11_acc):
PST## LOC bc
PST## SH bchijk
when the 852 has no call number and just the sublibrary code is being sent to the index.
Example (from util f/1/29 for bib# 006314381):
PST $$0Z30$$1006314381000010$$bSMU$$oBOOK$$eOR$$fN$$rSYS60-000000000
LOC SMU $$bSMU
SH SMU $$bSMU
This results in thousands of LOC and SH headings with just "SMU".
This problem was resolved by two changes:
1. Commenting out the LOC index in tab11_acc -- since the WSB and WCL Word indexes serve the same purpose.
2. Changing the SH line to include a "subfield filter", with "h" in column 3 (so that the field is indexed only if it actually contains a call number):
PST## h SH bchijk
The fact that there are thousands of existing LOC's and SH's with just the sublibrary is not a problem: the new rules are used and these old Z01's will just be ignored (and, of course, cleaned up with the next run of p_manage_02).
Case 2:
The problem was that the ./xxx01/tab/tab11_acc had this:
856## u -*webscri* URI u
This means that it's creating ACC Browse entries for all the 856's, like this:
85640 L $$uhttp://db14.linccweb.org/login?url=...m/lib/lscc/Top....
It was determined that this "URI" Browse index was not actually being used. Commenting out the 856 ... URI line in tab11_acc and restarting ue_01 corrected the problem.
If it seems that the above is not the problem, check the Oracle indexes for the tables which are read/updated by ue_01. These are listed in SKB 8192-2329.
- Article last edited: 18-Jan-2017