- Article Type: General
- Product: Aleph
- Product Version: 20, 21, 22
We've run across a number of records where a word was originally in the record, but when the text is removed, that record still appears in a keyword search for that word.
In this specific case, it's a fixed field position that is indexed. We have an index WJN on position 21 of the 008 field. Some Music records we're bringing in from Worldcat have this position set to n when it should be blank. When the record is later corrected, a search for WJN=n still retrieves this record. Aleph is not removing this record from the index that contains this word.
We have tried UTIL-F-1-17 to delete all words and headings for a record. That has not helped.
We have tried putting the n back, saving the record, and removing it again. That has not helped.
We tried changing n to some other invalid character. That also has not helped.
Is there any way to clear this word from the indexes for this record, short of running p_manage_01 on the entire database?
2 example records. They should be on both production and test servers, though all our attempts to correct it have been in production. 001491885 001344572
We have seen something like this happen before, for example with changed holdings making the PST fields different, and the record remaining in a search with the old location. We're not sure if this might be related, or if the current problem is only with fixed fields.
Prior to version 22, the maximum number of words which can be included in the Word indexes for a single record was 4950. And we have seen cases where a record is at or near this limit where the deletion of certain words does not cause a proper delete of the text from the Word index.
The words being indexed for a particular document can be seen in UTIL-F-1-28 ("Display Word Indexing for a Single Record"). Checking there, you will see that the "double-words", being generated for adjacency purposes, are also being included in this word count. So, in fact, the number of words being indexed is roughly the words you see in the record times 2.
In version 22 (rep_ver 19160), the limit for word indexing has been increased to 20,000 words (from 4,950).
Note: As described in the rep_ver, manage-01 must be run under Aleph 22.1 or higher. Simply sending records which have this problem through ue_01 will not work.
delete index text word
- Article last edited: 3/23/2015