- Product: Aleph
- Product Version: 20, 21, 22, 23
- Relevant for Installation Type: Dedicated-Direct, Direct, Local, Total Care
This issue is occuring in our MUS-01 CDLC OPAC: http://libweb.lib.xxx.edu/F?&local_base=mus01
If you search the quoted string "Tradition of the new" using the default parameters (all fields/all libraries), you get two results. The MARC record of the first result, "1946-1968 : the birth of contemporary art", does not contain this string. Why is it being retrieved?
Aleph Word searching doesn't actually look for the complete exact phrase. It uses the heuristic described in Appendix D of the "How To Run Index Jobs" document:
The new mechanism creates a word from each pair of words. Thus, in addition to the words "great", "britain", and "army", the words "greatbritain" and "britainarmy" will be indexed.
When you search on "great britain army" with adjacent = Yes, it will find the records containing the words "greatbritain" and "britainarmy".
But there's nothing to require that all three words "great britain army" be next to each other in the record.
We have found that this heuristic is very fast and gives good results in the majority of cases. If, in a particular case, there are too many false drops, one could do a zero proximity search ("wrd= great !0 britain !0 army") which will do the "old adjacency" and, though taking longer, will give you only those records where all three words are actually adjacent.
<end Appendix D excerpt>
"1946-1968 : the birth of contemporary art" appears in the results because the record contains the double words "tradition of", "of the", and "the new".
Though it doesn't apply to this situation, another, more common, cause of OPAC search retrieving records which lack the word(s) being searched is described in the article " Web OPAC search retrieves records which lack the word(s) being searched " (KB 16384-30385).
- Article last edited: 12-Mar-2016