Skip to main content
ExLibris
  • Subscribe by RSS
  • Ex Libris Knowledge Center

    Problem with newly-created word indexing code for SuDoc call number

    • Article Type: General
    • Product: Aleph
    • Product Version: 18.01

    Description:
    In our v.18 BIB test environment, we recently created a word indexing code for Item SuDoc number (Code=WIS) to index the "PST3#" subfield "$$h". Here are the configurations lines for the index:

    tab11_word:
    PST3# 0 z30 h 03 WIS WSD WRD
    PST3# 0 hol h 03 WIS WSD WRD

    tab00.eng
    H WIS W-069 00 00 W-Item SuDoc

    tab_word_breaking:
    03 # del_subfield
    03 # abbreviation
    03 # numbers
    03 # 2_hyphen
    03 # compress_h_quote
    03 # to_blank !@#$%^()_={}[]:";<>,.?|\/
    03 # split_cjk
    03 # to_lower

    Upon the creation of the code, we indexed several records in the database. The results are mixed with both successes and failures. The ones that work mostly contain the SuDoc numbers beginning with the single letter "Y". Here are a few examples that work with the indexing code in "CCL" query:
    000599634
    000598309
    000598757
    The ones that failed to work predominantly begin with two letters or more such as "LC", "NAS", and "PREX". Here are a few examples:
    000269469 NAS 1.2:Ae 8/8
    000571963 PREX 23.14/2:
    000598343 LC 1.2:N 49
    Also not all those that begin with a single letter work with this indexing code. For example, we found that the following record does not work in "CCL" query:
    000338250 A 1.2:F 84
    Upon further investigation using util/f/1/28, we found that for all the above listed records, the "PST3#" subfield "$$h" have been indexed. Besides, all the above records work with "CCL" query using the direct index "GVD".

    Resolution:
    In view of the fact that "all the above records work with "CCL" query using the direct index "GVD". " why are you creating this WIS index? Why not just use the GVD direct index?

    We have found Word indexes to be inappropriate/problematic for call numbers. This has to do with the fact that there are multiple segments to the call number, separated by spaces, and each is a separate word. You might do something with "compress" and "compress_blank" to create a single word with no spaces, but I think there are likely to be problems with even this.

    I suggest that if you really need another/different index you create it as a direct index.


    • Article last edited: 10/8/2013
    • Was this article helpful?