Skip to main content
ExLibris
  • Subscribe by RSS
  • Ex Libris Knowledge Center

    tab_word_breaking: routine 90 and the forward slash ("/")

    • Article Type: General
    • Product: Aleph
    • Product Version: 19.01

    Description:
    Last month we added Routine 90 to tab_word_breaking:

    90 # to_blank @$^_={}[]:;<,.\/

    We have BIB records with a 500 field that contains the phrase "on order" and a date, such as: "ON ORDER - 6/04".

    With Routine 90 in place, I would expect these records to be retrieved in a Find (word anywhere) search like "6/04" or "6 04" with words adjacent. However, neither of these searches work.

    500 is indexed as WRD. I did util F/1/2 using 90, for the text "6/04" and got these results:
    WORD 1: -6-
    WORD 2: -04-
    WORD 3: -604-

    According to this, then, shouldn't these searches work:

    Word anywhere = 6 04 (adjacent)
    Word anywhere = 6/04
    Word anywhere = 604

    None of these searches retrieve the record (sys #)1515132.
    Sys # 1526422 has "2/05" in a 500. Searches for this also don't work.
    Both of these BIB records are retrievable by other keyword searches on other terms.

    Other search terms which include the forward slash (such as "ob/gyn") also demonstrate this problem.

    Resolution:
    util f/1/28 shows that the 6/04 is not indexed separately. It's indexed as "6/04" only.

    The abc01/tab/tab11_word shows that the 500 field is being indexed using the "03" tab_word_breaking routine.

    The "03" has this line:

    03 # to_blank !@#$%^()_={}[]:";<>,.?|\

    that is: no forward slash.

    In searching, tab_word_breaking routine 90 is used. The routine 90 to_blank line *does* contain the slash. Thus, the slash is being changed to blank and it's looking for "06 04" -- and there is none.

    In the as-delivered USM01 tab11_word, tab_word_breaking routines 01 and 03 are used for all the fields. In a case like this, the best thing is to temporarily change routine 90 to match the indexing routines. (This presumes that both routine 01 and routine 03 contain the character -- which *is* the case with the slash.)

    When you remove the slash from the routine 90 to_blank line and restart the www_server (or pc_server), the search "6/04" will retrieve the record 1515132 given in your sample case.

    The next time you run p_manage_01 you can decide if you want to change the 03 (and 01) routine to_blank line to include the slash -- this will make it searchable as "6 04" -- or as "6/04".

    We are requesting (Sept. 2009) that the distributed usm01 tab_word_breaking be updated to correct this problem.

    Note: this is especially a problem with the WUR (URL) word index.

    (keywords: p-manage-01 manage_01)

    Additional Information

    faq


    • Article last edited: 10/8/2013