Skip to main content
  • Subscribe by RSS
  • Ex Libris Knowledge Center

    Interpreting util f/4 z97 word display -- why is punctuation included?

    • Article Type: General
    • Product: Aleph
    • Product Version: 18.01

    What I see when I do util f/4 for the z97 with "the" as the key is:

    -the -000796835-000796835
    -the'50's -004603227-004603227
    -the'other' -003420642-003420642
    -the-- -003420643-003420643
    -the--oh -004716590-004716590
    -the-freeman -004603228-004603228
    -the-made-for-tv -004603229-004603229

    I am curious as to why “-the” and “-the - - “ would be two different instances of z97.

    Every line in this z97 display begins with a hyphen ( - ). This is not actually stored in the record. It is added to this display.

    The index contains "doublewords": in addition to being indexed individually, an index entry will be created with adjacent words compressed into a single index word.

    Certain words may contain punctuation marks (or may be *just* punctuation marks). Punctuation marks which are changed in tab_word_breaking to blanks (via the "to_blank" line) or deleted (via the "compress" line) will not appear in the words. But marks which are not so blanked or compressed will be preserved. Also note that, as described in the tab_word_breaking header, the apostrophe ( ' ) and the hypen ( - ) are "triple-posted", that is, indexed:

    (1) as separate words;
    (2) as is (with hyphen/apostrophe);
    (3) with hyphen/apostrophe compressed.

    For example, twenty-five is indexed as:
    ! twentyfive
    ! twenty
    ! five
    ! twenty-five

    ! Both the hyphen and the apostrophe must not be included in any of the word
    ! breaking procedures defined in this table.

    So the answer is that what you are seeing is normal.

    • Article last edited: 10/8/2013