Skip to main content
  • Subscribe by RSS
  • Ex Libris Knowledge Center

    German diacritics: searching 'ue' for 'u' with umlaut

    • Article Type: General
    • Product: Aleph
    • Product Version: 15, 16, 18, 20, 21, 22, 23

    We are experiencing search disparities with diacritics searching. Particularly with German diacritics. We need our system adjusted so that a user can search with a diacritic, without a diacritic, or with a substitute for a diacritic (like 'ue' for u with umlaut) and get the same results.

    For Word searching:

    The relevant table is that specified in the WORD-FIX line in $alephe_unicode/tab_character_conversion_line, that is: $alephe_unicode/unicode_to_word_gen .

    The header of unicode_to_word_gen says:

    ! Another example, in order to set an umlauted "u" as "ue",
    ! set the equivalency of u-umlaut (00FC) to "u" + "e"
    ! (0075 + 0065).

    So, you need to change this line:

    00FC 0075                     #LATIN SMALL LETTER U WITH DIAERESIS

    to this:


    00FC 0075 0065                #LATIN SMALL LETTER U WITH DIAERESIS


    a. stop/start ue_01
    b. restart the www_server and pc_server
    c. resend a record containing the umlaut to the server (with GUI Cataloging or util f/13)
    d. check and see if the searching is satisfactory.

    For Browse:

    When the relevant $data_tab/tab_filing routine contains an entry for:

    .. char_conv FILING-KEY-01

    the $alephe_unicode/unicode_to_filing_01 file can be used to normalize combinations of characters.

    $alephe_unicode/unicode_to_filing_01 has this:

    00FC 0055                     #LATIN SMALL LETTER U WITH DIAERESIS

    You would need to change this line to:

    00FC 0055 0045                #LATIN SMALL LETTER U WITH DIAERESIS *

    and then perform steps a-d, as shown for the Words, above.

    * Rather than normalizing to lower case (0075 0065) this table is normalizing to upper case -- which requires "0055 0045".


    Article last edited: 16-Feb-2016

    • Was this article helpful?