Skip to main content
ExLibris
  • Subscribe by RSS
  • Ex Libris Knowledge Center

    How to detect invisible characters in bibliographic records

    • Product: Aleph
    • Product Version: 21, 22, 23
    • Relevant for Installation Type: Multi-Tenant Direct, Dedicated-Direct, Local, TotalCare

     

    Description

    Certain Unicode characters may not appear when saved in the bibliographic record. Such examples are:

    • left-to-right mark, Unicode U+200E, HTML ‎ ‎, UTF-8 is E2 80 8E
    • right-to-left mark, Unicode U+200F, HTML ‏ ‏, UTF-8 is E2 80 8F

     

    Those characters do not appear in GUI display (except in Unicode Bubble Hint), in WEB display, in util F-4 display.

    They might appear encoded when their data element is part of URL. Therefore it is hard to detect them and difficult to realize that they might be part of your data.

    Resolution

    When you see unexpected data (e.g. %E2%80%8E) as part of your linkout URLs, use batch procedure 'Download Machine-Readable Records (print-03)' to export suspicious bibliographic records.

    Open the file on the server using vi editor and check the relevant field. You will see the non-standard characters given above like this:

    010878606 856   L $$uhttp://your.server/sfx_locater?sid=...issn=1934-3566▒~@~N$$zOnline via SFX 

     

    Open the same records in GUI Cataloging Editor and manually remove the unwanted characters that are not directly visible.


    • Article last edited: 17-May-2017