Skip to main content
ExLibris

Knowledge Assistant

BETA
 
  • Subscribe by RSS
  • Back
    Aleph
    Ex Libris Knowledge Center
    1. Search site
      Go back to previous article
      1. Sign in
        • Sign in
        • Forgot password
    1. Home
    2. Aleph
    3. Knowledge Articles
    4. p_print_03: output contains U+nnnn characters Aleph fails to convert to MARC8

    p_print_03: output contains U+nnnn characters Aleph fails to convert to MARC8

    1. Last updated
    2. Save as PDF
    3. Share
      1. Share
      2. Tweet
      3. Share
    No headers
    • Article Type: General
    • Product: Aleph
    • Product Version: 18.01

    Description:
    When we export records from our database and convert them to MARC8, the result sometimes contains Unicode codes that Aleph was unable to convert to MARC8. Example:

    =670 \\$aGlossaire de l'?conomie de l'OCDE angl.-fran?., 2006 [en ligne via Ariane].$bEconomist\U+2026\ Business economist = ?conomiste d'entreprise.

    The U+2026 here is the Unicode code for an HORIZONTAL ELLIPSIS, which does not exist in MARC8.

    In order to eliminate those unreadable codes from our MARC8 exports, I want to add an home made transliteration using the following lines in the marc8_lat_to_unicode table:

    00AB 22 LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
    00BB 22 RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
    2013 2D EN DASH
    2019 27 RIGHT SINGLE QUOTATION MARK
    2026 2E2E2E HORIZONTAL ELLIPSIS

    So:

    Unicode LEFT-POINTING DOUBLE ANGLE QUOTATION MARK would be converted to a MARC8 QUOTATION MARK
    Unicode RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK would be converted to a MARC8 QUOTATION MARK
    Unicode EN DASH would be converted to MARC8 HYPHEN-MINUS
    Unicode RIGHT SINGLE QUOTATION MARK would be converted to a MARC8 APOSTROPHE
    Unicode HORIZONTAL ELLIPSIS would be converted to three MARC8 PERIOD

    It would solve our Exportation problem as it would produce MARC8 files without U+ codes.

    Now, my question: Could it be a problem if we want to Import records from a MARC8 encoded file in Aleph?

    My guess is no, since the MARC8 codes I used for the transliteration are all BASIC LATIN, the same as ASCII codes in the range 0x00-0x7F, which should just not be touched by the conversion algorithm. I just wanted a confirmation to that guess.

    Resolution:
    The marc8_lat_to_unicode file which Ex Libris distributes has single-unicode-character ("precomposed") representations for characters which appear in the Arial Unicode MS font (-- which is the de-facto standard for the display of diacritics).

    [From site:]

    My concern was not with the display of this character in Aleph because, as I said, there's no problem with that. It was with the modification I made in the marc8_lat_to_unicode table. The line added corrected the problem upon exportation, the exported file does not contain U+2026 anymore, but instead it contains the 3 dots. But I was worried that it would also do the reverse job: converting 3 dots in a single horizontal ellipsis upon importation of MARC8 data into Aleph.

    Yesterday afternoon, I gave it a try using p_manage_22. I tested conversion of a MARC8 file containing 3 dots to Unicode. The result (thankfully!) is what I expected: the 3 dots remained 3 dots and it's exactly what I wanted it to do.


    • Article last edited: 10/8/2013
    View article in the Exlibris Knowledge Center
    1. Back to top
      • p_print_03: "Existing 001 field for record ... is not a valid Aleph format"
      • p_print_04 (and p_print_01, _08, _09)
    • Was this article helpful?

    Recommended articles

    1. Article type
      Topic
      Language
      English
      Product
      Aleph
    2. Tags
      1. 18.01
      2. contype:kba
      3. Prod:Aleph
      4. Type:General
    1. © Copyright 2025 Ex Libris Knowledge Center
    2. Powered by CXone Expert ®
    • Term of Use
    • Privacy Policy
    • Contact Us
    2025 Ex Libris. All rights reserved