Skip to main content
ExLibris
  • Subscribe by RSS
  • Ex Libris Knowledge Center

    CAT:Bulk Import: 'O' as character set does not load underscore (5f)

    • Article Type: General
    • Product: Voyager
    • Product Version: 7.0.1

    Problem Symptoms

    • When loading OCLC records with an underscore _ (hex value 5f) using the expected character set OCLC (non-Unicode), the records error out and are not loaded.
    • Messages like "ERROR: Unparseable record written to error file: [33]-856 : c->8 undefined char in page 0 at 86 '5f766572 _ver are written to /m1/voyager/xxxdb/rpt/log.imp.yyyymmdd.hhmm.
    • Error messages mention '5f' character (underscore).
    • In SysAdmin -> Cataloging -> Bulk Import Rules -> Rules -> Expected Character Set Mapping of Imported records is OCLC (non-Unicode).
    • The same records can be loaded via the Cataloging client using the same expected character set (OCLC (non Unicode).
    • The same records can be loaded via Bulkimport using the expected character set of MARC21 MARC8 (non Unicode).

    Defect Status

    Issue 18748 is fixed in Voyager 8.1.1.

    Replication steps

    1. Get file of records with underscores from OCLC.
    2. Create a Bulkimport Rule in Sysadmin that is set with "OCLC "(non-Unicode)" as the Expected Character Set.
    3. Import the records using Bulkimport.
    4. Look at the log.imp file. It will report that the records were not loaded and sent to the error file.

    ----------------------------------------

    Part of log file
    "I am 2085. I will be doing all of '/m1/incoming/MER/errimp200608230001.txt' for you.
    The import code is "REP" for this run.
    The bib dup profile is "OCLCReplace" for this run.
    The auth dup profile is "AuthReplace" for this run.
    This import is using a rule that does not allow creation of MFHDs or Items.
    Wed Aug 30 12:39:06 2006
    Expecting 'O' as character set
    1: ERROR: Unparseable record written to error file: [34]-530 : c->8 undefined char in page 0 at 126 '5f68746d _htm'
    2: ERROR: Unparseable record written to error file: [22]-530 : c->8 undefined char in page 0 at 133 '5f73756d _sum'
    3: ERROR: Unparseable record written to error file: [16]-530 : c->8 undefined char in page 0 at 143 '5f313939 _199'
    4: ERROR: Unparseable record written to error file: [17]-538 : c->8 undefined char in page 0 at 115 '5f53505f _SP_'
    5: ERROR: Unparseable record written to error file: [20]-530 : c->8 undefined char in page 0 at 125 '5f353133 _513'
    6: ERROR: Unparseable record written to error file: [22]-538 : c->8 undefined char in page 0 at 123 '5f68746d _htm'
    7: ERROR: Unparseable record written to error file: [18]-530 : c->8 undefined char in page 0 at 128 '5f636174 _cat'

     

    Workaround

    Load records via the Cataloging client using the same expected character set ("OCLC (non Unicode)" or

    Load records via BulkImport using the expected character set of "MARC21 MARC8 (non Unicode)".

    Additional Information

    It has recently been deemed acceptable to use the spacing underscore as "_" rather than "%5F" on OCLC. See OCLC TB 252: http://www.oclc.org/support/documentation/worldcat/tb/252/


    • Article last edited: 08-Oct-2013