BULK: Non-Unicode records containing Greek characters fail to load.
- Article Type: General
- Product: Voyager
- Product Version: 8.2.0
Description:
Module: Bulk Import
Server platform affected: Linux, Sun OS
PC OS: Windows XP
Browser & version: IE 7
Releases replicated in: 7.2.5 - 8.2.0
Last version without bug: N/A
Expected results: When bulk importing MARC21 MARC-8 (non-Unicode) records, Greek characters are processed correctly.
Actual results: When bulk importing MARC21 MARC-8 (non-Unicode) records, Greek characters trigger an error message and records are not loaded.
Workflow implications: Records containing Greek characters cannot be loaded, workaround slows staff productivity.
Replication steps:
1. SysAdmin > Cataloging > Bulk Import Rules > Change expected character set of rule to use to MARC21 MARC-8.
2. Run Pbulkimport.
3. View log.imp file and note error messages. Example:
ERROR: Unparseable record written to error file: [13]-400 : c->8 no char to combine to in page 0 at 17 '1b285361
: ERROR: Unparseable record written to error file: [30]-400 : c->8 no char to combine to in page 4 at 8 '1b2842e2 .(B.'
: ERROR: Unparseable record written to error file: [44]-400 : c->8 no char to combine to in page 0 at 5 '1b285341 .(SA'
[57]-670 : c->8 no char to combine to in page 0 at 47 '1b285341 .(SA'
: ERROR: Unparseable record written to error file: [56]-451 : c->8 no char to combine to in page 0 at 15 '1b28536c .(Sl'
[58]-451 : c->8 no char to combine to in page 0 at 17 '1b28536a .(Sj'
[82]-670 : c->8 no char to combine to in page 0 at 54 '1b28536c .(Sl'
: ERROR: Unparseable record written to error file: [40]-451 : c->8 no char to combine to in page 0 at 12 '1b285372 .(Sr'
[42]-451 : c->8 no char to combine to in page 0 at 17 '1b28536c .(Sl'
[77]-670 : c->8 no char to combine to in page 0 at 1033 '1b285372 .(Sr'
4. Convert file to Unicode.
5.SysAdmin > Cataloging > Bulk Import Rules > Change expected character set of rule to use to MARC21 UTF-8.
6. Run Pbulkimport with Unicode file.
7. View log.imp file and note lack of error messages.
Workaround: Convert input file to UTF-8 (Unicode) using external utility (i.e., MarcEdit or similar); import as UTF-8.
Resolution:
- Article last edited: 3/2/2015