What character encoding is preferred for bulk importing records into Voyager?

Last updated
Save as PDF
Share
1. Share
2. Tweet
3. Share

Product: Voyager

Question

Is MARC21 UTF-8 or MARC21 MARC-8 character encoding preferred for bulk importing records into Voyager?

Answer

Voyager stores records in MARC21 UTF-8, and will convert MARC21 MARC-8 records to MARC21 UTF-8 when they are imported and MARC21 MARC-8 is selected as the Expected Character Set.

Voyager can handle either type of encoding as long as that type is selected as the Expected Character Set in the Bulk Import rule - Voyager will convert any character encoding as necessary¹. UTF-8 may be simpler because there is no conversion involved, but either can be selected, as long as the character set matches that of the records in the import file, and the encoding is consistent (that is, all records in the file encoded the same way).

You can determine the character coding scheme by checking byte 9 of the MARC Leader: a blank in byte 9 indicates the record is MARC-8, an a means the record is UCS/Unicode.

Additional Information

Only bibliographic data are stored as Unicode (MARC21 UTF-8) in Voyager, and the rest is Latin1. See the Voyager Technical User's Guide (various sections) for further details.

¹Characters that can't be converted will produce "Invalid MARC21 Character" error. See: What does "Invalid MARC21 Character" error mean in cataloging client?

Article last edited: 09-May-2020