How to put unicode characters in a fix
- Article Type: General
- Product: Aleph
- Product Version: 20
Desired Outcome Goal:
How to make use of complex Unicode characters, such as character fe52, in a fix routine?
Procedure:
1. For "simple" Unicode characters, use the decimal value with a backslash, like follows:
1 949## REPLACE-STRING \101,\102
Note that the system reads only the first three characters after the backslash.
2. For "complex" Unicode characters, like U+FE52, the decimal value is "65106". If you put it in the generic fix table with a backlash (\65106), the system will only translate"651", and will append "06" after it.
3. In such case, use "CHARACTER_CONVERSION=8859_8_TO_UTF (where '8859_8_TO_UTF' is the value from Col.1 of $alephe_unicode/tab_character_conversion_line), at the beginning of the line with special character.
4. For Unicode characters, use line_utf2line_utf program with a table from templates "line_utf2line_utf.template" and "line_utf2line_utf.extended.template" (also under the $alephe_unicode directory).
Note:An example of such usage is explained in article Converting OCLC's de-composed form of Unicode to ALEPH composed form about OCLC_UTF_TO_UTF.
Additional Information
See attachment Homemade Fix Procedures - Examples, Example #7 shows how to call a section of tab_character_conversion_line, such as 8859_8_TO_UTF.
Attachment
Homemade Fix Procedures - Examples
Homemade Fix Procedures - Parameters
- Article last edited: 10/8/2013