Skip to main content
ExLibris
  • Subscribe by RSS
  • Ex Libris Knowledge Center

    Apostrophes pasted from Microsoft Word

    • Article Type: General
    • Product: Aleph
    • Product Version: 20

    Description:
    Catalogers started editing authority records by copying text from Microsoft Word. While doing so, however, they copied text containg Unicode apostrophes (which is something casual in French). the problem is that now, our records conatins a mix of the new Unicode apostrophe (code point u+2019) and the old usual ASCII apostrophe (code point 27, or u+0027).

    I need help finding a way to fix this and convert all u+2019 apostrophe back to u+0027 apostrophes.

    Resolution:
    Another site reported the same issue when pasting from Microsoft Word to the Aleph Search fields, and getting incorrect results.

    <At the end of this message you can find intructions from Microsoft Word on how to turn the apostrophe Unicode off. >


    In both cases, we suggest changing the character_converstion table as follows:

    In case of Searches:

    table: unicode_to_word_gen

    Change from:

    2018 2018 #LEFT SINGLE QUOTATION MARK
    2019 2019 #RIGHT SINGLE QUOTATION MARK

    To:
    2018 0027 #LEFT SINGLE QUOTATION MARK
    2019 0027 #RIGHT SINGLE QUOTATION MARK



    In order to fix the records:

    1) Use line_utf2line_utf.template under $alephe_unicode to create a new conversion table.
    The contents of the table should be:

    2018 0027 #LEFT SINGLE QUOTATION MARK
    2019 0027 #RIGHT SINGLE QUOTATION MARK

    2) Add the new table to the tab_character_conversion_line (under the same directory), with a new entry, for example:

    APOSTROPHE ##### L line_utf2line_utf <your new table from line_utf2line_utf.template>


    3) Run p-manage-22 with Conversion routine APOSTROPHE.

    4) Another option is to add the new table from step 1 to entry "Z" in tab_character_conversion_line, like follows:

    Z ##### # line_utf2line_utf <your new table from line_utf2line_utf.template>


    and then set up a new fix routine in tab_fix using "fix_doc_char_conv_z" program.

    The advantage is that this fix routine can be used when editing or saving the record in Cataloging.


    **********************************************************
    See also instruction form Microsoft Word on How to turn the Unicode/Ascii character conversion on and off:

    Change curly quotes to straight quotes and vice versa
    Microsoft Word automatically changes straight quotation marks ( ' or " ) to curly (smart or typographer's) quotes ( or ) as you type.

    To turn this feature on or off:

    On the Tools menu, click AutoCorrect Options, and then click the AutoFormat As You Type tab.
    Under Replace as you type, select or clear the "Straight quotes" with "smart quotes" check box.
    Note You can find and replace all instances of single or double curly quotes with straight quotes in your document. To do this, clear the "Straight quotes" with "smart quotes" check box on the AutoFormat As You Type tab. On the Edit menu, click Replace. In both the Find what and Replace with boxes, type ' or ", and then click Find Next or Replace All.

    To replace all straight quotes with curly quotes, select the "Straight quotes" with "smart quotes" check box, and repeat the find and replace procedure.

    ---------------------------

    Straight quotes in MS Word are U+0027
    Curly/Smart/Typographer quotes in MS Word are U+2019

    ---------------------------
    Outlook 2003

    In previous versions of Microsoft Outlook, the default e-mail editor was the Outlook editor. You could change the editor to Microsoft Word if you wanted. In this version, the default e-mail editor is Word, so you can take advantage of features such as:
    [...]
    Autoformat Format your message automatically as you type, and add formatting to plain text messages that you receive.

    --------------------------------------------------------------------------------------------------------------------------------------------------------------------

    Additional Information

    apostrophe, unicode


    • Article last edited: 10/8/2013