Transformation Routines
Routine Name | Description | Example | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Add period at the end
|
Adds a period to the end of the field if the field does not end with one of the following symbols: '!' or '?' .
|
|||||||||
Add to beginning of string
|
Adds the string defined in the parameter before the content of the field.
|
Parameter: ISBN:
Input: 123-45-678-90
Output: ISBN: 123-45-678-90
|
||||||||
Add to end of string
|
Adds the string defined in the parameter after the content of the field.
|
Parameter: (ISBN)
Input: 123-45-678-90
Output: 123-45-678-90 (ISBN)
|
||||||||
Assign to AZ list |
Used by Journal Search and Database Search to transform titles and database names to the following categories:
|
Input 1: Journal of Chemistry Output 1: J Input 2: Output 2: others Input 3: 1040 Instructions Output 3: 0-9 |
||||||||
Character Conversion
|
Converts the characters in the field using a character conversion table.
|
|||||||||
Complete End Date |
Converts an end date (YYYY or YYYYMM) to a complete end date (YYYYMMDD) based on the input format. If the input already contains a complete date, the date is copied without transformation. |
Input 1: 1990 Output 1: 19901231 Input 2: 199003 Output 2: 19900331 Input 3: 899 Output 3: 08991231 |
||||||||
Complete Start Date |
Converts a start date (YYYY or YYYYMM) to a complete start date (YYYYMMDD) based on the input format. If the input already contains a complete date, the date is copied without transformation. |
Input 1: 1990 Output 1: 19900101 Input 2: 199003 Output 2: 19900301 Input 3: 899 Output 3: 08990101 |
||||||||
ConvertISBN13to
ISBN10 |
Converts a 13-digit ISBN to a 10-digit ISBN if possible. The input should be a 10- or 13-digit ISBN with or without hyphens.
Parameters: none
Output: A 10-digit ISBN without hyphens.
|
Input: 9780747599609
Output: 0747599602
|
||||||||
ConvertToISBN13
|
Converts an ISBN to a 13-digit ISBN. The input should be a 10- or 13-digit ISBN with or without hyphens
Parameters: none
Output: A 13-digit ISBN without hyphens.
|
Input: 0747599602
Output: 9780747599609
|
||||||||
Copy as is |
This is the default transformation, which copies the source data without making any changes. |
Input: 0747599602
Output: 0747599602
|
||||||||
Define subfield delimiter
|
Defines a delimiter between subfields. This routine prevents the need to have a separate rule per subfield. The same delimiter will be used for all subfields.
|
Parameter: ^--^
Input: $$aUniversities and colleges $$xChildren$$xRepublicans
Output: Universities and colleges - Children -- Republicans
|
||||||||
Delete Characters
|
Deletes the characters specified in the parameter.
|
Parameter: '
Input: O'brien
Output: Obrien
|
||||||||
Delete spaces
|
Deletes spaces.
|
|||||||||
Drop non-Filing Text
|
For MARC 0- drops filing text based on indicator 1 or 2. The parameter is the indicator:
@@ind1@@" or "@@ind2@@"
|
Parameter: @@ind1@@
Input: The journal of the AAA
Output: journal of the AAA
|
||||||||
Extract and arrange XML elements
|
Extracts child nodes from an XML element in a specific order when there are multiple occurrences of the XML element. The transformation handles every element separately.
The input to the transformation should be a simple XML structure, such as the following:
<parent>
<child1>data1</child1>
<child2>data2</child2>
<child3>data3</child3>
Parameter Notes:
|
Input: child2;child1@@\s
Output: <child2>data2</child2> <child1>data1</child1> <child3>data3</child3>
Input: child2;child1@@\s@@D
Output: <child2>data2</child2> <child1>data1</child1>
Input: child2;child1@@\s@@D@@D
Output: data2 data1
Input: child2;child1@@@@D@@D
Output: data2data1
|
||||||||
Format Date
|
Formats dates to be in the structure:
YYYY-MM-DD hh:mm:ss
|
Input: 20020418155342
Output: 2002-04-18 15:53:42
Input: 20020418
Output: 2002-04-18
|
||||||||
Format number
|
Adds leading digits to create a seven digit number.
Use another transformation to remove commas or periods within the number.
|
Input: 10000
Output: 0010000
|
||||||||
Format End Date |
Transforms the date or date range specified in the input to a formatted end date. This routine handles the input as follows:
|
Input 1: 1995-1999 Output 1: 1999 Input 2: [1995-1999] Output 2: 1999 Input 3: 1995- Output 3: 9999 Input 4: 19uu Output 4: 1999 |
||||||||
Format Start Date |
Transforms the date or date range specified in the input to a formatted start date. This routine handles the input as follows:
|
Input 1: 1995-1999 Output 1: 1995 Input 2: [1995-1999] Output 2: 1995 Input 3: 1995- Output 3:1995 Input 4: 19uu Output 4: 1901 |
||||||||
Format URL
|
Formats special characters for URLs:
|
"%" -> "%25"
"\\$" -> "%24"
"&" -> "%26"
"\\+" -> "%2B"
"," -> "%2C"
"/" -> "%2F"
":" -> "%3A"
";" -> "%3B"
"=" -> "%3D"
"\\?" -> "%3F"
"@" -> "%40"
"\\s" -> "%20"
"\"" -> "%22"
"<" -> "%3C"
">" -> "%3E"
"#" -> "%23"
"\\{" -> "%7B"
"\\}" -> "%7D"
"\\|" -> "%7C"
"\\\\" -> "%5C"
"\\^" -> "%5E"
"~" -> "%7E"
"\\[" -> "%5B"
"\\]" -> "%5D"
"`" -> "%60
This transformation is not currently required.
|
||||||||
Format Year
|
Gets the first four characters of the given string and replaces all characters that are not digits with the specified parameter.
|
Parameter: ?
Input: 194u
Output: 194?
|
||||||||
Get author first name
|
Retrieves the author's first name for Latin languages. It will take all characters following the first comma.
|
Input: Lippe, Ole von der
Output: Ole von der
|
||||||||
Get author first name non latin languages |
Retrieves the author's first name for non-Latin languages (such as Hebrew). It will take all characters following the first comma. For Arabic customers, refer to the following article: How to use the Get Author First and Last Name Non Latin Transformations for Arabic. |
Input: דיקמן, עמינדב
Output: עמינדב
|
||||||||
Get author last name
|
Retrieves the author's last name for Latin languages. It will take all characters up to the first comma.
|
Input: Lippe, Ole von der
Output: Lippe
|
||||||||
Get author last name non latin languages |
Retrieves the author's last name for non-Latin languages (such as Hebrew). It will take all characters up to the first comma. For Arabic customers, refer to the following article: How to use the Get Author First and Last Name Non Latin Transformations for Arabic. |
Input: דיקמן, עמינדב
Output: דיקמן
|
||||||||
Get author first last name
|
Returns the author’s first and last name.
|
Input: Marshall, John B
Output: John B Marshall
|
||||||||
Get author last first name
|
Returns the author’s last and first name.
|
Input: John B Marshall
Output: Marshall, John B
|
||||||||
GetHeadTail
|
Returns the specified number of characters from the beginning and end of the input string. The following format is used for the Parameter field, where <param1> is the number of characters taken from the beginning and <param2> is the number of characters taken from the end:
<param1>@@<param2>
If you do not specify a value for <param2>, only characters from the beginning will be taken.
If the input string has fewer characters than specified in either parameter, the system returns the entire string.
|
Parameter: 20@@5
Input: “England and France during the hundred years war”
Output: “England and France ds war”
|
||||||||
Get highest number |
Returns the highest number from the input field. |
Input: 112 pages, 2 ill Output: 112 |
||||||||
Get highest number and normalize last digit |
Returns the highest number from the input field and changes the last digit to 0. |
Input: 112 pages, 2 ill Output: 110 |
||||||||
Include/Exclude Subfields (starts with)
|
This transformation allows the following options:
|
The following example also includes the Split Field BBBBB transformation, which is used in case the subfield includes multiple URIs delimited by BBBBB.
Parameter: include@@(Uri)@@BBBBB
Input:
Output:
(Uri)http://viaf.org/viaf/sourceID/LC|no2008011383
Parameter: exclude@@(Uri)
Input:
Output:
1021-01BC_INST-98137743588801021
|
||||||||
Lower case
|
Changes case to lower. There is no parameter.
|
Input: History of books
Output: history of books
|
||||||||
Normalize author
|
Keeps the last name of the author and the first character of the first name.
This routine normalizes authors for the Author facet.
|
Input: Lippe, Ole von der
Output: Lippe, O
|
||||||||
Normalize
Diacritics |
Normalizes the input string based on source and target codes defined in the DiacriticsConversion mapping table.
Parameter: none
Output: The normalized string.
The DiacriticsConversion mapping table contains the following columns:
|
Source UniCode: 00D8
Target UniCode: 004F
Output: Converts a Latin O with stroke to upper case O
|
||||||||
Put subfields in separate fields
|
Creates separate PNX fields for every occurrence of a subfield within a field.
|
Input: $$aeng$$afre$$gre
Output:
eng
fre
gre
|
||||||||
RemoveLeading
StringFromList |
Removes a leading string from the input. The leading strings (such as articles) that you want removed must be defined in a normalization mapping table.
Parameter: The code for the normalization mapping table that lists the strings to be removed from the beginning of the input.
Output: The input without the leading string.
For example, you can create a normalization mapping table called LeadingArticles and include a list of articles to remove:
Each string must be entered in both columns as shown above.
|
Parameter: LeadingArticles
Input: a report to congress
Output: report to congress
|
||||||||
RemoveString
FromList |
Removes all occurrences of a string from the input. The strings that you want removed must be defined in a normalization mapping table.
Parameter: The name of the normalization mapping table that lists the strings to be removed from the input.
Output: The input with the specified strings removed.
For example, you can create a normalization mapping table called RemoveStrings and include a list of strings to remove:
Each string must be entered in both columns as shown above.
|
Parameter: RemoveStrings
Input: War and Peace
Output: War Peace
|
||||||||
Remove characters from the end
|
Removes the last character from the input if it matches any character specified in the Parameter field.
This routine removes a single character only. If you want to remove a string of characters, you can either repeat this routine as many times as necessary or utilize the Remove string from the end routine.
|
Parameter: :,=;/]
Input:
New york: Blackwell,
Output:
New york: Blackwell
|
||||||||
Remove HTML tags
|
Removes HTML tags from XML content.
|
Input:
Bats Adjust Their 'Field-of-View': Use of <test1>Biosonar</test1> Is More Advanced <test2>Than</test2> Thought,
Output:
Bats Adjust Their 'Field-of-View': Use of Biosonar Is More Advanced Than Thought
|
||||||||
Remove Leading Characters
|
Removes the first character from the input if it matches any character specified in the Parameter field.
This routine removes a single character only. If you want to remove a string of characters, you can either repeat this routine as many times as necessary or utilize the Remove Leading String routine.
|
Parameter: [({
Input:
[1948]
Output:
1948]
|
||||||||
Remove string from the end
|
Removes the string specified in the parameter from the end of the field.
|
Parameter: (ISBN)
Input: 675484451 (ISBN)
Output: 675484451
|
||||||||
Remove Leading String
|
Removes the string specified in the parameter from the beginning of the field.
|
Parameter: (ISBN)
Input: (ISBN) 675484451
Output: 675484451
|
||||||||
Remove Punctuation
|
Removes the following punctuation from the field and changes them to blank:
!"#$%&'()*+,-./:;<=>?@[\]^_`{|}}~
Punctuation defined in the parameter will not be deleted.
|
Parameter: $
Input: Cost of item: 121$
Output: Cost of item 121$
|
||||||||
Replace Characters
|
Replaces the characters specified in the parameter with characters specified in the second part of the parameter:
<characters to replace>@@<replacement string>
If there is no <replacement string>, the characters will just be removed.
|
Parameter: .",@@^
Input: History of the U.S.A.
Output: History of the USA
|
||||||||
Replace Spaces by String
|
Replaces all spaces by the string defined in the parameter.
A parameter is required.
|
Parameter: ^;^
Input: eng fre ger
Output: eng; fre; ger
|
||||||||
Replace nonnumeric chars in range |
Replaces each non-numeric character specified by the <start> and <end> character positions with a question mark. Parameter: <start>@@<end> characters specified in the start and positions character positions with a question marks |
Parameter: 0@@3 Input: Ab56 Output: ??56 |
||||||||
Replace start and end angle brackets by parentheses
|
Replaces < and > by ( ).
|
|||||||||
Replace string by string
|
Replaces the string specified in the parameter by the second string specified in the parameter. Use @@ to separate the parameters. Note that this transformation is case sensitive.
|
Parameter: Test@@TEST Input: Test Output: TEST |
||||||||
Split Data of Fixed Length
|
Splits fixed length data to substrings of the length specified in the parameter. The resulting substrings are delimited by spaces.
|
Parameter: 3
Input: engfreger
Output: eng fre ger
|
||||||||
Split by pattern
|
Splits a string into substrings that match the defined pattern, The substrings are delimited by spaces.
|
Parameter: (.{3}})
Input: engdutheb
Output: eng dut heb
|
||||||||
Split Field
|
Splits a field into separate PNX fields based on the delimiter defined in the parameter.
|
Parameter: ;
Input: eng;spa;ger
Output:
eng
spa
ger
Every language code will be a separate field.
|
||||||||
Upper case every first letter
|
Changes the first letter of every word to uppercase, leaving all others in their original case. A word is defined as text that is separated by whitespace or punctuation.
This transformation accepts a normalization mapping table name as a parameter, which can be used to define words that should be ignored by the transformation. The Source column contains the word that you want to exclude, and the Target column contains the transformation. The same word can be added to both the Source and Target columns to ensure that they are not transformed at all. For example:
Source: and
Target: and
Input: barnes and noBle
Output: Barnes and NoBle
|
Input: A loNg and winding road
Output: A LoNg And Winding Road
|
||||||||
Upper case every first letter (lower case others)
|
Changes the first letter of every word to uppercase and makes sure that all other characters are lowercase. A word is defined as text that is separated by whitespace or punctuation.
This transformation accepts a normalization mapping table name as a parameter, which can be used to define words that should be ignored by the transformation. The Source column contains the word that you want to exclude, and the Target column contains the transformation. The same word can be added to both the Source and Target columns to ensure that they are not transformed at all. For example:
Source: and
Target: and
Input: barnes and noBle
Output: Barnes and Noble
|
Input: A loNg and winding road
Output: A Long And Winding Road
|
||||||||
Upper case every first letter - whitespace only (lower case others)
|
Changes the first letter of every word to uppercase and makes sure that all other characters are lowercase. A word is defined as text that is separated by whitespace only.
This transformation accepts a normalization mapping table name as a parameter, which can be used to define words that should be ignored by the transformation. The Source column contains the word that you want to exclude, and the Target column contains the transformation. The same word can be added to both the Source and Target columns to ensure that they are not transformed at all.For example:
Source: and
Target: and
Input: barnes and noBle
Output: Barnes and Noble
|
Input: A loNg and winding road
Output: A Long And Winding Road
|
||||||||
Upper case every first letter - whitespace only
|
Changes the first letter of every word to uppercase, leaving all others in their original case. A word is defined as text that is separated by whitespace only.
This transformation accepts a normalization mapping table name as a parameter, which can be used to define words that should be ignored by the transformation. The Source column contains the word that you want to exclude, and the Target column contains the transformation. The same word can be added to both the Source and Target columns to ensure that they are not transformed at all. For example:
Source: and
Target: and
Input: barnes and noBle
Output: Barnes and NoBle
|
Input: A loNg and winding road
Output: A LoNg And Winding Road
|
||||||||
Take substring
|
Retrieves a substring from a string from the defined characters number and defined number of characters. Use @@ to separate the parameters. The count starts from 0.
This routine is useful to extract characters from specific positions in MARC control fields. For example to extract the year from 008 enter: 7@@4—that is, take 4 characters from position 7.
|
Parameter: 7@@4
Input: 831024s1984 mau b 00110 eng
Output: 1984
|
||||||||
Take first words
|
Retrieves the first number of defined words. A word is any string between blanks.
|
Parameter: 5
Input: A history of the middle ages in the 13th and 14th centuries
Output: A history of the middle
|
||||||||
Take first subfields
|
This transformation can be used when a single field can have multiple occurrences of a specific subfield and only a limited number of occurrences should be taken.
|
Parameter: 1
Input: $$z1458998797$$z8976439871
Output: 1458998797
|
||||||||
Take characters from the end
|
Retrieves the defined number of characters from the end of the field.
|
|||||||||
Take from first occurrence
|
Retrieves the rest of the string from the first occurrence of the substring or character specified in the first parameter. Using the second parameter it is possible to indicate whether the string defined in the first parameter should be taken or not. Use 0 to exclude the string and 1 to include it. The default is 0.
|
Parameter: ,@@0
Input: Blackstone, John
Output: John
Parameter: ,@@1
Input: Blackstone, John
Output: , John
|
||||||||
Take from last occurrence
|
Retrieves the rest of the string from the last occurrence of the specified substring or character. Using the second parameter it is possible to indicate whether the string defined in the first parameter should be taken or not. Use 0 to exclude the string and 1 to include it. The default is 0.
|
|||||||||
Take until first occurrence
|
Retrieves the string until the first occurrence of the specified substring or character. Using the second parameter it is possible to indicate whether the string defined in the first parameter should be taken or not. Use 0 to exclude the string and 1 to include it. The default is 0.
Do not use this transformation with a space as the parameter. Instead, use the Take first words routine.
|
Parameter: ,@@1
Input: Blackstone, John
Output: Blackstone,
|
||||||||
Take until last occurrence
|
Retrieves the string until the last occurrence of the specified substring or character. Using the second parameter it is possible to indicate whether the string defined in the first parameter should be taken or not. Use 0 to exclude the string and 1 to include it. The default is 0.
|
|||||||||
Upper Case
|
Changes case to upper. There is no parameter.
|
Input: History of books
Output: HISTORY OF BOOKS
|
||||||||
Use mapping table
|
Uses a mapping table to convert input values.
To prevent unexpected results, do not add more than one mapping row for each source code when defining a Normalization mapping table.
|
|||||||||
Write constant
|
Adds a constant. The parameter is the constant to add. Use this routine instead of the 'constant' source type if you want it written only if a field exists.
|
Routine Name | Description | Example |
---|---|---|
Drop String (use reg. exp)
|
Removes the string that matches the pattern defined in the parameter.
|
Parameter: \.$
Input: Cheever, Daniel Sargent.
Output: Cheever, Daniel Sargent
|
ReplaceLast
RegexpByString |
Replaces the last occurence of the specified regular expression with the specified string.
The Parameter field uses the following format, where <reg_exp> is a regular expression to replace, @@ is the parameter delimiter, and <str> is the replacement string:
<reg_exp>@@<str>
If the replacement string is omitted, the system will use an empty string.
Output: The original string with the last instance of the regular expression changed to the replacement string.
|
Parameter: (\([^()]+\))$
Input: History of Germany (online)
Output: History of Germany
|
Substitute string (use reg. exp.)
|
Substitutes all occurrences of a string found using a regular expression with the specified string.
The Parameter field uses the following format, where <reg_exp> is a regular expression to replace, @@ is the parameter delimiter, and <str> is the replacement string:
<reg_exp>@@<str>
If the substitute string is omitted, the system will remove all occurrences of the string found by the regular expression.
Output: The original string with all instances of a string replaced by the substitute string.
|
The following example replaces every second occurrence of a colon with a semicolon.
Parameter: ([^\:]*\:[^\:]*)\:@@$1\;
Input: History of Germany: 1800s: 1900s
Output: History of Germany: 1800s; 1900s
|
Take string (use reg. exp.)
|
Takes the string that matches the parameter. Once the first string is found, it will be returned.
|
Parameter: .{7}(.{4}).*
Input: 831024s1984 mau b 00110 eng
Output: 1984
|
Take all matching strings (use reg. exp)
|
Creates a string that consists of substrings that match the defined pattern. The string will be delimited by the delimiter defined as the second parameter.
Unlike the Take string routine, this transformation takes all occurrences.
|
Parameter: \"([^\"]+)\"@@::
Input: LABEL="Cover Page" and LABEL="Table of Content"
Output: Cover Page::Table of Content
|