Transformation Routines

Last updated
Save as PDF
Share
1. Share
2. Tweet
3. Share

Spaces are indicated by a ^.

Transformation routines allow you to change the source value into another value so that it can be used for searching and display in Primo. Validation routines are also considered transformation routines, but they are used within conditions for comparisons and do not transform or copy values (see Validation Routines).

The following tables list the basic and advanced transformation routines:

Routine Name

Description

Example

Add period at the end

Adds a period to the end of the field if the field does not end with one of the following symbols: '!' or '?' .

Add to beginning of string

Adds the string defined in the parameter before the content of the field.

Parameter: ISBN:

Input: 123-45-678-90

Output: ISBN: 123-45-678-90

Add to end of string

Adds the string defined in the parameter after the content of the field.

Parameter: (ISBN)

Input: 123-45-678-90

Output: 123-45-678-90 (ISBN)

Assign to AZ list

Used by Journal Search and Database Search to transform titles and database names to the following categories:

0-9 – The value returned if the first character in the source field is a number (0-9).
<normalized letter> – The normalized letter (A-Z) if the first character in the source field is mapped to a letter (A-Z) with the A-Z Characters Transformations mapping table. For example, if the first character is A or Á, the routine returns A.
others – The value returned if the first character in the source field is not a letter or number.

Input 1: Journal of Chemistry

Output 1: J

Input 2: 中国药理学报

Output 2: others

Input 3: 1040 Instructions

Output 3: 0-9

Character Conversion

Converts the characters in the field using a character conversion table.

Complete End Date

Converts an end date (YYYY or YYYYMM) to a complete end date (YYYYMMDD) based on the input format. If the input already contains a complete date, the date is copied without transformation.

Input 1: 1990

Output 1: 19901231

Input 2: 199003

Output 2: 19900331

Input 3: 899

Output 3: 08991231

Complete Start Date

Converts a start date (YYYY or YYYYMM) to a complete start date (YYYYMMDD) based on the input format. If the input already contains a complete date, the date is copied without transformation.

Input 1: 1990

Output 1: 19900101

Input 2: 199003

Output 2: 19900301

Input 3: 899

Output 3: 08990101

ConvertISBN13to
ISBN10

Converts a 13-digit ISBN to a 10-digit ISBN if possible. The input should be a 10- or 13-digit ISBN with or without hyphens.

Parameters: none

Output: A 10-digit ISBN without hyphens.

Input: 9780747599609

Output: 0747599602

ConvertToISBN13

Converts an ISBN to a 13-digit ISBN. The input should be a 10- or 13-digit ISBN with or without hyphens

Parameters: none

Output: A 13-digit ISBN without hyphens.

Input: 0747599602

Output: 9780747599609

Copy as is

This is the default transformation, which copies the source data without making any changes.

Input: 0747599602

Output: 0747599602

Define subfield delimiter

Defines a delimiter between subfields. This routine prevents the need to have a separate rule per subfield. The same delimiter will be used for all subfields.

Parameter: ^--^

Input: $$aUniversities and colleges $$xChildren$$xRepublicans

Output: Universities and colleges - Children -- Republicans

Delete Characters

Deletes the characters specified in the parameter.

Parameter: '

Input: O'brien

Output: Obrien

Delete spaces

Deletes spaces.

Drop non-Filing Text

For MARC 0- drops filing text based on indicator 1 or 2. The parameter is the indicator:

@@ind1@@" or "@@ind2@@"

Parameter: @@ind1@@

Input: The journal of the AAA

Output: journal of the AAA

Extract and arrange XML elements

Extracts child nodes from an XML element in a specific order when there are multiple occurrences of the XML element. The transformation handles every element separately.

The input to the transformation should be a simple XML structure, such as the following:

<parent>
<child1>data1</child1>
<child2>data2</child2>
<child3>data3</child3>

Parameter Notes:

At least one parameter is required.
The order of the parameters must be retained; if any parameter is not used, the '@@" delimiter still should be used. For example, if the output order and delete tag names parameters are the only parameters needed, enter: param1@@@@@@param4.
To add space as a delimiter, specify '\s' in the delimiter string parameter.
Default values for Delete others and Delete tag names parameters are false. If you want to use one of them, use D (Delete) as the parameter value. For example: param1@@param2@@D@@D).
If the output order parameter is not specified, and the Delete others parameter is not set to true, all data elements will be used in a random order and the XML order will not be preserved.

Input: child2;child1@@\s

Output: <child2>data2</child2> <child1>data1</child1> <child3>data3</child3>

Input: child2;child1@@\s@@D

Output: <child2>data2</child2> <child1>data1</child1>

Input: child2;child1@@\s@@D@@D

Output: data2 data1

Input: child2;child1@@@@D@@D

Output: data2data1

Format Date

Formats dates to be in the structure:

YYYY-MM-DD hh:mm:ss

Input: 20020418155342

Output: 2002-04-18 15:53:42

Input: 20020418

Output: 2002-04-18

Format number

Adds leading digits to create a seven digit number.

Use another transformation to remove commas or periods within the number.

Input: 10000

Output: 0010000

Format End Date

Transforms the date or date range specified in the input to a formatted end date. This routine handles the input as follows:

Removes all non-digits except for the following:
- Question mark - ? (denotes unknown dates.)
- u (denotes unknown dates)
- Slash (/ or \)
- Hyphen-Minus sign (U002D or U2010 or U2011 or U2012 or U2013 or U2212)
Replaces unknown dates with a 0 (for start date or BCE end date) or a 9 (end date or BCE start date).
For a date range, the first date is taken as the start date and the second as the end date.
For an open date, the end date becomes 9999.

Input 1: 1995-1999

Output 1: 1999

Input 2: [1995-1999]

Output 2: 1999

Input 3: 1995-

Output 3: 9999

Input 4: 19uu

Output 4: 1999

Format Start Date

Transforms the date or date range specified in the input to a formatted start date. This routine handles the input as follows:

Removes all non-digits except for the following:
- Question mark - ? (denotes unknown dates.)
- u (denotes unknown dates)
- Slash (/ or \)
- Hyphen-Minus sign (U002D or U2010 or U2011 or U2012 or U2013 or U2212)
Replaces unknown dates with a 0 (for start date or BCE end date) or a 9 (end date or BCE start date).
For a date range, the first date is taken as the start date and the second as the end date.
For an open date, the end date becomes 9999.

Input 1: 1995-1999

Output 1: 1995

Input 2: [1995-1999]

Output 2: 1995

Input 3: 1995-

Output 3:1995

Input 4: 19uu

Output 4: 1901

Format URL

Formats special characters for URLs:

"%" -> "%25"

"\\$" -> "%24"

"&" -> "%26"

"\\+" -> "%2B"

"," -> "%2C"

"/" -> "%2F"

":" -> "%3A"

";" -> "%3B"

"=" -> "%3D"

"\\?" -> "%3F"

"@" -> "%40"

"\\s" -> "%20"

"\"" -> "%22"

"<" -> "%3C"

">" -> "%3E"

"#" -> "%23"

"\\{" -> "%7B"

"\\}" -> "%7D"

"\\|" -> "%7C"

"\\\\" -> "%5C"

"\\^" -> "%5E"

"~" -> "%7E"

"\\[" -> "%5B"

"\\]" -> "%5D"

"`" -> "%60

This transformation is not currently required.

Format Year

Gets the first four characters of the given string and replaces all characters that are not digits with the specified parameter.

Parameter: ?

Input: 194u

Output: 194?

Get author first name

Retrieves the author's first name for Latin languages. It will take all characters following the first comma.

Input: Lippe, Ole von der

Output: Ole von der

Get author first name non latin languages

Retrieves the author's first name for non-Latin languages (such as Hebrew). It will take all characters following the first comma.

For Arabic customers, refer to the following article: How to use the Get Author First and Last Name Non Latin Transformations for Arabic.

Input: דיקמן, עמינדב

Output: עמינדב

Get author last name

Retrieves the author's last name for Latin languages. It will take all characters up to the first comma.

Input: Lippe, Ole von der

Output: Lippe

Get author last name non latin languages

Retrieves the author's last name for non-Latin languages (such as Hebrew). It will take all characters up to the first comma.

For Arabic customers, refer to the following article: How to use the Get Author First and Last Name Non Latin Transformations for Arabic.

Input: דיקמן, עמינדב

Output: דיקמן

Get author first last name

Returns the author’s first and last name.

Input: Marshall, John B

Output: John B Marshall

Get author last first name

Returns the author’s last and first name.

Input: John B Marshall

Output: Marshall, John B

GetHeadTail

Returns the specified number of characters from the beginning and end of the input string. The following format is used for the Parameter field, where <param1> is the number of characters taken from the beginning and <param2> is the number of characters taken from the end:

If you do not specify a value for <param2>, only characters from the beginning will be taken.

If the input string has fewer characters than specified in either parameter, the system returns the entire string.

Parameter: 20@@5

Input: “England and France during the hundred years war”

Output: “England and France ds war”

Get highest number

Returns the highest number from the input field.

Input: 112 pages, 2 ill

Output: 112

Get highest number and normalize last digit

Returns the highest number from the input field and changes the last digit to 0.

Input: 112 pages, 2 ill

Output: 110

Include/Exclude Subfields (starts with)

This transformation allows the following options:

Include - This option can be used with the Split Field transformation to add each subfield value to a separate PNX field.

include@@<starts with value to match - not case sensitive>@@<separator to include>
Exclude - This option can be used to exclude specific subfields (starting with a particular value) from the PNX record.

exclude@@<starts with value to match - not case sensitive>

For example, Alma includes in the author (1XX, 7XX) and subject (6XX) fields a subfield 0 with the authority control ID (for Browse), and are planning to add to these fields additional 0 subfields with linked data. The transformation can be used to exclude subfields '0' if they start with (URI) from Browse, and can in the future be used to include subfields '0' if they start with (URI) in a linked data field.

The following example also includes the Split Field BBBBB transformation, which is used in case the subfield includes multiple URIs delimited by BBBBB.

Parameter: include@@(Uri)@@BBBBB

Input:

exclude0-
(Uri)http://viaf.org/viaf/sourceID/LC|no2008011383

Output:

(Uri)http://viaf.org/viaf/sourceID/LC|no2008011383

Parameter: exclude@@(Uri)

Input:

1021-01BC_INST-98137743588801021
(Uri)http://www.exlibrisgroup.com/yonatan

Output:

1021-01BC_INST-98137743588801021

Lower case

Changes case to lower. There is no parameter.

Input: History of books

Output: history of books

Normalize author

Keeps the last name of the author and the first character of the first name.

This routine normalizes authors for the Author facet.

Input: Lippe, Ole von der

Output: Lippe, O

Normalize
Diacritics

Normalizes the input string based on source and target codes defined in the DiacriticsConversion mapping table.

Parameter: none

Output: The normalized string.

The DiacriticsConversion mapping table contains the following columns:

Source UniCode – the Unicode character from which to normalize.
Target UniCode – the Unicode character to which to normalize.

Source UniCode: 00D8

Target UniCode: 004F

Output: Converts a Latin O with stroke to upper case O

Put subfields in separate fields

Creates separate PNX fields for every occurrence of a subfield within a field.

Input: $$aeng$$afre$$gre

Output:

eng

fre

gre

RemoveLeading
StringFromList

Removes a leading string from the input. The leading strings (such as articles) that you want removed must be defined in a normalization mapping table.

Parameter: The code for the normalization mapping table that lists the strings to be removed from the beginning of the input.

Output: The input without the leading string.

For example, you can create a normalization mapping table called LeadingArticles and include a list of articles to remove:

sourceCode1	targetCode
a	a
an	an
the	the

Each string must be entered in both columns as shown above.

Parameter: LeadingArticles

Input: a report to congress

Output: report to congress

RemoveString
FromList

Removes all occurrences of a string from the input. The strings that you want removed must be defined in a normalization mapping table.

Parameter: The name of the normalization mapping table that lists the strings to be removed from the input.

Output: The input with the specified strings removed.

For example, you can create a normalization mapping table called RemoveStrings and include a list of strings to remove:

sourceCode1	targetCode
&	&
and	and

Each string must be entered in both columns as shown above.

Parameter: RemoveStrings

Input: War and Peace

Output: War Peace

Remove characters from the end

Removes the last character from the input if it matches any character specified in the Parameter field.

This routine removes a single character only. If you want to remove a string of characters, you can either repeat this routine as many times as necessary or utilize the Remove string from the end routine.

Parameter: :,=;/]

Input:

New york: Blackwell,

Output:

New york: Blackwell

Remove HTML tags

Removes HTML tags from XML content.

Input:

Bats Adjust Their 'Field-of-View': Use of <test1>Biosonar</test1> Is More Advanced <test2>Than</test2> Thought,

Output:

Bats Adjust Their 'Field-of-View': Use of Biosonar Is More Advanced Than Thought

Remove Leading Characters

Removes the first character from the input if it matches any character specified in the Parameter field.

This routine removes a single character only. If you want to remove a string of characters, you can either repeat this routine as many times as necessary or utilize the Remove Leading String routine.

Parameter: [({

Input:

[1948]

Output:

1948]

Remove string from the end

Removes the string specified in the parameter from the end of the field.

Parameter: (ISBN)

Input: 675484451 (ISBN)

Output: 675484451

Remove Leading String

Removes the string specified in the parameter from the beginning of the field.

Parameter: (ISBN)

Input: (ISBN) 675484451

Output: 675484451

Remove Punctuation

Removes the following punctuation from the field and changes them to blank:

!"#$%&'()*+,-./:;<=>?@[\]^_`{|}}~

Punctuation defined in the parameter will not be deleted.

Parameter: $

Input: Cost of item: 121$

Output: Cost of item 121$

Replace Characters

Replaces the characters specified in the parameter with characters specified in the second part of the parameter:

If there is no <replacement string>, the characters will just be removed.

Parameter: .",@@^

Input: History of the U.S.A.

Output: History of the USA

Replace Spaces by String

Replaces all spaces by the string defined in the parameter.

A parameter is required.

Parameter: ^;^

Input: eng fre ger

Output: eng; fre; ger

Replace nonnumeric chars in range

Replaces each non-numeric character specified by the <start> and <end> character positions with a question mark.

Parameter: <start>@@<end>

characters specified in the start and positions character positions with a question marks

Parameter: 0@@3

Input: Ab56

Output: ??56

Replace start and end angle brackets by parentheses

Replaces < and > by ( ).

Replace string by string

Replaces the string specified in the parameter by the second string specified in the parameter. Use @@ to separate the parameters. Note that this transformation is case sensitive.

Parameter: Test@@TEST

Input: Test

Output: TEST

Split Data of Fixed Length

Splits fixed length data to substrings of the length specified in the parameter. The resulting substrings are delimited by spaces.

Parameter: 3

Input: engfreger

Output: eng fre ger

Split by pattern

Splits a string into substrings that match the defined pattern, The substrings are delimited by spaces.

Parameter: (.{3}})

Input: engdutheb

Output: eng dut heb

Split Field

Splits a field into separate PNX fields based on the delimiter defined in the parameter.

Parameter: ;

Input: eng;spa;ger

Output:

eng

spa

ger

Every language code will be a separate field.

Upper case every first letter

Changes the first letter of every word to uppercase, leaving all others in their original case. A word is defined as text that is separated by whitespace or punctuation.

This transformation accepts a normalization mapping table name as a parameter, which can be used to define words that should be ignored by the transformation. The Source column contains the word that you want to exclude, and the Target column contains the transformation. The same word can be added to both the Source and Target columns to ensure that they are not transformed at all. For example:

Source: and

Target: and

Input: barnes and noBle

Output: Barnes and NoBle

Input: A loNg and winding road

Output: A LoNg And Winding Road

Upper case every first letter (lower case others)

Changes the first letter of every word to uppercase and makes sure that all other characters are lowercase. A word is defined as text that is separated by whitespace or punctuation.

Source: and

Target: and

Input: barnes and noBle

Output: Barnes and Noble

Input: A loNg and winding road

Output: A Long And Winding Road

Upper case every first letter - whitespace only (lower case others)

Changes the first letter of every word to uppercase and makes sure that all other characters are lowercase. A word is defined as text that is separated by whitespace only.

Source: and

Target: and

Input: barnes and noBle

Output: Barnes and Noble

Input: A loNg and winding road

Output: A Long And Winding Road

Upper case every first letter - whitespace only

Changes the first letter of every word to uppercase, leaving all others in their original case. A word is defined as text that is separated by whitespace only.

Source: and

Target: and

Input: barnes and noBle

Output: Barnes and NoBle

Input: A loNg and winding road

Output: A LoNg And Winding Road

Take substring

Retrieves a substring from a string from the defined characters number and defined number of characters. Use @@ to separate the parameters. The count starts from 0.

This routine is useful to extract characters from specific positions in MARC control fields. For example to extract the year from 008 enter: 7@@4—that is, take 4 characters from position 7.

Parameter: 7@@4

Input: 831024s1984 mau b 00110 eng

Output: 1984

Take first words

Retrieves the first number of defined words. A word is any string between blanks.

Parameter: 5

Input: A history of the middle ages in the 13th and 14th centuries

Output: A history of the middle

Take first subfields

This transformation can be used when a single field can have multiple occurrences of a specific subfield and only a limited number of occurrences should be taken.

Parameter: 1

Input: $$z1458998797$$z8976439871

Output: 1458998797

Take characters from the end

Retrieves the defined number of characters from the end of the field.

Take from first occurrence

Retrieves the rest of the string from the first occurrence of the substring or character specified in the first parameter. Using the second parameter it is possible to indicate whether the string defined in the first parameter should be taken or not. Use 0 to exclude the string and 1 to include it. The default is 0.

Parameter: ,@@0

Input: Blackstone, John

Output: John

Parameter: ,@@1

Input: Blackstone, John

Output: , John

Take from last occurrence

Retrieves the rest of the string from the last occurrence of the specified substring or character. Using the second parameter it is possible to indicate whether the string defined in the first parameter should be taken or not. Use 0 to exclude the string and 1 to include it. The default is 0.

Take until first occurrence

Retrieves the string until the first occurrence of the specified substring or character. Using the second parameter it is possible to indicate whether the string defined in the first parameter should be taken or not. Use 0 to exclude the string and 1 to include it. The default is 0.

Do not use this transformation with a space as the parameter. Instead, use the Take first words routine.

Parameter: ,@@1

Input: Blackstone, John

Output: Blackstone,

Take until last occurrence

Retrieves the string until the last occurrence of the specified substring or character. Using the second parameter it is possible to indicate whether the string defined in the first parameter should be taken or not. Use 0 to exclude the string and 1 to include it. The default is 0.

Upper Case

Changes case to upper. There is no parameter.

Input: History of books

Output: HISTORY OF BOOKS

Use mapping table

Uses a mapping table to convert input values.

To prevent unexpected results, do not add more than one mapping row for each source code when defining a Normalization mapping table.

Write constant

Adds a constant. The parameter is the constant to add. Use this routine instead of the 'constant' source type if you want it written only if a field exists.

In addition, the following table contains more advanced routines in which regular expressions are used:

Advanced Routines
Routine Name	Description	Example
Drop String (use reg. exp)	Removes the string that matches the pattern defined in the parameter.	Parameter: \.$ Input: Cheever, Daniel Sargent. Output: Cheever, Daniel Sargent
ReplaceLast RegexpByString	Replaces the last occurence of the specified regular expression with the specified string. The Parameter field uses the following format, where <reg_exp> is a regular expression to replace, @@ is the parameter delimiter, and <str> is the replacement string: <reg_exp>@@<str> If the replacement string is omitted, the system will use an empty string. Output: The original string with the last instance of the regular expression changed to the replacement string.	Parameter: ($[^()]+$)$ Input: History of Germany (online) Output: History of Germany
Substitute string (use reg. exp.)	Substitutes all occurrences of a string found using a regular expression with the specified string. The Parameter field uses the following format, where <reg_exp> is a regular expression to replace, @@ is the parameter delimiter, and <str> is the replacement string: <reg_exp>@@<str> If the substitute string is omitted, the system will remove all occurrences of the string found by the regular expression. Output: The original string with all instances of a string replaced by the substitute string.	The following example replaces every second occurrence of a colon with a semicolon. Parameter: ([^\:]\:[^\:])\:@@$1\; Input: History of Germany: 1800s: 1900s Output: History of Germany: 1800s; 1900s
Take string (use reg. exp.)	Takes the string that matches the parameter. Once the first string is found, it will be returned.	Parameter: .{7}(.{4}).* Input: 831024s1984 mau b 00110 eng Output: 1984
Take all matching strings (use reg. exp)	Creates a string that consists of substrings that match the defined pattern. The string will be delimited by the delimiter defined as the second parameter. Unlike the Take string routine, this transformation takes all occurrences.	Parameter: \"([^\"]+)\"@@:: Input: LABEL="Cover Page" and LABEL="Table of Content" Output: Cover Page::Table of Content