Generic MARC 21
The following sections describe the mappings used for Generic MARC normalizations.
Control Section
Normalized Record Field | Source/Content | Note |
---|---|---|
Source ID
|
From data source definitions
|
|
Original Source ID
|
From data source definitions
|
|
Source Record-ID
|
From header of source file
|
|
Record ID
|
Source ID + Source Record-ID
|
|
Additional Record-ID
|
||
Source Type
|
Not in use.
|
|
Source Format
|
From data source definitions
|
|
SourceSystem
|
Aleph
|
Display Section
General Notes
-
String multiple occurrences with a semicolon unless indicated otherwise. If the source data has a period at the end and it is not the final occurrence, remove the period.
-
Remove the following end punctuation: : , = ; /
Notes Regarding Subfields and Indicators
-
If no subfields are listed explicitly, data from all non-numeric subfields will be displayed.
-
If a field or a subfield is repeated, all instances should be displayed.
-
Subfields are listed in alphabetical order for the sake of clarity, but should be displayed in the order they are recorded in the source record.
-
If all the subfields or specified non-numeric subfields are taken, numeric subfields are not considered.
-
If a numeric subfield is specifically included, no other numeric subfield will be included.
-
If a numeric subfield is excluded, the mapping will take other numeric subfields.
-
If no indicators are defined, all indicators will be taken.
880 tags are mapped together with the standard tag (the 880 tags are added first) in the following fields: Contributor, Publisher, Creator, Description, Edition, Subject (for 600, 610, 611, and 630), Relation, and Is part of.
Display Element | Source | Note |
---|---|---|
Source
|
Source from the data source definition
|
|
Resource Type
|
See mapping below
|
|
Title
|
If FMT=SE, then use 130 OR 245; otherwise use 245 with the following subfields:
130 ##adfklmnoprs
245 ## $$abfgknp
|
The 130 was added for serials, because for serials additional information will typically be in 130.
|
Uniform title
|
130 admnprs
OR
240 admnprs
|
|
Vernacular title
|
880 where $$6=245
subfields:
abfgknp
|
|
Creator
|
100 abcdejqu
110 abcde
111 abcdn
|
If the creator is derived from 100 and the first indicator is 1 or 2 then all of the text after the comma is taken (there must be a comma) and is displayed before the text that precedes the comma (deleting the comma itself).
For example:
Lippe, Ole von der --> Ole von der Lippe
Van Der Wise, Fred --> Fred Van Der Wise
Disabled rules do not reverse the author name.
|
Contributor
|
700, 710, and 711 except for second indicator=2
With the following subfields:
700 abcdejqu
710 abcde
711 abcdn
|
For 700 the first indicator is 1 or 2. Therefore all of the text after the comma is taken (there must be a comma) and is displayed before the text that precedes the comma (deleting the comma itself).
For example:
Lippe, Ole von der --> Ole von der Lippe
Van Der Wise, Fred --> Fred Van Der Wise
Analytic 7XX field are excluded. They will be added to the description.
Disabled rules do not reverse the author name.
|
Description
|
505, 520 $a
700, 710, and 711 with the second indicator=2 using the following subfields:
700 abcdemnopst
710 abcdemnopst
711 acdenpqst
|
Every field is a separate occurrence.
|
Edition
|
250 $a $b
|
|
Publisher
|
502 a or 260 a,b or 264 a,b
And equivalent fields from 880.
|
|
Subject
|
All 6XX fields
|
Strip all numeric subfields.
|
Language
|
008/35-37;
041 subfields $$a $d, $e (all occurrences should be taken)
|
Validate code against list of ISO 639-2 codes. If the code cannot be translated, leave it as is.
|
Physical Format
|
300 and 340 fields
|
If the 300 field does not end with a period, add it.
|
Identifier
|
020 $$a – prefix the value with ISBN:
022 $$a – prefix the value with ISSN:
024 2# $$a – prefix the value with ISMN
|
This mapping is disabled in the out-of-the-box template since the identifiers by default do not display in the Front End.
|
Relation
|
Prefix the value with Series: 400, 410, 411, 440, 490, 800, 810, 811, 830, 840
780 (first indicator -1): Prefix the value with Earlier Title:
785 (first indicator -1): Prefix the value with Later Title:
Strip subfield $w, $x, $y
|
Every field should be a separate occurrence.
The prefix should be added to $$C and the value to $$V.
Display constant codes are used:
series
earlier_title
later_title
|
Is Part Of
|
773
Strip subfield $w, $x, $y
|
|
Creation Date
|
260 $c
OR
008/07-10
|
For a date created from 008, create a date only if it starts with a digit that is not zero and replace missing digits with a question mark. For example:
19-- > 19??
19uu > 19??
|
Library Level Availability
|
The Library Level Availability field subfields include:
$$I Primo Institution
$$L Primo Library
$$1 Sublocation
$$2 Call number
$$S Availability status
$$3 No. of items
$$4 No. of unavailable items
$$5 multi-volume flag
$$6 number of loans
|
Mapping to Resource Type
The mapping is based on the format type derived either from LDR positions 6 and 7 or tag and position. Use the following tables to determine the mapping.
Leader pos. 6/7 | Record type | Format |
---|---|---|
a Language material + pos.7= a,c,d,m
|
Books
|
BK
|
a Language material + pos.7= b, i. s
|
Continuing Resources
|
SE
|
c Notated music
|
Music
|
MU
|
d Manuscript notated music
|
Music
|
MU
|
e Cartographic material
|
Maps
|
MP
|
f Manuscript cartographic material
|
Maps
|
MP
|
g Projected medium
|
Visual materials
|
VM
|
i Nonmusical sound recording
|
Audio materials
|
AM
|
j Musical sound recording
|
Audio materials
|
AM
|
k Two-dimensional non-projectable graphic
|
Visual materials
|
VM
|
m Computer file
|
Computer files
|
CF
|
o Kit
|
Visual materials
|
VM
|
p Mixed material
|
Mixed materials
|
MX
|
r Three-dimensional artifact or naturally occurring object
|
Visual materials
|
VM
|
t Manuscript language material
|
Books
|
BK
|
w Rare books
Used by KORMARC.
|
Rare Books | RB |
Default
|
BK
|
Format | Based on (tag and position) | TYPE | Note |
---|---|---|---|
BK
|
book
|
The catch-all for BK if no further information is available is Book
|
|
CF
|
008-26
h
|
audio
|
|
CF
|
008-26
j
|
database
|
|
CF
|
008-26
d, e
|
text_resource
|
|
CF
|
other
|
||
MP
|
map
|
DC defines a map as a type of image.
|
|
AM
|
audio
|
||
MU
|
score
|
||
SE
|
008 21
d,w
|
other
|
|
SE
|
008 21
L
|
text resource
|
|
SE
|
008 21
M
|
book
|
|
SE
|
journal
|
||
VM
|
008 33
I,k,l,n,s,t
|
image
|
|
VM
|
008 33
F,m,v
|
video
|
|
VM
|
other
|
||
MX
|
other
|
Links
Type of Link | Source | Note |
---|---|---|
OpenURL
|
Based on resource type from display:
If type=article then:
$$Topenurl_article
Otherwise:
$$Topenurl_journal
|
SFX has two sources for Primo: one for articles in which case the data is used and one for journals in which case the date is ignored. There is a different template per source.
|
OpenURL_fulltext
|
Based on resource type from display:
If type=article then:
$$Topenurlfull_article
Otherwise:
$$Topenurlfull_journal
|
SFX has two sources for Primo: one for articles in which case the data is used and one for journals in which case the date is ignored. There is a different template per source.
|
OpenURL_servicetext
|
||
Backlink
|
||
LinktoHoldings
|
||
Linkto Holdings_available
|
||
Linkto Holdings_unavailable
|
||
Linkto Holdings_doesnotexist
|
||
LinktoRequest
|
||
LinktoResource
|
856 40 $u and 856 41 $u
Add display text ($$D) from $y + $3 + $z. If not available, then use code: "Online version"
856 1#, 856 10, and 856 11
Add display text ($$D) from $y + $3 + $z. If not available then use code: "Online version"
|
Validate that the link is to the resource by checking the content of subfield 3.
|
Additional links
|
856 42 $u.
Add display text ($$D) from $y + $3 + $z. If not available then use code: "Related online content"
506 $u $$Dlink to restrictions on access
538 $u $$Dlink to system details
540 $u $$D Link to terms governing use and reproduction
545 $u $$D Link to biographical or historical information
856 41 $u if $3 is "Sample Text" or "Publisher description"
|
|
Thumbnail
|
$$Tsyndetics_thumb (disabled)
$$Tgoogle_thumb
|
For Syndetics, this field requires an ISBN.
For Google, this field requires an OCLC and LCCN.
|
linktotoc
|
505 $u
$$Tamazon_toc
$$Tsyndetics_toc (disabled)
856 4# $u if $3=Table of Contents
|
Create Amazon and Syndetic links only if there is an ISBN.
|
linktoabstract
|
$$Tsyndetics_abstract
|
Add if there is an ISBN (020 $a).
|
linktoreview
|
520 1# $u
|
|
linktofa
|
555 0# $u
Add subfields abcd to $$D
|
|
linktouc
|
$$Tamazon_uc – add if there is ISBN
$$Tworldcat_isbn – add if there is ISBN ELSE add
$$Tworldcat_oclc – if there is OCLC number
|
|
linktoexcerpt
|
$$Tsyndetics_excerpt
|
Add if there is an ISBN.
|
Search
880 tags are mapped together with the standard tag in the following fields: Creator/Contributor, Title, Additional title, Description, Subject (for 600, 610, 611, and 630), TOC, and General (except for identifiers).
Index | Source tag | Notes |
---|---|---|
Creator/contributor
|
100 abcdejqu
110 abcde
111 abcdn
245 c
505 r
508 a
511 a
700 abcdejqu
710 abcde
711 abcdn
720 a
800 abcdejqu
810 abcde
811 abcdn
|
|
100 a
700 a
800 a
|
For the 100, 700, and 800 fields, if the first indicator is 1 or 2, then take only the second uppercase character in the string, following a comma.
|
|
Title
|
If type = Journal:
245 a
245 a,b,f,g,n,p
130 a
Else:
245 a,b,f,g,n,p
|
For journals up to three exact titles are indexed.
|
Additional title
|
100 fgklnpt
110 fgklnpt
111 fgklnpt
247 abnp
400 fklnptv
410 fklnptv
411 fklnpstv
440 anpv
490 av
700 fklmnoprst
710 fklmnoprst
711 fklnpst
730 adfklmnoprs
740 anp
800 fklmnoprstv
810 fklmnoprstv
811 fklnpstv
830 adfklmnoprstv
840 adfklmnoprstv
760,762,765,767,770,772,
773,774,775,776,777,780,
785,786,787 subfields st
|
|
Alternative Title
|
130
210
240
243
246 abnp
|
|
Description
|
520 $a
|
|
Subject
|
6XX fields – Strip all numeric subfields
Translation of LCC by enrichment
|
|
ISBN
|
020 az
|
|
ISSN
|
022 ayz
|
|
Resource type
|
Resource type from display
|
|
Creation date
|
008/07-10 and 008/11-14 are digits and not 9999
260 $c
|
|
Full Text
|
||
TOC
|
505 $a
|
|
RecordSource
|
Source ID from the control section
|
Required to filter out certain sources.
|
RecordID
|
Record ID from the control section
|
Required to retrieve record based on system number.
|
General
|
260 $b
502
511
508
518
521
534
586
0242 az
0243 az
027 az
028 a
|
|
Search scope
|
From PNX:
delivery/institution
control/sourceid (for example the data source is added as a scope)
|
|
Restricted search scope
|
||
Scope
|
Copies from the Search scope and Restricted search scope from the sections above
|
Sort
Sort type | DC field |
---|---|
Creation Date
|
008/07-10 OR 260 $c
|
Author
|
A single author sort key is created from one of the following tags. Subfields are the same as in the display section:
880/100
100
880/110
110
880/111
111
880700
700
880/710
710
880/711
711
|
Title
|
A single title sort key is created from one of the following:
880/245
130 if FMT=SE
245
|
Popularity
|
Facets
Facet | Source | Note |
---|---|---|
Resource type
|
Create this based on the Resource type field from display section as follows.
Book -> books
Journal -> journals
Article -> articles
Text Resource -> books
Image -> images
Audio -> media
Video -> media
Score -> Scores
Map -> Maps
Other -> other
|
In some cases, two values should be created, each as a separate field.
|
Language
|
008/35-37 and 041 subfields a, d, e.
|
If the language is not a valid ISO 639 code it should not be created.
|
Creator/Contributor_
|
100/700 $a
110/710 $a
111/711 $a
|
The normalized format.
For 100 and 700, if the first indicator is 1 or 2 then take second upper case character in the string, following a comma.
7XX except for second indicator 1.
|
Topic
|
6XX except for 655
First facet level is all data up to the first occurrence of subfield $$v, x, y or z. Each subfield division (v, x, y or z) constitutes the next level.
The first facet level might have multiple occurrences in one record; these multiple occurrences should be "de-duplicated."
|
Punctuation that is in the field should be retained, except for periods at the end.
For example:
<datafield ind1="0" ind2="0" tag="630"/>
<subfield code="a">Bible.</subfield>
<subfield code="p">O.T.</subfield>
<subfield code="p">Pentateuch</subfield>
<subfield code="x">Sermons.</subfield>
Should become:
Bible - O.T. - Pentateuch-Sermons (the hyphen between Pentateuch and Sermons is for the levels).
|
Genre
|
655 $a
6XX $v
|
|
classification.lcc
|
Added by enrichment
|
|
Creation Date
|
008/07-10 OR 260 $c
|
Truncate 260 $c so that it has only 4 digits. If the date cannot be normalized to 4 digits, do not create the facet.
|
File size
|
Not in use
|
|
Collection
|
||
Physical format
|
Not in use
|
Not in use.
|
Top-level
|
online_resources -- assign if the delivery category is Online Resource, SFX Resource, or MetaLib Resource.
new – as tagged before load.
Available in Library map based on availability information in the source record.
|
|
Pre-filter
|
Based on Resource Type from the display section:
Book -> books
Journal -> journals
Article -> articles
Text Resource -> books
Image -> images
Video -> audio_video
Audio -> audio_video
Maps -> maps
Score -> scores
|
|
Related record
|
Duplicate Record Detection Vector
Currently two types of record matching vectors exist:
-
T1 – for non-serials
-
T2 – for serials
The mapping of record to T1 or T2 is based on the format type. The format type is based on the extraction procedure that creates the format (FMT) field from pos. 6 and 7 in the leader.
-
T1 – All formats except for SE
-
T2 – SE
Vector for T1 - "non-serials"
Field ID | Nature of field | Content of Field/Source Tag + Subfield | Note |
---|---|---|---|
T
|
Type
|
1
|
Created if the format is not SE.
|
The following fields are for the candidate selection:
|
|||
C1
|
UnivID, UnivID_invalid
|
010 $a $z
|
Take prefix and number and remove any suffixes.
Multiple occurrences are delimited by a semicolon.
|
C2
|
ISBN, Invalid_ISBN
|
020 $a $z
|
Use data until a blank character or the end of subfield.
Multiple occurrences are delimited by a semicolon.
|
C3
|
Short title
|
245 $abnp
Use normalization routine #1
Exact match on first 20 and last 10 char.
|
The result is a single string of 30 characters.
|
C4
|
Year
|
008 7-10
|
|
The following fields are for the matching program:
|
|||
F1
|
UnivID
|
010 $a
|
Take prefix and number and remove any suffixes
|
F2
|
UnivID_Invalid
|
010 $z
|
Take prefix and number and remove any suffixes
Multiple occurrences are delimited by a semicolon.
|
F3
|
ISBN
|
020 $a
|
Use data until a blank character or the end of subfield.
Multiple occurrences are delimited by a semicolon.
|
F4
|
ISBN_Invalid
|
020 $z
|
Use data until a blank character or the end of subfield.
Multiple occurrences are delimited by a semicolon.
|
F5
|
Short title
|
245 $abnp
|
Same as C3.
|
F6
|
Year
|
008 7-10
|
|
F7
|
Full title
|
245 $abnp
Use routine #2 from
|
|
F8
|
Country of publication
|
008 15-17
|
|
F9
|
Pagination
|
300 $$a
|
|
F10
|
Publisher
|
260 $$b
Use filing routine #3 to normalize
|
Take only first occurrence of 260 tag and first occurrence of subfield b.
|
F11
|
Main entry (author, corporate body, meeting)
|
100 $abcdq
OR
110 $abcdn
OR
111 $abcdenq
Use normalization routine #3 to normalize
|
Vector for T2 - "serials"
Field ID | Nature of field | Content of Field/Source Tag + Subfield | Note |
---|---|---|---|
T
|
Type
|
2
|
Created if the format is SE.
|
The following fields are for the candidate selection:
|
|||
C1
|
UnivID, UnivID_invalid
|
010 $a $z
|
Use data until a blank character or the end of subfield.
Multiple occurrences are delimited by a semicolon.
|
C2
|
ISSN, Invalid_ISSN, cancelled_ISSN
|
022 $a $y $z
|
Use data until a blank character or the end of subfield.
Multiple occurrences are delimited by a semicolon.
|
C3
|
Short title
|
245 $abnp
Use filing procedure #1
Exact match on first 25 char.
|
The result is a single string of 25 characters.
|
C4
|
Place of publication
|
260 $$a normalized using routine 75
After applying routine #3 then take only the first string (up to first blank).
|
Take only first occurrence of 260 and first occurrence of subfield a.
|
The following fields are for the matching program:
|
|||
F1
|
UnivID
|
010 $a
|
Use data until a blank character or the end of subfield.
|
F2
|
UnivID_Invalid
|
010 $z
|
Use data until a blank character or the end of subfield.
Multiple occurrences are delimited by a semicolon.
|
F3
|
ISSN
|
022 $a
|
Use data until a blank character or the end of subfield.
Multiple occurrences are delimited by a semicolon.
|
F4
|
ISSN_Invalid
|
022 $y
|
Use data until a blank character or the end of subfield.
Multiple occurrences are delimited by a semicolon.
|
F5
|
ISSN_Cancelled
|
022 $z
|
Use data until a blank character or the end of subfield.
Multiple occurrences are delimited by a semicolon.
|
F6
|
Year
|
008 7-10
|
|
F7
|
Full title
|
245 $abnp
Use filing routine #2
|
|
F8
|
Truncated title
|
245 $a
Use normalization routine #2
|
|
F9
|
Country of publication
|
008 15-17
|
|
F10
|
Place of publication
|
260 $$a normalized using routine #3
After applying routine, take only the first string (up to first blank).
|
Take only first occurrence of 260 and first occurrence of subfield a.
|
F11
|
Main entry (author, corporate body, meeting)
|
110 $abcdn
OR
111 $abcdenq
OR
130 $a adlmnoprst
Use filing routine #3
|
FRBRization
Refer to Normalization Routines for Duplicate Record Detection, for the normalization routines for the author and title parts.
The key field has two subfields:
-
$$K key part
-
$$A key part type that determines the algorithm
Field ID | Source (value of $$K for K fields) | Key part type (value of $$A for K fields) | Note |
---|---|---|---|
T
|
Always 1
|
MARC 21 algorithm
|
|
K1-Kn
For every record a different number can be created
|
100 OR, 110 OR, 111 OR, 700 ADD, 710 ADD, 711 ADD
|
A
|
Single occurrence of 100, 110, and 111;
Multiple occurrences of 700, 710, 711, 100, 110.
Take subfields a, b, c, d, q
111, 711 - a, b, c, d, n, q
Do not generate key from 700 or 710 if subfield e = "former owner"
|
Kn
|
130
|
TO
|
Subfield a, d, m, n, p, r, s
Do not generate a key if subfield a or k contains "selections" or "census."
|
Kn
|
If format is not SE:
240 ADD
245 OR
242 OR
246 OR
247 OR
740 OR
245 subfield k
If format is SE:
240 ADD
245 OR
242 OR
246 OR
247 OR
740 OR
245 subfield k
|
T
|
240 – Subfields a, d, m, n,p,r, s
245 – a, b, e, f, g, n, p
242 – a, b, f, g, n, p
246 – a, b, f, g, n, p
247 – a, b, f, g, n, p
740 – Subfields anp
Do not generate a part key from 240 if it starts with any of the following: selections, laws, treaties, bills, statutes, Acts, public general acts, acts, rules, works, or census.
Note: If the format is not a serial (FMT=SE), then the title part keys will be generated from both 240 and 245.
|
Delivery and Scoping
Delivery Field | Source | Additional normalization notes |
---|---|---|
Institution
|
Using ILS Institution Codes mapping table.
|
|
Delivery category
|
Based on algorithm in Defining the Delivery Category Algorithm.
|
|
Restricted delivery scope
|
Ranking
Local mapping required as relevant.
Booster Field | Source | Additional normalization notes |
---|---|---|
booster1
|
1 or as added by enrichment program
|
|
booster2
|
Not in use.
|
Enrichment
Local mapping required as relevant.
Enrichment Field | Source | Additional normalization notes |
---|---|---|
classification.lcc
|
050 $a, 090 $a
|
All occurrences added to separate fields.
|
fulltext
|
||
TOC
|
||
Abstract
|
||
Review
|
||
Rank-parent-child
|
||
Rank-Number of copies
|
||
Rank-Date first copy
|
||
Rank-Number of loans
|
Additional Data
This includes multiple occurrences in separate fields.
Additional data field | Source | Additional normalization notes |
---|---|---|
Author Last
|
100 1# OR 100 2# OR 700 1# OR 700 2# $a
|
Takes text until first comma.
Only one occurrence should be created.
|
Author First
|
100 1# OR 100 2# OR 700 1# OR 700 2# $a
|
Takes text after first comma and until first space.
Only one occurrence should be created.
|
Author initials
|
||
Author first initial
|
||
Author middle initial
|
||
Author suffix
|
||
Author
|
100 abcdejqu
|
|
Corporate Author
|
110 abcde
111 abcdn
|
|
Additional author
|
700 abcdejqu
710 abcde
711 abcdn
|
|
Series author
|
800 abcde
|
|
Book Title
|
If resource type is not an article or a journal:
245 abfgknp
|
Because the PNX cannot be used in conditions, this is based on LDR and 008.
|
Article title
|
||
Journal title
|
If resource type is Journal:
245 abfgknp
|
Since the PNX cannot be used in conditions, this is based on LDR and 008.
|
Short title
|
210 a
|
|
Additional title
|
246 abnp
|
|
Series title
|
400, 410, 411, 440, 490, 800, 810, 811, 830, 840
|
Strip subfield x.
|
Date
|
008/07-10 or 260 $c
|
Normalize to 4 characters.
|
RISDate
|
260 $c or 008/08-10
|
|
Additional Date
|
||
Volume
|
||
Issue
|
||
Part
|
||
Season
|
||
Quarter
|
||
Start page
|
||
End page
|
||
Pages
|
||
Article number
|
||
ISSN
|
022 a
|
Use data up to a blank character or end of subfield.
|
eISSN
|
776 x
|
Use data up to a blank character or end of subfield.
|
ISBN
|
020 a
|
Use data up to a blank character or end of subfield.
|
CODEN
|
030 a
|
Use data up to a blank character or end of subfield.
|
SICI
|
||
Metadata Format
|
If there is a 502 -> dissertation
Else based on Resource type from display:
Else -> book
|
|
Genre
|
The Genre mapping table maps the resource type from the display section of the PNX to the genre that is required by the OpenURL.
|
Use Genre mapping table.
|
RISType
|
Based on Resource type from display:
If there is a 502 then -> THES
book -> BOOK
journal -> JOUR
map -> MAP
video -> VIDEO
audio -> SOUND
music -> MUSIC
article -> JOUR
Else -> GEN
|
|
City of Publication
|
260 a
|
|
Publisher
|
260 b
|
|
Abstract
|
520 ab
|
|
Miscellaneous1
|
||
Miscellaneous2
|
||
Miscellaneous3
|
||
OCLC ID
|
035 $$a – if text (OCoLC) is present in 035.
|
Take all digits following the text OCLC and until space.
Example:
035 $$a(OC0LC)814782
|
LCCN
|
010 $$a
|
Take prefix and number.
|
DOI
|
||
URL
|
||
Local fields 1-5
|
Browse
The system can create multiple occurrences in separate fields.
Browse field | Source | Additional normalization notes |
---|---|---|
Institution
|
PNX: delivery/institution
|
|
Author
|
All of the following:
100,110,111,700, 710, 711, 720, 800, 810, 811, and equivalent 880 fields
|
$$D (display form) and $$E (normalized form) are created.
|
Title
|
All of the following:
130, 210, 240, 243, 245, 246, 247, 440, 490, 730, 740, 830
And the following using $$t:
100,110,111, 700, 719, 711,800, 810, 811
And equivalent fields from 880.
|
$$D (display form) and $$E (normalized form) are created.
|
Subject
|
600, 610, 611, 630, 648, 650, 651, 654, 655
|
$$D (display form) and $$E (normalized form) are created.
|
Call number
|
Rules not added.
|
Normalization Routines for Duplicate Record Detection
Certain characters are translated in XML:
Special character | Special meaning | Entity encoding |
---|---|---|
>
|
Begins a tag.
|
>
|
<
|
Ends a tag.
|
<
|
Quotation mark.
|
"
|
|
'Apostrophe.
|
'
|
|
&
|
Ampersand.
|
&
|
The publishing platform removes all leading and trailing spaces and packs double spaces.
Normalization Routine #1
-
Remove non-filing charactersDrop initial text using non-filing indicator. The non-filing indicator is the second indicator in the following MARC tags: 222, 240, 242, 243, 245, 440, and 830. The second indicator contains a number from 0-9 indicating how many characters to drop. (There are some fields where the non-filing indicator is in the first position: 130, 630, 730, and 740.)Remove all text that appears within <<>> or within the Unicode characters 0088 and 0089.
-
Delete the following characters: '
-
Change the following characters to blank: !@#$%^&*()_+-={}}[]:";<>?,./~`
-
Convert characters using the "FILING-KEY-01' character conversion table.
-
Change characters to lower case.
-
Remove all spaces.
-
Take first 10 and last 10 characters.
Normalization Routine #2
-
Remove non-filing characters.Drop initial text using non-filing indicator. The non-filing indicator is the second indicator in the following MARC tags: 222, 240, 242, 243, 245, 440, and 830. The second indicator contains a number from 0-9 indicating how many characters to drop. (There are some fields where the non-filing indicator is in the first position: 130, 630, 730, and 740.)
-
Remove all text that appears within <<>> or within the Unicode characters 0088 and 0089.For example:<datafield ind1="1" ind2="0" tag="245"> <subfield code="a"><<the>> book : its history in England in the middle ages!</the></subfield> </datafield>Should become:"book: its history in England in the middle ages"
-
Delete the following characters: '
-
Change the following characters to blank: !@#$%^&*()_+-={}}[]:";<>?,./~`
-
Convert characters using the "FILING-KEY-01' character conversion table.
-
Change characters to lower case.
Normalization Routine #3
-
Delete the following characters: '
-
Change the following characters to blank: !@#$%^&*()_+-={}}[]:";<>?,./~`
-
Convert characters using the "FILING-KEY-01' character conversion table.
-
Change characters to lower case.
Normalization Routines for FRBR
The publishing platform will delete leading and trailing blanks and remove double spaces.
Author Part Normalization
-
Delete characters: | [ ] '
-
Change characters to space: $~'^%*/\?@.:;<>{}}-()"!¿¡,
-
Convert characters using the NACO_diacritics character conversion table.
-
Change characters to lower case.
Title Part Normalization
-
Remove non-filing characters.Drop initial text using non-filing indicator. The non-filing indicator is the second indicator in the following MARC tags: 222, 240, 242, 243, 245, 440, and 830. The second indicator contains a number from 0-9 indicating how many characters to drop. (There are some fields where the non-filing indicator is in the first position: 130, 630, 730, 740.)
-
Delete characters: | [ ] '
-
Change characters to space: $~'^%*/\?@.:;<>{}}-()"!¿¡
-
Convert characters using the NACO_diacritics character conversion table.
-
Change characters to lower case.
Defining the Delivery Category Algorithm
The following out-of-the-box algorithm is used for MARC 21. It should be possible to distinguish between the following resource types:
-
Physical items (except for microfolm)
-
Microform
-
SFX resources
-
Online resources
The algorithm is read from top to bottom. Once a record is assigned a category, the algorithm stops.
When there are several definitions for the same category the priority is given to the "safest" option.
In the algorithm, priority has been given to online resources based on the assumption that users most often prefer this option. Primo will include a display of the location and availability status of physical items.
The format is based on the definitions used for each resource type. For more information on these definitions, see LDR Positions.
Condition | Delivery Category | Note |
---|---|---|
If 035=SFX
|
SFX Resources
|
|
007/00=c and 007/01=r
|
Online Resource
|
|
If there is a 8564- or 85640 or 85641
|
Online Resource
|
Add conditions based on $$3 to prevent this category from being assigned if the link is not to the resource (e.g. $$3 is Table of Contents, or Abstract).
|
If 007/00=h
|
Microform
|
|
If FMT=BK or MU or SE or MX and 008/23=a or b or c
|
Microform
|
|
If FMT=MP VM and 008/29=a or b or c
|
Microform
|
|
If 245 $$h includes the string micro
|
Microform
|
|
If not any of the above
|
Physical Item
|