Migrating Content from a Digital Commons Repository to Alma Digital
Several hundreds of files had to be migrated from a Digital Commons repository to Alma Digital. These collections were of various file types, and various document types.
Depending on the type of resource and whether MARC or Dublin Core records were best suited for the type of resource we chose different methods to migrate these in Alma.
Books
In each of about 10 cases we had a MARC record for the print version already in Alma. As there were only a few entire books to migrate we handled these on a title by title basis. The steps for these were:
- Export the bib record from Alma, and open in MARCedit. This allowed me to set up tasks to make batch changes that could be applied to each of the book records easily, and some of those tasks were also then available to use on other types of migrated content.
- Edit the record using MARCedit to ensure it was suited for an online version of the book
- Ensured that this new record had appropriate metadata so that the "Open Access" marker would appear in Alma and Primo
- Imported the record to Alma as a new record (MARCxml file format)
- Used a browser extension to batch download the chapter pdfs from a landing page in the Digital Commons repository to the desktop. This made assembling the files into a folder very quick.
- We provide an entire work pdf in one representation, and individual chapters/parts pdfs in another representation. Example
- Had to rename files to assist with labels for display in Alma viewer
- In Alma search for the record, add representations manually.
- Scanned book covers for thumbnails
Other MARC record collections
Local Journals (ceased publication). We had several journals in this category. We had pdfs for each article in the repository to migrate. The largest set was about 500 articles. The smallest about 10. For each journal these steps were followed:
- Using MARCedit harvest from the Digital Commons repository using OAI-PMH specifying the set (each journal had its own set), and mapping the results to MARC records. The URL for the PDF version in the repository was automatically mapped to an 856 tag.
- Edit the records using MARCedit
- Batches of tasks were defined that could be applied to each journal with some minor variations eg. had to update the task to change the ISSN for each new set
- Added 583 tag to describe the migration. eg.
-<marc:datafield tag="583" ind2=" " ind1=" ">
<marc:subfield code="a">Transformed digitally</marc:subfield>
<marc:subfield code="z">Migrated from epublications@bond repository</marc:subfield>
<marc:subfield code="2">pda</marc:subfield>
<marc:subfield code="c">20181127</marc:subfield>
<marc:subfield code="h">Bond University Library</marc:subfield>
</marc:datafield> -
Added 786 linking tag to a journal level record
-
Added 506 tag for Open Access indicators
-
Modified LDR and 008 (this was tricky to only change encoding that applied to the whole set and not the individual articles), manually changed the date positions
-
Uploaded set of records to Alma using new record with no inventory type of profile, and created a set based on these records
-
Ran the Generate Representations based on 856 tag job on the set, specifying the collection, access rights, and which fields to extract the label and any notes from
-
After closure of the Digital Commons repository these sets need to have the 856 tag that points to the old repository removed. This could be done with a normalization rule and process in Alma, or by exporting all, editing and reloading if they are large sets.
Sets to have Dublin Core records
Various other smaller sets including images, audio etc were also migrated using the following method.
- Download the files into a folder (to be used by the Alma Digital Uploader) for an ingest
- Rename if necessary so that file names make nice labels when they are processed in Alma, and give some consideration to file naming when you have multiple files in a representation to get the preferred sort order.
- Harvest using MARCedit OAI-PMH specifying the set
- Make any batch edits required
- Export using MARC to Dublin Core mapping
- Edit the xml using Notepad++ to insert file details for each record
-
<dc:identifier>file://filename.jpg</dc:identifer>
-
-
Create or modify a digital inventory import profile to work with your ingest files. Don't use any special characters in your filenames, make sure the profile is looking for xml if that is the type of metadata file.
I found the training videos very helpful in working through these processes.