Configuring Content Structures
Configuring Content Structures
Content structures define the structure of the package that must be delivered to Rosetta in order to convert it to a Rosetta-compatible METS.
Deposit Managers work with content structures using the List of Content Structures page (see Accessing the List of Content Structures Page). The following activities can be performed using this page:
- Adding a Content Structure
- Viewing Content Structure Details
- Duplicating a Content Structure
- Viewing Material Flows Associated with the Content Structure
- Updating a Content Structure
- Deleting a Content Structure
Accessing the List of Content Structures Page
The List of Content Structures page enables Deposit Managers to view, activate, duplicate, and delete existing content structures. In addition, Deposit Managers can use this page to add new content structures.
To access the List of Content Structures page:
From the Rosetta drop-down menu, select Deposits > Deposit Arrangements > Content Structure.
The List of Content Structures page opens.
List of Content Structures Page
To narrow your view to a subset of what is shown, click the Filter drop-down arrow and select the type, class, or group you want to see. To find an existing structure, enter its name or type in the Find field, select an in: option, and click the Go button.
The table displays the following information:
Column | Description |
---|---|
Name | The name of the content structure. |
Type | The type of content structure. |
Material Flow |
|
Created On | The date on which the content structure was created. |
Adding a Content Structure
Deposit Managers can add a new content structure to define how metadata must be converted from its original format to one supported by Rosetta.
To add a content structure:
- Access the List of Content Structures page (see Accessing the List of Content Structures Page).
- In the Add Content Structure drop-down list, select the content structure to which the original format must be converted.
- Click Add. The Content Structure Details page for that format converter opens.
- Enter information in the fields for the format converter you selected. See Set of Files Converter, Dublin Core Converter, XSL Converter, BagIt Converter, METS Converter, or CSV Content Structure for examples of each format.
- Click Save. The content structure is saved in the Rosetta system.
To enable Producer Agents to use the content structure for uploading content, Deposit Managers must associate the content structure with a material flow (see Associating Material Flow Components with Material Flows).
Set of Files Converter
The Set of Files Converter is the simplest and quickest method to load content into Rosetta. Use this content structure for UI-based deposits and other non-structured content.
Content Structure Form: Set of Files
All fields except Name are populated with default values. Enter a name for the converter and review and change, if necessary, the default values, using the descriptions below.
- Name: The name of the content structure.
- Status: The status of the content structure, either Active or Inactive. If the content structure is Inactive, it cannot be used in a Material Flow.
- Create Complex: If set to true, Rosetta creates only one IE. Its one representation contains all of the files in the streams subfolder. If set to false, Rosetta creates separate IEs with one representation and one file for each file in the streams subfolder.
When selecting false, Rosetta moves the content of dc:title to dc:alternative, since the title is the same for each IE and copies the file name to dc:title, since it is unique.
Dublin Core Converter
The Dublin Core Converter is similar to the Set of Files Converter but it allows users to maintain a relation between specific metadata and filestreams. This is useful when you want to create multiple IEs with different metadata in one SIP.
A dc.xml file may contain multiple DC records. <record> elements can be nested in any root element (for example: <records>, <collection>, etc.).
Content Structure Form: Dublin Core
- Name: The name of the content structure.
- Status: Should be set to active.
- Stream Source: The dc field that references files (one or more) to be ingested. File location is relative to the SIP's streams subfolder. Absolute NFS paths and HTTP references are also supported (URL must be a direct link to the binary file).
The list of available stream source fields can be edited from the Content Structure Stream Source code table in the administration UI.
XSL Converter
The XSL converter allows users to upload SIPs of any source format in an automated material flow. This is done by preparing deposits in XML format and creating an XSL file that can convert the input XML files to DC files. This format allows customers uploading files to Rosetta to enrich the IEs with metadata information without the need to create a full, valid Rosetta METS (only DC information can be provided along with the streams to be uploaded).
XSL Content Structure Details
The XSL converter definitions are similar to the DC converter with the addition of the following fields:
- Create Complex: If this field is set to False, Rosetta creates one IE with one representation and one file for each file in the streams subfolder. If this field is set to True, Rosetta creates only one IE. Its one representation contains all of the files in the streams subfolder.
- Upload XSL File: The XSL file name including full path.
The file extension (.xsl) is validated by Rosetta at the time of creating the content structure’s instance.
The system performs steps as follows:
- Using the XSL, it converts the input XML to DC format.
- The system uploads the DC and stream file(s) based on the information in the converted DC.
BagIt Converter
The BagIt converter allows you to upload SIPs in BagIt format. This format consists of the following sections:
- Data – the digital files
- Manifest – contains a checksum with the relative path to the files that enables Rosetta to perform a validation on the files
- txt file – contains the MD tags of the BagIt data
The BagIt content structure form contains the following fields:
BagIt Converter
- Store Tags as Source Metadata check box– select to convert tag files to source md (whether or not they are mapped)
- Tag File – the name of the txt file that contains the BagIt metadata tags
- Tag – the BagIt metadata tag to which you want to map the METS field
- Property – the METS field to which you want to map the BagIt metadata tag
METS Converter
The METS content structure form contains the following fields:
METS Converter Form
- Name: The name of the content structure.
- Status: Should be set to active.
CSV Content Structure
The CSV Content Structure allows users to submit metadata in CSV format, along with file streams. Rosetta transforms each CSV row into an object (Collection, IE, Representation, File—depending on the Object type field). The CSV file should hold all the relevant information for creating the objects. This can include metadata about the SIP and can also be used to create new collections.
The CSV Content Structure can be used only in a material flow with system-defined Detailed CSV or NFS submission format. To use NFS, place the file under the streams directory and the CSV file under the content directory. If you use a zip file, they are decomposed automatically (with Submission Format validations).
The CSV Content Structure UI requires users to specify a CSV template, which determines the metadata fields depositors are required to fill. These templates are managed in the CSV Template UI, in the Producers section. For more information, see CSV Templates.
CSV Converter Update Page
The Generate CSV Option field enables staff users to allow Producer Agents to auto-generate a full CSV file that represents the structure of the uploaded file. According to the selected value, Producer Agents will be prompted to download an auto-generated CSV file once a file has been uploaded. They will then be able to save their deposit activity as a draft, conveniently edit the CSV file, add metadata for each object, and upload and submit at a later stage. The following options are available:
- None - no CSV auto-generation will be available.
- Simple - each file will become one IE.
- Collections - like simple, but with each node in the file becoming a collection.
- Complex - the entire deposit will become one IE.
- A physical structure map will be created based on the order of the files in the CSV file. When using Complex, it may be advisable to use the Generate Logical Structmap enrichment task to maintain the file structure hierarchy.
- When using Collections, a dcterms:is PartOf field must be added to the CSV mapping table so that sub-collections are created properly. Three additional fields may be added with the collection prefix:
- collection.externalSystem
- collection.externalId
- collection.description - The externalSystem and externalId fields must be unique as a pair.
The Download CSV Template downloads a CSV file with all the mandatory fields defined in the selected mapping table. This can be provided to Producer Agents when auto-generation is set to None.
Download CSV Template
CSV Templates
For information specific to CSV templates for structural IEs, see CSV Format for Structural IEs of the Rosetta Producer’s Guide.
CSV templates include a list of mandatory metadata fields that must be part of a deposited csv file. This list populates the CSV Template drop-down list in the CSV Content Structure UI. From the Rosetta drop‐down menu, select Deposits > Deposit Arrangements > CSV Templates.
CVS Template – View
CVS Template – Edit
The mandatory metadata fields include both system-level mandatory fields and user-level mandatory fields. System level mandatory fields include:
- File Original Name - File name (should be left empty when using HTTP/FTP)
- File Original Path - Absolute path to file (should be left empty when files are under SIP streams directory) or URL
- Collection dc:title and dcterms:IsPartOf (when generating collections)
User‐level mandatory fields can include additional fields the library may require (such as an IE dc:creator field). System mandatory fields are visible on the right and cannot be changed. To add or remove a user‐level mandatory field, simply drag and drop the requested field in the multi‐select widget.
- Collection-level mandatory fields are relevant only if the selected CSV content structure is configured to use collections (Generate CSV Option=Collections).
- Preservation Type field default value is Preservation Master.
Viewing Content Structure Details
Deposit Managers can view the content structure details, such as the content structure format, original format, and mapping table.
Deposit Managers cannot update the details while viewing them.
To view the content structure details:
On the List of Content Structures page (see Accessing the List of Content Structures Page), locate the content structure you want to view and click View. The View Content Structure Details page opens.
Read-Only View of Content Structure Details
For a description of the information displayed on this page, see the table in Accessing the List of Content Structures Page.
Duplicating a Content Structure
Deposit Managers can duplicate a content structure. This is especially helpful when creating a new content structure. It is often faster to duplicate an existing content structure and then modify it, than to create a new content structure.
To duplicate a content structure:
On the List of Content Structure page (see Accessing the List of Content Structures Page), locate the content structure you want to duplicate and click Duplicate. The Rosetta system creates a copy of the structure.
An exact copy of the content structure is added to the List of Content Structures page. The Rosetta system automatically labels the new content structure with the name Copy of followed by the name of the original content structure.
Viewing Material Flows Associated with the Content Structure
Deposit Managers can view the material flows that are associated with the content structure.
To view the material flows:
On the List of Content Structures page (see Accessing the List of Content Structures Page), locate the content structure for which you want to view the material flows and click one of the following:
- The Associated Material Flows link, when multiple material flows are associated with the content structure.
- The name of the material flow, when a single material flow is associated with the content structure.
The List of Material Flows page opens. The page displays columns containing the information described in Adding a Material Flow. You cannot update the material flow details.
To return to the List of Content Structures page, click Save.
Updating a Content Structure
Deposit Managers can update content structure details at any time. For example, a Deposit Manager can specify another content structure or change the mapping table.
To update a content structure:
- On the List of Content Structures page (see Accessing the List of Content Structures Page), locate the content structure that you want to update and click Update. The Update Content Structure Details page opens.
- Modify the fields as needed.
- To save your changes and return to the List of Content Structure page, click Save.
The system updates the content structure details.
Deleting a Content Structure
A Deposit Manager can delete a content structure when it is not being used by any Producers and the Deposit Manager does not want to maintain the content structure.
Deposit Managers cannot delete a content structure when a Producer Agent is using it to deposit content. Deposit Managers can delete the content structure only after the deposit process is complete and no other Producer Agent is using the content structure.
To delete a content structure:
- On the List of Content Structures page (see Accessing the List of Content Structures Page), locate the content structure you want to delete and click More. Additional options are displayed.
- Click Delete. The confirmation page opens.
- Click OK. The content structure is removed from the list.
The content structure is removed from the Rosetta system. Producer Agents can no longer use this content structure when depositing content.