Glossary
Access copy |
When an Intellectual Entity (IE) contains files that cannot be viewed due to constraints in a library's technical environment, staff users can create a viewable copy called an Access Copy. Access Copies allow users to add representations of a derivative type to complete an IE. |
Access rights |
The rights of a user to view a particular object--and any related restrictions. |
Access rights checker |
A plug-in component that determines access restrictions for a user—from the user’s particular context, and for the particular object. |
Access rights exceptions |
Exceptions to the object’s access rights that may allow a certain user special access rights based on the user’s or the object’s characteristics. |
Settings that define who can access the content deposited by Producer Agents, and when this content can be accessed. Access rights options are defined by staff users. | |
Access SDK |
A code component that externalizes all the relevant APIs needed for creation of viewers and viewer preprocessors |
Agent |
A user or system that performs an action |
Application Library |
The Application Library contains all of the data regarding applications: name, ID, license end-date, and so forth. Each application can be related to one or more formats. |
Approval group |
A group of Assessors, Arrangers, or Approvers to which the same group of SIPs is assigned by a back office Administrator. The group is based on business rules, and the parameters for calculating these values are the material type and classification. An approval group helps manage SIPs as they move through system processes. |
Staff users responsible for reviewing SIPs and deciding whether a SIP must be approved, returned to the Producer Agent, or declined. | |
Audit event |
An event that is stored as a discrete entry in an audit event table. |
Back Office Administrator |
An administration user responsible for configuring the overall framework that defines how Producers, Producer Agents, and staff users interact with the Rosetta system, and how content is processed. |
BIRT |
BIRT is an open source Eclipse-based reporting system that integrates with your Java/J2EE application to produce reports. |
Bitstream |
An object embedded within a bytestream that cannot be transformed into a standalone file without the addition of file structure (for example, headers). |
Boiler Plate Statement |
The copyright statement presented to a user when the user deposits items. |
Bytestream |
A compound file containing filestream(s) and/or embedded bitstreams. |
Casual Producers |
Casual Producers are associated with a single non-authenticated Producer Agent who submits material on a one-time basis through a special unrestricted URL. This URL links to a deposit interface limited to a set of submission wizards. Casual Producers are not managed actively. The option to allow creation of Casual Producers is configurable. If the parameter (in the General Parameters table) is set to false, there will be no option to create Casual Producers. |
Classification group |
Since significant properties are usually shared between multiple formats, the classification group is a way to aggregate these common properties so that they will be connected to all the relevant formats. |
Consortia |
The 2nd level of the consortium hierarchy. This is a group of institutions which allow them to share certain entities and support across institutional roles. The consortia level may not be used actively in any given installation but is always available and is integral to the structure of the system |
Consortium hierarchy |
This is a 4-level hierarchy is at which all roles or entities “live.” The levels are:
|
A specific type or structure of the submitted content that is supported by the Rosetta system. These are:
|
|
Core repository |
A two-layer Rosetta module. The upper layer, the services layer, is responsible for the management and execution of processes. The lower layer, the data layer, is responsible for storing and retrieving DE/IE. In the context of SIP processing, the core repository’s upper layer is used (via API) for executing and monitoring task chains. The ability to execute and monitor task chains is important as some of the stage routines use task chains to implement the stage’s processing instructions. |
CSV |
Comma-separated values. A file format used for storing database information in ASCII format (each entry or field is separated by a comma and each new row is represented by a new line). |
CSV Loader |
A Rosetta component that is used for loading SIPs data and metadata and creating a deposit activity from it. |
Delivery |
The component of the Rosetta system that enables content consumers to view content and content metadata that are stored in the permanent repository. Content consumers view the content and metadata using a Web browser and through different viewers that are suitable for the file's format. New viewers can be added to support different formats. |
Delivery Manager |
Accepts delivery request, checks access rights and directs user to appropriate viewer based on rules |
Delivery Rules |
The rules that determine how content is delivered to end users and external systems. Delivery rules are configurable by the customer and are analyzed in order by the Delivery Rules Manager, and when the system finds a matching rule, it does not continue the search. |
Delivery Rules Manager |
A component that determines the appropriate viewer for a particular delivery request. |
Department (a.k.a. Admin Unit) |
The fourth and lowest level of the consortium hierarchy which allows management of user roles and database collections directly. |
Deposit activity |
A transaction record stored in the Deposit Area for each deposit initiated through the Deposit Application. A Deposit Activity can have following statuses:
|
Deposit application |
The combined package of the deposit UI and the deposit server. |
Deposit control settings |
Settings that define the amount of content that Producer Agents can deposit and the amount of Producer Agent content that must be reviewed by the staff. |
Staff users responsible for configuring generic settings for Producers (who provide content to the Rosetta system). | |
Deposit server |
The server that receives the submitted deposit acquires the submission content, transforms it, and wraps it into standard METS format before forwarding the matching SIP XML record to the staging server. |
Deposit UI |
The Rosetta Web interface for submitting digital material to the repository. The client can replace the deposit Web interface component with a substitute interface, or bypass the component altogether and deposit directly into the deposit server using the Delivery APIs. |
Deposit View |
A deposit application UI instance with its particular look and branding, accessed through its own URL. A deposit interface is owned by one institution. |
Descriptor file |
A comma-separated values (CSV) file that holds information on the files, representations, and IEs used to create the new representation(s). |
DNX |
DNX metadata is implemented as an XML record, containing an unlimited list of sections (or “records”) where each record contains unlimited numbers of attributes (or “properties”). Neither the record names nor the attribute names are limited by definition, so in theory they can hold any information. In practice, each DNX profile is limited by this flexibility and enables configurable validation, a read-only section, and default values. The motivation for defining another md-type called DNX comes from the need to collapse and aggregate administrative and technical metadata under one roof where the management, development, and viewing/editing is much more simple. |
DPS SDK |
A set of programming tools (Software Development Kit) that allows programmers to develop specialized computer applications and adapt them to the Digital Preservation System (DPS). |
DROID |
A tool used by Rosetta for format Identification. For more information, see http://droid.sourceforge.net/. |
The pre-defined role that grants a staff user privileges to interact with content in the repository. Three versions of the Editor role (View, Standard, and Full) determine the actions that a staff user may take with regard to sets and set members. Editors with sufficient privileges, for instance, can add new thumbnails for the images deposited by Producer Agents, edit the descriptive metadata provided by Producer Agents, and add new metadata. |
|
Exchange package |
The set of data used by the system for exporting and importing the files in the Add Representation environment. The exchange package contains the files themselves and a descriptor file with information about the IEs, representations, and files needed to create the new representation(s). |
External user management |
A model in which user information is managed through a third-party identity and access management system (IAM) or directory server. This data is used by the DPS. |
A named and ordered sequence of bytes that an operating system can recognize. One or more files comprise a Representation. | |
Format Library |
The Format Library contains all of the data regarding formats: name, description, related applications, related risks, and sustainability factors. Some of the information is managed globally (exposed to all installations of Rosetta) and some of it can be managed locally (stored within the local DB of the institution). In the local library, data can be added but not removed. |
Gallery View |
A SIP content view available to Assessors and Arrangers in which IE thumbnails display in addition to standard information. |
Generic Producer profile |
A Producer profile that defines material flows and deposit control settings that are assigned to all Producers (unless personalized settings were configured). Generic profiles are created by a Deposit Manager, and are assigned automatically by the Rosetta system when a Producer registers. |
Group Producer |
A Producer that is represented by an organization. |
Human stage |
A stage performed by a human agent (for example, Technical Analyst, Assessor, Approver) who uses the application’s UI to manipulate the processed SIP. On entering a human stage, the application will forward the SIP’s information to the appropriate agents and will mark the SIP as waiting for a response. The automatic processing continues once the appropriate human agent is finished working on the SIP. |
Individual Producer |
A Producer that is represented by an individual. |
Installation |
The first level of the consortium hierarchy, which encompasses all other levels. |
Institutions |
Independent work environments consisting of one or more administrative environments. More than one institution can comprise a system or consortium hierarchy. |
A set of files that is considered a unit (for example, scanned pages of a book or a photograph). An IE is stored in a METS XML in the permanent repository. Alternatively, IEs can be structural IEs ,which represent complex hierarchal objects with relationships to other IEs. Structural IEs do not have any representations or files. | |
Internal Producer |
Internal Producers are Trusted Producers (for use by staff) who can submit filestreams and/or descriptive metadata directly through the institution’s Network File System (NFS). |
Internal user management |
Refers to a model whereby the user information is managed wholly within the DPS. |
Itemized set |
A group of objects whose set members are determined at the time the set is saved. There is no stored relation between the set members or the query from which they were derived. |
JBPM |
Java Business Processes Manager: a workflow and BPM engine that enables the creation and management of business processes that execute the stage routines and keep track of the SIP during its processing. |
Material flow |
Settings that define how Producer Agents can deposit content. Material flows are configured by Deposit Managers and Negotiators. A material flow can be associated with multiple Producer profiles. Similarly, multiple material flows can be associated with the same Producer profile. A material flow consists of: |
Metadata |
Information about the content that Producer Agents deposit. Metadata can contain both descriptive (such as author, title, and creation date) and technical (such as file size and location) information. Descriptive metadata is provided by Producer Agents. Technical metadata is automatically extracted from the content by the Rosetta system. |
Contains fields that Producer Agents must complete in order to describe the content that they deposit. Metadata forms are configured by Deposit Managers and Negotiators. A metadata form can be associated with multiple Producer profiles. Similarly, multiple metadata forms can be associated with the same Producer profile. |
|
Metadata management |
A component of the work area that allows users to search, view, and edit metadata records as discrete objects; i.e., devoid of their association to any repository intellectual entity, representation, or file |
METS |
Stands for Metadata Encoding and Transmission Standard, METS is a common and widely use format. See: http://www.loc.gov/standards/mets/. The DPS staging server receives the data encoded in METS format. Deposit application is a DPS application used by Producer Agents to deposit digital materials. The deposit application is responsible for the file(s) acquisition and for the conversion from various content structures to a METS format. The Deposit Application is constructed from two main layers: the Deposit API layer and the Deposit Web interface. |
Migration types |
The only type of preservation alternative currently supported in Rosetta. The migration can be internal or external:
|
Staff users responsible for working with Producers and tailoring the generic deposit configuration of the Rosetta system to the needs of specific Producers. | |
OAI-PMH |
The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH): A low-barrier mechanism for repository interoperability. Data providers are repositories that expose structured metadata via OAI-PMH. Service providers then make OAI-PMH service requests to harvest that metadata. OAI-PMH is a set of services that are invoked within HTTP. |
OAIS |
An Open Archival Information System (or OAIS) is an archive consisting of an organization of people and systems that has accepted the responsibility of preserving information and making it available for a designated community. OAIS is the ISO reference model for Open Archival Information System. |
OTB |
Out of the box: the configurations that come with installation and determine the default behavior of the system without alterations. |
PDS |
Patron Directory Service: A DPS Web component that facilitates user authentication and login in to the DPS. The PDS is part of the standard calling application package, but it is a distinct and separate component. It does not have a user database of its own. Rather, it can be configured to work with an institution’s authentication server(s) and user database(s), such as an LDAP directory service. |
permanent repository |
The component of the Rosetta system that stores content that was deposited by Producer Agents and approved by staff users. From the permanent repository, the content can be delivered to content consumers through the Web and other channels. |
Personalized Material Flow |
A customized material flow prepared by a Negotiator for a specific Producer. |
Personalized Producer profile |
A Producer profile that includes customized generic material flows, additional material flows, and/or customized deposit control settings. Personalized profiles are created and assigned to Producers by a Negotiator. |
Plug-in |
Additional software that can be integrated into Rosetta and run by different modules. An example is a new type of technical metadata tool that can become part of the validation stack process. |
PREMIS |
Preservation Metadata: Implementation Strategies. An international working group charged with the following:
|
Preservation plan |
The preservation plan is a structured workflow used by the Preservation Analyst (PA) to handle objects that are at risk. The workflow takes the PA through the stages of gathering documentation and general information, creating the preservation set, defining the suggested alternatives, running tests, and summarizing the results. |
Preservation plan alternative |
Each preservation plan should have one or more alternatives to ensure success. For example, the same plan may have two migration utilities that convert the source format to the target format. Each one of these is saved as an alternative, and the workflow allows the user to evaluate each utility and add information that is relevant for the utility being evaluated. |
Preservation plan execution |
After the institution signs off on a plan, the plan can be executed with no need to go through testing and defining the exact material. The plan's execution can be scheduled in advance or launched immediately |
Preservation set |
A preservation set is a set of intellectual entities (IEs) that is defined in the first stage of preservation planning. It starts as a logical set and becomes an itemized set every time a preservation plan is executed. |
Pre-transformer |
A routine that converts non-standard content structure into standard content structure, such that it can be transformed using the system's standard transformers. It is activated prior to the transformer as part of the SIP submission process. |
Privileges |
The discrete permissions that make up a user role. Privileges correspond to the interfaces and interface functions in the DPS. |
Producers |
An entity (person or organization) in whose name the digital material is submitted to the digital archive. A Producer is associated with one or many Producer Agents. Material flows and deposit parameters such as disk space quota and target group are assigned on the Producer level and apply to all associated Agents. |
Producer Agent |
A user who deposits digital material for the repository. A user may be associated with more than one Producer Agent role (typically staff depositing for the library as well as on behalf of a Producer). |
Producer group |
Every Producer is assigned to a Producer group by a staff user or by the system during the deposit registration process. These groups help the managing of Producers by allowing members to share aspects of their deposits, metadata, access rights, and other common characteristics. |
Producer Manager |
A user who has access to the Producer Management function (updating of Producer public fields and activation/deactivation of associated A.gents) in the Deposit Module |
Producer profile |
Governs how the associated Producer Agents can deposit content, and how this content is processed by the Rosetta system. Producer profiles also define the amount of content that Producer Agents can deposit. Producer profiles are configured by staff users. |
Producer type |
Defines how Producers are registered in the Rosetta system and how they deposit content. |
PRONOM |
An online registry of technical information about file formats. Created and managed by the UK National Archives. |
Provenance |
The documentation of the chain of events and actions (as well as related agents) that a specific object has undergone in the repository. |
Provenance event |
An action that involves at least one object or agent related to the repository. |
Publishing |
An extensible process that extracts and formats metadata for external uses. |
Recycle Bin |
A UI dedicated to the process of IE deletion from the repository. In this UI, a user with the necessary privileges can permanently delete or restore IEs that have been deleted by other staff users. |
Registered Producers |
Producers who are associated with authenticated Producer Agents, who have access to the full Deposit Module functionality (that is, they can review and track their deposits at any time). Registered Producers are assigned a generic set of material flows and therefore can start depositing material immediately upon self-registration, without any staff intervention. |
The set of files, including structural metadata, needed for a complete rendition of an Intellectual entity (IE). Each IE in a METS XML can contain multiple representations. | |
Retention Period |
A period of time or an absolute date after which records stored in Rosetta can be deleted. Retention policies are defined for records that do not need to be stored indefinitely, such as papers required for legal purposes. |
Risk analysis |
A process that runs on the repository and outputs a list of files according to their Risk Identifiers. |
Risk identifiers |
Derived from either a query of the file attributes that put the files at risk (existing technical metadata) or a tool that extracts the technical metadata that describes the problem. Each file format can be related to one or more risk identifiers. |
Role parameter |
A role modifier or limiter that constrains the terms by which role privileges may be executed. For example, a Negotiator may be permitted to manipulate only those Producers in the Trusted Producer group. |
Rosetta system |
The Web-based software application designed to enable effective preservation of, and access to, digital heritage collections. With the Rosetta system, large amounts of digital data, including audio, video, and text content, can be stored and managed. |
Sampling rate |
Defines the amount of Producer Agent content that must be reviewed by staff users. The default sampling rate is defined by Deposit Managers, and the personalized sampling rate is defined by Negotiators. |
Set |
A physical list of objects. Sets can be created in two ways:
|
Set management |
A component of the Rosetta UI that allows authorized staff users to create and administer logical sets and itemized sets. |
Set member |
An IE, representation, or file that belongs to a logical set or itemized set. |
A Submission Information Package (SIP) that is generated automatically by the Rosetta system when moving deposit activities from the Deposit Server to the Staging Server. A SIP consists of at least one of each of the following: SIPs contain information about provenance, location of the submission content, and the content structure. The DPS system defines an XML representation of a SIP (based on METS). The Staging Server receives an XML representation of a SIP from the deposit application and processes it. |
|
SIP Items Tracking Table |
A physical table in the Staging Server that contains information about the files that are associated with a single SIP. Each entry in the table points to a an entry in the SIP Tracking Table. The Items Tracking Table is used, among other things, for storing information about the processing of the SIP's files. Thus, for example, when a human agent processes the SIP, the agent's decision regarding each file (e.g., Reject, Decline, Accept, etc.) is stored in the appropriate Items Tracking Table entry. |
SIP Processing |
The process undergone by the SIP from the time it is received by the Staging Server until it is moved to the permanent repository. SIP processing resembles an assembly line: the SIP is automatically moved between processing stages according its processing configuration. During each stage, the SIP is processed according to the stage's predefined processing instructions. The final stage that indicates that the process has completed successfully is the move to the permanent repository. |
SIP Processing Configuration |
A set of stages, rules of flow between stages, and processing instructions managed in the JBPM. The processing configuration may vary between different types of SIPs, according to the predefined routing rules. When a SIP enters the Staging Server, the SIP routing officer decides, based on the predefined SIP Routing Rules, which SIP Processing Configuration applies to the SIP. Once the relevant SIP processing configuration has been identified, the SIP is processed according to the instructions found in the relevant configuration. |
SIP Processing Stage |
A logical step in the processing workflow. Each stage is composed of predefined logical processing instructions that are implemented by the stage routine and are executed by the application upon entering the stage. A workflow can be composed of a varying number of stages. |
SIP processing state machine |
This is the component that is responsible for determining what should be the SIP's next processing stage and for planting the appropriate processing instructions in the SIP tracking table. Knowing the SIP's current stage and the stage's result, the processing state machine can, by looking at the relevant processing configuration, determine what should be the SIP's next stage and which stage routine should be activated next. Once the SIP's next stage has been identified, the processing state machine asks the SIP tracking table manager to update the SIP's entry in the tracking table accordingly. The processing state machine can be thought of as a 'brain' that when told the SIP's current situation can read the appropriate processing configuration manual, determine what should be done next with the SIP, and store the processing instructions for the SIP's next stage in the appropriate place |
SIP processing workflow |
The process that handles the SIP from the point of submission to the staging server to the point that it moves to the permanent repository. The SIP processing workflow includes the following phases:
|
SIP processing workflow configuration |
A specific set of stages, rules of flow between the stages, and processing instructions. Processing configuration may vary between different types of SIPs, according to the predefined routing rules. When a SIP enters the Staging Server, the SIP routing officer decides, based on the predefined SIP routing rules, which SIP processing configuration applies to the SIP. Once the relevant SIP processing configuration has been identified (and linked to the SIP's entry in the SIP tracking table), the SIP is processed according to the instructions found in the relevant configuration. The SIP processing statemachine uses the relevant SIP Processing Configuration as a manual describing how the specific SIP should be handled throughout its processing. |
SIP routing officer |
A component that identifies, based on the predefined SIP routing rules, the SIP processing configuration that applies to a specific SIP. |
SIP Routing Rules |
A set of parameter-based rules that point out the appropriate SIP processing configuration for a combination of SIP attribute values. |
SIP worker |
An application program (thread) that executes the processing instructions associated with the assigned SIP's current stage. |
Skin |
A definition of the look of Rosetta pages. In this document, skin refers to colors, logos, and icons presented in the different Delivery viewers. |
Staff users |
Users who are responsible for managing Producers, Producer Agents, and the content that Producer Agents deposit. The following staff users exist: |
Stage routine |
Implements the processing instructions (internal or task chain) of a stage within a processing configuration. When a SIP enters a processing stage and is selected for processing by a SIP worker, the SIP worker executes the appropriate stage routine according to the instructions it receives from the SIP tracking table manager. |
Operation module |
The component of the Rosetta system that stores content submitted by Producer Agents. In the Operational module, staff users review the content and decide whether to approve it for permanent storage, return it to the Producer Agent, or decline it. |
Statistics event |
An event which is a calculated aggregate of events over a period of time, used for determining event measures (such as average or number). |
Structural map |
Deposited METS can contain structural maps for the entire IE, a group of representations, or a single representation. For the entire IE: comes from the deposited METS that does not reference the ID of any of the METS' file groups. It is linked to the IE, preserved, and can be exported. However, delivery and other operations may not be available for this type of structural map. For a single representation: a structural map within the deposited METS that shares an ID with a single fileGrp. For multiple representations (i.e., shared structural map): exists within the deposited METS that references the IDs of several file groups within the METS. The syntax for multiple references is: ID=”fileGrp_1 Id; fileGrp_2 Id; …” In the case of a shared structural map, the Staging Server is responsible for creating a copy of the structural map for each of the relevant representations upon loading the SIP to the Staging Server (since in the repository, every representation is self-contained and contains its structural map). |
Submission content |
The files and metadata prepared by a Producer for submission. Submission content should follow valid DPS content structure. |
Settings that govern how Producer Agents upload files and what limitations are applied to these files. Submission formats are configured by Deposit Managers (when a generic submission format must be created) and Negotiators (when a personalized submission format must be created). A submission format can be associated with multiple Producer profiles. Similarly, multiple submission formats can be associated with the same Producer profile. |
|
Submission Information Package |
See SIP. |
System Administrator |
An administration user responsible for configuring both the server on which the Rosetta system is installed and the Oracle database that Rosetta uses to store Rosetta-related data. |
System stage |
A stage performed automatically by the system, without receiving human input. The Stage routine implementing a system stage can either contain the code instructions of the stage or point to the task chain that contains the relevant code instructions. |
Task |
A program that performs an operation on an object within the Core Repository. |
Task chain |
An ordered list of tasks. |
Staff users responsible for handling technical problems (such as corrupted files or files infected by a virus) that may occur with files that Producer Agents deposit. | |
Transformer |
A program that converts standard content structure into a SIP METS format. |
Tree view |
A SIP Contents view available to Arrangers, in which the original tree structure of the files in the SIP displays in addition to the regular information. |
Trusted Producers |
Trusted Producers are Registered Producers who have negotiated access to personalized material flows. The level of personalization can go from the definition of default values for selected form fields to the automation of the deposit process by allowing submission of both file streams and associated descriptive metadata as files on an FTP server. |
User |
An individual or organization that interacts in some way with the system. A user may be a staff member or a patron that logs in to a module and uses the system or a user may be an organization in a more general sense. A user of the system can be assigned various roles such as Negotiator, Approver, and/or Technical Analyst. Some roles are limited to staff users, whereas others such as Producer Agent can be assigned both to staff users and patrons. |
User role |
A named group of privileges that a user is authorized to perform. Roles are based on expected workflows and job responsibilities in the DPS. They are fixed and not editable by the library. |
User role profile |
Comprises both a role and the role's associated parameters. Although role privileges are fixed, parameters will vary depending on the user. One user linked to the Negotiator - Full role may be assigned the parameter of Trusted Producer Group only, while a second Negotiator - Full may be assigned the parameter of All Producer Groups. |
Viewer |
An extensible component that handles the viewing of content. |
Viewer pre-processor |
An extensible component that is activated based on the delivery rules to prepare an object for viewing and redirect user to the relevant viewer. |