Validation Stack Rules
Automating Corrections
Auto-correction rules are created, edited, and implemented by Technical Analysts (TAs) during the validation of SIPs. When the validation process encounters errors, the TA can manually correct the error and assign the same correction to subsequent matching errors. For instance, when a file fails validation because Rosetta cannot match the file’s extension to a known format, the TA can decide that, for example, a .j123 will always be read as a .jpg. This rule can be automated to apply to all occurrences of .j123.
For more information on handling errors and creating rules, see Adding a Rule.
The types of validation stack errors that a TA may configure are:
- Format identification auto-correction rules: If the format identification task unambiguously associates a file with a format, and the Technical Analyst can determine the correct format, a rule can be created to auto-assign it to a specific format.
- Metadata extraction error handling rules: TAs can define rules that ignore errors returned by the metadata extraction task that are deemed irrelevant or non-critical for preservation.
- Format validation error handling rules: TAs can define rules that ignore error returned by the format validation task that are deemed irrelevant or non-critical for preservation.
- Virus check error: If the virus check plug-in returns an error message that the TA determines does not indicate a threat, a rule can be created to ignore the error.
Rosetta adds an event with rule parameters to the file when a rule is applied. These events can be searched under the following fields:
- Format Identification Auto Correction Criteria
- Metadata Extraction Error Ignore Criteria
- Format Validation Error Ignore Criteria
- Virus Check Error Ignore Criteria
When a rule is triggered, a provenance event is generated with the rule parameter details.
Configuring Validation Stack Rules
Technical Analysts can configure validation rules from the Management Home page, Submissions menu, under the Rules heading. The types of validation stack rules display in the list. (For information on the types, see the bulleted list in the above section, Automating Corrections.) To add, edit, or delete a rule, click the link that describes the type of error on which the rule is based.
Clicking the Format Identification Correction, for example, opens a list of rules related to format identification errors:
List of Rules Page
The following actions can be performed on the List of Auto-Correction Rules page:
Adding a Rule
To add a rule to one of the validation stack error types, Technical Analysts define one or more parameters on the Rule Details page. (To omit one or more of the parameters, leave the operator as Any.)
- Input parameters:
- Producer name, if matching by Producer
- Preservation Type to which the preservation type of a problematic file is compared
- Format name to which the format of a problematic file is compared
- File extensions to which the file extension of a problematic file is compared (add the extension manually if the extension does not appear in the Format Library list)
- Mime type, if comparing a file on the basis of its MIME type
- File size and creation date, for comparison on those data
- Error ID (for Metadata Extraction error) for comparison on the identifiers of JHOVE error messages
- Additional fields specific to the type of rule being added or edited
- Output parameters (one of the below):
- The file format that must be applied to the file if its file format and extension match the input parameters
- The reason for ignoring errors that match the input parameters
To add a validation stack error rule:
- From the Submissions rollover menu, Rules column, click the type of error to which you want your new rule to apply:
- Format Identification Correction
- Metadata Extraction Error
- Virus Check Error
The Rule List page for that error opens.
- Click the Add Rule button.
The Rule Details page opens. Parameters vary slightly based on the rule type.
Rule Details for Format Identification Correction
- Enter a name and description for the rule in the corresponding fields.
- Select an operator and one or more value(s) for the input parameters you want the rule to use. If, for example, you want to narrow the rule to apply only to work deposited on behalf of a particular Producer, then for the Producer parameter, select List Equals and the particular Producer(s) from the right-side list box.
- The format value list uses the format identifiers taken from the format library.
- For detailed information about operators and parameters, see Operators Used in Rule Parameters on.
- Click Save.
The Rosetta system saves the new validation stack rule and can use it to identify a specific action to be taken.
Updating a Rule
Technical Analysts can update a rule to modify its input or output parameters.
To update an auto-correction rule:
- On the List of Rules page, locate the auto-correction rule that you want to update and click Update. The Rule Details page opens.
- Modify the fields that you want to update, and then click Save.
The Rosetta system now uses the updated parameters.
Re-Ordering Rules
To determine the rule that must be used for a specific file, the Rosetta system compares the input parameters defined in a rule with the parameters of the file.
Rules are analyzed in the same order as they are displayed on the List of Rules page. The Rosetta system uses the first auto-correction rule found that match the parameters of the file.
Technical Analysts can change the order of rules.
To re-order auto-correction rules:
- On the List of Rules page, in the Set Order column, use the up and down arrows to change the order of the rules.
- Click Save.
The Rosetta system now processes the rules in the newly defined order.
Duplicating a Rule
Technical Analysts can duplicate an existing rule. This is especially helpful when creating a new auto-correction rule. It is often faster to duplicate an existing auto-correction rule and then modify it, than to create a new rule.
To duplicate a rule:
On the List of Rules page, locate the auto-correction rule that you want to duplicate, and click Duplicate in its row.
An exact copy of the rule is added to the List of Rules page. The Rosetta system automatically labels the new rule with the name Copy of followed by the name of the original rule.
Activating and Deactivating a Rule
Technical Analysts can activate or deactivate a rule. After an auto-correction rule is deactivated, it is no longer available to the Rosetta system for matching.
On the List of Rules page, the status of the rule is indicated by the check mark in the Active column:
- Yellow – The auto-correction rule is active.
- Grey – The auto-correction rule is inactive.
To activate or deactivate an auto-correction rule:
- On the List of Rules page, locate the auto-correction rule you want to activate or deactivate.
- In the Active column, click the check mark.
The page refreshes, and the check mark in the Active column indicates the new status. (The rule is changed from active to inactive, or from inactive to active.)
Exporting a Rule
Technical Analysts can export rules to share them with other institutions.
To export rules, on the List of Rules page, click Export Rules. Rosetta generates a CSV with all the rules' details.
Deleting a Rule
Technical Analysts can delete an existing rule. After a rule is deleted, it is no longer available to the Rosetta system for matching.
To delete a rule:
- On the List of Rules page, locate the rule you want to delete and click Delete. The confirmation page opens.
- Click OK.
The rule is deleted from the Rosetta system.