Archive Decomposer
Archive Decomposer
The Decomposer represents the plug-in family of all classes/programs of varied packed/compressed files that handle a decomposition.
The decompression of containers is invoked from two areas of the system:
- Web Deposit – according to the material flow definition, the container (e.g. zip file) is decomposed into the inner files using the plug-in. In this case, the container is disregarded and only the inner files are ingested into Rosetta as they were in the original streams.
- Validation Stack – according to decomposition rules, a container (e.g. multi-page TIFFs) is decomposed into bitStreams (retained only in the operational DB). These inner files, the bitStreams, also pass through the ValidationStack (that is, identification, tech-md-extraction, risk-extraction) and participate in the risk report.
Plug-in Parameters
The Archive Decomposer requires two parameters:
- Full file name (including full path) – full name of the container input file
- Directory name - where the inner files will be extracted to
Usage
Once installed, the Archive Decomposer can be used as the decomposing tool in the following decomposition rules setup.
Decompose at the Time of Loading
Rosetta can decompose a compound file while loading it onto the deposit server so that only the inner files are loaded while the original compound file is not. The tool used for decomposing the compound file is one of the installed Archive Decomposer plug-ins, which is accessible in the Automatic Decomposition Rules rule editor (Administration > Deposit > Automatic Decomposition Rules).
Decomposition Rule
ByteStream Extraction
Another use of the Archive Decomposer plug-in is the ByteStream extraction. This mechanism allows Rosetta to extract and store technical MD (needed for preservation) of each of the inner files. The tool to be used for decomposing the compound file is one of the installed Archive Decomposer plug-ins which will be accessible in the Rules for Bytestream MD Extraction rule editor.
Bytestream Metadata Extraction Rule - Details
Implementations
Rosetta includes three implementations of the Archive Decomposer plug-in:
- decomposerArcFile – A script plug-in to decompose ARC files
- Unzip – A script plug-in to decompose ZIP files
- Unzip With Encoding – A script plug-in to decompose ZIP files that require special encoding for the inner files
Decomposer Plug-in Management