Technical MD Extractor
Technical MD Extractor
The Technical MD Extractor plug-in represents the family of utilities that extracts the technical properties (such as size, encoding, compression) of a file. Each extractor is responsible for handling a specific file format. However, the sharing of generic extractors is possible.
The extractor plug-in exposes all the properties it can extract, and Rosetta uses this information to extract property values of a given file. The system saves the property values to the DNX and updates the significantProperties section, which holds standard technical properties (such as from exif and XDP). It updates also the fileTechnicalMetadataExtraction section, which holds information on the metadata extraction run itself (for example, agent and plug-in names, errors that occurred during the run, etc.)
- The extractor must be associated with a format in the format library in order to run.
- The plug-in is executed by the TechMD extraction and Add Representation TechMD extraction tasks when the task parameter Technical MD extraction only is disabled (cleared).
Plug-in Parameters
Depending on the specific implementation, a plug-in may or may not require parameters. If the plug-in does require parameters, they are populated during installation of a new instance of the plug-in.
Usage
The association of an MD extractor to a format is done at the format level (format library). For an MD extractor to be available at the format level, it should be assigned to the same Classification Group that the format is assigned to.
MD Extractor Plug-In Management
After the MD extractor is assigned to a classification, it is listed in the Related MD extractors folder of that classification group.
Metadata Extractor Tools
The same list of MD extractors is also available in the MD extractor drop-down at the format level for all formats belonging to the same classification.
Metadata Extractor Drop-down Menu
Implementations
Rosetta includes the following implementations of the Technical MD Extractor plug-in:
- JHOVE
- BYTESTREAM-hul
- ASCII-hul
- AIFF-hul
- HTML-hul
- JPEG2000-hul
- PDF-hul
- JPEG-hul
- GIF-hul
- TIFF-hul
- UTF8-hul
- WAVE-hul
- XML-hul
- NLNZ extraction tool
- nz.govt.natlib.adapter.flac.FlacAdapter
- nz.govt.natlib.adapter.bmp.BitmapAdapter
- nz.govt.natlib.adapter.mp3.MP3Adapter
- nz.govt.natlib.adapter.arc.ArcAdapter
- nz.govt.natlib.adapter.wav.WaveAdapter
- nz.govt.natlib.adapter.pdf.PDFAdapter
- nz.govt.natlib.adapter.jpg.JpgAdapter
- nz.govt.natlib.adapter.openoffice.OpenOfficeAdapter
- nz.govt.natlib.adapter.pdfbox.PDFBoxAdapter
- nz.govt.natlib.adapter.any.DefaultAdapter
- nz.govt.natlib.adapter.works.DocAdapter
- nz.govt.natlib.adapter.excel.ExcelAdapter
- nz.govt.natlib.adapter.gif.GIFAdapter
- nz.govbt.natlib.adapter.html.HTMLAdapter
- nz.govt.natlib.adapter.powerpoint.PowerPointAdapter
- nz.govt.natlib.adapter.tiff.TIFFAdapter
- nz.govt.natlib.adapter.word.WordAdapter
- nz.govt.natlib.adapter.wordperfect.WPAdapter
- nz.govt.natlib.adapter.xml.XMLAdapter3
- nz.govt.natlib.adapter.xml.XMLAdapter