Skip to main content
ExLibris

Knowledge Assistant

BETA
 
Back
Rosetta

 

Ex Libris Knowledge Center
  1. Search site
    Go back to previous article
    1. Sign in
      • Sign in
      • Forgot password
  1. Home
  2. Rosetta
  3. Knowledge Articles
  4. JHOVE's PDF-hul Module

JHOVE's PDF-hul Module

  1. Last updated
  2. Save as PDF
  3. Share
    1. Share
    2. Tweet
    3. Share
  1. Question
  2. Answer
  3. Additional Information
  • Product: Rosetta
  • Product Version: 5.3
  • Relevant for Installation Type: Local

Question

How is JHOVE's PDF-hul Module utilized by Rosetta?

Answer

PDF files tested directly in JHOVE (v1.9, v1.11, and v1.16) often present no errors.
Rosetta v5.3 includes JHOVE v1.10 with the v1.7 PDF-hul Module.
Ex Libris Development confirms that JHOVE 1.16 can be used with the metadata extraction plugins in Rosetta.
However, in some cases technical metadata can't be extracted from some PDF files.

You may see the following error messages in TA Workbench Validation (e.g.):

1. Invalid object number in cross-reference stream,Failed to retrieve extractor properties.
2. Expected dictionary for font entry in page resource.
3. Improperly constructed page tree,Annotation object is not a dictionary.

To address these errors in the short-term, please create a "Format Identification Correction" rule to ignore it.
In the Management module navigate to Home > Submissions > Rules > Format Identification Correction to establish the rule.

Additional Information

With the exception of the 1.6, 1.9 and 1.11 framework releases, every JHOVE release has seen updates to the PDF-hul Module.
JHOVE’s most recent v1.16 included PDF-hul v1.8, which fixed two major bugs.

These lead to false validation errors relating to invalid page dictionary objects and improperly constructed page trees.
While a number of fixes have improved PDF/A validation, JHOVE has been proven unsuitable for PDF/A validation.
The coverage of PDF versions hasn’t changed since PDF-hul 1.0; for “plain” PDF, JHOVE supports PDF 1.0-1.6.

Note that the metadata extraction plugins are part of the Format Library.
To formalize a more long-term solution, please consult directly with the Rosetta Format Library Working Group (FLWG) in order to update the Format Library not to report these issues.


  • Article last edited: 15-Dec-2017
View article in the Exlibris Knowledge Center
  1. Back to top
    • Is there a list of all of the SOLR search fields in Rosetta?
    • Job runs History - bad display
  • Was this article helpful?

Recommended articles

  1. Article type
    Reference
    Content Type
    Knowledge Article
    Language
    English
    Product
    Rosetta
  2. Tags
    1. JHOVE
    2. PDF
  1. © Copyright 2025 Ex Libris Knowledge Center
  2. Powered by CXone Expert ®
  • Term of Use
  • Privacy Policy
  • Contact Us
2025 Ex Libris. All rights reserved