Skip to main content
ExLibris

Knowledge Assistant

BETA
 
Back
Esploro

 

Ex Libris Knowledge Center
  1. Search site
    Go back to previous article
    1. Sign in
      • Sign in
      • Forgot password
  1. Home
  2. Esploro
  3. Product Documentation
  4. Esploro Online Help (English)
  5. Esploro Smart Harvesting Framework
  6. General Overview of Smart Harvesting Framework
  7. CDI

CDI

  1. Last updated
  2. Save as PDF
  3. Share
    1. Share
    2. Tweet
    3. Share
  1. Central Discovery Index (CDI)
    1. How often is CDI updated?
    2. Using CDI records for Esploro Assets
    3. Record Metadata
  2. Additional References

Central Discovery Index (CDI)

The Central Discover Index (CDI) is used in Esploro as the source for Smart Harvesting and for auto-population of asset metadata during a manual deposit. CDI has billions of records and adds more daily from multiple sources: publishers, aggregators, and repositories of various kinds. CDI is inclusive and harvests records from all subject domains. There are over 30,000 sources.

Some numbers:

  • 750+ million Journal Articles
  • 730+ million Books / eBooks / Book chapters
  • 110+ million Patents
  • 9+ million Datasets
  • 50+ million Conference Proceedings

See CDI Record Summary and Sources for an A-Z list of list of all the publishers, aggregators, and other content contributors that provide content to CDI.

How often is CDI updated?

CDI harvests records from the various sources on an ongoing basis. The periodicity varies between the sources from daily to monthly.  Some of the key sources, including for example Crossref, are harvested daily. It is important to keep in mind that CDI is indexed twice a week which means that even if the source is harvested daily, records will be added to the index only after the bi-weekly indexing has run.

Using CDI records for Esploro Assets

As noted above, CDI gets records from many different data providers. Esploro selects a “preferred” record from all sources Esploro can legally use. The following guidelines are used to select the “preferred” record:

  1. Records from Scopus cannot be used and are filtered out.
  2. Preference is then given to records in the following order:
    1. Records with well formatted and rich author information. This means that the first name and last name are split, and the author has affiliation/ORCID/email information. Author information is critical to the author-matching process. Generally speaking, records with good author metadata are good in other areas as well.
    2. Records with DOIs.
    3. Records that have subjects and/or abstracts.
    4. Records from Web of Science.

Record Metadata

With so many sources, the quality of the records in CDI can vary; even records from a single source will vary. Most of the records in CDI have high quality metadata but there are some that are missing data and/or have errors. This can happen even in cases of records from very trustworthy sources, including publishers.

Note the following known issues:

  • There can be occasional duplicates.
  • Author affiliations are very often missing or messy.
  • Author names are sometimes inversed.
  • Corporate authors are sometimes incorrect.
  • Article numbers are missing or added as issue numbers.

The Esploro team is working on improving the results where possible.

Additional References

  • General Overview of Smart Harvesting Framework
View article in the Exlibris Knowledge Center
  1. Back to top
    • General Overview of Smart Harvesting Framework
    • Working with Smart Harvesting
  • Was this article helpful?

Recommended articles

  1. Article type
    Reference
    Content Type
    Documentation
    Language
    English
    Product
    Esploro
  2. Tags
    This page has no tags.
  1. © Copyright 2025 Ex Libris Knowledge Center
  2. Powered by CXone Expert ®
  • Term of Use
  • Privacy Policy
  • Contact Us
2025 Ex Libris. All rights reserved