Skip to main content
ExLibris
  • Subscribe by RSS
  • Ex Libris Knowledge Center

    Central Discovery Index (CDI) - Test Plan and identified issues log

    Created By: Stacey van Groll
    Created on: 4/18/2020



    Goal

    Support the community by sharing details of areas I'm testing and issues identified, while many of us are in Enablement phase for the new Central Discovery Index (CDI)

    Disclaimer: My testing is ongoing, this list is not all-inclusive, and it may become out of date as issues are resolved (hopefully soon)

    See also Lesli Moore's plan here: Testing the Central Discovery Index (CDI)

     

    Notes on reporting issues

    • Ex Libris will not prioritise issues which only one or two sites report
    • If you find the same issue in your environment, make sure to submit your own SalesForce case
    • All UQ cases are Published, and include "CDI" in both the Title and Description
    • Please do the same, so that your cases can be easily found, to support community collaboration
    • For further filtering, most UQ cases are submitted from 12.3.2020, but I am also following up earlier cases when they were stalled for resolution on PCI, but advised as being fixed on CDI (such as Initial articles for Sort and Starts with)
    • You may also wish to submit feedback on the number of issues and mandatory Switchover dates: cdi_info@exlibrisgroup.com

     

    Primo Area

    Issues

    Case?

    Facets

     

     

    Top Level - Available online

     

     

    Top Level - Peer-Reviewed

    Facet inclusion returns records which aren't PR

    Facet includes records which do not have brief results icon

    PNX data unexpected code

    Non-journal articles included eg Books

    Yes

    Top Level - Open Access

    Facet inclusion returns records which aren't OA

    Facet includes records which do not have brief results icon

    Open Access results are marked as 'No full-text' in filtered search

    OA in Alma not marked as OA in Primo

    Yes

    Top Level facets 3 of 6 facets containing local records are returning a server error message instead of zero results Yes

    Author/Creator

    No entries for CDI records, even though data in PNX

    Yes

    Subjects

    Including active filter returns results which do not contain the Subject facet chosen (nor in search or display PNX)

    Entries are appearing in the Subject facet which are not subjects, nor anywhere in the PNX ie Multi-user or DDA as a facet term

    Yes

    Content Type

    Including entries does not exclude all others (documentation is wrong for types which are meant to match and merge ie based on metadata only), and results do not match type (eg limit to Articles and Videos are still in results)

    Artificially limited to only display 20 types

    Code table mapping missing eg web_resources

    When activating some facets like Articles and then ticking the box for expanded search, Zero results message displays instead of more results than in filtered search

    Yes

    Publication Date

    Using the Publication Date facet in Brief Results causes it to disappear, so that more granuality cannot be added eg 2010-2020, then limiting further to 2015-2020, and instead have to start again

    Yes

    New records

    Facet appears but no data in PNX to facilitate verification and troubleshooting (causing Saved Search Alerts zero results?)

    Yes

    Language

     

     

    Journal Title

    Missing entries in top 2,000 results (for PCI too)

    Book titles are found commonly in the Journal Titles facet, for both Books and Book Chapters (ie data is incorrect in PNX as facet_jtitle) - case updated to add that this is occurring for more than just Books ie Web Resources information from the Is Part Of is in the Journal Titles facet as well

    Yes

    Collection

    Many missing entries, so very different to determine source of content, even when the data is visible in PNX

    Many multiple duplicate entries are sub-collections

    Facet entry disappears from UI as a selection option (despite still being in the facet PNX) when the collection is moved from filtered to expanded search by selecting 'Do not show as Full Text in CDI even if active in Alma' (and sometimes the scope DBID disappears as well)

    Yes

    Sort

     

     

    Initial Articles

    Entries still contain initial articles (for PCI too, but meant to be fixed on CDI)

    Yes

    Feedback Messages

     

     

    Illegal characters eg empty parentheses

    No results message and yet results are displayed

    Yes

    Too many wildcards

    No results message and yet results are displayed

    Yes

    Too many boolean

    Expanded results and excluded term message displays but results appear inconsistent for blended search

    Yes

    Too many clauses eg terror*

    Partial results message displays but results appear inconsistent for blended search

    Message sometimes does not display for CDI when it displays for PCI

    Yes

    Expanded results due to few or no local records

     

     

    Expanded results due to too long query

    Feedback message missing on both PCI and CDI, and instead displays illegal characters message

    Yes

    Zero results

     

     

    Search

     

     

    Wildcard search with just *

    Returns No results message, as well as hundreds of millions of results with no Availability status in brief results and no View It or Get It in full record display (nb Saved Search Alert with this query can be set, but sends no email - no case yet)

    Yes

    Broad topic keyword search

     

     

    Narrow topic keyword search

     

     

    Known item search

    Other entries such as relevance ranking unexpected behaviour cover this eg Databases dropping from No.1 to No.2, key textbooks with multiple editions with significant shifting of position).  Does not seem necessarily bad, but documentation is needed for specifically how the ranking has changed, so that local blending can be considered for adjustment eg increase it a little

    Yes

    Boolean

    Searching with boolean does not prevent expansion, for AND or NOT, so a precise search cannot be done

    Yes

    Wildcards

    Searching with wildcard and quotation marks does not return expected results eg "social network*" only returns "social network" and not "social networks", "social networking", etc

     

    Quotation Marks

    Searching with quotation marks does not prevent expansion, so a precise search cannot be done

    Yes

    Parentheses

     

     

    Characters: & ampersand & ampersand and 'and' are not interchangeable and & returns completely irrelevant results (fixed in PCI in February 2018, so why not covered already in CDI?) Yes
    Characters: - hyphens In PCI ISSNs can be searched both with and without hyphens, but the data is not comprehensively indexed for both forms in CDI, so very few results are retrieved Yes

    Unique identifiers

    Search by DOI or PMID returns more than a dozen results which are mostly completely irrelevant, and actual result is sometimes No.7 or No.11, as opposed to 1 or 2 results on PCI.  Example of unnecessary broad and constant expansion

    Yes

    Collection prepend

    Not working for search in Primo even though in same field of PNX as PCI eg gale* - FIXED IN MAY 2020

    Yes

    rsrctype

    Not working for search in Primo even though in search PNX eg conference_proceeding - PARTIALLY FIXED IN MAY 2020, but broken for at least three types (Newspaper Articles, Articles, and Other), and also implemented as per the pluralized facet terms, rather than the singular search terms

    Yes

    DBID

    Not consistently searchable in Primo, with some codes missing, or not conducive to search eg full stop in code

    DBID in Primo does not match DBID for unique collection in Alma

    No DBID in record at all ie empty scope field, but record id exists clearly from a collection by PNX and permalink

    Some DBIDs are words, such as DRUMS or PROOF, which could skew real user search results (no case for this yet)

    Yes

    Search PNX

    Results returned when the data does not exist in Search PNX

    Ex Libris advises that search PNX is now legacy and irrelevant, but unknown exactly what this means (and hugely concerning given we rely on this data to understand ranking and adjust local configurations as we see fit)

    Yes

    Data mapping

    Different mapping of key data to title and addtitle, impacting ranking with many articles of a journal ranked higher than journal itself by search on journal title

    Yes

    Relevance of results

    Clear variation with ranking but poor documentation on why

    Yes

    Expansion

    Significantly more expansion, returning completely irrelevant results or unnecessary eg dog returns dogs, even if many milliions of results

    Yes

    Creatorcontrib

    Support for creatorcontrib field unknown, with impact on blended search and search engine configuration boosting

    Yes

    Blending Search Engine Configuration for local search engine unexpected ie 1 local result in first page rather than 3 or 4 Yes

    Advanced Search

     

     

    Prefilters

    Some records contain rsrctype but not prefilter (no case in yet, but appears to be an issue for Market Research and Standards especially)

     

    Title

    Title search returns results with no indication of search terms in Title

    Starts with search returns no CDI results, only local

    Yes

    Subjects

    Subjects contains query returns results where the search query is not contained in the Subject field

    Yes

    Author/Creator

    Author query returns results where the search query is not contained in the Author field

    Yes

    Search Features

     

     

    Controlled vocabulary

    Suggestions do not appear for CDI which do appear for PCI

    Suggestion block does not disappear after being clicked, and clicking the link again does nothing

    Yes

    Did you mean

     

     

    Personalisation

    No documentation of expected changes (if any), but results definitely seem significantly different, even with activations issues for enablement

    Yes

    Autosuggest

     

     

    Additional Features

     

     

    Citation Trail

    Zero results (due to dropping the CDI search element from URL)

    Yes

    Times Cited

     

     

    bX Recommender

     

     

    Services page

     

     

    Contextual Relationships

    Nb low priority as not active in Prod due to existing dealbreakers

    Tested briefly and immediately got API call failure in console

    Yes

    Linking and Availability

     

     

    Filtered search

    Results appear which are 'No full-text' in filtered search, with 70 of 130 Open Access Link in Record collection having this problem in Page 1 or 2 of results (which should obviously never be the case if OA, and the linktorsrc is visible in the PNX, but is not displayed in UI)

    Search by Record ID or entering Primo via Permalink for 'No full-text' records causes display of record without ticking the expansion checkbox, rather than showing Zero Results message (implying an incorrect availability assignment which is not the case)

    Yes

    Expanded search Records appear which do not exist in our Alma IZ, or CZ as far as I can tell, and appear to be from other institution local collections, such as Library Guides for other unis (which we do NOT want!) Yes

    Link in Record

    Link in Record preference hides all other access options, as a valuable failsafe when the top link does not work as expected (this is NOT a better user experience than 'See all'!)

    Link in Record preference has high likelihood of not being Open Access, removing support for many users

    GES Links are not displayed for Link in Record collections, which includes our local Unpaywall link to support Open Access and our help link for eBooks to assist users with different platform navigation

    Link in Record preference gives library no option to add information for users such as Authentication Notes and issues with the source platform

    Link in Record preference removes our local autonomy to rank services by Online Services Order

    Link in Record unique collections are behaving as Link Resolver and presenting 'No full text available' in View It

    Yes

    Link Resolver

    Many Available online records in filtered search, but 'No full text available' in View It of full record display

    Inconsistent assignment of availability status, with some records which failed to merge 'Available online' and others 'No full-text'

    Yes

    No holdings Get It in expanded

     

     

    Proxying

     

     

    Brief Results

     

     

    Snippets

    Snippet display is irrelevant, and has no term highlighting

    Yes

    Thumbnails

     

     

    Navigation Unable to access all results, with Load More Results unexpectedly appearing after No.72, with second attempt stopping at No.82, and third attempt stopping at No.92, and pagination widget attempt to move to Page 10 causing complete failure with error message Yes

    Send To Actions

     

     

    EndNote RIS / Web

    Addata Publisher PB is duplicated for members of group, with many entries exported to RIS

    Yes

    Citation

     

     

    BibTex

     

     

    Permalink

    Query for shortened hash key stability

    Record ID in permalink dynamically changing, with no visibility of varying Record IDs in merged record (no certainty of stability)

    Request for list of collections with known permalink redirection failures (documented as approx 8%)

    Query on behaviour for unique collections and for non-unique collections ie outcome in Primo if made not active for search

    Yes

    Print

     

     

    Email

     

     

    Full Record

     

     

    Subjects

    Massive lists in no discernable order in full record display eg not alphabetized and also duplicate entries for capitalization eg Dog and dog and DOG (this is an issue on PCI too, but much greater on CDI as all subjects are displayed in the one merged logical record, rather than only displaying the FRBR preferred record subjects on CDI)

    Yes

    Source / Collection

    Significant missing data with much data on PCI stripped from CDI even for unique content (also stripped from search and facets, as well as display). As noted in another entry, this data is also stripped even more if a record is moved from filtered to expanded search by 'Do not shows as full text' suppression, which gives the impression it is gone completely, but it is just really well hidden. This is a significant loss of autonomy for our ability to remove records from our environment which have poor metadata, especially as this is often because they fail to merge and cause swamping by duplicates. Only option appears to be to report the problem for correction, but this can take months or years, and sometimes is denied, such as if a collection is no longer maintained or if it is deemed to be 'correct'

    Yes

    Title Title field truncation to 500 characters Yes
    Term Highlighting Does not include stop words even with a clear phrase match exact to query Yes

    Links

    Links section missing for many (most?) records, even though often the Links are in the PNX (deliberately hidden - why?)

    Yes

    Lateral Links

     

     

    General Electronic Services

     

     

    Unpaywall

     

     

    Talis ISSN/ISBN

     

     

    Requesting

     

     

    Relais Partner / Broker

     

     

    My Favourites

     

     

    Saved Searches

     

     

    Email Alert

    Alerts by * wildcard not received (no specific case)

    Alerts return only Zero results message for CDI results, or only local results when the set contains them (Probably due to other entry for newrecords PNX data missing, as required to match to URL fromdate)

    Saved Search alerts not received until next day, even though new results are present

    Saved Search alerts return advise xxx (low number) new records, but number of records in Primo by the link is yyy (high number)

    Yes

    Saved Items

     

     

    Search History for session

     

     

    PCI to CDI redirection

     

     

    Updates: Release Notes / Outages

     

     

    Ongoing updates and sync

    Release Notes of relevance unknown both during Enablement and after Switchover ie follow PCI notes or follow CKB notes, or both, as these are not aligned

    Changes to PCI collections are not occurring in CDI collections

    PC to Alma mapping file in OLH is not being kept up to date, and is out of alignment with PCI Release Notes and CKB Release Notes

    Changes are ongoing which impact Enablement eg significant work on HathiTrust collections, so correct collection was not activated

    CDI Collection List is wrong for Number of Records when compared to Alma Number of Records, and records found in Primo

    Yes

    Product Materials No OLH page is available for 'Product Materials' which contains Root Cause Analysis (RCAs) for outages (of which there was one in April 2020) and Uptime Reports Yes
    Delays System status alerts not yet being sent for index update delays (advised in case of 70+ hours for update, so no specific case in). Significant issue for testing changes in activation, as several days required to await changes and unknown when this may occur. Testing has shown anywhere from 4 days to over a week with changes not taking effect Yes

    Content

     

     

    Control over content

    Records appear in Primo for unique collections which are not active in Alma IZ (how do we remove collections with poor metadata? - we can't unless unique in one collection because it is now only records by rights, not collections :( and can only report)

    Yes

    Discovery of full text Record fails to appear in Primo even with active portfolios Yes
    Missing content

    Collections active for search in Alma cannot be found in Primo

    Significant content missing eg 'The Australian' has over 400,000 records in PCI and 12,000 in CDI

    The Trove (Australian Theses) collection has over 400,000 Dissertations in PCI, but the collection in CDI is advised as Subscription (incorrect), 1,082 Number of Records (incorrect), and Resource types of magazine articles (incorrect), and I can't even find these in CDI results

    New records in PCI are not found in CDI, for example one collection not updated at all in 2020

    Yes
    Support for PCI Ex Libris Support is refusing to fix content, metadata and linking issues with PCI unless High priority or 'urgent', because of transition to CDI, even though this is our Production environment right now and we won't transition for several months Yes

    Useless content

    Many records for content which should not be ingested eg many Articles with title of "E-Mail Notification of Your Latest Issue Online". The source vendor may present this content as 'Articles' on their platform, but there should be validation routines to prevent including such content in a citation index, as of no value to users

    Yes

    Duplication Massive number of duplicate records that have not matched and merged, even though they do on PCI Yes
    Miscategorization of types 7300% increase in Journals type from PCI to CDI, which seems to be because they are actually mostly should be Articles (by presence of an Is Part Of) Yes

    Validity of data

    No indication of frbrtype grouping in PNX, with all entries 5 even if not grouped (so should be 6), so cannot even use this data to identify unique content

    Logical records change by search eg data is x with DBID search and y when found by title search

    Source DBIDs and Scope DBIDs sometimes do not match

    Yes

     

    Alma - CDI for Alma Electronic Collections and PC to CDI Activation Report

    Case?

    CDI Tab Information missing Number of records

    Yes

    DBID not searchable in Repository Search, so can't track back from problematic Primo records to root cause Alma collection, and not visible in CZ unless activated first

    Yes

    Availability (Electronic Collection) and CDI Search Activation facets are missing and/or displays incorrect count and/or display different results to count when selected

    Yes

    2 of 7 documented searchable fields are missing

    Yes

    No changes made to CDI settings in Alma have any effect on CDI enabled Primo, or only take effect after 70+ hours instead of documented maximum 48 hours

    Yes

    CDI Search Activation Status editable option appears for full text activated collections, with ability to change to Not Active, even though this has no functional outcome

    Yes

    No effect of changing CDI settings by CDI Tab, only by Repository Search result buttons

    Yes

    No Publishing Information for CDI in main menu option or individual records for full text of package and searchable collections (bug for electronic titles fixed by editing and saving publishing profile)

    Yes

    No title specific file available for daily automated CDI publishing job, and no granular information in Events SA Analytics either

    Yes

    No access to institutional holdings file, and no documentation of expected currency of the file - advised per SF case, and wrote up a CKC

    Yes

    Job to publish records to CDI manually is undocumented and fails - advised that this will be removed, as it was a mistake that it was made visible

    Yes

    Extended delays when using filters for Electronic Collection repository searches when searching for all collections by * wildcard (30 seconds approx every time) - advised that this is expected and an 'enhancement' to fix

    Yes

    History tab does not record changes made to CDI Tab, or records only incorrect information like a Modification Date being changed which is not even the actual current date of a CDI change

    Yes

    Several CDI fields are not included in exports, including Local Notes and DBID

    Yes

    Local Notes in CDI Tab appears to have a character limit, but unknown size, causing issues in recording setting changes given wording is so verbose eg 'Do not show as full text available in CDI even if active in Alma'. No indication of this limit in UI, and the record will Save without showing an error, but impact is outcome such as half the note will be gone when the record is checked again - cannot replicate this now with Lorem ipsum  

    Search for In CDI = No returns results with In CDI = No and also results without this (including IZ), and In CDI is empty returns several hundred results with seemingly no correlation

    Yes

    Rights of Subscription (Collection) mixed with Linking Type of Link Resolver for more than 1000 collections, which does not match documented behavior of either, and this mix does not exist on PCI

    Yes

    Rights of Subscription (Link Resolver) mixed with Linking Type of Link in Record for 1 CZ collection, which does not match documented behaviour of either, and this mix does not exist on PCI Yes

    Link Resolver Linking Type does not match Database type with no portfolios for assigning availability status and no services for providing links matched to Alma bibliographic content and associated electronic services. Causing 'Available online' availability status records in filtered search, with 'No full text available' in View It when content is unique, as this is assigned at package level for Databases which have no portfolios by design. Also a very big assumption that every single record deemed non-unique will have at least one valid matching service in Alma to provide a service link, unless Link Resolver behaviour has fundamentally changed with CDI

    Yes

    CDI Tab Tooltips do not match actual options and wording is confusing (perhaps originally different settings and the tooltips forgotten when changes made?)

    Yes

    CDI Tab settings documentation is confusing, and does not match Alma settings or outcome behaviour in Primo (eg Searchable and Not Searchable in OLH and Active and Not Active in Alma) - Also Feedback sent 31.3.2020 - documentation updated in May to change to Active and Not Active

    Yes

    Full Text Rights on PCI collections do not match Full Text Rights on the mapped CDI collections in Alma, with outcome such as Subscription (Collection) on PCI swapped to Open Access on Alma, and vice versa, and unknown which is correct. Includes Hathi OA Full Text in Alma as Subscription (Collection)

    Yes

    Full Text Linking on PCI collections do not match Full Text Linking on mapped CDI collections in Alma, and unknown which is correct, especially given defect behaviour in Primo

    Yes

    Unexplained variations between PCI Interface > PC to Alma mapping file > PC to CDI Activation Report > Alma Electronic Collection CDI settings, leading to scenarios like Link Resolver collections marked as special setting of 'Active for full text in CDI only' which is documented as only meant to occur for Link in Record

    Yes

    No option to change settings in bulk, such to remove transition setting of 'Active for Full Text in CDI only', or to make non-full text collections Active for Search  

    Mapping of Subscription (Collection) aka Link in Record documented as being one to one for all of them, but instances where there are one to many

    Yes

    Description fields in Alma Electronic Collections only have a dash, instead of the Description available in PCI interface

    Yes

    DBID is not visible until a CZ collection is made Active for Search in IZ

     

    3 Electronic Collections found with all CDI information bar a Linking Type, and also one of these is a IZ collection (more of these found, and appears to occur when a CZ collection is deleted and therefore made IZ, but the CDI information block is retained even though now meaningless and cannot be changed with no CDI tab anymore.  May be two different issues, but still in one case)

    Yes

    Unknown expected outcome for availability status and Get It / View It when Database Link Resolver collection (aka Available online by title portfolios which don't exist for Database) is marked as ' Do not show as Full Text available in CDI even if active in Alma'

    Yes

    Unknown expected outcome for availability status and Get It / View It when Database Link in Record collection (aka Available online for whole collection) is marked as ' Do not show as Full Text available in CDI even if active in Alma'

    Yes

    Records for Database collections lose their UI display of source facet and scope DBID from PNX when moved from filtered to expanded search by 'Do not show as Full Text in CDI even if active in Alma' (initially thought to be removed from search completely, until realising it was the source facet being hidden from UI facet) Yes
    Unknown how a Database collection is considered 'active' for full text, as records appear in CDI even if the collection has no Electronic Collection URL and unsuppressed linked bibliographic record ie grey house icon in Alma Yes

    Alma Analytics: Poor documentation for extent of expected data in Alma Analytics, so unknown what will be covered or will be possible for reporting

    Yes

    Sandbox: No sign of CDI Enablement in Alma Premium Sandbox several weeks after Enablement advised as complete - documentation updated in May to cover our scenario of a single PC key for Prod and Sandbox, but ongoing queries as OLH still not clear on some aspects

    Yes

    CDI collection in Alma with no DBID Yes
    Some collections have two or more DBIDs, and often a whole string - SF explanation that this was due to multiple Summon collections being mapped to one Alma collection Yes
    Sometimes the daily record count for CDI does not match the weekly record count for PCI on the same day (and sometimes it does) Yes
    'Provider Coverage' field documented to be added in May, but not found in Production Yes
    Some CDI information block labels have been changed in May 2020 Release Update, but now they don't match the search indexes (eg 'CDI Search Rights' still the label for the search index, but the corresponding information block label is 'Search Rights in CDI' Yes
    CDI Newspapers by search does not match data in CDI Tabs for Electronic Collections eg 'Yes, Newspapers Search only' returns only 2 instead of more than 250 collections, and many collections under 'Is empty' even though they are Newspapers: No in CDI Tab Yes
    Resource Types in CDI Tab of Alma do not match Resource Types in Primo, and appear to be the original Summon types and not the mapped type, such as drawing,  and painting Yes

     

     

    Feedback and suggestions

    • Stacey van Groll

    • Discovery and Access Coordinator

    • University of Queensland

    s.vangroll@library.uq.edu.au




    • Was this article helpful?