How and When Catalog Data Appears in Quicksearch
At 8 a.m. each day, an automated process extracts information from Orbis and Morris for ingest into Quicksearch. From Orbis, unsuppressed bibliographic records with at least one unsuppressed holdings record attached that have been created, updated, or changed from suppressed to unsuppressed in the previous 24 hours are identified and processed to transfer selected data from the holdings record(s) into the bibliographic record. These modified records, as well as lists of bibliographic identifiers for records that have been deleted or suppressed, are placed in a staging area.
A second process picks up those files from the staging area at 9 a.m. and ingests them into Quicksearch. Quicksearch’s Books+ is powered by an underlying Solr index. The indexing process extracts data from the MARC records and restructures it for more efficient searching, sorting, and faceting, as well as for display. The same data may be copied to multiple indexes in different forms to support specific functions: for example, non-filing characters removed from titles in the sort index, or information from various places in the record interpreted to apply one or more format terms for faceting. When a record is identified as deleted or suppressed in Orbis, all data in the indexes associated with that record is removed.
Invalid MARC data in a record may prevent it from ingesting into Quicksearch. The most common errors are empty subfields and empty delimiters. An automated daily audit process identifies these errors, which are remediated on a weekly basis. Monthly audit reports are published to the Quicksearch project blog.
Quicksearch is replicated across three servers. The first server is used to handle the ingest process, while the second and third support public use of the system. The second server copies updates from the first server, and the third does the same from the second. Each server checks for updates every hour, so changes in Orbis may ingest into the first server at 9 a.m., transfer to the second server at 10 a.m., and appear on the third server at 11 a.m. When a user performs a search in Books+, they are randomly assigned to the second or third server. This means that updates made in Orbis may not display in the public interface until the afternoon after the day the changes are made.
Access information in the search results and in the holdings box on item pages, however, is the result of real-time API calls to the source catalog. The availability of a resource (i.e., whether it is checked out) is always up to date. This may result in brief discrepancies between descriptive information drawn from the Solr index for the item display and the same data pulled from the API for the holdings box: for example, a change to the call number of a resource will appear in the holdings box immediately, but may not be re-indexed for search and display elsewhere in the interface until the following day.