You are here

Deduplication Update

Thank you all for your patience this weekend and in the past weeks as we prepared this deduplication. You've helped it run as smoothly as possible. We are back up and running today with a nearly completed deduplication, but the process is ongoing.

Not all records have been fully deduplicated. Duplicate records you may find can fall into three major categories:

  1. Items in transit, under serial control etc. The deduplication process uses the same tool we use to merge records, meaning merging items in transit is not possible, nor are merges that will delete a bibliographic record under serial control or with open orders attached. SirsiDynix will contunially retry these failed merges to automate as many as possible as transits are resolved. The remainder will be provided in a list for manual cleanup and future attempts. There are currently fewer than 6,000 matched pairs in this category, though many popular materials will be in this category.
  2. Near Matches - Records that shared standard numbers but were rejected as a match for other reasons (publisher, date, author, or title discrepancies) account for a large portion of our deduplication results. These need to be examined individually and manually merged if necessary. This cleanup process will likely take us through the end of the year. We will not have an accurate report of these records and their counts until merge reties have completed and SirsiDynix can run some analysis reports. We hope to start this process next week.
  3. Newer Records - Records added after May 17th were not included in deduplication and will not be in our cleanup lists.

SWAN is once again accepting merge requests, though please bear in mind that we will have many of these in a processing queue soon. Feel free to submit requests when a fast-track is necessary, but know that in many cases we will get to the merge eventually.

In the coming weeks, please report any issues you may find in a ticket. We have a pre-deduped version of our catalog running on a backup server as well as a detailed report of every merge the process performed. While we don't anticipate any major problems, a large data project of this type will no doubt necessitate some cleanup.

Thank you again, and please feel free to direct any questions to our ticketing system.