[DCRM-L] OCLC's duplicate detection & resolution software: two questions for the rare and archival materials communities

Haley, Kathleen M. KHaley at mwa.org
Wed Sep 16 12:50:53 MDT 2015


I've consulted with colleagues here at AAS and we would also like to see a cut-off date of 1900 or 1901 (while acknowledging that this may not be a completely practical wish).

Unfortunately from our perspective the protected status of dcrb and bdrb records doesn't appear to hold up well when bulk loading is involved. I've been looking at our data in light of the impending demise of Institutional Records and found 5,699 cases where one of our records coded dcrb or bdrb is attached to the same OCLC master record as another of our records. For example:

OCLC id 191228408. We have two IRs attached to this record for imprint variants of this almanac with different citations to the Checklist of American imprints and Drake. Both are coded dcrb. Given that the 019 field in the master record includes the id number 191228409 I have my suspicions that our initial data load correctly created two master records which were subsequently merged in spite of the dcrb codes.

OCLC id 428751. In this case we have 5 IRs for volumes in a children's series which have all been linked to a non-dcrb record for volume 1. In this case the distinctive part of the title doesn't appear in our 245 until subfield $p.

Does OCLC have any intention of revisiting these IR/Master record match-ups when Institutional records go away?

We are also concerned with what's happened with roughly 1800 brief ephemera records which we haven't coded dcrb. For instance, OCLC id 208753973 has 2 IRs attached. These are for two rewards of merit. They are from the same publisher and happen to have the same text, but the images are different. If you look at these two items they are very clearly not the same thing, but the only indication of that in the records are in notes fields. A later merge date would have protected them.

As a side note, I only find out about record mergers when a record we have previously sent to OCLC gets re-sent and I review the list of record ids which OCLC sends back to us. This being the case, I have a feeling that the figure of 5,699 problematic records is understating the extent of the problem.


Kathleen M. Haley
Information Systems Librarian
American Antiquarian Society
185 Salisbury St.
Worcester, MA 01609

phone: (508) 471-2147
e-mail: KHaley at mwa.org
AAS website: American Antiquarian Society



More information about the DCRM-L mailing list