[DCRM-L] OCLC de-duping algorithms and dates of publication

Ann W. Copeland auc1 at psu.edu
Wed Nov 3 11:56:01 MDT 2010


Jackie,

We did discuss this at what used to be called MASC last winter. Here 
from the minutes:

* A)*. */OCLC issues./*

/ Given the new functionality available in OCLC to improve records, are 
catalogers working  differently --for example, routinely adding 
genre/form terms to master records?
/
  Some participants said they search the OCLC database for a suitable 
record to enhance using  DCRM(B) cataloging rules and/or they add genre 
terms and notes to AACR2 records, others said  they upgrade their 
records only in their local database. The concern that other catalogers 
could  delete the information in enhanced records in OCLC was mentioned 
as was the belief that  public services librarians would prefer less 
elaborate records.

  Annie Copland reported that on behalf of the RBMS Bibliographic 
Standards Committee she had written to OCLC to inquire about the 
possibility of OCLC allowing duplicate records for the same item, one 
record cataloged according to AACR2 and another according to DCRM. OCLC 
responded that rather than allowing permissible duplicates, they prefer 
having the DCRM record, as the one containing the most information, be 
the master record. OCLC wondered how libraries would react to this 
change.  A show of hands of MASC participants was called for and a large 
majority indicated their preference for the DCRM record being the master 
record.  Some attendees asked to have an OCLC representative at a future 
MASC meeting to discuss master records, duplicate records and 
proliferation of records in the database.

Glenn then issued this in May I believe:

OCLC’s Duplicate Detection and Resolution software (DDR) does not merge 
records if one of the imprint dates is pre-1800, nor would OCLC staff 
merge records in this situation unless it were absolutely clear that the 
records represented the same item (but we would be willing to work with 
someone who had gone through the effort of working out which were true 
duplicates and which weren’t).

While the matching software used to load records prepared in external 
systems into WorldCat is very similar to that used in DDR, it does not 
include the pre-1800 exclusion. We could consider some more complex 
exclusions that would be based on the 040 $e coding (e.g., exclude all 
with a ‘dcrb[x]’ code and  its predecessor codes) if the rare book 
community felt this would be desirable.

It’s certainly true that a WorldCat record can end up with holdings 
attached that represent variations of the item described in the 
bibliographic record. OCLC matching has not always been as restrictive 
as it is now, and catalogers certainly may have chosen “close” master 
records and then made adaptations in their local systems.

The issue of not recording an edition statement based on a reference 
source is a very problematic one. Having an edition statement (even a 
bracketed one) would, I believe, prevent mismatches in both DDR and 
Batchload; having that information in the “first note” (which I assume 
would be a 500, since the 503 is no longer valid) is not the sort of 
thing that is “actionable” from a machine matching perspective.

It would be useful to carry forward this discussion with the rare book 
community. Nobody wants to play “fast and loose” with record merging, 
but, on the other hand, I don’t think people really want a situation 
where there’s no attempt to match at all.

Glenn E. Patton

Director, WorldCat Quality Management

I'm not sure where we want to go with this now.

Thanks, Annie

On 11/3/2010 1:22 PM, Dooley,Jackie wrote:
>
> Big questions acout which, IMHO, Bib Standards oughta have 
> discussions. -Jackie
>
> *From:* dcrm-l-bounces at lib.byu.edu [mailto:dcrm-l-bounces at lib.byu.edu] 
> *On Behalf Of *Deborah J. Leslie
> *Sent:* Wednesday, November 03, 2010 7:35 AM
> *To:* DCRM Revision Group List
> *Subject:* Re: [DCRM-L] OCLC de-duping algorithms and dates of publication
>
> Thanks for Annie’s comment. I have mixed feelings about the no 
> de-duping of pre-1801 publications. Would OCLC really give preference 
> to dcrm records if they were to de-dupe? Even over pcc records?
>
> __________________________________________
>
> Deborah J. Leslie, M.A., M.L.S.
>
> RBMS past chair 2010-2011 | Head of Cataloging, Folger Shakespeare Library
>
> 201 East Capitol St., S.E., Washington, D.C. 20003 | 202.675-0369 
> (phone)  202.675-0328 (fax) | djleslie at folger.edu  | www.folger.edu
>
> *From:* dcrm-l-bounces at lib.byu.edu [mailto:dcrm-l-bounces at lib.byu.edu] 
> *On Behalf Of *ANN W. COPELAND
> *Sent:* Tuesday, 02 November, 2010 22:45
> *To:* Erin Blake
> *Cc:* DCRM Revision Group List
> *Subject:* Re: [DCRM-L] OCLC de-duping algorithms and dates of publication
>
> Interestingly, when we asked about permissible duplicates (one DCRM, 
> one AACR2) OCLC said they did NOT want duplicate records. Instead they 
> wanted to merge records with the DCRM record surviving as the master 
> record. So, why exempt pre-1800 books from the de-duping? Why not work 
> the algorithm to favor DCRM?
>
> Thanks, Annie
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://listserver.lib.byu.edu/pipermail/dcrm-l/attachments/20101103/21a3f60a/attachment.htm 


More information about the DCRM-L mailing list