[DCRM-L] OCLC de-duping algorithms and dates of publication

Dooley,Jackie dooleyj at oclc.org
Wed Nov 3 12:11:39 MDT 2010


Ann, thanks for the reminder of the MASC discussion. Given that MASC isn’t a committee and therefore can’t take formal actions, I still think this is fertile territory for Bib Standards. Perhaps an ad hoc subcommittee?

 

I’ll follow up with Glenn and to talk about following up with RBMS. 

 

Best wishes, Jackie

 

Jackie Dooley

Program Officer

OCLC Research and the RLG Partnership

 

949.492.5060 (office/home) – Pacific Time

949.295.1529 (mobile)

 

From: Ann W. Copeland [mailto:auc1 at psu.edu] 
Sent: Wednesday, November 03, 2010 10:56 AM
To: DCRM Revision Group List
Cc: Dooley,Jackie
Subject: Re: [DCRM-L] OCLC de-duping algorithms and dates of publication

 

Jackie,

We did discuss this at what used to be called MASC last winter. Here from the minutes:

 A). OCLC issues. 
 
 Given the new functionality available in OCLC to improve records, are catalogers working  differently --for example, routinely adding genre/form terms to master records? 
 
 Some participants said they search the OCLC database for a suitable record to enhance using  DCRM(B) cataloging rules and/or they add genre terms and notes to AACR2 records, others said  they upgrade their records only in their local database. The concern that other catalogers could  delete the information in enhanced records in OCLC was mentioned as was the belief that  public services librarians would prefer less elaborate records.  
 
 Annie Copland reported that on behalf of the RBMS Bibliographic Standards Committee she had written to OCLC to inquire about the possibility of OCLC allowing duplicate records for the same item, one record cataloged according to AACR2 and another according to DCRM. OCLC responded that rather than allowing permissible duplicates, they prefer having the DCRM record, as the one containing the most information, be the master record. OCLC wondered how libraries would react to this change.  A show of hands of MASC participants was called for and a large majority indicated their preference for the DCRM record being the master record.  Some attendees asked to have an OCLC representative at a future MASC meeting to discuss master records, duplicate records and proliferation of records in the database.

Glenn then issued this in May I believe:

 

OCLC’s Duplicate Detection and Resolution software (DDR) does not merge records if one of the imprint dates is pre-1800, nor would OCLC staff merge records in this situation unless it were absolutely clear that the records represented the same item (but we would be willing to work with someone who had gone through the effort of working out which were true duplicates and which weren’t).  

 

While the matching software used to load records prepared in external systems into WorldCat is very similar to that used in DDR, it does not include the pre-1800 exclusion.  We could consider some more complex exclusions that would be based on the 040 $e coding (e.g., exclude all with a ‘dcrb[x]’ code and  its predecessor codes) if the rare book community felt this would be desirable.

 

It’s certainly true that a WorldCat record can end up with holdings attached that represent variations of the item described in the bibliographic record.  OCLC matching has not always been as restrictive as it is now, and catalogers certainly may have chosen “close” master records and then made adaptations in their local systems.

 

The issue of not recording an edition statement based on a reference source is a very problematic one.  Having an edition statement (even a bracketed one) would, I believe, prevent mismatches in both DDR and Batchload; having that information in the “first note”  (which I assume would be a 500, since the 503 is no longer valid) is not the sort of thing that is “actionable” from a machine matching perspective.

 

It would be useful to carry forward this discussion with the rare book community.  Nobody wants to play “fast and loose” with record merging, but, on the other hand, I don’t think people really want a situation where there’s no attempt to match at all.

 

Glenn E. Patton

Director, WorldCat Quality Management

I'm not sure where we want to go with this now. 

Thanks, Annie

On 11/3/2010 1:22 PM, Dooley,Jackie wrote: 

Big questions acout which, IMHO, Bib Standards oughta have discussions. -Jackie

 

From: dcrm-l-bounces at lib.byu.edu [mailto:dcrm-l-bounces at lib.byu.edu] On Behalf Of Deborah J. Leslie
Sent: Wednesday, November 03, 2010 7:35 AM
To: DCRM Revision Group List
Subject: Re: [DCRM-L] OCLC de-duping algorithms and dates of publication

 

Thanks for Annie’s comment. I have mixed feelings about the no de-duping of pre-1801 publications. Would OCLC really give preference to dcrm records if they were to de-dupe? Even over pcc records?   

__________________________________________

Deborah J. Leslie, M.A., M.L.S.

RBMS past chair 2010-2011 | Head of Cataloging, Folger Shakespeare Library

201 East Capitol St., S.E., Washington, D.C. 20003 | 202.675-0369 (phone)  202.675-0328 (fax) | djleslie at folger.edu  | www.folger.edu

 

 

From: dcrm-l-bounces at lib.byu.edu [mailto:dcrm-l-bounces at lib.byu.edu] On Behalf Of ANN W. COPELAND
Sent: Tuesday, 02 November, 2010 22:45
To: Erin Blake
Cc: DCRM Revision Group List
Subject: Re: [DCRM-L] OCLC de-duping algorithms and dates of publication

 

Interestingly, when we asked about permissible duplicates (one DCRM, one AACR2) OCLC said they did NOT want duplicate records. Instead they wanted to merge records with the DCRM record surviving as the master record. So, why exempt pre-1800 books from the de-duping? Why not work the algorithm to favor DCRM? 

Thanks, Annie




-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://listserver.lib.byu.edu/pipermail/dcrm-l/attachments/20101103/110e1945/attachment.htm 


More information about the DCRM-L mailing list