<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:p="urn:schemas-microsoft-com:office:powerpoint" xmlns:a="urn:schemas-microsoft-com:office:access" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882" xmlns:rs="urn:schemas-microsoft-com:rowset" xmlns:z="#RowsetSchema" xmlns:b="urn:schemas-microsoft-com:office:publisher" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:c="urn:schemas-microsoft-com:office:component:spreadsheet" xmlns:odc="urn:schemas-microsoft-com:office:odc" xmlns:oa="urn:schemas-microsoft-com:office:activation" xmlns:html="http://www.w3.org/TR/REC-html40" xmlns:q="http://schemas.xmlsoap.org/soap/envelope/" xmlns:rtc="http://microsoft.com/officenet/conferencing" xmlns:D="DAV:" xmlns:Repl="http://schemas.microsoft.com/repl/" xmlns:mt="http://schemas.microsoft.com/sharepoint/soap/meetings/" xmlns:x2="http://schemas.microsoft.com/office/excel/2003/xml" xmlns:ppda="http://www.passport.com/NameSpace.xsd" xmlns:ois="http://schemas.microsoft.com/sharepoint/soap/ois/" xmlns:dir="http://schemas.microsoft.com/sharepoint/soap/directory/" xmlns:ds="http://www.w3.org/2000/09/xmldsig#" xmlns:dsp="http://schemas.microsoft.com/sharepoint/dsp" xmlns:udc="http://schemas.microsoft.com/data/udc" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:sub="http://schemas.microsoft.com/sharepoint/soap/2002/1/alerts/" xmlns:ec="http://www.w3.org/2001/04/xmlenc#" xmlns:sp="http://schemas.microsoft.com/sharepoint/" xmlns:sps="http://schemas.microsoft.com/sharepoint/soap/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:udcs="http://schemas.microsoft.com/data/udc/soap" xmlns:udcxf="http://schemas.microsoft.com/data/udc/xmlfile" xmlns:udcp2p="http://schemas.microsoft.com/data/udc/parttopart" xmlns:wf="http://schemas.microsoft.com/sharepoint/soap/workflow/" xmlns:dsss="http://schemas.microsoft.com/office/2006/digsig-setup" xmlns:dssi="http://schemas.microsoft.com/office/2006/digsig" xmlns:mdssi="http://schemas.openxmlformats.org/package/2006/digital-signature" xmlns:mver="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns:mrels="http://schemas.openxmlformats.org/package/2006/relationships" xmlns:spwp="http://microsoft.com/sharepoint/webpartpages" xmlns:ex12t="http://schemas.microsoft.com/exchange/services/2006/types" xmlns:ex12m="http://schemas.microsoft.com/exchange/services/2006/messages" xmlns:pptsl="http://schemas.microsoft.com/sharepoint/soap/SlideLibrary/" xmlns:spsl="http://microsoft.com/webservices/SharePointPortalServer/PublishedLinksService" xmlns:Z="urn:schemas-microsoft-com:" xmlns:st="" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=Content-Type content="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 12 (filtered medium)">
<style>
<!--
/* Font Definitions */
@font-face
        {font-family:"Arial Unicode MS";
        panose-1:2 11 6 4 2 2 2 2 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
        {font-family:"Palatino Linotype";
        panose-1:2 4 5 2 5 5 5 3 3 4;}
@font-face
        {font-family:"\@Arial Unicode MS";
        panose-1:2 11 6 4 2 2 2 2 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal;
        font-family:"Calibri","sans-serif";
        color:windowtext;}
span.EmailStyle18
        {mso-style-type:personal;
        font-family:"Arial Unicode MS","sans-serif";
        color:#1F497D;}
span.EmailStyle19
        {mso-style-type:personal;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
span.EmailStyle20
        {mso-style-type:personal-reply;
        font-family:"Palatino Linotype","serif";
        color:#993366;
        font-weight:normal;
        font-style:normal;
        text-decoration:none none;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page Section1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.Section1
        {page:Section1;}
-->
</style>
<!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang=EN-US link=blue vlink=purple>
<div class=Section1>
<p class=MsoNormal><span style='font-size:12.0pt;font-family:"Palatino Linotype","serif";
color:#993366'>I would welcome an exclusion of dcrm and predecessor codes from matching
algorithms. <o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:12.0pt;font-family:"Palatino Linotype","serif";
color:#993366'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:12.0pt;font-family:"Palatino Linotype","serif";
color:#993366'>The exclusion of pre-1801 imprints explains the large numbers of
duplicate results for early printed books. I'm not sure of its value; removing
this exclusion while adding one for the 040 seems like it might be a better way
to go, but I bow to those with longer and more experience of using WorldCat.<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:12.0pt;font-family:"Palatino Linotype","serif";
color:#993366'><o:p> </o:p></span></p>
<div>
<div style='border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in'>
<p class=MsoNormal><b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>From:</span></b><span
style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>
dcrm-l-bounces@lib.byu.edu [mailto:dcrm-l-bounces@lib.byu.edu] <b>On Behalf Of </b>Dooley,Jackie<br>
<b>Sent:</b> Thursday, 20 May, 2010 14:12<br>
<b>To:</b> DCRM Revision Group List<br>
<b>Cc:</b> Chapman,John; Patton,Glenn<br>
<b>Subject:</b> [DCRM-L] OCLC's Glenn Patton on merging and de-duplication
ofWorldCat records<o:p></o:p></span></p>
</div>
</div>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal><span style='color:#1F497D'>I brought the recent
conversation among Richard Noble and others to the attention of Glenn Patton,
OCLC’s long-time expert in quality control issues (including record
de-duplication), and he provided the information below. In a nutshell:
OCLC does not de-dup records for any pre-1800 imprints, given the complexities
in determining what constitutes a “duplicate.” Further conversation
on this would be welcomed if those in the rare book cataloging community would
find it useful. Issues relating to when to input a new record are pertinent.<o:p></o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'>Best to all, Jackie<o:p></o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'>Jackie Dooley<o:p></o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'>Consulting Archivist<o:p></o:p></span></p>
<div style='border:none;border-bottom:solid windowtext 1.0pt;padding:0in 0in 1.0pt 0in'>
<p class=MsoNormal><span style='color:#1F497D'>OCLC Research and the RLG
Partnership<o:p></o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'><o:p> </o:p></span></p>
</div>
<p class=MsoNormal><span style='color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial Unicode MS","sans-serif";
color:#1F497D'>OCLC</span><span style='font-size:10.0pt;color:#1F497D'>’</span><span
style='font-size:10.0pt;font-family:"Arial Unicode MS","sans-serif";color:#1F497D'>s
Duplicate Detection and Resolution software (DDR) does not merge records if one
of the imprint dates is pre-1800, nor would OCLC staff merge records in this
situation unless it were absolutely clear that the records represented the same
item (but we would be willing to work with someone who had gone through the
effort of working out which were true duplicates and which weren</span><span
style='font-size:10.0pt;color:#1F497D'>’</span><span style='font-size:
10.0pt;font-family:"Arial Unicode MS","sans-serif";color:#1F497D'>t).</span><span
style='font-size:10.0pt;color:#1F497D'> </span><span style='font-size:
10.0pt;font-family:"Arial Unicode MS","sans-serif";color:#1F497D'> <o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial Unicode MS","sans-serif";
color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial Unicode MS","sans-serif";
color:#1F497D'>While the matching software used to load records prepared in
external systems into WorldCat is very similar to that used in DDR, it does not
include the pre-1800 exclusion.</span><span style='font-size:10.0pt;color:#1F497D'> </span><span
style='font-size:10.0pt;font-family:"Arial Unicode MS","sans-serif";color:#1F497D'>
We could consider some more complex exclusions that would be based on the 040
$e coding (e.g., exclude all with a </span><span style='font-size:10.0pt;
color:#1F497D'>‘</span><span style='font-size:10.0pt;font-family:"Arial Unicode MS","sans-serif";
color:#1F497D'>dcrb[x]</span><span style='font-size:10.0pt;color:#1F497D'>’</span><span
style='font-size:10.0pt;font-family:"Arial Unicode MS","sans-serif";color:#1F497D'>
code and its predecessor codes) if the rare book community felt this
would be desirable.<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial Unicode MS","sans-serif";
color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial Unicode MS","sans-serif";
color:#1F497D'>It</span><span style='font-size:10.0pt;color:#1F497D'>’</span><span
style='font-size:10.0pt;font-family:"Arial Unicode MS","sans-serif";color:#1F497D'>s
certainly true that a WorldCat record can end up with holdings attached that
represent variations of the item described in the bibliographic record.</span><span
style='font-size:10.0pt;color:#1F497D'> </span><span style='font-size:
10.0pt;font-family:"Arial Unicode MS","sans-serif";color:#1F497D'> OCLC
matching has not always been as restrictive as it is now, and catalogers
certainly may have chosen “close” master records and then made
adaptations in their local systems.<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial Unicode MS","sans-serif";
color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial Unicode MS","sans-serif";
color:#1F497D'>The issue of not recording an edition statement based on a
reference source is a very problematic one.</span><span style='font-size:10.0pt;
color:#1F497D'> </span><span style='font-size:10.0pt;font-family:"Arial Unicode MS","sans-serif";
color:#1F497D'> Having an edition statement (even a bracketed one) would, I
believe, prevent mismatches in both DDR and Batchload; having that information
in the “first note” </span><span style='font-size:10.0pt;
color:#1F497D'> </span><span style='font-size:10.0pt;font-family:"Arial Unicode MS","sans-serif";
color:#1F497D'>(which I assume would be a 500, since the 503 is no longer
valid) is not the sort of thing that is “actionable” from a machine
matching perspective.<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial Unicode MS","sans-serif";
color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial Unicode MS","sans-serif";
color:#1F497D'>It would be useful to carry forward this discussion with the
rare book community.</span><span style='font-size:10.0pt;color:#1F497D'> </span><span
style='font-size:10.0pt;font-family:"Arial Unicode MS","sans-serif";color:#1F497D'>
Nobody wants to play “fast and loose” with record merging, but, on
the other hand, I don</span><span style='font-size:10.0pt;color:#1F497D'>’</span><span
style='font-size:10.0pt;font-family:"Arial Unicode MS","sans-serif";color:#1F497D'>t
think people really want a situation where there</span><span style='font-size:
10.0pt;color:#1F497D'>’</span><span style='font-size:10.0pt;font-family:
"Arial Unicode MS","sans-serif";color:#1F497D'>s no attempt to match at all.<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:10.0pt'><o:p> </o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'>Glenn E. Patton<o:p></o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'>Director, WorldCat Quality
Management<o:p></o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'>OCLC<o:p></o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'>6565 Kilgour Place<o:p></o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'>Dublin OH 43017-3395<o:p></o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'>Phone: +1.800.828.5878, ext.
6371 or +1.614.764.6371<o:p></o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'>Fax: +1.614.718.7187<o:p></o:p></span></p>
<p class=MsoNormal><span style='color:#1F497D'>Email: pattong@oclc.org<o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:12.0pt;font-family:"Arial Unicode MS","sans-serif";
color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
</div>
</body>
</html>