<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; color: rgb(0, 0, 0); font-size: 14px; font-family: Calibri, sans-serif;">
<div>
<div>
<div>
<div>Richard and Francis,</div>
<div> </div>
<div>We are asking if the 1801 cutoff (or the 1901 cartographic exception date) need to be adjusted, but are not suggesting that it should be earlier. We would expect that, if a change is agreed upon, the dates would be later.</div>
<div> </div>
<div>We are asking the question of the DCRM-L community to see if there is any consensus that can be reached about a change, or if the current scheme is logical and can remain. The context that Richard provided should be helpful in the discussion.</div>
<div> </div>
</div>
<div>
<div id="MAC_OUTLOOK_SIGNATURE">
<div>
<div>--</div>
<div>
<div>John Chapman</div>
<div>OCLC · Product Manager, Metadata Services</div>
<div>6565 Kilgour Place, Dublin, OH 43017 USA </div>
<div>T +1-614-761-5272</div>
</div>
</div>
<div><br>
</div>
</div>
</div>
</div>
</div>
<div><br>
</div>
<span id="OLK_SRC_BODY_SECTION">
<div style="font-family:Calibri; font-size:12pt; text-align:left; color:black; BORDER-BOTTOM: medium none; BORDER-LEFT: medium none; PADDING-BOTTOM: 0in; PADDING-LEFT: 0in; PADDING-RIGHT: 0in; BORDER-TOP: #b5c4df 1pt solid; BORDER-RIGHT: medium none; PADDING-TOP: 3pt">
<span style="font-weight:bold">From: </span><<a href="mailto:dcrm-l-bounces@lib.byu.edu">dcrm-l-bounces@lib.byu.edu</a>> on behalf of "Noble, Richard"<br>
<span style="font-weight:bold">Reply-To: </span>DCRM Users' Group<br>
<span style="font-weight:bold">Date: </span>Friday, September 4, 2015 at 10:23 AM<br>
<span style="font-weight:bold">To: </span>DCRM Users' Group<br>
<span style="font-weight:bold">Subject: </span>Re: [DCRM-L] OCLC's duplicate detection & resolution software: two questions for the rare and archival materials communities<br>
</div>
<div><br>
</div>
<div>
<div>
<div dir="ltr">
<div class="gmail_default" style="font-family:georgia,serif;font-size:small">Quick response: the cut-off for books should, if anything, be later, not earlier. The year 1801 is arbitrary, as much established as it is in national bibliographies and the like.
It seems to be understood as the end of the "hand-press period", which is historically not the case. For English books that would be no earlier than 1820, and for some continental books even later (I see German books of the 1840s printed direct from type on
handmade laid paper, for instance).</div>
<div class="gmail_default" style="font-family:georgia,serif;font-size:small"><br>
</div>
<div class="gmail_default" style="font-family:georgia,serif;font-size:small">But the bibliographical significance of "hand-press" has been great exaggerated. While printers become more and more adept at covering their tracks as the c19 proceeds, bibliographical
analysis and description are very much applicable to post-1801 books and post "hand-press" books, for the most basic of our FRBR purposes: the identification of manifestations, and, at the most learned level, the specification of diagnostic evidence for distinction
of manifestations, as well as explicit accounting for evidence of variation within the body of items that constitute a manifestation.</div>
<div class="gmail_default" style="font-family:georgia,serif;font-size:small"><br>
</div>
<div class="gmail_default" style="font-family:georgia,serif;font-size:small">That said, I suppose--assuming that the exemption of dcrm records from automatic de-duping continues--the idea is to establish criteria by which to exempt a range of non-dcrm records
as well. Earlier versions of dcrm tended to emphasize 1801/"hand-press period" as a cutoff for application of the special rules (and the consequent finer-grained analysis of supporting evidence and variation), so it it made sense of a kind to specify that
range. As tempting as it is, however, to limit dcrm to hand-press books because it is easier to analyze and describe them, I know from considerable experience that post-1801 books printed from plates, perhaps based on mechanical composition, are equally and
more subtly variable.</div>
<div class="gmail_default" style="font-family:georgia,serif;font-size:small"><br>
</div>
<div class="gmail_default" style="font-family:georgia,serif;font-size:small">The whole body of pre-1801 works forms, I presume, a relatively small percentage of the material represented in the database, though the mass of duplicate records generated by uploading
of incommensurably cataloged material is considerable. The problem is not so much the conflation of different manifestations indifferently described, as it is the loss of information that takes place when merged records are expunged, which precludes conscious
and focused comparison--by catalogers well versed in the vagaries of legacy and minimal cataloging--as a check on de-duping errors.</div>
<div class="gmail_default" style="font-family:georgia,serif;font-size:small"><br>
</div>
<div class="gmail_default" style="font-family:georgia,serif;font-size:small">I would be dismayed to see an irreversible process applied to an even greater range of materials than before. IRs being a lost cause, this would be mitigated to some extent if records
represented in 019 fields could be preserved for inspection (beyond the current brief grace period) in such a way as not to impede the operations of the WorldCat as a whole. But as Francis Lapka pointed out, the regression of the date cutoff does seem to be
a retraction, not an expansion, of safeguards.</div>
</div>
<div class="gmail_extra"><br clear="all">
<div>
<div class="gmail_signature"><font face="courier new,monospace">RICHARD NOBLE :: RARE MATERIALS CATALOGUER :: JOHN HAY LIBRARY</font>
<div><font face="courier new,monospace">BROWN UNIVERSITY :: PROVIDENCE, R.I. 02912 :: 401-863-1187</font></div>
<div><span style="font-family: 'courier new', monospace;"><</span><a href="mailto:RICHARD_NOBLE@BROWN.EDU" style="font-family:'courier new',monospace" target="_blank">Richard_Noble@Br</a><span style="font-family: 'courier new', monospace;"><a href="http://own.edu" target="_blank">own.edu</a></span><span style="font-family: 'courier new', monospace;">></span></div>
</div>
</div>
<br>
<div class="gmail_quote">On Fri, Sep 4, 2015 at 9:00 AM, Lapka, Francis <span dir="ltr">
<<a href="mailto:francis.lapka@yale.edu" target="_blank">francis.lapka@yale.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div lang="EN-US" link="#0563C1" vlink="#954F72">
<div>
<p><span style="font-family: Georgia, serif;">Jackie,</span><span style="font-size: 12pt; font-family: Georgia, serif;"><u></u><u></u></span></p>
<p><span style="font-family: Georgia, serif;">I'm grateful for your message, and pleased to hear that OCLC is considering changes "to expand and strengthen the safeguards we already apply to bibliographic records for unique, rare, and/or archival materials."<u></u><u></u></span></p>
<div>
<div>
<p class="MsoNormal"><span style="font-family: Georgia, serif;">At first blush, it would seem that moving the chronological exception for de-duping to an earlier date might *weaken* the safeguards, since it would make the exception apply to a smaller set of
records. Could you tell us more about the motivation for this particular change and how it might serve to strengthen the safeguards?<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family: Georgia, serif;"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family: Georgia, serif;">Thanks<span class="HOEnZb"><font color="#888888"><u></u><u></u></font></span></span></p>
<span class="HOEnZb"><font color="#888888"></font></span></div>
<span class="HOEnZb"><font color="#888888">
<div>
<p class="MsoNormal"><span style="font-family: Georgia, serif;">Francis<u></u><u></u></span></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
</font></span></div>
<div>
<div class="h5">
<div>
<p class="MsoNormal"><span style="color:#787878">On Fri, Sep 04, 2015 at 4:18 AM, Dooley,Jackie <<a href="mailto:dooleyj@oclc.org" target="_blank">dooleyj@oclc.org</a>> wrote:<u></u><u></u></span></p>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal" style="background:white"><span style="font-size:10.5pt"> </span> Dear DCRM-L -- <u></u><u></u></p>
</div>
<div>
<div>
<div>
<p class="MsoNormal" style="background:white"> <span style="font-family: 'Times New Roman', serif;"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:.5in;background:white">On behalf of my colleagues on OCLC's Metadata Quality Team, I'm writing to pose two questions: 1) whether the pre-1801 cutoff for excluding records from de-duplication should be changed to an earlier
date, and 2) whether additional cataloging code symbols should be added to the 040 $e exception. <u></u><u></u></p>
</div>
</div>
</div>
<div>
<p class="MsoNormal" style="background:white"><u></u> <u></u></p>
</div>
<div>
<div>
<div>
<p class="MsoNormal" style="margin-left:.5in;background:white">We're considering changes to the automated Duplicate Detection and Resolution (DDR) software and are seeking community opinion before taking action. The contemplated changes are
<b>intended to expand and strengthen the safeguards we already apply to bibliographic records for unique, rare, and/or archival materials</b>. As members of the rare and/or archival cataloging community, you are in an excellent position to provide informed
advice on these issues.<span style="font-family: 'Times New Roman', serif;"><u></u><u></u></span></p>
</div>
</div>
</div>
<div>
<div>
<div>
<p class="MsoNormal" style="margin-left:.5in;background:white"> <u></u><u></u></p>
<p class="MsoNormal" style="margin-left:.5in;background:white">First, some background. OCLC first developed the capability to merge bibliographic records manually in 1983. During the late 1980s and early 1990s, we developed automated DDR software, which dealt
with Books records only. From 2005 through 2009, OCLC developed a completely new version of DDR that worked with all bibliographic formats. From the very beginning of automated DDR back in 1991,
<b>records for resources with dates of publication/production earlier than 1801 have been set aside and not processed</b>. More recently, in consultation with the American Library Association (ALA) Map and Geospatial Information Round Table (MAGIRT) Cataloging
and Classification Committee (CCC), we have further <b>exempted records for cartographic materials with dates of publication earlier than 1901</b>.
<b>In addition, </b>we exempt from DDR processing all records for resources that can be identified as<b> photographs (Material Types “pht” for photograph and/or “pic” for picture)</b>.<u></u><u></u></p>
</div>
</div>
</div>
<div>
<p class="MsoNormal" style="background:white"><u></u> <u></u></p>
</div>
<div>
<div>
<div>
<p class="MsoNormal" style="margin-left:.5in;background:white">Following discussions with representatives of the rare materials community several years ago,
<b>we also exempted from DDR processing all records that are coded in field 040 subfield $e under description conventions for rare materials codes "bdrb", "dcrb", "dcrmb”, or “dcrms</b>.” Please note that these DDR exemptions are
<i>not</i> intended to apply to electronic, microform, or other reproductions, only to the original resources.<span style="font-family: 'Times New Roman', serif;"><u></u><u></u></span></p>
</div>
</div>
</div>
<div>
<p class="MsoNormal" style="background:white"><u></u> <u></u></p>
</div>
<div>
<div>
<div>
<p class="MsoNormal" style="margin-left:.5in;background:white">The current DDR software is incredibly complicated and continues to be fine-tuned on a regular basis. Although this is an oversimplification of a complex process, there are now at least two dozen
different points of comparison taken into consideration. Many of these comparison points draw data from multiple parts of a bibliographic record and involve manipulation of data in ways designed to distinguish both variations that should be equated and distinctions
that must be recognized.<span style="font-family: 'Times New Roman', serif;"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:.5in;background:white">As part of our ongoing efforts to improve DDR’s accuracy, we are reaching out again to members of the rare materials and archival resources communities, in particular, for feedback on the following
questions:<u></u><u></u></p>
</div>
</div>
</div>
<blockquote style="margin-left:30.0pt;margin-right:0in">
<ol start="1" type="1">
<li class="MsoNormal" style="background:white">Within the context of the materials cataloged by your community, are there dates other than pre-1801 for most resources and pre-1901 for cartographic materials that would make more sense as an exemption cutoff?<u></u><u></u></li><li class="MsoNormal" style="background:white">The current list of Description Convention Source Codes, found at
<a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__www.loc.gov_standards_sourcelist_descriptive-2Dconventions.html&d=AwMGaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=t7GDkvcZa922K6iya7a6MxgVxxw7OjL0m1rPBXkflk4&m=kRqExyp5bTagfw4W-s3iO-qvtjTFj_59J74agId44nI&s=MJfHI5B_tV51Vx2wSKcLJQY4vkqu3ua9UEvXyUqqX8c&e=" target="_blank">
http://www.loc.gov/standards/sourcelist/descriptive-conventions.html</a>, has grown much more extensive in recent years. Aside from the four codes already exempted ("bdrb", "dcrb", "dcrmb”, “dcrms”), are there others that it would make sense to consider exempting?
Note that Description Convention Source Codes “appm”, “dacs”, “gihc”, and “dcrmg” have already been suggested for adding to the exemption list.
<u></u><u></u></li></ol>
<ol start="2" type="1">
<ol start="1" type="1">
<li class="MsoNormal" style="background:white">Are there other well-accepted rare and/or archival materials descriptive standards that don’t currently have their own code, and so are absent from the MARC Code List? If so, would the relevant community be willing
to request codes from LC?<u></u><u></u></li><li class="MsoNormal" style="background:white">How faithfully do members of the relevant community actually code such records in field 040 subfield $e?<u></u><u></u></li></ol>
</ol>
<div>
<div>
<div>
<div>
<p class="MsoNormal" style="background:white"><u></u> <u></u></p>
</div>
<p class="MsoNormal" style="background:white">Please reply either to the list or to me directly. We greatly appreciate your input.<u></u><u></u></p>
</div>
</div>
</div>
<div>
<p class="MsoNormal" style="background:white"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal" style="background:white">Many thanks— Jackie<u></u><u></u></p>
</div>
<div>
<div>
<div>
<p class="MsoNormal" style="background:white"><u></u> <u></u></p>
</div>
<table border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td width="100%" valign="top" style="width:100.0%;padding:0in 0in 0in 0in">
<p class="MsoNormal"><span style="color:#333f48">-</span><u></u><u></u></p>
</td>
</tr>
<tr>
<td width="100%" valign="top" style="width:100.0%;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><span style="color:#333f48">Jackie Dooley</span><u></u><u></u></p>
</td>
</tr>
<tr>
<td valign="bottom" style="padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><span style="color:#333f48">Program Officer, OCLC Research</span><u></u><u></u></p>
</td>
</tr>
<tr>
<td valign="bottom" style="padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><span style="color:#333f48">647 Camino de los Mares, Suite 108-240</span><u></u><u></u></p>
<p class="MsoNormal"><span style="color:#333f48">San Clemente, CA 92673</span><u></u><u></u></p>
</td>
</tr>
<tr>
<td valign="bottom" style="padding:3.0pt 0in 0in 0in">
<p class="MsoNormal">office/home <a href="tel:949-492-5060" value="+19494925060" target="_blank">
949-492-5060</a><br>
mobile <a href="tel:949-295-1529" value="+19492951529" target="_blank">949-295-1529</a><br>
<a href="mailto:dooleyj@oclc.org" target="_blank">dooleyj@oclc.org</a><u></u><u></u></p>
</td>
</tr>
<tr>
<td valign="top" style="padding:6.0pt 0in 3.75pt 0in">
<p class="MsoNormal"><a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__www.oclc.org_home.en.html-3Fcmpid-3Demailsig-5Flogo&d=AwMGaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=t7GDkvcZa922K6iya7a6MxgVxxw7OjL0m1rPBXkflk4&m=kRqExyp5bTagfw4W-s3iO-qvtjTFj_59J74agId44nI&s=dnyUTanaqjBHSVV1FdTIEoNm6hDTbjlsRHIvE8OGviQ&e=" target="_blank"><span style="color:blue;text-decoration:none"><img border="0" width="118" height="42" src="http://www.oclc.org/content/dam/ext-ref/emailsignature/oclc-logo-emailsignature.png" alt="OCLC"></span></a><u></u><u></u></p>
</td>
</tr>
<tr>
<td valign="top" style="padding:0in 0in 4.5pt 0in">
<p class="MsoNormal"><span style="color:#2178b5"><a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__www.oclc.org_home.en.html-3Fcmpid-3Demailsig-5Flink&d=AwMGaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=t7GDkvcZa922K6iya7a6MxgVxxw7OjL0m1rPBXkflk4&m=kRqExyp5bTagfw4W-s3iO-qvtjTFj_59J74agId44nI&s=TS_w0TQQ5p-iCY6URnpdmON9jBXJFIqhge-Llx6W-ms&e=" target="_blank"><span style="color:#2178b5;text-decoration:none">OCLC.org</span></a>/research</span><u></u><u></u></p>
</td>
</tr>
</tbody>
</table>
<p class="MsoNormal" style="background:white"><u></u> <u></u></p>
</div>
</div>
</blockquote>
<div>
<div>
<blockquote style="margin-left:30.0pt;margin-right:0in">
<div>
<p class="MsoNormal" style="background:white"><u></u> <u></u></p>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</span>
</body>
</html>