[DCRM-L] OCLC's duplicate detection & resolution software: two questions for the rare and archival materials communities

Elizabeth O'Keefe eokeefe at themorgan.org
Fri Sep 4 11:13:47 MDT 2015


Whew. Thanks for the quick clarification, Kate.

Liz

On Fri, Sep 4, 2015 at 1:08 PM, Kate Moriarty <moriarks at slu.edu> wrote:

> Liz and Lenore,
>
> You're right. I was remembering a discussion about automated RDA changes
> to OCLC records. Different issue - my apologies.
>
> -Kate
>
> On Fri, Sep 4, 2015 at 11:34 AM, Rouse, Lenore <rouse at cua.edu> wrote:
>
>> This is probably a dumb question, but even without amremm in a record,
>> under  what circumstance would OCLC ever merge a record for a
>> *manuscript*, which by definition is unique? I've operated under the
>> assumption that I would never have to worry about our ms. records being
>> merged.
>>
>> Re Jackie's question - I now catalog practically everything as DCRM but
>> this was not the case in this institution until perhaps 10 years ago or
>> whenever I wised up.  I haven't recataloged AACR2 records into dcrm either.
>> So there are indeed many post 1801 items that might easily succumb to
>> merging. I'd argue for an 1840 or 1850 cutoff date but that might be too
>> conservative for some.
>> Lenore
>>
>> --
>> Lenore M. Rouse
>> Curator, Rare Books and Special Collections
>> The Catholic University of America
>> Room 214, Mullen Library
>> 620 Michigan Avenue N.E.
>> Washington, D.C. 20064
>>
>> PHONE: 202 319-5090
>> E-MAIL: rouse at cua.edu
>> RBSC BLOG: http://ascendonica.blogspot.com/
>>
>>
>>
>> On 9/4/2015 11:37 AM, Kate Moriarty wrote:
>>
>> Thank you for this, Jackie and John.
>>
>> As others have stated, I would be in favor of moving the cut-off date to
>> a later date, though I'll leave it to those with a larger post-1801
>> collection to suggest a specific date.
>>
>> Jackie, regarding your 2nd question, I believe you mentioned last year
>> that OCLC would be adding "amremm" to the list of 040 $e DDR exemptions.
>> You said it wouldn't be easy - have you had any success with it?
>>
>> And in answer to your last question, we regularly code the 040 $e here
>> and, at least from the records I see in OCLC, it seems like others do, too.
>>
>> Thanks,
>> Kate
>>
>> On Fri, Sep 4, 2015 at 9:51 AM, Chapman,John <chapmanj at oclc.org> wrote:
>>
>>> Richard and Francis,
>>>
>>> We are asking if the 1801 cutoff (or the 1901 cartographic exception
>>> date) need to be adjusted, but are not suggesting that it should be
>>> earlier. We would expect that, if a change is agreed upon, the dates would
>>> be later.
>>>
>>> We are asking the question of the DCRM-L community to see if there is
>>> any consensus that can be reached about a change, or if the current scheme
>>> is logical and can remain. The context that Richard provided should be
>>> helpful in the discussion.
>>>
>>> --
>>> John Chapman
>>> OCLC · Product Manager, Metadata Services
>>> 6565 Kilgour Place, Dublin, OH 43017 USA
>>> T +1-614-761-5272
>>>
>>>
>>> From: < <dcrm-l-bounces at lib.byu.edu>dcrm-l-bounces at lib.byu.edu> on
>>> behalf of "Noble, Richard"
>>> Reply-To: DCRM Users' Group
>>> Date: Friday, September 4, 2015 at 10:23 AM
>>> To: DCRM Users' Group
>>> Subject: Re: [DCRM-L] OCLC's duplicate detection & resolution software:
>>> two questions for the rare and archival materials communities
>>>
>>> Quick response: the cut-off for books should, if anything, be later, not
>>> earlier. The year 1801 is arbitrary, as much established as it is in
>>> national bibliographies and the like. It seems to be understood as the end
>>> of the "hand-press period", which is historically not the case. For English
>>> books that would be no earlier than 1820, and for some continental books
>>> even later (I see German books of the 1840s printed direct from type on
>>> handmade laid paper, for instance).
>>>
>>> But the bibliographical significance of "hand-press" has been great
>>> exaggerated. While printers become more and more adept at covering their
>>> tracks as the c19 proceeds, bibliographical analysis and description are
>>> very much applicable to post-1801 books and post "hand-press" books, for
>>> the most basic of our FRBR purposes: the identification of manifestations,
>>> and, at the most learned level, the specification of diagnostic evidence
>>> for distinction of manifestations, as well as explicit accounting for
>>> evidence of variation within the body of items that constitute a
>>> manifestation.
>>>
>>> That said, I suppose--assuming that the exemption of dcrm records from
>>> automatic de-duping continues--the idea is to establish criteria by which
>>> to exempt a range of non-dcrm records as well. Earlier versions of dcrm
>>> tended to emphasize 1801/"hand-press period" as a cutoff for application of
>>> the special rules (and the consequent finer-grained analysis of supporting
>>> evidence and variation), so it it made sense of a kind to specify that
>>> range. As tempting as it is, however, to limit dcrm to hand-press books
>>> because it is easier to analyze and describe them, I know from considerable
>>> experience that post-1801 books printed from plates, perhaps based on
>>> mechanical composition, are equally and more subtly variable.
>>>
>>> The whole body of pre-1801 works forms, I presume, a relatively small
>>> percentage of the material represented in the database, though the mass of
>>> duplicate records generated by uploading of incommensurably cataloged
>>> material is considerable. The problem is not so much the conflation of
>>> different manifestations indifferently described, as it is the loss of
>>> information that takes place when merged records are expunged, which
>>> precludes conscious and focused comparison--by catalogers well versed in
>>> the vagaries of legacy and minimal cataloging--as a check on de-duping
>>> errors.
>>>
>>> I would be dismayed to see an irreversible process applied to an even
>>> greater range of materials than before. IRs being a lost cause, this would
>>> be mitigated to some extent if records represented in 019 fields could be
>>> preserved for inspection (beyond the current brief grace period) in such a
>>> way as not to impede the operations of the WorldCat as a whole. But as
>>> Francis Lapka pointed out, the regression of the date cutoff does seem to
>>> be a retraction, not an expansion, of safeguards.
>>>
>>> RICHARD NOBLE :: RARE MATERIALS CATALOGUER :: JOHN HAY LIBRARY
>>> BROWN UNIVERSITY  ::  PROVIDENCE, R.I. 02912  ::  401-863-1187
>>> <Richard_Noble at Br <RICHARD_NOBLE at BROWN.EDU>own.edu>
>>>
>>> On Fri, Sep 4, 2015 at 9:00 AM, Lapka, Francis <francis.lapka at yale.edu>
>>> wrote:
>>>
>>>> Jackie,
>>>>
>>>> I'm grateful for your message, and pleased to hear that OCLC is
>>>> considering changes "to expand and strengthen the safeguards we already
>>>> apply to bibliographic records for unique, rare, and/or archival materials."
>>>>
>>>> At first blush, it would seem that moving the chronological exception
>>>> for de-duping to an earlier date might *weaken* the safeguards, since it
>>>> would make the exception apply to a smaller set of records. Could you tell
>>>> us more about the motivation for this particular change and how it might
>>>> serve to strengthen the safeguards?
>>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Francis
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Sep 04, 2015 at 4:18 AM, Dooley,Jackie < <dooleyj at oclc.org>
>>>> dooleyj at oclc.org> wrote:
>>>>
>>>>
>>>>
>>>>                 Dear DCRM-L --
>>>>
>>>>
>>>>
>>>> On behalf of my colleagues on OCLC's Metadata Quality Team, I'm writing
>>>> to pose two questions: 1) whether the pre-1801 cutoff for excluding records
>>>> from de-duplication should be changed to an earlier date, and 2) whether
>>>> additional cataloging code symbols should be added to the 040 $e exception.
>>>>
>>>>
>>>>
>>>> We're considering changes to the automated Duplicate Detection and
>>>> Resolution (DDR) software and are seeking community opinion before taking
>>>> action. The contemplated changes are *intended to expand and
>>>> strengthen the safeguards we already apply to bibliographic records for
>>>> unique, rare, and/or archival materials*. As members of the rare
>>>> and/or archival cataloging community, you are in an excellent position to
>>>> provide informed advice on these issues.
>>>>
>>>>
>>>>
>>>> First, some background. OCLC first developed the capability to merge
>>>> bibliographic records manually in 1983. During the late 1980s and early
>>>> 1990s, we developed automated DDR software, which dealt with Books records
>>>> only. From 2005 through 2009, OCLC developed a completely new version of
>>>> DDR that worked with all bibliographic formats. From the very beginning of
>>>> automated DDR back in 1991, *records for resources with dates of
>>>> publication/production earlier than 1801 have been set aside and not
>>>> processed*. More recently, in consultation with the American Library
>>>> Association (ALA) Map and Geospatial Information Round Table (MAGIRT)
>>>> Cataloging and Classification Committee (CCC), we have further *exempted
>>>> records for cartographic materials with dates of publication earlier than
>>>> 1901*. *In addition, *we exempt from DDR processing all records for
>>>> resources that can be identified as* photographs (Material Types “pht”
>>>> for photograph and/or “pic” for picture)*.
>>>>
>>>>
>>>>
>>>> Following discussions with representatives of the rare materials
>>>> community several years ago, *we also exempted from DDR processing all
>>>> records that are coded in field 040 subfield $e under description
>>>> conventions for rare materials codes "bdrb", "dcrb", "dcrmb”, or “dcrms*.”
>>>> Please note that these DDR exemptions are *not* intended to apply to
>>>> electronic, microform, or other reproductions, only to the original
>>>> resources.
>>>>
>>>>
>>>>
>>>> The current DDR software is incredibly complicated and continues to be
>>>> fine-tuned on a regular basis. Although this is an oversimplification of a
>>>> complex process, there are now at least two dozen different points of
>>>> comparison taken into consideration. Many of these comparison points draw
>>>> data from multiple parts of a bibliographic record and involve manipulation
>>>> of data in ways designed to distinguish both variations that should be
>>>> equated and distinctions that must be recognized.
>>>>
>>>> As part of our ongoing efforts to improve DDR’s accuracy, we are
>>>> reaching out again to members of the rare materials and archival resources
>>>> communities, in particular, for feedback on the following questions:
>>>>
>>>>
>>>>    1. Within the context of the materials cataloged by your community,
>>>>    are there dates other than pre-1801 for most resources and pre-1901 for
>>>>    cartographic materials that would make more sense as an exemption cutoff?
>>>>    2. The current list of Description Convention Source Codes, found
>>>>    at
>>>>    <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.loc.gov_standards_sourcelist_descriptive-2Dconventions.html&d=AwMGaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=t7GDkvcZa922K6iya7a6MxgVxxw7OjL0m1rPBXkflk4&m=kRqExyp5bTagfw4W-s3iO-qvtjTFj_59J74agId44nI&s=MJfHI5B_tV51Vx2wSKcLJQY4vkqu3ua9UEvXyUqqX8c&e=>
>>>>    http://www.loc.gov/standards/sourcelist/descriptive-conventions.html,
>>>>    has grown much more extensive in recent years. Aside from the four codes
>>>>    already exempted ("bdrb", "dcrb", "dcrmb”, “dcrms”), are there others that
>>>>    it would make sense to consider exempting? Note that Description Convention
>>>>    Source Codes “appm”, “dacs”, “gihc”, and “dcrmg” have already been
>>>>    suggested for adding to the exemption list.
>>>>
>>>>
>>>>    1. Are there other well-accepted rare and/or archival materials
>>>>       descriptive standards that don’t currently have their own code, and so are
>>>>       absent from the MARC Code List? If so, would the relevant community be
>>>>       willing to request codes from LC?
>>>>       2. How faithfully do members of the relevant community actually
>>>>       code such records in field 040 subfield $e?
>>>>
>>>>
>>>>
>>>> Please reply either to the list or to me directly. We greatly
>>>> appreciate your input.
>>>>
>>>>
>>>>
>>>> Many thanks— Jackie
>>>>
>>>>
>>>>
>>>> -
>>>>
>>>> Jackie Dooley
>>>>
>>>> Program Officer, OCLC Research
>>>>
>>>> 647 Camino de los Mares, Suite 108-240
>>>>
>>>> San Clemente, CA 92673
>>>>
>>>> office/home 949-492-5060
>>>> mobile 949-295-1529
>>>> <dooleyj at oclc.org>dooleyj at oclc.org
>>>>
>>>> [image: OCLC]
>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.oclc.org_home.en.html-3Fcmpid-3Demailsig-5Flogo&d=AwMGaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=t7GDkvcZa922K6iya7a6MxgVxxw7OjL0m1rPBXkflk4&m=kRqExyp5bTagfw4W-s3iO-qvtjTFj_59J74agId44nI&s=dnyUTanaqjBHSVV1FdTIEoNm6hDTbjlsRHIvE8OGviQ&e=>
>>>>
>>>> OCLC.org
>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.oclc.org_home.en.html-3Fcmpid-3Demailsig-5Flink&d=AwMGaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=t7GDkvcZa922K6iya7a6MxgVxxw7OjL0m1rPBXkflk4&m=kRqExyp5bTagfw4W-s3iO-qvtjTFj_59J74agId44nI&s=TS_w0TQQ5p-iCY6URnpdmON9jBXJFIqhge-Llx6W-ms&e=>
>>>> /research
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Kate S. Moriarty, MSW, MLS  |  Rare Book Catalog Librarian  |  Associate
>> Professor  |  Pius XII Memorial Library  |  Room 320-2
>> Saint Louis University  |  3650 Lindell Blvd . |  St. Louis, MO 63108  |  (314)
>> 977-3024 (tel)  |  (314) 977-3108 (fax)  |  moriarks at slu.edu  |
>> <http://libraries.slu.edu/>http://libraries.slu.edu/
>>
>>
>>
>
>
> --
> Kate S. Moriarty, MSW, MLS  |  Rare Book Catalog Librarian  |  Associate
> Professor  |  Pius XII Memorial Library  |  Room 320-2
> Saint Louis University  |  3650 Lindell Blvd . |  St. Louis, MO 63108  |
> (314) 977-3024 (tel)  |  (314) 977-3108 (fax)  |  moriarks at slu.edu  |
> http://libraries.slu.edu/
>



-- 
Elizabeth O'Keefe
Director of Collection Information Systems
The Morgan Library & Museum
225 Madison Avenue
New York, NY  10016-3405

TEL: 212 590-0380
FAX: 2127685680
NET: eokeefe at themorgan.org

Visit CORSAIR, the Library's comprehensive collections catalog:
http://corsair.themorgan.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserver.lib.byu.edu/pipermail/dcrm-l/attachments/20150904/5c2140bf/attachment-0001.html>


More information about the DCRM-L mailing list