[DCRM-L] 655 field -- migration

Lapka, Francis francis.lapka at yale.edu
Mon Jun 1 13:50:09 MDT 2020


Thank you Paloma for your thoughtful reply.

Full disclosure: though my questions are based in part on the anticipated headaches of a migration to BIBFRAME, they are also motivated by the desire to have records from my department (rare books & mss) play well with records from the art collections at my museum (e.g., paintings, prints, and drawings) in our shared online catalog. In the museum catalog, we have facets for genre and work type, which correspond (roughly) to Work genre/form and Instance genre/form.

I ran several reports this morning to get a clearer summary of the 27,000+ 655 occurrences in our records, especially those used most often. A review of this data (at a skim) confirms earlier hunches:

Terms from these thesauri are probably at least 90% aligned with work:

  *   fast
  *   lcgft

Terms from these thesauri are probably at least 90% aligned with instance:

  *   gmgpc
  *   rbpri
  *   rbpub

With consistent application of $5, terms from rbbin can be aligned with Instance or Item with a level of confidence near 100%.

With rbprov, terms should always align to Item.

Terms from these thesauri have the most ambiguous alignment, with some terms in real gray areas:

  *   aat: in my dataset, the alignment is about 60% Instance and 40% Work
  *   rbgenr: in my dataset, the alignment is about 75% Work and 25% Instance.

Then there's rbpap, which I'm inclined to map to bf:baseMaterial (or medium in my museum catalog) - rather than treat as a genre/form.

A question for RBMS Controlled Vocabularies editors: when the six existing RBMS thesauri are integrated into a single cohesive thesaurus,<http://rbms.info/cv-comments/2019/01/09/rbms-controlled-vocabulary-reorganization-update/> will the six source codes ($2) be replaced with a single value? If so - and if this change transpires while we are still in a MARC environment (at least partially) - will there be something in our MARC data to indicate the facet to which a term belongs (works, objects, production, provenance, publishing)?

Francis


Francis Lapka
Senior Catalogue Librarian
Department of Rare Books and Manuscripts
Yale Center for British Art
203-432-9672  *  francis.lapka at yale.edu<mailto:francis.lapka at yale.edu>




From: DCRM-L <dcrm-l-bounces at lib.byu.edu> On Behalf Of Graciani Picardo, Paloma
Sent: Friday, May 29, 2020 6:29 PM
To: DCRM Users' Group <dcrm-l at lib.byu.edu>
Subject: Re: [DCRM-L] 655 field -- migration

Hi Francis,

at the Ransom Center we have started kind of similar conversations but from the angle of original BIBFRAME metadata creation. As members of the LD4P2 cohort (and in the context of the current shelter-at-home situation) we are spending a good chunk of our time "re-cataloging" one of our collections in the RDF based cataloging platform developed for the project.

To your first question, we early on decided on assigning the proper genre/form term to the proper BF entity, and that would be our expectation for a programmatic conversion. That said, while the parsing is obvious for what goes to the item, it might not be so obvious sometimes (even for human eyes!) for what goes to the Work or Instance, since it depends on the nature of the work/expression.

To your second question, even though we haven't discussed this yet at the center, I would always feel more inclined towards Method B (adjust 655 to potential mapping). In my experience, you might lose some crucial data post migration that could have been very helpful for data remediation decisions. That said, your question made me curious about the likeability of approach A working for us (leaving the 655 unchanged), so I had a quick look at our use of the 655 field throughout time. Of the 34,500 occurrences of the field 655 in our catalog, there are around 50 different values on the $2, reduced to 29 after some quick OpenRefine magic. More than half are "rbgenr" terms, with "fast", "aat", "rbbin", "gsafd" and "lcgft" being the only ones with more than a thousand occurrences. Honestly, the only ones of these I would feel confident to assign programmatically based on source are "rbbin" and "gsafd". On the other hand, our use of $5 is very recent, so that would not work for our legacy data either. As per a conversion based on pattern recognition, well, we might have up to seven different variations for a given vocabulary term (typos, plural/singular, etc..), so that approach will have to be preceded by  some heavy pre-migration data clean up or be extremely complex.

The idea of implementing a new subfield to indicate corresponding BF entities sounds very interesting and I would like to hear more about logistics or drawbacks for that approach. For instance, how would that new field impact data display in our current ILS/LSPs (assuming that those data manipulations would get started well in advance before migration to BIBFRAME)? I also would be interested in further discussions around the challenge of associating any copy specific access point (not just genre/form but former owners, annotators, inscribers, etc..) to the actual item when there are many items to one manifestation description.

Anyway, I better stop before I get too much into the weeds, but hope any of this helped. I am excited to see this conversation getting started and would also love to hear use cases and thoughts from other institutions!

Happy Friday!
Paloma

Paloma Graciani Picardo| she, her, hers
Metadata Librarian and Head, Printed & Published Media
Harry Ransom Center | The University of Texas at Austin
www.hrc.utexas.edu<https://nam05.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.hrc.utexas.edu%2F&data=02%7C01%7Cfrancis.lapka%40yale.edu%7C51ba68f0b4434566e62d08d8041fc04f%7Cdd8cbebb21394df8b4114e3e87abeb5c%7C0%7C1%7C637263881722536629&sdata=JnMFvtL3nVouqR4esunb5xduekJ9jGZmhjJLBzjuCBk%3D&reserved=0>


From: DCRM-L <dcrm-l-bounces at lib.byu.edu<mailto:dcrm-l-bounces at lib.byu.edu>> On Behalf Of Lapka, Francis
Sent: Thursday, May 28, 2020 7:28 AM
To: DCRM Users' Group <dcrm-l at lib.byu.edu<mailto:dcrm-l at lib.byu.edu>>
Subject: Re: [DCRM-L] 655 field -- migration

Thanks Stephen, Dorothy, and Jeff for your responses.

I think my original query has taken an inadvertent detour. It wasn't my intent to question the well-established practice of using subfield $5 in conjunction with 655 fields to indicate that a form/genre term applies to a specific copy. This is good and proper.

The main thrust of my question is this: when we reach the day when our catalogs are batch converted from MARC to BIBFRAME, how can we apply a machine-actionable mechanism (without human mediation) to ensure that the 655 headings are appropriately mapped to the pertinent items, instances, and works?

Certainly the presence of a subfield $5 is already a machine-actionable clue that a given 655 heading should be matched with an item (though it's regrettable that subfield $5 specifies an institution instead of an actual copy - esp. when there's more than one copy). But in current practice, there's no explicit indication whether a term should be matched to an instance (manifestation) or a work. What's the best way to plan for batch conversion processes that correctly identify the proper domain for all 655 headings?

Francis





From: DCRM-L <dcrm-l-bounces at lib.byu.edu<mailto:dcrm-l-bounces at lib.byu.edu>> On Behalf Of Jeff Barton
Sent: Wednesday, May 27, 2020 10:05 PM
To: DCRM Users' Group <dcrm-l at lib.byu.edu<mailto:dcrm-l at lib.byu.edu>>
Subject: Re: [DCRM-L] 655 field -- migration

We code as $5 (NjP) 655 entries (and the accompanying 500 field notes) that refer to aspects at the item level - or "copy specific" in bibliographicese.  These are generally unique to the copy at hand.  The exception among the RBMS thesauri would be terms from the Genre thesaurus terms, which apply more broadly to the work.  I'd be interested in what others do in this regard too.

In that general connection, I'd also be interested to know how others are generally using sub-field divisions Form/Genre terms.  I seem to recall from MidWinter that many were subdividing chronologically only by century.  Is that an accurate recollection?  I'd also be interested in if/how others might be subdividing 655 terms geographically, and if so at what level":  city, state/province/county, or country? If geographic aspects (inscriptions or bookseller tickets, for instance) can be reasonably identified with a location, that is.

Thanks



Jeff Barton

Cotsen Library Cataloger

Rare Books & Special Collections Department

Princeton University Library

One Washington Rd.

Princeton, NJ 08544

jpbarton at princeton.edu<mailto:jpbarton at princeton.edu>



________________________________
From: DCRM-L <dcrm-l-bounces at lib.byu.edu<mailto:dcrm-l-bounces at lib.byu.edu>> on behalf of Young, Stephen <stephen.young at yale.edu<mailto:stephen.young at yale.edu>>
Sent: Wednesday, May 27, 2020 9:31:18 PM
To: DCRM Users' Group <dcrm-l at lib.byu.edu<mailto:dcrm-l at lib.byu.edu>>
Subject: Re: [DCRM-L] 655 field -- migration

At the Beinecke we do the same. I was mistaken to say "instance" and not "occurrence." Perhaps the 655s could be sorted out on that basis.

Stephen Young

________________________________
From: DCRM-L <dcrm-l-bounces at lib.byu.edu<mailto:dcrm-l-bounces at lib.byu.edu>> on behalf of Auyong, Dorothy <dauyong at huntington.org<mailto:dauyong at huntington.org>>
Sent: Wednesday, May 27, 2020 7:40 PM
To: DCRM Users' Group <dcrm-l at lib.byu.edu<mailto:dcrm-l at lib.byu.edu>>
Subject: Re: [DCRM-L] 655 field -- migration


At the Huntington we only apply subfield 5 CSmH to item occurrence, not instance.





Dorothy Auyong

Early Books and Codices Cataloging Manager

Acquisitions, Cataloging & Metadata Services



The Huntington

Library, Art Museum, and Botanical Gardens

1151 Oxford Road, San Marino, CA 91108

huntington.org<https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.huntington.org%2F&data=02%7C01%7Cfrancis.lapka%40yale.edu%7C51ba68f0b4434566e62d08d8041fc04f%7Cdd8cbebb21394df8b4114e3e87abeb5c%7C0%7C1%7C637263881722536629&sdata=UURPFJw2nWK599jp9HxNPXH3slvPANww4SfRr6uKtAY%3D&reserved=0>



T 626-405-2188

E dauyong at huntington.org<mailto:dauyong at huntington.org>



[cid:image002.png at 01D63821.FB0B8240]







From: DCRM-L <dcrm-l-bounces at lib.byu.edu<mailto:dcrm-l-bounces at lib.byu.edu>> On Behalf Of Young, Stephen
Sent: Wednesday, May 27, 2020 11:31 AM
To: DCRM Users' Group <dcrm-l at lib.byu.edu<mailto:dcrm-l at lib.byu.edu>>
Subject: Re: [DCRM-L] 655 field -- migration



Our practice for Beinecke records is to add subfield 5 CtY-BR to those 655s that apply to instance and not work. Do others follow a similar practice?



Stephen R. Young

Rare Book Catalog Librarian

Beinecke Rare Book and Manuscript Library

________________________________

From: DCRM-L <dcrm-l-bounces at lib.byu.edu<mailto:dcrm-l-bounces at lib.byu.edu>> on behalf of Lapka, Francis <francis.lapka at yale.edu<mailto:francis.lapka at yale.edu>>
Sent: Wednesday, May 27, 2020 2:20 PM
To: 'dcrm-l at lib.byu.edu' <dcrm-l at lib.byu.edu<mailto:dcrm-l at lib.byu.edu>>
Subject: [DCRM-L] 655 field -- migration



Hi all.



In a 2018 report on a MARC-to-BIBFRAME data conversion executed by Casalini for Yale, a Yale TF had this to say about the 655 field:



... Conversion of the 655 field clashes with the ambiguity inherent in MARC bibliographic records; this field could map to work, instance, or item properties. To overcome MARC's ambiguity in such fields, a converter would have to employ pattern recognition that goes beyond the MARC encoding: for example, treating all 655 fields with the value "rbprov" in subfield $2 as bf:genreForm with a domain of bf:Item. This might add considerable complexity to the conversion specification, but without this upfront complexity, the outcome will be one of diminished discovery. ...



As the day nears when we may need to convert our MARC to BIBFRAME for production usage, I'm curious if other institutions have started to make plans for how to migrate 655 data to the correct Work/Instance/Item (WII) entity. In BIBFRAME, the genreForm property<https://nam05.safelinks.protection.outlook.com/?url=http%3A%2F%2Fid.loc.gov%2Fontologies%2Fbibframe.html%23p_genreForm&data=02%7C01%7Cfrancis.lapka%40yale.edu%7C51ba68f0b4434566e62d08d8041fc04f%7Cdd8cbebb21394df8b4114e3e87abeb5c%7C0%7C1%7C637263881722546625&sdata=9vRs3WztTpJw04%2FfkOOxOlqdIP1PPOjxESTfssjJTsM%3D&reserved=0> can be used with work, instance, or item.  In the Library of Congress MARC-to-BIBFRAME conversion specification, the 655 field maps always to bf:Work (see this LC specification<https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.loc.gov%2Fbibframe%2Fmtbf%2FConvSpec-3XX-v1.5p.xlsx&data=02%7C01%7Cfrancis.lapka%40yale.edu%7C51ba68f0b4434566e62d08d8041fc04f%7Cdd8cbebb21394df8b4114e3e87abeb5c%7C0%7C1%7C637263881722546625&sdata=rdo8YHWmVgR7geHMmHcS9ZDNEXRPBraAPP2BaajzqsE%3D&reserved=0>).



Has your institution started to consider the issue? If so, what are your plans?



More questions:



  1.  Would it be acceptable to migrate all 655 data to the bf:Work?



  1.  If we'd like 655 data to migrate to Work, Instance, and Item, as appropriate, it seems that there are two broad options to consider:



     *   Leave the 655 data unchanged (from current practice) but add a considerable amount of complexity to the conversion spec to make the WII distinctions; or



     *   Adjust the 655 data in some manner before data conversion, so that WII distinctions are clearly articulated in MARC, requiring less complexity from the conversion spec.



  1.  In method A - leave the 655 data unchanged - what would be a reasonable strategy? Possibilities:



     *   Make broad mapping rules based on the thesaurus value in 655 subfield $2. For example, $2 rbprov data maps to Item (works well), $2 rbbin maps to Item (accurate more often than not), $2 gmgpc maps to Instance (mostly true), $2 rbgenr maps to Work (mostly true?), and so on.



     *   Make detailed mapping rules accounting for every expected term, e.g. Publisher's cloth bindings maps to Instance, and so on.



Both of these possibilities assume that the converter - likely the work of a vendor - is able to incorporate such complexity.



  1.  In method B - adjust the 655 data before conversion - what's reasonable? I can think of at least one possibility:



     *   Add a new nugget of data to the 655 entry to declare a term's WII alignment. For example, how about a (newly defined) subfield $i? Such a subfield could align the data with WII/WEMI properties, e.g. 655 _7 $i gf-item $a Bookplates. $2 rbgenr





What are the other possibilities? What's most likely to succeed?



Francis







Francis Lapka

Senior Catalogue Librarian

Department of Rare Books and Manuscripts

Yale Center for British Art

203-432-9672  *  francis.lapka at yale.edu<mailto:francis.lapka at yale.edu>



This message is from an external sender. Learn more about why this matters.<https://ut.service-now.com/sp?id=kb_article&number=KB0011401>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserver.lib.byu.edu/pipermail/dcrm-l/attachments/20200601/c163c625/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.png
Type: image/png
Size: 6487 bytes
Desc: image002.png
URL: <http://listserver.lib.byu.edu/pipermail/dcrm-l/attachments/20200601/c163c625/attachment-0001.png>


More information about the DCRM-L mailing list