Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
http://www.loc.gov/bibframe/
Library of Congress Thomas Jefferson Building
1
Assessment and Next Steps
June 15, 2016
Director for Acquisitions & Bibliographic Access
Library of Congress
} LC engaged in linked data for several years
} First foray was sharing its authority data
} LC created its Linked Data Service (id.loc.gov)
in 2009
} Library of Congress Subject Headings offered
as first set of authority data
} Name authorities and various vocabularies
followed
} Id.loc.gov played integral role in BIBFRAME
Pilot
4
} BIBFRAME’s beginnings were some four years
ago
} LC pressured for years to develop a
replacement for MARC
} LC Working Group on the Future of
Bibliographic Control’s On the Record was
final push for LC
} The time was never quite right for a structure
that was considered feasible
} With introduction of linked data LC saw a
viable structure
5
} LC contracted with Zepheira to develop model
that became BIBRAME model vocabulary 1.0
} Development of BIBFRAME 1.0 accomplished
with input from community
} Initially, LC had collaboration of early
experimenters, including—Princeton, George
Washington, Cornell, National Library of
Medicine, and Deutsche Nationalbibliothek,
British Library
6
} This initial work and collaboration helped LC
stabilize the BIBFRAME and vocabulary 1.0
} This work continued for several years
} By late 2014/early 2015, made determination
that LC mount a pilot to test
◦ efficacy of BIBFRAME and the
◦ ability of cataloging staff to create bibliographic
data in BIBFRAME structure
7
} Ca. 40 staff identified for the Pilot
} Mix of catalogers and technicians that catalog
◦ Materials in all languages, scripts and formats
◦ Monographs, serials, cartographic materials, music
(notated), sound recordings, moving image, and
two-dimensional art (prints and photographs)
} Process materials they regularly receive
8
} Required to catalog in both the MARC 21
format and BIBFRAME
◦ Dual data creation affected the participants’ normal
production
◦ No attempt to address the impact of BIBFRAME on
production
9
} Pilot participants were pioneers
} Working in a system still under development
} Attended 16 hours of instruction on Semantic
Web, Linked Data, and use of the BIBFRAME
Editor
} COIN—Cooperative & Instructional Programs
Division staff members provided the training
} Training materials available from the
Cataloger’s Learning Workshop website
http://www.loc.gov/catworkshop/bibframe/
10
} Module 1: Introduction to the Semantic Web
and Linked Data (four and a half hours)
} Module 2: Introduction to BIBFRAME Tools
(two and half hours)
} Taught using PowerPoint slides, Quizzes, and
Exercises
11
} Module 3 consisted of two Units:
◦ Unit 1—recap of major concepts of the Semantic Web
and Linked Data
– considered necessary because of significant time gap since
pilot participants first exposed to these concepts, and
because some found the concepts themselves difficult to
understand
◦ Unit 2—
– primary goal to provide hands-on training on use of
BIBFRAME Editor to create BIBFRAME “records”
– secondary goals to explain Pilot ‘ground rules’ and to
prepare participants to be effective testers and provide
helpful feedback.
◦
12
} Module 3,Unit 1—
◦ 40-slide PowerPoint presentation.
} Unit 2—
◦ 51-page manual, with plentiful screen captures to
show participants what they should see at the
various stages of working in the Editor
13
} Participants began using the BIBFRAME Editor
immediately after training in its use
} Entered data into both the LC ILS (Voyager)
and the BIBFRAME Editor
◦ Created MARC records in LC ILS first
} Weekly ‘de-briefings’ held to help the
participants, instructors, and developers
} Midway through Pilot, participants instructed
to enter data into BIBFRAME Editor and then
create MARC record in LC ILS
14
} Searching was available to primary datasets
on LC Linked Data Service Authorities and
Vocabularies web site, id.loc.gov
◦ Initially LC/NACO Authority File and Library of
Congress Subject Headings (LCSH)
◦ Later, additional datasets from id.loc.gov were
made searchable from the Editor
} More datasets were searchable via the Editor,
as well
◦ including some controlled lists from Resource
Description & Access (RDA)
15
} Into the Pilot, ability to access previously
input BIBFRAME descriptions was possible
} Descriptions could not be edited
} Descriptions created in BIBFRAME did not
constitute a database of record
} Descriptions not distributed as part of the
Library’s cataloging distribution service
16
} No changes made in workflow
} Participants were still creating MARC records
in the LC ILS
} Not operating in production mode
} Data created will eventually be discarded
17
} Good understanding of RDA needed for
working in the BIBFRAME Editor
} Need to converse using RDA terminology
rather than MARC coding
} Participants wanted to see and analyze
BIBFRAME RDF serializations created during
Pilot
} Reinforced training objectives on the
Semantic Web and Linked Data presented in
Modules 1 and 2
18
} Phase One of BIBFRAME Pilot lasted six
months (October 1, 2015 – March 31, 2016)
} Pilot participants continuing to catalog in
BIBFRAME Editor to retain skills
} BIBFRAME data continue to be created and
analyzed
} After LC Pilot BIBFRAME 2.0 is underway, data
created using BIBFRAME 1.0 model to be
discarded
19
} Network Development and MARC Standards
Office—NDMSO created technical components
that supported Pilot
} Included most of LC’s MARC bibliographic
records transformed into
◦ BIBFRAME descriptions
◦ controlled authority and term lists with URIs
◦ input editor for the participants to use
20
} Pilot’s focus was input of data and impact on
catalogers
} Function of end user access was not studied
} System did not support
◦ recording of holdings
◦ acquisitions processes
◦ description distribution functions
} 2,000 records created in the Pilot made
available in a bulk download file
21
} Pilot participants submitted over 2,000
descriptions to the system
} Eight profiles for different resource types
established to assist with input:
◦ monographs, serials
◦ notated music
◦ Cartographic materials
◦ BluRay DVD, Audio CD
◦ 35mm Feature Film
◦ prints/photographs
22
} Modeling of Works and Instances was clear
} Participants generally just looked for the RDA
rule and viewed it or put in the value
} How it was packaged by the BIBFRAME model
was not that important to know
} Underscored the dichotomy between the
FRBR/RDA and BIBFRAME models
23
} Dropdowns and lookups were popular
features
} They improved
◦ accuracy of data strings
◦ provided the data linking URIs without keying them
◦ made input more efficient
24
} BIBFRAME editor used labels
◦ closely synchronized with RDA
◦ linked to key RDA rules for an element
} Participants found the labels and RDA rule
links very helpful
} Treatment of Expressions in BIBFRAME model
required additional explanation
} BIBFRAME model considers an Expression a
Work with links between the RDA Work and
RDA Expression
25
} Searching as implemented was adequate but
could be improved
} Look ahead fields were very useful for known
item searching
} Some “what do you have like this” searching
was helpful
} Known item searching usually sufficed
26
} Decision made to simulate BIBFRAME
environment
} Required conversion of LC file of 18 million
MARC bibliographic records to provide
BIBFRAME file against which to catalog
} 13.5 million records converted
◦ split into Work and Instance records
– 13.4 million Work records
– 13.85 Instance records
} Transformation was credible, but a work in
progress
27
} Good enough to illustrate Work/Instance
separation, although not thoroughly tested in the
Pilot
} MARC Authority records needed by the catalogers
already converted to RDF and loaded into the LC
Linked Data Service
} For Pilot, name authorities were changed from
weekly load to daily load to provide up-to-date
authority lookup
} Providing input of new authority descriptions into
the BIBFRAME system was desirable but could not
be met in the timeframe
28
} Pilot achieved its aim and is considered a
success
} Input from catalogers participating in testing
the system enabled those developing
BIBFRAME to make considerable strides in its
development
} BIBFRAME 2.0 model and vocabulary
◦ released
◦ will form the basis of the next phase of a pilot in
fall 2016—not before October
29
} LC will continue to refine BIBFRAME model
and vocabulary 2.0
} LC, as member of LD4P—Linked Data for
Production, will work with 5 institutions
funded by a Mellon grant to test BIBFRAME
2.0
◦ Stanford
◦ Cornell
◦ Columbia
◦ Harvard
◦ Princeton
30
} Beacher Wiggins bwig@loc.gov
} Director for Acquisitions & Bibliographic
Access
} Library of Congress
} 101 Independence Avenue, SE
} Washington, DC 20540
} (202) 707-5137 FAX--(202) 707-6269
31

More Related Content

Wiggins-7-jun15

  • 2. Assessment and Next Steps June 15, 2016
  • 3. Director for Acquisitions & Bibliographic Access Library of Congress
  • 4. } LC engaged in linked data for several years } First foray was sharing its authority data } LC created its Linked Data Service (id.loc.gov) in 2009 } Library of Congress Subject Headings offered as first set of authority data } Name authorities and various vocabularies followed } Id.loc.gov played integral role in BIBFRAME Pilot 4
  • 5. } BIBFRAME’s beginnings were some four years ago } LC pressured for years to develop a replacement for MARC } LC Working Group on the Future of Bibliographic Control’s On the Record was final push for LC } The time was never quite right for a structure that was considered feasible } With introduction of linked data LC saw a viable structure 5
  • 6. } LC contracted with Zepheira to develop model that became BIBRAME model vocabulary 1.0 } Development of BIBFRAME 1.0 accomplished with input from community } Initially, LC had collaboration of early experimenters, including—Princeton, George Washington, Cornell, National Library of Medicine, and Deutsche Nationalbibliothek, British Library 6
  • 7. } This initial work and collaboration helped LC stabilize the BIBFRAME and vocabulary 1.0 } This work continued for several years } By late 2014/early 2015, made determination that LC mount a pilot to test ◦ efficacy of BIBFRAME and the ◦ ability of cataloging staff to create bibliographic data in BIBFRAME structure 7
  • 8. } Ca. 40 staff identified for the Pilot } Mix of catalogers and technicians that catalog ◦ Materials in all languages, scripts and formats ◦ Monographs, serials, cartographic materials, music (notated), sound recordings, moving image, and two-dimensional art (prints and photographs) } Process materials they regularly receive 8
  • 9. } Required to catalog in both the MARC 21 format and BIBFRAME ◦ Dual data creation affected the participants’ normal production ◦ No attempt to address the impact of BIBFRAME on production 9
  • 10. } Pilot participants were pioneers } Working in a system still under development } Attended 16 hours of instruction on Semantic Web, Linked Data, and use of the BIBFRAME Editor } COIN—Cooperative & Instructional Programs Division staff members provided the training } Training materials available from the Cataloger’s Learning Workshop website http://www.loc.gov/catworkshop/bibframe/ 10
  • 11. } Module 1: Introduction to the Semantic Web and Linked Data (four and a half hours) } Module 2: Introduction to BIBFRAME Tools (two and half hours) } Taught using PowerPoint slides, Quizzes, and Exercises 11
  • 12. } Module 3 consisted of two Units: ◦ Unit 1—recap of major concepts of the Semantic Web and Linked Data – considered necessary because of significant time gap since pilot participants first exposed to these concepts, and because some found the concepts themselves difficult to understand ◦ Unit 2— – primary goal to provide hands-on training on use of BIBFRAME Editor to create BIBFRAME “records” – secondary goals to explain Pilot ‘ground rules’ and to prepare participants to be effective testers and provide helpful feedback. ◦ 12
  • 13. } Module 3,Unit 1— ◦ 40-slide PowerPoint presentation. } Unit 2— ◦ 51-page manual, with plentiful screen captures to show participants what they should see at the various stages of working in the Editor 13
  • 14. } Participants began using the BIBFRAME Editor immediately after training in its use } Entered data into both the LC ILS (Voyager) and the BIBFRAME Editor ◦ Created MARC records in LC ILS first } Weekly ‘de-briefings’ held to help the participants, instructors, and developers } Midway through Pilot, participants instructed to enter data into BIBFRAME Editor and then create MARC record in LC ILS 14
  • 15. } Searching was available to primary datasets on LC Linked Data Service Authorities and Vocabularies web site, id.loc.gov ◦ Initially LC/NACO Authority File and Library of Congress Subject Headings (LCSH) ◦ Later, additional datasets from id.loc.gov were made searchable from the Editor } More datasets were searchable via the Editor, as well ◦ including some controlled lists from Resource Description & Access (RDA) 15
  • 16. } Into the Pilot, ability to access previously input BIBFRAME descriptions was possible } Descriptions could not be edited } Descriptions created in BIBFRAME did not constitute a database of record } Descriptions not distributed as part of the Library’s cataloging distribution service 16
  • 17. } No changes made in workflow } Participants were still creating MARC records in the LC ILS } Not operating in production mode } Data created will eventually be discarded 17
  • 18. } Good understanding of RDA needed for working in the BIBFRAME Editor } Need to converse using RDA terminology rather than MARC coding } Participants wanted to see and analyze BIBFRAME RDF serializations created during Pilot } Reinforced training objectives on the Semantic Web and Linked Data presented in Modules 1 and 2 18
  • 19. } Phase One of BIBFRAME Pilot lasted six months (October 1, 2015 – March 31, 2016) } Pilot participants continuing to catalog in BIBFRAME Editor to retain skills } BIBFRAME data continue to be created and analyzed } After LC Pilot BIBFRAME 2.0 is underway, data created using BIBFRAME 1.0 model to be discarded 19
  • 20. } Network Development and MARC Standards Office—NDMSO created technical components that supported Pilot } Included most of LC’s MARC bibliographic records transformed into ◦ BIBFRAME descriptions ◦ controlled authority and term lists with URIs ◦ input editor for the participants to use 20
  • 21. } Pilot’s focus was input of data and impact on catalogers } Function of end user access was not studied } System did not support ◦ recording of holdings ◦ acquisitions processes ◦ description distribution functions } 2,000 records created in the Pilot made available in a bulk download file 21
  • 22. } Pilot participants submitted over 2,000 descriptions to the system } Eight profiles for different resource types established to assist with input: ◦ monographs, serials ◦ notated music ◦ Cartographic materials ◦ BluRay DVD, Audio CD ◦ 35mm Feature Film ◦ prints/photographs 22
  • 23. } Modeling of Works and Instances was clear } Participants generally just looked for the RDA rule and viewed it or put in the value } How it was packaged by the BIBFRAME model was not that important to know } Underscored the dichotomy between the FRBR/RDA and BIBFRAME models 23
  • 24. } Dropdowns and lookups were popular features } They improved ◦ accuracy of data strings ◦ provided the data linking URIs without keying them ◦ made input more efficient 24
  • 25. } BIBFRAME editor used labels ◦ closely synchronized with RDA ◦ linked to key RDA rules for an element } Participants found the labels and RDA rule links very helpful } Treatment of Expressions in BIBFRAME model required additional explanation } BIBFRAME model considers an Expression a Work with links between the RDA Work and RDA Expression 25
  • 26. } Searching as implemented was adequate but could be improved } Look ahead fields were very useful for known item searching } Some “what do you have like this” searching was helpful } Known item searching usually sufficed 26
  • 27. } Decision made to simulate BIBFRAME environment } Required conversion of LC file of 18 million MARC bibliographic records to provide BIBFRAME file against which to catalog } 13.5 million records converted ◦ split into Work and Instance records – 13.4 million Work records – 13.85 Instance records } Transformation was credible, but a work in progress 27
  • 28. } Good enough to illustrate Work/Instance separation, although not thoroughly tested in the Pilot } MARC Authority records needed by the catalogers already converted to RDF and loaded into the LC Linked Data Service } For Pilot, name authorities were changed from weekly load to daily load to provide up-to-date authority lookup } Providing input of new authority descriptions into the BIBFRAME system was desirable but could not be met in the timeframe 28
  • 29. } Pilot achieved its aim and is considered a success } Input from catalogers participating in testing the system enabled those developing BIBFRAME to make considerable strides in its development } BIBFRAME 2.0 model and vocabulary ◦ released ◦ will form the basis of the next phase of a pilot in fall 2016—not before October 29
  • 30. } LC will continue to refine BIBFRAME model and vocabulary 2.0 } LC, as member of LD4P—Linked Data for Production, will work with 5 institutions funded by a Mellon grant to test BIBFRAME 2.0 ◦ Stanford ◦ Cornell ◦ Columbia ◦ Harvard ◦ Princeton 30
  • 31. } Beacher Wiggins bwig@loc.gov } Director for Acquisitions & Bibliographic Access } Library of Congress } 101 Independence Avenue, SE } Washington, DC 20540 } (202) 707-5137 FAX--(202) 707-6269 31