Presentation given by Edward Corrado on 11/14/11 at the University at Buffalo Libraries symposium entitled "Research Data: Management, Access, Control."
Report
Share
Report
Share
1 of 31
Download to read offline
More Related Content
Preservation and Research Data at Binghamton University Libraries by Edward Corrado
1. Edward
M.
Corrado
|
ecorrado@binghamton.edu
Binghamton
University
Libraries
|
http://library.binghamton.edu/
Presented
14
November
2011
at
Research
Data:
Management,
Access
and
Control
Symposium,
University
at
Buffalo
2. Binghamton
University,
one
of
four
comprehensive
doctoral
research
universities
within
the
State
University
of
New
York,
is
recognized
for
stellar
academics,
an
international
focus,
high
graduation
rates
and
overall
value
§
Undergraduates:
11,706
§
Graduate
students:
3,007
§
Average
SAT
score
for
2011
incoming
Freshman:
1305
§ Top
25%
of
high
school
class:
85%
§ Students
of
color:
33.3%
§
International
students:
10%
§ #1
in
2011
as
a
best
value
among
the
nation's
public
colleges
for
out-‐of-‐
state
students
and
#5
overall
Binghamton
University:
(Kiplinger's
Personal
Finance,
2011)
"The
premier
public
university
in
the
§ Students
come
from
all
50
states
and
northeast”
and
“best
buy”
100
countries
(Fiske
Guide
To
Colleges,
2010)
5. ¡ Backups
≠
Preservation
§ “Actions
required
to
maintain
access
to
digital
materials
beyond
the
limits
of
media
failure
or
technological
change.”
(Digital
Preservation
Collation,
2009)
§ Backups
alone
are
not
sufficient
§ Don’t
protect
against
obsolete
file
formats,
software,
hardware,
etc.
¡ Providing
access
≠
Preservation
§ Digital
Asset
management
systems
offer
access
but
not
[necessarily]
long
term
preservation
6. ¡ While
Digital
preservation
can
support
Open
access
and/or
Open
data,
preservation
does
not
and
can
not
always
imply
Openness
▪ Patents
and
other
legal
issues
▪ Confidential
data
such
as
Blood
Serum
Collection
▪ Researcher/Discipline
Norms
▪ Discipline
Specific
Repositories
such
as
arXiv.org
and
Inter-‐University
Consortium
for
Political
and
Social
Research
(ICPSR)
7. ¡ JISC
Beginner’s
Guide
to
Digital
Preservation
elaborates:
§ Managed:
Digital
preservation
is
a
Management
problem.
§ Activities:
The
policy
needs
to
filter
down
to
a
list
of
processes:
tasks
that
can
take
place
at
specified
times
and
in
specified
ways.
§ Necessary:
What
needs
to
be
done.
How
long
do
you
want
to
preserve
the
objects
for?
Discussions
about
the
activities
needed
to
achieve
a
level
of
preservation
are
necessary.
§ Continued
Access:
Access
is
the
key
here.
Most
objects
in
the
public
sphere
are
preserved
to
support
access
and
retrieval.
§ Digital
Materials:
Digital
materials,
digital
objects,
call
them
what
you
will.
This
is
the
stuff
you
are
preserving.
Different
objects
require
different
processes.
8. ¡ Local
Content
as
the
Future
of
[Academic]
Libraries?
§ At
least
in
regards
to
Physical
Collections?
§ Google
Books,
HathiTrust
§ To
a
large
degree
the
material
under
the
“Bell
Curve”
( journals,
gov’t
docs,
etc.)
is
already
being
“managed”
outside
of
libraries
¡ The
University
is
a
collection
of
Niche
Markets
(John
Meador,
Jr.)
“The
Long
Tail”
by
Chris
Anderson
in
Wired
(October,
2004),
his
book:
The
Long
Tail:
Why
the
Future
of
Business
is
Selling
More
of
Less.
New
York:
Hyperion,
2006
and
its
Revised
and
Updated
EdiKon,
2008.
9. ¡ Why
Libraries?
¡ Libraries
have
been
preserving
information
for
centuries
§ Furthers
the
role
of
libraries
to
the
digital
world
§ Not
a
new
idea,
a
new
format
§ Majority
of
new
material
is
published
in
digital
format
(Scholarly
Articles,
Campus
newsletters,
Course
catalogs,
Web
sites…)
University
of
Al-‐Karaouine,
Founded
859,
Fes,
Morocco
http://en.wikipedia.org/wiki/University_of_Al-‐Karaouine
12. ¡ Adhere
to
International
Standards
§ "Librarians
can
take
over
the
world.”
(Dr.
Barry
Smith)
But
we
need
to
use
tools
that
have
been
proven
-‐
not
building
new
ontologies
¡ Capture
the
locally
born
digital
objects
that
are
replacing
titles
formerly
found
in
our
print
archives
¡ Ensure
Digital
Curation
&
Preservation
¡ Provide
Cross-‐Collection
Search
¡ Demonstrate
proof
of
concept
before
soliciting
faculty
research
13. ¡ Experimented
with
various
“Digital
Content”
systems
including
Content
Pro,
CONTENTdm,
DSpace,
EPrints
¡ None
of
these
have
preservation
“built-‐in”
¡ Building
our
own
was
not
practical
§ Staffing
levels
§ Lack
of
programmers
§ Mission
creep?
§ Sustainability?
¡ Rosetta
by
Ex
Libris
14. ¡ Scalable
¡ Expandable
¡ Flexible
¡ Accessable
¡ Standards-‐based
§ “Based
on
the
Open
Archival
Information
System
(OAIS)
model
and
conforming
to
trusted
digital
repository
(TDR)
requirements.”
http://www.exlibrisgroup.com/category/RosettaOverview
15. Complete
preservation
solution
allowing
collection,
archiving
and
preservation
of
digital
materials
of
any
type.
Rosetta
ensures
data
integrity
and
provides
access
over-‐time
to
digital
materials.
Operational
Storage
Migration
Action
Permanent
Storage
Execute
Preservation
Identify
Risks
Evaluate
Alternatives
Actions
http://www.exlibrisgroup.com/category/RosettaOverview
16. ¡ No
preservation
systems
is
useful
if
there
is
no
access
(especially
for
a
University
Library)
¡ Rosetta
does
not
have
a
public
discovery
layer
¡ Rosetta’s
Digital
Publishing
System
is
flexible
so
there
are
options
¡ Primo
for
discovery
§ First
University
to
use
Prim0
with
Rosetta
§ Works
well
with
other
library
systems
such
as
Aleph
and
Primo
Central
§ One
stop
shopping
17. Licensed
Open
Access
Commercial
Digital
Local
Local
e-‐resources
Objects
Print
Digital
18. Data
Sets
Faculty
Research
Special
Collections/
Course
Catalogs
University
Archives
University
Photographs
Newsletters
Also
need
to
be
opportunistic
(blood
serum
collection)
Images
cc-‐by-‐nc-‐2.0:
http://www.flickr.com/photos/bycp/
20. ¡ Systems:
1
person
(~0.5
FTE)
§ Project
Management
§ Systems/Technical
¡ Metadata/Cataloging:
3
people
(~1.0
FTE)
¡ User
Interface:
Part
of
Web
Services
Librarian's
Time
¡ Special
Collections:
Not
directly
involved
with
implementation,
but
relied
on
heavily
for
collection
level
expertise
21. ¡ Metadata
Librarians
are
Project
Managers
§ Decide
on
appropriate
descriptive
metadata
fields
§ Create
the
metadata
forms
§ Provide
training
§ Develop
and/or
provide
specialized
terminology
(such
as
LCSH,
TGM,
TGN)
§ Review
submissions
as
appropriate
§ DO
NOT
typically
create
the
metadata
(student
workers
or
other
staff
will
create
metadata)
22. ¡ In
the
preliminary
planning
stages
¡ Need
to
demonstrate
we
can
do
what
we
say
¡ Scholarly
output
§ Articles,
proceedings,
etc.
§ Research
data
§ Related
material
including
grey
literature,
research
notes,
correspondence,
etc.
Photo
from
http://anthro.binghamton.edu/BiomedWebsite/serum.shtml
23. Please
characterize
your
research
in
terms
of
data
intensity
for
your
analysis
run
(n=91)
308
individuals
who
either
Normal
(working
had
an
externally
2.4%
data
set
up
to
100
sponsored
project
since
Megabytes)
2009
or
who
had
submitted
a
proposal
during
that
time
Heavy
(working
period
where
asked
to
take
24.4%
data
set
up
to
1
the
survey.
By
June
15,
2011
Terrabyte)
91
respondents
complete
Very
large
(working
the
survey.
(Conducted
by
data
set
up
to
1000
Jim
Wolf,
retired
Director
of
73.2%
Academic
Computing)
Terrabytes)
Extreme
(working
dataset
over
1000
Terrabytes)
24. 80
desktop
or
laptop
computer
60
in
office
or
lab
on
instrument
in
lab
research
group
server
40
storage
departmental
server
storage
20
ITS
storage
external
network
storage
0
Please
identify
where
you
store
data
generated
or
gathered
for
your
project
25. 60
50
40
30
forever
3-‐7
years
20
<
3
years
10
0
Local
research
ITS
storage
Library
archive
Disciplanry
group
server
repository
(e.g.
ICPSR)
26. 50
40
private
30
proprietory
20
openly
avilable
to
all
10
access
granted
to
individuals
0
Local
ITS
storage
Library
Disciplinary
research
archive
repository
group
server
(e.g.,
ICPSR)
28. ¡ Bring
everyone
on
board
¡ Set
priorities
¡ Review
metadata
and
digital
objects
often
(at
least
in
the
beginning)
¡ Metadata
may
contain
confidential
and/or
legally
protected
information
§ Will
metadata
librarians
need
human
subjects/IRB
approval?
§ Need
for
separate
discovery
mechanisms
29. ¡ Enlist
subject
librarians
to
help
make
connections
§ A
few
subject
librarians
have
identified
some
possible
data
needing
preservation
and
are
going
to
meet
with
faculty
for
preliminary
discussions
¡ Work
with
faculty
on
data
management
plans
§ Many
granting
agencies
such
as
NSF
are
requiring
data
management
plans
§ Get
involved
early
§ Assist
with
submission
requirements
for
research
30. ¡ Provide
preservation;
offer
dissemination
§ Don’t
confuse
preservation
with
open
access
§ Faculty
don’t
always
want
or
can
not
make
data
open
▪ Dark
archive
if
desired
§ Do
not
need
to
replace
or
replicate
current
data
dissemination
methods
(unless
researchers
desire)
¡ Not
all
research
data
is
“Big
Data”
§ Don’t
let
the
challenges
of
“Big
Data”
scare
you
away
from
all
data.
Photo:
http://siliconangle.com/files/2011/07/Big-‐Data.jpg
31. Rosetta
at
Binghamton
University
Libraries
|
14
November
2011
Reference:
Digital
Preservation
Collation
(2009).
Digital
Preservation
Handbook.
http://www.dpconline.org/advice/preservationhandbook/introduction/definitions-‐and-‐concepts