Download all the CORE data in a single package.


All the data in one single place

Prototype, analyse and process your data directly on your infrastructure.


Matching your needs

World's largest full text collection of scientific papers for machine processing.


Largest full text collection

Accessible and easy to understand documentation and processes.


Simple to use

Download millions of research outputs for text and data analysis and process it directly in your own infrastructure.

CORE Dataset

Access documentation

see data statistics

CORE data can be downloaded as a bulk dataset, allowing you to process it on your own computer or within your infrastructure. The dataset provides a harmonised and enriched data format for access content from across our [data providers](/data-providers). This is perfect for prototyping new methods, especially when intensive data processes need to be run. It is also a good choice for data analysis and text mining.


How it works

cite publication

Older dumps of the CORE Dataset are free and ODC-By licensed. Organisations that register for the CORE Dataset can purchase a licence for the more recent datasets. Sustaining Members receive access to the most recent datasets as a free member benefit.


How often is the CORE dataset updated?

Enter your email address to register for our datasets or access the download page if you have already registered. Please enter your institutional email if you are registering in an institutional capacity.

Register for the CORE Dataset

Brunel University Research Archive

ChesterRep

Computer Laboratory Technical Reports - Cambridge University

Cronfa at Swansea University

Glasgow Theses Service

Language Box

NERC Open Research Archive

Open Access Institutional Repository at Robert Gordon University

Repository@Napier

Royal Holloway Research Online

SAS-SPACE

Surrey Research Insight

University of Wales Trinity Saint David

UAL Research Online

University of Birmingham Research Archive, E-papers Repository

University of Dundee Online Publications

University of Huddersfield Repository

University of Liverpool Repository

University of Salford Institutional Repository

arXiv.org e-Print Archive

CiteSeerX

Edge Hill University Research Information Repository

ResearchOnline@GCU

BEACON eSPACE

Kansas State Publications Archival Collection

Texas ScholarWorks

Digital Library for Earth System Education

DSpace at The University of Washington

Ilithia

St George's Online Research Archive

SMU Digital Repository

Computer Science Technical Reports @Virginia Tech

DigitalCommons@Robert W. Woodruff Library

Ohio Digital Resource Commons - Marietta College

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg

University of Toledo Open Insitutional Archive

Access to Research at National University of Ireland, Galway

ACQUIRE

Amsterdam University Press Publications

UEF Electronic Publications

Digitale Hochschulschriften der LMU

Lumbung Pustaka UNY  (UNY Repository)

AMS Acta

Concordia University Research Repository

Research Repository

Tver State University Repository

PORTO Publications Open Repository TOrino

University of Chichester EPrints Repository

UKM Journal Article Repository

Hedatuz

ResearchOnline@JCU

NUI Maynooth Eprint Archive

Constellation

Epsilon Open Archive

University of Zagreb Medical School Repository

Memorial University Research Repository

Caltech Theses and Dissertations

Universitas Ahmad Dahlan Repository

Unitn-eprints Research

UnipiEprints

Policy Documentation Center

Repository@USM

ICRISAT Open Access Repository

KFUPM ePrints

Universiti Putra Malaysia Institutional Repository

UPN Jatim Repository

Цифровий архів Острозької академії (Digital Repository of Ostroh Academy)

The Aphasiology Archive

E-Prints archive

Institut National Polytechnique de Toulouse (Theses)

Sebelas Maret Institutional Repository

Julkari

Escuela Superior Politecnica del Litoral

Bibliothèque numérique de l'enssib

ResearchSpace@Auckland

Universität Stuttgart, Fakultät 5, Germany, Computer Science Archive

Chinese Culture University Institutional Repository(中國文化大學機構典藏)

Unisa Institutional Repository

National Taiwan Normal University Repository

TU Delft Repository

CemOA

RiuNet

Helsingin yliopiston digitaalinen arkisto

University of Queensland eSpace

USU Repository

UCLA - Biblioteca de Administración y Contaduría

Repositorio Institucional Universidad de Granada

Pandektis

Tilburg University Repository

UT Repository

Repositório Institucional da Universidade de Brasília

Serveur académique lausannois

University of Johannesburg Institutional Repository

Oxford Text Archive

DELFOS Repositorio International

King's Research Portal

Directory of Open Access Journals

S O C R A T E S

Acropolis Educational Resources Repository

ANSTO Publications Online

dataset

datasets

SEE More testimonials

“To build the product we have always envisioned, having a robust and comprehensive dataset of machine-readable, peer-reviewed papers is absolutely essential. We are incredibly grateful to be able to partner with an organization like CORE that not only can meet our data needs, but also shares our vision of making science more accessible and consumable. This unique combination of best-in-class data-offering and mission-alignment makes CORE an ideal partner for Consensus.”


The Dataset provides you with:

* The entire CORE's corpus of both metadata and full texts in a machine processable format.
* Detailed [documentation](https://core.ac.uk/documentation/dataset) on how to download the CORE dataset and how data is organised.
*  Access to a very large corpus of research documents at the level of full texts, perfect for training machine learning models, NLP and text mining.
* Unique content from the network of open repositories, in addition to research papers with a registered DOI.


What’s included