Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
www.nature.com/scientificdata OPEN DATA DESCRIPTOR DOCU-CLIM: A global documentary climate dataset for climate reconstructions Angela-Maria Burgdorf et al.# Documentary climate data describe evidence of past climate arising from predominantly written historical documents such as diaries, chronicles, newspapers, or logbooks. Over the past decades, historians and climatologists have generated numerous document-based time series of local and regional climates. However, a global dataset of documentary climate time series has never been compiled, and documentary data are rarely used in large-scale climate reconstructions. Here, we present the first global multi-variable collection of documentary climate records. The dataset DOCU-CLIM comprises 621 time series (both published and hitherto unpublished) providing information on historical variations in temperature, precipitation, and wind regime. The series are evaluated by formulating proxy forward models (i.e., predicting the documentary observations from climate fields) in an overlapping period. Results show strong correlations, particularly for the temperature-sensitive series. Correlations are somewhat lower for precipitation-sensitive series. Overall, we ascribe considerable potential to documentary records as climate data, especially in regions and seasons not well represented by early instrumental data and palaeoclimate proxies. Background & Summary Information on past climates has played an essential role in climate science1. While historically the primary research focus has been on reconstructing past annual temperature, the questions raised nowadays with the help of palaeoclimatological data are multifaceted, including changes in the water cycle, the occurrence of weather and climate extreme events, and atmospheric dynamics. This, in turn, is a challenge for producing palaeoclimatic datasets. New approaches, such as off-line palaeodata assimilation2–5, provide past climate fields at increasing spatial and temporal resolution. However, all reconstructions essentially depend on sufficient high-quality data inputs. Current climate field reconstructions are largely based on relatively high-resolution proxies measured in natural archives such as tree rings, corals, speleothems, bivalves, sediments, or ice cores. Extensive compilations of such proxies exist6,7. In particular, tree rings are widely used, among others, due to their extensive spatial distribution across the globe. Unlike most other natural proxies, tree ring proxies, have an annual resolution. However, their climate signal is mainly limited to the growing season (although there are also winter reconstruc tions8). Documentary proxies, i.e., climate data originating from historical documents, could provide an essential contribution since they potentially cover combinations of seasons and regions (e.g., winter in East Asia) that are otherwise not well represented by natural proxies. Furthermore, documentary data are often calendar dated, and some have a very high temporal resolution. Despite these advantages, they are largely overlooked and only marginally used in large-scale climate reconstructions, since they are not readily available in digital format in the main compilations used by climate scientists, or their quality is not well known (note that classification of events is often based on effects, which requires local context information). The PAGES 2k multiproxy database9, for instance, only includes 15 documentary proxy series. However, historians have compiled documentary climate information in databases such as EURO-CLIMHIST for many years10 (note that qualitative weather descriptions are also found in databases of early instrumental meteorological data, e.g., Rodrigo11). In recent years, a major international effort has been done to promote the use of the archives of societies in climate reconstructions. The PAGES CRIAS working group (Climate Reconstruction and Impacts from A full list of authors and their affiliations appears at the end of the paper. # SCIENTIFIC DATA | (2023) 10:402 | https://doi.org/10.1038/s41597-023-02303-y 1 www.nature.com/scientificdata www.nature.com/scientificdata/ Data Rescue Data Compilation Metadata from Inventory Database of raw documentary data Consolidated, formatted dataset DOCU-CLIM Gridded or station temperature Forward models Gridded precipitation Assessment (correlation, p-value res. variance) Fig. 1 Flow chart depicting the generation of the documentary dataset. the Archives of Societies) was founded in 2018 and is working towards that goal. The Palgrave Handbook of Climate History12 provided a first global overview of documentary climate data organized in regional chapters. Based on this and many other sources, Burgdorf13 recently inventoried documentary climate series from a literature and databases search. The inventory contains 688 entries; not all are publicly available, and some have not yet been digitized. Here, we publish a subset of the data inventoried in Burgdorf13, termed DOCU-CLIM. The dataset contributes to a global monthly palaeoclimate reanalysis starting in 1420 and is based on assimilating monthly-to-seasonal proxies, documentary data and instrumental data into an ensemble of atmospheric model simulations using an offline Kalman filter approach similar to Valler et al.5 For that reason, here we focus on series that provide information in the window 1400–1880 CE at monthly to annual resolution. In this paper we present the dataset (see Supplementary File DOCU-CLIM_Inventory.txt for an overview of all records) and evaluate its usefulness for climate reconstruction using proxy forward models (statistical models that predict the documentary series from climate data rather than vice versa). Methods Compilation and data rescue. Over the past decades, climate historians and historical climatologists have produced numerous datasets in which documentary data have been translated into quantitative climate information. However, as their focus is commonly regional or local, these data are often not submitted to global data repositories such as the NOAA World Data Service for Palaeoclimatology database (https://www.ncei.noaa.gov/ access/paleo-search/) but instead, published on project or personal websites, or, unfortunately still very often, not published at all. Even if documentary records are incorporated into databases, they may not always be organized in a manner suitable for climate scientists, particularly when working with time series. In this work, we focus exclusively on quantitative document-based time series data, representing a small, albeit underexploited, subset of the body of documentary climate data. Figure 1 illustrates the general workflow followed in this project. The compilation of an inventory of documentary climate series was described in a previous paper13, which lists 688 records. While the latter paper described the metadata, in this paper we compiled the actual data. As detailed in Burgdorf13, the inventory was based on a search of 14 existing databases (Table 1), contributing about 25% of the entries in the inventory, as well as extensive literature research, contributing the rest. The catalogued data followed a set of criteria, some of which were dictated by our intended use. For instance, we only inventoried material overlapping the period 1400–1880 CE, with a minimum record length of 30 years of which 20 must be before 1880. These criteria were set as we used the data in a data assimilation project starting in 1420 CE and in which instrumental data (which become more frequent after 1880 CE) were also assimilated. The variables of interest were temperature, precipitation, and wind (e.g., onset of seasonal wind regime) and hence only records were compiled that provide information on one of these variables (note that some of the series also depend on further variables). As detailed in Burgdorf13, the focus was predominantly on English-language literature that was accessible electronically and in which the authors state that the series contains information on one of the three variables (publications about the Mediterranean area and Central Europe in other languages exist and may contain additional series). Except for phenological data, we used only secondary material to ensure the inclusion of expert source interpretation. This includes derived indices, generally accepted to quantify descriptive and qualitative documentary data14 or even reconstructed time series in physical units. The next step was to compile the actual data. Not all inventoried series are available in electronic form, and some are subject to a restrictive data policy. We downloaded the series from 14 databases (Table 1) and contacted SCIENTIFIC DATA | (2023) 10:402 | https://doi.org/10.1038/s41597-023-02303-y 2 www.nature.com/scientificdata www.nature.com/scientificdata/ Name of repository or database Abbreviation Region Nall Ndocu Reference URL Emile-Geay et al.6 https://www.ncei.noaa.gov/ access/ paleo-search/study/ 21171 (last access: 30 May 2022) PAGES2k Global 2,000 Year Multiproxy Database PAGES2k Global 692 15 NOAA/World Data Service for Paleoclimatology archives NOAA Paleo Global >10000 61 Euro-Climhist Euro-Climhist Switzerland/central Europe 65 27 Pfister et al.10 https://www.euroclimhist.unibe.ch/en/ (last access: 30 May 2022) Tambora.org Tambora.org Germany 4 4 Riemann et al.133 http://www.tambora.org (last access: 30 May 2022) National Snow and Ice Database: Global Lake and River Ice Phenology NSIDC Northern Hemisphere 865 39 Benson et al.25, updated 2020 https://nsidc.org/data/g01377/ versions/1 (last access: 30 May 2022) Japan Climate Data Project JCDP Japan 14 3 Climatological Database for the World’s Oceans CLIWOC Global 1624 García-Herrera et al.134 https://www.historicalclimatology.com/ cliwoc.html (last access: 30 May 2022) Institute for Ocean Technology Ice Database Ice Data Canada 4 4 http://www.icedata.ca (last access: 30 May 2022) KNMI Climate Explorer Climate Explorer Global >200 ~10 https://climexp.knmi.nl/ start. cgi?id = someone@ somewhere (last access: 30 May 2022) Red Española de Reconstrucción Climática a Partir de Fuentes Documentales RECLIDO Spain 7 7 http://stream-ucm.es/RECLIDO/es/ home-es.htm (last access: 30 May 2022) Salvá Sinobas Salvá Sinobas Iberian Peninsula 18 5 http://salva-sinobas.uvigo.es/index.php (last access: 30 May 2022) Variabilidad y Reconstrucción del Clima Vareclim Global 5 5 https://www.upo.es/ vareclim/index. php (last access: 30 May 2022) Reconstructed East Asian Climate Historical Encoded Series REACHES China 1 1 Wang et al.26 https://www.ncdc.noaa. gov/paleosearch/study/ 23410 (last access: 30 May 2022) Tracking Extremes of Meteorological Phenomena Experienced in Space and Time TEMPEST United Kingdom 5 5 Veale et al.135 https://www.nottingham. ac.uk/ research/groups/ weather-extremes/ research/ tempest-database.aspx (last access: 30 May 2022) https://www.ncei.noaa.gov/ access/ paleo-search/ (last access: 30 May 2022) https://jcdp.jp (last access: 30 May 2022) Table 1. Overview of available global and national repositories and databases containing documentary evidence. Nall is the total number of series or databases available on the platform, and Ndocu is the number of those series based on documentary evidence and available prior to 1880 CE (from Burgdorf13). Note that there may be overlap between the repositories. many authors directly in cases when a dataset was not available in a repository. However, we only compiled data series that are open access and allow us to redistribute the data under a CC-BY license. In addition to compiling existing documentary data series, we also rescued a significant amount of data (this includes some series we recently presented in another study15). This concerns 137 ice phenology series and 5 precipitation series (Fig. 2). Note that some of the rescued data might be available electronically but we did not find it. The single most important source was a compilation of freezing and thawing dates of Russian rivers by Rykachev16 (see example in Fig. 3) and a follow-up compilation by Shostakovich17. Some (few) series were measured from graphs published in the 1970s where the underlying data were unavailable electronically. Many of the datasets digitized in the 1970s and even 1980s have not made it into the era of electronic publishing and open data policies (a list of rescued series including the sources is given in Supplementary file DOCU-CLIM_ Rescued.txt). To identify any climate signal contained in documentary climate data, we formulated forward models18 based on instrumental global monthly climate fields. These models not only serve for evaluation in this paper but are directly relevant for climate reconstruction approaches including data assimilation. We used temperature fields from BEST19 and HadCRUT520, and precipitation fields from GPCC21 to extract the time series from the closest grid point to each documentary site. The number of overlapping years between proxy and climate series had to be superior to 20 years (for all African wetness/dryness indices, as an exception, we accepted 10 overlapping years as otherwise no evaluation would have been possible on the entire dataset). The starting date was usually dictated by the start of the reference dataset, the end date by the end of the documentary dataset (see exceptions below), but never later than 1950 CE in order to avoid calibrating a forward model in a period in which climate or environment are no longer comparable with earlier periods. The forward model took the form of a multiple regression model (see Fig. 4), in which a documentary series was expressed as a linear combination of monthly series of the corresponding driving variable (either temperature or precipitation). If the season or month was specified in the source (e.g., monthly, seasonal, or annual indices), these months were used. If this information was unavailable, we used annual mean values. In the case of events that were indicated as a specific date (e.g., phenological data), we also included lagged predictor variables (i.e., temperature from one or several previous calendar months). The window to be included was determined in a backward selection approach. In this case, the models initially included 6 months prior to the Evaluation. SCIENTIFIC DATA | (2023) 10:402 | https://doi.org/10.1038/s41597-023-02303-y 3 www.nature.com/scientificdata www.nature.com/scientificdata/ Precipitation Ice phenology Annual Dec-Feb Mar-May Jun-Aug Fig. 2 Map of rescued series. Fig. 3 Extract from Rykachev16 showing the dates of freezing and thawing of the Neva River in St. Petersburg/ Leningrad, Russia, from 1706 to 1869 CE. event in question (defined as the 90th percentile), such that an entire growing season could possibly be covered. Then a selection was carried out, retaining only months that were significant at p < 0.1. Insignificant months between two significant months were also retained. If no significant months were identified, no model was calculated. For the phenological series covered in Reichen et al.15, we made use of the more detailed information available. For instance, strongly skewed variables were transformed logarithmically, and we used the reference dataset and reference period given in the paper. The procedure is sketched in Fig. 4. We then fitted regression models with a least-squares estimator. As a measure of goodness of fit, we used the correlation between the observed and modelled documentary series, along with the p-value. The following information on the evaluation is indicated in the example data file (Fig. 5): the reference period used (1829–1879 CE in this example case), the reference dataset (BEST), the model (monthly mean temperature of March and April; any transformation of variables would be indicated here), correlation, p-value, and error variance of the residuals. For some of the ice phenological records, we have also digitized nearby temperature records as the existing global databases did not have any data in close vicinity. These new data have been published in Reichen et al.15 and Lundstad et al.22 (https://doi.org/10.1594/PANGAEA.940724), and in these cases “REFDATA” is denoted by the label “station”. SCIENTIFIC DATA | (2023) 10:402 | https://doi.org/10.1038/s41597-023-02303-y 4 www.nature.com/scientificdata/ www.nature.com/scientificdata Fig. 4 Schematic figure of forward modeling approach. Fig. 5 Data format with 27 columns using the example of the first line of series 1213 (transposed for clearer visualization). It should be noted that, first, the residual error is not a measure of the error of the documentary proxy, but of the difference between the actual observation and the forward modelled observation (regression error). Consequently, it also contains the inherent error in the instrumental climate data and in the interpolation (this is the error required for data assimilation approaches). Second, this evaluation measures the error only in recent times when instrumental climate data are available. As a result, the quality of the documentary data in the earlier period may differ23. For all proxies that do not have a sufficiently long (see thresholds above) overlap with instrumental climate fields or nearby station records, an independent evaluation was not attempted here. This is indicated with an “NA” throughout the evaluation section of the data file. Other methods of evaluation are possible, but this requires more knowledge and hence we refer to the original publications. It would be possible to compare these cases with reconstructions such as EKF400v25, a global, monthly three-dimensional climate reconstruction covering 1600–2003 CE. EKF400v2 is based on an off-line assimilation approach of proxy data (e.g., tree-ring width, maximum late wood density), documentary data, and early instrumental data into an ensemble of atmospheric model simulations. However, in many cases the documentary data were assimilated in EKF400v2 (and hence datasets are not independent), while in cases where no information is locally available, EKF400v2 basically represents a model simulation, so no strong correlation is expected. Accordingly, we use EKF400v2 only in Sect. 4 for a case study. Some documentary indices continue into the instrumental era as the authors have complemented them with degraded instrumental data or have used instrumental data in addition to documentary data. These data may then not be independent of instrumental data. These values were not removed from DOCU-CLIM. However, in the evaluation conducted in this study, the calibration period in such cases is limited to years before 1900 CE. SCIENTIFIC DATA | (2023) 10:402 | https://doi.org/10.1038/s41597-023-02303-y 5 www.nature.com/scientificdata www.nature.com/scientificdata/ Precipitation Temperature Wetness/dryness Ice phenology Plant Phenology Wind 1400-1499 1500-1599 1600-1699 1700-1799 1800-1899 <50 50-99 100-149 150-199 200-299 300-399 400-499 500- Annual Dec-Feb Mar-May Jun-Aug Sep-Nov Fig. 6 Map of all documentary data categorized according to (top left) the variable, (top right) the start year, (bottom left) the season covered, and (bottom right) the length. Number of Values per Year 150 Precipitation Temperature Dryness/Wetness Ice phenology Plant phenology Wind 100 50 0 1400 1500 1600 1700 1800 1900 2000 Fig. 7 Number of values in the DOCU-CLIM dataset as a function of year and type of proxy. Where available, information on whether a value was from a documentary source or from degraded observations was added to the “META” column. Data Records The DOCU-CLIM dataset can be downloaded from the BORIS repository (https://boris-portal.unibe.ch/ handle/20.500.12422/207)24. The dataset comprises 621 files (note that a monthly index series is split into 12 files), totaling more than 100,000 values (Fig. 6). Information on all series, including links to the original holding, is given in the readme-file of the dataset24. The references of the original series are included in this paper (refs. 16,17,25–129). The files are in ASCII format with 27 columns and a variable number of lines. The files are structured in a way that allows for straightforward inclusion into data assimilation schemes (see Fig. 5). In each file, one line covers one year. As a consequence, monthly data are stored in 12 files, one for each calendar month. The first seven columns contain information about the series, version number, and geographical SCIENTIFIC DATA | (2023) 10:402 | https://doi.org/10.1038/s41597-023-02303-y 6 www.nature.com/scientificdata www.nature.com/scientificdata/ Neva, St. Petersburg Ob, Barnaul Volga, Saratov Day of year 300 320 340 360 Outlier 1700 1750 1800 1950 1900 Fig. 8 Time series of the freeze-up date of three Russian rivers. Note the reverse scale on the y-axis. location. These are identical for each line in the file. Then come the year, and the month (only given if the record resolution is monthly or seasonal). Where documentary information refers to a specific time (e.g., date of freezing), the month is set to NA. The column “STATISTIC” indicates whether the observation is a state (such as a date of freezing), or a mean value (e.g., a seasonal mean index, in which case the indicated month gives the last month of the averaged interval and the column “WINDOW” the number of months averaged). The column “BOREALSEASON” indicates the closest match to a season: winter (Dec-Feb), spring (Mar-May), summer (Jun-Aug), autumn (Sep-Nov), or annual (for ice freezing and thawing series we additionally used “earlywinter” and “latespring”). The next columns indicate variable name, unit, and type (for all series in this paper, the type is “DOCU”), then follows the column “VALUE” that contains the actual time series values. The next seven columns refer to fitting metrics of the forward model, as described in Sect. 2.2. Finally, the last three columns provide metadata such as a reference, the ID of the corresponding series in Burgdorf13, and a column “META” that contains further information (for several entries, they are separated by “|”). In all cases the META column provides the original value (which is often the same as the value itself). Figure 5 provides an example where the freezing date is given in yr-mon-day. For further metadata on the series and collections, the reader is referred to the inventory by Burgdorf13. The file structure is further explained in the supplementary file “DOCU_CLIM_File_Description.txt”. All file names start with “DOCU_CLIM” followed by the ID as six-digit integer, followed by the version number “V1.0” and the name of the variable; all of these elements are separated by an underscore character “_”. As an example, the file shown in Fig. 5 is named: “DOCU_CLIM_001213_V1.0_Ice_phenology_Slobodskoy_Vyatka__River_ break-up.txt”. R-code to read the files is given in the Supplementary material. Most of the document-based climate records are from Europe, which is partly due to our selection criteria. However, there are also records from Asia and North America. Data for Africa mostly concern precipitation27. We only have a few documentary records from South America28,29 and only two from Australia30,31. One of the advantages of documentary proxy data is that they encompass all seasons (left bottom panel in Fig. 6). For example, plant phenology reflects temperature in spring and summer (sometimes autumn). Ice phenology reflects conditions from late autumn to spring as ice-on dates are associated with autumn and winter air temperatures and ice-off dates are highly correlated to winter and spring air temperatures. Many of the indices such as temperature and precipitation indices for various regions of Europe are seasonal or even monthly. Some documentary data such as wetness/dryness in Africa indicate annual conditions (interpreted as yearly means)27. The earliest records that extend back to the 15th century or further are mainly from Europe, China and Japan (top right panel in Fig. 6). Some records from South America begin as early as the 16th century, and those from North America date back to the 18th century, while the earliest African records start around 1800. Many of the ice phenological data date back to the 1800s, but there are also longer records such as that of Lake Suwa, Japan, beginning in 1443. The oldest records are typically the longest (right panels in Fig. 6), as many continue to the start of instrumental observations or even beyond. Many records start after 1500 CE (Fig. 7) and the maximum coverage is in the late 19th century (note that from 1880 onward, no new records were added, although many phenological series start later). Temperature and precipitation indices are the most frequent record type, and wind indices31–34 the least. However, the numbers for different types vary in time (Fig. 7). During the 19th century, when many weather stations were already measuring temperature in Europe, the number of documentary temperature series decreases and ice phenological records dominate. As an example of series in DOCU-CLIM, Fig. 8 shows three time series of freeze-up dates of Russian rivers. Ice phenology is the dominating type of documentary data in DOCU-CLIM during most of the 19th century. While the Neva series is continuous, the one from the Ob in Barnaul has a long gap. The series from the Volga in Saratov are shorter and contain an outlier (freeze-up date: 14 Feb). Note that outliers were not filtered out for the following evaluation. Technical Validation For many of the time series, a technical validation was performed by the original authors. These evaluations include arguably the most specific, expert judgement, combining local knowledge on both the historical sources and the local-to-regional climate characteristics. Readers are encouraged to consult the original publication for specific details (references are indicated in the inventory file as well as in the data files; a link is also given to the inventory by Burgdorf13 where further information is available). SCIENTIFIC DATA | (2023) 10:402 | https://doi.org/10.1038/s41597-023-02303-y 7 www.nature.com/scientificdata/ www.nature.com/scientificdata -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 80 Number of series 60 40 20 0 <0.001 0.001-0.01 0.01-0.05 0.05-0.1 >0.1 -1.0 -0.5 0.0 0.5 Correlation coefficient 1.0 Fig. 9 Map of Pearson correlation coefficients between documentary data and forward modelled data (top), p-values (bottom) and histogram showing the distribution of the correlation coefficients. Grey dotted circles indicate series where no evaluation was possible. Fig. 10 Standardized climatic anomalies in 1835 (with respect to the period 1841–1870) for (top) documentary data and (bottom) EKF400v2. Variables include temperature in (left) spring to summer and (middle) autumn to early winter and (right) annual precipitation. Here we report the results of our independent validation, as described in Section 2. In Fig. 9 we show correlations for forward models that we calibrated in gridded instrument-based datasets. All records that have no overlap with observations (and thus evaluation was not possible) are denoted by grey dotted circles. SCIENTIFIC DATA | (2023) 10:402 | https://doi.org/10.1038/s41597-023-02303-y 8 www.nature.com/scientificdata/ www.nature.com/scientificdata We found robust and highly significant correlations across Europe, North America and Asia. Many of them are related to derived indices, plant phenology, or ice phenology. Somewhat lower but still highly significant correlations are found over South America. Spatially-varying correlations are found over Africa, where most of the series were evaluated based on only 10 years of overlap. Moreover, the precipitation dataset21, which the documentary data were compared with, may have substantial errors in these pre-1900 years. However, several significant correlations can be found in locations including southern Africa and Australia. The distribution of the correlations (Fig. 9) shows that an overwhelming number of series exhibits correlations above 0.5 and the peak of the distribution occurs at correlations between 0.7 and 0.9, which is higher than the correlations found for forward modelling of tree rings7. The highest correlations are observed for temperature indices, which are however often not fully independent from instrumental data in the overlapping period. Very high correlations are however, also found for ice phenological data and grape harvest dates. The evaluation of the 421 records with models demonstrates that many documentary series have significant potential for quantitative applications. However, the series that were not evaluated (due to the absence of instrumental series in the vicinity, or total lack of overlap) will require further investigation and consultation in the literature before incorporation into a climate reconstruction. Usage Notes The DOCU-CLIM dataset24 provides climate information from documentary data with the main aim of facilitating climate reconstructions. The dataset contains information on the correlation with corresponding forward models. This information should be carefully considered before using the data. Although climate reconstruction is the primary aim, the data can also be used in the form of individual time series. To demonstrate the potential of this new documentary dataset for quantitative climate analysis, we present a case study for the year 1835. In January 1835, the volcano Cosigüina in Nicaragua erupted and released massive amounts of sulfuric aerosols into the atmosphere. It is considered one of the largest historical volcanic eruptions in the Americas and led to widespread environmental impacts130. We investigated temperature-related series, and precipitation or wetness/dryness-related time series, for the year 1835 CE. For temperature, we differentiated two extended seasons: boreal spring to summer (March to July), and autumn to early winter (August to December), as this division fits best with the material contained in the documentary data (thawing dates and spring/summer phenology, freezing dates and autumn phenology). Monthly series were averaged to these seasons. For precipitation and wetness/dryness indices, we considered annual indices or means of all seasons. As a reference we chose the period 1841–1870 CE (retaining only series with more than 20 years of data in this period) and standardized the series with respect to this reference. Finally, the sign of series was adjusted such that, for temperature, positive indicates warmer conditions (e.g., spring flowering or thawing dates were multiplied by −1 as earlier dates indicate warming; the sign of freezing dates was kept as early freezing indicates low temperatures). Series that were already assimilated in EKF400v2 were excluded from this analysis. We then compared these anomalies to the EKF400v2 reanalysis where we performed the same procedure with global monthly fields. Seasonal and annual averages were calculated, and the fields for 1835 were presented as standardized anomalies from the 1841–1870 base period. Figure 10 shows the standardized anomalies of the documentary proxies (top row) and EKF400v2 (bottom row) for the year 1835 CE. The two sets of data are entirely independent, and they are plotted on the same scale. EKF400v2 shows a general cooling that is arguably related to the volcanic eruption. However, not all regions cooled in all seasons. In boreal spring and summer, Europe and northern Eurasia have standardized anomalies around zero, and in some regions, temperature anomalies are positive. Although EKF400v2 assimilates no ice phenology from Siberian rivers except for the Angara in Irkutsk, these regions also showed neutral or slightly warm conditions. During autumn and early winter, the documentary data (particularly the early freezing of rivers) suggest a general cooling across the northern mid-latitudes. This coldness corresponds well with the EKF400v2 anomaly fields. Finally, precipitation in EKF400v2 indicates drying in most parts of Africa and wetting around the Mediterranean. This pattern is also observed in most of the documentary data in Africa (none of which were assimilated into EKF400v2). The large-scale cooling and the drying of areas influenced by the African monsoon agree with the expected effects of a tropical volcanic eruption131. Overall, our analysis shows that our documentary dataset (DOCU-CLIM)24 can capture spatial climate variability associated with the prominent volcanic eruption of 1835. The DOCU-CLIM dataset24 can be used for climate reconstruction, particularly for data assimilation, which can make full use of the data and the metadata provided on the forward modeling. Some documentary time series could not be validated and should be further analyzed. DOCU-CLIM is a global dataset24 and can now be combined with other multi-proxy compilations such as the PAGES 2k6 datasets, or instrumental datasets such as H-CLIM22, to generate new climate reconstructions. Care should be taken when evaluating the series for trends. We have not analyzed the suitability of the records for trend analyses and advise testing this further before using the dataset for this purpose. The answer may well depend on the proxy type considered (phenological data, thermal index, etc.). Possible future updates of the DOCU-CLIM dataset may offer the data in a range of other existing data formats132. Code availability R code for generating the plots in this paper, for reading in all files and extracting desired information, and for the forward modeling is available from https://github.com/sbroennimann/DOCU-CLIM. Received: 19 December 2022; Accepted: 12 June 2023; Published: xx xx xxxx SCIENTIFIC DATA | (2023) 10:402 | https://doi.org/10.1038/s41597-023-02303-y 9 www.nature.com/scientificdata/ www.nature.com/scientificdata References 1. IPCC. Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change. https://doi.org/10.1017/9781009157896 (Cambridge University Press, 2021). 2. Bhend, J., Franke, J., Folini, D., Wild, M. & Brönnimann, S. An ensemble-based approach to climate reconstructions. Climate of the Past 8, 963–976 (2012). 3. Hakim, G. J. et al. The last millennium climate reanalysis project: Framework and first results. Journal of Geophysical Research: Atmospheres 121, 6745–6764 (2016). 4. Tardif, R. et al. Last Millennium Reanalysis with an expanded proxy database and seasonal proxy modeling. Climate of the Past 15, 1251–1273 (2019). 5. Valler, V., Franke, J., Brugnara, Y. & Brönnimann, S. An updated global atmospheric paleo‐reanalysis covering the last 400 years. Geosci Data J 00, 1–19 (2021). 6. Emile-Geay, J. et al. A global multiproxy database for temperature reconstructions of the Common Era. Sci Data 4, 170088 (2017). 7. Breitenmoser, P., Brönnimann, S. & Frank, D. Forward modelling of tree-ring width and comparison with a global network of treering chronologies. Climate of the Past 10, 437–449 (2014). 8. Shah, S. K., Pandey, U., Mehrotra, N., Wiles, G. C. & Chandra, R. A winter temperature reconstruction for the Lidder Valley, Kashmir, Northwest Himalaya based on tree-rings of Pinus wallichiana. Clim Dyn 53, 4059–4075 (2019). 9. Kilbourne, H. et al. A global multiproxy database for temperature reconstructions of the Common Era. figshare. Collection. https:// doi.org/10.6084/m9.figshare.c.3285353.v2 (2017). 10. Pfister, C., Rohr, C. & Jover, A. C. C. Euro-Climhist: eine Datenplattform der Universität Bern zur Witterungs-, Klima- und Katastrophengeschichte. Wasser Energie Luft 109, 45–48 (2017). 11. Rodrigo, F. S. The climate of Granada (southern Spain) during the first third of the 18th century (1706–1730) according to documentary sources. Climate of the Past 15, 647–659 (2019). 12. The Palgrave Handbook of Climate History. https://doi.org/10.1057/978-1-137-43020-5 (Palgrave Macmillan UK, 2018). 13. Burgdorf, A.-M. A global inventory of quantitative documentary evidence related to climate since the 15th century. Climate of the Past 18, 1407–1428 (2022). 14. Nash, D. J. et al. Climate indices in historical climate reconstructions: a global state of the art. Clim. Past 17, 1273–1314 (2021). 15. Reichen, L. et al. A decade of cold Eurasian winters reconstructed for the early 19th century. Nat Commun 13, 1–9 (2022). 16. Rykachev, M. Openings and freezings of rivers in the Russian Empire. (тип. Имп. Акад. наук, 1886). 17. Shostakovich, V. B. Opening and freezing of waters of Asian Russia. (паровая типо-литография П. И. Макушина и В. М. Посохина, 1909). 18. Dee, S. G., Steiger, N. J., Emile-Geay, J. & Hakim, G. J. On the utility of proxy system models for estimating climate states over the common era. J Adv Model Earth Syst 8, 1164–1179 (2016). 19. Rohde, R. et al. A New Estimate of the Average Earth Surface Land Temperature Spanning 1753 to 2011. Geoinformatics & Geostatistics: An Overview 01, 1–7 (2013). 20. Morice, C. P. et al. An Updated Assessment of Near‐Surface Temperature Change From 1850: The HadCRUT5 Data Set. Journal of Geophysical Research: Atmospheres 126, e2019JD032361 (2021). 21. Schneider, U. et al. Evaluating the Hydrological Cycle over Land Using the Newly-Corrected Precipitation Climatology from the Global Precipitation Climatology Centre (GPCC). Atmosphere 2017, Vol. 8, Page 52 8, 52 (2017). 22. Lundstad, E. et al. The global historical climate database HCLIM. Scientific Data 10, 44, https://doi.org/10.1038/s41597-02201919-w (2022). 23. Dobrovolný, P. et al. Monthly, seasonal and annual temperature reconstructions for Central Europe derived from documentary evidence and instrumental records since AD 1500. Clim Change 101, 69–107 (2010). 24. Burgdorf, A.-M. et al. DOCU-CLIM: A global documentary climate dataset for climate reconstructions. Open Access CRIS of the University of Bern https://doi.org/10.48620/167 (2022). 25. Benson, B., Magnuson, J. & Sharma, S. Global Lake and River Ice Phenology Database, Version 1. NSIDC: National Snow and Ice Data Center. https://doi.org/10.7265/N5W66HP8 (2000). 26. Wang, P. K. et al. Construction of the REACHES climate database based on historical documents of China. Sci Data 5, 180288 (2018). 27. Nicholson, S. E., Klotter, D. & Dezfuli, A. K. Spatial reconstruction of semi-quantitative precipitation fields over Africa during the nineteenth century from documentary evidence and gauge data. Quaternary Research 78, 13–23 (2012b). 28. Prieto, M. R. & Herrera, G. R. Documentary sources from South America: Potential for climate reconstruction. Palaeogeogr Palaeoclimatol Palaeoecol 281, 196–209 (2009). 29. Neukom, R. et al. An extended network of documentary data from South America and its potential for quantitative precipitation reconstructions back to the 16th century. Geophys Res Lett 36, L12703 (2009). 30. Fenby, C. & Gergis, J. L. Rainfall variations in south-eastern Australia part 1: Consolidating evidence from pre-instrumental documentary sources, 1788–1860. International Journal of Climatology 33, 2956–2972 (2013). 31. Gallego, D., García-Herrera, R., Peña-Ortiz, C. & Ribera, P. The steady enhancement of the Australian Summer Monsoon in the last 200 years. Sci Rep 7, 1–7 (2017). 32. Gallego, D., Ordóñez, P., Ribera, P., Peña-Ortiz, C. & García-Herrera, R. An instrumental index of the West African Monsoon back to the nineteenth century. Quarterly Journal of the Royal Meteorological Society 141, 3166–3176 (2015). 33. Barriopedro, D. et al. Witnessing North Atlantic westerlies variability from ships’ logbooks (1685–2008). Clim Dyn 43, 939–955 (2014). 34. Vega, I. et al. Reconstructing the western North Pacific summer monsoon since the late nineteenth century. J Clim 31, 355–368 (2018). 35. Adamson, G. C. D. & David, J. N. Documentary Reconstruction of Monsoon Rainfall Variability over Western India, 1781–1860. Clim. Dyn. 42, 749–69 (2014). 36. Alcoforado, M.-J. et al. Temperature and precipitation reconstruction in southern Portugal during the late Maunder Minimum (AD 1675–1715). The Holocene 10, 333–340 (2000). 37. Amano, T., Smithers, R. J., Sparks, T. H. & Sutherland, W. J. A 250-year index of first flowering dates and its response to temperature changes. Proc. R. Soc. B Biol. Sci. 277, 2451–2457 (2010). 38. Aono, Y. & Kazui, K. Phenological Data Series of Cherry Tree flowering in Kyoto Japan and Its Application to Reconstruction of Springtime Temperatures since the 9th Century. Int. J. Climatol. 28, 905–914 (2008). 39. Aono, Y. & Omoto, Y. Estimation of Temperature at Kyoto since the 11th Century. Using Flowering Data of Cherry Trees in Old Documents. J. Agricult. Meteorol. 49, 263–272 (1994). 40. Aono, Y. & Saito, S. Clarifying Springtime Temperature Reconstructions of the Medieval Period by Gap-Filling the Cherry Blossom Phenological Data Series at Kyoto, Japan. Int. J. Biometeorol. 54, 211–219 (2010). 41. Aono, Y. & Tani, A. Autumn temperature deduced from historical records of autumn tints phenology of maple tree in Kyoto. Japan. Climate in Biosphere 14, 18–28 (2014). 42. Aono, Y. Cherry Blossom Phenological Data since the Seventeenth Century for Edo (Tokyo), Japan, and Their Application to Estimation of March Temperatures. Int. J. Biometeorol. 59, 427–434 (2015). SCIENTIFIC DATA | (2023) 10:402 | https://doi.org/10.1038/s41597-023-02303-y 10 www.nature.com/scientificdata/ www.nature.com/scientificdata 43. Ball, T. F. Historical and instrumental evidence of climate: western Hudson Bay, Canada, 1714–1850. In: Climate since the A.D. 1500 (eds. Bradley, R. S. & Jones, P. D.), p. 40–73, Routledge (1995). 44. Barrat, J. M. D. First Appearance of Ice, the Closing and the time of Opening of the Connecticut River at Middletown. The American Journal of Science and Arts 39, 88–90 (1840). 45. Bartholy, J. et al. Classification and analysis of past climate information based on historical documentary sources for the Carpathian Basin. Int. J. Climatol. 24, 1759–1776 (2004). 46. Bullón, T. Winter Temperatures in the Second Half of the Sixteenth Century in the Central Area of the Iberian Peninsula. Clim. Past 4, 357–367 (2008). 47. Camenisch, C. Endless Cold: A Seasonal Reconstruction of Temperature and Precipitation in the Burgundian Low Countries during the 15th Century Based on Documentary Evidence. Clim. Past 11, 1049–1066 (2015). 48. Camuffo, D. et al. 500-Year Temperature Reconstruction in the Mediterranean Basin by Means of Documentary Data and Instrumental Observations. Clim. Change 101, 169–99 (2010). 49. Camuffo, D. et al. Climate change in the Mediterranean over the last five hundred years. In: Planet earth 2011 – global warming challenges and opportunities for policy and practice (ed. Carayannis, E. G.). InTech, Rijeka (2011). 50. Catchpole, A. J. W. & Faurer, M.-A. Summer sea ice severity in Hudson Strait, 1751–1870. Clim. Change 5, 115–139 (1983). 51. Catchpole, A. J. W. Hudson’s Bay Company Ships’ Log-Books as Sources of Sea Ice Data, 1751–1870. In: Climate since the A.D. 1500 (eds. Bradley, R. S. & Jones, P. D.), p. 17–39, Routledge (1995). 52. Catchpole, A. J. W., Moodie, D. W. & Milton, D. Freeze-Up and Break-Up of Estuaries on Hudson Bay in the Eighteenth and Nineteenth Centuries. Can. Geogr. Géographe Can. 20, 279–297 (1976). 53. Daux, V. et al. An Open-Access Database of Grape Harvest Dates for Climate Research: Data Description and Quality Assessment. Clim. Past 8, 1403–1418 (2012). 54. de Vries, J. Histoire du climat et économie: des faits nouveaux, une interprétation différente. Ann. Hist. Sci. Soc. 32, 198–226 (1977). 55. Defila, C. & Clot, B. Phytophenological trends in Switzerland. Int. J. Biometeorol. 45, 203–207 (2001). 56. Dobrovolný, P. et al. Precipitation Reconstruction for the Czech Lands, AD 1501–2010. Int. J. Climatol. 35, 1–14 (2015). 57. Eklund, A. Isläggning och islossning i svenska sjöar. (Long observation series of ice freeze and break up dates in swedish lakes), SMHI Hydrologi No. 81. Sveriges Meteorologiska och Hydrologiska Institut, Nörrköping, pp. 1–24 (1999). 58. Fernández-Fernández, M. I. et al. The climate in Zafra from 1750 to 1840: precipitation. Clim. Change 129, 267–280 (2015). 59. García, R. et al. Reconstruction of the precipitation in the Canary Islands for the period 1595–1836. B. Am. Meteorol. Soc. 84, 37–40 (2003). 60. Ge, Q. S. et al. Reconstruction of Historical Climate in China, High-Resolution Precipitation Data from Qing Dynasty Archives. B. Am. Meteorol. Soc. 86, 671–769 (2005). 61. Ge, Q. S. et al. The Rainy Season in the Northwestern Part of the East Asian Summer Monsoon in the 18th and 19th Centuries. Quat. Sci. Rev. 229, 16–23 (2011). 62. Gergis, J. & Ashcroft, L. Rainfall Variations in South-Eastern Australia Part 2: A Comparison of Documentary, Early Instrumental and Palaeoclimate Records, 1788–2008. Int. J. Climatol. 33, 2973–2987 (2013). 63. Gimmi, U. et al. A Method to Reconstruct Long Precipitation Series Using Systematic Descriptive Observations in Weather Diaries: The Example of the Precipitation Series for Bern, Switzerland (1760–2003). Theor. App. Clim. 87, 185–197 (2007). 64. Gioda, A. Para una Historia Climática de La Paz en los Últimos Cinco Siglos. Revista de La Coordinadora de Historia 3, 13–33 (1999). 65. Gioda, A. & Prieto, M. R. Histoire des sécheresses andines. Potosí. El Niño et le petit age glaciaire. La Météorologie 8, 33–42 (1999). 66. Glaser, R. & Riemann, D. A thousand-year record of temperature variations for Germany and Central Europe based on documentary data. J. Quaternary Sci. 24, 437–449 (2009). 67. Grab, S. & Zumthurm, T. The land and its climate knows no transition, no middle ground, everywhere too much or too little: a documentary-based climate chronology for central Namibia, 1845–1900. Int. J. Climatol. 38, e643–e659 (2018). 68. Guevara-Murua, A. et al. 300 years of hydrological records and societal responses to droughts and floods on the Pacific coast of Central America. Clim. Past 14, 175–191 (2018). 69. Hannaford, M. J. et al. Early-nineteenth-century southern African precipitation reconstructions from ships’ logbooks. The Holocene 25, 379–390 (2015). 70. Hao, Z. X. et al. Winter Temperature Variations over the Middle and Lower Reaches of the Yangtze River since 1736 AD. Clim. Past 8, 1023–1030 (2012). 71. García-Herrera, R. & Prieto, M. R. Floods in the semiarid Argentinean Chaco during the 17th to 19th centuries. In: Proceedings of Palaeofloods, Historical Data & Climatic Variability: Applications in Flood Risk Assessment (ed. Thorndycraft, V. R. et al.), pp. 107–112, CSIC-Centro de Ciencias Medioambientales, Madrid (2003). 72. Hildebrandsson, H. H. Sur le prétendu changement du climat européen en temps historiques. Nova Acta R. Soc. Sci. Upps. 4(4), 5 (1915). 73. Hill, B. T. & Jones, S. J. The Newfoundland ice extent and the solar cycle from 1860 to 1988. J. Geophys. Res. 95, 5385 (1990). 74. Hirano, J. et al. Reconstruction of July temperature variations since the 1830s in Kawanishi based on historical weather documents. Geographical Review of Japan (Series A) 86, 451–464 (2013). (in Japanese). 75. Holopainen, J. et al. Plant phenological data and tree-rings as palaeoclimate indicators since AD 1750 in SW Finland. Int. J. Biometeorol. 51, 61–72 (2006). 76. Kajander, J. Methodological Aspects on River Cryophenology Exemplified by a Tricentennial Break-up Time Series from Tornio. Geophysica 29, 73–95 (1993). 77. Kelso, C. & Vogel, C. The Climate of Namaqualand in the Nineteenth Century. Clim. Change 83, 357–380 (2007). 78. Kiss, A. et al. An experimental 392-year documentary-based multi proxy (vine and grain) reconstruction of May-July temperatures for Koszeg, West-Hungary. Int. J. Biometeorol. 55, 595–611 (2011). 79. Koslowski, G. & Glaser, R. Reconstruction of the ice winter severity since 1701 in the Western Baltic. Clim. Change 31, 79–98 (1995). 80. Kouraev, A. et al. The ice regime of Lake Baikal from historical and satellite data: Relationship to air temperature, dynamical, and other factors. Limnology and Oceanography 52, 1268–1286 (2007). 81. Koslowski, G. & Glaser, R. Variations in reconstructed ice winter severity in the Western Baltic from 1501 to 1995, and their implications for the North Atlantic oscillation. Clim. Change 41, 175–191 (1999). 82. Leijonhufvud, L. et al. Five Centuries of Stockholm Winter/Spring Temperatures Reconstructed from Documentary Evidence and Instrumental Observations. Clim. Change 101, 109–141 (2010). 83. Livingstone, D. M. Break-up dates of Alpine lakes as proxy data for local and regional mean surface air temperatures. Clim. Change 37, 407–439 (1997). 84. Loader, N. J. et al. Spring temperature variability in northern Fennoscandia AD 1693–2011. J. Quat. Sci. 26, 566–570 (2011). 85. Magne, M. A. Two Centuries of River Ice Dates in Hudson Bay Region from Historical Sources. (University of Manitoba, 1981). 86. Magnuson, J. J. et al. Historical trends in lake and river ice cover in the Northern Hemisphere. Science 289, 1743–1746 (2000). 87. Matuszko, D. Klimat Krakowa w XX wieku (Climate of Cracow in 20th Century). IGiGP UJ, 251 pp. (2007) SCIENTIFIC DATA | (2023) 10:402 | https://doi.org/10.1038/s41597-023-02303-y 11 www.nature.com/scientificdata/ www.nature.com/scientificdata 88. Maurer, C., Koch, E., Hammerl, C., Hammerl, T. & Pokorny, E. BACCHUS temperature reconstruction for the period 16th to 18th centuries from Viennese and Klosterneuburg grape harvest dates. J. Geophys. Res. 114, D22106, https://doi. org/10.1029/2009JD011730 (2009). 89. Meier, N. et al. Grape harvest dates as a proxy for Swiss April to August temperature reconstructions back to AD 1480. Geophys. Res. Lett. 34, L20705 (2007). 90. Mikami, T. Long term variations of summer temperatures in Tokyo since 1721. Geographical Reports of Tokyo Metropolitan University 31, 157–165 (1996). 91. Mikami, T. Climatic Variations in Japan Reconstructed from Historical Documents. Weather 63, 190–193 (2008). 92. Mougin, M. Etudes glaciologiques en Savoie. Études Glaciologiques 3, 1–113 (1912). 93. Možný, M. et al. Cereal harvest dates as a proxy for Czech Republic March to June temperature. Clim. Change 110, 801–812 (2012). 94. Možný, M. et al. Drought reconstruction based on grape harvest dates for the Czech Lands, 1499−2012. Clim. Res. 70, 119–132 (2016). 95. Nash, D. J. & Endfield, G. H. ‘Splendid Rains Have Fallen’: Links Between El Niño and Rainfall Variability in the Kalahari, 1840–1900. Clim. Change 86, 257–290 (2008). 96. Nash, D. J. & Endfield, G. H. A 19th century climate chronology for the Kalahari region of central southern Africa derived from missionary correspondence. Int. J. Climatol. 22, 821–841 (2002). 97. Nash, D. J. & Grab, S. W. ‘A Sky of Brass and Burning Winds’: Documentary Evidence of Rainfall Variability in the Kingdom of Lesotho, Southern Africa, 1824–1900. Clim. Change 101, 617–653 (2010). 98. Nash, D. J. et al. Seasonal rainfall variability in southeast Africa during the nineteenth century reconstructed from documentary sources. Clim. Change 134, 605–619 (2016). 99. Neukom, R. et al. Multi-Proxy Summer and Winter Precipitation Reconstruction for Southern Africa Over the Last 200 Years. Clim. Dyn. 42, 2713–2726 (2014). 100. Nicholson, S. E. A Semi-Quantitative, Regional Precipitation Data Set for Studying African Climates of the Nineteenth Century, Part I. Overview of the Data Set. Clim. Change 50, 317–353 (2001). 101. Nicholson, S. E. et al. A Two-Century Precipitation Dataset for the Continent of Africa. B. Am. Meteorol. Soc. 93, 1219–1231 (2012). 102. Nordli, Ø. et al. A Late-Winter to Early-Spring Temperature Reconstruction for Southeastern Norway from 1758 to 2006. Annals of Glaciology 46, 404–408 (2007). 103. Ogilvie, A. E. J. & Jónsdóttir, I. Sea Ice, Climate, and Icelandic Fisheries in the Eighteenth and Nineteenth Centuries. Arctic 53, 383–394 (2000). 104. Ogilvie, A. E. J. The past climate and sea-ice record from Iceland, Part 1: Data to A.D. 1780. Clim. Change 6, 131–152 (1984). 105. Ouellet-Bernier, M.-M. & de Vernal, A. Winter freeze-up and summer break-up in Nunatsiavut, Canada, from 1770 to 1910. Past Glob. Chang. Mag. 28, 52–53 (2020). 106. Pfister, C. Monthly temperature and precipitation in central Europe 1525–1979: quantifying documentary evidence on weather and its effects. In: Climate since the A.D. 1500 (eds. Bradley, R. S. & Jones, P. D.), p. 92–117, Routledge (1995). 107. Pfister, C. The creation of high resolution spatio- temporal reconstructions of past climate from direct meteorological observations and proxy data. Methodological considerations and results. Climate in Europe 1675–1715. In: Climatic Trends and Anomalies in Europe 1675–1715 (eds. Frenzel, B., Pfister C. & Glaeser, B.), p. 329–376, Stuttgart (1994). 108. Prieto, M. R. & Herrera, R. De sequías, hambrunas, plagas y otras varias y continuas calamidades acaecidas en la Jurisdicción de Córdoba durante el siglo XVIII, Serie Economía y Sociedad. Cuadernos de Historia 4, 131–158 (2001). 109. Prieto, M. R. ENSO Signals in South America: Rains and Floods in the Paraná River Region during Colonial Times. Clim. Change 83, 39–54 (2007). 110. Prieto, M. R. et al. Historical Evidences of the Mendoza River Streamflow Fluctuations and Their Relationship with ENSO. The Holocene 9, 472–481 (1999). 111. Prieto, M. R., Herrera, R. & Dussel, P. Archival evidence for some aspects of historical climate variability in Argentina and Bolivia during the 17th and 18th centuries. In: Southern Hemisphere Paleo and Neo-Climates: Methods and Concepts (eds. Smolka, P. P. & Volkheimer, W.), pp. 127–142, Springer, Berlin (2000). 112. Rannie, W. F. Breakup and freezeup of the Red River at Winnipeg, Manitoba Canada in the 19th century and some climatic implications. Clim. Change 5, 283–296 (1983). 113. Rodrigo, F. S. et al. A 500-year precipitation record in Southern Spain. Int. J. Climatol. 19, 1233–1253 (1999). 114. Rodrigo, F. S. et al. A reconstruction of the winter North Atlantic Oscillation Index back to AD 1501 using documentary data in Southern Spain. J. Geophys. Res. 106, 14805–14818 (2001). 115. Rutishauser, T. Cherry Tree Phenology. Interdisciplinary Analyses of Phenological Observations of the Cherry Tree in the Extended Swiss Plateau Region and Their Relation to Climate Change. Diploma thesis, University of Bern 122 pp. (2003). 116. Seinä, A. & Palosuo, E. The classification of the maximum annual extent of ice cover in the Baltic Sea 1720–1995. MERI-Report Series of the Finnish Inst. of Marine Res. 27, 79–91 (1996). 117. Seinä, A. et al. Ice seasons 1996–2000 in Finnish sea areas. MERI-Report Series of the Finnish Inst. of Marine Res 43, 132 (2001). 118. Shabalova, M. V. & van Engelen, A. F. V. Evaluation of a Reconstruction of Winter and Summer Temperatures in the Low Countries, AD 764–1998. Clim. Change 58, 219–242 (2003). 119. Shipman, T. G. Ice-conditions on the Mississippi River at Davenport, Iowa. Northwest Sci. 12, 590–594 (1938). 120. Sparks, T. H. & Carey, P. D. The Responses of Species to Climate Over Two Centuries: An Analysis of the Marsham Phenological Record, 1736–1947. J. Ecol. 83, 321 (1995). 121. Takács, K., Kern, Z. & Pásztor, L. Long-term ice phenology records from eastern–central Europe. Earth Syst. Sci. Data 10, 391–404 (2018). 122. Tan, L. et al. Precipitation variations of Longxi, northeast margin of Tibetan Plateau since AD 960 and their relationship with solar activity. Clim. Past 4, 19–28 (2008). 123. Tarand, A. & Nordli, Ø. The Tallinn temperature series reconstructed back half a millennium by use of proxy data. Clim. Change 48, 189–199 (2001). 124. Taulis, E. De la distribution des pluies au Chili. Materiaux pour l'étude des calamités 33, 3–20 (1934). 125. Teillet, J. V. A Reconstruction of Summer Sea lce Conditions in the Labrador Sea Using Hudson’s Bay Company Ships’ Log-Books, 1751 to 1870. University of Manitoba, Winnipeg (1988). 126. Tejedor, E. et al. Rogation ceremonies: a key to understanding past drought variability in northeastern Spain since 1650. Clim. Past 15, 1647–1664 (2019). 127. Van Engelen, A. F. V. et al. A Millennium of Weather, Winds and Water in the Low Countries. In History and Climate: Memories of the Future? (ed. Jones P. D. et al.), pp. 101–123. New York, Springer (2001). 128. Vogel, C. H. A Documentary-Derived Climatic Chronology for South Africa, 1820–1900. Clim. Change 14, 291–307 (1989). 129. Wetter, O. & Pfister, C. Spring-Summer Temperatures Reconstructed for Northern Switzerland and Southwestern Germany from Winter Rye Harvest Dates, 1454–1970. Clim. Past 7, 1307–1026 (2011). 130. Longpré, M. A., Stix, J., Burkert, C., Hansteen, T. & Kutterolf, S. Sulfur budget and global climate impact of the A.D. 1835 eruption of Cosigüina volcano, Nicaragua. Geophys Res Lett 41, 6667–6675 (2014). SCIENTIFIC DATA | (2023) 10:402 | https://doi.org/10.1038/s41597-023-02303-y 12 www.nature.com/scientificdata/ www.nature.com/scientificdata 131. Raible, C. C. et al. Tambora 1815 as a test case for high impact volcanic eruptions: Earth system effects. WIREs Climate Change 7, 569–589 (2016). 132. McKay, N. P. & Emile-Geay, J. Technical note: The Linked Paleo Data framework – a common tongue for paleoclimatology. Clim. Past 12, 1093–1100, https://doi.org/10.5194/cp-12-1093-2016 (2016). 133. Riemann, D., Glaser, R., Kahle, M. & Vogt, S. The CRE tambora.org – new data and tools for collaborative research in climate and environmental history. Geosci Data J 2, 63–77 (2015). 134. García-Herrera, R. et al. CLIWOC: A Climatological Database for the World’s Oceans 1750–1854. Clim Change 73, 1–12 (2005). 135. Veale, L. et al. Dealing with the deluge of historical weather data: the example of the TEMPEST database. Geo 4, e00039 (2017). Acknowledgements This work was supported by the European Commission (ERC Grant PALAEO-RA, 787574) and by Swiss National Science Foundation project WeaR (188701). Simulations underlying EKF400v2 were performed at the Swiss National Supercomputing Centre CSCS. ET is supported by a Marie Skłodowska-Curie Action (“ITHACA-101024389”, and the Government of Aragón through the “Program of research groups” (group H09_20R, “Climate, Water, Global Change, and Natural Systems”). Author contributions A.M.B. compiled all datasets in this paper, S.B. formatted the data and performed the forward modeling, all authors contributed datasets, all authors commented on the manuscript. Competing interests The authors declare no competing interests. Additional information Correspondence and requests for materials should be addressed to S.B. Reprints and permissions information is available at www.nature.com/reprints. Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. © The Author(s) 2023 Angela-Maria Burgdorf1,2, Stefan Brönnimann 1,2 ✉, George Adamson3, Tatsuya Amano 4, Yasuyuki Aono 5, David Barriopedro6, Teresa Bullón7, Chantal Camenisch1,8, Dario Camuffo9, Valérie Daux 10, María del Rosario Prieto11,12,37, Petr Dobrovolný13,14, David Gallego 15, Ricardo García-Herrera 16,17, Joelle Gergis18,19, Stefan Grab20, Matthew J. Hannaford21, Jari Holopainen22, Clare Kelso23, Zoltán Kern24,25, Andrea Kiss 26, Elaine Kuan-Hui Lin27,28, Neil J. Loader29, Martin Možný30, David Nash20,31, Sharon E. Nicholson32, Christian Pfister1,8, Fernando S. Rodrigo 33, This Rutishauser2, Sapna Sharma 34, Katalin Takács35, Ernesto T. Vargas36 & Inmaculada Vega15 1 Institute of Geography, University of Bern, Bern, Switzerland. 2Oeschger Centre for Climate Change Research, University of Bern, Bern, Switzerland. 3Department of Geography, King’s College London, London, UK. 4School of Biological Sciences, The University of Queensland, Brisbane, Australia. 5Graduate School of Agriculture, Osaka Metropolitan University, Osaka, Japan. 6Instituto de Geociencias (IGEO), CSIC-UCM, Madrid, Spain. 7Department of Geography, Autonomous University of Madrid (UAM), Madrid, Spain. 8Institute of History, Department of Economic, Social and Environmental History (WSU), University of Bern, Bern, Switzerland. 9National Research Council-Institute of Atmospheric Sciences and Climate, Corso Stati Uniti 4, Padua, Italy. 10Laboratoire des Sciences du Climat et de l’Environnement, CNRS, CEA, UVSQ, Université Paris-Saclay, Gif-sur-Yvette, France. 11Argentine Institute of Nivology, Glaciology and Environmental Sciences (IANIGLA-CONICET), Mendoza, Argentina. 12Facultad de Filosofía y Letras, Universidad Nacional de Cuyo, Mendoza, Argentina. 13Department of Geography, Faculty of Science, Masaryk University, Brno, Czech Republic. 14Global Change Research Institute, Czech Academy of Sciences, Brno, Czech Republic. 15Departamento de Sistemas Físicos, Químicos y Naturales, Universidad Pablo de Olavide, Seville, Spain. 16Departamento de Física de la Tierra y Astrofísica, Universidad Complutense, Madrid, Spain. 17IGEO, Instituto de Geociencias (CSIC, UCM), Madrid, Spain. 18Fenner School of Environment and Society, Australian National University, Canberra, Australia. 19ARC Centre for Climate Extremes, The Australian National University, Canberra, Australia. 20School of Geography, Archaeology and Environmental Studies, University of the Witwatersrand, Johannesburg, South Africa. 21Department of Geography, College of Science, University of Lincoln, Lincoln, UK. 22Natural Resources Institute Finland, Helsinki, Finland. 23Department of Geography, Environmental Management and Energy Studies, University of Johannesburg, Johannesburg, South Africa. 24Institute for Geological SCIENTIFIC DATA | (2023) 10:402 | https://doi.org/10.1038/s41597-023-02303-y 13 www.nature.com/scientificdata/ www.nature.com/scientificdata and Geochemical Research, Research Centre for Astronomy and Earth Sciences, Budapest, Hungary. 25CSFK, MTA Centre of Excellence, Konkoly Thege Miklós út 15-17, Budapest, Hungary. 26Institute of Hydraulic Engineering and Water Resources Management, Vienna University of Technology, Vienna, Austria. 27Research Center for Environmental Changes, Academia Sinica, Taipei, Taiwan. 28Graduate Institute of Environmental Education, National Taiwan Normal University, Taipei, Taiwan. 29Department of Geography, Swansea University, Swansea, UK. 30 Department of Biometeorological Applications, Czech Hydrometeorological Institute, Prague, Czech Republic. 31 School of Applied Sciences, University of Brighton, Brighton, United Kingdom. 32Department of Earth, Ocean, and Atmospheric Science, Florida State University, Tallahassee, Florida, USA. 33Department of Chemistry and Physics, University of Almería, Almería, Spain. 34Department of Biology, York University, Toronto, Ontario, Canada. 35Institute for Soil Sciences (TAKI) and Centre for Agricultural Research (ATK), Budapest, Hungary. 36National Museum of Natural Sciences-Spanish Research Council, Madrid, Spain. 37Deceased: María del Rosario Prieto. ✉e-mail: stefan. broennimann@giub.unibe.ch SCIENTIFIC DATA | (2023) 10:402 | https://doi.org/10.1038/s41597-023-02303-y 14