Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/956750.956801acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Discovery of climate indices using clustering

Published: 24 August 2003 Publication History

Abstract

To analyze the effect of the oceans and atmosphere on land climate, Earth Scientists have developed climate indices, which are time series that summarize the behavior of selected regions of the Earth's oceans and atmosphere. In the past, Earth scientists have used observation and, more recently, eigenvalue analysis techniques, such as principal components analysis (PCA) and singular value decomposition (SVD), to discover climate indices. However, eigenvalue techniques are only useful for finding a few of the strongest signals. Furthermore, they impose a condition that all discovered signals must be orthogonal to each other, making it difficult to attach a physical interpretation to them. This paper presents an alternative clustering-based methodology for the discovery of climate indices that overcomes these limitiations and is based on clusters that represent regions with relatively homogeneous behavior. The centroids of these clusters are time series that summarize the behavior of the ocean or atmosphere in those regions. Some of these centroids correspond to known climate indices and provide a validation of our methodology; other centroids are variants of known indices that may provide better predictive power for some land areas; and still other indices may represent potentially new Earth science phenomena. Finally, we show that cluster based indices generally outperform SVD derived indices, both in terms of area weighted correlation and direct correlation with the known indices.

References

[1]
J. W. Demmel. Applied Numerial Linear Algebra. SIAM, January 1997.
[2]
L. Ertöz, M. Steinbach, and V. Kumar. Finding topics in collections of documents: A shared nearest neighbor approach. In Proceedings of Text Mine'01, First SIAM International Conference on Data Mining, Chicago, IL, USA, 2001.
[3]
L. Ertöz, M. Steinbach, and V. Kumar. A new shared nearest neighbor clustering algorithm and its applications. In Workshop on Clustering High Dimensional Data and its Applications, SIAM Data Mining 2002, Arlington, VA, USA, 2002.
[4]
L. Ertöz, M. Steinbach, and V. Kumar. Finding clusters of different sizes, shapes, and densities in noisy, high dimensional data. In Proceedings of Second SIAM International Conference on Data Mining, San Francisco, CA, USA, May 2003.
[5]
M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD 1996, pages 226--231, 1996.
[6]
http://www.cgd.ucar.edu/cas/catalog/climind/.
[7]
http://www.cdc.noaa.gov/USclimate/Correlation/help.html.
[8]
A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice Hall Advanced Reference Series. Prentice Hall, Englewood Cliffs, New Jersey, March 1988.
[9]
B. Lindgren. Statistical Theory. CRC Press, January 1993.
[10]
C. Potter, S. A. Klooster, and V. Brooks. Inter-annual variability in terrestrial net primary production: Exploration of trends and controls on regional to global scales. Ecosystems, 2(1):36--48, August 1999.
[11]
N. Saji, B. Goswami, P. Vinaychandran, and T. Yamagata. A dipole mode in the tropical indian ocean. Nature, 401:360--363, 1999.
[12]
P. Smyth, K. Ide, and M. Ghil. Multiple regimes in northern hemisphere height fields via mixture model clustering. Journal of Atmospheric Science, 56:3704--3723, 2000.
[13]
M. Steinbach, P.-N. Tan, V. Kumar, S. Klooster, and C. Potter. Temporal data mining for the discovery and analysis of ocean climate indices. In Proceedings of the KDD Temporal Data Mining Workshop, Edmonton, Alberta, Canada, August 2002.
[14]
M. Steinbach, P.-N. Tan, V. Kumar, C. Potter, and S. Klooster. Data mining for the discovery of ocean climate indices. In Mining Scientific Datasets Workshop, 2nd Annual SIAM International Conference on Data Mining, April 2002.
[15]
M. Steinbach, P.-N. Tan, V. Kumar, C. Potter, S. Klooster, and A. Torregrosa. Clustering earth science data: Goals, issues and results. In Proceedings of the Fourth KDD Workshop on Mining Scientific Datasets, San Francisco, California, USA, August 2001.
[16]
H. V. Storch and F. W. Zwiers. Statistical Analysis in Climate Research. Cambridge University Press, July 1999.
[17]
P.-N. Tan, M. Steinbach, V. Kumar, S. Klooster, C. Potter, and A. Torregrosa. Finding spatio-termporal patterns in earth science data. In KDD Temporal Data Mining Workshop, San Francisco, California, USA, August 2001.
[18]
G. H. Taylor. Impacts of the el nino/southern oscillation on the pacific northwest. Technical report, Oregon State University, Corvallis, Oregon, 1998.
[19]
J. C. Tilton. Image segmentation by region growing and spectral clustering with a natural convergence criterion. In Proc. of the 1998 International Geoscience and Remote Sensing Symposium (IGARSS'98), Seattle, WA, 1998.
[20]
N. Vivoy. Automatic classification of time series (acts): a new clustering method for remote sensing time series. International Journal of Remote Sensing, 2000.

Cited By

View all
  • (2024)Randomnet: clustering time series using untrained deep neural networksData Mining and Knowledge Discovery10.1007/s10618-024-01048-5Online publication date: 22-Jun-2024
  • (2023)MuSTC: A Multi-Stage Spatio–Temporal Clustering Method for Uncovering the Regionality of Global SSTAtmosphere10.3390/atmos1409135814:9(1358)Online publication date: 29-Aug-2023
  • (2023)Spatiotemporal Data Mining Problems and MethodsAnalytics10.3390/analytics20200272:2(485-508)Online publication date: 14-Jun-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
August 2003
736 pages
ISBN:1581137370
DOI:10.1145/956750
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 August 2003

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. clustering
  2. earth science data
  3. mining scientific data
  4. singular value decomposition
  5. time series

Qualifiers

  • Article

Conference

KDD03
Sponsor:

Acceptance Rates

KDD '03 Paper Acceptance Rate 46 of 298 submissions, 15%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)35
  • Downloads (Last 6 weeks)5
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Randomnet: clustering time series using untrained deep neural networksData Mining and Knowledge Discovery10.1007/s10618-024-01048-5Online publication date: 22-Jun-2024
  • (2023)MuSTC: A Multi-Stage Spatio–Temporal Clustering Method for Uncovering the Regionality of Global SSTAtmosphere10.3390/atmos1409135814:9(1358)Online publication date: 29-Aug-2023
  • (2023)Spatiotemporal Data Mining Problems and MethodsAnalytics10.3390/analytics20200272:2(485-508)Online publication date: 14-Jun-2023
  • (2022)When Is a Brain Like the Planet?Philosophy of Science10.1086/52196874:3(330-346)Online publication date: 1-Jan-2022
  • (2022)Climatic zoning of Ghana using selected meteorological variables for the period 1976–2018Meteorological Applications10.1002/met.204929:1Online publication date: 23-Feb-2022
  • (2021)Spatial–Temporal Patterns of Historical, Near-Term, and Projected Drought in the Conterminous United StatesHydrology10.3390/hydrology80301368:3(136)Online publication date: 8-Sep-2021
  • (2021)Spatio-Temporal Multi-Task Learning via Tensor DecompositionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2019.295671333:6(2764-2775)Online publication date: 1-Jun-2021
  • (2021)Spatiotemporal Trends and Variability in the Centroid of the Northern Hemisphere's Circumpolar VortexEarth and Space Science10.1029/2020EA0015948:8Online publication date: 5-Aug-2021
  • (2021)Time series clustering in linear time complexityData Mining and Knowledge Discovery10.1007/s10618-021-00798-wOnline publication date: 18-Sep-2021
  • (2021)E-Research and GeoComputation in Public HealthGeoComputation and Public Health10.1007/978-3-030-71198-6_3(37-78)Online publication date: 25-Jun-2021
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media