Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1111/j.1467-8659.2009.01467.xguideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Selecting good views of high-dimensional data using class consistency

Published: 10 June 2009 Publication History

Abstract

Many visualization techniques involve mapping high-dimensional data spaces to lower-dimensional views. Unfortunately, mapping a high-dimensional data space into a scatterplot involves a loss of information; or, even worse, it can give a misleading picture of valuable structure in higher dimensions. In this paper, we propose class consistency as a measure of the quality of the mapping. Class consistency enforces the constraint that classes of n-D data are shown clearly in 2-D scatterplots. We propose two quantitative measures of class consistency, one based on the distance to the class's center of gravity, and another based on the entropies of the spatial distributions of classes. We performed an experiment where users choose good views, and show that class consistency has good precision and recall. We also evaluate both consistency measures over a range of data sets and show that these measures are efficient and robust.

References

[1]
ASIMOV D.: The grand tour: a tool for viewing multidimensional data. SIAM Journal on Scientific and Statistical Computing 6, 1 (1985), 128-143.
[2]
BERTIN J.: Semiology of Graphics. The University of Wisconsin Press, 1984.
[3]
BULGARIAN NATIONAL CENTER OF HEALTH INFORMATICS: Health Indicators of Bulgaria (http://212.122.183.76/dps/index.php), last accessed 03/2008.
[4]
COOK D., BUJA A., CABRERA J., HURLEY C.: Grand tour and projection pursuit. Journal of Computational and Graphical Statistics 4, 3 (1995), 155-172.
[5]
DHILLON I. S., MODHA D. S., SPANGLER W. S.: Visualizing class structure of multidimensional data. In Proceedings of the 30th Symposium on the Interface: Computing Science and Statistics (1998), vol. 30, Interface Foundation of North America, pp. 488-493.
[6]
FRIEDMAN J. H.: Exploratory projection pursuit. Journal of the American Statistical Association 82, 397 (1987), 249-266.
[7]
FRIEDMAN J. H., TUKEY J. W.: Projection pursuit algorithm for exploratory data analysis. IEEE Transactions on Computers 23, 9 (1974), 881-890.
[8]
HARTIGAN J. A.: Printer graphics for clustering. Journal of Statistical Computing and Simulation 4, 3 (1975), 187-213.
[9]
KOREN Y., CARMEL L.: Robust linear dimensionality reduction. IEEE Transactions on Visualization and Computer Graphics 10, 4 (2004), 459-470.
[10]
KEIM D. A., SIPS M., ANKERST M.: Visual datamining techniques. In Book Chapter in: Visualization Handbook, Johnson C., Hansen C., (Eds.). Elsevier Science Publishing, 2004, pp. 813-825.
[11]
NEWMAN D., HETTICH S., BLAKE C., MERZ C.: UCI repository of machine learning databases (http://www.ics.uci.edu/~mlearn/mlrepository.html), 1998.
[12]
SEO J., SHNEIDERMAN B.: A rank-by-feature framework for interactive exploration of multidimensional data. Palgrave Macmillan Information Visualization 4, 2 (2005), 96-113.
[13]
TUKEY J. W., TUKEY P. A.: Computing graphics and exploratory data analysis: An introduction. In Proceedings of the Sixth Annual Conference and Exposition (1985), National Computer Graphics Association, pp. 773-785.
[14]
WILKINSON L., ANAND A., GROSSMAN R.: Highdimensional visual analytics: Interactive exploration guided by pairwise views of point distributions. IEEE Transactions on Visualization and Computer Graphics 12, 6 (2006), 1363-1372.
[15]
WORLD HEALTH ORGANIATION: WHOSIS WHO Statistical Information System (http://www.who.int/whosis/en/index.html), last accessed 03/2008.

Cited By

View all
  • (2024)A Large-Scale Sensitivity Analysis on Latent Embeddings and Dimensionality Reductions for Text SpatializationsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.345630831:1(305-315)Online publication date: 17-Sep-2024
  • (2024): Improving Label-Based Evaluation of Dimensionality ReductionIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332718730:1(781-791)Online publication date: 1-Jan-2024
  • (2024)Class-Constrained t-SNE: Combining Data Features and Class ProbabilitiesIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332660030:1(164-174)Online publication date: 1-Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
EuroVis'09: Proceedings of the 11th Eurographics / IEEE - VGTC conference on Visualization
June 2009
1054 pages

Sponsors

  • ZIB: ZIB
  • IEEE VGTC: IEEE Visualization and Graphics Technical Committee
  • DFG Research Center Matheon: DFG Research Center Matheon
  • NVIDIA
  • EUROGRAPHICS: The European Association for Computer Graphics

Publisher

The Eurographs Association & John Wiley & Sons, Ltd.

Chichester, United Kingdom

Publication History

Published: 10 June 2009

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A Large-Scale Sensitivity Analysis on Latent Embeddings and Dimensionality Reductions for Text SpatializationsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.345630831:1(305-315)Online publication date: 17-Sep-2024
  • (2024): Improving Label-Based Evaluation of Dimensionality ReductionIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332718730:1(781-791)Online publication date: 1-Jan-2024
  • (2024)Class-Constrained t-SNE: Combining Data Features and Class ProbabilitiesIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332660030:1(164-174)Online publication date: 1-Jan-2024
  • (2024)Large-Scale Evaluation of Topic Models and Dimensionality Reduction Methods for 2D Text SpatializationIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332656930:1(902-912)Online publication date: 1-Jan-2024
  • (2024)Seeing is Learning in High Dimensions: The Synergy Between Dimensionality Reduction and Machine LearningSN Computer Science10.1007/s42979-024-02604-y5:3Online publication date: 21-Feb-2024
  • (2023)Information visualisation for industrial process monitoringProceedings of the 27th International Database Engineered Applications Symposium10.1145/3589462.3595631(107-114)Online publication date: 5-May-2023
  • (2022)A Survey on ML4VIS: Applying Machine Learning Advances to Data VisualizationIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2021.310614228:12(5134-5153)Online publication date: 1-Dec-2022
  • (2022)Multidimensional data visualization applying a variety-oriented scatterplot selection techniqueJournal of Visualization10.1007/s12650-022-00871-626:1(199-210)Online publication date: 29-Aug-2022
  • (2021)Scatterplot Selection Applying a Graph Coloring AlgorithmProceedings of the 14th International Symposium on Visual Information Communication and Interaction10.1145/3481549.3481553(1-6)Online publication date: 6-Sep-2021
  • (2021)A Taxonomy of Property Measures to Unify Active Learning and Human-centered Approaches to Data LabelingACM Transactions on Interactive Intelligent Systems10.1145/343933311:3-4(1-42)Online publication date: 3-Sep-2021
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media