Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2523616.2525941acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

The wisdom of virtual crowds: mining datacenter telemetry to collaboratively debug performance

Published: 01 October 2013 Publication History

Abstract

Explaining the (mis)behavior of virtual machines in large-scale cloud environments presents a number of challenges with respect to both scale and making sense of torrents of datacenter telemetry emanating from multiple levels of the stack. In this paper we leverage VM-similarity to explain the behavior or performance of a VM using its cohort as a reference (or by contrasting it against groups of VMs outside of its cohort). The key insight is that virtual machines (VMs) running the same application (components or workloads), or VMs colocated within the same (logical) tier of a complex application exhibit similar telemetry patterns.
The power of similarity relationships stems from the additional context that similarity provides. The quantitative or qualitative "distance" between a VM and its expected cohort could be used to explain or diagnose any discrepancy. Similarly, the distance between a VM and one in another cohort can be used to explain why the VMs are dissimilar. As an example we apply our data-mining techniques to debugging ViewPlanner performance. ViewPlanner is a tool used to emulate and evaluate large-scale deployments of virtual desktops. Using a ViewPlanner deployment of 175 VMs we collect ~ 300 metrics-per-VM, sampled at 20-second frequency over multiple 1 hour epochs, from the PerformanceManager [4] on ESX and automatically filter (using entropy measures [2]) and cluster them using K-means [1]. We use the median value of each metric within an epoch to summarize the VM's behavior during that epoch.
We introduce spread/diffusion metrics to explain the difference between VMs. Spread metrics are those such that the expected value of the order statistic (in our case the median) of a metric, m, E[m] differs between two clusters, i.e., the expected value is conditioned on the cluster, E[mi|clusterA] ≠ E[mi|clusterB]. Within a cluster of VMs, differences in the distribution of a particular metric, mi, may be explained by conditioning mi on other metrics, {c0, ..., cn}, where E[mi] ≠ E[mi|c0, ..., cn]. We automatically find potentially interesting mi's using Silverman's test [3] for multi-modality and we use Mutual Information [2] to find associated ci's.

References

[1]
T. Hastie, R. Tibshirani, and J. H. Friedman. The elements of statistical learning: data mining, inference, and prediction: with 200 full-color illustrations. New York: Springer-Verlag, 2001.
[2]
A. Hyvarinen, J. Karhunen, and E. Oja. Independent Component Analysis. John Wiley & Sons, Inc., 2001.
[3]
B. W. Silverman. Using kernel density estimates to investigate multimodality. J. R. Statist. Soc. B, 43(1): 97--99, 1981.
[4]
VMware Inc. vsphere performance. http://pubs.vmware.com/vsphere-50/index.jsp?topic=%2Fcom.vmware.wssdk.pg.doc_50%2FPG_Ch16_Performance.18.1.html.

Cited By

View all
  • (2014)Crowdsourced Resource-Sizing of Virtual AppliancesProceedings of the 2014 IEEE International Conference on Cloud Computing10.1109/CLOUD.2014.111(801-809)Online publication date: 27-Jun-2014

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SOCC '13: Proceedings of the 4th annual Symposium on Cloud Computing
October 2013
427 pages
ISBN:9781450324281
DOI:10.1145/2523616
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 2013

Check for updates

Qualifiers

  • Research-article

Conference

SOCC '13
Sponsor:
SOCC '13: ACM Symposium on Cloud Computing
October 1 - 3, 2013
California, Santa Clara

Acceptance Rates

SOCC '13 Paper Acceptance Rate 23 of 114 submissions, 20%;
Overall Acceptance Rate 169 of 722 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2014)Crowdsourced Resource-Sizing of Virtual AppliancesProceedings of the 2014 IEEE International Conference on Cloud Computing10.1109/CLOUD.2014.111(801-809)Online publication date: 27-Jun-2014

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media