The widespread utilization of cloud computing services
has brought in the emergence of cloud service reliability
as an important issue for both cloud providers and users. To
enhance cloud service reliability and reduce the subsequent losses, the future status of virtual machines should be monitored in real time and predicted before they crash. However, most existing methods ignore the following two characteristics of actual cloud
environment, and will result in bad performance of status prediction:
1. cloud environment is dynamically changing; 2. cloud
environment consists of many heterogeneous physical and virtual
machines. In this paper, we investigate the predictive power of
collected data from cloud environment, and propose a simple yet
general machine learning model StaP to predict multiple machine
status. We introduce the motivation, the model development
and optimization of the proposed StaP. The experimental results
validated the effectiveness of the proposed StaP.
1 of 2
More Related Content
Machine Status Prediction for Dynamic and Heterogenous Cloud Environment
2. Training model is to find the optimal parameters Θ that can
maximize Eq. 3. With elementary algebraic manipulations, we
can change the training target into:
Θ∗
= argmax
W ,V
N
n=1 m∈Sn
rm,nvm
T
W yn
−
m∈Sn
rm,n log
m∈Sn
exp vT
mW yn
(4)
As direct optimizing would suffer high computational cost,
we resort to the negative sampling technique [5] for efficiency.
The optimizing process is shown in Algorithm 1, where σ(x)
means the logistic function.
Algorithm 1: Learning algorithm
Input: R, Y = {yn},learning rate η, maximum iterations
maxIt, sampling number k;
Output: W , V ;
1 Initialize W , V ,t = 0, define ϕn = m∈Sn
rm,nvm;
2 while t + + < maxIt do
3 for n = 1; n ≤ N; n + + do
4 W = W + ησ −ϕT
n W yn ϕnyT
n ;
5 for m ∈ Sn do
6 vm = vm + ησ −ϕT
n W yn rm,nW yn;
7 for i = 1; i ≤ k; i + + do
8 sample negative sample yi;
9 W = W − ησ ϕT
n W yn ϕnyT
i ;
10 for m ∈ Sn do
11 vm = vm − ησ ϕT
n W yn rm,nW yi;
12 Update W = W − 2ληW , V = V − 2ληV ;
13 return W , V ;
With the ready trained parameters W , V , we can predict yz
for a new machine uz according to
y ∗
z = argmax
y∈Y
p(y|Sz) = argmax
y∈Y
˜p(yn)p(Sn|yn)
= argmax
y∈Y
log ˜p(y) +
m∈Sz
rm,zvm
T
W y
−
m∈Sz
rm,z log
m∈Sz
exp vT
mW y
, (5)
where ˜p(yn) is the empirical distribution of machine status
representation yn given by the R.
As the terms m∈Sz
rm,zvm
T
W , m∈Sz
rm,z and
vT
mW are constant for all y, the process to get y ∗
z would
not cost much. In Eq. 5, the empirical distribution ˜p(y) can
be considered as the prior probability of y, and p(Sn|yn) is
closely related to the likelihood function.
III. EXPERIMENT
The experimental dataset contains 210, 000 ratings ex-
pressed by 1, 075 users on 2, 000 books. A user has individual
information, such as gender, age group and his rating list. We
choose this dataset because a user, his rating list and individual
information can be mapped to a virtual machine, the set of
collected items and its two different future statuses.
We employ POP and SNE as baseline models, and weight-
ed F1 and Hamming Loss as evaluation metrics[4]. They
are commonly used in multi-task multi-class classification
problem, which is similar to status prediction tasks in cloud
environment.
TABLE I
PERFORMANCE COMPARISON.
Training
ratio (%)
weighted F 1 Hamming Loss
POP SNE StaP POP SNE StaP
50 0.095 0.213 0.278 0.464 0.469 0.467
70 0.096 0.315 0.350 0.489 0.463 0.458
90 0.096 0.367 0.379 0.451 0.452 0.443
The experimental results are as shown in Table I. Clearly,
the proposed StaP outperforms POP and SNE under different
evaluation metrics all the time, as we set the training data ratio
with 50%, 80% and 90% respectively. This result validates the
assumption that the proposed StaP is a more proper model to
predict the current status of virtual or physical machines by
utilizing data that collected from cloud environment.
IV. CONCLUSION
In this paper, we address the problem of virtual machine
status prediction for dynamic and heterogenous cloud environ-
ment. More specifically, we investigate the predictive power of
collected data of different items from cloud environment and
propose a simple yet general machine learning model StaP to
automatically learn the representation of of different items and
the correlations among them, and predict multiple statuses in
real time. The experimental results validated the effectiveness
of the proposed model.
ACKNOWLEDGMENT
This work was supported by NSFC (61272521), NSFC
(61571066, 2016.01-2019.12), and the Fundamental Research
Funds for the Central Universities(2016RC19).
REFERENCES
[1] M. Dong, H. Li, K. Ota, L. T. Yang, and H. Zhu, “Multicloud-based
evacuation services for emergency management,” Cloud Computing,
IEEE, vol. 1, no. 4, pp. 50–59, 2014.
[2] P. Gill, N. Jain, and N. Nagappan, “Understanding network failures in data
centers: measurement, analysis, and implications,” in ACM SIGCOMM
Computer Communication Review, vol. 41, pp. 350–361, ACM, 2011.
[3] J. Liu, S. Wang, A. Zhou, S. Kumar, F. Yang, and R. Buyya, “Using
proactive fault-tolerance approach to enhance cloud service reliability,”
IEEE Transactions on Cloud Computing, pp. 1–13, 2016.
[4] P. Wang, J. Guo, Y. Lan, J. Xu, and X. Cheng, “Your cart tells you:
Inferring demographic attributes from purchase data,” in Proceedings of
ACM International Conference on Web Search and Data Mining (WSDM),
pp. 251–260, ACM, 2016.
[5] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Dis-
tributed representations of words and phrases and their compositionality,”
in Proceedings of Advances in Neural Information Processing Systems
(NIPS), pp. 3111–3119, 2013.
137