Cost-aware multi data-center bulk transfers in the cloud from a customer-side perspective
JL García-Dorado, SG Rao - IEEE Transactions on Cloud …, 2015 - ieeexplore.ieee.org
IEEE Transactions on Cloud Computing, 2015•ieeexplore.ieee.org
Many cloud applications (eg, data backup and replication, video distribution) require
dissemination of large volumes of data from a source data-center to multiple geographically
distributed data-centers. Given the high costs of wide-area bandwidth, the overall cost of
inter-data-center communication is a major concern in such scenarios. While previous works
have focused on optimizing the costs of bulk transfer, most of them use the charging models
of Internet service providers, typically based on the 95th percentile of bandwidth …
dissemination of large volumes of data from a source data-center to multiple geographically
distributed data-centers. Given the high costs of wide-area bandwidth, the overall cost of
inter-data-center communication is a major concern in such scenarios. While previous works
have focused on optimizing the costs of bulk transfer, most of them use the charging models
of Internet service providers, typically based on the 95th percentile of bandwidth …
Many cloud applications (e.g., data backup and replication, video distribution) require dissemination of large volumes of data from a source data-center to multiple geographically distributed data-centers. Given the high costs of wide-area bandwidth, the overall cost of inter-data-center communication is a major concern in such scenarios. While previous works have focused on optimizing the costs of bulk transfer, most of them use the charging models of Internet service providers, typically based on the 95th percentile of bandwidth consumption. However, public Cloud Service Providers (CSP) follow very different models to charge their customers. First, the cost for transmission is flat and depends on the location of the source and receiver data-centers. Second, CSPs offer discounts once customer transfers exceed certain volume thresholds per data-center. We present a systematic framework, CloudMPcast, that exploits these two aspects of cloud pricing schemes. CloudMPcast constructs overlay distribution trees for bulk-data transfer that both optimizes dollar costs of distribution, and ensures end-to-end data transfer times are not affected. CloudMPCast monitors TCP throughputs between data-centers and only proposes alternative trees that respect original transfer times. After an extensive measurement study, the cost savings range from 10 to 60 percent for both Azure and EC2 infrastructures, which potentially translates to millions of dollars a year assuming realistic demands.
ieeexplore.ieee.org