LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Backup or Not: An Online Cost Optimal Algorithm for Data Analysis Jobs Using Spot Instances

Recently, large-scale public cloud providers begin to offer spot instances. This type of instance has become popular with more and more cloud users in the light of its convenient access… Click to show full abstract

Recently, large-scale public cloud providers begin to offer spot instances. This type of instance has become popular with more and more cloud users in the light of its convenient access mode and low price, especially for those big data analysis jobs with high performance computation requirements. However, using spot instances may carry the risk of being interrupted and lead to extra costs for job re-executions because these instances are generally unstable. Yet, such cost can be greatly reduced if a backup can be made at the right time before interruptions. For convenience and cost efficiency, users can choose the StaaS (Storage-as-a-Service) storage provided by the same cloud provider, whose spot instances are used by the users, to store backup data files for future job execution recovery. Since making backups too often will incur increased costs, users need to make the backup decisions appropriately considering the condition when an abrupt interruption will occur in the future. However, it is hard to know or predict precisely when such an interruption will occur. For solving this problem, in this article, we propose an online algorithm to guide cloud users to make backups when using spot instances to execute big data analysis jobs, without requiring any information about future interruptions. We prove theoretically that our proposed online algorithm can guarantee a bounded competitive ratio less than 2. Finally, according to extensive experiments, we verify the effectiveness of our online algorithm in reducing the additional cost caused by interruptions in using spot instances and find that our online algorithm can still achieve a stable cost optimization even if interruptions occur frequently.

Keywords: using spot; spot; algorithm; cost; spot instances; data analysis

Journal Title: IEEE Access
Year Published: 2020

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.