LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Self-organized dynamic provisioning for big data

Photo by neom from unsplash

Recent rapid expansion of datasets in big data problems has resulted in data sizes that exceed processing capabilities of available distributed computing power. In other words, we are producing more… Click to show full abstract

Recent rapid expansion of datasets in big data problems has resulted in data sizes that exceed processing capabilities of available distributed computing power. In other words, we are producing more data than we can process. In addition, further analysis of a dataset collective state may require duplicating, transferring, and distributing to increase the scale of the problem. Orchestrating these steps in large-scale complex systems is non-trivial. One basic technique to help minimize effects of data re-distribution is to use dynamic resource provisioning environments. When the node organization and structure is dynamic and eclectic, provisioning environments require up-to-date information about resource availability. Maintaining freshness of available resource state in centralized or hierarchical scheduling systems imposes a network communication overhead. Centralization also introduces administrative barriers, limiting interoperability. One effective method to improve the extent of self-organization is taking feedback. Based on this feedback, nodes can then alter their behavior to better respond to changing characteristics in dynamic resource provisioning environments. In this article, we present a decentralized scheduling framework that takes feedback from the system, and adjusts its behavior accordingly. Our framework presents an enabling mechanism for self-organization, where each cloud node adapts its behavior based on the feedback. This approach, compared to centralized resource provisioning solutions that exist in current cloud systems, achieves comparable scheduling decisions, with half the packet overhead. We show that by taking advantage of spatial locality with dynamic provisioning, and due to better scheduling decisions with our framework, data processing overhead of big data problems can be reduced by at least 30% in general, and up to 55% in particular resource distributions. This in turn, results in efficient scheduling decisions to provision better resources for big data tasks.

Keywords: big data; resource provisioning; resource; dynamic provisioning; provisioning environments

Journal Title: Cluster Computing
Year Published: 2017

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.