LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

A Distributed Scheduling Framework of Service Based ETL Process

Photo from wikipedia

The use of service oriented computing paradigm and ETL (Extract-Transform-Load) technology has recently received significant attention to enable data warehouse construction and data integration. Aiming at improving scheduling and execution… Click to show full abstract

The use of service oriented computing paradigm and ETL (Extract-Transform-Load) technology has recently received significant attention to enable data warehouse construction and data integration. Aiming at improving scheduling and execution efficiency of service based ETL process, this paper proposes a distributed scheduling and execution framework for ETL process and a corresponding method. Firstly, add different weights to the ETL process to ensure the loading efficiency of core business data. Secondly, the scheduler selects the executors according to the performance and load, then allocates the ETL process execution request based on the greedy balance (GB) algorithm to make the load of the executor balancing. Thirdly, the executors parses ETL process to ETL services, then selects one or more executors to deploy and execute the ETL service according to the locality-aware strategy, that is, the amount of data involved and the distance of the node network which service involved, which can reduce the network overhead and improve execution efficiency. Finally, the effectiveness of the proposed method is verified by experimental comparison.

Keywords: etl process; service based; based etl; service; distributed scheduling

Journal Title: Big Data
Year Published: 2019

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.