LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Data source selection for approximate query

Photo by campaign_creators from unsplash

Exact query on big data is a challenging task due to the large numbers of autonomous data sources. In this paper, an efficient method is proposed to select sources on… Click to show full abstract

Exact query on big data is a challenging task due to the large numbers of autonomous data sources. In this paper, an efficient method is proposed to select sources on big data for approximate query. A gain model is presented for source selection by considering information coverage and quality provided by sources. Under this model, the source selection problem is formalized into two optimization problems. Because of the NP-hardness of proposed problems, two approximate algorithms are devised to solve them respectively, and their approximate ratios and complexities are analyzed. To further improve efficiency, a randomized method is developed for gain estimation. Based on it, the time complexities of improved algorithms are sub-linear in the number of data item. Experimental results show high efficiency and scalability of proposed algorithms.

Keywords: data source; source selection; approximate query; query

Journal Title: Journal of Combinatorial Optimization
Year Published: 2021

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.