LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Federated Data Cleaning: Collaborative and Privacy-Preserving Data Cleaning for Edge Intelligence

Photo from wikipedia

As an important driving factor of emerging Internet-of-Things (IoT) applications, machine learning algorithms are currently facing the challenge of how to “clean” data noise, that is introduced during the training… Click to show full abstract

As an important driving factor of emerging Internet-of-Things (IoT) applications, machine learning algorithms are currently facing the challenge of how to “clean” data noise, that is introduced during the training process (e.g., asynchronous execution and lossy data compression and quantization). In an attempt to guarantee data quality, various data cleaning approaches have been proposed to filter out abnormal data entries based on the global data distribution. However, most existing data cleaning approaches are based on a centralized paradigm and thus cannot be applied to future edge-based IoT applications, where each edge node (EN) has only a limited view of the global data distribution. Moreover, the increasing demand for privacy preservation largely prevents ENs from combining their data for centralized cleaning. In this study, we propose a federated data cleaning protocol, coined as FedClean, for edge intelligence (EI) scenarios that is designed to achieve data cleaning without compromising data privacy. More specifically, different ENs first generate Boolean shares of their data and distribute them to two noncolluding servers. These two servers then run the FedClean protocol to privately and efficiently compute the attribute value frequency (AVF) scores of the collected data entries, which are then sorted in ascending order via a bitonic sorting network without revealing their values. As a result, data entries with lower AVF scores are considered as abnormal and filtered out. The security, efficiency, and effectiveness of the proposed approach are then demonstrated via concrete security analysis and comprehensive experiments.

Keywords: edge intelligence; data cleaning; federated data; privacy; edge

Journal Title: IEEE Internet of Things Journal
Year Published: 2021

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.