LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Improving MapReduce privacy by implementing multi-dimensional sensitivity-based anonymization

Photo by campaign_creators from unsplash

Big data is predominantly associated with data retrieval, storage, and analytics. Data analytics is prone to privacy violations and data disclosures, which can be partly attributed to the multi-user characteristics… Click to show full abstract

Big data is predominantly associated with data retrieval, storage, and analytics. Data analytics is prone to privacy violations and data disclosures, which can be partly attributed to the multi-user characteristics of big data environments. Adversaries may link data to external resources, try to access confidential data, or deduce private information from the large number of data pieces that they can obtain. Data anonymization can address some of these concerns by providing tools to mask and can help with concealing the vulnerable data. Currently available anonymization methods, however, are not capable of accommodating the big data scalability, granularity, and performance in efficient manners. In this paper, we introduce a novel framework that implements SQL-like Hadoop ecosystems, incorporating Pig Latin with the additional splitting of data. The splitting reduces data masking and increases the information gained from the anonymized data. Our solution provides a fine-grained masking and concealment, which is based on access level privileges of the user. We also introduce a simple classification technique that can accurately measure the anonymization extent in any anonymized data. The results of testing this classification technique and the proposed sensitivity-based anonymization method using different samples will also be discussed. These results show the significant benefits of the proposed approach, particularly regarding reduced information loss associated with the anonymization processes.

Keywords: sensitivity based; big data; anonymization; based anonymization; privacy

Journal Title: Journal of Big Data
Year Published: 2017

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.