TY - JOUR AU - Raj, Helen Wilfred AU - Balachandran, Santhi PY - 2019 TI - A Survey of Data Anonymization Techniques for Privacy-Preserving Mining in Bigdata JF - Journal of Computer Science VL - 16 IS - 2 DO - 10.3844/jcssp.2020.194.201 UR - https://thescipub.com/abstract/jcssp.2020.194.201 AB - Bigdata era is seeing the data burst occurring in a multitude of angles that are better expressed in terms of the 4Vs (Volume, Velocity, Velocity, Veracity). While trying to infer information from data, care should be exercised as not to reveal the identity of the data owner, which breaches the privacy rights. Leakage of information can happen right from the data collection point, at the data storage area, followed by the distribution of data to data users/miners and finally with published results. A cross-matching of all these points with the 4Vs (growing still) of big data, puts a huge challenge on how to extract the maximum possible information, without compromising on the privacy of the data owner. Anonymization of the original data should be done at one or more of the above-mentioned stages before the data are given for the mining process. This work makes a survey of the various anonymization techniques followed to transform the data in such a way that the privacy of the data owner is not compromised. Also, the sample data drawn should resemble and represent the original dataset in the maximum possible number of dimensions. The results of the various methodologies have been analyzed and the observations have been presented.