Preserving Data Clustering with Expectation Maximization Algorithm

Journal Title: Journal of Information Systems and Telecommunication - Year 2016, Vol 4, Issue 3

Abstract

Data mining and knowledge discovery are important technologies for business and research. Despite their benefits in various areas such as marketing, business and medical analysis, the use of data mining techniques can also result in new threats to privacy and information security. Therefore, a new class of data mining methods called privacy preserving data mining (PPDM) has been developed. The aim of researches in this field is to develop techniques those could be applied to databases without violating the privacy of individuals. In this work we introduce a new approach to preserve sensitive information in databases with both numerical and categorical attributes using fuzzy logic. We map a database into a new one that conceals private information while preserving mining benefits. In our proposed method, we use fuzzy membership functions (MFs) such as Gaussian, P-shaped, Sigmoid, S-shaped and Z-shaped for private data. Then we cluster modified datasets by Expectation Maximization (EM) algorithm. Our experimental results show that using fuzzy logic for preserving data privacy guarantees valid data clustering results while protecting sensitive information. The accuracy of the clustering algorithm using fuzzy data is approximately equivalent to original data and is better than the state of the art methods in this field.

Authors and Affiliations

Leila Jafar Tafreshi, Farzin Yaghmaee

Keywords

Related Articles

A Unicast Tree-Based Data Gathering Protocol for Delay Tolerant Mobile Sensor Networks

The Delay Tolerant Mobile Sensor Networks (DTMSNs) distinguish themselves from conventional sensor networks by means of some features such as loose connectivity, node mobility, and delay tolerability. It needs to be ackn...

A New Architecture for Intrusion-Tolerant Web Services Based on Design Diversity Techniques

Web services are the realization of service-oriented architecture (SOA). Security is an important challenge of SOAP-based Web services. So far, several security techniques and standards based on traditional security mech...

A new Sparse Coding Approach for Human Face and Action Recognition

Sparse coding is an unsupervised method which learns a set of over-complete bases to represent data such as image, video and etc. In the cases where we have some similar images from the different classes, using the spars...

The Surfer Model with a Hybrid Approach to Ranking the Web Pages

Users who seek results pertaining to their queries are at the first place. To meet users’ needs, thousands of webpages must be ranked. This requires an efficient algorithm to place the relevant webpages at first ranks. R...

Load Balanced Spanning Tree in Metro Ethernet Networks

Spanning Tree Protocol (STP) is a link management standard that provides loop free paths in Ethernet networks. Deploying STP in metro area networks is inadequate because it does not meet the requirements of these network...

Download PDF file
  • EP ID EP184050
  • DOI 10.7508/jist.2016.03.004
  • Views 144
  • Downloads 0

How To Cite

Leila Jafar Tafreshi, Farzin Yaghmaee (2016). Preserving Data Clustering with Expectation Maximization Algorithm. Journal of Information Systems and Telecommunication, 4(3), 167-173. https://europub.co.uk/articles/-A-184050