Inference Control and Privacy Preservation in Data Mining

Inference Control and Privacy Preservation in Data Mining

<p> Inference Control and Privacy Preservation in Data Mining</p><p>Suggested reviewers: Dan Simovici, Xintao Wu</p><p>Motivation: These recent years have seen a staggering increase in the volume of information exchanged through the internet and with it the size of personally identifiable information that is exchanged online and/or is stored in data repositories. This situation brings concerns about individual privacy rights and how to protect them through regulation and technology. This module aims to shed light on the current privacy and data protection issues and some of the methods that help protect it </p><p>Target Audience: Senior computer science majors. A graduate level module can also be used with additional assignments.</p><p>Prerequisites: Database, data structures, programming. </p><p>Module Objectives</p><p> Give students an overview of the privacy concepts and requirements.</p><p> Present the students with the techniques used for the preservation of private information.</p><p> Present current privacy preserving techniques in data mining. </p><p>Module Organization</p><p>1. The Concept of privacy (2 hours)</p><p> a) Definition of Privacy and Data Protection.</p><p> b) Privacy and Security </p><p> c) Privacy and Legislation:</p><p> a.i. Legal: Individual Rights, Human Rights, Fourth Amendment, HiPAA.</p><p> a.ii. Organizational Privacy</p><p> a.iii. Informational Privacy: Digital Identities and Online Privacy Regulations</p><p> d) Types of Privacy Attacks.</p><p> e) Evaluating Privacy Techniques: Utility functions, disclosure factor</p><p>2. Data Centered Privacy Protection Methods (2 hours) a) What Data to hide: </p><p> b) Data Partitioning: Horizontal versus vertical </p><p> c) Data Modification : Aggregating, Blocking, Perturbation, Swapping, Sampling</p><p> d) Data Hiding: Cryptography Based techniques</p><p>3. Privacy Preserving Data Mining Techniques: (2 hours)</p><p> a) Data Obfuscation</p><p> b) Data Summarization</p><p> c) Data Separation</p><p> d) Inference Control</p><p> i. Confidential Data: Legal Requirements and Societal Expectations</p><p> ii. Data Aggregation</p><p> iii. Statistical Databases: Inference Control </p><p> iv. Conclusion</p><p> e) Privacy Preserving Association Rule Mining:</p><p> i. Horizontal Data Partitioning</p><p> ii. The ID3 Algorithm</p><p>Exercise: Privacy preserving association mining</p><p>Assume data is horizontally partitioned – Each site has complete information on a set of entities</p><p>– Same attributes at each site</p><p>The goal is to avoid disclosing entities, please develop an efficient association mining algorithm.</p>

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    3 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us