Ted Dunning View Bio
Data security is usually framed in terms of how well you can prevent access. The other side of the coin, however, is giving the right people sufficient access to data so they can do their jobs. There are solutions available that do consistent masking and obscure fields and records selectively.
Advanced analytics makes this problem much harder, however. De-anonymization technologies are available that can find correlations between diverse data sources, thus making it extremely risky to release even masked data. Unfortunately, data sufficiently detailed to allow effective modeling will be particularly vulnerable to de-anonymization attacks., Yet there is an advantage to being able to share data publicly because it encourages experimentation from a wide range of groups to find new modeling technologies and determine which ones are worth bringing in house.
In this talk, Ted will describe some new developments that avoid these problem in a provably secure fashion. The crux is a new method for developing synthetic data generators that make it feasible for outside researchers to test new methods without any risk to sensitive data. Ted will also describe specific examples how this method has been used with several MapR customers.
Ted Dunning is Chief Application Architect at MapR Technologies and committer and PMC member of the Apache Mahout, Apache ZooKeeper, and Apache Drill projects. Ted has been very active in mentoring new Apache projects and is currently serving as vice president of incubation for the Apache Software Foundation. Ted was the chief architect behind the MusicMatch (now Yahoo Music) and Veoh recommendation systems. He built fraud detection systems for ID Analytics (later purchased by LifeLock) and he has 24 patents issued to date and a dozen pending. Ted has a PhD in computing science from the University of Sheffield. When he’s not doing data science, he plays guitar and mandolin. He also bought the beer at the first Hadoop user group meeting..