Mining massive amounts of personal data can provide crucial insights into important questions asked by scientists, sociologists, and public policy makers. But behind each data point, there’s a real human, demanding privacy.
The Defense Advanced Research Projects Agency (DARPA) has awarded Assistant Professor of Computer Science nearly $500,000 to participate in Project Brandeis, a new program that challenges researchers from across the country to develop systems that facilitate data analysis while preserving privacy.
“In the era of big data, there are many examples where data mining technologies have yielded useful insights into messy, complex data,” Hay said. “However, there are also instances where these same technologies are misapplied and even abused.”
Hay’s research is part of a $2.8 million team effort led by scientists at UMASS Amherst. In the months ahead, the team will attempt to build systems that achieve what cryptographers have defined as differential privacy: query results that are statistically true but not precise enough to allows hackers to link real people with otherwise anonymous data points.
Hay and his IJʿ undergraduate research assistants will help in designing the system architecture, coding a prototype, and collaborating with other Brandeis Project teams to integrate that prototype into a larger demonstration system.
“The DARPA program is not simply funding research,” Hay said. “Instead, each research group that receives funding is expected to work collaboratively with other research groups and develop experimental systems that bring our technologies together.”
The Brandeis Project taps Hay’s strengths. Using his advanced understanding of computer science, he has already built systems that make it possible for researchers to analyze data while protecting individual privacy.
“We are at a point where there are now many algorithms for doing privacy-preserving data analysis,” Hay said. “However, these algorithms are complex and often require specialized knowledge to apply them correctly — our goal is to simplify this process, providing users with simpler tools that are still effective, both in protecting privacy and yielding useful insights from the data.”