Proceedings of the Annual Workshop on Wireless Systems: Advanced Research and Development (WISARD)
Researchers choosing to share wireless-network traces with colleagues must first anonymize sensitive information, trading off the removal of information in the interest of identity protection and the preservation of useful data within the trace. While several metrics exist to quantify this privacy-utility tradeoff, they are often computationally expensive. Computing these metrics using a \emphsample\/ of the trace could potentially save precious time. In this paper, we examine several sampling methods to discover their effects on measurement of the privacy-utility tradeoff when anonymizing network traces. We tested the relative accuracy of several packet and flow-sampling methods on existing privacy and utility metrics. We concluded that, for our test trace, no single sampling method we examined allowed us to accurately measure the tradeoff, and that some sampling methods can produce grossly inaccurate estimates of those values. We call for further research to develop sampling methods that maintain relevant privacy and utility properties.
Phillip A. Fazio, Keren Tan, and David Kotz. Effects of network trace sampling methods on privacy and utility metrics. In Proceedings of the Annual Workshop on Wireless Systems: Advanced Research and Development (WISARD), January 2012. 10.1109/COMSNETS.2012.6151387