Abstract
We propose a new method that satisfies approximate differential privacy for top- selection with unordered output in the unknown data domain setting, not relying on the full knowledge of the domain universe. Our algorithm only requires looking at the top- elements for any given , thus, enforcing the principle of minimal privilege. Unlike previous methods, our privacy parameter does not scale with , giving improved applicability for scenarios of very large . Moreover, our novel construction, which combines the sparse vector technique and stability efficiently, can be applied as a general framework to any type of query, thus being of independent interest. We extensively compare our algorithm to previous work of top- selection on the unknown domain, and show, both analytically and on experiments, settings where we outperform the current state-of-the-art.
Publication
In Proceedings of the 36thConference on Uncertainty in Artificial Intelligence (UAI)
Research Scientist
My research interests include machine learning (deep learning) and statistics, with current research focus on large language models, differential privacy, and their applications to healthcare.