TY - GEN
T1 - Just Rewrite It Again
T2 - 19th International Conference on Availability, Reliability and Security, ARES 2024
AU - Meisenbacher, Stephen
AU - Matthes, Florian
N1 - Publisher Copyright:
© 2024 Owner/Author.
PY - 2024/7/30
Y1 - 2024/7/30
N2 - The study of Differential Privacy (DP) in Natural Language Processing often views the task of text privatization as a rewriting task, in which sensitive input texts are rewritten to hide explicit or implicit private information. In order to evaluate the privacy-preserving capabilities of a DP text rewriting mechanism, empirical privacy tests are frequently employed. In these tests, an adversary is modeled, who aims to infer sensitive information (e.g., gender) about the author behind a (privatized) text. Looking to improve the empirical protections provided by DP rewriting methods, we propose a simple post-processing method based on the goal of aligning rewritten texts with their original counterparts, where DP rewritten texts are rewritten again. Our results show that such an approach not only produces outputs that are more semantically reminiscent of the original inputs, but also texts which score on average better in empirical privacy evaluations. Therefore, our approach raises the bar for DP rewriting methods in their empirical privacy evaluations, providing an extra layer of protection against malicious adversaries.
AB - The study of Differential Privacy (DP) in Natural Language Processing often views the task of text privatization as a rewriting task, in which sensitive input texts are rewritten to hide explicit or implicit private information. In order to evaluate the privacy-preserving capabilities of a DP text rewriting mechanism, empirical privacy tests are frequently employed. In these tests, an adversary is modeled, who aims to infer sensitive information (e.g., gender) about the author behind a (privatized) text. Looking to improve the empirical protections provided by DP rewriting methods, we propose a simple post-processing method based on the goal of aligning rewritten texts with their original counterparts, where DP rewritten texts are rewritten again. Our results show that such an approach not only produces outputs that are more semantically reminiscent of the original inputs, but also texts which score on average better in empirical privacy evaluations. Therefore, our approach raises the bar for DP rewriting methods in their empirical privacy evaluations, providing an extra layer of protection against malicious adversaries.
KW - Data Privacy
KW - Differential Privacy
KW - Natural Language Processing
UR - http://www.scopus.com/inward/record.url?scp=85200405606&partnerID=8YFLogxK
U2 - 10.1145/3664476.3669926
DO - 10.1145/3664476.3669926
M3 - Conference contribution
AN - SCOPUS:85200405606
T3 - ACM International Conference Proceeding Series
BT - ARES 2024 - 19th International Conference on Availability, Reliability and Security, Proceedings
PB - Association for Computing Machinery
Y2 - 30 July 2024 through 2 August 2024
ER -