TY - GEN
T1 - Towards Non-adversarial Algorithmic Recourse
AU - Leemann, Tobias
AU - Pawelczyk, Martin
AU - Prenkaj, Bardh
AU - Kasneci, Gjergji
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
PY - 2024
Y1 - 2024
N2 - The streams of research on adversarial examples and counterfactual explanations have largely been growing independently. This has led to several recent works trying to elucidate their similarities and differences. Most prominently, it has been argued that adversarial examples, as opposed to counterfactual explanations, have a unique characteristic in that they lead to a misclassification compared to the ground truth. However, the computational goals and methodologies employed in existing counterfactual explanation and adversarial example generation methods often lack alignment with this requirement. Using formal definitions of adversarial examples and counterfactual explanations, we introduce non-adversarial algorithmic recourse and outline why in high-stakes situations, it is imperative to obtain counterfactual explanations that do not exhibit adversarial characteristics. We subsequently investigate how different components in the objective functions, e.g., the machine learning model or cost function used to measure distance, determine whether the outcome can be considered an adversarial example or not. Our experiments on common datasets highlight that these design choices are often more critical in deciding whether recourse is non-adversarial than whether recourse or attack algorithms are used. Furthermore, we show that choosing a robust and accurate machine learning model results in less adversarial recourse desired in practice.
AB - The streams of research on adversarial examples and counterfactual explanations have largely been growing independently. This has led to several recent works trying to elucidate their similarities and differences. Most prominently, it has been argued that adversarial examples, as opposed to counterfactual explanations, have a unique characteristic in that they lead to a misclassification compared to the ground truth. However, the computational goals and methodologies employed in existing counterfactual explanation and adversarial example generation methods often lack alignment with this requirement. Using formal definitions of adversarial examples and counterfactual explanations, we introduce non-adversarial algorithmic recourse and outline why in high-stakes situations, it is imperative to obtain counterfactual explanations that do not exhibit adversarial characteristics. We subsequently investigate how different components in the objective functions, e.g., the machine learning model or cost function used to measure distance, determine whether the outcome can be considered an adversarial example or not. Our experiments on common datasets highlight that these design choices are often more critical in deciding whether recourse is non-adversarial than whether recourse or attack algorithms are used. Furthermore, we show that choosing a robust and accurate machine learning model results in less adversarial recourse desired in practice.
KW - Adversarials
KW - Algorithmic Recourse
KW - Counterfactuals
UR - http://www.scopus.com/inward/record.url?scp=85200752559&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-63800-8_20
DO - 10.1007/978-3-031-63800-8_20
M3 - Conference contribution
AN - SCOPUS:85200752559
SN - 9783031637995
T3 - Communications in Computer and Information Science
SP - 395
EP - 419
BT - Explainable Artificial Intelligence - Second World Conference, xAI 2024, Proceedings
A2 - Longo, Luca
A2 - Lapuschkin, Sebastian
A2 - Seifert, Christin
PB - Springer Science and Business Media Deutschland GmbH
T2 - 2nd World Conference on Explainable Artificial Intelligence, xAI 2024
Y2 - 17 July 2024 through 19 July 2024
ER -