TY - GEN
T1 - Statistical topological data analysis–A kernel perspective
AU - Kwitt, Roland
AU - Huber, Stefan
AU - Niethammer, Marc
AU - Lin, Weili
AU - Bauer, Ulrich
N1 - Funding Information:
This work has been partially supported by the Austrian Science Fund, project no. KLI 00012. We also thank the anonymous reviewers for their valuable comments/suggestions.
PY - 2015
Y1 - 2015
N2 - We consider the problem of statistical computations with persistence diagrams, a summary representation of topological features in data. These diagrams encode persistent homology, a widely used invariant in topological data analysis. While several avenues towards a statistical treatment of the diagrams have been explored recently, we follow an alternative route that is motivated by the success of methods based on the embedding of probability measures into reproducing kernel Hilbert spaces. In fact, a positive definite kernel on persistence diagrams has recently been proposed, connecting persistent homology to popular kernel-based learning techniques such as support vector machines. However, important properties of that kernel enabling a principled use in the context of probability measure embeddings remain to be explored. Our contribution is to close this gap by proving universality of a variant of the original kernel, and to demonstrate its effective use in twosample hypothesis testing on synthetic as well as real-world data.
AB - We consider the problem of statistical computations with persistence diagrams, a summary representation of topological features in data. These diagrams encode persistent homology, a widely used invariant in topological data analysis. While several avenues towards a statistical treatment of the diagrams have been explored recently, we follow an alternative route that is motivated by the success of methods based on the embedding of probability measures into reproducing kernel Hilbert spaces. In fact, a positive definite kernel on persistence diagrams has recently been proposed, connecting persistent homology to popular kernel-based learning techniques such as support vector machines. However, important properties of that kernel enabling a principled use in the context of probability measure embeddings remain to be explored. Our contribution is to close this gap by proving universality of a variant of the original kernel, and to demonstrate its effective use in twosample hypothesis testing on synthetic as well as real-world data.
UR - http://www.scopus.com/inward/record.url?scp=84965095287&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84965095287
VL - 28
T3 - Advances in Neural Information Processing Systems
SP - 3070
EP - 3078
BT - Advances in Neural Information Processing Systems 28 (NIPS 2015)
T2 - 29th Annual Conference on Neural Information Processing Systems, NIPS 2015
Y2 - 7 December 2015 through 12 December 2015
ER -