One-Class Active Learning for Outlier Detection with Multiple Subspaces
Holger Trittenbach, Klemens Böhm
This is the companion website for the publication
Holger Trittenbach and Klemens Böhm. 2019. One-Class Active Learning for Outlier Detection with Multiple Subspaces. In The 28th ACM International Conference on Information and Knowledge Management (CIKM ’19), November 3–7, 2019, Beijing, China. ACM, New York, NY, USA, 10 pages. https://doi.org/10.1145/3357384.3357873
This website provides code and description to reproduce experiments and analyses. The description covers the full experimental pipeline, from preprocessing the raw data and to generating the plots and tables shown in the paper. Citing this work:
@inproceedings{trittenbach2019subsvdd,
title={One-Class Active Learning for Outlier Detection with Multiple Subspaces},
author={Trittenbach, Holger and B{\"o}hm, Klemens},
booktitle={Proceedings of the 28th ACM International Conference on Information and Knowledge Management},
year={2019}
}
Resources
The resources are divided into several repositories.
- subsvdd-evaluation: Contains the scripts to run experiments and to analyze the results. The package readme is a step-by-step guide to reproduce the experiments described in the companion paper. The pre-processed input files are available to download (3 MB). Running the experiments is compute intensive and takes many CPU hours. Therefore, we also provide the results to download (51 MB).
- SVDD.jl: SubSVDD is part of the SVDD.jl julia package.
- OneClassActiveLearning.jl: A Julia package that implements various Query Strategies and an Active Learning Cycle.
For an overview and a benchmark on one-class active learning visit the ocal project website
The code is licensed under a MIT License and the result data under a Creative Commons Attribution 4.0 International License.
Overview
Active learning are methods to improve classification quality by user feedback. A common assumption is that users can provide such feedback, regardless of how results are presented to them. However, this assumption often does not hold in practice.
To support users in the feedback process, we introduce SubSVDD, a semi-supervised classifier for one-class active learning. SubSVDD learns decision boundaries in low-dimensional projections of the data. This is an advantage over existing methods: decision boundaries in low-dimensional projections are easier to interpret, and can be visualized if projections are two- or three-dimensional. For active learning, we provide a framework to select observations for feedback across multiple subspaces.
Evaluation Examples
SubSVDD outperforms its competitors on several benchmark data sets. Further, it allows to trade the effort users require to interpret the results for additional result accuracy by increasing the dimensionality of the projections.
For further results and details on the figures, we refer to the companion paper.
Contact
We welcome contributions to the packages and bug reports on Github.
For questions and comments, please contact holger.trittenbach@kit.edu