Abstract
Tracking of multiple targets in a crowded environment using tracking by detection algorithms has been investigated thoroughly. Although these techniques are quite successful, they suffer from the loss of much detailed information about targets in detection boxes, which is highly desirable in many applications like activity recognition. To address this problem, we propose an approach that tracks superpixels instead of detection boxes in multi-view video sequences. Specifically, we first extract superpixels from detection boxes and then associate them within each detection box, over several views and time steps that lead to a combined segmentation, reconstruction, and tracking of superpixels. We construct a flow graph and incorporate both visual and geometric cues in a global optimization framework to minimize its cost. Hence, we simultaneously achieve segmentation, reconstruction and tracking of targets in video. Experimental results confirm that the proposed approach outperforms state-of-the-art techniques for tracking while achieving comparable results in segmentation.
Originalsprache | Englisch |
---|---|
Seiten (von - bis) | 166-181 |
Seitenumfang | 16 |
Fachzeitschrift | Computer Vision and Image Understanding |
Jahrgang | 154 |
DOIs | |
Publikationsstatus | Veröffentlicht - 1 Jan. 2017 |