“Box Supervised Video Segmentation Proposal Network” Paper Presented at the 24th Irish Machine Vision and Image Processing Conference

“Box Supervised Video Segmentation Proposal Network” Paper Presented at the 24th Irish Machine Vision and Image Processing Conference

Email
Print

Jonathan Kobold, Head of Machine Learning and Computer Vision team at HENSOLDT Analytics, recently co-authored a paper with Tanveer Hannan, Rajat Koner, and Matthias Schubert from the Ludwig Maximilian University of Munich on the topic of video object segmentation.

The publication titled “Box Supervised Video Segmentation Proposal Network” was accepted to be presented at the 24th Irish Machine Vision and Image Processing Conference, which took place between 31 August – 2 September at Queen’s University, Belfast.

The IMVIP Conference is Ireland’s primary meeting for those in the fields of machine vision and image processing providing a platform for researchers across Ireland and beyond to showcase their novel work and share ideas. Participants of this year’s edition saw 19 oral presentations and 13 poster presentations, as well as three keynote sessions.

Box Supervised Video Segmentation Proposal Network - Paper Abstract

Bounding box supervision provides a balanced compromise between labeling effort and result quality for image segmentation. However, there exists no such work explicitly tailored for videos. Applying the image segmentation methods directly to videos produces sub-optimal solutions because they do not exploit the temporal information. In this work, we propose a box-supervised video segmentation proposal network. We take advantage of intrinsic video properties by introducing a novel box-guided motion calculation pipeline and a motion-aware affinity loss. As the motion is utilized only during training, the run-time remains fixed during inference time. We evaluate our model on Video Object Segmentation (VOS) challenge. The method outperforms the state-of-the-art self-supervised methods by 16.4% and 6.9% J&F score, and the majority of fully supervised ones on the DAVIS and Youtube-VOS dataset. Code is available at https://github.com/Tanveer81/BoxVOS.git.

Download the paper

To request the full version of this document, please fill in the form below.

Please find out about your rights and choices and how we use your information in our Privacy Policy and Cookie Policy.

HENSOLDT Analytics
HENSOLDT Analytics

HENSOLDT Analytics is a global leading provider of Open Source Intelligence (OSINT) systems and Natural Language Processing technologies, such as Automatic Speech Recognition, which are key elements for media monitoring and analysis.