TY - GEN
T1 - Contrastive Transformer-Based Multiple Instance Learning for Weakly Supervised Polyp Frame Detection
AU - Tian, Yu
AU - Pang, Guansong
AU - Liu, Fengbei
AU - Liu, Yuyuan
AU - Wang, Chong
AU - Chen, Yuanhong
AU - Verjans, Johan
AU - Carneiro, Gustavo
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - Current polyp detection methods from colonoscopy videos use exclusively normal (i.e., healthy) training images, which i) ignore the importance of temporal information in consecutive video frames, and ii) lack knowledge about the polyps. Consequently, they often have high detection errors, especially on challenging polyp cases (e.g., small, flat, or partially visible polyps). In this work, we formulate polyp detection as a weakly-supervised anomaly detection task that uses video-level labelled training data to detect frame-level polyps. In particular, we propose a novel convolutional transformer-based multiple instance learning method designed to identify abnormal frames (i.e., frames with polyps) from anomalous videos (i.e., videos containing at least one frame with polyp). In our method, local and global temporal dependencies are seamlessly captured while we simultaneously optimise video and snippet-level anomaly scores. A contrastive snippet mining method is also proposed to enable an effective modelling of the challenging polyp cases. The resulting method achieves a detection accuracy that is substantially better than current state-of-the-art approaches on a new large-scale colonoscopy video dataset introduced in this work. Our code and dataset are available at https://github.com/tianyu0207/weakly-polyp.
AB - Current polyp detection methods from colonoscopy videos use exclusively normal (i.e., healthy) training images, which i) ignore the importance of temporal information in consecutive video frames, and ii) lack knowledge about the polyps. Consequently, they often have high detection errors, especially on challenging polyp cases (e.g., small, flat, or partially visible polyps). In this work, we formulate polyp detection as a weakly-supervised anomaly detection task that uses video-level labelled training data to detect frame-level polyps. In particular, we propose a novel convolutional transformer-based multiple instance learning method designed to identify abnormal frames (i.e., frames with polyps) from anomalous videos (i.e., videos containing at least one frame with polyp). In our method, local and global temporal dependencies are seamlessly captured while we simultaneously optimise video and snippet-level anomaly scores. A contrastive snippet mining method is also proposed to enable an effective modelling of the challenging polyp cases. The resulting method achieves a detection accuracy that is substantially better than current state-of-the-art approaches on a new large-scale colonoscopy video dataset introduced in this work. Our code and dataset are available at https://github.com/tianyu0207/weakly-polyp.
KW - Colonoscopy
KW - Polyp detection
KW - Video anomaly detection
KW - Vision transformer
KW - Weakly-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85139068001&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-16437-8_9
DO - 10.1007/978-3-031-16437-8_9
M3 - Conference contribution
AN - SCOPUS:85139068001
SN - 9783031164361
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 88
EP - 98
BT - Medical Image Computing and Computer Assisted Intervention – MICCAI 2022 - 25th International Conference, Proceedings
A2 - Wang, Linwei
A2 - Dou, Qi
A2 - Fletcher, P. Thomas
A2 - Speidel, Stefanie
A2 - Li, Shuo
PB - Springer Science and Business Media Deutschland GmbH
T2 - 25th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2022
Y2 - 18 September 2022 through 22 September 2022
ER -