Publication | Open Access
Learning from a tiny dataset of manual annotations: a teacher/student approach for surgical phase recognition
44
Citations
16
References
2018
Year
Artificial IntelligenceData AnnotationConvolutional Neural NetworkEngineeringMachine LearningDigital PathologyAutomatic Annotation ToolSynthetic LabelsVideo InterpretationSynthetic AnnotationsImage AnalysisData SciencePattern RecognitionManual AnnotationsVideo TransformerMachine VisionFeature LearningKnowledge DiscoverySurgical TrainingComputer ScienceVideo UnderstandingMedical Image ComputingDeep LearningComputer VisionReal-time Video StreamTiny DatasetSurgical Phase RecognitionAutomatic Annotation
Vision algorithms capable of interpreting scenes from a real-time video stream are necessary for computer-assisted surgery systems to achieve context-aware behavior. In laparoscopic procedures one particular algorithm needed for such systems is the identification of surgical phases, for which the current state of the art is a model based on a CNN-LSTM. A number of previous works using models of this kind have trained them in a fully supervised manner, requiring a fully annotated dataset. Instead, our work confronts the problem of learning surgical phase recognition in scenarios presenting scarce amounts of annotated data (under 25% of all available video recordings). We propose a teacher/student type of approach, where a strong predictor called the teacher, trained beforehand on a small dataset of ground truth-annotated videos, generates synthetic annotations for a larger dataset, which another model - the student - learns from. In our case, the teacher features a novel CNN-biLSTM-CRF architecture, designed for offline inference only. The student, on the other hand, is a CNN-LSTM capable of making real-time predictions. Results for various amounts of manually annotated videos demonstrate the superiority of the new CNN-biLSTM-CRF predictor as well as improved performance from the CNN-LSTM trained using synthetic labels generated for unannotated videos. For both offline and online surgical phase recognition with very few annotated recordings available, this new teacher/student strategy provides a valuable performance improvement by efficiently leveraging the unannotated data.
| Year | Citations | |
|---|---|---|
Page 1
Page 1