Multi-camera operating room activity analysis for workflow analysis
View PublicationAbstract
Recognizing workflow in the operating room (OR) is essential for developing context-aware systems to provide rich information to enhance patient outcomes by providing performance feedback to the staff post operation or by managing OR scheduling and utilization in real-time. However, manual event reporting can be unreliable and time-consuming. Therefore, our work aims to develop an automatic OR activity analysis framework for recognizing OR workflow for efficiency understanding, analysis and optimization. Such tasks are challenging due to the OR staff wearing similar scrubs and face masks in this setting. The presence of large machinery and equipment also occlude much camera views. By utilizing multiple pre-calibrated cameras mounted in the OR, our system localizes, tracks and recognizes the activity of each staff in OR. First, multi-person localization is done in 2D by using a top-down neural network. Next, an efficient and novel 3D reconstruction by tracking scheme is employed by matching 2D poses across different camera views with back-projected 2D poses from previous frame. Lastly, actions, such as standing, walking, sitting, crouching and lying, are recognized via prior pose knowledge and a pose graph convolution network. We evaluated pose estimation accuracy using percentage of correct key points (PCK) metric and achieved 78.0% accuracy on MVOR dataset, which outperformed state-of-art performance (76.2%). Based on these poses, the actions recognized are 96% accurate. Given the results, our proposed framework for automatic OR activity analysis demonstrates its potential to facilitate further online or offline workflow analysis.