TY - JOUR
T1 - A Comparative Study of Spatio-Temporal U-Nets for Tissue Segmentation in Surgical Robotics
AU - Attanasio, Aleks
AU - Alberti, Chiara
AU - Scaglioni, Bruno
AU - Marahrens, Nils
AU - Frangi, Alejandro F.
AU - Leonetti, Matteo
AU - Biyani, Chandra Shekhar
AU - De Momi, Elena
AU - Valdastri, Pietro
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2021/2
Y1 - 2021/2
N2 - In surgical robotics, the ability to achieve high levels of autonomy is often limited by the complexity of the surgical scene. Autonomous interaction with soft tissues requires machines able to examine and understand the endoscopic video streams in real-time and identify the features of interest. In this work, we show the first example of spatio-temporal neural networks, based on the U-Net, aimed at segmenting soft tissues in endoscopic images. The networks, equipped with Long Short-Term Memory and Attention Gate cells, can extract the correlation between consecutive frames in an endoscopic video stream, thus enhancing the segmentation's accuracy with respect to the standard U-Net. Initially, three configurations of the spatio-temporal layers are compared to select the best architecture. Afterwards, the parameters of the network are optimised and finally the results are compared with the standard U-Net. An accuracy of 83.77% ± 2.18% and a precision of 78.42% ± 7.38% are achieved by implementing both Long Short Term Memory (LSTM) convolutional layers and Attention Gate blocks. The results, although originated in the context of surgical tissue retraction, could benefit many autonomous tasks such as ablation, suturing and debridement.
AB - In surgical robotics, the ability to achieve high levels of autonomy is often limited by the complexity of the surgical scene. Autonomous interaction with soft tissues requires machines able to examine and understand the endoscopic video streams in real-time and identify the features of interest. In this work, we show the first example of spatio-temporal neural networks, based on the U-Net, aimed at segmenting soft tissues in endoscopic images. The networks, equipped with Long Short-Term Memory and Attention Gate cells, can extract the correlation between consecutive frames in an endoscopic video stream, thus enhancing the segmentation's accuracy with respect to the standard U-Net. Initially, three configurations of the spatio-temporal layers are compared to select the best architecture. Afterwards, the parameters of the network are optimised and finally the results are compared with the standard U-Net. An accuracy of 83.77% ± 2.18% and a precision of 78.42% ± 7.38% are achieved by implementing both Long Short Term Memory (LSTM) convolutional layers and Attention Gate blocks. The results, although originated in the context of surgical tissue retraction, could benefit many autonomous tasks such as ablation, suturing and debridement.
KW - computer assisted interventions
KW - Medical robotics
KW - minimally invasive surgery
KW - surgical vision
UR - http://www.scopus.com/inward/record.url?scp=85100920283&partnerID=8YFLogxK
U2 - 10.1109/TMRB.2021.3054326
DO - 10.1109/TMRB.2021.3054326
M3 - Article
AN - SCOPUS:85100920283
VL - 3
SP - 53
EP - 63
JO - IEEE Transactions on Medical Robotics and Bionics
JF - IEEE Transactions on Medical Robotics and Bionics
IS - 1
M1 - 9335948
ER -