Other Open Access

SARAS challenge on Multi-domain Endoscopic Surgeon Action Detection

Fabio Cuzzolin; Vivek Singh Bawa; Inna Skarga-Bandurova; Mohamed Mohamed; Jackson Ravindran Charles; Elettra Oleari; Alice Leporini; Carmela Landolfo; Armando Stabile; Francesco Setti; Riccardo Muradore


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nam##2200000uu#4500</leader>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Surgeon action detection</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">cross-domain learning</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">minimal invasive surgery</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">radical prostatectomy</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">endoscopy</subfield>
  </datafield>
  <controlfield tag="005">20210303084033.0</controlfield>
  <controlfield tag="001">4575197</controlfield>
  <datafield tag="711" ind1=" " ind2=" ">
    <subfield code="g">MICCAI 2021</subfield>
    <subfield code="a">24th International Conference on Medical Image Computing and Computer Assisted Intervention</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Visual Artificial Intelligence Lab, Oxford Brookes University, Oxford, UK</subfield>
    <subfield code="a">Vivek Singh Bawa</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Visual Artificial Intelligence Lab, Oxford Brookes University, Oxford, UK</subfield>
    <subfield code="a">Inna Skarga-Bandurova</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Visual Artificial Intelligence Lab, Oxford Brookes University, Oxford, UK</subfield>
    <subfield code="a">Mohamed Mohamed</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Visual Artificial Intelligence Lab, Oxford Brookes University, Oxford, UK</subfield>
    <subfield code="a">Jackson Ravindran Charles</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">San Raffaele Hospital, Milan, Italy</subfield>
    <subfield code="a">Elettra Oleari</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">San Raffaele Hospital, Milan, Italy</subfield>
    <subfield code="a">Alice Leporini</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">San Raffaele Hospital, Milan, Italy</subfield>
    <subfield code="a">Carmela Landolfo</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">San Raffaele Hospital, Milan, Italy</subfield>
    <subfield code="a">Armando Stabile</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">University of Verona, Italy</subfield>
    <subfield code="a">Francesco Setti</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">University of Verona, Italy</subfield>
    <subfield code="a">Riccardo Muradore</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">2808485</subfield>
    <subfield code="z">md5:206d242413b1e360f209e26a5251f8fd</subfield>
    <subfield code="u">https://zenodo.org/record/4575197/files/SARASchallengeonMulti-domainEndoscopicSurgeonActionDetection_02-11-2021_10-52-29.pdf</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2021-03-03</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="o">oai:zenodo.org:4575197</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">Director of the Visual Artificial Intelligence Lab, Oxford Brookes University, Oxford, UK</subfield>
    <subfield code="a">Fabio Cuzzolin</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">SARAS challenge on Multi-domain Endoscopic Surgeon Action Detection</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by-nd/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution No Derivatives 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;Minimally Invasive Surgery (MIS) involves very sensitive procedures. Success of these procedures depends on the individual competence and degree of coordination between the surgeons. The SARAS (Smart Autonomous Robotic Assistant Surgeon) EU consortium, www.saras-project.eu, is working on methods to assist surgeons in MIS procedures by devising deep learning models able to automatically detect surgeon actions from streaming endoscopic video. This challenge proposal builds on our previous MIDL 2020 challenge on surgeon action detection (https://saras-esad.grand-challenge.org), and aims to attract attention to this research problem and mobilise the medical computer vision community around it. In particular, informed by the challenges encountered in our SARAS work, we decided to focus this year&amp;rsquo;s challenge on the issue of learning static action detection model across multiple domains (e.g. types of data, distinct surgical procedures).&lt;/p&gt;

&lt;p&gt;Despite its huge success, deep learning suffers from two major limitations. Firstly, addressing a task (e.g., action detection in radical prostatectomy, as in SARAS) requires one to collect and annotate a large, dedicated dataset to achieve an acceptable level of performance. Consequently, each new task requires us to build a new model, often from scratch, leading to a linear relationship between the number of tasks and the number of models/datasets, with significant resource implications. Collecting large annotated datasets for every single MIS-based procedure is inefficient, very time consuming and financially expensive.&lt;/p&gt;

&lt;p&gt;In our SARAS work, we have captured endoscopic video data during radical prostatectomy under two different settings (&amp;#39;domains&amp;#39;): real procedures on real patients, and simplified procedures on artificial anatomies (&amp;#39;phantoms&amp;#39;). As shown in our MIDL 2020 challenge (over real data only), variations due to patient anatomy, surgeon style and so on dramatically reduce the performance of even state-of-the-art detectors compared to nonsurgical benchmark datasets. Videos captured in an artificial setting can provide more data, but are characterised by significant differences in appearance compared to real videos and are subject to variations in the looks of the phantoms over time. Inspired by these all-too-real issues, this challenge&amp;#39;s goal is to test the possibility of learning more robust models across domains (e.g. across different procedures which, however, share some types of tools or surgeon actions; or, in the SARAS case, learning from both real and artificial settings whose list of actions overlap, but do not coincide).&lt;/p&gt;

&lt;p&gt;In particular, this challenge aims to explore the opportunity of utilising cross-domain knowledge to boost model performance on each individual task whenever two or more such tasks share some objectives (e.g., some action categories). This is a common scenario in real-world MIS procedures, as different surgeries often have some core actions in common, or contemplate variations of the same movement (e.g. &amp;#39;pulling up the bladder&amp;#39; vs &amp;#39;pulling up a gland&amp;#39;). Hence, each time a new surgical procedure is considered, only a smaller percentage of new classes need to be added to the existing ones.&lt;/p&gt;

&lt;p&gt;The challenge provides two datasets for surgeon action detection: the first dataset (Dataset-R) is composed by 4 annotated videos of real surgeries on human patients, while the second dataset (Dataset-A) contains 6 annotated videos of surgical procedures on artificial human anatomies. All videos capture instances of the same procedure, Robotic Assisted Radical Prostatectomy (RARP), but with some difference in the set of classes. The two datasets share a subset of 10 action classes, while they differ in the remaining classes (because of the requirements of SARAS demonstrators). These two datasets provide a perfect opportunity to explore the possibility of exploiting multi-domain datasets designed for similar objectives to improve performance in each individual task.&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.4575196</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.4575197</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">other</subfield>
  </datafield>
</record>
387
269
views
downloads
All versions This version
Views 387387
Downloads 269269
Data volume 755.5 MB755.5 MB
Unique views 347347
Unique downloads 255255

Share

Cite as