Edinburgh Simulated Surgical Tools Dataset (RGBD)


Introduction

The dataset contains RGBD images of five simulated surgical tools (two kinds of scalpels, two kinds of clamps, and one tweezers), synthetic and real images, for a total number of 64,728 images (RGB 38,469, Depth 26,259). The tools are simulated because the main project was concerned with human-computer interaction rather than visual recognition, so the domain was simplified: non-specular, tool parts color-coded, larger than real-life so the depth sensor can acquire multiple points across the width. The real parts were 3D printed.

Ground-truth labels for the tool properties are provided for each synthetic image (slightly different from file to file, see detailed description in the tables below).

This dataset was created as part of a visual recognition subtask for the Advanced Autonomy Project at the University of Edinburgh, funded by the Turing Institute, Vision tasks include surgical tool classification, 6D pose estimation, tool attribute recognition (size, color, relative position, grasping points, etc.) for doctor-robot language interaction tasks and robotic arm picking tasks. Synthetic tools are created in blender using 3D meshes, while real tools are 3D printed using synthetic models. Original mesh files and point cloud files can also be downloaded below.

synthetic tools real tools
Synthetic ImageReal image

Dataset

Synthetic Images

See below for txt and json file formats.

File Type Images (RGB/D) Size (MB) Ground-truth Description
single_bbox_6000 rgb, single tool 5970/0 566 2D bounding boxes only (.txt)
multi_bbox_3000 rgb, multiple tools 3154/0 252 2D bounding boxes only (.txt)
multi_fullGT_500 rgb-d, multiple tools 500/500 76.3 ground truth description (.json)
multi_fullGT_1000 rgb-d, multiple tools 1110/1110 135 ground truth description (.json)
multi_spoon_3500 rgb-d, multiple tools 3500/3500 415 ground truth description (.json)
+ new object 'spoon.1', class index '5', one new color 'O' (orange)
multi_grasp_1000 rgb-d, multiple tools 1000/1000 114 ground truth description (.json)
+ grasp points for each tool
single_grasp_9000 rgb-d, single tool 9000/9000 931.5 ground truth description (.json)
+ grasp points for single tool

Real Images

As well as the images, the download files include detection boxes found by YoloV5. We estimate that the boxes are 99.5% correct (a few missing detections). No ground-truth on identity, color, or location is included.

File Type Image No. (rgb/d) Size (MB) Detection Label Description
multi_real_1000 rgb-d, multiple tools 1185/1185 686 detected 2D bounding boxes (.json)
+ black background
multi_real_1600 rgb-d, multiple tools 1685/1685 537 detected 2D bounding boxes (.json)
+ white background
multi_real_2200 rgb-d, multiple tools 2298/2298 953 detected 2D bounding boxes (.json)
+ normal background
single_paper_real_1000 rgb, single tool 1120/0 370 paper tools, detected 2D bounding boxes (.json)
+ white background
single_real_2000 rgb, single tool 1966/0 805 single tools, detected 2D bounding boxes (.json)
+ normal background
single_real_5000 rgb-d, single tool 5981/5981 3174 single tools, detected 2D bounding boxes (.json)
+ black background

Raw 3D Files

These are the source files for creating the synthetic tools. Examples of the parts are shown below.

File Size (MB) Description
ptc_and_mesh_files
example toos
1.82 (.pcd) 7 point cloud files for tools
(clamps with half parts)
(.blend) 1 file for tool meshes with grasp points

Acknowledgements

The data is freely available for research use. Acknowledge the University of Edinburgh and Turing Institute. The data is the property of the University of Edinburgh. All rights reserved.

Contact

Email: Prof. Robert Fisher at rbf -a-t- inf.ed.ac.uk.
School of Informatics, Univ. of Edinburgh
1.11 Bayes Centre, 47 Potterrow, Edinburgh EH8 9BT, UK
Tel: +44-(131)-651-3441 (direct line), +44-(131)-651-3443 (secretary)

Ground-truth and detection JSON file format

Ground-truth is a .json file made for each synthetic image. Please see details below.
Detected 2D bounding boxes are provided for each tool in real images.

# Tool Classes

nc: 5          #number of classes  #predefined

"class_label":   0 1 2 3 4      #class indexes #predefined 

"type": ['scalpel', 'scalpel', 'clamp', clamp', 'tweezers']    #type names  #predefined

# Tool Attributes

- Full scene description (read 'gtxxx.json' and gtdata[0])
{
- I CAN SEE X OBJECTS ON THE TABLE. 
'object_indices': [0, 1, ...], 
'objects': ['spoon', 'clamp', ...], 
'object_size': ['small', 'small', ...], 
'object_colors': ['blue', 'red', ...], 
'which_side_on_table': ['middle', 'middle', ...], 
}

- Each tool description (read 'gtxxx.json' and gtdata[1])

"real_name": ['scalpel.1', 'scalpel.2', 'clamp.1', 'clamp.2', 'tweezers.1']    #object_names  #predefined

"size": ['big', 'big', 'small', 'big', 'small'];  #predefined #object_size

"maincolor": ['R', 'G', 'B', 'P', 'C', 'Y']   (i.e., 'red', 'green', 'blue', 'purple', 'cyan', 'yellow'); #random

"2D_box_image": [x_center   y_center   width   height]   (YOLOv5 format value 0-1); #random #2D_image_coordinate

"location_world": [x, y, z] (m) ;   "rotation_world": [x, y, z] (Euler) ;  # 6d_pose #random #3D_world_coordinate

"open_angle":  0-70 degree for clamps (counterclockwise); others 0 degree;  #random #Z_axis_3D_world_coordinate

"3D_box_local": eight vertices of 3D bounding box (m); #predefined #object_3D_size #3D_local_coordinate

(*partial synthetic files only) 

"grasp3D_handle", "grasp3D_joint_blade", [x, y, z] (m) ; #grasp_points #predefined  #3D_local_coordinate

"below": [], "above": [], "near": [], "which_side_of_table": [];   # Relative_location

*duplicated tools will be named as ['scapel.1x', 'scapel.2x', 'clamp.1x', 'clamp.2x', 'tweezers.1x'] in the ground-truth file

Ground-truth and detection TXT file format

    
In the *.txt files, each row is a bounding box (YoloV5 format 0-1)
[class x_center y_center width height]

where class is one of these values:  0 1 2 3 4
corresponding to these class names:
( ['scaple.1', 'scaple.2', 'clamp.1', 'clamp.2', 'tweezers.1'] )

Valid HTML 4.01 Transitional

© 2022 Robert Fisher