CORSMAL Hand-Occluded Containers (CHOC) dataset
Description
CORSMAL Hand-Occluded Containers (CHOC) is an image-based dataset for category-level 6D object pose and size estimation - although it can also be used for detection, segmentation, hand+object reconstruction etc. - with 138,240 pseudo-realistic composite RGB-D images of hand-held containers on top of 30 real backgrounds (mixed-reality set) and 3,951 RGB-D images selected from the CORSMAL Container Manipulation (CCM) dataset (real set).
The images of the mixed-reality set are automatically rendered using Blender, and are split into 129,600 images of handheld containers and 8,640 images of objects without hand. Only one synthetic container is rendered for each image. Images are evenly split among 48 unique synthetic objects from three categories, namely 16 boxes, 16 drinking containers without stem (nonstems) and 16 drinking containers with stems (stems), selected from ShapeNetSem. For each object, 6 realistic grasps were manually annotated using GraspIt!: bottom grasp, natural grasp, and top grasp for the left and right hand. The mixed-reality set provides RGB images, depth images, segmentation masks (hand and object), normalised object coordinates images (only object), object meshes, annotated 6D object poses (orientation and translation in 3D with respect to the camera view), and grasp meshes with their MANO parameters. Each image has a resolution of 640x480 pixels. Background images were acquired using an Intel RealSense D435i depth camera, and include 15 indoor and 15 outdoor scenes. All information necessary to re-render the dataset is provided, namely backgrounds, camera intrinsic parameters, lighting, object models, and hand + forearm meshes, and poses; users can complement the existing data with additional annotations.
The images of the real set are selected from 180 representative sequences of the CCM dataset. Each image contains a person holding one of the 15 containers during a manipulation occurring in the video prior to a handover (e.g., picking up an empty container, shaking an empty or filled food box, or pouring a content into a cup or drinking glass). For each object instance, sequences were chosen under four randomly sampled conditions, including background and lighting conditions, scenarios (person sitting, with the object on the table; person sitting and already holding the object; person standing while holding the container and then walking towards the table), and filling amount and type. The same sequence is selected from the three fixed camera views (two side and one frontal view) of the CCM setup (60 sequences for each view). Fifteen sequences exhibit the case of the empty container for all fifteen objects, whereas the other sequences have the person filling the container with either pasta, rice or water at 50% or 90% of the full container capacity. The real set has RGB images, depth images and 6D pose annotations. For each sequence, the 6D poses of the containers are manually annotated every 10 frames if the container is visible in at least two views, resulting in a total of 3,951 annotations. Annotations of the 6D poses for the intermediate frames are also provided by using interpolation.
For enquiries, questions, or comments, please contact corsmal-challenge@qmul.ac.uk or a.xompero@qmul.ac.uk.
Note: The mixed-reality set was built on top of previous works for the generation of synthetic and mixed-reality datasets, such as OBMan and NOCS-CAMERA.
Files
annotations.zip
Files
(70.2 GB)
Name | Size | Download all |
---|---|---|
md5:a34b60c40b314fb8772ced1dc7c5543f
|
62.7 MB | Preview Download |
md5:346dbc310dbe9600877759be8c300c93
|
14.6 MB | Preview Download |
md5:e9e40110a7bd71e490046ae96e39e1d6
|
938.7 MB | Preview Download |
md5:75ae4d7917b45796ccbb4c4d0c5b9328
|
1.3 GB | Preview Download |
md5:f8eb2708a95e07d9c693228b9de3d095
|
343 Bytes | Preview Download |
md5:a713a3f63157b878774d10a6ec0dfa33
|
227.8 MB | Preview Download |
md5:73a2fbc402b0c67d54fb4c6fe7da3a89
|
3.3 GB | Preview Download |
md5:5f758350e43e04a7f20fa348b441b54c
|
3.3 GB | Preview Download |
md5:a71b0896dfd01e0fabbbcbc493ef2342
|
3.3 GB | Preview Download |
md5:7708e92ef809418628b74120ba0c5d67
|
3.2 GB | Preview Download |
md5:2c6226b0e57a43769514b1fc4b1845dd
|
8.3 MB | Preview Download |
md5:bd0c4ef4b4c259c5c01970eb70880d13
|
28.4 kB | Preview Download |
md5:c12f0ab7fd75b2e4b2442aae7cc2ca0e
|
3.6 GB | Preview Download |
md5:991f2c5fe8d23a63959ad031031860e8
|
3.7 GB | Preview Download |
md5:fdd1673ccaed7b5eba39ecfbb786605d
|
3.7 GB | Preview Download |
md5:89ff684a311f825142462feb608e4094
|
3.7 GB | Preview Download |
md5:f901c6c928ad142fbf42e3ef81f3a866
|
3.7 GB | Preview Download |
md5:01a878433f736b63aefbb708e505257d
|
3.7 GB | Preview Download |
md5:854dcce084d82eb7903144ace2e887da
|
3.7 GB | Preview Download |
md5:f6d0801d7ed940a19e3a6bd11ecd2e4f
|
3.7 GB | Preview Download |
md5:2ebf894e34521fbb5e5196e593f6857d
|
3.7 GB | Preview Download |
md5:2fbcb8fe3d899c2e58dc96abb9fa22c5
|
3.7 GB | Preview Download |
md5:86583e5eac42e4b6761b9b4bbb4c76ba
|
3.7 GB | Preview Download |
md5:76c078821b51aff65fab6fa800122d45
|
3.7 GB | Preview Download |
md5:22bb6b38675387ddd3d85d57c7169e38
|
3.7 GB | Preview Download |
md5:57045a470a3f120e71bb653584e244da
|
3.7 GB | Preview Download |
md5:d26151fdabab8dcc427d30b026beec31
|
3.0 GB | Preview Download |
md5:736dfa832d8c794fe3da496ce33d9363
|
318 Bytes | Preview Download |
Additional details
Related works
- Cites
- Dataset: 10.17636/101CORSMAL1 (DOI)
Funding
- UK Research and Innovation
- CORSMAL: Collaborative object recognition, shared manipulation and learning EP/S031715/1