Action Databases

  1. Actor and Action Dataset - 3782 videos, seven classes of actors performing eight different actions (Xu, Hsieh, Xiong, Corso)
  2. An analyzed collation of various labeled video datasets for action recognition (Kevin Murphy)
  3. 50 Salads - fully annotated 4.5 hour dataset of RGB-D video + accelerometer data, capturing 25 people preparing two mixed salads each (Dundee University, Sebastian Stein)
  4. ASLAN Action similarity labeling challenge database (Orit Kliper-Gross)
  5. Berkeley MHAD: A Comprehensive Multimodal Human Action Database (Ferda Ofli)
  6. BEHAVE Interacting Person Video Data with markup (Scott Blunsden, Bob Fisher, Aroosha Laghaee)
  7. Brown Breakfast Actions Dataset - 70 hours, 4 million frames of 10 different breakfast preparation activities (Kuehne, Arslan and Serre)
  8. Cornell Activity Datasets CAD 60, CAD 120 (Cornell Robot Learning Lab)
  9. CVBASE06: annotated sports videos (Janez Pers)
  10. DogCentric Activity Dataset - first-person videos taken from a camera mounted on top of a *dog* (Michael Ryoo)
  11. FCVID: Fudan-Columbia Video Dataset - 91,223 Web videos annotated manually according to 239 categories (Jiang, Wu, Wang, Xue, Chang)
  12. G3D - synchronised video, depth and skeleton data for 20 gaming actions captured with Microsoft Kinect (Victoria Bloom)
  13. Georgia Tech Egocentric Activities - Gaze(+) - videos of where people look at and their gaze location (Fathi, Li, Rehg)
  14. HMDB: A Large Human Motion Database (Serre Lab)
  15. Hollywood 3D dataset - 650 3D video clips, across 14 action classes (Hadfield and Bowden)
  16. Human Actions and Scenes Dataset (Marcin Marszalek, Ivan Laptev, Cordelia Schmid)
  17. HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion (Brown University)
  18. i3DPost Multi-View Human Action Datasets (Hansung Kim)
  19. i-LIDS video event image dataset (Imagery library for intelligent detection systems) (Paul Hosner)
  20. INRIA Xmas Motion Acquisition Sequences (IXMAS) (INRIA)
  21. Jena Action Recognition Dataset - Aibo dog actions (Korner and Denzler)
  22. JPL First-Person Interaction dataset - 7 types of human activity videos taken from a first-person viewpoint (Michael S. Ryoo, JPL)
  23. KTH human action recognition database (KTH CVAP lab)
  24. LIRIS human activities dataset - 2 cameras, annotated, depth images (Christian Wolf, et al)
  25. ManiAc RGB-D action dataset: different manipulation actions, 15 different versions, 30 different objects manipulated, 20 long and complex chained manipulation sequences (Eren Aksoy)
  26. MEXaction2 action detection and localization dataset - to support the development and evaluation of methods for 'spotting' instances of short actions in a relatively large video database: 77 hours, 117 videos (Michel Crucianu and Jenny Benois-Pineau)
  27. MPII Cooking Activities Dataset (M. Rohrbach)
  28. MSRC-12 Kinect gesture data set - 594 sequences and 719,359 frames from people performing 12 gestures (Microsoft Research Cambridge)
  29. MuHAVi - Multicamera Human Action Video Data (Hossein Ragheb)
  30. Oxford TV based human interactions (Oxford Visual Geometry Group)
  31. Rochester Activities of Daily Living Dataset (Ross Messing)
  32. SDHA Semantic Description of Human Activities 2010 contest - aerial views (Michael S. Ryoo, J. K. Aggarwal, Amit K. Roy-Chowdhury)
  33. SDHA Semantic Description of Human Activities 2010 contest - Human Interactions (Michael S. Ryoo, J. K. Aggarwal, Amit K. Roy-Chowdhury)
  34. Stanford Sport Events dataset (Jia Li)
  35. THUMOS - Action Recognition in Temporally Untrimmed Videos! - 430 hours of video data and 45 million frames (Gorban, Idrees, Jiang, Zamir, Laptev Shah, Sukthanka)
  36. TUM Kitchen Data Set of Everyday Manipulation Activities (Moritz Tenorth, Jan Bandouch)
  37. TV Human Interaction Dataset (Alonso Patron-Perez)
  38. Univ of Central Florida - Feature Films Action Dataset (Univ of Central Florida)
  39. Univ of Central Florida - YouTube Action Dataset (sports) (Univ of Central Florida)
  40. Univ of Central Florida - 50 Action Category Recognition in Realistic Videos (3 GB) (Kishore Reddy)
  41. UCF 101 action dataset 101 action classes, over 13k clips and 27 hours of video data (Univ of Central Florida)
  42. Univ of Central Florida - Sports Action Dataset (Univ of Central Florida)
  43. Univ of Central Florida - ARGAerial camera, Rooftop camera and Ground camera (UCF Computer Vision Lab)
  44. UCR Videoweb Multi-camera Wide-Area Activities Dataset (Amit K. Roy-Chowdhury)
  45. Verona Social interaction dataset (Marco Cristani)
  46. Videoweb (multicamera) Activities Dataset (B. Bhanu, G. Denina, C. Ding, A. Ivers, A. Kamal, C. Ravishankar, A. Roy-Chowdhury, B. Varda)
  47. ViHASi: Virtual Human Action Silhouette Data (userID: VIHASI password: virtual$virtual) (Hossein Ragheb, Kingston University)
  48. VIRAT Video Dataset - event recognition from two broad categories of activities (single-object and two-objects) which involve both human and vehicles. (Sangmin Oh et al)
  49. WorkoutSU-10 Kinect dataset for exercise actions (Ceyhun Akgul)
  50. Wrist-mounted camera video dataset - object manipulation (Ohnishi, Kanehira, Kanezaki, Harada)
  51. WVU Multi-view action recognition dataset (Univ. of West Virginia)
  52. YouCook - 88 open-source YouTube cooking videos with annotations (Jason Corso)


  1. 2008 MICCAI MS Lesion Segmentation Challenge (National Institutes of Health Blueprint for Neuroscience Research)
  2. Annotated Spine CT Database for Benchmarking of Vertebrae Localization, 125 patients, 242 scans (Ben Glockern)
  3. Cavy Action Dataset - 16 sequences with 640 x 480 resolutions recorded at 7.5 frames per second (fps) with approximately 31621506 frames in total (272 GB) of interacting cavies (guinea pig) (Al-Raziqi and Denzler)
  4. Computed Tomography Emphysema Database (Lauge Sorensen)
  5. CRCHistoPhenotypes - Labeled Cell Nuclei Data - colorectal cancer histology images consisting of nearly 30,000 dotted nuclei with over 22,000 labeled with the cell type (Rajpoot + Sirinukunwattana)
  6. Dermoscopy images (Eric Ehrsam)
  7. DIADEM: Digital Reconstruction of Axonal and Dendritic Morphology Competition (Allen Institute for Brain Science et al)
  8. DIARETDB1 - Standard Diabetic Retinopathy Database (Lappeenranta Univ of Technology)
  9. DRIVE: Digital Retinal Images for Vessel Extraction (Univ of Utrecht)
  10. Leaf Segmentation ChallengeTobacco and arabidopsis plant images (Hanno Scharr, Massimo Minervini, Andreas Fischbach, Sotirios A. Tsaftaris)
  11. MiniMammographic Database (Mammographic Image Analysis Society)
  12. MIT CBCL Automated Mouse Behavior Recognition datasets (Nicholas Edelman)
  13. Moth fine-grained recognition - 675 similar classes, 5344 images (Erik Rodner et al)
  14. Mouse Embryo Tracking Database - cell division event detection (Marcelo Cicconet, Kris Gunsalus)
  15. OASIS - Open Access Series of Imaging Studies - 500+ MRI data sets of the brain (Washington University, Harvard University, Biomedical Informatics Research Network)
  16. Plant Phenotyping Datasets - plant data suitable for plant and leaf detection, segmentation, tracking, and species recognition (M. Minervini, A. Fischbach, H. Scharr, S. A. Tsaftaris)
  17. Retinal fundus images - Ground truth of vascular bifurcations and crossovers (Univ of Groningen)
  18. Spine and Cardiac data (Digital Imaging Group of London Ontario, Shuo Li)
  19. Univ of Central Florida - DDSM: Digital Database for Screening Mammography (Univ of Central Florida)
  20. VascuSynth - 120 3D vascular tree like structures with ground truth (Mengliu Zhao, Ghassan Hamarneh)
  21. York Cardiac MRI dataset (Alexander Andreopoulos)

Face Databases

  1. 300 Videos in the Wild (300-VW) - 68 Facial Landmark Tracking (Chrysos, Antonakos, Zafeiriou, Snape, Shen, Kossaifi, Tzimiropoulos, Pantic)
  2. 3D Mask Attack Database (3DMAD) - 76500 frames of 17 persons using Kinect RGBD with eye positions (Sebastien Marcel)
  3. Audio-visual database for face and speaker recognition (Mobile Biometry MOBIO
  4. BANCA face and voice database (Univ of Surrey)
  5. Binghampton Univ 3D static and dynamic facial expression database (Lijun Yin, Peter Gerhardstein and teammates)
  6. BioID face database (BioID group)
  7. Biwi 3D Audiovisual Corpus of Affective Communication - 1000 high quality, dynamic 3D scans of faces, recorded while pronouncing a set of English sentences.
  8. Bosphorus 3D/2D Database of FACS annotated facial expressions, of head poses and of face occlusions (Bogazici University)
  9. CMU Facial Expression Database (CMU/MIT)
  10. CMU/MIT Frontal Faces (CMU/MIT)
  11. Cohn-Kanade AU-Coded Expression Database - 500+ expression sequences of 100+ subjects, coded by activated Action Units (Affect Analysis Group, Univ. of Pittsburgh)
  12. CMU/MIT Frontal Faces (CMU/MIT)
  13. CMU Pose, Illumination, and Expression (PIE) Database (Simon Baker)
  14. CSSE Frontal intensity and range images of faces (Ajmal Mian)
  15. Face Recognition Grand Challenge datasets (FRVT - Face Recognition Vendor Test)
  16. FaceScrub - A Dataset With Over 100,000 Face Images of 530 People (50:50 male and female) (H.-W. Ng, S. Winkler)
  17. FaceTracer Database - 15,000 faces (Neeraj Kumar, P. N. Belhumeur, and S. K. Nayar)
  18. FDDB: Face Detection Data set and Benchmark - studying unconstrained face detection (University of Massachusetts Computer Vision Laboratory)
  19. FG-Net Aging Database of faces at different ages (Face and Gesture Recognition Research Network)
  20. Facial Recognition Technology (FERET) Database (USA National Institute of Standards and Technology)
  21. Hannah and her sisters database - a dense audio-visual person-oriented ground-truth annotation of faces, speech segments, shot boundaries (Patrick Perez, Technicolor)
  22. Hong Kong Face Sketch Database
  23. Japanese Female Facial Expression (JAFFE) Database (Michael J. Lyons)
  24. LFW: Labeled Faces in the Wild - unconstrained face recognition
  25. Manchester Annotated Talking Face Video Dataset (Timothy Cootes)
  26. MegaFace - 1 million faces in bounding boxes (Kemelmacher-Shlizerman, Seitz, Nech, Miller, Brossard)
  27. MIT CBCL Face Recognition Database (Center for Biological and Computational Learning)
  28. MIT Collation of Face Databases (Ethan Meyers)
  29. MMI Facial Expression Database - 2900 videos and high-resolution still images of 75 subjects, annotated for FACS AUs.
  30. MORPH (Craniofacial Longitudinal Morphological Face Database) (University of North Carolina Wilmington)
  31. NIST Face Recognition Grand Challenge (FRGC) (NIST)
  32. NIST mugshot identification database (USA National Institute of Standards and Technology)
  33. Notre Dame face, IR face, 3D face, expression, crowd, and eye biometric datasets (Notre Dame)
  34. ORL face database: 40 people with 10 views (ATT Cambridge Labs)
  35. Oxford: faces, flowers, multi-view, buildings, object categories, motion segmentation, affine covariant regions, misc (Oxford Visual Geometry Group)
  36. OUI-Adience Faces - unfiltered faces for gender and age classification plus 3D faces (OUI)
  37. PubFig: Public Figures Face Database (Neeraj Kumar, Alexander C. Berg, Peter N. Belhumeur, and Shree K. Nayar)
  38. Re-labeled Faces in the Wild - original images, but aligned using "deep funneling" method. (University of Massachusetts, Amherst)
  39. SCface - Surveillance Cameras Face Database (Mislav Grgic, Kresimir Delac, Sonja Grgic, Bozidar Klimpak))
  40. Trondheim Kinect RGB-D Person Re-identification Dataset (Igor Barros Barbosa)
  41. UB KinFace Database - University of Buffalo kinship verification and recognition database
  42. WIDER FACE: A Face Detection Benchmark - 32,203 images with 393,703 labeled faces, 61 event classes (Shuo Yang, Ping Luo, Chen Change Loy, Xiaoou Tang)
  43. XM2VTS Face video sequences (295): The extended M2VTS Database (XM2VTS) - (Surrey University)
  44. Yale Face Database - 11 expressions of 10 people (A. Georghaides)
  45. Yale Face Database B - 576 viewing conditions of 10 people (A. Georghaides)
  46. YouTube Faces DB - 3,425 videos of 1,595 different people. (Wolf, Hassner, Maoz)


  1. FVC fingerpring verification competition 2002 dataset (University of Bologna)
  2. FVC fingerpring verification competition 2004 dataset (University of Bologna)
  3. FVC - a subset of FVC (Fingerprint Verification Competition) 2002 and 2004 fingerprint image databases, manually extracted minutiae data & associated documents (Umut Uludag)
  4. NIST fingerprint databases (USA National Institute of Standards and Technology)
  5. SPD2010 Fingerprint Singular Points Detection Competition (SPD 2010 committee)

General Images

  1. Aerial color image dataset (Swiss Federal Institute of Technology)
  2. AMOS: Archive of Many Outdoor Scenes (20+m) (Nathan Jacobs)
  3. Brown Univ Large Binary Image Database (Ben Kimia)
  4. Caltech-UCSD Birds-200-2011 (Catherine Wah)
  5. Columbia Multispectral Image Database (F. Yasuma, T. Mitsunaga, D. Iso, and S.K. Nayar)
  6. HIPR2 Image Catalogue of different types of images (Bob Fisher et al)
  7. Hyperspectral images of natural scenes - 2002 (David H. Foster)
  8. Hyperspectral images of natural scenes - 2004 (David H. Foster)
  9. ImageNet Linguistically organised (WordNet) Hierarchical Image Database - 10E7 images, 15K categories (Li Fei-Fei, Jia Deng, Hao Su, Kai Li)
  10. ImageNet Large Scale Visual Recognition Challenges - currently 200 object classes and 500+K images (Alex Berg, Jia Deng, Fei-Fei Li and others)
  11. ISPRS multi-platform photogrammetry dataset - 1) Nadir and oblique aerial images plus 2) Combined UAV and terrestrial images (Francesco Nex and Markus Gerke)
  12. LabelMeFacade Database - 945 labeled building images (Erik Rodner et al)
  13. NYU Symmetry Database - 176 single-symmetry and 63 multyple-symmetry images (Marcelo Cicconet and Davi Geiger)
  14. OTCBVS Thermal Imagery Benchmark Dataset Collection (Ohio State Team)
  15. McGill Calibrated Colour Image Database (Adriana Olmos and Fred Kingdom)
  16. Tiny Images Dataset 79 million 32x32 color images (Fergus, Torralba, Freeman)
  17. Visual Question Answering - 254K imags, 764K questions, ground truth (Agrawal, Lu, Antol, Mitchell, Zitnick, Batra, Parikh)

General RGBD and Depth Datasets

Note: there are 3D datasets elsewhere as well, e.g. in Objects, Scenes, and Actions.

  1. A Large Dataset of Object Scans - 392 objects in 9 casses, hundreds of frames each (Choi, Zhou, Miller, Koltun)
  2. BigBIRD - 100 objects with for each object, 600 3D point clouds and 600 high-resolution color images spanning all views (Singh, Sha, Narayan, Achim, Abbeel)
  3. CAESAR Civilian American and European Surface Anthropometry Resource Project - 4000 3D human body scans (SAE International)
  4. CIN 2D+3D object classification dataset - segmented color and depth images of objects from 18 categories of common household and office objects (Björn Browatzki et al)
  5. Cornell-RGBD-Dataset - Office Scenes (Hema Koppula)
  6. IMPART multi-view/multi-modal 2D+3D film production dataset - LIDAR, video, 3D models, spherical camera, RGBD, stereo, action, facial expressions, etc. (Univ. of Surrey)
  7. NYU Depth Dataset V2 - Indoor Segmentation and Support Inference from RGBD Images
  8. Oakland 3-D Point Cloud Dataset (Nicolas Vandapel)
  9. Semantic-8: 3D point cloud classification with 8 classes (ETH Zurich)
  10. Washington RGB-D Object Dataset - 300 common household objects and 14 scenes. (University of Washington and Intel Labs Seattle)

Hand, Hand Grasp, Action and Gesture Databases

  1. 3D Articulated Hand Pose Estimation with Single Depth Images (Tang, Chang, Tejani, Kim, Yu)
  2. A Dataset of Human Manipulation Actions - RGB-D of 25 objects and 6 actions (Alessandro Pieropan)
  3. A-STAR Annotated Hand-Depth Image Dataset and its Performance Evaluation - depth data and data glove data, 29 images of 30 volunteers, Chinese number counting and American Sign Language (Xu and Cheng)
  4. General HANDS: general hand detection and pose challenge - 22 sequences with different gestures, activities and viewpoints (UC Irvine)
  5. A Hand Gesture Detection Dataset (Javier Molina et al)
  6. Hand gesture and marine silhouettes (Euripides G.M. Petrakis)
  7. HandNet: annotated depth images of articulated hands 214971 annotated depth images of hands captured by a RealSense RGBD sensor of hand poses. Annotations: per pixel classes, 6D fingertip pose, heatmap. Images -> Train: 202198, Test: 10000, Validation: 2773. Recorded at GIP Lab, Technion.
  8. IDIAP Hand pose/gesture datasets (Sebastien Marcel)
  9. LISA Vehicle Detection Dataset - colour first person driving video under various lighting and traffic conditions (Sivaraman, Trivedi)
  10. LISA CVRR-HANDS 3D - 19 gestures performed by 8 subjects as car driver and passengers (Ohn-Bar and Trivedi)
  11. Mobile and Webcam Hand images database - MOHI and WEHI - 200 people, 30 images each (Ahmad Hassanat)
  12. MPI Dexter 1 Dataset for Evaluation of 3D Articulated Hand Motion Tracking - Dexter 1: 7 sequences of challenging, slow and fast hand motions, RGB + depth (Sridhar, Oulasvirta, Theobalt)
  13. MSR Realtime and Robust Hand Tracking from Depth - (Qian, Sun, Wei, Tang, Sun)
  14. NYU Hand Pose Dataset - 8252 test-set and 72757 training-set frames of captured RGBD data with ground-truth hand-pose, 3 views (Tompson, Stein, Lecun, Perlin}
  15. Sheffield gesture database - 2160 RGBD hand gesture sequences, 6 subjects, 10 gestures, 3 postures, 3 backgrounds, 2 illuminations (Ling Shao)
  16. UT Grasp Data Set - 4 subjects grasping a variety of objectss with a variety of grasps (Cai, Kitani, Sato)
  17. Yale human grasping data set - 27 hours of video with tagged grasp, object, and task data from two housekeepers and two machinists (Bullock, Feix, Dollar)

People, human body pose

  1. Frames Labeled In Cinema (FLIC) - 20928 frames labeled with human pose (Sapp, Taskar)
  2. Leeds Sports Pose Dataset - 2000 pose annotated images of mostly sports people (Johnson, Everingham)
  3. MPII Human Pose Dataset - 25K images containing over 40K people with annotated body joints, 410 human activities {Andriluka, Pishchulin, Gehler, Schiele)
  4. Pointing'04 ICPR Workshop Head Pose Image Database
  5. VGG Human Pose Estimation datasets including the BBC Pose (20 videos with an overlaid sign language interpreter), Extended BBC Pose (72 additional training videos), Short BBC Pose (5 one hour videos with sign language signers), and ChaLearn Pose (23 hours of Kinect data of 27 persons performing 20 Italian gestures). (Charles, Everingham, Pfister, Magee, Hogg, Simonyan, Zisserman)

Image, Video and Shape Database Retrieval

  1. ANN_SIFT1M - 1M Flickr images encoded by 128D SIFT descriptors (Jegou et al)
  2. Brown Univ 25/99/216 Shape Databases (Ben Kimia)
  3. CIFAR-10 - 60K 32x32 images from 10 classes, with a 512D GIST descriptor (Alex Krizhevsky)
  4. CLEF-IP 2011 evaluation on patent images
  5. Flickr 30K - images, actions and captions (Peter Young et al)
  6. IAPR TC-12 Image Benchmark (Michael Grubinger)
  7. IAPR-TC12 Segmented and annotated image benchmark (SAIAPR TC-12): (Hugo Jair Escalante)
  8. ImageCLEF 2010 Concept Detection and Annotation Task (Stefanie Nowak)
  9. ImageCLEF 2011 Concept Detection and Annotation Task - multi-label classification challenge in Flickr photos
  10. McGill 3D Shape Benchmark (Siddiqi, Zhang, Macrini, Shokoufandeh, Bouix, Dickinson)
  11. MPI Movie Description dataset - text and video (A. Rohrbach)
  12. NIST SHREC 2010 - Shape Retrieval Contest of Non-rigid 3D Models (USA National Institute of Standards and Technology)
  13. NIST SHREC - other NIST retrieval contest databases and links (USA National Institute of Standards and Technology)
  14. NIST TREC Video Retrieval Evaluation Database (USA National Institute of Standards and Technology)
  15. NUS-WIDE - 269K Flickr images annotated with 81 concept tags, enclded as a 500D BoVW descriptorChau et al)
  16. Princeton Shape Benchmark (Princeton Shape Retrieval and Analysis Group)
  17. Queensland cross media dataset - millions of images and text documents for "cross-media" retrieval (Yi Yang)
  18. TOSCA 3D shape database (Bronstein, Bronstein, Kimmel)

Object Databases

  1. B3DO: Berkeley 3-D Object Dataset - household object detection (Janoch et al)
  2. Bristol Egocentric Object Interactions Dataset - egocentric object interactions with synchronised gaze (Dima Damen)
  3. 2.5D/3D Datasets of various objects and scenes (Ajmal Mian)
  4. Amsterdam Library of Object Images (ALOI): 100K views of 1K objects (University of Amsterdam/Intelligent Sensory Information Systems)
  5. Beyond PASCAL: A Benchmark for 3D Object Detection in the Wild - 12 class, 3000+ images each with 3D annotations (Yu Xiang, Roozbeh Mottaghi, Silvio Savarese)
  6. Caltech 101 (now 256) category object recognition database (Li Fei-Fei, Marco Andreeto, Marc'Aurelio Ranzato)
  7. Catania Fish Species Recognition - 15 fish species, with about 20,000 sample training images and additional test images (Concetto Spampinato))
  8. Columbia COIL-100 3D object multiple views (Columbia University)
  9. CORE image dataset - to help learn more detailed models and for exploring cross-category generalization in object recognition. (Ali Farhadi, Ian Endres, Derek Hoiem, and David A. Forsyth)
  10. Densely sampled object views: 2500 views of 2 objects, eg for view-based recognition and modeling (Gabriele Peters, Universiteit Dortmund)
  11. Ellipse finding dataset (Dilip K. Prasad et al)
  12. GTSDB: German Traffic Sign Detection Benchmark (Ruhr-Universitat Bochum)
  13. GRAZ-02 Database (Bikes, cars, people) (A. Pinz)
  14. Linkoping 3D Object Pose Estimation Database (Fredrik Viksten and Per-Erik Forssen)
  15. Linkoping Traffic Signs Dataset - 3488 traffic signs in 20K images (Larsson and Felsberg)
  16. LISA Traffic Light Dataset - 6 light classes in various lighting conditions (Jensen, Philipsen, Mogelmose, Moeslund, and Trivedi)
  17. LISA Traffic Sign Dataset - video of 47 US sign types with 7855 annotations on 6610 frames (Mogelmose, Trivedi, and Moeslund)
  18. Microsoft COCO - Common Objects in Context (Tsung-Yi Lin et al)
  19. Microsoft Object Class Recognition image databases (Antonio Criminisi, Pushmeet Kohli, Tom Minka, Carsten Rother, Toby Sharp, Jamie Shotton, John Winn)
  20. Microsoft salient object databases (labeled by bounding boxes) (Liu, Sun Zheng, Tang, Shum)
  21. MIT CBCL Car Data (Center for Biological and Computational Learning)
  22. MIT CBCL StreetScenes Challenge Framework: (Stan Bileschi)
  23. ModelNet - 127,915 CAD Models, 662 Object Categories, 10 Categories with Annotated Orientation (Wu, Song, Khosla, Yu, Zhang, Tang, Xiao)
  24. NABirds Dataset - 70,000 annotated photographs of the 400 species of birds commonly observed in North America (Grant Van Horn)
  25. NEC Toy animal object recognition or categorization database (Hossein Mobahi)
  26. NORB 50 toy image database (NYU)
  27. PASCAL Image Database (motorbikes, cars, cows) (PASCAL Consortium)
  28. PASCAL 2007 Challange Image Database (motorbikes, cars, cows) (PASCAL Consortium)
  29. PASCAL 2008 Challange Image Database (PASCAL Consortium)
  30. PASCAL 2009 Challange Image Database (PASCAL Consortium)
  31. PASCAL 2010 Challange Image Database (PASCAL Consortium)
  32. PASCAL 2011 Challange Image Database (PASCAL Consortium)
  33. PASCAL 2012 Challange Image Database Category classification, detection, and segmentation, and still-image action classification (PASCAL Consortium)
  34. PASCAL-Context dataset - annotations for 400+ additional categories (Alan Yuille)
  35. PASCAL Parts dataset - PASCAL VOC with segmentation annotation for semantic parts of objects (Alan Yuille)
  36. UAH Traffic Signs Dataset (Bascón, Arroyo, Jiménez, Moreno and López)
  37. UIUC Car Image Database (UIUC)
  38. UIUC Dataset of 3D object categories (S. Savarese and L. Fei-Fei)
  39. Venezia 3D object-in-clutter recognition and segmentation (Emanuele Rodola)
  40. Visual Attributes Dataset visual attribute annotations for over 500 object classes (animate and inanimate) which are all represented in ImageNet. Each object class is annotated with visual attributes based on a taxonomy of 636 attributes (e.g., has fur, made of metal, is round).

People, Pedestrian, Eye/Iris, Template Detection/Tracking Databases

  1. 3D KINECT Gender Walking data base (L. Igual, A. Lapedriza, R. Borràs from UB, CVC and UOC, Spain)
  2. AGORASET: a dataset for crowd video analysis (Nicolas Courty et al)
  3. Bosphorus Hand Geometry Database and Hand-Vein Database (Bogazici University)
  4. Caltech Pedestrian Dataset (P. Dollar, C. Wojek, B. Schiele and P. Perona)
  5. CASIA gait database (Chinese Academy of Sciences)
  6. CASIA-IrisV3 (Chinese Academy of Sciences, T. N. Tan, Z. Sun)
  7. CAVIAR project video sequences with tracking and behavior ground truth (CAVIAR team/Edinburgh University - EC project IST-2001-37540)
  8. CUHK Crowd Dataset - 474 video clips from 215 crowded scenes (Shao, Loy, and Wang)
  9. Crime Scene Footwear Impression Database - crime scene and reference foorware impression images (Adam Kortylewski)
  10. CUHK01 Dataset : Person re-id dataset with 3, 884 images of 972 pedestrians (Rui Zhao et al)
  11. CUHK02 Dataset : Person re-id dataset with five camera view settings. (Rui Zhao et al)
  12. CUHK03 Dataset : Person re-id dataset with 13,164 images of 1,360 pedestrians (Rui Zhao et al)
  13. Daimler Pedestrian Detection Benchmark 21790 images with 56492 pedestrians plus empty scenes (D. M. Gavrila et al)
  14. Driver Monitoring Video Dataset (RobeSafe + Jesus Nuevo-Chiquero)
  15. Edinburgh overhead camera person tracking dataset (Bob Fisher, Bashia Majecka, Gurkirt Singh, Rowland Sillito)
  16. Eyetracking database summary (Stefan Winkler)
  17. GVVPerfcapEva - repository of human shape and performance capture data, including full body skeletal, hand tracking, body shape, face performance, interactions (Christian Theobalt)
  18. HAT database of 27 human attributes (Gaurav Sharma, Frederic Jurie)
  19. INRIA Person Dataset (Navneet Dalal)
  20. ISMAR09 ground truth video dataset for template-based (i.e. planar) tracking algorithms (Sebastian Lieberknecht)
  21. Izmir - omnidirectional and panoramic image dataset (with annotations) to be used for human and car detection (Yalin Bastanlar)
  22. MAHNOB: MHI-Mimicry database - A 2 person, multiple camera and microphone database for studying mimicry in human-human interaction scenarios. (Sun, Lichtenauer, Valstar, Nijholt, and Pantic)
  23. Market-1501 Dataset - 32,668 annotated bounding boxes of 1,501 identities from up to 6 cameras (Liang Zheng et al)
  24. MPI DYNA - A Model of Dynamic Human Shape in Motion (Max Planck Tubingen)
  25. Multimodal Activities of Daily Living - including video, audio, physiological, sleep, motion and plug sensors. (Alexia Briasouli)
  26. MIT CBCL Pedestrian Data (Center for Biological and Computational Learning)
  27. MIT eye tracking database (1003 images) (Judd et al)
  28. Modena and Reggio Emilia first person head motion videos (Univ of Modena and Reggio Emilia)
  29. MPI FAUST DatasetA data set containing 300 real, high-resolution human scans, with automatically computed ground-truth correspondences (Max Planck Tubingen)
  30. MPI MOSH Motion and Shape Capture from Markers. MOCAP data, 3D shape meshes, 3D high resolution scans. (Max Planck Tubingen)
  31. Multiple Object Tracking Benchmark - A collection of datasets with ground truth, plus a performance league table (ETHZ, U. Adelaide, TU Darmstadt)
  32. Notre Dame Iris Image Dataset (Patrick J. Flynn)
  33. NYU Multiple Object Tracking Benchmark (Konrad Schindler et al)
  34. PARSE Dataset of Articulated Bodies - 300 images of humans and horses (Ramanan)
  35. PARSE Dataset Additional Data - facial expression, gaze direction, and gender (Antol, Zitnick, Parikh)
  36. PETS 2009 Crowd Challange dataset (Reading University & James Ferryman)
  37. PETS: Performance Evaluation of Tracking and Surveillance (Reading University & James Ferryman)
  38. PETS Winter 2009 workshop data (Reading University & James Ferryman)
  39. Pixel-based change detection benchmark dataset (Goyette et al)
  40. PIROPO - People in Indoor ROoms with Perspective and Omnidirectional cameras, with more than 100,000 annotated frames (GTI-UPM, Spain)
  41. RAiD - Re-Identification Across Indoor-Outdoor Dataset: 43 people, 4 cameras, 6920 images (Abir Das et al)
  42. Stanford Structured Group Discovery dataset - Discovering Groups of People in Images (W. Choi et al)
  43. Transient Biometrics Nails Dataset V01 (Igor Barros Barbosa)
  44. UBIRIS: Noisy Visible Wavelength Iris Image Databases (University of Beira)
  45. Univ of Central Florida - Crowd Dataset (Saad Ali)
  46. Univ of Central Florida - Crowd Flow Segmentation datasets (Saad Ali)
  47. UTIRIS cross-spectral iris image databank (Mahdi Hosseini)
  48. VIPeR: Viewpoint Invariant Pedestrian Recognition - 632 pedestrian image pairs taken from arbitrary viewpoints under varying illumination conditions. (Gray, Brennan, and Tao)
  49. York Univ Eye Tracking Dataset (120 images) (Neil Bruce)

Remote Sensing

  1. ISPRS 2D semantic labeling - Height models and true ortho-images with a ground sampling distance of 5cm have been prepared over the city of Potsdam/Germany (Franz Rottensteiner, Gunho Sohn, Markus Gerke, Jan D. Wegner)
  2. ISPRS 3D semantic labeling - nine class airborne laser scanning data (Franz Rottensteiner, Gunho Sohn, Markus Gerke, Jan D. Wegner)

Scene Segmentation or Classification

  1. Barcelona - 15,150 images, urban views of Barcelona (Tighe and Lazebnik)
  2. COLD (COsy Localization Database) - place localization (Ullah, Pronobis, Caputo, Luo, and Jensfelt)
  3. Geometric Context - scene interpretation images (Derek Hoiem)
  4. Indoor Scene Recognition - 67 Indoor categories, 15620 images (Quattoni and Torralba)
  5. LM+SUN - 45,676 images, mainly urban or human related scenes (Tighe and Lazebnik)
  6. Places Scene Recognition database - 205 scene categories and 2.5 millions of images (Zhou, Lapedriza, Xiao, Torralba, and Oliva)
  7. RGB-NIR Scene Dataset - 477 images in 9 categories captured in RGB and Near-infrared (NIR) (Brown and Susstrunk)
  8. Sift Flow (also known as LabelMe Outdoor, LMO) - 2688 images, mainly outdoor natural and urban (Tighe and Lazebnik)
  9. Stanford Background Dataset - 715 images of outdoor scenes containing at least one foreground object (Gould et al)
  10. SUN 2012 - 16,873 fully annotated scene images for scene categorization (Xiao et al)
  11. SUN 397 - 397 scene categories for scene classification (Xiao et al)
  12. SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite - 10,000 RGB-D images, 146,617 2D polygons and 58,657 3D bounding boxes (Song, Lichtenberg, and Xiao)

Segmentation (General)

  1. Alpert et al. Segmentation evaluation database (Sharon Alpert, Meirav Galun, Ronen Basri, Achi Brandt)
  2. Berkeley Segmentation Dataset and Benchmark (David Martin and Charless Fowlkes)
  3. GrabCut Image database (C. Rother, V. Kolmogorov, A. Blake, M. Brown)
  4. LabelMe images database and online annotation tool (Bryan Russell, Antonio Torralba, Kevin Murphy, William Freeman)


  1. AVSS07: Advanced Video and Signal based Surveillance 2007 datasets (Andrea Cavallaro)
  2. ETISEO Video Surveillance Download Datasets (INRIA Orion Team and others)
  3. Heriot Watt Summary of datasets for human tracking and surveillance (Zsolt Husz)
  4. Openvisor - Video surveillance Online Repository (Univ of Modena and Reggio Emilia)
  5. SCOUTER - video surveillance ground truthing (shifting perspectives, different setups/lighting conditions, large variations of subject). 30 videos and approximately 36,000 manually labeled frames. (Catalin Mitrea)
  6. SPEVI: Surveillance Performance EValuation Initiative (Queen Mary University London)
  7. UCSB Anomaly Detection Dataset - a stationary camera mounted at an elevation, overlooking pedestrian walkways, with unusual pedestrian or non-pedestrian motion.
  8. UCSD trajectory clustering and analysis datasets - (Morris and Trivedi)
  9. Udine Trajectory-based anomalous event detection dataset - synthetic trajectory datasets with outliers (Univ of Udine Artificial Vision and Real Time Systems Laboratory)


  1. Brodatz Texture, Normalized Brodatz Texture, Colored Brodatz Texture, Multiband Brodatz Texture 154 new images plus 112 original images with various transformations (A. Safia, D. He)
  2. Color texture images by category (
  3. Columbia-Utrecht Reflectance and Texture Database (Columbia & Utrecht Universities)
  4. DynTex: Dynamic texture database (Renaud Piteri, Mark Huiskes and Sandor Fazekas)
  5. KTH TIPS & TIPS2 textures - pose/lighting/scale variations (Eric Hayman)
  6. Oulu Texture Database (Oulu University)
  7. Oxford Describable Textures Dataset - 5640 images in 47 categories (M.Cimpoi, S. Maji, I. Kokkinos, S. Mohamed, A. Vedaldi)
  8. Prague Texture Segmentation Data Generator and Benchmark (Mikes, Haindl)
  9. UIUC Textures: 1000+330+110 gray images of 11 classes (Kemal Kilic)
  10. Uppsala texture dataset of surfaces and materials - fabrics, grains, etc.
  11. Vision Texture (MIT Media Lab)

General Videos

  1. GoPro-Gyro Dataset - ego centric videos (Linkoping Computer Vision Laboratory)
  2. Large scale YouTube video dataset - 156,823 videos (2,907,447 keyframes) crawled from YouTube videos (Yi Yang)

Other Collections

  1. CALVIN research group datasets - object dection with eye tracking, imagenet bounding boxes, synchronised activities, stickman and body poses, youtube objects, faces, horses, toys, visual attributes, shape classes (CALVIN ggroup)
  2. CANTATA Video and Image Database Index site (Multitel)
  3. Chinese University of Hong Kong datasets - Face sketch, face alignment, image search, public square observation, occlusion, central station, MIT single and multiple camera trajectories, person re-identification (Multimedia lab)
  4. Computer Vision Homepage list of test image databases (Carnegie Mellon Univ)
  5. ETHZ various, including ETH 3D head pose, BIWI audiovisual data, ETHZ shape classes, BIWI walking pedestrians, pedestrians, buildings, 4D MRI, personal events, liver untrasound, Food 101 (ETH Zurich, Computer Vision Lab)
  6. IDIAP dataset collection - 26 different datasets - multimodal, attack, biometric, cursive characters, discourse, eye gaze, posters, maya codex, MOBIO, face spoofing, game playing, finger vein, youtube-personality traits (IDIAP team)
  7. Leibe's Collection of people/vehicle/object databases (Bastian Leibe)
  8. Lotus Hill Image Database Collection with Ground Truth (Sealeen Ren, Benjamin Yao, Michael Yang)
  9. Michael Firman's List of RGBD datasets
  10. MIT Saliency Benchmark dataset - collection (pointers to 23 datasets) (Bylinskii, Judd, Borji, Itti, Durand, Oliva, Torralba}
  11. Oxford Misc, including Buffy, Flowers, TV characters, Buildings, etc (Oxford Visual geometry Group)
  12. PEIPA Image Database Summary (Pilot European Image Processing Archive)
  13. Univ of Bern databases on handwriting, online documents, string edit and graph matching (Univ of Bern, Computer Vision and Artificial Intelligence)
  14. USC Annotated Computer Vision Bibliography database publication summary (Keith Price)
  15. USC-SIPI image databases: texture, aerial, favorites (eg. Lena) (USC Signal and Image Processing Institute)


  1. 3D mesh watermarking benchmark dataset (Guillaume Lavoue)
  2. Active Appearance Models datasets (Mikkel B. Stegmann)
  3. Aircraft tracking (Ajmal Mian)
  4. California-ND - 701 photos from a personal photo collection, including many challenging real-life non-identical near-duplicates (Vassilios Vonikakis)
  5. Cambridge Motion-based Segmentation and Recognition Dataset (Brostow, Shotton, Fauqueur, Cipolla)
  6. Catadioptric camera calibration images (Yalin Bastanlar)
  7. Chars74K dataset - 74 English and Kannada characters (Teo de Campos -
  8. Columbia Camera Response Functions: Database (DoRF) and Model (EMOR) (M.D. Grossberg and S.K. Nayar)
  9. Columbia Database of Contaminants' Patterns and Scattering Parameters (Jinwei Gu, Ravi Ramamoorthi, Peter Belhumeur, Shree Nayar)
  10. Dense outdoor correspondence ground truth datasets, for optical flow and local keypoint evaluation (Christoph Strecha)
  11. DTU controlled motion and lighting image dataset (135K images) (Henrik Aanaes)
  12. EISATS: .enpeda.. Image Sequence Analysis Test Site (Auckland University Multimedia Imaging Group)
  13. FlickrLogos-32 - 8240 images of 32 product logos (Stefan Romberg)
  14. Flowchart images (Allan Hanbury)
  15. Image/video quality assessment database summary (Stefan Winkler)
  16. INRIA feature detector evaluation sequences (Krystian Mikolajczyk)
  17. INRIA's PERCEPTION's database of images and videos gathered with several synchronized and calibrated cameras (INRIA Rhone-Alpes)
  18. INRIA's Synchronized and calibrated binocular/binaural data sets with head movements (INRIA Rhone-Alpes)
  19. KITTI dataset for stereo, optical flow and visual odometry (Geiger, Lenz, Urtasun)
  20. Large scale 3D point cloud data from terrestrial LiDAR scanning (Andreas Nuechter)
  21. Linkoping Rolling Shutter Rectification Dataset (Per-Erik Forssen and Erik Ringaby)
  22. Middlebury College stereo vision research datasets (Daniel Scharstein and Richard Szeliski)
  23. MPI-Sintel optical flow evaluation dataset (Michael Black)
  24. MPI Sintel Flow Dataset A data set for the evaluation of optical flow derived from the open source 3D animated short film, Sintel. It has been extended for Stereo and disparity, Depth and camera motion, and Segmentation. (Max Planck Tubingen)
  25. MSR-VTT - video to text database of 200K+ video clip/sentence pairs
  26. Multi-FoV - photo-realistic video sequences that allow benchmarking of the impact of the Field-of-View (FoV) of the camera on various vision tasks. (Zhang, Rebecq, Forster, Scaramuzza)
  27. Multiview stereo images with laser based groundtruth (ESAT-PSI/VISICS,FGAN-FOM,EPFL/IC/ISIM/CVLab)
  28. NCI Cancer Image Archive - prostate images (National Cancer Institute)
  29. NIST 3D Interest Point Detection (Helin Dutagaci, Afzal Godil)
  30. NRCS natural resource/agricultural image database (USDA Natural Resources Conservation Service)
  31. Occlusion detection test data (Andrew Stein)
  32. The Open Video Project (Gary Marchionini, Barbara M. Wildemuth, Gary Geisler, Yaxiao Song)
  33. OSIE - Object and Semantic Images and Eye-tracking - 700 images, 5551 segmented objects, eye tracking data (Xu, Jiang, Wang, Kankanhalli, Zhao)
  34. Outdoor Ground Truth Evaluation Dataset for Sensor-Aided Visual Handheld Camera Localization (Daniel Kurz, metaio)
  35. PHOS (illumination invariance dataset) - 15 scenes captured under different illumination conditions * 15 images (Vassilios Vonikakis)
  36. Pics 'n' Trails - Dataset of Continuously archived GPS and digital photos (Gamhewage Chaminda de Silva)
  37. PRINTART: Artistic images of prints of well known paintings, including detail annotations. A benchmark for automatic annotation and retrieval tasks with this database was published at ECCV. (Nuno Miguel Pinho da Silva)
  38. RAWSEEDS SLAM benchmark datasets (Rawseeds Project)
  39. Robotic 3D Scan Repository - 3D point clouds from robotic experiments of scenes (Osnabruck and Jacobs Universities)
  40. ROMA (ROad MArkings) : Image database for the evaluation of road markings extraction algorithms (Jean-Philippe Tarel, et al)
  41. SALICON - Saliency in Context eye tracking dataset c. 1000 images with eye-tracking data in 80 image classes (Jiang, Huang, Duan, Zhao)
  42. Scripps Plankton Camera System - thousands of images of c. 50 classes of plankton and other small marine objects (Jaffe et al)
  43. Shadow Removal Dataset and Online Benchmark for Variable Scene Categories (Han Gong and Darren Cosker)
  44. Stuttgart Range Image Database - 66 views of 45 objects
  45. TGIF - 100K animated GIFs from Tumblr and 120K natural language descriptions (Li, Song, Cao, Tetreault, Goldberg, Jaimes, Luo)
  46. UCL Ground Truth Optical Flow Dataset (Oisin Mac Aodha)
  47. Univ of Genoa Datasets for disparity and optic flow evaluation (Manuela Chessa)
  48. Validation and Verification of Neural Network Systems (Francesco Vivarelli)
  49. Very Long Baseline Interferometry Image Reconstruction Dataset (MIT CSAIL)
  50. Virtual KITTI - 40 high-resolution videos (17,008 frames) generated from five different virtual worlds, for : object detection and multi-object tracking, scene-level and instance-level semantic segmentation, optical flow, and depth estimation (Gaidon, Wang, Cabon, Vig)
  51. VSD: Technicolor Violent Scenes Dataset - a collection of ground-truth files based on the extraction of violent events in movies
  52. WHOI-Plankton - 3.5 million images of microscopic marine plankton on 103 categories (Olson, Sosik)
  53. WILD: Weather and Illumunation Database (S. Narasimhan, C. Wang. S. Nayar, D. Stolyarov, K. Garg, Y. Schechner, H. Peri)

