CVonline: Image Databases


Index by Topic

  1. Action Databases
  2. Attribute recognition
  3. Autonomous Driving
  4. Biological/Medical
  5. Camera calibration
  6. Face and Eye/Iris Databases
  7. Fingerprints
  8. General Images
  9. General RGBD and depth datasets
  10. General Videos
  11. Hand, Hand Grasp, Hand Action and Gesture Databases
  12. Image, Video and Shape Database Retrieval
  13. Object Databases
  14. People (static), human body pose
  15. People Detection and Tracking Databases (See also Surveillance)
  16. Remote Sensing
  17. Scene or Place Segmentation or Classification
  18. Segmentation
  19. Simultaneous Localization and Mapping
  20. Surveillance (See also People)
  21. Textures
  22. Urban Datasets
  23. Other Collection Pages
  24. Miscellaneous Topics

Another helpful site is the YACVID page.


Action Databases

  1. 3D online action dataset - There are seven action categories (Microsoft and Nanyang Technological University)
  2. 50 Salads - fully annotated 4.5 hour dataset of RGB-D video + accelerometer data, capturing 25 people preparing two mixed salads each (Dundee University, Sebastian Stein)
  3. ASLAN Action similarity labeling challenge database (Orit Kliper-Gross)
  4. Action Detection in Videos - MERL Shopping Dataset consists of 106 videos, each of which is a sequence about 2 minutes long (Michael Jones, Tim Marks)
  5. Actor and Action Dataset - 3782 videos, seven classes of actors performing eight different actions (Xu, Hsieh, Xiong, Corso)
  6. An analyzed collation of various labeled video datasets for action recognition (Kevin Murphy)
  7. BEHAVE Interacting Person Video Data with markup (Scott Blunsden, Bob Fisher, Aroosha Laghaee)
  8. BU-action Datasets - Three image action datasets (BU101, BU101-unfiltered, BU203-unfiltered) that have 1:1 correspondence with classes of the video datasets UCF101 and ActivityNet. (S. Ma, S. A. Bargal, J. Zhang, L. Sigal, S. Sclaroff.)
  9. Berkeley MHAD: A Comprehensive Multimodal Human Action Database (Ferda Ofli)
  10. Berkeley Multimodal Human Action Database - five different modalities to expand the fields of application (University of California at Berkeley and Johns Hopkins University)
  11. Breakfast dataset - It's a dataset with 1712 video clips showing 10 kitchen activities, which are hand segmented into 48 atomic action classes . (H. Kuehne, A. B. Arslan and T. Serre )
  12. Brown Breakfast Actions Dataset - 70 hours, 4 million frames of 10 different breakfast preparation activities (Kuehne, Arslan and Serre)
  13. CAD-120 dataset - focuses on high level activities and object interactions (Cornell University)
  14. CAD-60 dataset - The CAD-60 and CAD-120 data sets comprise of RGB-D video sequences of humans performing activities (Cornell University)
  15. CVBASE06: annotated sports videos (Janez Pers)
  16. Charades Dataset - 10,000 videos from 267 volunteers, each annotated with multiple activities, captions, objects, and temporal localizations. (Sigurdsson, Varol, Wang, Laptev, Farhadi, Gupta)
  17. Composable activities dataset - Different combinations of 26 atomic actions formed 16 activity classes which were performed by 14 subjects and annotations were provided (Pontificia Universidad Catolica de Chile and Universidad del Norte)
  18. Cornell Activity Datasets CAD 60, CAD 120 (Cornell Robot Learning Lab)
  19. DMLSmartActions dataset - Sixteen subjects performed 12 different actions in a natural manner (University of British Columbia)
  20. Depth-included Human Action video dataset - It contains 23 different actions (CITI in Academia Sinica)
  21. DogCentric Activity Dataset - first-person videos taken from a camera mounted on top of a *dog* (Michael Ryoo)
  22. ETS Hockey Game Event Data Set - This data set contains footage of two hockey games captured using fixed cameras. (M.-A. Carbonneau, A. J. Raymond, E. Granger, and G. Gagnon)
  23. FCVID: Fudan-Columbia Video Dataset - 91,223 Web videos annotated manually according to 239 categories (Jiang, Wu, Wang, Xue, Chang)
  24. G3D - synchronised video, depth and skeleton data for 20 gaming actions captured with Microsoft Kinect (Victoria Bloom)
  25. G3Di - This dataset contains 12 subjects split into 6 pairs (Kingston University)
  26. Gaming 3D dataset - real-time action recognition in gaming scenario (Kingston University)
  27. Georgia Tech Egocentric Activities - Gaze(+) - videos of where people look at and their gaze location (Fathi, Li, Rehg)
  28. HMDB: A Large Human Motion Database (Serre Lab)
  29. Hollywood 3D dataset - 650 3D video clips, across 14 action classes (Hadfield and Bowden)
  30. Human Actions and Scenes Dataset (Marcin Marszalek, Ivan Laptev, Cordelia Schmid)
  31. HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion (Brown University)
  32. I-LIDS video event image dataset (Imagery library for intelligent detection systems) (Paul Hosner)
  33. I3DPost Multi-View Human Action Datasets (Hansung Kim)
  34. IAS-lab Action dataset - contain sufficient variety of actions and number of people performing the actions (IAS Lab at the University of Padua)
  35. INRIA Xmas Motion Acquisition Sequences (IXMAS) (INRIA)
  36. InfAR Dataset -Infrared Action Recognition at Different Times Neurocomputing(Chenqiang Gao, Yinhe Du, Jiang Liu, Jing Lv, Luyu Yang, Deyu Meng, Alexander G. Hauptmann)
  37. JPL First-Person Interaction dataset - 7 types of human activity videos taken from a first-person viewpoint (Michael S. Ryoo, JPL)
  38. Jena Action Recognition Dataset - Aibo dog actions (Korner and Denzler)
  39. K3Da - Kinect 3D Active dataset - K3Da (Kinect 3D active) is a realistic clinically relevant human action dataset containing skeleton, depth data and associated participant information (D. Leightley, M. H. Yap, J. Coulson, Y. Barnouin and J. S. McPhee)
  40. KIT Robo-Kitchen Activity Data Set - 540 clips of 17 people performing 12 complex kitchen activities.(L. Rybok, S. Friedberger, U. D. Hanebeck, R. Stiefelhagen)
  41. KTH human action recognition database (KTH CVAP lab)
  42. Karlsruhe Motion, Intention, and Activity Data set (MINTA) - 7 types of activities of daily living including fully motion primitive segments.(D. Gehrig, P. Krauthausen, L. Rybok, H. Kuehne, U. D. Hanebeck, T. Schultz, R. Stiefelhagen)
  43. LIRIS Human Activities Dataset - contains (gray/rgb/depth) videos showing people performing various activities (French National Center for Scientific Research)
  44. LIRIS human activities dataset - 2 cameras, annotated, depth images (Christian Wolf, et al)
  45. MEXaction2 action detection and localization dataset - to support the development and evaluation of methods for 'spotting' instances of short actions in a relatively large video database: 77 hours, 117 videos (Michel Crucianu and Jenny Benois-Pineau)
  46. MPII Cooking Activities Dataset (M. Rohrbach)
  47. MSR-Action3D - benchmark RGB-D action dataset (Microsoft Research Redmond and University of Wollongong)
  48. MSRActionPair dataset - : Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences (University of Central Florida and Microsoft)
  49. MSRC-12 Kinect gesture data set - 594 sequences and 719,359 frames from people performing 12 gestures (Microsoft Research Cambridge)
  50. MSRC-12 dataset - sequences of human movements, represented as body-part locations, and the associated gesture (Microsoft Research Cambridge and University of Cambridge)
  51. MSRDailyActivity3D Dataset - There are 16 activities (Microsoft and the Northwestern University)
  52. ManiAc RGB-D action dataset: different manipulation actions, 15 different versions, 30 different objects manipulated, 20 long and complex chained manipulation sequences (Eren Aksoy)
  53. Mivia dataset - It consists of 7 high-level actions performed by 14 subjects. (Mivia Lab at the University of Salemo)
  54. MuHAVi - Multicamera Human Action Video Data (Hossein Ragheb)
  55. Multi-modal action detection (MAD) Dataset - It contains 35 sequential actions performed by 20 subjects. (CarnegieMellon University)
  56. Multiview 3D Event dataset - This dataset includes 8 categories of events performed by 8 subjects (University of California at Los Angles)
  57. NTU RGB+D Action Recognition Dataset - NTU RGB+D is a large scale dataset for human action recognition(Amir Shahroudy)
  58. Northwestern-UCLA Multiview Action 3D - There are 10 action categories:(Northwestern University and University of California at Los Angles)
  59. Oxford TV based human interactions (Oxford Visual Geometry Group)
  60. Parliament - The Parliament dataset is a collection of 228 video sequences, depicting political speeches in the Greek parliament(Michalis Vrigkas, Christophoros Nikou, Ioannins A. kakadiaris)
  61. RGB-D activity dataset - Each video in the dataset contains 2-7 actions involving interaction with different objects. (Cornell University and Stanford University)
  62. RGBD-SAR Dataset - RGBD-SAR Dataset (University of Electronic Science and Technology of China and Microsoft)
  63. Rochester Activities of Daily Living Dataset (Ross Messing)
  64. SBU Kinect Interaction Dataset - It contains eight types of interactions (Stony Brook University)
  65. SBU-Kinect-Interaction dataset v2.0 - It comprises of RGB-D video sequences of humans performing interaction activities (Kiwon Yun etc.)
  66. SDHA Semantic Description of Human Activities 2010 contest - Human Interactions (Michael S. Ryoo, J. K. Aggarwal, Amit K. Roy-Chowdhury)
  67. SDHA Semantic Description of Human Activities 2010 contest - aerial views (Michael S. Ryoo, J. K. Aggarwal, Amit K. Roy-Chowdhury)
  68. SFU Volleyball Group Activity Recognition - 2 levels annotations dataset (9 players' actions and 8 scene's activity) for volleyball videos.(M. Ibrahim, S. Muralidharan, Z. Deng, A. Vahdat, and G. Mori / Simon Fraser University)
  69. SYSU 3D Human-Object Interaction Dataset - Forty subjects perform 12 distinct activities (Sun Yat-sen University)
  70. ShakeFive Dataset - contains only two actions, namely hand shake and high five. (Universiteit Utrecht)
  71. ShakeFive2 - A dyadic human interaction dataset with limb level annotations on 8 classes in 153 HD videos(Coert van Gemeren, Ronald Poppe, Remco Veltkamp)
  72. Sports Videos in the Wild (SVW) - SVW is comprised of 4200 videos captured solely with smartphones by users of Coach Eye smartphone app, a leading app for sports training developed by TechSmith corporation.(Seyed Morteza Safdarnejad, Xiaoming Liu)
  73. Stanford Sport Events dataset (Jia Li)
  74. THUMOS - Action Recognition in Temporally Untrimmed Videos! - 430 hours of video data and 45 million frames (Gorban, Idrees, Jiang, Zamir, Laptev Shah, Sukthanka)
  75. TUM Kitchen Data Set of Everyday Manipulation Activities (Moritz Tenorth, Jan Bandouch)
  76. TV Human Interaction Dataset (Alonso Patron-Perez)
  77. The Falling Detection dataset - Six subjects in two sceneries performed a series of actions continuously (University of Texas)
  78. The TJU dataset - contains 22 actions performed by 20 subjects in two different environments; a total of 1760 sequences. (Tianjin University)
  79. The UPCV action dataset - The dataset consists of 10 actions performed by 20 subjects twice. (University of Patras)
  80. UC-3D Motion Database - Available data types encompass high resolution Motion Capture, acquired with MVN Suit from Xsens and Microsoft Kinect RGB and depth images.(Institute of Systems and Robotics, Coimbra, Portugal)
  81. UCF 101 action dataset 101 action classes, over 13k clips and 27 hours of video data (Univ of Central Florida)
  82. UCFKinect - The dataset is composed of 16 actions (University of Central Florida Orlando)
  83. UCR Videoweb Multi-camera Wide-Area Activities Dataset (Amit K. Roy-Chowdhury)
  84. UTD-MHAD - Eight subjects performed 27 actions four times. (University of Texas at Dallas)
  85. UTKinect dataset - Ten types of human actions were performed twice by 10 subjects (University of Texas)
  86. UWA3D Multiview Activity Dataset - Thirty activities were performed by 10 individuals (University of Western Australia)
  87. Univ of Central Florida - 50 Action Category Recognition in Realistic Videos (3 GB) (Kishore Reddy)
  88. Univ of Central Florida - ARGAerial camera, Rooftop camera and Ground camera (UCF Computer Vision Lab)
  89. Univ of Central Florida - Feature Films Action Dataset (Univ of Central Florida)
  90. Univ of Central Florida - Sports Action Dataset (Univ of Central Florida)
  91. Univ of Central Florida - YouTube Action Dataset (sports) (Univ of Central Florida)
  92. Utrecht Multi-Person Motion Benchmark (UMPM). - a collection of video recordings of people together with a ground truth based on motion capture data.(N.P. van der Aa, X. Luo, G.J. Giezeman, R.T. Tan, R.C. Veltkamp.)
  93. VIRAT Video Dataset - event recognition from two broad categories of activities (single-object and two-objects) which involve both human and vehicles. (Sangmin Oh et al)
  94. Verona Social interaction dataset (Marco Cristani)
  95. ViHASi: Virtual Human Action Silhouette Data (userID: VIHASI password: virtual$virtual) (Hossein Ragheb, Kingston University)
  96. Videoweb (multicamera) Activities Dataset (B. Bhanu, G. Denina, C. Ding, A. Ivers, A. Kamal, C. Ravishankar, A. Roy-Chowdhury, B. Varda)
  97. WVU Multi-view action recognition dataset (Univ. of West Virginia)
  98. WorkoutSU-10 Kinect dataset for exercise actions (Ceyhun Akgul)
  99. WorkoutSU-10 dataset - contains exercise actions selected by professional trainers for therapeutic purposes. (Sabanc University)
  100. Wrist-mounted camera video dataset - object manipulation (Ohnishi, Kanehira, Kanezaki, Harada)
  101. YouCook - 88 open-source YouTube cooking videos with annotations (Jason Corso)
  102. YouTube-8M Dataset -A Large and Diverse Labeled Video Dataset for Video Understanding Research(Google Inc.)

Attribute recognition

  1. BirdsThis database contains 600 images (100 samples each) of six different classes of birds.(Svetlana Lazebnik, Cordelia Schmid, and Jean Ponce)
  2. ButterfliesThis database contains 619 images of seven different classes of butterflies. (Svetlana Lazebnik, Cordelia Schmid, and Jean Ponce)
  3. CALVIN research group datasets - object detection with eye tracking, imagenet bounding boxes, synchronised activities, stickman and body poses, youtube objects, faces, horses, toys, visual attributes, shape classes (CALVIN ggroup)
  4. CelebA - Large-scale CelebFaces Attributes Dataset(Ziwei Liu, Ping Luo, Xiaogang Wang, Xiaoou Tang)
  5. HAT database of 27 human attributes (Gaurav Sharma, Frederic Jurie)
  6. LFW-10 dataset for learning relative attributes - A dataset of 10,000 pairs of face images with instance-level annotations for 10 attributes.(CVIT, IIIT Hyderabad. )
  7. Person Recognition in Personal Photo Collections - we introduced three harder splits for evaluation and long-term attribute annotations and per-photo timestamp metadata.(Oh, Seong Joon and Benenson, Rodrigo and Fritz, Mario and Schiele, Bernt)
  8. Visual Attributes Dataset visual attribute annotations for over 500 object classes (animate and inanimate) which are all represented in ImageNet. Each object class is annotated with visual attributes based on a taxonomy of 636 attributes (e.g., has fur, made of metal, is round).
  9. WIDER Attribute Dataset - WIDER Attribute is a large-scale human attribute dataset, with 13789 images belonging to 30 scene categories, and 57524 human bounding boxes each annotated with 14 binary attributes.(Li, Yining and Huang, Chen and Loy, Chen Change and Tang, Xiaoou)

Autonomous Driving

  1. AMUSE -The automotive multi-sensor (AMUSE) dataset taken in real traffic scenes during multiple test drives. (Philipp Koschorrek etc.)
  2. Autonomous Driving - semantic segmentation,pedestrian detection,virtual-world data,far infrared,stereo,driver monitoring(CVC research center and the UAB and UPC universities)
  3. Joint Attention in Autonomous Driving (JAAD) - The dataset includes instances of pedestrians and cars intended primarily for the purpose of behavioural studies and detection in the context of autonomous driving.(Iuliia Kotseruba, Amir Rasouli and John K. Tsotsos)
  4. LISA Vehicle Detection Dataset - colour first person driving video under various lighting and traffic conditions (Sivaraman, Trivedi)
  5. Lost and Found Dataset - The Lost and Found Dataset addresses the problem of detecting unexpected small road hazards (often caused by lost cargo) for autonomous driving applications. (Sebastian Ramos, Peter Pinggera, Stefan Gehrig, Uwe Franke, Rudolf Mester, Carsten Rother)
  6. SYNTHIA - Large set (~half million) of virtual-world images for training autonomous cars to see. (ADAS Group at Computer Vision Center)
  7. The SYNTHetic collection of Imagery and Annotations - the purpose of aiding semantic segmentation and related scene understanding problems in the context of driving scenarios (Computer vision center,UAB)

Biological/Medical

  1. 2008 MICCAI MS Lesion Segmentation Challenge (National Institutes of Health Blueprint for Neuroscience Research)
  2. ASU DR-AutoCC Data - a Multiple-Instance Learning feature space for a diabetic retinopathy classification dataset (Ragav Venkatesan, Parag Chandakkar, Baoxin Li - Arizona State University)
  3. Annotated Spine CT Database for Benchmarking of Vertebrae Localization, 125 patients, 242 scans (Ben Glockern)
  4. BRATS - the identification and segmentation of tumor structures in multiparametric magnetic resonance images of the brain (TU Munchen etc.)
  5. CRCHistoPhenotypes - Labeled Cell Nuclei Data - colorectal cancer?histology images?consisting of nearly 30,000 dotted nuclei with over 22,000 labeled with the cell type (Rajpoot + Sirinukunwattana)
  6. CREMI: MICCAI 2016 Challenge - 6 volumes of electron microscopy of neural tissue,neuron and synapse segmentation, synaptic partner annotation. (Jan Funke, Stephan Saalfeld, Srini Turaga, Davi Bock, Eric Perlman)
  7. Cavy Action Dataset - 16 sequences with 640 x 480 resolutions recorded at 7.5 frames per second (fps) with approximately 31621506 frames in total (272 GB) of interacting cavies (guinea pig) (Al-Raziqi and Denzler)
  8. Cell Tracking Challenge Datasets - 2D/3D time-lapse video sequences with ground truth(Ma et al., Bioinformatics 30:1609-1617, 2014)
  9. Computed Tomography Emphysema Database (Lauge Sorensen)
  10. DIADEM: Digital Reconstruction of Axonal and Dendritic Morphology Competition (Allen Institute for Brain Science et al)
  11. DIARETDB1 - Standard Diabetic Retinopathy Database (Lappeenranta Univ of Technology)
  12. DRIVE: Digital Retinal Images for Vessel Extraction (Univ of Utrecht)
  13. DeformIt 2.0 - Image Data Augmentation Tool: Simulate novel images with ground truth segmentations from a single image-segmentation pair (Brian Booth and Ghassan Hamarneh)
  14. Dermoscopy images (Eric Ehrsam)
  15. EPT29.This database contains 4842 images of 1613 specimens of 29 taxa of EPTs:(Tom etc.)
  16. FIRE Fundus Image Registration Dataset - 134 retinal image pairs and groud truth for registration.(FORTH-ICS)
  17. KID - A capsule endoscopy database for medical decision support (Anastasios Koulaouzidis and Dimitris Iakovidis)
  18. Leaf Segmentation ChallengeTobacco and arabidopsis plant images (Hanno Scharr, Massimo Minervini, Andreas Fischbach, Sotirios A. Tsaftaris)
  19. MIT CBCL Automated Mouse Behavior Recognition datasets (Nicholas Edelman)
  20. MUCIC: Masaryk University Cell Image Collection - 2D/3D synthetic images of cells/tissues for benchmarking(Masaryk University)
  21. MiniMammographic Database (Mammographic Image Analysis Society)
  22. Moth fine-grained recognition - 675 similar classes, 5344 images (Erik Rodner et al)
  23. Mouse Embryo Tracking Database - cell division event detection (Marcelo Cicconet, Kris Gunsalus)
  24. OASIS - Open Access Series of Imaging Studies - 500+ MRI data sets of the brain (Washington University, Harvard University, Biomedical Informatics Research Network)
  25. Plant Phenotyping Datasets - plant data suitable for plant and leaf detection, segmentation, tracking, and species recognition (M. Minervini, A. Fischbach, H. Scharr, S. A. Tsaftaris)
  26. Retinal fundus images - Ground truth of vascular bifurcations and crossovers (Univ of Groningen)
  27. Spine and Cardiac data (Digital Imaging Group of London Ontario, Shuo Li)
  28. Stonefly9This database contains 3826 images of 773 specimens of 9 taxa of Stoneflies (Tom etc.)
  29. Univ of Central Florida - DDSM: Digital Database for Screening Mammography (Univ of Central Florida)
  30. VascuSynth - 120 3D vascular tree like structures with ground truth (Mengliu Zhao, Ghassan Hamarneh)
  31. VascuSynth - Vascular Synthesizer generates vascular trees in 3D volumes. (Ghassan Hamarneh, Preet Jassi, Mengliu Zhao)
  32. York Cardiac MRI dataset (Alexander Andreopoulos)

Camera calibration

  1. Catadioptric camera calibration images (Yalin Bastanlar)
  2. GoPro-Gyro Dataset - This dataset consists of a number of wide-angle rolling shutter video sequences with corresponding gyroscope measurements (Hannes etc.)
  3. LO-RANSAC - LO-RANSAC library for estimation of homography and epipolar geometry(K. Lebeda, J. Matas and O. Chum)

Face and Eye/Iris Databases

  1. 300 Videos in the Wild (300-VW) - 68 Facial Landmark Tracking (Chrysos, Antonakos, Zafeiriou, Snape, Shen, Kossaifi, Tzimiropoulos, Pantic)
  2. 3D Mask Attack Database (3DMAD) - 76500 frames of 17 persons using Kinect RGBD with eye positions (Sebastien Marcel)
  3. 3D facial expression - Binghamton University 3D Static and Dynamic Facial Expression Databases (Lijun Yin, Jeff Cohn, and teammates)
  4. Audio-visual database for face and speaker recognition (Mobile Biometry MOBIO http://www.mobioproject.org/)
  5. BANCA face and voice database (Univ of Surrey)
  6. Binghampton Univ 3D static and dynamic facial expression database (Lijun Yin, Peter Gerhardstein and teammates)
  7. BioID face database (BioID group)
  8. BioVid Heat Pain Database - This video (and biomedical signal) dataset contains facial and physiopsychological reactions of 87 study participants who were subjected to experimentally induced heat pain.(University of Magdeburg (Neuro-Information Technology group) and University of Ulm (Emotion Lab))
  9. Biwi 3D Audiovisual Corpus of Affective Communication - 1000 high quality, dynamic 3D scans of faces, recorded while pronouncing a set of English sentences.
  10. Bosphorus 3D/2D Database of FACS annotated facial expressions, of head poses and of face occlusions (Bogazici University)
  11. CASIA-IrisV3 (Chinese Academy of Sciences, T. N. Tan, Z. Sun)
  12. CASIR Gaze Estimation Database - RGB and depth images (from Kinect V1.0) and ground truth values of facial features corresponding to experiments for gaze estimation benchmarking: (Filipe Ferreira etc.)
  13. CMU Facial Expression Database (CMU/MIT)
  14. CMU Pose, Illumination, and Expression (PIE) Database (Simon Baker)
  15. CMU/MIT Frontal Faces (CMU/MIT)
  16. CMU/MIT Frontal Faces (CMU/MIT)
  17. CSSE Frontal intensity and range images of faces (Ajmal Mian)
  18. CelebA - Large-scale CelebFaces Attributes Dataset(Ziwei Liu, Ping Luo, Xiaogang Wang, Xiaoou Tang)
  19. Cohn-Kanade AU-Coded Expression Database - 500+ expression sequences of 100+ subjects, coded by activated Action Units (Affect Analysis Group, Univ. of Pittsburgh)
  20. Columbia Gaze Data Set - 5,880 images of 56 people over 5 head poses and 21 gaze directions (Brian A. Smith, Qi Yin, Steven K. Feiner, Shree K. Nayar)
  21. Computer Vision Laboratory Face Database (CVL Face Database) - Database contains 798 images of 114 persons, with 7 images per person and is freely available for research purposes.(Peter Peer etc.)
  22. EURECOM Facial Cosmetics Database - 389 images, 50 persons with/without make-up, annotations about the amount and location of applied makeup.(Jean-Luc DUGELAY et al)
  23. EURECOM Kinect Face Database - 52 people, 2 sessions, 9 variations, 6 facial landmarks.(Jean-Luc DUGELAY et al)
  24. EYEDIAP dataset - The EYEDIAP dataset was designed to train and evaluate gaze estimation algorithms from RGB and RGB-D data.It contains a diversity of participants, head poses, gaze targets and sensing conditions.(Kenneth Funes and Jean-Marc Odobez)
  25. FDDB: Face Detection Data set and Benchmark - studying unconstrained face detection (University of Massachusetts Computer Vision Laboratory)
  26. FG-Net Aging Database of faces at different ages (Face and Gesture Recognition Research Network)
  27. Face Recognition Grand Challenge datasets (FRVT - Face Recognition Vendor Test)
  28. Face Super-Resolution Dataset - Ground truth HR-LR face images captured with a dual-camera setup(Chengchao Qu etc.)
  29. FaceScrub - A Dataset With Over 100,000 Face Images of 530 People (50:50 male and female) (H.-W. Ng, S. Winkler)
  30. FaceTracer Database - 15,000 faces (Neeraj Kumar, P. N. Belhumeur, and S. K. Nayar)
  31. Facial Recognition Technology (FERET) Database (USA National Institute of Standards and Technology)
  32. Hannah and her sisters database - a dense audio-visual person-oriented ground-truth annotation of faces, speech segments, shot boundaries (Patrick Perez, Technicolor)
  33. Hong Kong Face Sketch Database
  34. IDIAP Head Pose Database (IHPD) - The dataset contains a set of meeting videos along with the head groundtruth of individual participants (around 128min)(Sileye Ba and Jean-Marc Odobez)
  35. IMDB-WIKI - 500k+ face images with age and gender labels (Rasmus Rothe, Radu Timofte, Luc Van Gool )
  36. Iranian Face Database - IFDB is the first image database in middle-east, contains color facial images with age, pose, and expression whose subjects are in the range of 2-85. (Mohammad Mahdi Dehshibi)
  37. Japanese Female Facial Expression (JAFFE) Database (Michael J. Lyons)
  38. LFW: Labeled Faces in the Wild - unconstrained face recognition
  39. MIT CBCL Face Recognition Database (Center for Biological and Computational Learning)
  40. MIT Collation of Face Databases (Ethan Meyers)
  41. MIT eye tracking database (1003 images) (Judd et al)
  42. MMI Facial Expression Database - 2900 videos and high-resolution still images of 75 subjects, annotated for FACS AUs.
  43. MORPH (Craniofacial Longitudinal Morphological Face Database) (University of North Carolina Wilmington)
  44. Manchester Annotated Talking Face Video Dataset (Timothy Cootes)
  45. MegaFace - 1 million faces in bounding boxes (Kemelmacher-Shlizerman, Seitz, Nech, Miller, Brossard)
  46. NIST Face Recognition Grand Challenge (FRGC) (NIST)
  47. NIST mugshot identification database (USA National Institute of Standards and Technology)
  48. Notre Dame Iris Image Dataset (Patrick J. Flynn)
  49. Notre Dame face, IR face, 3D face, expression, crowd, and eye biometric datasets (Notre Dame)
  50. ORL face database: 40 people with 10 views (ATT Cambridge Labs)
  51. OUI-Adience Faces - unfiltered faces for gender and age classification plus 3D faces (OUI)
  52. Oxford: faces, flowers, multi-view, buildings, object categories, motion segmentation, affine covariant regions, misc (Oxford Visual Geometry Group)
  53. PubFig: Public Figures Face Database (Neeraj Kumar, Alexander C. Berg, Peter N. Belhumeur, and Shree K. Nayar)
  54. Re-labeled Faces in the Wild - original images, but aligned using "deep funneling" method. (University of Massachusetts, Amherst)
  55. SCface - Surveillance Cameras Face Database (Mislav Grgic, Kresimir Delac, Sonja Grgic, Bozidar Klimpak))
  56. Salient features in gaze-aligned recordings of human visual input - TB of human gaze-contingent data "in the wild" (Frank Schumann etc.)
  57. SiblingsDB - The SiblingsDB contains two datasets depicting images of individuals related by sibling relationships. (Politecnico di Torino/Computer Graphics & Vision Group)
  58. Trondheim Kinect RGB-D Person Re-identification Dataset (Igor Barros Barbosa)
  59. UB KinFace Database - University of Buffalo kinship verification and recognition database
  60. UBIRIS: Noisy Visible Wavelength Iris Image Databases (University of Beira)
  61. UTIRIS cross-spectral iris image databank (Mahdi Hosseini)
  62. VIPSL Database - VIPSL Database is for research on face sketch-photo synthesis and recognition, including 200 subjects (1 photo and 5 sketches per subject).(Nannan Wang)
  63. WIDER FACE: A Face Detection Benchmark - 32,203 images with 393,703 labeled faces, 61 event classes (Shuo Yang, Ping Luo, Chen Change Loy, Xiaoou Tang)
  64. XM2VTS Face video sequences (295): The extended M2VTS Database (XM2VTS) - (Surrey University)
  65. Yale Face Database - 11 expressions of 10 people (A. Georghaides)
  66. Yale Face Database B - 576 viewing conditions of 10 people (A. Georghaides)
  67. York Univ Eye Tracking Dataset (120 images) (Neil Bruce)
  68. YouTube Faces DB - 3,425 videos of 1,595 different people. (Wolf, Hassner, Maoz)
  69. Zurich Natural Image - the image material used for creating natural stimuli in a series of eye-tracking studies (Frey et al.)

Fingerprints

  1. FVC fingerpring verification competition 2002 dataset (University of Bologna)
  2. FVC fingerpring verification competition 2004 dataset (University of Bologna)
  3. NIST fingerprint databases (USA National Institute of Standards and Technology)
  4. SPD2010 Fingerprint Singular Points Detection Competition (SPD 2010 committee)

General Images

  1. A Dataset for Real Low-Light Image Noise Reduction - It contains pixel and intensity aligned pairs of images corrupted by low-light camera noise and their low-noise counterparts. (J. Anaya, A. Barbu)
  2. AMOS: Archive of Many Outdoor Scenes (20+m) (Nathan Jacobs)
  3. Aerial images Building detection from aerial images using invariant color features and shadow information (Beril Sirmacek)
  4. BGU Hyperspectral Image Database of Natural Scenes (Ohad Ben-Shahar and Boaz Arad)
  5. Brown Univ Large Binary Image Database (Ben Kimia)
  6. CMP Facade Database - Includes 606 rectified images of facades from various places with 12 architectural classes annotated.(Radim Tylecek)
  7. Caltech-UCSD Birds-200-2011 (Catherine Wah)
  8. Columbia Multispectral Image Database (F. Yasuma, T. Mitsunaga, D. Iso, and S.K. Nayar)
  9. DAQUAR (Visual Turing Challenge) - A dataset containing questions and answers about real-world indoor scenes.(Mateusz Malinowski, Mario Fritz)
  10. Dataset of American Movie Trailers 2010-2014 - Contains links to 474 hollywood movie trailers along with associated metadata (genre, budget, runtime, release, MPAA rating, screens released, sequel indicator) (USC Signal Analysis and Interpretation Lab)
  11. General 100 Dataset - General-100 dataset contains 100 bmp-format images (with no compression), which are well-suited for super-resolution training(Dong, Chao and Loy, Chen Change and Tang, Xiaoou)
  12. HIPR2 Image Catalogue of different types of images (Bob Fisher et al)
  13. Hyperspectral images for spatial distributions of local illumination in natural scenes - Thirty calibrated hyperspectral radiance images of natural scenes with probe spheres embedded for local illumination estimation. (Nascimento, Amano & Foster)
  14. Hyperspectral images of natural scenes - 2002 (David H. Foster)
  15. Hyperspectral images of natural scenes - 2004 (David H. Foster)
  16. ISPRS multi-platform photogrammetry dataset - 1) Nadir and oblique aerial images plus 2) Combined UAV and terrestrial images (Francesco Nex and Markus Gerke)
  17. ImageNet Large Scale Visual Recognition Challenges - currently 200 object classes and 500+K images (Alex Berg, Jia Deng, Fei-Fei Li and others)
  18. ImageNet Linguistically organised (WordNet) Hierarchical Image Database - 10E7 images, 15K categories (Li Fei-Fei, Jia Deng, Hao Su, Kai Li)
  19. Improved 3D Sparse Maps for High-performance Structure from Motion with Low-cost Omnidirectional Robots - Evaluation Dataset - Data set used in research paper doi:10.1109/ICIP.2015.7351744 (Breckon, Toby P., Cavestany, Pedro)
  20. LabelMeFacade Database - 945 labeled building images (Erik Rodner et al)
  21. Local illumination hyperspectral radiance images - Thirty hyperspectral radiance images of natural scenes with embedded probe spheres for local illumination estimates(Sgio M. C. Nascimento, Kinjiro Amano, David H. Foster)
  22. McGill Calibrated Colour Image Database (Adriana Olmos and Fred Kingdom)
  23. Multiply Distorted Image Database -a database for evaluating the results of image quality assessment metrics on multiply distorted images.(Fei Zhou)
  24. NPRgeneral - A standardized collection of images for evaluating image stylization algorithms. (David Mould, Paul Rosin)
  25. NYU Symmetry Database - 176 single-symmetry and 63 multyple-symmetry images (Marcelo Cicconet and Davi Geiger)
  26. OTCBVS Thermal Imagery Benchmark Dataset Collection (Ohio State Team)
  27. PAnorama Sparsely STructured Areas Datasets - the PASSTA datasets used for evaluation of the image alignment (Andreas Robinson)
  28. Time-Lapse Hyperspectral Radiance Images of Natural Scenes - Four time-lapse sequences of 7-9 calibrated hyperspectral radiance images of natural scenes taken over the day. (Foster, D.H., Amano, K., & Nascimento, S.M.C.)
  29. Time-lapse hyperspectral radiance images - Four time-lapse sequences of 7-9 calibrated hyperspectral images of natural scenes, spectra at 10-nm intervals(David H. Foster, Kinjiro Amano, Sgio M. C. Nascimento)
  30. Tiny Images Dataset 79 million 32x32 color images (Fergus, Torralba, Freeman)
  31. Visual Question Answering - 254K imags, 764K questions, ground truth (Agrawal, Lu, Antol, Mitchell, Zitnick, Batra, Parikh)
  32. Visual Question Generation - 15k images (including both object-centric and event-centric images), 75k natural questions asked about the images which can evoke further conversation(Nasrin Mostafazadeh , Ishan Misra , Jacob Devlin , Margaret Mitchell , Xiao dong He , Lucy Vanderwende)
  33. YFCC100M: The New Data in Multimedia Research - This publicly available curated dataset of 100 million photos and videos is free and legal for all.(Bart ThomeeYahoo Labs and Flickr in San Francisco,etc.)

General RGBD and Depth Datasets

Note: there are 3D datasets elsewhere as well, e.g. in Objects, Scenes, and Actions.

  1. 3D-Printed RGB-D Object Dataset - 5 objects with groundtruth CAD models and camera trajectories, recorded with various quality RGB-D sensors(Siemens & TUM)
  2. 3DCOMET - 3DCOMET is a dataset for testing 3D data compression methods.(Miguel Cazorla , Javier Navarrete,Vicente Morell, Miguel Cazorla, Diego Viejo, Jose Garcia-Rodriguez, Sergio Orts.)
  3. A Dataset for Non-Rigid Reconstruction from RGB-D Data - Eight scenes for reconstructing non-rigid geometry from RGB-D data, each containing several hundred frames along with our results. (Matthias Innmann, Michael Zollhoefer, Matthias Niessner, Christian Theobalt, Marc Stamminger)
  4. A Large Dataset of Object Scans - 392 objects in 9 casses, hundreds of frames each (Choi, Zhou, Miller, Koltun)
  5. Articulated Object Challenge - 4 articulated objects consisting of rigids parts connected by 1D revolute and prismatic joints, 7000+ RGBD images with annotations for 6D pose estimation(Frank Michel, Alexander Krull, Eric Brachmann, Michael. Y. Yang,Stefan Gumhold, Carsten Rother)
  6. BigBIRD - 100 objects with for each object, 600 3D point clouds and 600 high-resolution color images spanning all views (Singh, Sha, Narayan, Achim, Abbeel)
  7. CAESAR Civilian American and European Surface Anthropometry Resource Project - 4000 3D human body scans (SAE International)
  8. CIN 2D+3D object classification dataset - segmented color and depth images of objects from 18 categories of common household and office objects (Bj?rn Browatzki et al)
  9. CTU Garment Folding Photo Dataset - Color and depth images from various stages of garment folding.(Sushkov R., Melkumov I., Smutn y V. (Czech Technical University in Prague))
  10. CTU Garment Sorting Dataset - Dataset of garment images, detailed stereo images, depth images and weights.(Petrik V., Wagner L. (Czech Technical University in Prague))
  11. Clothing part dataset - The clothing part dataset consists of image and depth scans, acquired with a Kinect, of garments laying on a table, with over a thousand part annotations (collar, cuffs, hood, etc) using polygonal masks.(Arnau Ramisa, Guillem Aleny,Francesc Moreno-Noguer and Carme Torras)
  12. Cornell-RGBD-Dataset - Office Scenes (Hema Koppula)
  13. Delft Windmill Interior and Exterior Laser Scanning Point Clouds -(Beril Sirmacek)
  14. EURECOM Kinect Face Database - 52 people, 2 sessions, 9 variations, 6 facial landmarks.(Jean-Luc DUGELAY et al)
  15. EURECOM Kinect Face Database - 52 people, 2 sessions, 9 variations, 6 facial landmarks.(Jean-Luc DUGELAY et al)
  16. Goldfinch: GOogLe image-search Dataset for FINe grained CHallenges - a largescale dataset for finegrained bird (11K species),butterfly (14K species), aircraft (409 types), and dog (515 breeds) recognition.(Jonathan Krause, Benjamin Sapp,Andrew Howard, Howard Zhou, Alexander Toshev, Tom Duerig, James Philbin, Li Fei-Fei)
  17. IMPART multi-view/multi-modal 2D+3D film production dataset - LIDAR, video, 3D models, spherical camera, RGBD, stereo, action, facial expressions, etc. (Univ. of Surrey)
  18. Kinect v2 Dataset - Efficient Multi-Frequency Phase Unwrapping using Kernel Density Estimation (Felix etc.)
  19. Multi-sensor 3D Object Dataset for Object Recognition with Full Pose Estimation - Multi-sensor 3D Object Dataset for Object Recognition and Pose Estimation(Alberto Garcia-Garcia, Sergio Orts-Escolano, Sergiu Oprea,etc.)
  20. NTU RGB+D Action Recognition Dataset - NTU RGB+D is a large scale dataset for human action recognition(Amir Shahroudy)
  21. NYU Depth Dataset V2 - Indoor Segmentation and Support Inference from RGBD Images
  22. Oakland 3-D Point Cloud Dataset (Nicolas Vandapel)
  23. SYNTHIA - Large set (~half million) of virtual-world images for training autonomous cars to see. (ADAS Group at Computer Vision Center)
  24. Semantic-8: 3D point cloud classification with 8 classes (ETH Zurich)
  25. Stereo and ToF dataset with ground truth - The dataset contains 5 different scenes acquired with a Time-of-flight sensor and a stereo setup. Ground truth information is also provided.(Carlo Dal Mutto, Pietro Zanuttigh, Guido M. Cortelazzo)
  26. TUM RGB-D Benchmark - Dataset and benchmark for the evaluation of RGB-D visual odometry and SLAM algorithms (BCrgen Sturm, Nikolas Engelhard, Felix Endres, Wolfram Burgard and Daniel Cremers)
  27. UC-3D Motion Database - Available data types encompass high resolution Motion Capture, acquired with MVN Suit from Xsens and Microsoft Kinect RGB and depth images.(Institute of Systems and Robotics, Coimbra, Portugal)
  28. Washington RGB-D Object Dataset - 300 common household objects and 14 scenes. (University of Washington and Intel Labs Seattle)

General Videos

  1. AlignMNIST - An artificially extended version of the MNIST handwritten dataset. (en Hauberg)
  2. GoPro-Gyro Dataset - ego centric videos (Linkoping Computer Vision Laboratory)
  3. Large scale YouTube video dataset - 156,823 videos (2,907,447 keyframes) crawled from YouTube videos (Yi Yang)
  4. MovieQA - each machines to understand stories by answering questions about them. 15000 multiple choice QAs, 400+ movies.(M. Tapaswi, Y. Zhu, R. Stiefelhagen, A. Torralba, R. Urtasun, and S. Fidler)
  5. Multispectral visible-NIR video sequences - Annotated multispectral video, visible + NIR (LE2I, Universit de Bourgogne)
  6. Sports-1M - Dataset for sports video classification containing 487 classes and 1.2M videos.(Andrej Karpathy and George Toderici and Sanketh Shetty and Thomas Leung and Rahul Sukthankar and Li Fei-Fei.)
  7. Video Object Segmentation dataset DAVIS - Densely Annotated VIdeo Segmentation (F. Perazzi, J. Pont-Tuset, B. McWilliams, L. Van Gool, M. Gross, and A. Sorkine-Hornung)
  8. Video Sequencesused for research on Euclidean upgrades based on minimal assumptions about the camera(Kenton McHenry)
  9. Video Stacking Dataset - A Virtual Tripod for Hand-held Video Stacking on Smartphones (Erik Ringaby etc.)
  10. YFCC100M videos - A benchmark on the video subset of YFCC100M which includes the videos, he video content features and the API to a sate-of-the-art video content engine.(Lu Jiang)
  11. YFCC100M: The New Data in Multimedia Research - This publicly available curated dataset of 100 million photos and videos is free and legal for all.(Bart ThomeeYahoo Labs and Flickr in San Francisco,etc.)
  12. YouTube-8M - Dataset for video classification in the wild, containing pre-extracted frame level features from 8M videos, and 4800 classes.(Sami Abu-El-Haija, Nisarg Kothari, Joonseok Lee, Paul Natsev,George Toderici, Balakrishnan Varadarajan, Sudheendra Vijayanarasimhan)
  13. YouTube-8M Dataset -A Large and Diverse Labeled Video Dataset for Video Understanding Research(Google Inc.)

Hand, Hand Grasp, Hand Action and Gesture Databases

  1. 3D Articulated Hand Pose Estimation with Single Depth Images (Tang, Chang, Tejani, Kim, Yu)
  2. A Dataset of Human Manipulation Actions - RGB-D of 25 objects and 6 actions (Alessandro Pieropan)
  3. A Hand Gesture Detection Dataset (Javier Molina et al)
  4. A-STAR Annotated Hand-Depth Image Dataset and its Performance Evaluation - depth data and data glove data, 29 images of 30 volunteers, Chinese number counting and American Sign Language (Xu and Cheng)
  5. Bosphorus Hand Geometry Database and Hand-Vein Database (Bogazici University)
  6. EgoHands - A large dataset with over 15,000 pixel-level-segmented hands recorded from egocentric cameras of people interacting with each other. (Sven Bambach)
  7. FORTH Hand tracking library (FORTH)
  8. General HANDS: general hand detection and pose challenge - 22 sequences with different gestures, activities and viewpoints (UC Irvine)
  9. Grasp UNderstanding (GUN-71) dataset - 12,000 first-person RGB-D images of object manipulation scenes annotated using a taxonomy of 71 fine-grained grasps.(Rogez, Supancic and Ramanan)
  10. Hand gesture and marine silhouettes (Euripides G.M. Petrakis)
  11. HandNet: annotated depth images of articulated hands 214971 annotated depth images of hands captured by a RealSense RGBD sensor of hand poses. Annotations: per pixel classes, 6D fingertip pose, heatmap. Images -> Train: 202198, Test: 10000, Validation: 2773. Recorded at GIP Lab, Technion.
  12. IDIAP Hand pose/gesture datasets (Sebastien Marcel)
  13. Kinect and Leap motion gesture recognition dataset - The dataset contains 1400 different gestures acquired with both the Leap Motion and the Kinect devices(Giulio Marin, Fabio Dominio, Pietro Zanuttigh)
  14. Kinect and Leap motion gesture recognition dataset - The dataset contains several different static gestures acquired with the Creative Senz3D camera.(A. Memo, L. Minto, P. Zanuttigh)
  15. LISA CVRR-HANDS 3D - 19 gestures performed by 8 subjects as car driver and passengers (Ohn-Bar and Trivedi)
  16. LISA Vehicle Detection Dataset - colour first person driving video under various lighting and traffic conditions (Sivaraman, Trivedi)
  17. MPI Dexter 1 Dataset for Evaluation of 3D Articulated Hand Motion Tracking - Dexter 1: 7 sequences of challenging, slow and fast hand motions, RGB + depth (Sridhar, Oulasvirta, Theobalt)
  18. MSR Realtime and Robust Hand Tracking from Depth - (Qian, Sun, Wei, Tang, Sun)
  19. Mobile and Webcam Hand images database - MOHI and WEHI - 200 people, 30 images each (Ahmad Hassanat)
  20. NYU Hand Pose Dataset - 8252 test-set and 72757 training-set frames of captured RGBD data with ground-truth hand-pose, 3 views (Tompson, Stein, Lecun, Perlin}
  21. Sahand Dynamic Hand Gesture Database - This database contains 11 Dynamic gestures designed to convey the functions of mouse and touch screens to computers.(Behnam Maleki, Hossein Ebrahimnezhad)
  22. Sheffield gesture database - 2160 RGBD hand gesture sequences, 6 subjects, 10 gestures, 3 postures, 3 backgrounds, 2 illuminations (Ling Shao)
  23. UT Grasp Data Set - 4 subjects grasping a variety of objectss with a variety of grasps (Cai, Kitani, Sato)
  24. Yale human grasping data set - 27 hours of video with tagged grasp, object, and task data from two housekeepers and two machinists (Bullock, Feix, Dollar)

Image, Video and Shape Database Retrieval

  1. ANN_SIFT1M - 1M Flickr images encoded by 128D SIFT descriptors (Jegou et al)
  2. Brown Univ 25/99/216 Shape Databases (Ben Kimia)
  3. CIFAR-10 - 60K 32x32 images from 10 classes, with a 512D GIST descriptor (Alex Krizhevsky)
  4. CLEF-IP 2011 evaluation on patent images
  5. Dataset of Structured Queries and Spatial Relations - Dataset of structured queries about images with the emphasise on spatial relations.(Mateusz Malinowski, Mario Fritz)
  6. DeepFashion - Large-scale Fashion Database(Ziwei Liu, Ping Luo, Shi Qiu, Xiaogang Wang, Xiaoou Tang)
  7. EMODB - Thumbnails of images in the picsearch image search engine together with the picsearch emotion keywords (Reiner Lenz etc.)
  8. ETU10 Silhouette Dataset - The dataset consists of 720 silhouettes of 10 objects, with 72 views per object.(M. Akimaliev and M.F. Demirci)
  9. Flickr 30K - images, actions and captions (Peter Young et al)
  10. Flickr15k - Sketch based Image Retrieval (SBIR) Benchmark - Dataset of 330 sketches and 15,024 photos comprising 33 object categories,benchmark dataset commonly used to evaluate Sketch based Image Retrieval (SBIR) algorithms.(Hu and Collomosse, CVIU 2013)
  11. IAPR TC-12 Image Benchmark (Michael Grubinger)
  12. IAPR-TC12 Segmented and annotated image benchmark (SAIAPR TC-12): (Hugo Jair Escalante)
  13. ImageCLEF 2010 Concept Detection and Annotation Task (Stefanie Nowak)
  14. ImageCLEF 2011 Concept Detection and Annotation Task - multi-label classification challenge in Flickr photos
  15. METU Trademark datasetThe METU Dataset is composed of more than 900K real logos belonging to companies worldwide. (Usta Bilgi Sistemleri A.S. and Grup Ofis Marka Patent A.S)
  16. MPI Movie Description dataset - text and video (A. Rohrbach)
  17. McGill 3D Shape Benchmark (Siddiqi, Zhang, Macrini, Shokoufandeh, Bouix, Dickinson)
  18. Multiview Stereo Evaluation - Each dataset is registered with a "ground-truth" 3D model acquired via a laser scanning process(Steve Seitz et al)
  19. NIST SHREC - other NIST retrieval contest databases and links (USA National Institute of Standards and Technology)
  20. NIST SHREC 2010 - Shape Retrieval Contest of Non-rigid 3D Models (USA National Institute of Standards and Technology)
  21. NIST TREC Video Retrieval Evaluation Database (USA National Institute of Standards and Technology)
  22. NUS-WIDE - 269K Flickr images annotated with 81 concept tags, enclded as a 500D BoVW descriptorChau et al)
  23. Princeton Shape Benchmark (Princeton Shape Retrieval and Analysis Group)
  24. Queensland cross media dataset - millions of images and text documents for "cross-media" retrieval (Yi Yang)
  25. TOSCA 3D shape database (Bronstein, Bronstein, Kimmel)
  26. YouTube-8M Dataset -A Large and Diverse Labeled Video Dataset for Video Understanding Research(Google Inc.)

Object Databases

  1. 2.5D/3D Datasets of various objects and scenes (Ajmal Mian)
  2. 3D Object Recognition Stereo DatasetThis dataset consists of 9 objects and 80 test images. (Akash Kushal and Jean Ponce)
  3. 3D Photography Dataseta collection of ten multiview data sets captured in our lab(Yasutaka Furukawa and Jean Ponce)
  4. 3D-Printed RGB-D Object Dataset - 5 objects with groundtruth CAD models and camera trajectories, recorded with various quality RGB-D sensors(Siemens & TUM)
  5. Amsterdam Library of Object Images (ALOI): 100K views of 1K objects (University of Amsterdam/Intelligent Sensory Information Systems)
  6. B3DO: Berkeley 3-D Object Dataset - household object detection (Janoch et al)
  7. Beyond PASCAL: A Benchmark for 3D Object Detection in the Wild - 12 class, 3000+ images each with 3D annotations (Yu Xiang, Roozbeh Mottaghi, Silvio Savarese)
  8. Bristol Egocentric Object Interactions Dataset - egocentric object interactions with synchronised gaze (Dima Damen)
  9. CORE image dataset - to help learn more detailed models and for exploring cross-category generalization in object recognition. (Ali Farhadi, Ian Endres, Derek Hoiem, and David A. Forsyth)
  10. CTU Color and Depth Image Dataset of Spread Garments - Images of spread garments with annotated corners.(Wagner, L., Krejov D., and Smutn V. (Czech Technical University in Prague))
  11. Caltech 101 (now 256) category object recognition database (Li Fei-Fei, Marco Andreeto, Marc'Aurelio Ranzato)
  12. Catania Fish Species Recognition - 15 fish species, with about 20,000 sample training images and additional test images (Concetto Spampinato))
  13. Columbia COIL-100 3D object multiple views (Columbia University)
  14. Densely sampled object views: 2500 views of 2 objects, eg for view-based recognition and modeling (Gabriele Peters, Universiteit Dortmund)
  15. EDUB-Obj - Egocentric dataset for object localization and segmentation.(Marc Bolaños and Petia Radeva.)
  16. Ellipse finding dataset (Dilip K. Prasad et al)
  17. GDXray:X-ray images for X-ray testing and Computer Vision - GDXray includes five groups of images: Castings, Welds*,Baggages, Nature and Settings. (Domingo Mery, Catholic University of Chile)
  18. GRAZ-02 Database (Bikes, cars, people) (A. Pinz)
  19. GTSDB: German Traffic Sign Detection Benchmark (Ruhr-Universitat Bochum)
  20. ICubWorld - iCubWorld datasets are collections of images acquired by recording from the cameras of the iCub humanoid robot while it observes daily objects.(Giulia Pasquale, Carlo Ciliberto, Giorgio Metta, Lorenzo Natale, Francesca Odone and Lorenzo Rosasco.)
  21. LISA Traffic Light Dataset - 6 light classes in various lighting conditions (Jensen, Philipsen, Mogelmose, Moeslund, and Trivedi)
  22. LISA Traffic Sign Dataset - video of 47 US sign types with 7855 annotations on 6610 frames (Mogelmose, Trivedi, and Moeslund)
  23. Linkoping 3D Object Pose Estimation Database (Fredrik Viksten and Per-Erik Forssen)
  24. Linkoping Traffic Signs Dataset - 3488 traffic signs in 20K images (Larsson and Felsberg)
  25. MIT CBCL Car Data (Center for Biological and Computational Learning)
  26. MIT CBCL StreetScenes Challenge Framework: (Stan Bileschi)
  27. Microsoft COCO - Common Objects in Context (Tsung-Yi Lin et al)
  28. Microsoft Object Class Recognition image databases (Antonio Criminisi, Pushmeet Kohli, Tom Minka, Carsten Rother, Toby Sharp, Jamie Shotton, John Winn)
  29. Microsoft salient object databases (labeled by bounding boxes) (Liu, Sun Zheng, Tang, Shum)
  30. ModelNet - 127,915 CAD Models, 662 Object Categories, 10 Categories with Annotated Orientation (Wu, Song, Khosla, Yu, Zhang, Tang, Xiao)
  31. NABirds Dataset - 70,000 annotated photographs of the 400 species of birds commonly observed in North America (Grant Van Horn)
  32. NEC Toy animal object recognition or categorization database (Hossein Mobahi)
  33. NORB 50 toy image database (NYU)
  34. NTU-VOI: NTU Video Object Instance Dataset - video clips with frame-level bounding box annotations of object instances for evaluating object instance search and localization in large scale videos.(Jingjing Meng, et. al.)
  35. Object Pose Estimation Database - This database contains 16 objects, each sampled at 5 degrees angle increments along two rotational axes (F. Viksten etc.)
  36. Object Recognition DatabaseThis database features modeling shots of eight objects and 51 cluttered test shots containing multiple objects.(Fred Rothganger, Svetlana Lazebnik, Cordelia Schmid, and Jean Ponce. )
  37. PASCAL 2007 Challange Image Database (motorbikes, cars, cows) (PASCAL Consortium)
  38. PASCAL 2008 Challange Image Database (PASCAL Consortium)
  39. PASCAL 2009 Challange Image Database (PASCAL Consortium)
  40. PASCAL 2010 Challange Image Database (PASCAL Consortium)
  41. PASCAL 2011 Challange Image Database (PASCAL Consortium)
  42. PASCAL 2012 Challange Image Database Category classification, detection, and segmentation, and still-image action classification (PASCAL Consortium)
  43. PASCAL Image Database (motorbikes, cars, cows) (PASCAL Consortium)
  44. PASCAL Parts dataset - PASCAL VOC with segmentation annotation for semantic parts of objects (Alan Yuille)
  45. PASCAL-Context dataset - annotations for 400+ additional categories (Alan Yuille)
  46. Raindrop Detection - Improved Raindrop Detection using Combined Shape and Saliency Descriptors with Scene Context Isolation - Evaluation Dataset (Breckon, Toby P., Webster, Dereck D.)
  47. Swedish Leaf Dataset - These images contains leaves from 15 treeclasses (Oskar J. O. S?derkvist)
  48. UAH Traffic Signs Dataset (Arroyo etc.)
  49. UIUC Car Image Database (UIUC)
  50. UIUC Dataset of 3D object categories (S. Savarese and L. Fei-Fei)
  51. Venezia 3D object-in-clutter recognition and segmentation (Emanuele Rodola)
  52. Visual Attributes Dataset visual attribute annotations for over 500 object classes (animate and inanimate) which are all represented in ImageNet. Each object class is annotated with visual attributes based on a taxonomy of 636 attributes (e.g., has fur, made of metal, is round).
  53. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations - Visual Genome is a dataset, a knowledge base, an ongoing effort to connect structured image concepts to language.(Ranjay Krishna, Yuke Zhu, Oliver Groth,Justin Johnson, Kenji Hata, Joshua Kravitz, Stephanie Chen, Yannis Kalantidis, Li Jia-Li, David Ayman Shamma, Michael Bernstrein, Li Fei-Fei)
  54. Visual Hull Data Setsa collection of visual hull datasets (Svetlana Lazebnik, Yasutaka Furukawa, and Jean Ponce)

People (static), human body pose

  1. Frames Labeled In Cinema (FLIC) - 20928 frames labeled with human pose (Sapp, Taskar)
  2. Leeds Sports Pose Dataset - 2000 pose annotated images of mostly sports people (Johnson, Everingham)
  3. MPII Human Pose Dataset - 25K images containing over 40K people with annotated body joints, 410 human activities {Andriluka, Pishchulin, Gehler, Schiele)
  4. MPII Human Pose Dataset - MPII Human Pose dataset is a de-facto standard benchmark for evaluation of articulated human pose estimation. (Mykhaylo Andriluka, Leonid Pishchulin, Peter Gehler, Bernt Schiele)
  5. People In Photo Albums - Social media photo dataset with images from Flickr, and manual annotations on person heads and their identities.(Ning Zhang and Manohar Paluri and Yaniv Taigman and Rob Fergus and Lubomir Bourdev)
  6. Person Recognition in Personal Photo Collections - we introduced three harder splits for evaluation and long-term attribute annotations and per-photo timestamp metadata.(Oh, Seong Joon and Benenson, Rodrigo and Fritz, Mario and Schiele, Bernt)
  7. Pointing'04 ICPR Workshop Head Pose Image Database
  8. UC-3D Motion Database - Available data types encompass high resolution Motion Capture, acquired with MVN Suit from Xsens and Microsoft Kinect RGB and depth images.(Institute of Systems and Robotics, Coimbra, Portugal)
  9. VGG Human Pose Estimation datasets including the BBC Pose (20 videos with an overlaid sign language interpreter), Extended BBC Pose (72 additional training videos), Short BBC Pose (5 one hour videos with sign language signers), and ChaLearn Pose (23 hours of Kinect data of 27 persons performing 20 Italian gestures). (Charles, Everingham, Pfister, Magee, Hogg, Simonyan, Zisserman)

People Detection and Tracking Databases

  1. 3D KINECT Gender Walking data base (L. Igual, A. Lapedriza, R. Borràs from UB, CVC and UOC, Spain)
  2. AGORASET: a dataset for crowd video analysis (Nicolas Courty et al)
  3. CASIA gait database (Chinese Academy of Sciences)
  4. CAVIAR project video sequences with tracking and behavior ground truth (CAVIAR team/Edinburgh University - EC project IST-2001-37540)
  5. CMU Panoptic Studio Dataset - Multiple people social interaction dataset captured by 500+ synchronized video cameras, with 3D full body skeletons and calibration data. (H. Joo, T. Simon, Y. Sheikh)
  6. CUHK Crowd Dataset - 474 video clips from 215 crowded scenes (Shao, Loy, and Wang)
  7. CUHK01 Dataset : Person re-id dataset with 3, 884 images of 972 pedestrians (Rui Zhao et al)
  8. CUHK02 Dataset : Person re-id dataset with five camera view settings. (Rui Zhao et al)
  9. CUHK03 Dataset : Person re-id dataset with 13,164 images of 1,360 pedestrians (Rui Zhao et al)
  10. Caltech Pedestrian Dataset (P. Dollar, C. Wojek, B. Schiele and P. Perona)
  11. Daimler Pedestrian Detection Benchmark 21790 images with 56492 pedestrians plus empty scenes (D. M. Gavrila et al)
  12. Driver Monitoring Video Dataset (RobeSafe + Jesus Nuevo-Chiquero)
  13. DukeMTMC: Duke Multi-Target Multi-Camera tracking dataset - 8 cameras, 85 min, 2m frames, 2000 people of video (Ergys Ristani, Francesco Solera, Roger S. Zou, Rita Cucchiara, Carlo Tomasi)
  14. Edinburgh overhead camera person tracking dataset (Bob Fisher, Bashia Majecka, Gurkirt Singh, Rowland Sillito)
  15. GVVPerfcapEva - repository of human shape and performance capture data, including full body skeletal, hand tracking, body shape, face performance, interactions (Christian Theobalt)
  16. HAT database of 27 human attributes (Gaurav Sharma, Frederic Jurie)
  17. INRIA Person Dataset (Navneet Dalal)
  18. Inria Dressed human bodies in motion benchmark - Benchmark containing 3D motion sequences of different subjects, motions, and clothing styles that allows to quantitatively measure the accuracy of body shape estimates.(Jinlong Yang, Jean-Sbastien Franco, Franck H=E9troy-Wheeler, and Stefanie Wuhrer)
  19. Izmir - omnidirectional and panoramic image dataset (with annotations) to be used for human and car detection (Yalin Bastanlar)
  20. Joint Attention in Autonomous Driving (JAAD) - The dataset includes instances of pedestrians and cars intended primarily for the purpose of behavioural studies and detection in the context of autonomous driving.(Iuliia Kotseruba, Amir Rasouli and John K. Tsotsos)
  21. MAHNOB: MHI-Mimicry database - A 2 person, multiple camera and microphone database for studying mimicry in human-human interaction scenarios. (Sun, Lichtenauer, Valstar, Nijholt, and Pantic)
  22. MIT CBCL Pedestrian Data (Center for Biological and Computational Learning)
  23. MPI DYNA - A Model of Dynamic Human Shape in Motion (Max Planck Tubingen)
  24. MPI FAUST Dataset A data set containing 300 real, high-resolution human scans, with automatically computed ground-truth correspondences (Max Planck Tubingen)
  25. MPI MOSH Motion and Shape Capture from Markers. MOCAP data, 3D shape meshes, 3D high resolution scans. (Max Planck Tubingen)
  26. Market-1501 Dataset - 32,668 annotated bounding boxes of 1,501 identities from up to 6 cameras (Liang Zheng et al)
  27. Modena and Reggio Emilia first person head motion videos (Univ of Modena and Reggio Emilia)
  28. Multimodal Activities of Daily Living - including video, audio, physiological, sleep, motion and plug sensors. (Alexia Briasouli)
  29. Multiple Object Tracking Benchmark - A collection of datasets with ground truth, plus a performance league table (ETHZ, U. Adelaide, TU Darmstadt)
  30. Multispectral visible-NIR video sequences - Annotated multispectral video, visible + NIR (LE2I, Universit de Bourgogne)
  31. NYU Multiple Object Tracking Benchmark (Konrad Schindler et al)
  32. Occluded Articulated Human Body Dataset - Body pose extraction and tracking under occlusions, 6 RGB-D sequences in total (3500 frames) with one, two and three users, marker-based ground truth data(Markos Sigalas, Maria Pateraki, Panos Trahanias)
  33. PARSE Dataset Additional Data - facial expression, gaze direction, and gender (Antol, Zitnick, Parikh)
  34. PARSE Dataset of Articulated Bodies - 300 images of humans and horses (Ramanan)
  35. PETS 2009 Crowd Challange dataset (Reading University & James Ferryman)
  36. PETS Winter 2009 workshop data (Reading University & James Ferryman)
  37. PETS: Performance Evaluation of Tracking and Surveillance (Reading University & James Ferryman)
  38. PIROPO - People in Indoor ROoms with Perspective and Omnidirectional cameras, with more than 100,000 annotated frames (GTI-UPM, Spain)
  39. People-Art - a databased containing people labelled in photos and artwork (Qi Wu and Hongping Cai)
  40. Photo-Art-50 - a databased containing 50 object classes annoted in photos and artwork (Qi Wu and Hongping Cai)
  41. Pixel-based change detection benchmark dataset (Goyette et al)
  42. RAiD - Re-Identification Across Indoor-Outdoor Dataset: 43 people, 4 cameras, 6920 images (Abir Das et al)
  43. SYNTHIA - Large set (~half million) of virtual-world images for training autonomous cars to see. (ADAS Group at Computer Vision Center)
  44. Shinpuhkan 2014 - A Person Re-identification dataset containing 22,000 images of 24 people captured by 16 cameras. (Yasutomo Kawanishi et al.)
  45. Stanford Structured Group Discovery dataset - Discovering Groups of People in Images (W. Choi et al)
  46. Temple Color 128 - Color Tracking Benchmark - Encoding Color Information for Visual Tracking (P. Liang, E. Blasch, H. Ling)
  47. Transient Biometrics Nails Dataset V01 (Igor Barros Barbosa)
  48. Univ of Central Florida - Crowd Dataset (Saad Ali)
  49. Univ of Central Florida - Crowd Flow Segmentation datasets (Saad Ali)
  50. VIPeR: Viewpoint Invariant Pedestrian Recognition - 632 pedestrian image pairs taken from arbitrary viewpoints under varying illumination conditions. (Gray, Brennan, and Tao)
  51. Visual object tracking challenge datasets - The VOT datasets is a collection of fully annotated visual object tracking datasets used in the single-target short-term visual object tracking challenges.(The VOT committee)
  52. WIDER Attribute Dataset - WIDER Attribute is a large-scale human attribute dataset, with 13789 images belonging to 30 scene categories, and 57524 human bounding boxes each annotated with 14 binary attributes.(Li, Yining and Huang, Chen and Loy, Chen Change and Tang, Xiaoou)

Remote Sensing

  1. Brazilian Cerrado-Savanna Scenes Dataset - Composition of IR-R-G scenes taken by RapidEye sensor for vegetation classification in Brazilian Cerrado-Savanna. (K. Nogueira, J. A. dos Santos, T. Fornazari, T. S. Freire, L. P. Morellato, R. da S. Torres)
  2. Brazilian Coffee Scenes Dataset - Composition of IR-R-G scenes taken by SPOT sensor for identification of coffee crops in Brazilian mountains.(O. A. B. Penatti, K. Nogueira, J. A. dos Santos.)
  3. Building Detection Benchmark -14 images acquired from IKONOS (1 m) and QuickBird (60 cm)(Ali Ozgun Ok and Caglar Senaras)
  4. CBERS-2B, Landsat 5 TM, Geoeye, Ikonos-2 MS and ALOS-PALSAR - land-cover classification using optical images(D. Osaku et al. )
  5. Furnas and Tiete - sediment yield classification( Pisani et al.)
  6. ISPRS 2D semantic labeling - Height models and true ortho-images with a ground sampling distance of 5cm have been prepared over the city of Potsdam/Germany (Franz Rottensteiner, Gunho Sohn, Markus Gerke, Jan D. Wegner)
  7. ISPRS 3D semantic labeling - nine class airborne laser scanning data (Franz Rottensteiner, Gunho Sohn, Markus Gerke, Jan D. Wegner)
  8. The Linkoping Thermal InfraRed dataset - The LTIR dataset is a thermal infrared dataset for evaluation of Short-Term Single-Object (STSO) tracking (Linkoping University)

Scene Segmentation or Classification

  1. Barcelona - 15,150 images, urban views of Barcelona (Tighe and Lazebnik)
  2. CMU Visual Localization Data Set - Dataset collected over the period of a year using the Navlab 11 equipped with IMU, GPS, INS, Lidars and cameras.(Hernan Badino, Daniel Huber and Takeo Kanade)
  3. COLD (COsy Localization Database) - place localization (Ullah, Pronobis, Caputo, Luo, and Jensfelt)
  4. EDUB-Seg- Egocentric dataset for event segmentation.(Mariella Dimiccoli, Marc Bolaños, Estefania Talavera, Maedeh Aghaei, Stavri G. Nikolov, and Petia Radeva.)
  5. Fifteen Scene Categoriesa dataset of fifteen natural scene categories ( Fei-Fei Li and Aude Oliva)
  6. Geometric Context - scene interpretation images (Derek Hoiem)
  7. Indoor Place Recognition Dataset for localization of Mobile Robots - The dataset contains 17 different places built from 2 different robots (virtualMe and pioneer) (Raghavender Sahdev, John K. Tsotsos.)
  8. Indoor Scene Recognition - 67 Indoor categories, 15620 images (Quattoni and Torralba)
  9. LM+SUN - 45,676 images, mainly urban or human related scenes (Tighe and Lazebnik)
  10. Maritime Imagery in the Visible and Infrared Spectrums - VAIS contains simultaneously acquired unregistered thermal and visible images of ships acquired from piers (Zhang, M.M, Choi, J., Daniilidis, K., Wolf, M.T. & Kanan)
  11. NYU V2 Mixture of Manhattan Frames Dataset - We provide the Mixture of Manhattan Frames (MMF) segmentation and MF rotations on the full NYU depth dataset V2 by Silberman et al. (Straub, Julian and Rosman, Guy and Freifeld, Oren and Leonard, John J. and Fisher III, John W.)
  12. Places Scene Recognition database - 205 scene categories and 2.5 millions of images (Zhou, Lapedriza, Xiao, Torralba, and Oliva)
  13. RGB-NIR Scene Dataset - 477 images in 9 categories captured in RGB and Near-infrared (NIR) (Brown and Susstrunk)
  14. SUN 2012 - 16,873 fully annotated scene images for scene categorization (Xiao et al)
  15. SUN 397 - 397 scene categories for scene classification (Xiao et al)
  16. SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite - 10,000 RGB-D images, 146,617 2D polygons and 58,657 3D bounding boxes (Song, Lichtenberg, and Xiao)
  17. SYNTHIA - Large set (~half million) of virtual-world images for training autonomous cars to see. (ADAS Group at Computer Vision Center)
  18. Sift Flow (also known as LabelMe Outdoor, LMO) - 2688 images, mainly outdoor natural and urban (Tighe and Lazebnik)
  19. Stanford Background Dataset - 715 images of outdoor scenes containing at least one foreground object (Gould et al)
  20. Surface detection - Real-time traversable surface detection by colour space fusion and temporal analysis - Evaluation Dataset (Breckon, Toby P., Katramados, Ioannis)
  21. ViDRILO - ViDRILO is a dataset containing 5 sequences of annotated RGB-D images acquired with a mobile robot in two office buildings under challenging lighting conditions.(Miguel Cazorla ,J. Martinez-Gomez, M. Cazorla, I. Garcia-Varea and V. Morell.)
  22. Video Object Segmentation dataset DAVIS - Densely Annotated VIdeo Segmentation (F. Perazzi, J. Pont-Tuset, B. McWilliams, L. Van Gool, M. Gross, and A. Sorkine-Hornung)

Segmentation (General)

  1. Alpert et al. Segmentation evaluation database (Sharon Alpert, Meirav Galun, Ronen Basri, Achi Brandt)
  2. BMC (Background Model Challenge) - A dataset for comparing background subtraction algorithms, comp= osed of real and synthetic videos(Antoine)
  3. Berkeley Segmentation Dataset and Benchmark (David Martin and Charless Fowlkes)
  4. CTU Color and Depth Image Dataset of Spread Garments - Images of spread garments with annotated corners.(Wagner, L., Krejov D., and Smutn V. (Czech Technical University in Prague))
  5. CTU Garment Folding Photo Dataset - Color and depth images from various stages of garment folding.(Sushkov R., Melkumov I., Smutn y V. (Czech Technical University in Prague))
  6. DeformIt 2.0 - Image Data Augmentation Tool: Simulate novel images with ground truth segmentations from a single image-segmentation pair (Brian Booth and Ghassan Hamarneh)
  7. GrabCut Image database (C. Rother, V. Kolmogorov, A. Blake, M. Brown)
  8. LabelMe images database and online annotation tool (Bryan Russell, Antonio Torralba, Kevin Murphy, William Freeman)
  9. PetroSurf3D - 26 high resolution (sub-millimeter accuracy) 3D scans of rock art with pixelwise labeling of petroglyphs for segmentation(Poier, Seidl, Zeppelzauer, Reinbacher, Schaich, Bellandi, Marretta, Bischof)
  10. SYNTHIA - Large set (~half million) of virtual-world images for training autonomous cars to see. (ADAS Group at Computer Vision Center)
  11. Stony Brook University Shadow Dataset (SBU-Shadow5k) - Large scale shadow detection dataset from a wide variety of scenes and photo types, with human annotations (Tomas F.Y. Vicente, Le Hou, Chen-Ping Yu, Minh Hoai, Dimitris Samaras)

Simultaneous Localization and Mapping

  1. Event-based Data for Pose Estimation, Visual Odometry, and SLAMThe data also include intensity images, inertial measurements, and ground truth from a motion-capture system. (ETH)
  2. RAWSEEDS SLAM benchmark datasets (Rawseeds Project)
  3. TUM RGB-D Benchmark - Dataset and benchmark for the evaluation of RGB-D visual odometry and SLAM algorithms (BCrgen Sturm, Nikolas Engelhard, Felix Endres, Wolfram Burgard and Daniel Cremers)
  4. Visual Odometry / SLAM Evaluation - The odometry benchmark consists of 22 stereo sequences (Andreas Geiger and Philip Lenz and Raquel Urtasun)

Surveillance

  1. AVSS07: Advanced Video and Signal based Surveillance 2007 datasets (Andrea Cavallaro)
  2. Activity modeling and abnormality detection dataset - The dataset containes a 45 minutes video with annotated anomalies.(Jagan Varadarajan and Jean-Marc Odobez)
  3. Background subtraction - a list of datasets about background subtraction(Thierry BOUWMANS )
  4. CMUSRD: Surveillance Research Dataset - multi-camera video for indoor surveillance scenario (K. Hattori, H. Hattori, et al)
  5. DukeMTMC: Duke Multi-Target Multi-Camera tracking dataset - 8 cameras, 85 min, 2m frames, 2000 people of video (Ergys Ristani, Francesco Solera, Roger S. Zou, Rita Cucchiara, Carlo Tomasi)
  6. ETISEO Video Surveillance Download Datasets (INRIA Orion Team and others)
  7. MAHNOB Databases -including Laughter Database,HCI-tagging Database,MHI-Mimicry Database( M. Pantic. etc.)
  8. Multispectral visible-NIR video sequences - Annotated multispectral video, visible + NIR (LE2I, Universit de Bourgogne)
  9. Openvisor - Video surveillance Online Repository (Univ of Modena and Reggio Emilia)
  10. Parking-Lot dataset - Parking-Lot dataset is a car dataset which focus on moderate and heavily occlusions on cars in the parking lot scenario.(B. Li, T.F. Wu and S.C. Zhu)
  11. Pornography Database - The Pornography database is a pornography detection dataset containing nearly 80 hours of 400 pornographic and 400 non-pornographic videos extracted from pornography websites and Youtube. (Sandra Avila, Eduardo Valle, Arnaldo de A.)
  12. Queen Mary Multi-Camera Distributed Traffic Scenes Dataset (QMDTS) - The QMDTS is collected from urban surveillance environment for the study of surveillance behaviours in distributed scenes.(Dr. Xun Xu. Prof. Shaogang Gong and Dr. Timothy Hospedales)
  13. SALSA: Synergetic sociAL Scene Analysis - A Novel Dataset for Multimodal Group Behavior Analysis(Xavier Alameda-Pineda etc.)
  14. SBMnet (Scene Background Modeling.NET) - A dataset for testing background estimation algorithms(Pierre-Marc Jodoin, Lucia Maddalena, and Alfredo Petrosino)
  15. SCOUTER - video surveillance ground truthing (shifting perspectives, different setups/lighting conditions, large variations of subject). 30 videos and approximately 36,000 manually labeled frames. (Catalin Mitrea)
  16. SJTU-BESTOne surveillance-specified datasets platform with realistic, on-using camera-captured, diverse set of surveillance images and videos (Shanghai Jiao Tong University)
  17. SPEVI: Surveillance Performance EValuation Initiative (Queen Mary University London)
  18. Shinpuhkan 2014 - A Person Re-identification dataset containing 22,000 images of 24 people captured by 16 cameras. (Yasutomo Kawanishi et al.)
  19. Tracking in extremely cluttered scenes - this single object tracking dataset has 28 highly cluttered sequences with per frame annotation(Jingjing Xiao,Linbo Qiao,Rustam Stolkin,Ale Leonardis)
  20. UCSD Anomaly Detection Dataset - a stationary camera mounted at an elevation, overlooking pedestrian walkways, with unusual pedestrian or non-pedestrian motion.
  21. UCSD trajectory clustering and analysis datasets - (Morris and Trivedi)
  22. Udine Trajectory-based anomalous event detection dataset - synthetic trajectory datasets with outliers (Univ of Udine Artificial Vision and Real Time Systems Laboratory)
  23. WIDER Attribute Dataset - WIDER Attribute is a large-scale human attribute dataset, with 13789 images belonging to 30 scene categories, and 57524 human bounding boxes each annotated with 14 binary attributes.(Li, Yining and Huang, Chen and Loy, Chen Change and Tang, Xiaoou)

Textures

  1. Brodatz Texture, Normalized Brodatz Texture, Colored Brodatz Texture, Multiband Brodatz Texture 154 new images plus 112 original images with various transformations (A. Safia, D. He)
  2. Color texture images by category (textures.forrest.cz)
  3. Columbia-Utrecht Reflectance and Texture Database (Columbia & Utrecht Universities)
  4. DynTex: Dynamic texture database (Renaud Piteri, Mark Huiskes and Sandor Fazekas)
  5. KTH TIPS & TIPS2 textures - pose/lighting/scale variations (Eric Hayman)
  6. Oulu Texture Database (Oulu University)
  7. Oxford Describable Textures Dataset - 5640 images in 47 categories (M.Cimpoi, S. Maji, I. Kokkinos, S. Mohamed, A. Vedaldi)
  8. Prague Texture Segmentation Data Generator and Benchmark (Mikes, Haindl)
  9. Texture DatabaseThe texture database features 25 texture classes, 40 samples each(Svetlana Lazebnik, Cordelia Schmid, and Jean Ponce)
  10. Uppsala texture dataset of surfaces and materials - fabrics, grains, etc.
  11. Vision Texture (MIT Media Lab)

Urban Datasets

  1. Barcelona - 15,150 images, urban views of Barcelona (Tighe and Lazebnik)
  2. CMP Facade Database - Includes 606 rectified images of facades from various places with 12 architectural classes annotated.(Radim Tylecek)
  3. LM+SUN - 45,676 images, mainly urban or human related scenes (Tighe and Lazebnik)
  4. MIT CBCL StreetScenes Challenge Framework: (Stan Bileschi)
  5. Queen Mary Multi-Camera Distributed Traffic Scenes Dataset (QMDTS) - The QMDTS is collected from urban surveillance environment for the study of surveillance behaviours in distributed scenes.(Dr. Xun Xu. Prof. Shaogang Gong and Dr. Timothy Hospedales)
  6. Robust Global Translations with 1DSfMthe numerical data describing global structure from motion problems for each dataset (Kyle Wilson and Noah Snavely)
  7. Sift Flow (also known as LabelMe Outdoor, LMO) - 2688 images, mainly outdoor natural and urban (Tighe and Lazebnik)
  8. Street-View Change Detection with Deconvolutional Networks - Database with aligned image pairs from street-view imagery with structural,lighting, weather and seasonal changes.(Pablo F. Alcantarilla, Simon Stent, German Ros, Roberto Arroyo and Riccardo Gherardi)
  9. SydneyHouse - Streetview house images with accurate 3D house shape, facade object label, dense point correspondence, and annotation toolbox.(Hang Chu, Shenlong Wang, Raquel Urtasun,Sanja Fidler)
  10. Traffic Signs Dataset - recording sequences from over 350 km of Swedish highways and city roads (Fredrik Larsson)

Other Collections

  1. CALVIN research group datasets - object detection with eye tracking, imagenet bounding boxes, synchronised activities, stickman and body poses, youtube objects, faces, horses, toys, visual attributes, shape classes (CALVIN ggroup)
  2. CANTATA Video and Image Database Index site (Multitel)
  3. Chinese University of Hong Kong datasets - Face sketch, face alignment, image search, public square observation, occlusion, central station, MIT single and multiple camera trajectories, person re-identification (Multimedia lab)
  4. Computer Vision Homepage list of test image databases (Carnegie Mellon Univ)
  5. ETHZ various, including ETH 3D head pose, BIWI audiovisual data, ETHZ shape classes, BIWI walking pedestrians, pedestrians, buildings, 4D MRI, personal events, liver untrasound, Food 101 (ETH Zurich, Computer Vision Lab)
  6. IDIAP dataset collection - 26 different datasets - multimodal, attack, biometric, cursive characters, discourse, eye gaze, posters, maya codex, MOBIO, face spoofing, game playing, finger vein, youtube-personality traits (IDIAP team)
  7. Leibe's Collection of people/vehicle/object databases (Bastian Leibe)
  8. Lotus Hill Image Database Collection with Ground Truth (Sealeen Ren, Benjamin Yao, Michael Yang)
  9. Michael Firman's List of RGBD datasets
  10. MIT Saliency Benchmark dataset - collection (pointers to 23 datasets) (Bylinskii, Judd, Borji, Itti, Durand, Oliva, Torralba}
  11. Oxford Misc, including Buffy, Flowers, TV characters, Buildings, etc (Oxford Visual geometry Group)
  12. PEIPA Image Database Summary (Pilot European Image Processing Archive)
  13. Univ of Bern databases on handwriting, online documents, string edit and graph matching (Univ of Bern, Computer Vision and Artificial Intelligence)
  14. USC Annotated Computer Vision Bibliography database publication summary (Keith Price)
  15. USC-SIPI image databases: texture, aerial, favorites (eg. Lena) (USC Signal and Image Processing Institute)
  16. The world from a cat perspective - videos recorded from the head of a freely behaving cat (Belinda Y. Betsch, Wolfgang Einh?user)
  17. RSBA dataset - Sequences for evaluating rolling shutter bundle adjustment (Per-Erik etc.)
  18. Video Stacking Dataset - Dataset for evaulating video stacking on cell-phones (Erik Ringaby etc.)
  19. Kinect v2 dataset - Dataset for evaluating unwrapping in kinect2 depth decoding (Felix etc.)
  20. Vehicle Detection in Aerial Imagery - VEDAI is a dataset for Vehicle Detection in Aerial Imagery, provided as a tool to benchmark automatic target recognition algorithms in unconstrained environments. (Sebastien Razakarivony and Frederic Jurie)
  21. Wrist-mounted camera video dataset - Activities of Daily Living videos captured from a wrist- mounted camera and a head-mounted camera(Katsunori Ohnishi, Atsushi Kanehira,Asako Kanezaki, Tatsuya Harada)
  22. high-res 3D-Models - it includes high-res renderings of these data-sets. ( Hubert etc.)
  23. Computer Vision Lab OCR DataBase (CVL OCR DB) - CVL OCR DB is a public annotated image dataset of 120 binary annotated images of text in natural scenes. (Andrej Ikica and Peter Peer.)
  24. Replay Mobile:2D face spoofing - Presentation attack (spoofing) dataset with samples from both real data subjects and spoofed data subjects performed with paper, photos and videos to/from a mobile device.(Idiap research institute)
  25. Replay Attack:2D face spoofing - Presentation attack (spoofing) dataset with samples from both real data subjects and spoofed data subjects performed with paper, photos and videos from a mobile device to a laptop.(Idiap research institute)
  26. Msspoof:2D multi-spectral face spoofing - Presentation attack (spoofing) dataset with samples from both real data subjects and spoofed data subjects performed with paper to a NIR and VIS camera(Idiap research institute)
  27. VERA Fingervein - Fingervein dataset with data subjects recorded with a open fingervein sensor(Idiap research institute)
  28. VERA Fingervein spoofing - Presentation attack (spoofing) dataset with samples from spoofed data subjects (corresponding to VERA Fingervein) performed with paper(Idiap research institute)
  29. VERA PalmVein:PalmVein - Palmvein dataset with data subjects recorded with a open palmvein sensor(Idiap research institute)
  30. PalmVein spoofing - Presentation attack (spoofing) dataset with samples from spoofed data subjects (corresponding to VERA Palmvein) performed with paper(Idiap research institute)
  31. Biometrics Evaluation and Testing - Evaluation of identification technologies, including Biometrics( European computing e-infrastructure)
  32. Yummly-10k dataset - The goal was to understand human perception, in this case of food taste similarity.(SE(3) Computer Vision Group at Cornell Tech)
  33. HKU-IS -4447 images with pixel labeling groundtruth for salient object detection.(Guanbin Li, Yizhou Yu)
  34. The Event-Camera Dataset - This presents the world's first collection of datasets with an event-based camera for high-speed robotics (E. Mueggler, H. Rebecq, G. Gallego, T. Delbruck, D. Scaramuzza)
  35. Annotated Web Ears Dataset (AWE Dataset) - All images were acquired by cropping ears from images from the internet of known persons.(AEiga Emer, Vitomir Atruc and Peter Peer)
  36. General 100 Dataset - General-100 dataset contains 100 bmp-format images (with no compression), which are well-suited for super-resolution training(Dong, Chao and Loy, Chen Change and Tang, Xiaoou)
  37. Multiview Stereo Evaluation - Each dataset is registered with a "ground-truth" 3D model acquired via a laser scanning process(Steve Seitz et al)

Miscellaneous

  1. 3D mesh watermarking benchmark dataset (Guillaume Lavoue)
  2. A Dataset for Real Low-Light Image Noise Reduction - It contains pixel and intensity aligned pairs of images corrupted by low-light camera noise and their low-noise counterparts. (J. Anaya, A. Barbu)
  3. Active Appearance Models datasets (Mikkel B. Stegmann)
  4. Aircraft tracking (Ajmal Mian)
  5. Annotated Web Ears Dataset (AWE Dataset) - All images were acquired by cropping ears from images from the internet of known persons.(AEiga Emer, Vitomir Atruc and Peter Peer)
  6. CITIUS Video Database - A database of 72 videos with eye-tracking data= for evaluate dynamic saliency visual models.(Xose)
  7. CVSSP 3D data repository - The datasets are designed to evaluate general multi-view reconstruction algorithms. (Armin Mustafa, Hansung Kim, Jean-Yves Guillemaut and Adrian Hilton)
  8. California-ND - 701 photos from a personal photo collection, including many challenging real-life non-identical near-duplicates (Vassilios Vonikakis)
  9. Cambridge Motion-based Segmentation and Recognition Dataset (Brostow, Shotton, Fauqueur, Cipolla)
  10. Catadioptric camera calibration images (Yalin Bastanlar)
  11. Chars74K dataset - 74 English and Kannada characters (Teo de Campos - t.decampos@surrey.ac.uk)
  12. Columbia Camera Response Functions: Database (DoRF) and Model (EMOR) (M.D. Grossberg and S.K. Nayar)
  13. Columbia Database of Contaminants' Patterns and Scattering Parameters (Jinwei Gu, Ravi Ramamoorthi, Peter Belhumeur, Shree Nayar)
  14. Crime Scene Footwear Impression Database - crime scene and reference foorware impression images (Adam Kortylewski)
  15. DR(eye)VE - A driver's attention dataset (University of Modena and Reggio Emilia)
  16. DTU controlled motion and lighting image dataset (135K images) (Henrik Aanaes)
  17. Database for Visual Eye Movements (DOVES) - A set of eye movements collected from 29 human observers as they viewed 101 natural calibrated images. (van der Linde, I., Rajashekar, U., Bovik, A. C. etc.)
  18. DeformIt 2.0 - Image Data Augmentation Tool: Simulate novel images with ground truth segmentations from a single image-segmentation pair (Brian Booth and Ghassan Hamarneh)
  19. Dense outdoor correspondence ground truth datasets, for optical flow and local keypoint evaluation (Christoph Strecha)
  20. EISATS: .enpeda.. Image Sequence Analysis Test Site (Auckland University Multimedia Imaging Group)
  21. Featureless object tracking - This dataset contains several videosequences with limited texture, intended for visual tracking, including manually annotated per-frame pose.(Lebeda, Hadfield, Matas, Bowden)
  22. FlickrLogos-32 - 8240 images of 32 product logos (Stefan Romberg)
  23. General 100 Dataset - General-100 dataset contains 100 bmp-format images (with no compression), which are well-suited for super-resolution training(Dong, Chao and Loy, Chen Change and Tang, Xiaoou)
  24. Geometry2view - This dataset contains image pairs for 2-view geometry computation , including manually annotated point coordinates.(Lebeda, Matas, Chum)
  25. Hannover Region Detector Evaluation Data Set - Feature detector evaluation sequences in multiple image resolutions from 1.5 up to 8 megapixels (Kai Cordes)
  26. Hillclimb and CubicGlobe datasets - a video of a rally car, separated into several independent shots (for visual tracking and modelling). (Lebeda, Hadfield, Bowden)
  27. IISc - Dissimilarity between Isolated Objects (IISc-DIO) - The dataset has a total of 26,675 perceived dissimilarity measurements made on 269 human subjects using a Visual Search task with a diverse set of objects.(RT Pramod & SP Arun, IISc)
  28. INRIA feature detector evaluation sequences (Krystian Mikolajczyk)
  29. INRIA's PERCEPTION's database of images and videos gathered with several synchronized and calibrated cameras (INRIA Rhone-Alpes)
  30. Image/video quality assessment database summary (Stefan Winkler)
  31. KITTI dataset for stereo, optical flow and visual odometry (Geiger, Lenz, Urtasun)
  32. LFW-10 dataset for learning relative attributes - A dataset of 10,000 pairs of face images with instance-level annotations for 10 attributes.(CVIT, IIIT Hyderabad. )
  33. Large scale 3D point cloud data from terrestrial LiDAR scanning (Andreas Nuechter)
  34. Light-field Material Dataset - 1.2k annotated images of 12 material classes taken with the Lytro ILLUM camera(Ting-Chun Wang, Jun-Yan Zhu, Ebi Hiroaki,Manmohan Chandraker, Alexei Efros, Ravi Ramamoorthi)
  35. Linkoping Rolling Shutter Rectification Dataset (Per-Erik Forssen and Erik Ringaby)
  36. MARIS Portofino dataset - A dataset of underwater stereo images depicting cylindrical pipe objects and collected to test object detection and pose estimation algorithms. (RIMLab (Robotics and Intelligent Machines Laboratory), University of Parma.)
  37. MPI Sintel Flow Dataset A data set for the evaluation of optical flow derived from the open source 3D animated short film, Sintel. It has been extended for Stereo and disparity, Depth and camera motion, and Segmentation. (Max Planck Tubingen)
  38. MPI-Sintel optical flow evaluation dataset (Michael Black)
  39. MSR-VTT - video to text database of 200K+ video clip/sentence pairs
  40. Middlebury College stereo vision research datasets (Daniel Scharstein and Richard Szeliski)
  41. Modelling of 2D Shapes with Ellipses - he dataset contains 4,526 2D shapes included in standard as well as in home-build datasets.(Costas Panagiotakis and Antonis Argyros)
  42. Multi-FoV - photo-realistic video sequences that allow benchmarking of the impact of the Field-of-View (FoV) of the camera on various vision tasks. (Zhang, Rebecq, Forster, Scaramuzza)
  43. Multiview Stereo Evaluation - Each dataset is registered with a "ground-truth" 3D model acquired via a laser scanning process(Steve Seitz et al)
  44. Multiview stereo images with laser based groundtruth (ESAT-PSI/VISICS,FGAN-FOM,EPFL/IC/ISIM/CVLab)
  45. NCI Cancer Image Archive - prostate images (National Cancer Institute)
  46. NIST 3D Interest Point Detection (Helin Dutagaci, Afzal Godil)
  47. NRCS natural resource/agricultural image database (USDA Natural Resources Conservation Service)
  48. OSIE - Object and Semantic Images and Eye-tracking - 700 images, 5551 segmented objects, eye tracking data (Xu, Jiang, Wang, Kankanhalli, Zhao)
  49. Object Removal - Generalized Dynamic Object Removal for Dense Stereo Vision Based Scene Mapping using Synthesised Optical Flow - Evaluation Dataset (Hamilton, O.K., Breckon, Toby P.)
  50. Occlusion detection test data (Andrew Stein)
  51. PHOS (illumination invariance dataset) - 15 scenes captured under different illumination conditions * 15 images (Vassilios Vonikakis)
  52. PRINTART: Artistic images of prints of well known paintings, including detail annotations. A benchmark for automatic annotation and retrieval tasks with this database was published at ECCV. (Nuno Miguel Pinho da Silva)
  53. Pics 'n' Trails - Dataset of Continuously archived GPS and digital photos (Gamhewage Chaminda de Silva)
  54. RAWSEEDS SLAM benchmark datasets (Rawseeds Project)
  55. ROMA (ROad MArkings) : Image database for the evaluation of road markings extraction algorithms (Jean-Philippe Tarel, et al)
  56. Robotic 3D Scan Repository - 3D point clouds from robotic experiments of scenes (Osnabruck and Jacobs Universities)
  57. Rolling Shutter Rectification Dataset - Rectifying rolling shutter video from hand-held devices (Per-Erik etc.)
  58. SALICON - Saliency in Context eye tracking dataset c. 1000 images with eye-tracking data in 80 image classes (Jiang, Huang, Duan, Zhao)
  59. Scripps Plankton Camera System - thousands of images of c. 50 classes of plankton and other small marine objects (Jaffe et al)
  60. Stony Brook Univeristy Real-World Clutter Dataset (SBU-RwC90) - Images of different level of clutterness, ranked by humans (Chen-Ping Yu, Dimitris Samaras, Gregory Zelinsky)
  61. Street-View Change Detection with Deconvolutional Networks - Database with aligned image pairs from street-view imagery with structural,lighting, weather and seasonal changes.(Pablo F. Alcantarilla, Simon Stent, German Ros, Roberto Arroyo and Riccardo Gherardi)
  62. SydneyHouse - Streetview house images with accurate 3D house shape, facade object label, dense point correspondence, and annotation toolbox.(Hang Chu, Shenlong Wang, Raquel Urtasun,Sanja Fidler)
  63. TGIF - 100K animated GIFs from Tumblr and 120K natural language descriptions (Li, Song, Cao, Tetreault, Goldberg, Jaimes, Luo)
  64. TMAGIC dataset - Several videosequences for visual tracking, containing strong out-of-plane rotation(Lebeda, Hadfield, Bowden)
  65. TUM RGB-D Benchmark - Dataset and benchmark for the evaluation of RGB-D visual odometry and SLAM algorithms (BCrgen Sturm, Nikolas Engelhard, Felix Endres, Wolfram Burgard and Daniel Cremers)
  66. The Conflict Escalation Resolution (CONFER) Database - 120 audio-visual episodes (~142 mins) of naturalistic interactions from televised political debates, annotated frame-by-frame in terms of real-valued conflict intensity. (Christos Georgakis, Yannis Panagakis, Stefanos Zafeiriou,Maja Pantic)
  67. The Open Video Project (Gary Marchionini, Barbara M. Wildemuth, Gary Geisler, Yaxiao Song)
  68. The Toulouse Vanishing Points Dataset - a dataset of Manhattan scenes for vanishing point estimation which also provide, for each image, the IMU data of the camera orientation.(Vincent Angladon and Simone Gasparini)
  69. UCL Ground Truth Optical Flow Dataset (Oisin Mac Aodha)
  70. Univ of Genoa Datasets for disparity and optic flow evaluation (Manuela Chessa)
  71. VSD: Technicolor Violent Scenes Dataset - a collection of ground-truth files based on the extraction of violent events in movies
  72. Validation and Verification of Neural Network Systems (Francesco Vivarelli)
  73. Very Long Baseline Interferometry Image Reconstruction Dataset (MIT CSAIL)
  74. Virtual KITTI - 40 high-resolution videos (17,008 frames) generated from five different virtual worlds, for : object detection and multi-object tracking, scene-level and instance-level semantic segmentation, optical flow, and depth estimation (Gaidon, Wang, Cabon, Vig)
  75. Visual Object Tracking challenge - This challenge is held annually as an ICCV/ECCV workshop, with a new dataset and an updated evaluation kit every year.(Kristan et al.)
  76. WHOI-Plankton - 3.5 million images of microscopic marine plankton on 103 categories (Olson, Sosik)
  77. WILD: Weather and Illumunation Database (S. Narasimhan, C. Wang. S. Nayar, D. Stolyarov, K. Garg, Y. Schechner, H. Peri)
  78. YACCLAB dataset - YACCLAB dataset includes both synthetic and real binary images(Grana, Costantino; Bolelli, Federico; Baraldi, Lorenzo; Vezzani, Roberto)
  79. YtLongTrack - This dataset contains two video sequences with challenges such as low quality, extreme length and full occlusions, including manually annotated per-frame pose.(Lebeda, Hadfield, Matas, Bowden)

Acknowledgements: Many thanks to all of the contributors for their suggestions of databases. Can PU was very helpful with the updating of this web page.

Return to CVentry top level


Valid XHTML 1.0 Strict

© 2016 Robert Fisher