See (Calli et al, 2015) for a review of 33 datasets of 3D object as of 2015. See (Downs et al., 2022) for a review of more datasets as of 2022.
Bottou, L.; Cortes, C.; Denker, J.S.; Drucker, H.; Guyon, I.; Jackel, L.D.; LeCun, Y.; Muller, U.A.; Sackinger, E.; Simard, P.; Vapnik, V. (1994). "Comparison of classifier methods: A case study in handwritten digit recognition". Proceedings of the 12th IAPR International Conference on Pattern Recognition (Cat. No.94CH3440-5). Vol. 2. IEEE Comput. Soc. Press. pp. 77â82. doi:10.1109/ICPR.1994.576879. ISBN 978-0-8186-6270-6. 978-0-8186-6270-6
"NIST Special Database 19". NIST. 2010-08-27. https://www.nist.gov/srd/nist-special-database-19
LeCun, Yann. "NORB: Generic Object Recognition in Images". cs.nyu.edu. Retrieved 2025-04-26. https://cs.nyu.edu/~yann/research/norb/
LeCun, Y.; Fu Jie Huang; Bottou, L. (2004). "Learning methods for generic object recognition with invariance to pose and lighting". 2. IEEE: 97â104. doi:10.1109/CVPR.2004.1315150. ISBNÂ 978-0-7695-2158-9. {{cite journal}}: Cite journal requires |journal= (help) 978-0-7695-2158-9
Torralba, A.; Fergus, R.; Freeman, W.T. (November 2008). "80 Million Tiny Images: A Large Data Set for Nonparametric Object and Scene Recognition". IEEE Transactions on Pattern Analysis and Machine Intelligence. 30 (11): 1958â1970. doi:10.1109/TPAMI.2008.128. ISSNÂ 0162-8828. PMIDÂ 18787244. https://ieeexplore.ieee.org/document/4531741
"The Street View House Numbers (SVHN) Dataset". ufldl.stanford.edu. Retrieved 2025-02-25. http://ufldl.stanford.edu/housenumbers/
Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, Andrew Y. Ng. "Reading Digits in Natural Images with Unsupervised Feature Learning" NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011 http://ufldl.stanford.edu/housenumbers/nips2011_housenumbers.pdf
Hinton, Geoffrey; Vinyals, Oriol; Dean, Jeff (2015-03-09). "Distilling the Knowledge in a Neural Network". arXiv:1503.02531 [stat.ML]. /wiki/ArXiv_(identifier)
Sun, Chen; Shrivastava, Abhinav; Singh, Saurabh; Gupta, Abhinav (2017). "Revisiting Unreasonable Effectiveness of Data in Deep Learning Era". pp. 843â852. arXiv:1707.02968 [cs.CV]. /wiki/ArXiv_(identifier)
Abnar, Samira; Dehghani, Mostafa; Neyshabur, Behnam; Sedghi, Hanie (2021-10-05). "Exploring the Limits of Large Scale Pre-training". arXiv:2110.02095 [cs.LG]. /wiki/ArXiv_(identifier)
Zhai, Xiaohua; Kolesnikov, Alexander; Houlsby, Neil; Beyer, Lucas (2021-06-08). "Scaling Vision Transformers". arXiv:2106.04560 [cs.CV]. /wiki/ArXiv_(identifier)
Zhou, Bolei; Lapedriza, Agata; Khosla, Aditya; Oliva, Aude; Torralba, Antonio (2018-06-01). "Places: A 10 Million Image Database for Scene Recognition". IEEE Transactions on Pattern Analysis and Machine Intelligence. 40 (6): 1452â1464. doi:10.1109/TPAMI.2017.2723009. ISSNÂ 0162-8828. PMIDÂ 28692961. https://ieeexplore.ieee.org/document/7968387
Grauman, Kristen; Westbury, Andrew; Byrne, Eugene; Chavis, Zachary; Furnari, Antonino; Girdhar, Rohit; Hamburger, Jackson; Jiang, Hao; Liu, Miao; Liu, Xingyu; Martin, Miguel; Nagarajan, Tushar; Radosavovic, Ilija; Ramakrishnan, Santhosh Kumar; Ryan, Fiona; Sharma, Jayant; Wray, Michael; Xu, Mengmeng; Xu, Eric Zhongcong; Zhao, Chen; Bansal, Siddhant; Batra, Dhruv; Cartillier, Vincent; Crane, Sean; Do, Tien; Doulaty, Morrie; Erapalli, Akshay; Feichtenhofer, Christoph; Fragomeni, Adriano; Fu, Qichen; Gebreselasie, Abrham; Gonzalez, Cristina; Hillis, James; Huang, Xuhua; Huang, Yifei; Jia, Wenqi; Khoo, Weslie; Kolar, Jachym; Kottur, Satwik; Kumar, Anurag; Landini, Federico; Li, Chao; Li, Yanghao; Li, Zhenqiang; Mangalam, Karttikeya; Modhugu, Raghava; Munro, Jonathan; Murrell, Tullie; Nishiyasu, Takumi; Price, Will; Puentes, Paola Ruiz; Ramazanova, Merey; Sari, Leda; Somasundaram, Kiran; Southerland, Audrey; Sugano, Yusuke; Tao, Ruijie; Vo, Minh; Wang, Yuchen; Wu, Xindi; Yagi, Takuma; Zhao, Ziwei; Zhu, Yunyi; Arbelaez, Pablo; Crandall, David; Damen, Dima; Farinella, Giovanni Maria; Fuegen, Christian; Ghanem, Bernard; Ithapu, Vamsi Krishna; Jawahar, C. V.; Joo, Hanbyul; Kitani, Kris; Li, Haizhou; Newcombe, Richard; Oliva, Aude; Park, Hyun Soo; Rehg, James M.; Sato, Yoichi; Shi, Jianbo; Shou, Mike Zheng; Torralba, Antonio; Torresani, Lorenzo; Yan, Mingfei; Malik, Jitendra (2022). "Ego4D: Around the World in 3,000 Hours of Egocentric Video". arXiv:2110.07058 [cs.CV]. /wiki/ArXiv_(identifier)
Srinivasan, Krishna; Raman, Karthik; Chen, Jiecao; Bendersky, Michael; Najork, Marc (2021-07-11). "WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning". Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM. pp. 2443â2449. arXiv:2103.01913. doi:10.1145/3404835.3463257. ISBN 978-1-4503-8037-9. 978-1-4503-8037-9
Krishna, Ranjay; Zhu, Yuke; Groth, Oliver; Johnson, Justin; Hata, Kenji; Kravitz, Joshua; Chen, Stephanie; Kalantidis, Yannis; Li, Li-Jia; Shamma, David A; Bernstein, Michael S; Fei-Fei, Li (2017). "Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations". International Journal of Computer Vision. 123: 32â73. arXiv:1602.07332. doi:10.1007/s11263-016-0981-7. S2CIDÂ 4492210. /wiki/ArXiv_(identifier)
Karayev, S., et al. "A category-level 3-D object dataset: putting the Kinect to work." Proceedings of the IEEE International Conference on Computer Vision Workshops. 2011. http://alliejanoch.com/iccvw2011.pdf
Tighe, Joseph, and Svetlana Lazebnik. "Superparsing: scalable nonparametric image parsing with superpixels Archived 6 August 2019 at the Wayback Machine." Computer VisionâECCV 2010. Springer Berlin Heidelberg, 2010. 352â365. /wiki/Svetlana_Lazebnik
Arbelaez, P.; Maire, M; Fowlkes, C; Malik, J (May 2011). "Contour Detection and Hierarchical Image Segmentation" (PDF). IEEE Transactions on Pattern Analysis and Machine Intelligence. 33 (5): 898â916. doi:10.1109/tpami.2010.161. PMIDÂ 20733228. S2CIDÂ 206764694. Retrieved 27 February 2016. http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/papers/amfm_pami2010.pdf
Lin, Tsung-Yi; Maire, Michael; Belongie, Serge; Bourdev, Lubomir; Girshick, Ross; Hays, James; Perona, Pietro; Ramanan, Deva; Lawrence Zitnick, C.; DollĂĄr, Piotr (2014). "Microsoft COCO: Common Objects in Context". arXiv:1405.0312 [cs.CV]. /wiki/ArXiv_(identifier)
Russakovsky, Olga; et al. (2015). "Imagenet large scale visual recognition challenge". International Journal of Computer Vision. 115 (3): 211â252. arXiv:1409.0575. doi:10.1007/s11263-015-0816-y. hdl:1721.1/104944. S2CID 2930547. /wiki/ArXiv_(identifier)
"COCO â Common Objects in Context". cocodataset.org. https://cocodataset.org/
Deng, Jia, et al. "Imagenet: A large-scale hierarchical image database."Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 2009. https://www.researchgate.net/profile/Li_Jia_Li/publication/221361415_ImageNet_a_Large-Scale_Hierarchical_Image_Database/links/00b495388120dbc339000000/ImageNet-a-Large-Scale-Hierarchical-Image-Database.pdf
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012. http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
Russakovsky, Olga; Deng, Jia; Su, Hao; Krause, Jonathan; Satheesh, Sanjeev; et al. (11 April 2015). "ImageNet Large Scale Visual Recognition Challenge". International Journal of Computer Vision. 115 (3): 211â252. arXiv:1409.0575. doi:10.1007/s11263-015-0816-y. hdl:1721.1/104944. S2CID 2930547. /wiki/ArXiv_(identifier)
Xiao, Jianxiong; Hays, James; Ehinger, Krista A.; Oliva, Aude; Torralba, Antonio (June 2010). "SUN database: Large-scale scene recognition from abbey to zoo". 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE. pp. 3485â3492. doi:10.1109/cvpr.2010.5539970. hdl:1721.1/60690. ISBN 978-1-4244-6984-0. 978-1-4244-6984-0
Donahue, Jeff; Jia, Yangqing; Vinyals, Oriol; Hoffman, Judy; Zhang, Ning; Tzeng, Eric; Darrell, Trevor (2013). "DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition". arXiv:1310.1531 [cs.CV]. /wiki/ArXiv_(identifier)
Yu, Fisher; Seff, Ari; Zhang, Yinda; Song, Shuran; Funkhouser, Thomas; Xiao, Jianxiong (2016-06-04). "LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop". arXiv:1506.03365 [cs.CV]. /wiki/ArXiv_(identifier)
"Index of /lsun/". dl.yf.io. Retrieved 2024-09-19. http://dl.yf.io/lsun/
"LSUN". Complex Adaptive Systems Laboratory. Retrieved 2024-09-19. https://complexity.cecs.ucf.edu/lsun/
Gupta, Agrim; Dollar, Piotr; Girshick, Ross (2019). "LVIS: A Dataset for Large Vocabulary Instance Segmentation": 5356â5364. {{cite journal}}: Cite journal requires |journal= (help) https://openaccess.thecvf.com/content_CVPR_2019/html/Gupta_LVIS_A_Dataset_for_Large_Vocabulary_Instance_Segmentation_CVPR_2019_paper.html
Ivan Krasin, Tom Duerig, Neil Alldrin, Andreas Veit, Sami Abu-El-Haija, Serge Belongie, David Cai, Zheyun Feng, Vittorio Ferrari, Victor Gomes, Abhinav Gupta, Dhyanesh Narayanan, Chen Sun, Gal Chechik, Kevin Murphy. "OpenImages: A public dataset for large-scale multi-label and multi-class image classification, 2017. Available from https://github.com/openimages." https://github.com/openimages
Vyas, Apoorv, et al. "Commercial Block Detection in Broadcast News Videos." Proceedings of the 2014 Indian Conference on Computer Vision Graphics and Image Processing. ACM, 2014. https://dl.acm.org/citation.cfm?id=2683546
Hauptmann, Alexander G., and Michael J. Witbrock. "Story segmentation and detection of commercials in broadcast news video." Research and Technology Advances in Digital Libraries, 1998. ADL 98. Proceedings. IEEE International Forum on. IEEE, 1998. https://pdfs.semanticscholar.org/5c21/6db7892fa3f515d816f84893bfab1137f0b2.pdf
Tung, Anthony KH, Xin Xu, and Beng Chin Ooi. "Curler: finding and visualizing nonlinear correlation clusters." Proceedings of the 2005 ACM SIGMOD international conference on Management of data. ACM, 2005. https://www.researchgate.net/profile/Anthony_Tung/publication/221214229_CURLER_Finding_and_Visualizing_Nonlinear_Correlated_Clusters/links/55b8691a08aed621de05cd92.pdf
Jarrett, Kevin, et al. "What is the best multi-stage architecture for object recognition?." Computer Vision, 2009 IEEE 12th International Conference on. IEEE, 2009. https://ieeexplore.ieee.org/abstract/document/5459469/
Lazebnik, Svetlana, Cordelia Schmid, and Jean Ponce. "Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories."Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on. Vol. 2. IEEE, 2006. /wiki/Svetlana_Lazebnik
Griffin, G., A. Holub, and P. Perona. Caltech-256 object category dataset California Inst. Technol., Tech. Rep. 7694, 2007. Available: http://authors.library.caltech.edu/7694, 2007. http://authors.library.caltech.edu/7694
Baeza-Yates, Ricardo, and Berthier Ribeiro-Neto. Modern information retrieval. Vol. 463. New York: ACM press, 1999.
"đș COYO-700M: Image-Text Pair Dataset". Kakao Brain. 2022-11-03. Retrieved 2022-11-03. https://github.com/kakaobrain/coyo-dataset
Fu, Xiping, et al. "NOKMeans: Non-Orthogonal K-means Hashing." Computer VisionâACCV 2014. Springer International Publishing, 2014. 162â177. https://pdfs.semanticscholar.org/9da2/abae3072fd9fcff0e13b8f00fc21f22d0085.pdf
Heitz, Geremy; et al. (2009). "Shape-based object localization for descriptive classification". International Journal of Computer Vision. 84 (1): 40â62. CiteSeerX 10.1.1.142.280. doi:10.1007/s11263-009-0228-y. S2CID 646320. /wiki/CiteSeerX_(identifier)
Everingham, Mark; et al. (2010). "The pascal visual object classes (voc) challenge". International Journal of Computer Vision. 88 (2): 303â338. doi:10.1007/s11263-009-0275-4. hdl:20.500.11820/88a29de3-6220-442b-ab2d-284210cf72d6. S2CID 4246903. https://www.research.ed.ac.uk/portal/en/publications/the-pascal-visual-object-classes-voc-challenge(88a29de3-6220-442b-ab2d-284210cf72d6).html
Felzenszwalb, Pedro F.; et al. (2010). "Object detection with discriminatively trained part-based models". IEEE Transactions on Pattern Analysis and Machine Intelligence. 32 (9): 1627â1645. CiteSeerX 10.1.1.153.2745. doi:10.1109/tpami.2009.167. PMID 20634557. S2CID 3198903. /wiki/CiteSeerX_(identifier)
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012. http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
Gong, Yunchao, and Svetlana Lazebnik. "Iterative quantization: A procrustean approach to learning binary codes." Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011. /wiki/Svetlana_Lazebnik
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012. http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
Gong, Yunchao, and Svetlana Lazebnik. "Iterative quantization: A procrustean approach to learning binary codes." Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011. /wiki/Svetlana_Lazebnik
"CINIC-10 dataset". Luke N. Darlow, Elliot J. Crowley, Antreas Antoniou, Amos J. Storkey (2018) CINIC-10 is not ImageNet or CIFAR-10. 2018-10-09. Retrieved 2018-11-13. http://www.bayeswatch.com/2018/10/09/CINIC/
"fashion-mnist: A MNIST-like fashion product database. Benchmark :point_right". Zalando Research. 2017-10-07. Retrieved 2017-10-07. https://github.com/zalandoresearch/fashion-mnist
"notMNIST dataset". Machine Learning, etc. 2011-09-08. Retrieved 2017-10-13. http://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html
Chaladze, G., Kalatozishvili, L. (2017). Linnaeus 5 dataset. Chaladze.com. Retrieved 13 November 2017, from http://chaladze.com/l5/ http://chaladze.com/l5/
Afifi, Mahmoud (2017-11-12). "Gender recognition and biometric identification using a large dataset of hand images". arXiv:1711.04322 [cs.CV]. /wiki/ArXiv_(identifier)
Lomonaco, Vincenzo; Maltoni, Davide (2017-10-18). "CORe50: a New Dataset and Benchmark for Continuous Object Recognition". arXiv:1705.03550 [cs.CV]. /wiki/ArXiv_(identifier)
She, Qi; Feng, Fan; Hao, Xinyue; Yang, Qihan; Lan, Chuanlin; Lomonaco, Vincenzo; Shi, Xuesong; Wang, Zhengwei; Guo, Yao; Zhang, Yimin; Qiao, Fei; Chan, Rosa H.M. (2019-11-15). "OpenLORIS-Object: A Robotic Vision Dataset and Benchmark for Lifelong Deep Learning". arXiv:1911.06487v2 [cs.CV]. /wiki/ArXiv_(identifier)
Morozov, Alexei; Sushkova, Olga (2019-06-13). "THz and thermal video data set". Development of the multi-agent logic programming approach to a human behaviour analysis in a multi-channel video surveillance. Moscow: IRE RAS. Retrieved 2019-07-19. http://www.fullvision.ru/monitoring/description_eng.php
Morozov, Alexei; Sushkova, Olga; Kershner, Ivan; Polupanov, Alexander (2019-07-09). "Development of a method of terahertz intelligent video surveillance based on the semantic fusion of terahertz and 3D video images" (PDF). CEUR. 2391: paper19. Retrieved 2019-07-19. http://ceur-ws.org/Vol-2391/paper19.pdf
Calli, Berk; Walsman, Aaron; Singh, Arjun; Srinivasa, Siddhartha; Abbeel, Pieter; Dollar, Aaron M. (September 2015). "Benchmarking in Manipulation Research: Using the Yale-CMU-Berkeley Object and Model Set". IEEE Robotics & Automation Magazine. 22 (3): 36â52. arXiv:1502.03143. doi:10.1109/MRA.2015.2448951. ISSNÂ 1070-9932. https://ieeexplore.ieee.org/document/7254318
Downs, Laura; Francis, Anthony; Koenig, Nate; Kinman, Brandon; Hickman, Ryan; Reymann, Krista; McHugh, Thomas B.; Vanhoucke, Vincent (2022-05-23). "Google Scanned Objects: A High-Quality Dataset of 3D Scanned Household Items". 2022 International Conference on Robotics and Automation (ICRA). IEEE. pp. 2553â2560. arXiv:2204.11918. doi:10.1109/ICRA46639.2022.9811809. ISBN 978-1-7281-9681-7. 978-1-7281-9681-7
"Princeton Shape Benchmark". shape.cs.princeton.edu. Retrieved 2025-03-07. https://shape.cs.princeton.edu/benchmark/main.html
Shilane, P.; Min, P.; Kazhdan, M.; Funkhouser, T. (2004). "The princeton shape benchmark". Proceedings Shape Modeling Applications, 2004. IEEE. pp. 167â388. doi:10.1109/SMI.2004.1314504. ISBN 978-0-7695-2075-9. 978-0-7695-2075-9
Janoch, Allison; Karayev, Sergey; Jia, Yangqing; Barron, Jonathan T.; Fritz, Mario; Saenko, Kate; Darrell, Trevor (2013), Fossati, Andrea; Gall, Juergen; Grabner, Helmut; Ren, Xiaofeng (eds.), "A Category-Level 3D Object Dataset: Putting the Kinect to Work", Consumer Depth Cameras for Computer Vision: Research Topics and Applications, London: Springer, pp. 141â165, doi:10.1007/978-1-4471-4640-7_8, ISBN 978-1-4471-4640-7, retrieved 2025-03-07 978-1-4471-4640-7
Chang, Angel X.; Funkhouser, Thomas; Guibas, Leonidas; Hanrahan, Pat; Huang, Qixing; Li, Zimo; Savarese, Silvio; Savva, Manolis; Song, Shuran (2015-12-09). "ShapeNet: An Information-Rich 3D Model Repository". arXiv:1512.03012 [cs.GR]. /wiki/ArXiv_(identifier)
"Computational Vision and Geometry Lab". cvgl.stanford.edu. Retrieved 2025-03-07. https://cvgl.stanford.edu/projects/objectnet3d/
Xiang, Yu; Kim, Wonhui; Chen, Wei; Ji, Jingwei; Choy, Christopher; Su, Hao; Mottaghi, Roozbeh; Guibas, Leonidas; Savarese, Silvio (2016). "ObjectNet3D: A Large Scale Database for 3D Object Recognition". In Leibe, Bastian; Matas, Jiri; Sebe, Nicu; Welling, Max (eds.). Computer Vision â ECCV 2016. Lecture Notes in Computer Science. Vol. 9912. Cham: Springer International Publishing. pp. 160â176. doi:10.1007/978-3-319-46484-8_10. ISBN 978-3-319-46484-8. 978-3-319-46484-8
Reizenstein, Jeremy; Shapovalov, Roman; Henzler, Philipp; Sbordone, Luca; Labatut, Patrick; Novotny, David (2021). "Common Objects in 3D: Large-Scale Learning and Evaluation of Real-Life 3D Category Reconstruction": 10901â10911. {{cite journal}}: Cite journal requires |journal= (help) https://openaccess.thecvf.com/content/ICCV2021/html/Reizenstein_Common_Objects_in_3D_Large-Scale_Learning_and_Evaluation_of_Real-Life_ICCV_2021_paper.html
Reizenstein, Jeremy; Shapovalov, Roman; Henzler, Philipp; Sbordone, Luca; Labatut, Patrick; Novotny, David (2021-09-01). "Common Objects in 3D: Large-Scale Learning and Evaluation of Real-life 3D Category Reconstruction". arXiv:2109.00512 [cs.CV]. /wiki/ArXiv_(identifier)
Downs, Laura; Francis, Anthony; Koenig, Nate; Kinman, Brandon; Hickman, Ryan; Reymann, Krista; McHugh, Thomas B.; Vanhoucke, Vincent (2022-05-23). "Google Scanned Objects: A High-Quality Dataset of 3D Scanned Household Items". 2022 International Conference on Robotics and Automation (ICRA). IEEE. pp. 2553â2560. arXiv:2204.11918. doi:10.1109/ICRA46639.2022.9811809. ISBN 978-1-7281-9681-7. 978-1-7281-9681-7
Deitke, Matt; Liu, Ruoshi; Wallingford, Matthew; Ngo, Huong; Michel, Oscar; Kusupati, Aditya; Fan, Alan; Laforte, Christian; Voleti, Vikram; Gadre, Samir Yitzhak; VanderBilt, Eli; Kembhavi, Aniruddha; Vondrick, Carl; Gkioxari, Georgia; Ehsani, Kiana (2023-12-15). "Objaverse-XL: A Universe of 10M+ 3D Objects". Advances in Neural Information Processing Systems. 36: 35799â35813. https://proceedings.neurips.cc/paper_files/paper/2023/hash/70364304877b5e767de4e9a2a511be0c-Abstract-Datasets_and_Benchmarks.html
Wu, Tong; Zhang, Jiarui; Fu, Xiao; Wang, Yuxin; Ren, Jiawei; Pan, Liang; Wu, Wayne; Yang, Lei; Wang, Jiaqi; Qian, Chen; Lin, Dahua; Liu, Ziwei (2023). "OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation": 803â814. {{cite journal}}: Cite journal requires |journal= (help) https://openaccess.thecvf.com/content/CVPR2023/html/Wu_OmniObject3D_Large-Vocabulary_3D_Object_Dataset_for_Realistic_Perception_Reconstruction_and_CVPR_2023_paper.html
"OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation". omniobject3d.github.io. Retrieved 2025-03-07. https://omniobject3d.github.io/
"UnCommon Objects in 3D". uco3d.github.io. Retrieved 2025-03-07. https://uco3d.github.io/
Liu, Xingchen; Tayal, Piyush; Wang, Jianyuan; Zarzar, Jesus; Monnier, Tom; Tertikas, Konstantinos; Duan, Jiali; Toisoul, Antoine; Zhang, Jason Y. (2025-01-13). "UnCommon Objects in 3D". arXiv:2501.07574 [cs.CV]. /wiki/ArXiv_(identifier)
M. Cordts, M. Omran, S. Ramos, T. ScharwÀchter, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, "The Cityscapes Dataset." In CVPR Workshop on The Future of Datasets in Vision, 2015. https://www.cityscapes-dataset.com/wordpress/wp-content/papercite-data/pdf/cordts2015cvprw.pdf
Houben, Sebastian, et al. "Detection of traffic signs in real-world images: The German Traffic Sign Detection Benchmark." Neural Networks (IJCNN), The 2013 International Joint Conference on. IEEE, 2013. https://www.researchgate.net/profile/Sebastian_Houben/publication/242346625_Detection_of_Traffic_Signs_in_Real-World_Images_The_German_Traffic_Sign_Detection_Benchmark/links/0046352a03ec384e97000000/Detection-of-Traffic-Signs-in-Real-World-Images-The-German-Traffic-Sign-Detection-Benchmark.pdf
Mathias, Mayeul, et al. "Traffic sign recognitionâHow far are we from the solution?." Neural Networks (IJCNN), The 2013 International Joint Conference on. IEEE, 2013. http://www.varcity.eu/paper/ijcnn2013_mathias_trafficsign.pdf
Geiger, Andreas, Philip Lenz, and Raquel Urtasun. "Are we ready for autonomous driving? the kitti vision benchmark suite." Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012. https://www.cvlibs.net/publications/Geiger2012CVPR.pdf
Sturm, JĂŒrgen, et al. "A benchmark for the evaluation of RGB-D SLAM systems." Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on. IEEE, 2012. http://jsturm.de/publications/data/sturm12iros.pdf
The KITTI Vision Benchmark Suite on YouTube https://www.youtube.com/watch?v=KXpZ6B1YB_k
Kragh, Mikkel F.; et al. (2017). "FieldSAFE â Dataset for Obstacle Detection in Agriculture". Sensors. 17 (11): 2579. arXiv:1709.03526. Bibcode:2017Senso..17.2579K. doi:10.3390/s17112579. PMC 5713196. PMID 29120383. https://vision.eng.au.dk/fieldsafe
"Papers with Code - Daimler Monocular Pedestrian Detection Dataset". paperswithcode.com. Retrieved 5 May 2023. https://paperswithcode.com/dataset/daimler-monocular-pedestrian-detection
Enzweiler, Markus; Gavrila, Dariu M. (December 2009). "Monocular Pedestrian Detection: Survey and Experiments". IEEE Transactions on Pattern Analysis and Machine Intelligence. 31 (12): 2179â2195. doi:10.1109/TPAMI.2008.260. ISSNÂ 1939-3539. PMIDÂ 19834140. S2CIDÂ 1192198. https://ieeexplore.ieee.org/document/4657363
Yin, Guojun; Liu, Bin; Zhu, Huihui; Gong, Tao; Yu, Nenghai (28 July 2020). "A Large Scale Urban Surveillance Video Dataset for Multiple-Object Tracking and Behavior Analysis". arXiv:1904.11784 [cs.CV]. /wiki/ArXiv_(identifier)
"Object Recognition in Video Dataset". mi.eng.cam.ac.uk. Retrieved 5 May 2023. https://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/
Brostow, Gabriel J.; Shotton, Jamie; Fauqueur, Julien; Cipolla, Roberto (2008). "Segmentation and Recognition Using Structure from Motion Point Clouds". Computer Vision â ECCV 2008. Lecture Notes in Computer Science. Vol. 5302. Springer. pp. 44â57. doi:10.1007/978-3-540-88682-2_5. ISBN 978-3-540-88681-5. 978-3-540-88681-5
Brostow, Gabriel J.; Fauqueur, Julien; Cipolla, Roberto (15 January 2009). "Semantic object classes in video: A high-definition ground truth database". Pattern Recognition Letters. 30 (2): 88â97. Bibcode:2009PaReL..30...88B. doi:10.1016/j.patrec.2008.04.005. ISSNÂ 0167-8655. https://www.sciencedirect.com/science/article/abs/pii/S0167865508001220
"WildDash 2 Benchmark". wilddash.cc. Retrieved 5 May 2023. https://wilddash.cc/railsem19
Zendel, Oliver; Murschitz, Markus; Zeilinger, Marcel; Steininger, Daniel; Abbasi, Sara; Beleznai, Csaba (June 2019). "RailSem19: A Dataset for Semantic Rail Scene Understanding". 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). pp. 1221â1229. doi:10.1109/CVPRW.2019.00161. ISBN 978-1-7281-2506-0. S2CID 198166233. 978-1-7281-2506-0
"The Boreas Dataset". www.boreas.utias.utoronto.ca. Retrieved 5 May 2023. https://www.boreas.utias.utoronto.ca/#/
Burnett, Keenan; Yoon, David J.; Wu, Yuchen; Li, Andrew Zou; Zhang, Haowei; Lu, Shichen; Qian, Jingxing; Tseng, Wei-Kang; Lambert, Andrew; Leung, Keith Y. K.; Schoellig, Angela P.; Barfoot, Timothy D. (26 January 2023). "Boreas: A Multi-Season Autonomous Driving Dataset". arXiv:2203.10168 [cs.RO]. /wiki/Angela_Schoellig
"Bosch Small Traffic Lights Dataset". hci.iwr.uni-heidelberg.de. 1 March 2017. Retrieved 5 May 2023. https://hci.iwr.uni-heidelberg.de/content/bosch-small-traffic-lights-dataset
Behrendt, Karsten; Novak, Libor; Botros, Rami (May 2017). "A deep learning approach to traffic lights: Detection, tracking, and classification". 2017 IEEE International Conference on Robotics and Automation (ICRA). pp. 1370â1377. doi:10.1109/ICRA.2017.7989163. ISBN 978-1-5090-4633-1. S2CID 6257133. 978-1-5090-4633-1
"FRSign Dataset". frsign.irt-systemx.fr. Retrieved 5 May 2023. https://frsign.irt-systemx.fr/
Harb, Jeanine; Rébéna, Nicolas; Chosidow, Raphaël; Roblin, Grégoire; Potarusov, Roman; Hajri, Hatem (5 February 2020). "FRSign: A Large-Scale Traffic Light Dataset for Autonomous Trains". arXiv:2002.05665 [cs.CY]. /wiki/ArXiv_(identifier)
"ifs-rwth-aachen/GERALD". Chair and Institute for Rail Vehicles and Transport Systems. 30 April 2023. Retrieved 5 May 2023. https://github.com/ifs-rwth-aachen/GERALD
Leibner, Philipp; Hampel, Fabian; Schindler, Christian (3 April 2023). "GERALD: A novel dataset for the detection of German mainline railway signals". Proceedings of the Institution of Mechanical Engineers, Part F: Journal of Rail and Rapid Transit. 237 (10): 1332â1342. doi:10.1177/09544097231166472. ISSNÂ 0954-4097. S2CIDÂ 257939937. https://journals.sagepub.com/doi/abs/10.1177/09544097231166472
Wojek, Christian; Walk, Stefan; Schiele, Bernt (June 2009). "Multi-cue onboard pedestrian detection". 2009 IEEE Conference on Computer Vision and Pattern Recognition. pp. 794â801. doi:10.1109/CVPR.2009.5206638. ISBN 978-1-4244-3992-8. S2CID 18000078. 978-1-4244-3992-8
Toprak, TuÄçe; Aydın, Burak; BelenlioÄlu, Burak; GĂŒzeliĆ, CĂŒneyt; Selver, M. Alper (5 April 2020). "Conditional Weighted Ensemble of Transferred Models for Camera Based Onboard Pedestrian Detection in Railway Driver Support Systems". IEEE Transactions on Vehicular Technology: 1. doi:10.1109/TVT.2020.2983825. S2CID 216510283. Retrieved 5 May 2023. https://zenodo.org/record/3741742
Toprak, Tugce; Belenlioglu, Burak; Aydın, Burak; Guzelis, Cuneyt; Selver, M. Alper (May 2020). "Conditional Weighted Ensemble of Transferred Models for Camera Based Onboard Pedestrian Detection in Railway Driver Support Systems". IEEE Transactions on Vehicular Technology. 69 (5): 5041â5054. doi:10.1109/TVT.2020.2983825. ISSN 1939-9359. S2CID 216510283. https://ieeexplore.ieee.org/document/9050835
Tilly, Roman; Neumaier, Philipp; Schwalbe, Karsten; Klasek, Pavel; Tagiew, Rustam; Denzler, Patrick; Klockau, Tobias; Boekhoff, Martin; Köppel, Martin (2023). "Open Sensor Data for Rail 2023". FID Move (in German). doi:10.57806/9mv146r0. /wiki/Doi_(identifier)
Tagiew, Rustam; Köppel, Martin; Schwalbe, Karsten; Denzler, Patrick; Neumaier, Philipp; Klockau, Tobias; Boekhoff, Martin; Klasek, Pavel; Tilly, Roman (4 May 2023). "OSDaR23: Open Sensor Data for Rail 2023". 2023 8th International Conference on Robotics and Automation Engineering (ICRAE). pp. 270â276. arXiv:2305.03001. doi:10.1109/ICRAE59816.2023.10458449. ISBN 979-8-3503-2765-6. 979-8-3503-2765-6
"Home". Argoverse. Retrieved 5 May 2023. https://www.argoverse.org/
Chang, Ming-Fang; Lambert, John; Sangkloy, Patsorn; Singh, Jagjeet; Bak, Slawomir; Hartnett, Andrew; Wang, De; Carr, Peter; Lucey, Simon; Ramanan, Deva; Hays, James (6 November 2019). "Argoverse: 3D Tracking and Forecasting with Rich Maps". arXiv:1911.02620 [cs.CV]. /wiki/ArXiv_(identifier)
Kharroubi, Abderrazzaq; Ballouch, Zouhair; Hajji, Rafika; Yarroudh, Anass; Billen, Roland (9 April 2024). "Multi-Context Point Cloud Dataset and Machine Learning for Railway Semantic Segmentation". Infrastructures. 9 (4): 71. doi:10.3390/infrastructures9040071. https://doi.org/10.3390%2Finfrastructures9040071
Qiu, Bo; Zhou, Yuzhou; Dai, Lei; Wang, Bing; Li, Jianping; Dong, Zhen; Wen, Chenglu; Ma, Zhiliang; Yang, Bisheng (December 2024). "WHU-Railway3D: A Diverse Dataset and Benchmark for Railway Point Cloud Semantic Segmentation". IEEE Transactions on Intelligent Transportation Systems. 25 (12): 20900â20916. doi:10.1109/TITS.2024.3469546. ISSNÂ 1558-0016. https://ieeexplore.ieee.org/document/10716569
Chen, Zhichao; Yang, Jie; Feng, Zhicheng; Zhu, Hao (16 January 2024). "RailFOD23: A dataset for foreign object detection on railroad transmission lines". Scientific Data. 11 (1): 72. Bibcode:2024NatSD..11...72C. doi:10.1038/s41597-024-02918-9. ISSNÂ 2052-4463. PMCÂ 10791632. PMIDÂ 38228610. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10791632
Khemmar, Redouane; Mauri, Antoine; Dulompont, Camille; Gajula, Jayadeep; Vauchey, Vincent; Haddad, Madjid; Boutteau, Rémi (22 May 2022). "Road and Railway Smart Mobility: A High-Definition Ground Truth Hybrid Dataset". Sensors. 22 (10): 3922. Bibcode:2022Senso..22.3922K. doi:10.3390/s22103922. PMC 9143394. PMID 35632331. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9143394
ICONS 2022: the seventeenth International Conference on Systems: April 24-28, 2022, Barcelona, Spain. Wilmington, DE, USA: IARIA. 2022. ISBNÂ 978-1-61208-941-6. 978-1-61208-941-6
Jiang, Tengping; Li, Shiwei; Zhang, Qinyu; Wang, Guangshuai; Zhang, Zequn; Zeng, Fankun; An, Peng; Jin, Xin; Liu, Shan; Wang, Yongjun (2024). "RailPC: A large-scale railway point cloud semantic segmentation dataset". CAAI Transactions on Intelligence Technology. 9 (6): 1548â1560. doi:10.1049/cit2.12349. ISSNÂ 2468-2322. https://doi.org/10.1049%2Fcit2.12349
Abid, Mahdi; Teixeira, Mathis; Mahtani, Ankur; Laurent, Thomas (2024). "RailCloud-HdF: A Large-Scale Point Cloud Dataset for Railway Scene Semantic Segmentation". Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications. pp. 159â170. doi:10.5220/0012394800003660. ISBN 978-989-758-679-8. 978-989-758-679-8
Rustam, Tagiew; Ilkay, Wunderlich; Philipp, Zanitzer; Mark, Sastuba; Carsten, Knoll; Kilian, Göller; Haadia, Amjad; Steffen, Seitz (2025). "Görlitz Rail Test Center CV Dataset 2024 (RailGoerl24)". German National Library of Science and Technology. https://data.fid-move.de/de/dataset/railgoerl24
"Face Recognition Homepage - Databases". www.face-rec.org. Retrieved 2025-04-26. https://www.face-rec.org/databases/
Huang, Gary B., et al. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Vol. 1. No. 2. Technical Report 07-49, University of Massachusetts, Amherst, 2007. https://hal.inria.fr/docs/00/32/19/23/PDF/Huang_long_eccv2008-lfw.pdf
"LFW Face Database : Main". web.archive.org. 2012-12-01. Archived from the original on 2012-12-01. Retrieved 2025-04-26. https://web.archive.org/web/20121201044531/http://vis-www.cs.umass.edu/lfw
Zafeiriou, S.; Kollias, D.; Nicolaou, M.A.; Papaioannou, A.; Zhao, G.; Kotsia, I. (2017). "Aff-Wild: Valence and Arousal 'In-the-Wild' Challenge" (PDF). 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). pp. 1980â1987. doi:10.1109/CVPRW.2017.248. ISBN 978-1-5386-0733-6. S2CID 3107614. 978-1-5386-0733-6
Kollias, D.; Tzirakis, P.; Nicolaou, M.A.; Papaioannou, A.; Zhao, G.; Schuller, B.; Kotsia, I.; Zafeiriou, S. (2019). "Deep Affect Prediction in-the-wild: Aff-Wild Database and Challenge, Deep Architectures, and Beyond". International Journal of Computer Vision. 127 (6â7): 907â929. arXiv:1804.10938. doi:10.1007/s11263-019-01158-4. S2CIDÂ 13679040. https://rdcu.be/bmGm2
Kollias, D.; Zafeiriou, S. (2019). "Expression, affect, action unit recognition: Aff-wild2, multi-task learning and arcface" (PDF). British Machine Vision Conference (BMVC), 2019. arXiv:1910.04855. https://bmvc2019.org/wp-content/uploads/papers/0399-paper.pdf
Kollias, D.; Schulc, A.; Hajiyev, E.; Zafeiriou, S. (2020). "Analysing Affective Behavior in the First ABAW 2020 Competition". 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020). pp. 637â643. arXiv:2001.11409. doi:10.1109/FG47880.2020.00126. ISBN 978-1-7281-3079-8. S2CID 210966051. 978-1-7281-3079-8
Phillips, P. Jonathon; et al. (1998). "The FERET database and evaluation procedure for face-recognition algorithms". Image and Vision Computing. 16 (5): 295â306. doi:10.1016/s0262-8856(97)00070-x. /wiki/Doi_(identifier)
Wiskott, Laurenz; et al. (1997). "Face recognition by elastic bunch graph matching". IEEE Transactions on Pattern Analysis and Machine Intelligence. 19 (7): 775â779. CiteSeerX 10.1.1.44.2321. doi:10.1109/34.598235. S2CID 30523165. /wiki/CiteSeerX_(identifier)
Livingstone, Steven R.; Russo, Frank A. (2018). "The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English". PLOS ONE. 13 (5): e0196391. Bibcode:2018PLoSO..1396391L. doi:10.1371/journal.pone.0196391. PMCÂ 5955500. PMIDÂ 29768426. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5955500
Livingstone, Steven R.; Russo, Frank A. (2018). "Emotion". The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS). doi:10.5281/zenodo.1188976. /wiki/Doi_(identifier)
Grgic, Mislav; Delac, Kresimir; Grgic, Sonja (2011). "SCfaceâsurveillance cameras face database". Multimedia Tools and Applications. 51 (3): 863â879. doi:10.1007/s11042-009-0417-2. S2CIDÂ 207218990. /wiki/Doi_(identifier)
Wallace, Roy, et al. "Inter-session variability modelling and joint factor analysis for face authentication." Biometrics (IJCB), 2011 International Joint Conference on. IEEE, 2011. https://repository.ubn.ru.nl/bitstream/handle/2066/94489/94489.pdf
Georghiades, A. "Yale face database". Center for Computational Vision and Control at Yale University. 2: 1997. http://CVC.yale.edu/Projects/Yalefaces/Yalefa
Nguyen, Duy; et al. (2006). "Real-time face detection and lip feature extraction using field-programmable gate arrays". IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics. 36 (4): 902â912. CiteSeerX 10.1.1.156.9848. doi:10.1109/tsmcb.2005.862728. PMID 16903373. S2CID 7334355. /wiki/CiteSeerX_(identifier)
Kanade, Takeo, Jeffrey F. Cohn, and Yingli Tian. "Comprehensive database for facial expression analysis." Automatic Face and Gesture Recognition, 2000. Proceedings. Fourth IEEE International Conference on. IEEE, 2000. /wiki/Takeo_Kanade
Zeng, Zhihong; et al. (2009). "A survey of affect recognition methods: Audio, visual, and spontaneous expressions". IEEE Transactions on Pattern Analysis and Machine Intelligence. 31 (1): 39â58. CiteSeerX 10.1.1.144.217. doi:10.1109/tpami.2008.52. PMID 19029545. /wiki/CiteSeerX_(identifier)
Lyons, Michael; Kamachi, Miyuki; Gyoba, Jiro (1998). "Facial expression images". The Japanese Female Facial Expression (JAFFE) Database. doi:10.5281/zenodo.3451524. /wiki/Doi_(identifier)
Lyons, Michael; Akamatsu, Shigeru; Kamachi, Miyuki; Gyoba, Jiro "Coding facial expressions with Gabor wavelets." Automatic Face and Gesture Recognition, 1998. Proceedings. Third IEEE International Conference on. IEEE, 1998. https://zenodo.org/record/3430156
Ng, Hong-Wei, and Stefan Winkler. "A data-driven approach to cleaning large face datasets Archived 6 December 2019 at the Wayback Machine." Image Processing (ICIP), 2014 IEEE International Conference on. IEEE, 2014. http://vintage.winklerbros.net/Publications/icip2014a.pdf
RoyChowdhury, Aruni; Lin, Tsung-Yu; Maji, Subhransu; Learned-Miller, Erik (2015). "One-to-many face recognition with bilinear CNNs". arXiv:1506.01342 [cs.CV]. /wiki/ArXiv_(identifier)
Jesorsky, Oliver, Klaus J. Kirchberg, and Robert W. Frischholz. "Robust face detection using the hausdorff distance." Audio-and video-based biometric person authentication. Springer Berlin Heidelberg, 2001.
Bhatt, Rajen B., et al. "Efficient skin region segmentation using low complexity fuzzy decision tree model." India Conference (INDICON), 2009 Annual IEEE. IEEE, 2009. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.708.9158&rep=rep1&type=pdf
Lingala, Mounika; et al. (2014). "Fuzzy logic color detection: Blue areas in melanoma dermoscopy images". Computerized Medical Imaging and Graphics. 38 (5): 403â410. doi:10.1016/j.compmedimag.2014.03.007. PMC 4287461. PMID 24786720. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4287461
Maes, Chris, et al. "Feature detection on 3D face surfaces for pose normalisation and recognition." Biometrics: Theory Applications and Systems (BTAS), 2010 Fourth IEEE International Conference on. IEEE, 2010. https://lirias.kuleuven.be/retrieve/135678
Savran, Arman, et al. "Bosphorus database for 3D face analysis." Biometrics and Identity Management. Springer Berlin Heidelberg, 2008. 47â56. https://web.archive.org/web/20190222192331/http://pdfs.semanticscholar.org/4254/fbba3846008f50671edc9cf70b99d7304543.pdf
Heseltine, Thomas, Nick Pears, and Jim Austin. "Three-dimensional face recognition: An eigensurface approach." Image Processing, 2004. ICIP'04. 2004 International Conference on. Vol. 2. IEEE, 2004. http://eprints.whiterose.ac.uk/1526/01/austinj4.pdf
Ge, Yun; et al. (2011). "3D Novel Face Sample Modeling for Face Recognition". Journal of Multimedia. 6 (5): 467â475. CiteSeerX 10.1.1.461.9710. doi:10.4304/jmm.6.5.467-475. /wiki/CiteSeerX_(identifier)
Wang, Yueming; Liu, Jianzhuang; Tang, Xiaoou (2010). "Robust 3D face recognition by local shape difference boosting". IEEE Transactions on Pattern Analysis and Machine Intelligence. 32 (10): 1858â1870. CiteSeerXÂ 10.1.1.471.2424. doi:10.1109/tpami.2009.200. PMIDÂ 20724762. S2CIDÂ 15263913. /wiki/CiteSeerX_(identifier)
Zhong, Cheng, Zhenan Sun, and Tieniu Tan. "Robust 3D face recognition using learned visual codebook." Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on. IEEE, 2007. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.580.8534&rep=rep1&type=pdf
Zhao, G.; Huang, X.; Taini, M.; Li, S. Z.; PietikĂ€inen, M. (2011). "Facial expression recognition from near-infrared videos" (PDF). Image and Vision Computing. 29 (9): 607â619. doi:10.1016/j.imavis.2011.07.002.[dead link] http://www.academia.edu/download/42229488/Image_and_Vision_Computing20160206-29020-1auzaon.pdf
Soyel, Hamit, and Hasan Demirel. "Facial expression recognition using 3D facial feature distances." Image Analysis and Recognition. Springer Berlin Heidelberg, 2007. 831â838. https://pdfs.semanticscholar.org/cf81/4b618fcbc9a556cdce225e74a8806867ba84.pdf
Bowyer, Kevin W.; Chang, Kyong; Flynn, Patrick (2006). "A survey of approaches and challenges in 3D and multi-modal 3D+ 2D face recognition". Computer Vision and Image Understanding. 101 (1): 1â15. CiteSeerXÂ 10.1.1.134.8784. doi:10.1016/j.cviu.2005.05.005. /wiki/CiteSeerX_(identifier)
Tan, Xiaoyang; Triggs, Bill (2010). "Enhanced local texture feature sets for face recognition under difficult lighting conditions". IEEE Transactions on Image Processing. 19 (6): 1635â1650. Bibcode:2010ITIP...19.1635T. CiteSeerXÂ 10.1.1.105.3355. doi:10.1109/tip.2010.2042645. PMIDÂ 20172829. S2CIDÂ 4943234. /wiki/Bibcode_(identifier)
Mousavi, Mir Hashem; Faez, Karim; Asghari, Amin (2008). "Three Dimensional Face Recognition Using SVM Classifier". Seventh IEEE/ACIS International Conference on Computer and Information Science (Icis 2008). pp. 208â213. doi:10.1109/ICIS.2008.77. ISBN 978-0-7695-3131-1. S2CID 2710422. 978-0-7695-3131-1
Amberg, Brian; Knothe, Reinhard; Vetter, Thomas (2008). "Expression invariant 3D face recognition with a Morphable Model" (PDF). 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition. pp. 1â6. doi:10.1109/AFGR.2008.4813376. ISBN 978-1-4244-2154-1. S2CID 5651453. Archived from the original (PDF) on 28 July 2018. Retrieved 6 August 2019. 978-1-4244-2154-1
Irfanoglu, M.O.; Gokberk, B.; Akarun, L. (2004). "3D shape-based face recognition using automatically registered facial surfaces". Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004. pp. 183â186 Vol.4. doi:10.1109/ICPR.2004.1333734. ISBN 0-7695-2128-2. S2CID 10987293. 0-7695-2128-2
Beumier, Charles; Acheroy, Marc (2001). "Face verification from 3D and grey level clues". Pattern Recognition Letters. 22 (12): 1321â1329. Bibcode:2001PaReL..22.1321B. doi:10.1016/s0167-8655(01)00077-0. /wiki/Bibcode_(identifier)
Afifi, Mahmoud; Abdelhamed, Abdelrahman (2017-06-13). "AFIF4: Deep Gender Classification based on AdaBoost-based Fusion of Isolated Facial Features and Foggy Faces". arXiv:1706.04277 [cs.CV]. /wiki/ArXiv_(identifier)
"SoF dataset". sites.google.com. Retrieved 2017-11-18. https://sites.google.com/view/sof-dataset
"IMDb-WIKI". data.vision.ee.ethz.ch. Retrieved 2018-03-13. https://data.vision.ee.ethz.ch/cvl/rrothe/imdb-wiki/
"AVA: A Video Dataset of Atomic Visual Action". research.google.com. Retrieved 2024-10-18. https://research.google.com/ava/
Li, Ang; Thotakuri, Meghana; Ross, David A.; Carreira, JoĂŁo; Vostrikov, Alexander; Zisserman, Andrew (2020-05-20). "The AVA-Kinetics Localized Human Actions Video Dataset". arXiv:2005.00214 [cs.CV]. /wiki/ArXiv_(identifier)
Patron-Perez, A.; Marszalek, M.; Reid, I.; Zisserman, A. (2012). "Structured learning of human interactions in TV shows". IEEE Transactions on Pattern Analysis and Machine Intelligence. 34 (12): 2441â2453. doi:10.1109/tpami.2012.24. PMIDÂ 23079467. S2CIDÂ 6060568. /wiki/Doi_(identifier)
Ofli, F., Chaudhry, R., Kurillo, G., Vidal, R., & Bajcsy, R. (January 2013). Berkeley MHAD: A comprehensive multimodal human action database. In Applications of Computer Vision (WACV), 2013 IEEE Workshop on (pp. 53â60). IEEE. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.432.5113&rep=rep1&type=pdf
Jiang, Y. G., et al. "THUMOS challenge: Action recognition with a large number of classes." ICCV Workshop on Action Recognition with a Large Number of Classes, http://crcv.ucf.edu/ICCV13-Action-Workshop. 2013. http://crcv.ucf.edu/ICCV13-Action-Workshop
Simonyan, Karen, and Andrew Zisserman. "Two-stream convolutional networks for action recognition in videos." Advances in Neural Information Processing Systems. 2014. https://papers.nips.cc/paper/5353-two-stream-convolutional-networks-for-action-recognition-in-videos.pdf
Stoian, Andrei; Ferecatu, Marin; Benois-Pineau, Jenny; Crucianu, Michel (2016). "Fast Action Localization in Large-Scale Video Archives". IEEE Transactions on Circuits and Systems for Video Technology. 26 (10): 1917â1930. doi:10.1109/TCSVT.2015.2475835. S2CIDÂ 31537462. /wiki/Doi_(identifier)
Botta, M., A. Giordana, and L. Saitta. "Learning fuzzy concept definitions." Fuzzy Systems, 1993., Second IEEE International Conference on. IEEE, 1993. /wiki/Lorenza_Saitta
Frey, Peter W.; Slate, David J. (1991). "Letter recognition using Holland-style adaptive classifiers". Machine Learning. 6 (2): 161â182. doi:10.1007/bf00114162. https://doi.org/10.1007%2Fbf00114162
Peltonen, Jaakko; Klami, Arto; Kaski, Samuel (2004). "Improved learning of Riemannian metrics for exploratory analysis". Neural Networks. 17 (8): 1087â1100. CiteSeerXÂ 10.1.1.59.4865. doi:10.1016/j.neunet.2004.06.008. PMIDÂ 15555853. /wiki/CiteSeerX_(identifier)
Liu, Cheng-Lin; Yin, Fei; Wang, Da-Han; Wang, Qiu-Feng (January 2013). "Online and offline handwritten Chinese character recognition: Benchmarking on new databases". Pattern Recognition. 46 (1): 155â162. Bibcode:2013PatRe..46..155L. doi:10.1016/j.patcog.2012.06.021. /wiki/Bibcode_(identifier)
Wang, D.; Liu, C.; Yu, J.; Zhou, X. (2009). "CASIA-OLHWDB1: A Database of Online Handwritten Chinese Characters". 2009 10th International Conference on Document Analysis and Recognition. pp. 1206â1210. doi:10.1109/ICDAR.2009.163. ISBN 978-1-4244-4500-4. S2CID 5705532. 978-1-4244-4500-4
Liu, Cheng-Lin; Yin, Fei; Wang, Da-Han; Wang, Qiu-Feng (January 2013). "Online and offline handwritten Chinese character recognition: Benchmarking on new databases". Pattern Recognition. 46 (1): 155â162. Bibcode:2013PatRe..46..155L. doi:10.1016/j.patcog.2012.06.021. /wiki/Bibcode_(identifier)
Williams, Ben H., Marc Toussaint, and Amos J. Storkey. Extracting motion primitives from natural handwriting data. Springer Berlin Heidelberg, 2006. https://www.era.lib.ed.ac.uk/bitstream/handle/1842/3221/BH%20Williams%20PhD%20thesis%2009.pdf?sequence=1
Meier, Franziska, et al. "Movement segmentation using a primitive library."Intelligent Robots and Systems (IROS), 2011 IEEE/RSJ International Conference on. IEEE, 2011. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.395.8598&rep=rep1&type=pdf
T. E. de Campos, B. R. Babu and M. Varma. Character recognition in natural images. In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP), Lisbon, Portugal, February 2009 http://personal.ee.surrey.ac.uk/Personal/T.Decampos/papers/decampos_etal_visapp2009.pdf
Cohen, Gregory; Afshar, Saeed; Tapson, Jonathan; André van Schaik (2017). "EMNIST: An extension of MNIST to handwritten letters". arXiv:1702.05373v1 [cs.CV]. /wiki/ArXiv_(identifier)
"The EMNIST Dataset". NIST. 4 April 2017. https://www.nist.gov/itl/products-and-services/emnist-dataset
Cohen, Gregory; Afshar, Saeed; Tapson, Jonathan; André van Schaik (2017). "EMNIST: An extension of MNIST to handwritten letters". arXiv:1702.05373 [cs.CV]. /wiki/ArXiv_(identifier)
Llorens, David, et al. "The UJIpenchars Database: a Pen-Based Database of Isolated Handwritten Characters." LREC. 2008. https://web.archive.org/web/20190806015012/https://pdfs.semanticscholar.org/24cf/ef15094c59322560377bbf8e4185245c654f.pdf
Calderara, Simone; Prati, Andrea; Cucchiara, Rita (2011). "Mixtures of von mises distributions for people trajectory shape analysis". IEEE Transactions on Circuits and Systems for Video Technology. 21 (4): 457â471. doi:10.1109/tcsvt.2011.2125550. S2CIDÂ 1427766. /wiki/Doi_(identifier)
Guyon, Isabelle, et al. "Result analysis of the nips 2003 feature selection challenge." Advances in neural information processing systems. 2004. http://papers.nips.cc/paper/2728-result-analysis-of-the-nips-2003-feature-selection-challenge.pdf
Lake, B. M.; Salakhutdinov, R.; Tenenbaum, J. B. (2015-12-11). "Human-level concept learning through probabilistic program induction". Science. 350 (6266): 1332â1338. Bibcode:2015Sci...350.1332L. doi:10.1126/science.aab3050. ISSNÂ 0036-8075. PMIDÂ 26659050. https://doi.org/10.1126%2Fscience.aab3050
Lake, Brenden (2019-11-09). "Omniglot data set for one-shot learning". GitHub. Retrieved 2019-11-10. https://github.com/brendenlake/omniglot
LeCun, Yann; et al. (1998). "Gradient-based learning applied to document recognition". Proceedings of the IEEE. 86 (11): 2278â2324. CiteSeerX 10.1.1.32.9552. doi:10.1109/5.726791. S2CID 14542261. /wiki/CiteSeerX_(identifier)
Kussul, Ernst; Baidyk, Tatiana (2004). "Improved method of handwritten digit recognition tested on MNIST database". Image and Vision Computing. 22 (12): 971â981. doi:10.1016/j.imavis.2004.03.008. /wiki/Tetyana_Baydyk
Xu, Lei; KrzyĆŒak, Adam; Suen, Ching Y. (1992). "Methods of combining multiple classifiers and their applications to handwriting recognition". IEEE Transactions on Systems, Man, and Cybernetics. 22 (3): 418â435. doi:10.1109/21.155943. hdl:10338.dmlcz/135217. /wiki/Doi_(identifier)
Alimoglu, Fevzi, et al. "Combining multiple classifiers for pen-based handwritten digit recognition." (1996). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.25.6299
Tang, E. Ke; et al. (2005). "Linear dimensionality reduction using relevance weighted LDA". Pattern Recognition. 38 (4): 485â493. Bibcode:2005PatRe..38..485T. doi:10.1016/j.patcog.2004.09.005. S2CID 10580110. /wiki/Bibcode_(identifier)
Hong, Yi, et al. "Learning a mixture of sparse distance metrics for classification and dimensionality reduction." Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2011. https://pages.ucsd.edu/~ztu/publication/iccv11_sparsemetric.pdf
Thoma, Martin (2017). "The HASYv2 dataset". arXiv:1701.08380 [cs.CV]. /wiki/ArXiv_(identifier)
Karki, Manohar; Liu, Qun; DiBiano, Robert; Basu, Saikat; Mukhopadhyay, Supratik (2018-06-20). "Pixel-level Reconstruction and Classification for Noisy Handwritten Bangla Characters". arXiv:1806.08037 [cs.CV]. /wiki/ArXiv_(identifier)
Liu, Qun; Collier, Edward; Mukhopadhyay, Supratik (2019). "PCGAN-CHAR: Progressively Trained Classifier Generative Adversarial Networks for Classification of Noisy Handwritten Bangla Characters". Digital Libraries at the Crossroads of Digital Information for the Future. Lecture Notes in Computer Science. Vol. 11853. Springer International Publishing. pp. 3â15. arXiv:1908.08987. doi:10.1007/978-3-030-34058-2_1. ISBN 978-3-030-34057-5. S2CID 201665955. 978-3-030-34057-5
"iSAID". captain-whu.github.io. Retrieved 2021-11-30. https://captain-whu.github.io/iSAID/index.html
Zamir, Syed & Arora, Aditya & Gupta, Akshita & Khan, Salman & Sun, Guolei & Khan, Fahad & Zhu, Fan & Shao, Ling & Xia, Gui-Song & Bai, Xiang. (2019). iSAID: A Large-scale Dataset for Instance Segmentation in Aerial Images. website https://captain-whu.github.io/iSAID/index.html
Yuan, Jiangye; Gleason, Shaun S.; Cheriyadat, Anil M. (2013). "Systematic benchmarking of aerial image segmentation". IEEE Geoscience and Remote Sensing Letters. 10 (6): 1527â1531. Bibcode:2013IGRSL..10.1527Y. doi:10.1109/lgrs.2013.2261453. S2CIDÂ 629629. /wiki/Bibcode_(identifier)
Vatsavai, Ranga Raju. "Object based image classification: state of the art and computational challenges." Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data. ACM, 2013. https://dl.acm.org/citation.cfm?id=2534927
Butenuth, Matthias, et al. "Integrating pedestrian simulation, tracking and event detection for crowd analysis." Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on. IEEE, 2011. http://www.hartmann-alberts.de/dirk/pub/proceedings2011e.pdf
Fradi, Hajer, and Jean-Luc Dugelay. "Low level crowd analysis using frame-wise normalized feature for people counting." Information Forensics and Security (WIFS), 2012 IEEE International Workshop on. IEEE, 2012. http://www.eurecom.fr/fr/publication/3841/download/mm-publi-3841.pdf
Johnson, Brian Alan, Ryutaro Tateishi, and Nguyen Thanh Hoan. "A hybrid pansharpening approach and multiscale object-based image analysis for mapping diseased pine and oak trees." International journal of remote sensing34.20 (2013): 6969â6982. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.826.9200&rep=rep1&type=pdf
Mohd Pozi, Muhammad Syafiq; Sulaiman, Md Nasir; Mustapha, Norwati; Perumal, Thinagaran (2015). "A new classification model for a class imbalanced data set using genetic programming and support vector machines: Case study for wilt disease classification". Remote Sensing Letters. 6 (7): 568â577. Bibcode:2015RSL.....6..568M. doi:10.1080/2150704X.2015.1062159. S2CIDÂ 58788630. https://www.tandfonline.com/doi/abs/10.1080/2150704X.2015.1062159
Gallego, A.-J.; Pertusa, A.; Gil, P. "Automatic Ship Classification from Optical Aerial Images with Convolutional Neural Networks." Remote Sensing. 2018; 10(4):511. https://www.mdpi.com/2072-4292/10/4/511
Gallego, A.-J.; Pertusa, A.; Gil, P. "MAritime SATellite Imagery dataset". Available: https://www.iuii.ua.es/datasets/masati/, 2018. https://www.iuii.ua.es/datasets/masati/
Johnson, Brian; Tateishi, Ryutaro; Xie, Zhixiao (2012). "Using geographically weighted variables for image classification". Remote Sensing Letters. 3 (6): 491â499. Bibcode:2012RSL.....3..491J. doi:10.1080/01431161.2011.629637. S2CIDÂ 122543681. /wiki/Bibcode_(identifier)
Chatterjee, Sankhadeep, et al. "Forest Type Classification: A Hybrid NN-GA Model Based Approach." Information Systems Design and Intelligent Applications. Springer India, 2016. 227â236. https://www.researchgate.net/profile/Sankhadeep_Chatterjee/publication/282605325_Forest_Type_Classification_A_Hybrid_NN-GA_Model_Based_Approach/links/57493cb308ae5c51e29e6f1b/Forest-Type-Classification-A-Hybrid-NN-GA-Model-Based-Approach.pdf
Diegert, Carl. "A combinatorial method for tracing objects using semantics of their shape." Applied Imagery Pattern Recognition Workshop (AIPR), 2010 IEEE 39th. IEEE, 2010. https://www.osti.gov/servlets/purl/1278837
Razakarivony, Sebastien, and Frédéric Jurie. "Small target detection combining foreground and background manifolds." IAPR International Conference on Machine Vision Applications. 2013. https://hal.archives-ouvertes.fr/hal-00943444/file/13_mva-detection.pdf
"SpaceNet". explore.digitalglobe.com. Archived from the original on 13 March 2018. Retrieved 2018-03-13. https://web.archive.org/web/20180313092809/http://explore.digitalglobe.com/spacenet
Etten, Adam Van (2017-01-05). "Getting Started With SpaceNet Data". The DownLinQ. Retrieved 2018-03-13. https://medium.com/the-downlinq/getting-started-with-spacenet-data-827fd2ec9f53
Vakalopoulou, M.; Bus, N.; Karantzalosa, K.; Paragios, N. (July 2017). "Integrating edge/Boundary priors with classification scores for building detection in very high resolution data". 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS). pp. 3309â3312. doi:10.1109/IGARSS.2017.8127705. ISBN 978-1-5090-4951-6. S2CID 8297433. 978-1-5090-4951-6
Yang, Yi; Newsam, Shawn (2010). "Bag-of-visual-words and spatial extensions for land-use classification". Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems. New York, New York, USA: ACM Press. pp. 270â279. doi:10.1145/1869790.1869829. ISBN 9781450304283. S2CID 993769. 9781450304283
Basu, Saikat; Ganguly, Sangram; Mukhopadhyay, Supratik; DiBiano, Robert; Karki, Manohar; Nemani, Ramakrishna (2015-11-03). "DeepSat: A learning framework for satellite imagery". Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems. ACM. pp. 1â10. doi:10.1145/2820783.2820816. ISBN 9781450339674. S2CID 4387134. 9781450339674
Liu, Qun; Basu, Saikat; Ganguly, Sangram; Mukhopadhyay, Supratik; DiBiano, Robert; Karki, Manohar; Nemani, Ramakrishna (2019-11-21). "DeepSat V2: feature augmented convolutional neural nets for satellite image classification". Remote Sensing Letters. 11 (2): 156â165. arXiv:1911.07747. doi:10.1080/2150704x.2019.1693071. ISSNÂ 2150-704X. S2CIDÂ 208138097. /wiki/ArXiv_(identifier)
Basu, Saikat; Ganguly, Sangram; Mukhopadhyay, Supratik; DiBiano, Robert; Karki, Manohar; Nemani, Ramakrishna (2015-11-03). "DeepSat: A learning framework for satellite imagery". Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems. ACM. pp. 1â10. doi:10.1145/2820783.2820816. ISBN 9781450339674. S2CID 4387134. 9781450339674
Liu, Qun; Basu, Saikat; Ganguly, Sangram; Mukhopadhyay, Supratik; DiBiano, Robert; Karki, Manohar; Nemani, Ramakrishna (2019-11-21). "DeepSat V2: feature augmented convolutional neural nets for satellite image classification". Remote Sensing Letters. 11 (2): 156â165. arXiv:1911.07747. doi:10.1080/2150704x.2019.1693071. ISSNÂ 2150-704X. S2CIDÂ 208138097. /wiki/ArXiv_(identifier)
Md Jahidul Islam, et al. "Semantic Segmentation of Underwater Imagery: Dataset and Benchmark." 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020. https://ieeexplore.ieee.org/abstract/document/9340821
Waszak et al. "Semantic Segmentation in Underwater Ship Inspections: Benchmark and Data Set." IEEE Journal of Oceanic Engineering. IEEE, 2022. https://ieeexplore.ieee.org/document/9998080
"True Color Kodak Images". r0k.us. Retrieved 2025-02-27. https://r0k.us/graphics/kodak/
Ebadi, Ashkan; Paul, Patrick; Auer, Sofia; Tremblay, Stéphane (2021-11-12). "NRC-GAMMA: Introducing a Novel Large Gas Meter Image Dataset". arXiv:2111.06827 [cs.CV]. /wiki/ArXiv_(identifier)
Canada, Government of Canada National Research Council (2021). "The gas meter image dataset (NRC-GAMMA) - NRC Digital Repository". nrc-digital-repository.canada.ca. doi:10.4224/3c8s-z290. Retrieved 2021-12-02. https://nrc-digital-repository.canada.ca/eng/view/object/?id=ba1fc493-e65f-4c0a-ab31-ecbcdf00bfa4
Rabah, Chaima Ben; Coatrieux, Gouenou; Abdelfattah, Riadh (October 2020). "The Supatlantique Scanned Documents Database for Digital Image Forensics Purposes". 2020 IEEE International Conference on Image Processing (ICIP). IEEE. pp. 2096â2100. doi:10.1109/icip40778.2020.9190665. ISBN 978-1-7281-6395-6. S2CID 224881147. 978-1-7281-6395-6
Mills, Kyle; Tamblyn, Isaac (2018-05-16). "Big graphene dataset". National Research Council of Canada. doi:10.4224/c8sc04578j.data. {{cite web}}: Missing or empty |url= (help) /wiki/Doi_(identifier)
Mills, Kyle; Spanner, Michael; Tamblyn, Isaac (2018-05-16). "Quantum simulation". Quantum simulations of an electron in a two dimensional potential well. National Research Council of Canada. doi:10.4224/PhysRevA.96.042113.data. /wiki/Doi_(identifier)
Rohrbach, M.; Amin, S.; Andriluka, M.; Schiele, B. (2012). "A database for fine grained activity detection of cooking activities". 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE. pp. 1194â1201. doi:10.1109/cvpr.2012.6247801. ISBN 978-1-4673-1228-8. 978-1-4673-1228-8
Kuehne, Hilde, Ali Arslan, and Thomas Serre. "The language of actions: Recovering the syntax and semantics of goal-directed human activities."Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014. https://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Kuehne_The_Language_of_2014_CVPR_paper.pdf
Sviatoslav, Voloshynovskiy, et al. "Towards Reproducible results in authentication based on physical non-cloneable functions: The Forensic Authentication Microstructure Optical Set (FAMOS)."Proc. Proceedings of IEEE International Workshop on Information Forensics and Security. 2012. http://vision.unige.ch/publications/postscript/2012/2012.WIFS.database.pdf
Olga, Taran and Shideh, Rezaeifar, et al. "PharmaPack: mobile fine-grained recognition of pharma packages."Proc. European Signal Processing Conference (EUSIPCO). 2017. https://archive-ouverte.unige.ch/unige:97444/ATTACHMENT01
Khosla, Aditya, et al. "Novel dataset for fine-grained image categorization: Stanford dogs."Proc. CVPR Workshop on Fine-Grained Visual Categorization (FGVC). 2011. https://people.csail.mit.edu/khosla/papers/fgvc2011.pdf
Parkhi, Omkar M., et al. "Cats and dogs."Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012. http://www.robots.ox.ac.uk:5000/~vgg/publications/2012/parkhi12a/parkhi12a.pdf
Biggs, Benjamin; Boyne, Oliver; Charles, James; Fitzgibbon, Andrew; Cipolla, Roberto (2020). Computer Vision â ECCV 2020. Lecture Notes in Computer Science. Vol. 12356. arXiv:2007.11110. doi:10.1007/978-3-030-58621-8. ISBN 978-3-030-58620-1. S2CID 227173931. 978-3-030-58620-1
Parkhi, Omkar M., et al. "Cats and dogs."Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012. http://www.robots.ox.ac.uk:5000/~vgg/publications/2012/parkhi12a/parkhi12a.pdf
Razavian, Ali, et al. "CNN features off-the-shelf: an astounding baseline for recognition." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2014. https://www.cv-foundation.org/openaccess/content_cvpr_workshops_2014/W15/papers/Razavian_CNN_Features_Off-the-Shelf_2014_CVPR_paper.pdf
Ortega, Michael; et al. (1998). "Supporting ranked boolean similarity queries in MARS". IEEE Transactions on Knowledge and Data Engineering. 10 (6): 905â925. CiteSeerX 10.1.1.36.6079. doi:10.1109/69.738357. /wiki/CiteSeerX_(identifier)
He, Xuming, Richard S. Zemel, and Miguel Ă. Carreira-Perpiñån. "Multiscale conditional random fields for image labeling[permanent dead link]." Computer vision and pattern recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE computer society conference on. Vol. 2. IEEE, 2004. ftp://www-vhost.cs.toronto.edu/public_html/public_html/dist/zemel/Papers/cvpr04.pdf
Deneke, Tewodros, et al. "Video transcoding time prediction for proactive load balancing." Multimedia and Expo (ICME), 2014 IEEE International Conference on. IEEE, 2014. https://ieeexplore.ieee.org/abstract/document/6890256/
Ting-Hao (Kenneth) Huang, Francis Ferraro, Nasrin Mostafazadeh, Ishan Misra, Aishwarya Agrawal, Jacob Devlin, Ross Girshick, Xiaodong He, Pushmeet Kohli, Dhruv Batra, C. Lawrence Zitnick, Devi Parikh, Lucy Vanderwende, Michel Galley, Margaret Mitchell (13 April 2016). "Visual Storytelling". arXiv:1604.03968 [cs.CL].{{cite arXiv}}: CS1 maint: multiple names: authors list (link) /wiki/ArXiv_(identifier)
Wah, Catherine, et al. "The caltech-ucsd birds-200-2011 dataset." (2011). https://authors.library.caltech.edu/27452/1/CUB_200_2011.pdf
Duan, Kun, et al. "Discovering localized attributes for fine-grained recognition." Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012. http://vision.soic.indiana.edu/papers/attributes2012cvpr.pdf
"YouTube-8M Dataset". research.google.com. Retrieved 1 October 2016. https://research.google.com/youtube8m/
Abu-El-Haija, Sami; Kothari, Nisarg; Lee, Joonseok; Natsev, Paul; Toderici, George; Varadarajan, Balakrishnan; Vijayanarasimhan, Sudheendra (27 September 2016). "YouTube-8M: A Large-Scale Video Classification Benchmark". arXiv:1609.08675 [cs.CV]. /wiki/ArXiv_(identifier)
"YFCC100M Dataset". mmcommons.org. Yahoo-ICSI-LLNL. Retrieved 1 June 2017. http://mmcommons.org
Bart Thomee; David A Shamma; Gerald Friedland; Benjamin Elizalde; Karl Ni; Douglas Poland; Damian Borth; Li-Jia Li (25 April 2016). "Yfcc100m: The new data in multimedia research". Communications of the ACM. 59 (2): 64â73. arXiv:1503.01817. doi:10.1145/2812802. S2CIDÂ 207230134. /wiki/ArXiv_(identifier)
Y. Baveye, E. Dellandrea, C. Chamaret, and L. Chen, "LIRIS-ACCEDE: A Video Database for Affective Content Analysis," in IEEE Transactions on Affective Computing, 2015. https://hal.archives-ouvertes.fr/hal-01375518/document
Y. Baveye, E. Dellandrea, C. Chamaret, and L. Chen, "Deep Learning vs. Kernel Methods: Performance for Emotion Prediction in Videos," in 2015 Humaine Association Conference on Affective Computing and Intelligent Interaction (ACII), 2015. https://hal.archives-ouvertes.fr/hal-01193144/document
M. Sjöberg, Y. Baveye, H. Wang, V. L. Quang, B. Ionescu, E. Dellandréa, M. Schedl, C.-H. Demarty, and L. Chen, "The mediaeval 2015 affective impact of movies task," in MediaEval 2015 Workshop, 2015. https://www.researchgate.net/profile/Hanli_Wang2/publication/309704559_The_MediaEval_2015_Affective_Impact_of_Movies_Task/links/581dada308ae12715af33bc8/The-MediaEval-2015-Affective-Impact-of-Movies-Task.pdf
S. Johnson and M. Everingham, "Clustered Pose and Nonlinear Appearance Models for Human Pose Estimation Archived 2021-11-04 at the Wayback Machine", in Proceedings of the 21st British Machine Vision Conference (BMVC2010) http://sam.johnson.io/research/publications/johnson10bmvc.pdf
S. Johnson and M. Everingham, "Learning Effective Human Pose Estimation from Inaccurate Annotation Archived 2021-11-04 at the Wayback Machine", In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR2011) http://sam.johnson.io/research/publications/johnson11cvpr.pdf
Afifi, Mahmoud; Hussain, Khaled F. (2017-11-02). "The Achievement of Higher Flexibility in Multiple Choice-based Tests Using Image Classification Techniques". arXiv:1711.00972 [cs.CV]. /wiki/ArXiv_(identifier)
"MCQ Dataset". sites.google.com. Retrieved 2017-11-18. https://sites.google.com/view/mcq-dataset/mcqe-dataset
Taj-Eddin, I. A. T. F.; Afifi, M.; Korashy, M.; Hamdy, D.; Nasser, M.; Derbaz, S. (July 2016). "A new compression technique for surveillance videos: Evaluation using new dataset". 2016 Sixth International Conference on Digital Information and Communication Technology and its Applications (DICTAP). pp. 159â164. doi:10.1109/DICTAP.2016.7544020. ISBN 978-1-4673-9609-7. S2CID 8698850. 978-1-4673-9609-7
Tabak, Michael A.; Norouzzadeh, Mohammad S.; Wolfson, David W.; Sweeney, Steven J.; Vercauteren, Kurt C.; Snow, Nathan P.; Halseth, Joseph M.; Di Salvo, Paul A.; Lewis, Jesse S.; White, Michael D.; Teton, Ben; Beasley, James C.; Schlichting, Peter E.; Boughton, Raoul K.; Wight, Bethany; Newkirk, Eric S.; Ivan, Jacob S.; Odell, Eric A.; Brook, Ryan K.; Lukacs, Paul M.; Moeller, Anna K.; Mandeville, Elizabeth G.; Clune, Jeff; Miller, Ryan S.; Photopoulou, Theoni (2018). "Machine learning to classify animal species in camera trap images: Applications in ecology". Methods in Ecology and Evolution. 10 (4): 585â590. doi:10.1111/2041-210X.13120. ISSNÂ 2041-210X. https://doi.org/10.1111%2F2041-210X.13120
Taj-Eddin, Islam A. T. F.; Afifi, Mahmoud; Korashy, Mostafa; Ahmed, Ali H.; Ng, Yoke Cheng; Hernandez, Evelyng; Abdel-Latif, Salma M. (November 2017). "Can we see photosynthesis? Magnifying the tiny color changes of plant green leaves using Eulerian video magnification". Journal of Electronic Imaging. 26 (6): 060501. arXiv:1706.03867. Bibcode:2017JEI....26f0501T. doi:10.1117/1.jei.26.6.060501. ISSNÂ 1017-9909. S2CIDÂ 12367169. /w/index.php?title=Eulerian_magnification&action=edit&redlink=1
"Mathematical Mathematics Memes". https://www.kaggle.com/abdelghanibelgaid/mathematical-mathematics-memes
Karras, Tero; Laine, Samuli; Aila, Timo (June 2019). "A Style-Based Generator Architecture for Generative Adversarial Networks". 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE. pp. 4396â4405. arXiv:1812.04948. doi:10.1109/cvpr.2019.00453. ISBN 978-1-7281-3293-8. S2CID 54482423. 978-1-7281-3293-8
Oltean, Mihai (2017). "Fruits-360 dataset". GitHub. https://www.github.com/fruits-360