Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

The ApolloScape Open Dataset for Autonomous Driving and Its Application

Published: 01 October 2020 Publication History

Abstract

Autonomous driving has attracted tremendous attention especially in the past few years. The key techniques for a self-driving car include solving tasks like 3D map construction, self-localization, parsing the driving road and understanding objects, which enable vehicles to reason and act. However, large scale data set for training and system evaluation is still a bottleneck for developing robust perception models. In this paper, we present the <italic>ApolloScape</italic> dataset <xref ref-type="bibr" rid="ref1">[1]</xref> and its applications for autonomous driving. Compared with existing public datasets from real scenes, e.g., KITTI <xref ref-type="bibr" rid="ref2">[2]</xref> or Cityscapes <xref ref-type="bibr" rid="ref3">[3]</xref>, ApolloScape contains much large and richer labelling including holistic semantic dense point cloud for each site, stereo, per-pixel semantic labelling, lanemark labelling, instance segmentation, 3D car instance, high accurate location for every frame in various driving videos from multiple sites, cities and daytimes. For each task, it contains at lease 15x larger amount of images than SOTA datasets. To label such a complete dataset, we develop various tools and algorithms specified for each task to accelerate the labelling process, such as joint 3D-2D segment labeling, active labelling in videos etc. Depend on <italic>ApolloScape</italic>, we are able to develop algorithms jointly consider the learning and inference of multiple tasks. In this paper, we provide a sensor fusion scheme integrating camera videos, consumer-grade motion sensors (GPS/IMU), and a 3D semantic map in order to achieve robust self-localization and semantic segmentation for autonomous driving. We show that practically, sensor fusion and joint learning of multiple tasks are beneficial to achieve a more robust and accurate system. We expect our dataset and proposed relevant algorithms can support and motivate researchers for further development of multi-sensor fusion and multi-task learning in the field of computer vision.

References

[1]
“ApolloScape Website.” 2018. [Online]. Available: apolloscape.auto.
[2]
A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,” Int. J. Robot. Res., vol. 32, pp. 1231–1237, 2013.
[3]
M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The cityscapes dataset for semantic urban scene understanding,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 3213–3223.
[4]
Velodyne Lidar, “HDL-64E,” 2018. [Online]. Available: http://velodynelidar.com/, Accessed on: Mar. 1, 2018.
[5]
A. Kar, C. Häne, and J. Malik, “Learning a multi-view stereo machine,” in Proc. 31st Int. Conf. Neural Inf. Process. Syst., 2017, pp. 365–376.
[6]
P.-H. Huang, K. Matzen, J. Kopf, N. Ahuja, and J.-B. Huang, “Deepmvs: Learning multi-view stereopsis,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 2821–2830.
[7]
Y. Yao, Z. Luo, S. Li, T. Fang, and L. Quan, “Mvsnet: Depth inference for unstructured multi-view stereo,” in Proc. Eur. Conf. Comput. Vis., Sep. 2018, pp. 767–783.
[8]
X. Cheng, P. Wang, and R. Yang, “Depth estimation via affinity learned with convolutional spatial propagation network,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 108–125.
[9]
A. Kendall, M. Grimes, and R. Cipolla, “Posenet: A convolutional network for real-time 6-dof camera relocalization,” in Proc. IEEE Int. Conf. Comput. Vis., 2015, pp. 2938–2946.
[10]
J. L. Schönberger, M. Pollefeys, A. Geiger, and T. Sattler, “Semantic visual localization,” ISPRS J. Photogrammetry Remote Sens., 2018.
[11]
J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 6896–6906.
[12]
L. Chen, G. Papandreou, F. Schroff, and H. Adam, “Rethinking atrous convolution for semantic image segmentation,” CoRR, vol. abs/1706.05587, 2017.
[13]
K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask r-cnn,” IEEE Trans. Pattern Anal. Mach. Intell., 2018.
[14]
L.-C. Chen, A. Hermans, G. Papandreou, F. Schroff, P. Wang, and H. Adam, “Masklab: Instance segmentation by refining object detection with semantic and direction features,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 4013–4022.
[15]
Y. Xiang, R. Mottaghi, and S. Savarese, “Beyond pascal: A benchmark for 3d object detection in the wild,” in Proc. IEEE Winter Conf. Appl. Comput. Vis., 2014, pp. 75–82.
[16]
A. Kar, S. Tulsiani, J. Carreira, and J. Malik, “Category-specific object reconstruction from a single image,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 1966–1974.
[17]
F. Guney and A. Geiger, “Displets: Resolving stereo ambiguities using object knowledge,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 4165–4175.
[18]
A. Kundu, Y. Li, and J. M. Rehg, “3d-rcnn: Instance-level 3d object reconstruction via render-and-compare,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 3559–3568.
[19]
X. Song, P. Wang, D. Zhou, R. Zhu, C. Guan, Y. Dai, H. Su, H. Li, and R. Yang, “Apollocar3d: A large 3d car instance understanding benchmark for autonomous driving,” Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019.
[20]
ApolloScape, “Semantic segmentation.” 2018. [Online]. Available: http://apolloscape.auto/scene.html
[21]
ApolloScape, “Instance segmentation.” 2018. [Online]. Available: https://www.kaggle.com/c/cvpr-2018-autonomous-driving
[22]
ApolloScape, “Lanemark segmentation.” 2018. [Online]. Available: http://apolloscape.auto/lane_segmentation.html
[23]
ApolloScape, “Localization.” 2018. [Online]. Available: http://apolloscape.auto/self_localization.html
[24]
W. Peng, et al., “ApolloScape API.” 2018. [Online]. Available: https://github.com/ApolloScapeAuto/dataset-api
[25]
A. Kundu, Y. Li, F. Dellaert, F. Li, and J. M. Rehg, “Joint semantic segmentation and 3d reconstruction from monocular video,” in Proc. Eur. Conf. Comput. Vis., 2014, pp. 703–718.
[26]
G. J. Brostow, J. Fauqueur, and R. Cipolla, “Semantic object classes in video: A high-definition ground truth database,” Pattern Recognit. Lett., vol. 30, no. 2, pp. 88–97, 2009.
[27]
S. Wang, M. Bai, G. Mattyus, H. Chu, W. Luo, B. Yang, J. Liang, J. Cheverie, S. Fidler, and R. Urtasun, “Torontocity: Seeing the world with a million eyes,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 3009–3017.
[28]
G. Neuhold, T. Ollmann, S. R. Bulo, and P. Kontschieder, “The mapillary vistas dataset for semantic understanding of street scenes,” in Proc. IEEE Int. Conf. Comput. Vis., 2017, pp. 22–29.
[29]
F. Yu, W. Xian, Y. Chen, F. Liu, M. Liao, V. Madhavan, and T. Darrell, “Bdd100k: A diverse driving video database with scalable annotation tooling,” arXiv: 1805.04687, 2018.
[30]
G. Ros, L. Sellart, J. Materzynska, D. Vazquez, and A. M. Lopez, “The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 3234–3243.
[31]
S. R. Richter, Z. Hayder, and V. Koltun, “Playing for benchmarks,” in Proc. IEEE Int. Conf. Comput. Vis., 2017, pp. 2213–2222.
[32]
D. Scharstein, H. Hirschmüller, Y. Kitajima, G. Krathwohl, N. Nešić, X. Wang, and P. Westling, “High-resolution stereo datasets with subpixel-accurate ground truth,” in Proc. German Conf. Pattern Recognit., 2014, pp. 31–42.
[33]
N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor segmentation and support inference from rgbd images,” in Proc. Eur. Conf. Comput. Vis., 2012, pp. 746–760.
[34]
T. Sattler, W. Maddern, C. Toft, A. Torii, L. Hammarstrand, E. Stenborg, D. Safari, M. Okutomi, M. Pollefeys, J. Sivic, et al., “Benchmarking 6dof outdoor visual localization in changing conditions,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, vol. 1, pp. 8601–8610.
[35]
M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman, “The pascal visual object classes (voc) challenge,” Int. J. Comput. Vis., vol. 88, no. 2, pp. 303–338, 2010.
[36]
T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft coco: Common objects in context,” in Proc. Eur. Conf. Comput. Vis., 2014, pp. 740–755.
[37]
“Unity Development Platform.” 2005. [Online]. Available: https://unity3d.com/
[38]
J. Hoffman, D. Wang, F. Yu, and T. Darrell, “FCNs in the wild: Pixel-level adversarial and constraint-based adaptation,” arXiv:1612.02649, 2016.
[39]
Y. Zhang, P. David, and B. Gong, “Curriculum domain adaptation for semantic segmentation of urban scenes,” in Proc. IEEE Int. Conf. Comput. Vis., vol. 2, no. 5, 2017, Art. no.
[40]
Y. Chen, W. Li, and L. Van Gool, “Road: Reality oriented adaptation for semantic segmentation of urban scenes,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 7892–7901.
[41]
B. M. Haralick, C.-N. Lee, K. Ottenberg, and M. Nölle, “Review and analysis of solutions of the three point perspective pose estimation problem,” Int. J. Comput. Vis., vol. 13, no. 3, pp. 331–356, 1994.
[42]
L. Kneip, H. Li, and Y. Seo, “Upnp: An optimal o (n) solution to the absolute pose problem with universal applicability,” in Proc. Eur. Conf. Comput. Vis., 2014, pp. 127–142.
[43]
P. David, D. Dementhon, R. Duraiswami, and H. Samet, “Softposit: Simultaneous pose and correspondence determination,” Int. J. Comput. Vis., vol. 59, no. 3, pp. 259–284, 2004.
[44]
F. Moreno-Noguer, V. Lepetit, and P. Fua, “Pose priors for simultaneously solving alignment and correspondence,” in Proc. Eur. Conf. Comput. Vis., 2008, pp. 405–418.
[45]
D. Campbell, L. Petersson, L. Kneip, and H. Li, “Globally-optimal inlier set maximisation for simultaneous camera pose and feature correspondence,” in Proc. IEEE Int. Conf. Comput. Vis., 2017, vol. 1, no. 3.
[46]
T. Sattler, A. Torii, J. Sivic, M. Pollefeys, H. Taira, M. Okutomi, and T. Pajdla, “Are large-scale 3d models really necessary for accurate visual localization?” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 6175–6184.
[47]
J. Engel, T. Schöps, and D. Cremers, “Lsd-slam: Large-scale direct monocular slam,” in Proc. Eur. Conf. Comput. Vis., 2014, pp. 834–849.
[48]
A. Kendall, R. Cipolla, et al., “Geometric loss functions for camera pose regression with deep learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, vol. 3, Art. no.
[49]
F. Walch, C. Hazirbas, L. Leal-Taixe, T. Sattler, S. Hilsenbeck, and D. Cremers, “Image-based localization using lstms for structured feature correlation,” in Proc. IEEE Int. Conf. Comput. Vis., 2017, pp. 627–637.
[50]
R. Clark, S. Wang, A. Markham, N. Trigoni, and H. Wen, “Vidloc: A deep spatio-temporal model for 6-dof video-clip relocalization,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, vol. 3, pp. 1–9.
[51]
H. Coskun, F. Achilles, R. DiPietro, N. Navab, and F. Tombari, “Long short-term memory kalman filters: Recurrent neural estimators for pose regularization,” in Proc. IEEE Int. Conf. Comput. Vis., 2017, pp. 5525–5533.
[52]
K.-N. Lianos, J. L. Schönberger, M. Pollefeys, and T. Sattler, “Vso: Visual semantic odometry,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 234–250.
[53]
K. Vishal, C. Jawahar, and V. Chari, “Accurate localization by fusing images and gps signals,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, 2015, pp. 17–24.
[54]
Z. Laskar, I. Melekhov, S. Kalia, and J. Kannala, “Camera relocalization by computing pairwise relative poses using convolutional neural network,” Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 920–929.
[55]
B. Ummenhofer, H. Zhou, J. Uhrig, N. Mayer, E. Ilg, A. Dosovitskiy, and T. Brox, “Demon: Depth and motion network for learning monocular stereo,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, vol. 5, 2017, Art. no.
[56]
H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, “Pyramid scene parsing network,” Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 6230–6239.
[57]
A. Arnab, S. Jayasumana, S. Zheng, and P. H. Torr, “Higher order conditional random fields in deep neural networks,” in Proc. Eur. Conf. Comput. Vis., 2016, pp. 524–540.
[58]
W. Byeon, T. M. Breuel, F. Raue, and M. Liwicki, “Scene labeling with lstm recurrent neural networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 3547–3555.
[59]
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 770–778.
[60]
A. Paszke, A. Chaurasia, S. Kim, and E. Culurciello, “Enet: A deep neural network architecture for real-time semantic segmentation,” CoRR, vol. abs/1606.02147, 2016.
[61]
H. Zhao, X. Qi, X. Shen, J. Shi, and J. Jia, “Icnet for real-time semantic segmentation on high-resolution images,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 405–420.
[62]
A. Kundu, V. Vineet, and V. Koltun, “Feature space optimization for semantic video segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 3168–3175.
[63]
A. Dosovitskiy, P. Fischer, E. Ilg, P. Hausser, C. Hazirbas, V. Golkov, P. Van Der Smagt, D. Cremers, and T. Brox, “Flownet: Learning optical flow with convolutional networks,” in Proc. IEEE Int. Conf. Comput. Vis., 2015, pp. 2758–2766.
[64]
R. Gadde, V. Jampani, and P. V. Gehler, “Semantic video cnns through representation warping,” in Proc. IEEE Int. Conf. Comput. Vis., 2017, pp. 4453–4462.
[65]
X. Zhu, Y. Xiong, J. Dai, L. Yuan, and Y. Wei, “Deep feature flow for video recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 2349–2358.
[66]
C. Hane, C. Zach, A. Cohen, R. Angst, and M. Pollefeys, “Joint 3d scene reconstruction and class segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2013, pp. 97–104.
[67]
K. Tateno, F. Tombari, I. Laina, and N. Navab, “Cnn-slam: Real-time dense monocular slam with learned depth prediction,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, vol. 2, pp. 6565–6574.
[68]
RIEGL, “VMX-1HA.” 2000. [Online]. Available: http://www.riegl.com/
[69]
TuSimple, “Lanemark segmentation.” 2017. [Online]. Available: http://benchmark.tusimple.ai/#/
[70]
J. Revaud, P. Weinzaepfel, Z. Harchaoui, and C. Schmid, “Deepmatching: Hierarchical deformable dense matching,” Int. J. Comput. Vis., vol. 120, no. 3, pp. 300–323, 2016.
[71]
C. Luo, Z. Yang, P. Wang, Y. Wang, W. Xu, R. Nevatia, and A. Yuille, “Every pixel counts++: Joint learning of geometry and motion with 3d holistic understanding,” arXiv: 1810.06125, 2018.
[72]
J. Xie, M. Kiefel, M.-T. Sun, and A. Geiger, “Semantic instance annotation of street scenes by 3d to 2d label transfer,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 3688–3697.
[73]
S. Christoph Stein, M. Schoeler, J. Papon, and F. Worgotter, “Object partitioning using local convexity,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2014, pp. 304–311.
[74]
“Point Cloud Library.” 2010. [Online]. Available: http://pointclouds.org
[75]
C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: Deep hierarchical feature learning on point sets in a metric space,” in Proc. 31st Int. Conf. Neural Inf. Process. Syst., 2017, pp. 5105–5114.
[76]
Z. Wu, C. Shen, and A. v. d. Hengel, “Wider or deeper: Revisiting the resnet model for visual recognition,” arXiv:1611.10080, 2016.
[77]
S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” in Proc. Neural Inf. Process. Syst., 2015, pp. 91–99.
[78]
P. Wang, X. Shen, Z. Lin, S. Cohen, B. Price, and A. L. Yuille, “Joint object and part segmentation using deep learned potentials,” in Proc. IEEE Int. Conf. Comput. Vis., 2015, pp. 1573–1581.
[79]
H. Su, J. Deng, and L. Fei-Fei, “Crowdsourcing annotations for visual object detection,” in Proc. Workshops 26th AAAI Conf. Artif. Intell., 2012, vol. 1, no.
[80]
A. Kovashka, O. Russakovsky, L. Fei-Fei, K. Grauman, et al., “Crowdsourcing in computer vision,” Found. Trends® Comput. Graph. Vis., vol. 10, no. 3, pp. 177–243, 2016.
[81]
G. Li, J. Wang, Y. Zheng, and M. J. Franklin, “Crowdsourced data management: A survey,” IEEE Trans. Knowl. Data Eng., vol. 28, no. 9, pp. 2296–2319, Sep. 2016.
[82]
P. Wang, R. Yang, B. Cao, W. Xu, and Y. Lin, “Dels-3d: Deep localization and segmentation with a 3d semantic map,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 5860–5869.
[83]
Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey, et al., “Google's neural machine translation system: Bridging the gap between human and machine translation,” arXiv:1609.08144, 2016.
[84]
R. E. Kalman, et al., “A new approach to linear filtering and prediction problems,” J. Basic Eng., vol. 82, no. 1, pp. 35–45, 1960.
[85]
T. Pohlen, A. Hermans, M. Mathias, and B. Leibe, “Full-resolution residual networks for semantic segmentation in street scenes,” Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 3309–3318.
[86]
D. Eigen and R. Fergus, “Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture,” in Proc. IEEE Int. Conf. Comput. Vis., 2015, pp. 2650–2658.
[87]
B.-H. Lee, J.-H. Song, J.-H. Im, S.-H. Im, M.-B. Heo, and G.-I. Jee, “Gps/dr error estimation for autonomous vehicle localization,” Sens., vol. 15, no. 8, pp. 20 779–20 798, 2015.
[88]
T. Dozat, “Incorporating nesterov momentum into adam,” 2016.
[89]
T. Chen, M. Li, Y. Li, M. Lin, N. Wang, M. Wang, T. Xiao, B. Xu, C. Zhang, and Z. Zhang, “Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems,” CoRR, vol. abs/1512.01274, 2015.
[90]
M. J. M. M. Mur-Artal, Raúl and J. D. Tardós, “ORB-SLAM: A versatile and accurate monocular SLAM system,” IEEE Trans. Robot., vol. 31, no. 5, pp. 1147–1163, Oct. 2015.
[91]
B. Hariharan, P. Arbeláez, R. Girshick, and J. Malik, “Simultaneous detection and segmentation,” in Proc. Eur. Conf. Comput. Vis., 2014, pp. 297–312.
[92]
“Cityscapes instance segmentation benchmark.” [Online]. Available: https://www.cityscapes-dataset.com/benchmarks/#instance-level-scene-lab eling-task
[93]
S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, “Path aggregation network for instance segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 8759–8768.
[94]
Z. Yueqing, L. Zeming, and G. Yu, “Find tiny instance segmentation,” 2018. [Online]. Available: http://www.skicyyu.org/WAD/wad_final.pdf,
[95]
Smart_Vision_SG, “Wad instance segmentation 2nd place,” 2018. [Online]. Available: https://github.com/Computational-Camera/Kaggle-CVPR-2018-WAD-Video-Segm entation-Challenge-Solution
[96]
SZU_N606, “Wad instance segmentation 3rd place,” 2018. [Online]. Available: https://github.com/wwoody827/cvpr-2018-autonomous-driving-autopilot-sol ution
[97]
L. Liu, H. Li, and Y. Dai, “Deep stochastic attraction and repulsion embedding for image based localization,” arXiv: 1808.08779, 2018.
[98]
Y. Ma, X. Zhu, S. Zhang, R. Yang, W. Wang, and D. Manocha, “Trafficpredict: Trajectory prediction for heterogeneous traffic-agents,” Proc. AAAI, 2019.

Cited By

View all
  • (2025)Monocular Depth EstimationApplied Computer Systems10.2478/acss-2025-000330:1(21-33)Online publication date: 24-Jan-2025
  • (2025)Accurate 3D Multi-Object Detection and Tracking on Vietnamese Street Scenes Based on Sparse Point Cloud DataIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.348395326:1(92-101)Online publication date: 1-Jan-2025
  • (2025)Understanding Decision-Making of Autonomous Driving via Semantic AttributionIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.348381026:1(283-294)Online publication date: 1-Jan-2025
  • Show More Cited By

Index Terms

  1. The ApolloScape Open Dataset for Autonomous Driving and Its Application
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image IEEE Transactions on Pattern Analysis and Machine Intelligence
      IEEE Transactions on Pattern Analysis and Machine Intelligence  Volume 42, Issue 10
      Oct. 2020
      406 pages

      Publisher

      IEEE Computer Society

      United States

      Publication History

      Published: 01 October 2020

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 07 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)Monocular Depth EstimationApplied Computer Systems10.2478/acss-2025-000330:1(21-33)Online publication date: 24-Jan-2025
      • (2025)Accurate 3D Multi-Object Detection and Tracking on Vietnamese Street Scenes Based on Sparse Point Cloud DataIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.348395326:1(92-101)Online publication date: 1-Jan-2025
      • (2025)Understanding Decision-Making of Autonomous Driving via Semantic AttributionIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.348381026:1(283-294)Online publication date: 1-Jan-2025
      • (2025)RS100K: Road-Region Segmentation Dataset for Semi-supervised Autonomous Driving in the WildInternational Journal of Computer Vision10.1007/s11263-024-02207-3133:2(910-928)Online publication date: 1-Feb-2025
      • (2025)A review of 3D object detection based on autonomous drivingThe Visual Computer: International Journal of Computer Graphics10.1007/s00371-024-03480-641:3(1757-1775)Online publication date: 1-Feb-2025
      • (2024)Deep Learning-based Depth Estimation Methods from Monocular Image and Videos: A Comprehensive SurveyACM Computing Surveys10.1145/367732756:12(1-51)Online publication date: 15-Jul-2024
      • (2024)EgoCentric+: A Multipurpose Data Set for Head-Mounted Wearable Computing DevicesProceedings of the 2024 International Conference on Advanced Visual Interfaces10.1145/3656650.3656692(1-5)Online publication date: 3-Jun-2024
      • (2024)Efficient High-Resolution Deep Learning: A SurveyACM Computing Surveys10.1145/364510756:7(1-35)Online publication date: 9-Apr-2024
      • (2024)Supporting Safety Analysis of Image-processing DNNs through Clustering-based ApproachesACM Transactions on Software Engineering and Methodology10.1145/364367133:5(1-48)Online publication date: 3-Jun-2024
      • (2024)Performance of NSGA-III on Multi-objective Combinatorial Optimization Problems Heavily Depends on Its ImplementationsProceedings of the Genetic and Evolutionary Computation Conference10.1145/3638529.3654004(511-519)Online publication date: 14-Jul-2024
      • Show More Cited By

      View Options

      View options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media