Preview

Journal of the Russian Universities. Radioelectronics

Advanced search

Method for Automatic Determination of a 3D Trajectory of Vehicles in a Video Image

https://doi.org/10.32603/1993-8985-2021-24-3-49-59

Abstract

Introduction. An important part of an automotive unmanned vehicle (UV) control system is the environment analysis module. This module is based on various types of sensors, e.g. video cameras, lidars and radars. The development of computer and video technologies makes it possible to implement an environment analysis module using a single video camera as a sensor. This approach is expected to reduce the cost of the entire module. The main task in video image processing is to analyse the environment as a 3D scene. The 3D trajectory of an object, which takes into account its dimensions, angle of view and movement vector, as well as the vehicle pose in a video image, provides sufficient information for assessing the real interaction of objects. A basis for constructing a 3D trajectory is vehicle pose estimation.
Aim. To develop an automatic method for estimating vehicle pose based on video data analysis from a single video camera.
Materials and methods. An automatic method for vehicle pose estimation from a video image was proposed based on a cascade approach. The method includes vehicle detection, key points determination, segmentation and vehicle pose estimation. Vehicle detection and determination of its key points were resolved via a neural network. The segmentation of a vehicle video image and its mask preparation were implemented by transforming it into a polar coordinate system and searching for the outer contour using graph theory.
Results. The estimation of vehicle pose was implemented by matching the Fourier image of vehicle mask signatures and the templates obtained based on 3D models. The correctness of the obtained vehicle pose and angle of view estimation was confirmed by experiments based on the proposed method. The vehicle pose estimation had an accuracy of 89 % on an open Carvana image dataset.
Conclusion. A new approach for vehicle pose estimation was proposed, involving the transition from end-to-end learning of neural networks to resolve several problems at once, e.g., localization, classification, segmentation, and angle of view, towards cascade analysis of information. The accuracy level of end-to-end learning requires large sets of representative data, which complicates the scalability of solutions for road environments in Russia. The proposed method makes it possible to estimate the vehicle pose with a high accuracy level, at the same time as involving no large costs for manual data annotation and training.

About the Authors

I. G. Zubov
Ltd "Next"
Russian Federation

Ilya G. Zubov, Master of Engineering and Technology (2016), algorithm programmer. The author of 11 scientific publications. Area of expertise: digital image processing; applied television systems.

12, Presnenskaya Nab., floor 35, room № 3, Moscow 123317



N. A. Obukhova
Saint Petersburg Electrotechnical University
Russian Federation

Natalia A. Obukhova, Dr. of Sci. (Engineering) (2009), Professor (2004), the Chief of the Department of Television and Video Equipment. The author of more than 130 scientific publications. Area of expertise: digital image processing; applied television systems.

5 Professor Popov St., St Petersburg 197376



References

1. Forward Collision Warning with a Single Camera / E. Dagan, O. Mano, G. P. Stein, A. Shashua // Proc. of the IEEE Intelligent Vehicles Symp., Parma, Italy, 14–17 Jun. 2004. Piscataway: IEEE, 2004. P. 37–42. doi: 10.1109/IVS.2004.1336352

2. Complex of video recording of traffic violations "ISKRAVIDEO-2" KR. Available at: http://www.simicon.ru/rus/product/gun/archive/iv2_k.html (accessed 29.08.2020)

3. MMDetection: open MMLab detection Toolbox and Benchmark / K. Chen, J. Wang, J. Pang, Y. Cao, Y. Xiong, X. Li, S. Sun, W. Feng, Z. Liu, J. Xu, Z. Zhang, D. Cheng, C. Zhu, T. Cheng, Q. Zhao, B. Li, X. Lu, R. Zhu, Y. Wu, J. Dai, J. Wang, J. Shi, W. Ouyang, C. C. Loy, D. Lin. URL: https://arxiv.org/pdf/1906.07155.pdf (дата обращения 29.08.2020)

4. SSD: Single Shot MultiBox Detector / W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, A. C. Berg // Europ. Conf. on Computer Vision, ECCV 2016, Amsterdam, The Netherlands, 8–16 Oct. 2016. P. 21–37. doi: 10.1007/978-3-319-46448-0_2.

5. Focal loss for dense object detection / T.-Y. Lin, P. Goyal, R. Girshick, K. He, P. Dollar // IEEE Trans. on Pattern Analysis and Machine Intelligence. 2018. Vol. PAMI42, iss. 2. P. 318–327.

6. Li B., Liu Y., Wang X. Gradient Harmonized Single-stage Detector // Thirty-Third AAAI Conf. on Artificial Intelligence, AAAI-19, Jan. 27 – Feb. 1, 2019. P. 8577–8584.

7. FCOS: Fully Convolutional One-Stage Object Detection / Z. Tian, C. Shen, H. Chen, T. He. 2019. URL: https://arxiv.org/abs/1904.01355 (дата обращения 29.08.2020)

8. Mask R-CNN / K. He, G. Gkioxari, P. Dollar, R. Girshick // IEEE Intern. Conf. on Computer Vision, ICCV 2017, Venice, Italy, Oct. 22–29, 2017. URL: https://arxiv.org/pdf/1703.06870.pdf (дата обращения 29.08.2020)

9. Bochkovskiy A., Wang C.-Y., Mark Liao H.-Y. YOLOv4: Optimal Speed and Accuracy of Object Detection. URL: https://arxiv.org/pdf/2004.10934.pdf (дата обращения 21.02.2021)

10. Classification and Pose Estimation of Vehicles in Videos by 3D Modeling within Discrete-Continuous Optimization / M. Hoedlmoser, B. Micusik, M.-Y. Liu, M. Pollefeys, M. Kampel // 2 nd Intern. Conf. on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DimPVT), Zurich, Switzerland, 13–15 Oct. 2012. Piscataway: IEEE, 2012. P. 198–205. doi: 10.1109/3DIMPVT.2012.23

11. Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection / Y. Xiang, W. Choi, Y. Lin, S. Savarese, 2017. URL: https://arxiv.org/abs/1604.04693. (дата обращения 02.06.2020)

12. 3D Bounding Box Estimation using Deep Learning and Geometry / A. Mousavian, D. Anguelov, J. Flynn, J. Kosecka. 2017. URL: https://arxiv.org/abs/1612.00496 (дата обращения 02.06.2020)

13. Monocular 3D Object Detection for Autonomous Driving / X. Chen, K. Kundu, Z. Zhang, H. Ma, S. Fidler, R. Urtasun // IEEE Conf. on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, USA, June 27–30, 2016. Piscataway: IEEE, 2016. P. 2147–2156.

14. 6-DoF Object Pose from Semantic Keypoints / G. Pavlakos, X. Zhou, A. Chan, K. G. Derpanis, K. Daniilidis // IEEE Intern. Conf. on Robotics and Automation (ICRA), Singapore, 29 May – 3 June, 2017. Piscataway: IEEE, 2017. P. 2011–2018. doi: 10.1109/ICRA.2017.7989233

15. Zubov I. G. Vehicle Pose Estimation Based on Object Contour // IEEE Conf. of Russian Young Researchers in Electrical and Electronic Engineering, ELCONRUS 2020, St Petersburg and Moscow, 27–30 Jan. 2020. Piscataway: IEEE, 2020. P. 1452–1454. doi: 10.1109/EIConRus49466.2020.9039472

16. The PASCAL Visual Object Classes Challenge (VOC2007). URL: http://www.pascal-network.org/challenges/VOC/voc2007/index.html (дата обращения 01.06.2020)

17. ImageNet: A Large-Scale Hierarchical Image Database / J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei // Computer Vision and Pattern Recognition. 2009, CVPR 2009, Miami, USA, Jun. 20-25, 2009. P. 248–255. doi: 10.1109/CVPR.2009.5206848

18. Microsoft COCO: Common Objects in Context / T.-Y. Lin, M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays, P. Perona, D. Ramanan, C. L. Zitnick, P. Dollar // Europ. Conf. on Computer Vision. ECCV 2014, Zurich, Switzerland, Sept. 6–12, 2014. P. 740–755. doi: 10.1007/978-3-319-10602-1_48

19. Zeiler M. D., Fergus R. Visualizing and Understanding Convolutional Networks // Proc. of the ECCV Europ. Conf. on Computer Vision, Zurich, Switzerland, 6–12 Sept. 2014. Piscataway: IEEE, 2014. P. 818–833. doi: 10.1007/978-3-319-10590-1_53

20. Zubov I. G. An Automatic Method for Interest Point Detection. J. of the Russian Universities. Radioelectronics. 2020, vol. 23, no. 6, pp. 6-16. doi: https://doi.org/10.32603/1993-8985-2020-23-6-6-16 (In Russ.)

21. Deep Residual Learning for Image Recognition / K. He, X. Zhang, S. Ren, J. Sun // IEEE Conf. on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, USA, Jun. 27–30, 2016. Piscataway: IEEE, 2016. P. 770–778. doi: 10.1109/CVPR.2016.90

22. Zubov I. G. Method for Automatic Segmentation of Vehicles in Digital Images. J. of the Russian Universities. Radioelectronics. 2019, vol. 22, no. 5, pp. 6-16. doi: 10.32603/1993-8985-2019-22-5-6-16 (In Russ.)

23. Shcherba E. V. Application Analysis of Interpolation and Extrapolation Methods as Used for Image Restoration. Computer Optics. 2009, vol. 33, no. 3, pp. 336–339. Available at: http://www.computeroptics.smr.ru/KO/PDF/KO33-3/33313.pdf (accessed 20.08.2020). (In Russ.)

24. Gonzalez R., Woods R. Digital Image Processing. 3 d ed. Prentice Hall, 2012, 834 p. (In Russ.)

25. Dechter R., Pearl J. Generalized Best-First Search Strategies and the Optimality of A* // J. of the ACM (JACM). 1985. Vol. 32, № 3. P. 505–536. URL: https://www.ics.uci.edu/~dechter/publications/r0.pdf (дата обращения 29.08.2020)

26. Carvana Image Masking Challenge. URL: https://www. kaggle.com/c/carvana-image-masking-challenge (дата обращения 29.08.2020)

27. Vehicle Key-Point & Orientation Estimation. URL: https://github.com/Pirazh/Vehicle_Key_Point_Orientation_Estimation (дата обращения 29.08.2020)

28. Flusser J. On the Independence of Rotation Moment Invariants // Pattern Recognition. 2000. Vol. 33, iss. 9. P. 1405–1410. doi: 10.1016/S0031-3203(99)00127-2

29. A Dual-Path Model with Adaptive Attention for Vehicle Re-Identification / P. Khorramshahi, A. Kumar, N. Peri, S. S. Rambhatla, J.-C. Chen, R. Chellappa. P. 6132–6141. URL: https://openaccess.thecvf.com/content_ICCV_2019/papers/Khorramshahi_A_Dual-Path_Model_With_Adaptive_Attention_for_Vehicle_Re-Identification_ICCV_2019_paper.pdf (дата обращения 29.08.2020)


Review

For citations:


Zubov I.G., Obukhova N.A. Method for Automatic Determination of a 3D Trajectory of Vehicles in a Video Image. Journal of the Russian Universities. Radioelectronics. 2021;24(3):49-59. (In Russ.) https://doi.org/10.32603/1993-8985-2021-24-3-49-59

Views: 438


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1993-8985 (Print)
ISSN 2658-4794 (Online)