Applied Sciences

Research

19 pages, 1133 KiB

Open AccessArticle

M²Tames: Interaction and Semantic Context Enhanced Pedestrian Trajectory Prediction

by Xu Gao, Yanan Wang, Yaqian Zhao, Yilong Li and Gang Wu

Appl. Sci. 2024, 14(18), 8497; https://doi.org/10.3390/app14188497 - 20 Sep 2024

Viewed by 328

Autonomous driving pays considerable attention to pedestrian trajectory prediction as a crucial task. Constructing effective pedestrian trajectory prediction models depends heavily on utilizing the motion characteristics of pedestrians, along with their interactions among themselves and between themselves and their environment. However, traditional trajectory [...] Read more.

Autonomous driving pays considerable attention to pedestrian trajectory prediction as a crucial task. Constructing effective pedestrian trajectory prediction models depends heavily on utilizing the motion characteristics of pedestrians, along with their interactions among themselves and between themselves and their environment. However, traditional trajectory prediction models often fall short of capturing complex real-world scenarios. To address these challenges, this paper proposes an enhanced pedestrian trajectory prediction model, M²Tames, which incorporates comprehensive motion, interaction, and semantic context factors. M²Tames provides an interaction module (IM), which consists of an improved multi-head mask temporal attention mechanism (M²Tea) and an Interaction Inference Module (I²). M²Tea thoroughly characterizes the historical trajectories and potential interactions, while I² determines the precise interaction types. Then, IM adaptively aggregates useful neighbor features to generate a more accurate interactive feature map and feeds it into the final layer of the U-Net encoder to fuse with the encoder’s output. Furthermore, by adopting the U-Net architecture, M²Tames can learn and interpret scene semantic information, enhancing its understanding of the spatial relationships between pedestrians and their surroundings. These innovations improve the accuracy and adaptability of the model for predicting pedestrian trajectories. Finally, M²Tames is evaluated on the ETH/UCY and SDD datasets for short- and long-term settings, respectively. The results demonstrate that M²Tames outperforms the state-of-the-art model MSRL by 2.49% (ADE) and 8.77% (FDE) in the short-term setting and surpasses the optimum Y-Net by 6.89% (ADE) and 1.12% (FDE) in the long-term prediction. Excellent performance is also shown on the ETH/UCY datasets. Full article

(This article belongs to the Special Issue Deep Learning and Machine Learning in Image Processing and Pattern Recognition)

Journal Menu

Journal Browser

Deep Learning and Machine Learning in Image Processing and Pattern Recognition

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (19 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI