Modeling Spatiotemporal Pedestrian-Environment Interactions for Predicting Pedestrian Crossing Intention from the Ego-View

Chen, Chen (Tina)

Modeling Spatiotemporal Pedestrian-Environment Interactions for Predicting Pedestrian Crossing Intention from the Ego-View

Files

Chen-thesis-document-final.pdf (12.49 MB)

Date

2021-08

Authors

Chen, Chen (Tina)

Language

American English

Committee Chair

Li, Lingxi

Committee Members

Lauren, Christopher
Ding, Zhengming

Degree

M.S.E.C.E.

Degree Year

2021

Department

Electrical & Computer Engineering

Grantor

Purdue University

Abstract

For pedestrians and autonomous vehicles (AVs) to co-exist harmoniously and safely in the real-world, AVs will need to not only react to pedestrian actions, but also anticipate their intentions. In this thesis, we propose to use rich visual and pedestrian-environment interaction features to improve pedestrian crossing intention prediction from the ego-view.We do so by combining visual feature extraction, graph modeling of scene objects and their relationships, and feature encoding as comprehensive inputs for an LSTM encoder-decoder network. Pedestrians react and make decisions based on their surrounding environment, and the behaviors of other road users around them. The human-human social relationship has al-ready been explored for pedestrian trajectory prediction from the bird’s eye view in stationary cameras. However, context and pedestrian-environment relationships are often missing incurrent research into pedestrian trajectory, and intention prediction from the ego-view. To map the pedestrian’s relationship to its surrounding objects we use a star graph with the pedestrian in the center connected to all other road objects/agents in the scene. The pedestrian and road objects/agents are represented in the graph through visual features extracted using state of the art deep learning algorithms. We use graph convolutional networks, and graph autoencoders to encode the star graphs in a lower dimension. Using the graph en-codings, pedestrian bounding boxes, and human pose estimation, we propose a novel model that predicts pedestrian crossing intention using not only the pedestrian’s action behaviors(bounding box and pose estimation), but also their relationship to their environment. Through tuning hyperparameters, and experimenting with different graph convolutions for our graph autoencoder, we are able to improve on the state of the art results. Our context-driven method is able to outperform current state of the art results on benchmark datasetPedestrian Intention Estimation (PIE). The state of the art is able to predict pedestrian crossing intention with a balanced accuracy (to account for dataset imbalance) score of 0.61, while our best performing model has a balanced accuracy score of 0.79. Our model especially outperforms in no crossing intention scenarios with an F1 score of 0.56 compared to the state of the art’s score of 0.36. Additionally, we also experiment with training the state of the art model and our model to predict pedestrian crossing action, and intention jointly. While jointly predicting crossing action does not help improve crossing intention prediction, it is an important distinction to make between predicting crossing action versus intention.

Description

Indiana University-Purdue University Indianapolis (IUPUI)

Keywords

Pedestrian, Ego-view, Autonomous vehicle, Pedestrian intention, Pedestrian crossing, LSTM, Feature extraction

Rights

Attribution-NonCommercial 4.0 International

Type

Thesis

Permanent Link

https://hdl.handle.net/1805/26393
http://dx.doi.org/10.7912/C2/49

Collections

Electrical & Computer Engineering Department Theses and Dissertations

Full item page