Modeling Spatiotemporal Pedestrian-Environment Interactions for Predicting Pedestrian Crossing Intention from the Ego-View

If you need an accessible version of this item, please submit a remediation request.
Date
2021-08
Language
American English
Embargo Lift Date
Department
Committee Chair
Degree
M.S.E.C.E.
Degree Year
2021
Department
Electrical & Computer Engineering
Grantor
Purdue University
Journal Title
Journal ISSN
Volume Title
Found At
Abstract

For pedestrians and autonomous vehicles (AVs) to co-exist harmoniously and safely in the real-world, AVs will need to not only react to pedestrian actions, but also anticipate their intentions. In this thesis, we propose to use rich visual and pedestrian-environment interaction features to improve pedestrian crossing intention prediction from the ego-view.We do so by combining visual feature extraction, graph modeling of scene objects and their relationships, and feature encoding as comprehensive inputs for an LSTM encoder-decoder network. Pedestrians react and make decisions based on their surrounding environment, and the behaviors of other road users around them. The human-human social relationship has al-ready been explored for pedestrian trajectory prediction from the bird’s eye view in stationary cameras. However, context and pedestrian-environment relationships are often missing incurrent research into pedestrian trajectory, and intention prediction from the ego-view. To map the pedestrian’s relationship to its surrounding objects we use a star graph with the pedestrian in the center connected to all other road objects/agents in the scene. The pedestrian and road objects/agents are represented in the graph through visual features extracted using state of the art deep learning algorithms. We use graph convolutional networks, and graph autoencoders to encode the star graphs in a lower dimension. Using the graph en-codings, pedestrian bounding boxes, and human pose estimation, we propose a novel model that predicts pedestrian crossing intention using not only the pedestrian’s action behaviors(bounding box and pose estimation), but also their relationship to their environment. Through tuning hyperparameters, and experimenting with different graph convolutions for our graph autoencoder, we are able to improve on the state of the art results. Our context-driven method is able to outperform current state of the art results on benchmark datasetPedestrian Intention Estimation (PIE). The state of the art is able to predict pedestrian crossing intention with a balanced accuracy (to account for dataset imbalance) score of 0.61, while our best performing model has a balanced accuracy score of 0.79. Our model especially outperforms in no crossing intention scenarios with an F1 score of 0.56 compared to the state of the art’s score of 0.36. Additionally, we also experiment with training the state of the art model and our model to predict pedestrian crossing action, and intention jointly. While jointly predicting crossing action does not help improve crossing intention prediction, it is an important distinction to make between predicting crossing action versus intention.

Description
Indiana University-Purdue University Indianapolis (IUPUI)
item.page.description.tableofcontents
item.page.relation.haspart
Cite As
ISSN
Publisher
Series/Report
Sponsorship
Major
Extent
Identifier
Relation
Journal
Source
Alternative Title
Type
Thesis
Number
Volume
Conference Dates
Conference Host
Conference Location
Conference Name
Conference Panel
Conference Secretariat Location
Version
Full Text Available at
This item is under embargo {{howLong}}