research-article

Open access

A Study of Situational Reasoning for Traffic Understanding

Authors:

Alessandro OltramariAuthors Info & Claims

KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 3262 - 3272

https://doi.org/10.1145/3580305.3599246

Published: 04 August 2023 Publication History

PDF eReader

Abstract

Intelligent Traffic Monitoring (ITMo) technologies hold the potential for improving road safety/security and for enabling smart city infrastructure. Understanding traffic situations requires a complex fusion of perceptual information with domain-specific and causal commonsense knowledge. Whereas prior work has provided benchmarks and methods for traffic monitoring, it remains unclear whether models can effectively align these information sources and reason in novel scenarios. To address this assessment gap, we devise three novel text-based tasks for situational reasoning in the traffic domain: i) BDD-QA, which evaluates the ability of Language Models (LMs) to perform situational decision-making, ii) TV-QA, which assesses LMs' abilities to reason about complex event causality, and iii) HDT-QA, which evaluates the ability of models to solve human driving exams. We adopt four knowledge-enhanced methods that have shown generalization capability across language reasoning tasks in prior work, based on natural language inference, commonsense knowledge-graph self-supervision, multi-QA joint training, and dense retrieval of domain information. We associate each method with a relevant knowledge source, including knowledge graphs, relevant benchmarks, and driving manuals. In extensive experiments, we benchmark various knowledge-aware methods against the three datasets, under zero-shot evaluation; we provide in-depth analyses of model performance on data partitions and examine model predictions categorically, to yield useful insights on traffic understanding, given different background knowledge and reasoning strategies.

Supplementary Material

MP4 File (1113-2min-promo.mp4)

In this video, Jiarui introduces the paper focusing on the evaluation of language models in situational reasoning within traffic scenarios. The study employs a specially-designed framework that utilizes diverse knowledge sources adapted for various language models and incorporates three novel text-based tasks. Findings reveal that, though models perform much better than random guessing, there's a significant gap compared to human-level performance. The study also uncovers insights for potential improvements, such as combining predictions from different models and leveraging knowledge sources for enhanced performance. Reach out for further discussions!

Download
22.34 MB

References

[1]

Michael Ahn, Anthony Brohan, Noah Brown, Yevgen Chebotar, Omar Cortes, Byron David, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, et al. 2022. Do as i can, not as i say: Grounding language in robotic affordances. arXiv preprint arXiv:2204.01691 (2022).

Abstract

Supplementary Material

References

Index Terms

Recommendations

Improving Zero-shot Visual Question Answering via Large Language Models with Reasoning Question Prompts

Spatial Commonsense Reasoning for Machine Reading Comprehension

Residual Connection-Based Multi-step Reasoning via Commonsense Knowledge for Multiple Choice Machine Reading Comprehension

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations

Access Granted