Language and Sketching: An LLM-driven Interactive Multimodal Multitask Robot Navigation Framework

Weiqin Zu, Wenbin Song, Ruiqing Chen, Ze Guo, Fanglei Sun*, Zheng Tian, Wei Pan, Jun Wang

*Corresponding author for this work

Research output: Chapter in Book/Conference proceedingConference contributionpeer-review

Abstract

The socially-aware navigation system has evolved to adeptly avoid various obstacles while performing multiple tasks, such as point-to-point navigation, human-following, and-guiding. However, a prominent gap persists: in Human-Robot Interaction (HRI), the procedure of communicating commands to robots demands intricate mathematical formulations. Furthermore, the transition between tasks does not quite possess the intuitive control and user-centric interactivity that one would desire. In this work, we propose an LLM-driven interactive multimodal multitask robot navigation framework, termed LIM2N, to solve the above new challenge in the navigation field. We achieve this by first introducing a multimodal interaction framework where language and hand-drawn inputs can serve as navigation constraints and control objectives. Next, a reinforcement learning agent is built to handle multiple tasks with the received information. Crucially, LIM2N creates smooth cooperation among the reasoning of multimodal input, multitask planning, and adaptation and processing of the intelligent sensing modules in the complicated system. Detailed experiments are conducted in both simulation and the real world demonstrating that LIM2N has solid user needs understanding, alongside an enhanced interactive experience.

Original languageEnglish
Title of host publication2024 IEEE International Conference on Robotics and Automation, ICRA 2024
PublisherIEEE
Pages1019-1025
Number of pages7
ISBN (Electronic)9798350384574
DOIs
Publication statusPublished - 8 Aug 2024
Event2024 IEEE International Conference on Robotics and Automation, ICRA 2024 - Yokohama, Japan
Duration: 13 May 202417 May 2024

Publication series

NameProceedings - IEEE International Conference on Robotics and Automation
ISSN (Print)1050-4729

Conference

Conference2024 IEEE International Conference on Robotics and Automation, ICRA 2024
Country/TerritoryJapan
CityYokohama
Period13/05/2417/05/24

Fingerprint

Dive into the research topics of 'Language and Sketching: An LLM-driven Interactive Multimodal Multitask Robot Navigation Framework'. Together they form a unique fingerprint.

Cite this