A multi-modal AI approach for intuitively instructable autonomous systems

We present a multi-modal AI framework to intuitively instruct and control Automated Guided Vehicles. We define a general multi-modal AI architecture, which has a loose coupling between three different AI modules, including spoken language understanding, visual perception and Reinforcement Learning navigation. We use the same multi-modal architecture for two different use cases implemented in two different platforms: an off-road vehicle, which can pick objects, and an indoor forklift that performs automated warehouse inventory. We show how the proposed architecture can be used for a wide range of tasks and can be implemented in different hardware, demonstrating a high degree of modularity.

Language

English

Source (journal)

International journal on advances in systems and measurements

Publication

2023

Volume/pages

16 :1&2 (2023) , p. 1-13

Full text (open access)

https://repository.uantwerpen.be/docstore/d:irua:20525

Full text (publisher's version - intranet only)

https://repository.uantwerpen.be/docstore/d:iruaintra:10706

Faculty/Department				Faculty of Sciences. Mathematics and Computer Science Faculty of Applied Engineering Sciences

Research group				Internet Data Lab (IDLab)
Project info				A Connected Brain-sized network – Design of a distributed connectivity layer for combining different heterogeneous deep learning systems.
Publication type				A1 Journal article

Subject				Engineering sciences. Technology Computer. Automation

Affiliation				Publications with a UAntwerp address

Identifier

Creation

24.11.2023

Last edited

25.05.2024

To cite this reference

https://hdl.handle.net/10067/2010000151162165141