ReVoLT: Relational Reasoning and Voronoi Local graph planning for Target-driven navigation


Junjia Liu*, Jianfei Guo*, Zehui Meng,
Jingtao Xue, Zhuang Fu, Guangwu Liu

Code(Coming soon) Paper Video

Embodied AI is an inevitable trend that emphasizes the interaction between intelligent entities and the real world, with broad applications in Robotics, especially target-driven navigation. This task requires the robot to find an object of a certain category efficiently in an unknown domestic environment. Recent works focus on exploiting layout relationships by graph neural networks (GNNs). However, most of them obtain robot actions directly from observations in an end-to-end manner via an incomplete relation graph, which are not interpretable and re- liable. We decouple this task and propose ReVoLT, a hierarchical framework: (a) an object detection visual frontend, (b) a high- level reasoner (infers object-level sub-goals), (c) an intermediate- level planner (computes spatial location sub-goals from object- level sub-goals), and (d) a low-level controller (executes actions), which operates with a multi-layer semantic-spatial topological graph. The reasoner uses multiform structured relations as priors, which are obtained from combinatorial relation extraction networks composed of unsupervised GraphSAGE, GCN and GraphRNN-based Region Rollout. The reasoner performs with Upper Confidence Bound for Tree (UCT) to select object-level sub-goals, accounting for tradeoffs between exploitation (depth- first searching) and exploration (regretting). The lightweight planner generates spontaneous spatial location sub-goals from object-level subgoals through an online constructed Voronoi local graph, replacing classical SLAM. The simulation experiments demonstrate that our framework achieves better performance in the target-driven navigation tasks and generalizes well, which is superior to the existing state-of-the-art methods.