Skip to main navigation Skip to search Skip to main content

Spatially-enhanced recurrent memory for long-range mapless navigation via end-to-end reinforcement learning

  • Fan Yang
  • , Per Frivik
  • , David Hoeller
  • , Chen Wang
  • , Cesar Cadena
  • , Marco Hutter
  • Swiss Federal Institute of Technology Zurich

Research output: Contribution to journalArticlepeer-review

Abstract

Recent advancements in robot navigation, particularly with end-to-end learning approaches such as reinforcement learning (RL), have demonstrated remarkable efficiency and effectiveness. However, successful navigation still fundamentally depends on two key capabilities: mapping and planning, whether implemented explicitly or implicitly. Classical approaches rely on explicit mapping pipelines to transform and register egocentric observations into a coherent map for the planning module. In contrast, end-to-end learning often achieves this implicitly—through recurrent neural networks (RNNs) that fuse current and historical observations into a latent space for planning. While existing architectures, such as LSTM and GRU, can capture temporal dependencies, our findings reveal a critical limitation: their inability to effectively perform spatial memorization. This capability is essential for transforming and integrating sequential observations from varying perspectives to build spatial representations that support planning tasks. To address this, we propose spatially-enhanced recurrent units (SRUs)—a simple yet effective modification to existing RNNs—that enhance spatial memorization. To improve navigation performance, we introduce an attention-based network architecture integrated with SRUs, enabling long-range mapless navigation using a single forward-facing stereo camera. Additionally, we employ regularization techniques to facilitate robust end-to-end recurrent training via RL. Experimental results demonstrate that our approach improves long-range navigation performance by 23.5% overall compared to existing RNNs. Furthermore, when equipped with SRU memory, our method outperforms both RL baseline approaches—one relying on explicit mapping and the other on stacked historical observations—achieving overall improvements of 29.6% and 105.0%, respectively, in diverse environments that require long-horizon mapping and memorization capabilities. Finally, we address the sim-to-real gap by leveraging large-scale pretraining on synthetic depth data, enabling zero-shot transfer for deployment across diverse and complex real-world environments.

Original languageEnglish
JournalInternational Journal of Robotics Research
DOIs
StateAccepted/In press - 2025

Keywords

  • end-to-end mapless navigation
  • recurrent neural networks
  • reinforcement learning
  • spatial memory

Fingerprint

Dive into the research topics of 'Spatially-enhanced recurrent memory for long-range mapless navigation via end-to-end reinforcement learning'. Together they form a unique fingerprint.

Cite this