论文链接:https://ieeexplore.ieee.org/abstract/document/11036159
视频链接:https://www.bilibili.com/video/BV1o5KtzMEN1/
在复杂多变环境中,传统机器人运动规划需依赖人工针对不同环境特征(如障碍密度、空间结构)反复调参,以兼顾导航效率与安全性;而端到端深度强化学习(DRL)方法则面临可解释性差、仿真到实机迁移难等挑战。为此,本文创新性地提出一种环境自适应运动规划框架。该框架深度融合经典优化方法的稳定性与DRL策略的环境适应性,通过DRL实时动态调整优化目标参数,使机器人能依据环境特征,在高速行进与精确避障间智能切换,显著提升移动机器人导航的灵活性与效率!实验表明,在未知拥挤的办公室场景,所提算法比经典TEB方法效率提升29%,成功率提升17%,实现“又灵又稳”的高效导航。
Z. Zhu, R. Wang*, Y. Wang, Y. Wang and X. Zhang, Environment-Adaptive Motion Planning Via Reinforcement Learning Based Trajectory Optimization, IEEE Transactions on Automation Science and Engineering, doi: 10.1109/TASE.2025.3579573.
Abstract
This paper proposes a novel environment-adaptive motion planning framework for mobile robots, which utilizes deep reinforcement learning to dynamically adjust optimization objectives according to various environmental and robot-ego characteristics, greatly enhancing the adaptability and robustness compared to existing motion planning strategies. Our approach features a two-stage trajectory optimization algorithm that optimizes for smoothness, safety, and efficiency—elements critical in practical applications. Firstly, we propose a reinforcement learning algorithm that dynamically adjusts optimization objectives based on the environmental context, which carefully encodes the environment, coarse initial path and robot information into the observation space. Additionally, two techniques are designed to reduce the sim-to-real gap: 1) integrating classical optimization framework as the motion planning backbone; 2) using low-dimensional input in the learning component, which minimizes discrepancies between simulated and real-world conditions. With the hybrid strategy, not only the interpretability and stability of the classical motion planning pipeline is preserved, but also the adaptive capability of emerging DRL techniques is fully utilized. The efficacy of the proposed solution is demonstrated through extensive simulations and real-world experiments, showcasing superior performance in terms of safety and efficiency across various testing scenarios. Especially in unknown and cluttered real office environments, our approach significantly improve safety (17% increase in success rate) and efficiency (29% reduction in time cost), compared to the popular traditional methods.