Date of Award
12-17-2025
Document Type
Masters Project
Abstract
As the utilization of deep reinforcement learning (DRL) for mobile robotics increases, it is vital to ensure that agents can navigate safely in their environments. As these robots are often costly and fragile, a key part of this task is examining path planning and collision avoidance. This project focuses on the behavior of several DRL agents such as Soft-Actor Critic (SAC) and Proximal Policy Optimizer (PPO) in dynamic environments, particularly with intermittent obstacles—those that may not always pose a threat to the agent. This project does not include training with agent sensors to detect moving hazards. Rather, this work was performed to determine if the agent can identify that a particular area could be hazardous based on past collisions and how it should behave there. To do so, development was done of a custom MuJoCo simulation environment with multiple prebuilt layouts that contained a variable number of obstacles. DRL agents were trained in this environment to explore the ways in which agents can respond to dynamic environments, determine how they view hazardous locations, and explore whether the agents make noticeable changes in their behavior and learning in these environments. Tests were also done regarding differing reward structures, such as sparse and dense rewards, to view how the amount of reward per step an agent receives impacts agent behavior and runtime in the environment. Overall, agents performed better in environments with fewer hazards and had better results with the dense reward structure. However, agents tended to linger near the goal, farming rewards until the last timestep, believed in part to be due to a lack of penalties for lingering in the space. Future angles of research, including more variables in reward structures and providing more data such as sensors to the agent for selecting actions, are discussed as well.
Recommended Citation
Lewis, Rachel, "Can deep reinforcement learning help robots avoid unexpected obstacles?" (2025). Computer Science. 37.
https://ualaska.researchcommons.org/uaf_grad_compsci/37
Handle
http://hdl.handle.net/11122/16325