Abstract
Dual Control for Exploitation and Exploration (DCEE) shows promising performance by realizing optimal trade-off between exploitation and exploration under an unknown environment. However, it is computationally intensive and lacks rigorously established properties such as stability and convergence. This paper addresses these two issues by developing the Nesterov Accelerated Gradient Descent (NAGD) based DCEE, i.e. DCEE-NAGD, where the NAGD is applied to both the source term estimation and the path planning in the DCEE framework. It shows that DCEE-NAGD significantly reduces the search time by driving the search agent moving towards the estimated airborne source location (exploitation) and actively searching new data to reduce the current estimation uncertainty (exploration) with the help of NAGD. The convergence of both the source term estimation and the path planning of the DCEE-NAGD algorithm is rigorously established by applying the mean value theorem and mathematical transformation. More specifically, the convergence boundaries and the convergence rates of the source term estimation and the whole DCEE-NAGD algorithm are rigorously established. Both theoretic analysis and simulations confirm the proposed DCEE-NAGD algorithm significantly improves the performance so reduces the autonomous search time.
Original language | English |
---|---|
Article number | 129729 |
Journal | Neurocomputing |
Volume | 630 |
Early online date | 20 Feb 2025 |
DOIs | |
Publication status | E-pub ahead of print - 20 Feb 2025 |