Bayesian quadrature policy optimization for spacecraft proximity maneuvers and docking

Desong Du, Yanfang Liu, Ouyang Zhang, Naiming Qi, Weiran Yao, Wei Pan

Research output: Contribution to journalArticlepeer-review

Abstract

Advancing autonomous spacecraft proximity maneuvers and docking (PMD) is crucial for enhancing the efficiency and safety of inter-satellite services. One primary challenge in PMD is the accurate a priori definition of the system model, often complicated by inherent uncertainties in the system modeling and observational data. To address this challenge, we propose a novel Lyapunov Bayesian actor-critic reinforcement learning algorithm that guarantees the stability of the control policy under uncertainty. The PMD task is formulated as a Markov decision process that involves the relative dynamic model, the docking cone, and the cost function. By applying Lyapunov theory, we reformulate temporal difference learning as a constrained Gaussian process regression, enabling the state-value function to act as a Lyapunov function. Additionally, the proposed Bayesian quadrature policy optimization method analytically computes policy gradients, effectively addressing stability constraints while accommodating informational uncertainties in the PMD task. Experimental validation on a spacecraft air-bearing testbed demonstrates the significant and promising performance of the proposed algorithm.
Original languageEnglish
Article number109474
JournalAerospace Science and Technology
Volume154
Early online date23 Aug 2024
DOIs
Publication statusPublished - 1 Nov 2024

Fingerprint

Dive into the research topics of 'Bayesian quadrature policy optimization for spacecraft proximity maneuvers and docking'. Together they form a unique fingerprint.

Cite this