Robust machine learning for vision-based navigation about non-cooperative resident space objects

Placeholder Show Content


As the space industry grows and more platforms are launched into outer space, there is a rising concern for its sustainable development and an immediate need for strategies to prevent potential crises. These strategies include various upcoming missions such as on-orbit servicing (e.g., inspection, refueling) and active debris removal which commonly require a safe and autonomous Rendezvous, Proximity Operation and Docking (RPOD) capability of a servicer spacecraft with respect to an arbitrary Resident Space Object (RSO). The core of RPOD is the real-time estimation and tracking of the position and orientation (i.e., pose) of the target RSO, and achieving this using a passive optical sensor such as a camera is particularly attractive as cameras are ubiquitous, low Size-Weight-Power-Cost (SWaP-C) sensors present on all satellite platforms. Machine Learning (ML) is a modern approach to such a complex task as predicting the 6D pose of an object from a single or sequence of images. Unfortunately, space is a challenging environment to operate data-driven ML approaches. First, the inaccessibility to space prevents collecting a large-scale image dataset for not only training but also validation of spaceborne ML methods. Second, the computational resource is limited onboard satellite avionics due to the harsh space environment. These two challenges demand a computationally efficient ML-based pose estimation and navigation algorithm and, more importantly, a mechanism by which the algorithm's performance and robustness on unavailable spaceborne flight images can be validated on-ground prior to deployment. In response, this dissertation makes numerous contributions to the state of the art for vision-based pose estimation of a non-cooperative target using ML. The proposed algorithms and methodologies allow for training and on-ground validation of Neural Networks (NN) and ML-in-the-loop navigation filters that are also computationally efficient when tested on representative satellite avionics. The first contribution is various open-source benchmark datasets designed for training and evaluating various NN models for spaceborne applications. All datasets commonly employ synthetic images rendered with computer graphics for NN training. Select datasets additionally consist of Hardware-In-the-Loop (HIL) images of a mockup satellite model captured with a real optical sensor in the Testbed for Rendezvous and Optical Navigation (TRON) facility at Stanford's Space Rendezvous Laboratory. Analyses on nearly 10,000 images from TRON reveal that HIL images are viable on-ground surrogates for otherwise unavailable flight images that can be used for pre-flight performance verification of ML-in-the-loop guidance, navigation and control algorithms. The second contribution is Spacecraft Pose Network v3 (SPNv3), a computationally efficient NN model for monocular pose estimation of a known target RSO, and a robust Adaptive Unscented Kalman Filter (AUKF) that utilizes SPNv3 as its measurement module. SPNv3 is incorporated into the AUKF which tracks the relative orbital and attitude motion of the target. While HIL images from the facility such as TRON can significantly reduce the visual gap against spaceborne flight images, there are still remaining gaps mainly due to the fact that HIL domains are based on a mockup model that fails to capture the real satellite's material properties. Therefore, the third contribution is to close this remaining domain gap via Online Supervised Training (OST) of SPNv3 onboard the satellite avionics. OST fine-tunes the NN in a supervised manner using the pose pseudo-labels obtained from the a posteriori state estimates of the onboard navigation filter (e.g., AUKF). Experiments demonstrate that OST with a sub-optimally trained SPNv3 model can already improve the filter's steady-state errors, fully closing the domain gap, especially on the orientation estimation. The final contribution of this dissertation is a novel ML approach to pose estimation and 3D shape reconstruction of an unknown RSO. Specifically, the proposed Convolutional Neural Network (CNN) model takes a single 2D image of the target and predicts the shape of the target as an assembly of unit-size superquadric primitives, a compact representation capable of describing a wide range of simple 3D shapes (e.g., cuboid, ellipsoid) using only a few parameters. The experimental studies indicate that the proposed CNN trained on synthetic images of 64 different satellite models can reconstruct accurate superquadric assemblies when tested on unseen images of known models and capture high-level structures of unknown models most of the time despite having been trained on an extremely small dataset.


Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2024; ©2024
Publication date 2024; 2024
Issuance monographic
Language English


Author Park, Tae Ha
Degree supervisor D'Amico, Simone
Thesis advisor D'Amico, Simone
Thesis advisor Kochenderfer, Mykel J, 1980-
Thesis advisor Schwager, Mac
Degree committee member Kochenderfer, Mykel J, 1980-
Degree committee member Schwager, Mac
Associated with Stanford University, School of Engineering
Associated with Stanford University, Department of Aeronautics and Astronautics


Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Tae Ha Park.
Note Submitted to the Department of Aeronautics and Astronautics.
Thesis Thesis Ph.D. Stanford University 2024.

Access conditions

© 2024 by Tae Ha Park
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...