Quasi-Newton Differential Dynamic Programming for Robust Low-Thrust Optimization

Quasi-Newton Differential Dynamic Programming
for Robust Low-Thrust Optimization
Etienne Pellegrini and Ryan P. Russell
AIAA/AAS Astrodynamics Specialists Conference
Minneapolis, MN, 8/13/12

Summary
• Introduction
• The Hybrid Differential Dynamic Programming (HDDP)
Algorithm [Lantoine & Russell]
– State-Transition Matrices
• Quasi-Newton methods
– Application to HDDP
– The SR1 update
• Results
– 1D Landing
– 2D Spacecraft Problem [Bryson & Ho]
– Complete set of test problems
• Conclusions & Future work
2 Etienne Pellegrini – AIAA/AAS Astrodynamics Specialists Conference – 8/13/12 – Minneapolis, MN

State of the Art
Low thrust trajectories
 Highly nonlinear, constrained problems
 Need for specific and efficient NLP solvers
• DDP methods were introduced in late 60s [Mayne, Jacobson]
• Static/Dynamic Algorithm: uses Hessian shifting [Whiffen]
• HDDP: uses State-Transition Matrices approach
 Motivation for this paper:
High computational intensity for all those methods.

Classic NLP Solvers DDP Methods
Introduction

Classic NLP Solvers HDDP Method
Introduction

The HDDP algorithm

The HDDP algorithm: STM approach
Sensitivities are obtained using
the STMs
• Initialize 𝐽 𝑥,𝑁
∗
(𝑥) and 𝐽 𝑥𝑥,𝑁
∗
(𝑥)
• 𝐽 𝑥,𝑘 𝑥, 𝑢 and 𝐽 𝑥𝑥,𝑘(𝑥, 𝑢) are
obtained from backward
mapping of 𝐽 𝑥,𝑘+1
∗
(𝑥) and
𝐽 𝑥𝑥,𝑘+1
∗
(𝑥)
• The control law allows to
deduce state only sensitivities
𝐽 𝑥,𝑘
∗
(𝑥) and 𝐽 𝑥𝑥,𝑘
∗
(𝑥)

• Decouples the optimization step from the propagation step
– Allows for parallelization of the computation
– Allows for approximations to the partial derivatives
• Forward sweep:
– n equation for the state
– n2 equations for the 1st order STM
– n3 equations for the 2nd order STM
• Propagation of the STMs takes more than 80% of the
compute time
• Necessitates the user to provide the second-order partial
derivatives of the state dynamics
The HDDP algorithm: STM approach

• Introduced in 1959 [Davidon]
• Used in many optimization applications
• Aim: approximating the curvature of the problem
 Estimating the Hessian of the objective function
• Classical approach
– Gradient and estimate of the Hessian used to define a search
direction
– Step chosen with a line search or trust region method
– Estimate of the Hessian is updated
• Estimate of the Hessian has to be positive definite
9
Quasi-Newton Methods
Etienne Pellegrini – AIAA/AAS Astrodynamics Specialists Conference – 8/13/12 – Minneapolis, MN

Application to HDDP: estimating 𝚽 𝟐,𝒌
• Different from traditional quasi-Newton:
– Not as suitable to estimate the Hessian of the cost function
– Estimates the 2nd order STM
 Results in changes to the traditional methods
– No enforcement of the positive definiteness
– Requires a quasi-Newton update that approximates the
Hessian accurately
– Step decided by the propagation of the new control law
– The 2nd order STM is a tensor composed of n Hessians
 n quasi-Newton updates to apply
• Computation of the STM is decoupled: the optimization
steps are untouched
• The user does not need to provide 2nd order derivatives

SR1 Update
• Variety of quasi-Newton updates have been developed
– BFGS, DFP, Powell’s Damped BFGS, SR1, etc…
• Most of them: enforce positive definiteness of the estimate
– In classical quasi-Newton framework, a descent direction is
needed
– In our application: we don’t need the estimate to be pos. def.
• Symmetric Rank 1 update
– Does not enforce convexity
– Results in estimates closer to the true Hessian [Conn et al.]

Results: Framework
• Tested on a set of 6 fixed final time problems
• Implemented using Matlab. Similar results are expected
using another programming language
• Metric to evaluate how accurate the Hessian estimates are:
[Khalfan et al.]
• Average taken on every stage and every state.

Results: 1D Landing
Run time Iterations
HDDP 22.95 11
QHDDP 7.19 11
Controls obtained with HDDP and QHDDPStates and controls found by QHDDP
• 3 states: vertical position and velocity, and fuel
• 1 control: thrust

Results: 2D Spacecraft Problem
• Transfer between two coplanar circular orbits; minimize fuel
Trajectory obtained with QHDDP Controls obtained with HDDP and QHDDP
Run time Iterations
HDDP 551.27 89
QHDDP 32.35 82

Metric value for 4 different strategies Run time for different strategies
Other
Results: 2D Spacecraft Problem
• Different scenarios: Test of a restart strategy
 Trade-off between confidence in the estimate and
computation time
• NB: User has to provide 2nd order derivatives again

• Similar problem, longer time of flight (35 TU), lower maximum
thrust (0.05 MU.LU/TU2)
• Bang-bang structure as expected
Results: Multi-Rev Spacecraft Problem
Thrust and eccentricity (QHDDP)
0 10 20 30
0.06
0.04
0.2
0
Thrust(MULU/TU2)
0.3
0.2
0.1
0
Eccentricity
Trajectory found by QHDDP

Results: Complete Set
• Comparison of all test cases
• Metric: 2nd order STM well approximated for most cases
• Run time: show that the baseline case is mostly faster
Timings for all test cases Metric for all text cases

Conclusions
• Possibility of restarting the estimate with the real STM in
order to improve confidence
18
• Propagation becomes
5.4 to 30 times faster
• Total computation
time becomes 2.8 to
17 times faster

Future Work
• Testing on representative space trajectories
• Use of multi-step quasi-Newton methods
• Other updates
• Integration of numerical differencing or complex step
differentiation
• Parallelization of the propagation

Thank you for your attention

Backup Slides

Set of test problems

• Small perturbation to the state:
(1)
• Taylor series:
(2)
• Replace 𝛿𝑋 in (1):
(3)
• Equate (2) and (3):
23
Derivation of the STMs

• Taylor series:
• Quasi-Newton equation:
• Rank-1 update:
• Because 𝑎𝑢 𝑇
Δ𝑌𝑝 is a scalar:
• Finally:
24
Derivation of the SR1 update

• 𝐽 𝑋,𝑘
𝑖
and 𝐽 𝑋𝑋,𝑘
𝑖
are function of the downstream control law
(𝑢 𝑞, 𝑘 + 1 ≤ 𝑞 ≤ 𝑁)
• They are only accurate for a trajectory that follows exactly
this control law
• In HDDP, the next iteration changes the downstream
control law  𝐽 𝑋,𝑘
𝑖
and 𝐽 𝑋𝑋,𝑘
𝑖
do not hold information about
the new performance index 𝐽𝑖
• The quasi-Newton equation does not hold, even with exact
second-order derivatives
• Applying a quasi-Newton method, which enforces this
quasi-Newton equation, can not predict the right 𝐽 𝑋𝑋,𝑘
𝑖+1
25
Why not apply quasi-Newton to 𝑱 𝑿𝑿
computation?

“An optimal policy has the property that
whatever the initial state and initial decision
are, the remaining decisions must constitute
an optimal policy with regard to the state
resulting from the first decision.”
Bellman, R., Dynamic Programming, Princeton University Press,
Princeton, New Jersey, 1957.
26
Bellman’s Principle of Optimality

Quasi-Newton Differential Dynamic Programming for Robust Low-Thrust Optimization

Recomendados

Recomendados

Mais conteúdo relacionado

Último

Último (20)

Destaque

Destaque (20)

Quasi-Newton Differential Dynamic Programming for Robust Low-Thrust Optimization