"Forced-to-switch-initially" movies

Here we show how we pick desirable initial configurations that would yield most difference between \( \alpha_* \) and \( \mu_* \).

The idea is simple: We've realized that \( \alpha_* \) is better if it can reduce the number of possible tack-switches. Therefore, we

Compute another value function (deonted \( \tilde{w} \) ) that encodes the best probability of reaching the target within deadline \(s\) under the following new constraint:
- Force an initial tack-switch at any (\(r,\theta,s\)) and then behave optimally
Compute the difference in switchgrid (denoted \( D_* \)) between \( \alpha_* \) and \( \mu_* \) at any \(s\). By "difference", we meant
- Risk-aware (RA) policy prescribes "switching" while the risk-neutral (RN) policy doesn't.
- RN policy prescribes "switching" while the RA policy doesn't.
Compute \( w -\tilde{w} \) on \( D_* \) for all \( 0 \le s \le \bar{s} \)
Find states where \( w -\tilde{w} \) is large on \( D_* \). (In most cases, we want RN prescribes "switching" while RA doesn't.)

Moives:

The follownig 4 movies all show contour plots (starboard tack ( \( q = 1 \) ) only) in the relative (\(r, \theta \))-space where within each frame:

Left: The difference in Switchgrids between \( \alpha_* \) and \( \mu_* \) at a specific deadline \(s\)
Right: The corresponding success reduction in probability if forced to switch initially (behaving optimally afterward)