These examples are from the following paper:
View manuscript on arXiv »
All of the computations/examples use the following parameters:
Evolution of threshold-aware policies/value functions »
Sample paths visualizations »
Switchgrid difference + success reduction visualizations »