Bayesian Surveillance Evasion



We consider a task of surveillance-evading path-planning in a continuous setting. An Evader strives to escape from a 2D domain while minimizing the risk of detection (and immediate capture). The probability of detection is path-dependent and determined by the spatially inhomogeneous surveillance intensity, which is fixed but a priori unknown and gradually learned in the multi-episodic setting. We introduce a Bayesian reinforcement learning algorithm that relies on a Gaussian Process regression (to model the surveillance intensity function based on the information from prior episodes), numerical methods for Hamilton-Jacobi PDEs (to plan the best continuous trajectories based on the current model), and Confidence Bounds (to balance the exploration vs exploitation). We use numerical experiments and regret metrics to highlight the significant advantages of our approach compared to traditional graph-based algorithms of reinforcement learning.


D, Qi, D. Bindel and A. Vladimirsky, "Surveillance Evasion Through Bayesian Reinforcement Learning,"

View manuscript on Arxiv »

Source Code

View source code on Github »


The C++ implementation of Fast Marching Method is based on another work by D.Qi and A. Vladimirsky, which can be found here.

The 3rd author is supported by the NSF DMS (awards 1645643, 1738010, and 2111522).

We thank Elliot Cartree for providing a template for this website, originally used to host his work on Time-Dependent Surveillance-Evasion Games.