How to Train Your Latent Control Barrier Function:
Smooth Safety Filtering Under Hard-to-Model Constraints

Nakamura, Kensuke; Bishop, Arun L; Man, Steven

How to Train Your Latent Control Barrier Function:
Smooth Safety Filtering Under Hard-to-Model Constraints

Kensuke Nakamura Arun L. Bishop Steven Man Aaron M. Johnson Zachary Manchester Andrea Bajcsy

Carnegie Mellon University

Code [Coming Soon!] arXiv

Hardware Rollouts

Abstract

Latent safety filters extend Hamilton-Jacobi (HJ) reachability to operate on latent state representa- tions and dynamics learned directly from high-dimensional observations, enabling safe visuomotor control under hard-to-model constraints. However, existing methods implement “least-restrictive” filtering that discretely switch between nominal and safety policies, potentially undermining the task performance that makes modern visuomotor policies valuable. While reachability value func- tions can, in principle, be adapted to be control barrier functions (CBFs) for smooth optimization- based filtering, we theoretically and empirically show that current latent-space learning methods produce fundamentally incompatible value functions. We identify two sources of incompatibility: First, in HJ reachability, failures are encoded via a “margin function” in latent space, whose sign indicates whether or not a latent is in the constraint set. However, representing the margin function as a classifier yields saturated value functions that exhibit discontinuous jumps. We prove that the value function’s Lipschitz constant scales linearly with the margin function’s Lipschitz constant, revealing that smooth CBFs require smooth margins. Second, reinforcement learning (RL) approx- imations trained solely on safety policy data yield inaccurate value estimates for nominal policy actions, precisely where CBF filtering needs them. We propose the LatentCBF, which addresses both challenges through gradient penalties that lead to smooth margin functions without additional labeling, and a value-training procedure that mixes data from both the nominal and safety policy distributions. Experiments on simulated benchmarks and hardware with a vision-based manipula- tion policy demonstrate that LatentCBF enables smooth safety filtering while doubling the success rate over prior switching methods.

Goal: Smooth Safety Filtering for Hard-to-Model Constraints

Previous work has shown proposed latent safety filters to safeguard against hard-to-model safety constraints such as spilling, burning, or toppling. These methods approximate the solution of a Hamilton-Jacobi reachability problem in a learned latent world model, specifying constraints through a binary classification in the world model's latent space. Concretely, latent safety filters can be obtained by (approximately) solving the following time-discounted Hamilton-Jacobi-Bellman fixed point equation.

This results in a safety Q-function that can be used to monitor the safety of a task-oriented action, and a safety fallback policy. However, existing approaches rely on “least-restrictive” filtering that discretely switches between the nominal and safety policies, degrading the robot's performance. Since recent work has adapted HJ-based value functions into control barrier functions (CBFs), it is compelling to try to use this Q-function for smooth CBF-style safety filtering that minimally alters the nominal policy while maintaining safety.

However we identify two key issues that prevent using current latent safety filters for smooth CBF-style filtering.

Issue 1: Gradient Information

We previous methods for training latent safety filters train the margin function as a binary classifier in a world model's latent space. While this is sufficient for identifying the boundary of the unsafe set, the resulting value function has poor gradients for CBF-style safety filtering. We show in an example of how these poor gradients degrade optimization-based safety filtering in a privileged state space with value function computed using traditional numerical methods. Even in the absence of learning-based approximation errors, binary margin functions prevent smooth safety filtering (left). In contrast, smooth margin functions enable smooth safety filtering (right).

Issue 2: RL Approximations

In high-dimensional state spaces, such as those of a latent world model, the solution to the Hamilton-Jacobi reachability problem is typically approximated using actor-critic reinforcement learning (RL). In these methods, trajectory rollouts of the learned safety policy are stored into a replay buffer, and a Q-function is trained via supervised learning to a Bellman target sampled from this buffer.

However, this only trains the Q-function on actions from the fallback policy. This off-the-shelf training process is at odds with how the critic will be used at deployment in CBF-style filtering, where we must evaluate the safety of task-relevant actions from a nominal policy—actions the critic has likely never encountered during training.

Method

Our key insight for learning smooth margins without dense supervision is to draw inspiration from Wasserstein GANs (WGAN). This method learns a smooth discriminator that distinguishes between two classes of samples by regularizing its Lipschitz constant. We modify prior approaches to learning a safety margin function by including the following WGAN loss:

To address the RL approximation issue, we propose to modify the replay buffer to store an equal amount of nominal policy and safety fallback policy rollouts. This simple modification ensures that the Q-function is trained on actions from both the nominal and safety policies, improving its accuracy where it matters most for CBF-style filtering. In practice, we store (z, a, l, z', a') tuples in the replay buffer, where a and a' are sampled from the same policy.

Results: Simulation

We first analyze the impact of our proposed smooth margin loss on a simple Dubins car with a collision avoidance constraint (environment pictured under Issue 1). We report for both our method (GP) and a baseline from prior work (NoGP) the classification accuracy and smoothness (measured as the maximum |l(z) - l(z')| over a trajectory). We find our method greatly improves the smoothness of the margin function while maintaining similar classification accuracy.

We next evaluate the ability of the our learned value function for smooth safety filtering. We find that smoothing the margin function is critical for smooth safety filtering. Our method is able to reduce the override magnitude by 45% relative to least-restrictive filtering while maintaining 100% safety rate. In contrast, the baseline value function (NoGP) can only reduce the action override by 11%.

Results: Hardware

Moving to a 7-DoF robot arm, we evaluate the necessity of mixing in task-oriented actions into the replay buffer. We test our learned value function's ability to filter the robot arm diagonally lifting an opened bag of candy. We qualtitaively visualize the x-z trajectories for 10 unfiltered rollouts (None), a baseline without replay buffer mixing (LatentCBF-NoMix), and our method (LatentCBF). Both CBFs use the smooth margin function for fair comparison.

Without including task oriented actions in the replay buffer, the value function leads to poor safety estimates and executes erratic actions unaligned with the task policy. In contrast, our method successfully filters the nominal policy by eliminating the unsafe vertical lifting component of the action while allowing the robot to slide the bag along the table.

Finally, we evaluate the closed loop performance of our latent CBF when filtering a nominal diffusion policy. We adopt the skittles lifting task from prior work where the robot must lift an opened bag of skittles without spilling. We find that our method is just as safe as prior work while steering an unsafe nominal policy toward safe behavior modes.

BibTeX

@misc{nakamura2025trainlatentcontrolbarrier,
      title={How to Train Your Latent Control Barrier Function: Smooth Safety Filtering Under Hard-to-Model Constraints}, 
      author={Kensuke Nakamura and Arun L. Bishop and Steven Man and Aaron M. Johnson and Zachary Manchester and Andrea Bajcsy},
      year={2025},
      eprint={2511.18606},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2511.18606}, 
}