Braham Snyder

(rhymes with "Graham" and "Sam")

I create more efficient algorithms for sequential decision-making. I focus on reinforcement learning because: (i) it is likely important for outperforming the best decisions in prior data, and yet (ii) existing algorithms are often unstable or inefficient.

Three aspects for data- and compute-efficiency are so unstable if combined that they are called the deadly triad. One of my goals is to fix this instability. I think moving closer to residual minimization might be one part of the simplest solution.

I'm a student researcher at UT Austin. I'm fortunate to be advised by Yuke Zhu, and to be collaborating with Amy Zhang.

Email | Google Scholar | Twitter

profile photo
Selected Works:
Target Rate Optimization: Avoiding Iterative Error Exploitation
Braham Snyder, Amy Zhang, Yuke Zhu
NeurIPS Foundation Models for Decision Making Workshop, 2023
paper | (code forthcoming)

To improve the deadly triad instability of two standard algorithms, we optimize the rate at which their bootstrapped targets are updated. Our best approach to this, TROT, uses residual minimization. TROT increases return on almost half of the domains we test, by up to ~3x.

Towards Convergent Offline Reinforcement Learning
Braham Snyder
MS thesis, UT Austin, 2023

Raisin with a higher-level abstract and introduction, and an updated conclusion. Includes more of my ideas for fixing residual minimization, and discusses preliminary experiments in those directions.

Raisin: Residual Algorithms for Versatile Offline Reinforcement Learning
Braham Snyder, Yuke Zhu
NeurIPS Offline Reinforcement Learning Workshop, 2022
paper | ICLR reviews (rejected, top ~30%)

We revisit residual algorithms, averages of residual minimization and temporal difference learning. We add residual algorithms to a modern and high-scoring but inefficient offline algorithm. Changing nothing else, the residual weight hyperparameter reduces the number of neural networks required by 50x on a standard benchmark domain.