Renee Jia

Attention Residuals: A Comprehensive Understanding

15-20 min read

This paper addresses a fundamental problem in training deep transformer models: uncontrolled hidden-state magnitude growth as model depth increases. The auth...

The Web Is Not a Neutral Environment for Agents

5-10 min read

Browser agents are getting better fast, but the web is full of things that try to steer behavior. If that already works on humans, why would agents be immune?

Modeling Long User Histories for Ads Ranking

20-30 min read

How ads ranking systems went from aggregated feature counts to retrieve-and-compress architectures that handle 10,000+ user events under millisecond latency ...

The Evolution of Reward Hacking and Jailbreak Research in AI

25-30 min read

From specification gaming in classical RL to deceptive alignment and jailbreaks in LLMs—a survey of how reward hacking has become a central challenge in AI s...

Reasoning in Large Language Models: A Research-Centric Overview

15-25 min read

Reasoning in Large Language Models: A Research-Centric Overview

Recent posts

Attention Residuals: A Comprehensive Understanding

The Web Is Not a Neutral Environment for Agents

Modeling Long User Histories for Ads Ranking

The Evolution of Reward Hacking and Jailbreak Research in AI

Reasoning in Large Language Models: A Research-Centric Overview