Rajan Agarwal

Research Engineer, currently focused on post-training. Software Engineering Student @ University of Waterloo.

I work on RL for web agents as a Research Engineer at Amazon AGI Lab. Previously, I built multimodal video editing agents for hollywood at Kino AI and low-level train safety systems at Hitachi Rail.

I am a deeply technical person. I'm constantly building, learning and breaking things. I'm obsessed with learning how things work and designing novel solutions to problems I can't get out of my head. Right now, I'm most curious about multimodal models and coding agents.

Reinforcement Learning @ Amazon AGI
LLMs can invent their own compression

LLMs can invent their own compression

2025

As a constrained optimization problem, LLMs can use RL to invent their own compression schemes to increase its context window.

Shadow

Shadow

2025

Open-source background coding agent with 1.4k stars on GitHub. Feature-filled agent that works in a MicroVM with full codebase understanding.

Nova Act: SOTA browser-use model

Nova Act: SOTA browser-use model

2025

In my internship with Amazon AGI, I worked on RL for a browser-use model. I led model performance on two public benchmarks & worked on algorithms.

Natural Deception with RL

Natural Deception with RL

2025

Language models, when trained on hidden-information games, naturally learn deceptive techniques to win the game by any means.

Cross Lingual Alignment

Cross Lingual Alignment

2025

Research under Cohere Labs for a compute-efficient post training to represent different languages as modalities for multilingual language models.

nanochatVL

nanochatVL

2025

Giving vision to Karpathy's nanochat for <$10 of compute, by implementing LLaVA via SIGLIP encoder injection and fine-tuning on vision Q&A.

Kino AI: Hollywood Video Editing Agent

Kino AI: Hollywood Video Editing Agent

2025

Multimodal agent and long-context video understanding to help hollywood editors. Worked on the the most powerful video retrieval and editing agent.

Local VLM on a Samsung Galaxy

Local VLM on a Samsung Galaxy

2025

Tricked a Galaxy S24 to run Moondream 3B VLM locally, with quantization + local linux setup on phone. Built at TreeHacks 2025

GPU optimized voxel grids

GPU optimized voxel grids

2025

Designed and implemented GPU-optimized voxel grids for humanoid design team in Waterloo. Co-led ML team.

Arceus: Distributed Training on Macbooks

Arceus: Distributed Training on Macbooks

2024

A decentralized cross-device model training system with model and tensor parallelism to reduce compute needed to train large models.

Interoperable Coding Subagents

Interoperable Coding Subagents

2024

One of the first implementations of coding subagents to work together to solve hard, diverse coding problems.

Multimodal Memory Architecture

Multimodal Memory Architecture

2023

Long-term memory with multimodal knowledge graphs to search 7 days of video and audio within 5 seconds. Winners @ Hack the North 2023.

Shapeshift

Shapeshift

2023

Deep learning analysis of seismic frequencies and local policy to design affordable earthquake-resistant buildings. Worked under RippleX Fellowship, RBCx.

Offline Mesh Network

Offline Mesh Network

2022

An offline mesh network written in Swift via MultiPeer Connectivity to allow for cross-device transfer of files entirely offline, creating a chain of encrypted nodes.