People · Technologists

John Schulman

John Schulman is an OpenAI cofounder and reinforcement-learning researcher who created the PPO algorithm, helped build ChatGPT and the RLHF alignment behind it, left OpenAI for Anthropic in August 2024, and then cofounded Mira Murati's Thinking Machines Lab as chief scientist.

Location San Francisco, California Mentions 2 Tags Person JohnSchulman OpenAI Anthropic ThinkingMachinesLab ArtificialIntelligence ReinforcementLearning

John Schulman is a machine-learning researcher who cofounded OpenAI in 2015, where he helped develop reinforcement learning from human feedback (RLHF) and led the creation of ChatGPT, and who is regarded as one of the principal architects of the alignment methods that made conversational language models usable. He left OpenAI for Anthropic in August 2024, departed Anthropic roughly six months later, and in February 2025 became a cofounder and chief scientist of Thinking Machines Lab, the company founded by the former OpenAI chief technology officer Mira Murati.¹²

Reinforcement Learning and the Road to OpenAI

Schulman studied physics at Caltech, briefly pursued neuroscience, and took his doctorate in computer science at the University of California, Berkeley, advised by Pieter Abbeel, working on robotics and reinforcement learning. During and shortly after his PhD he developed two foundational reinforcement-learning algorithms: Trust Region Policy Optimization (TRPO), with coauthors including Sergey Levine and Michael Jordan, and Proximal Policy Optimization (PPO), published in 2017, which simplified TRPO while preserving its stability and became one of the most widely used and most cited methods in the field.³⁴

He joined OpenAI as a member of its founding team in 2015, immediately after finishing at Berkeley. PPO later became the workhorse algorithm of RLHF, the technique that fine-tunes a language model on human preference judgments to make its outputs more helpful and less harmful, and which OpenAI used to turn its raw GPT models into the aligned systems that culminated in ChatGPT. Schulman co-led OpenAI's post-training team and is widely credited as a leader of the effort that produced ChatGPT after its November 2022 release.⁴⁵

Anthropic, Then Thinking Machines Lab

On August 5, 2024 Schulman announced that he was leaving OpenAI to join Anthropic, the rival lab founded by former OpenAI researchers around AI safety. He wrote that the move "stems from my desire to deepen my focus on AI alignment, and to start a new chapter of my career where I can return to hands-on technical work," and he joined Anthropic's alignment-science team. His departure came amid a wave of exits from OpenAI's safety and research leadership, including the chief scientist Ilya Sutskever and Murati.¹⁶

Schulman's time at Anthropic was brief. In February 2025 he left to cofound Thinking Machines Lab with Murati, taking the role of chief scientist, in a company assembled largely from former OpenAI researchers including Barret Zoph and Bob McGrew. The lab raised a roughly 2-billion-dollar seed round at a 12-billion-dollar valuation in July 2025, led by Andreessen Horowitz, with Schulman among its named cofounders.²⁷

Sources

"OpenAI Co-Founder John Schulman Departs for AI Rival Anthropic," Bloomberg, August 6, 2024, on the August 2024 move and the "deepen my focus on AI alignment" statement. https://www.bloomberg.com/news/articles/2024-08-06/openai-co-founder-john-schulman-departs-for-ai-rival-anthropic ↩
"OpenAI cofounder John Schulman is joining Mira Murati's startup after brief stint at Anthropic," Fortune, February 6, 2025, on his Anthropic departure and joining Thinking Machines Lab as chief scientist. https://www.fortune.com/2025/02/06/openai-john-schulman-mira-muratis-startup-anthropic ↩
"Proximal Policy Optimization Algorithms," Schulman et al., 2017, the PPO paper. https://arxiv.org/abs/1707.06347 ↩
"Proximal Policy Optimization (PPO): The Key to LLM Alignment," Cameron R. Wolfe, on Schulman's Caltech and Berkeley training under Abbeel, the TRPO and PPO work, and PPO's role in RLHF. https://cameronrwolfe.substack.com/p/proximal-policy-optimization-ppo ↩
"How John Schulman Created OpenAI," KITRUM, on his joining the OpenAI founding team in 2015, leading the creation of ChatGPT, and co-leading the post-training team. https://kitrum.com/blog/the-inspiring-story-john-schulman-co-founder-of-openai/ ↩
"OpenAI co-founder John Schulman joins rival LLM developer Anthropic," SiliconANGLE, August 6, 2024, on the move and the broader departures from OpenAI. https://siliconangle.com/2024/08/06/openai-co-founder-john-schulman-joins-rival-llm-developer-anthropic/ ↩
"Mira Murati's Thinking Machines Lab is worth $12B in seed round," TechCrunch, July 15, 2025, on the seed round, valuation, and the cofounders including Schulman. https://techcrunch.com/2025/07/15/mira-muratis-thinking-machines-lab-is-worth-12b-in-seed-round/ ↩

Local network

John Schulman's direct connections. Click any node to navigate, drag to pan, scroll (or pinch) to zoom. + 2‑hop expands the neighborhood one level further.

Opacity 100% Highlight 100% Force 100%

Legend — how to read this graph

Node colour — type

People
Organizations
Programs
Events
Concepts
Places

Node size

Larger = more mentions across the vault.

Connections

Explicit link (wikilink between entries).

Inferred connection (name co-mention) — toggle with “Inferred”.

Highlights

Gold ring — a bridge entity linking distant clusters.

Accent ring — your current selection.

John Schulman

Reinforcement Learning and the Road to OpenAI

Anthropic, Then Thinking Machines Lab

Sources

Find a path from John Schulman to…

Local network

Mentioned in 2

John Schulman

Reinforcement Learning and the Road to OpenAI

Anthropic, Then Thinking Machines Lab

Sources

Connected to

Find a path from John Schulman to…

Local network

Mentioned in 2