John Schulman

15 posts

Procgen Benchmark

Retro Contest: Results

OpenAI Fellows Fall 2018

Gym Retro

Retro Contest

Retro Contest

We're launching a transfer learning contest that measures a reinforcement learning algorithm's ability to generalize from previous experience.


4 minute read

Reptile: A Scalable Meta-Learning Algorithm

Requests for Research 2.0

Learning a Hierarchy

OpenAI Baselines: ACKTR & A2C

OpenAI Baselines: ACKTR & A2C

We're releasing two new OpenAI Baselines implementations: ACKTR and A2C. A2C is a synchronous, deterministic variant of Asynchronous Advantage Actor Critic (A3C) which we've found gives equal performance.


4 minute read

Proximal Policy Optimization

Proximal Policy Optimization

We’re releasing a new class of reinforcement learning algorithms, Proximal Policy Optimization (PPO), which perform comparably or better than state-of-the-art approaches while being much simpler to implement and tune.


3 minute read

OpenAI Baselines: DQN

OpenAI Baselines: DQN

We're open-sourcing OpenAI Baselines, our internal effort to reproduce reinforcement learning algorithms with performance on par with published results. We'll release the algorithms over upcoming months; today's release includes DQN and three of its variants.


4 minute read

Roboschool

Evolution Strategies as a Scalable Alternative to Reinforcement Learning

Evolution Strategies as a Scalable Alternative to Reinforcement Learning

We've discovered that evolution strategies (ES), an optimization technique that's been known for decades, rivals the performance of standard reinforcement learning (RL) techniques on modern RL benchmarks, while overcoming many of RL's inconveniences.


12 minute read

Generative Models

Generative Models

This post describes four projects that share a common theme of enhancing or using generative models, a branch of unsupervised learning techniques in machine learning.


12 minute read

OpenAI Gym Beta