Deep Double Descent
We're releasing Safety Gym, a suite of environments and tools for measuring progress towards reinforcement learning agents which respect safety constraints while training.
GPT-2: 1.5B Release
Solving Rubik’s Cube with a Robot Hand
We've trained a pair of neural networks to solve the Rubik’s Cube with a human-like robot hand.
OpenAI Scholars Spring 2020
Fine-Tuning GPT-2 from Human Preferences
Emergent Tool Use from Multi-Agent Interaction
We've observed agents discovering progressively more complex tool use while playing a simple game of hide-and-seek.
Testing Robustness Against Unforeseen Adversaries
GPT-2: 6-Month Follow-Up
Microsoft Invests In and Partners with OpenAI to Support Us Building Beneficial AGI
Why Responsible AI Development Needs Cooperation on Safety
OpenAI Robotics Symposium 2019
OpenAI Scholars Spring 2019: Final Projects
OpenAI Fellows Fall 2018: Final Projects
We’ve created MuseNet, a deep neural network that can generate 4-minute musical compositions with 10 different instruments, and can combine styles from country to Mozart to the Beatles.
Generative Modeling with Sparse Transformers
OpenAI Five Defeats Dota 2 World Champions
OpenAI Five is the first AI to beat the world champions in an esports game, having won two back-to-back games versus the world champion Dota 2 team, OG, at Finals this weekend.
OpenAI Five Finals
Implicit Generation and Generalization Methods for Energy-Based Models
OpenAI Scholars Spring 2019
We've created OpenAI LP, a new "capped-profit" company that allows us to rapidly increase our investments in compute and talent while including checks and balances to actualize our mission.
Introducing Activation Atlases
We’ve created activation atlases (in collaboration with researchers from Google Brain), a new technique for visualizing interactions between neurons.
Neural MMO: A Massively Multiagent Game Environment
Spinning Up in Deep RL: Workshop Review
AI Safety Needs Social Scientists
Better Language Models and Their Implications
We’ve trained a large-scale unsupervised language model which generates coherent paragraphs of text, achieves state-of-the-art performance on many language modeling benchmarks, and performs rudimentary reading comprehension, machine translation, question answering, and summarization.
OpenAI Fellows Summer 2018: Final Projects
How AI Training Scales
We've discovered that the gradient noise scale, a simple statistical metric, predicts the parallelizability of neural network training on a wide range of tasks.
Quantifying Generalization in Reinforcement Learning
Spinning Up in Deep RL
We’re releasing Spinning Up in Deep RL, an educational resource designed to let anyone learn to become a skilled practitioner in deep reinforcement learning. Spinning Up consists of crystal-clear examples of RL code, educational exercises, documentation, and tutorials.
Learning Concepts with Energy Functions
Reinforcement Learning with Prediction-Based Rewards
Learning Complex Goals with Iterated Amplification
OpenAI Scholars Winter 2019
OpenAI Fellows Winter 2019 & Interns Summer 2019
OpenAI Scholars 2018: Final Projects
The International 2018: Results
OpenAI Five Benchmark: Results
We've trained a human-like robot hand to manipulate physical objects with unprecedented dexterity.
OpenAI Scholars 2018
OpenAI Five Benchmark
Glow: Better Reversible Generative Models
Learning Montezuma's Revenge from a Single Demonstration
Our team of five neural networks, OpenAI Five, has started to defeat amateur human teams at Dota 2.
Retro Contest: Results
Improving Language Understanding with Unsupervised Learning
We've obtained state-of-the-art results on a suite of diverse language tasks with a scalable, task-agnostic system, which we're also releasing. Our approach is a combination of two existing ideas: transformers and unsupervised pre-training.
OpenAI Fellows Fall 2018
AI and Compute
AI Safety via Debate
Evolved Policy Gradients
We're releasing a charter that describes the principles we use to execute on OpenAI's mission. This document reflects the strategy we've refined over the past two years, including feedback from many people internal and external to OpenAI.
We're launching a transfer learning contest that measures a reinforcement learning algorithm's ability to generalize from previous experience.
Report from the OpenAI Hackathon
Reptile: A Scalable Meta-Learning Algorithm
Ingredients for Robotics Research
We're releasing eight simulated robotics environments and a Baselines implementation of Hindsight Experience Replay, all developed for our research over the past year. We've used these environments to train models which work on physical robots.
Preparing for Malicious Uses of AI
Interpretable Machine Learning through Teaching
Discovering Types for Entity Disambiguation
Requests for Research 2.0
Scaling Kubernetes to 2,500 Nodes
Block-Sparse GPU Kernels
We’re releasing highly-optimized GPU kernels for an underexplored class of neural network architectures: networks with block-sparse weights. Depending on the chosen sparsity, these kernels can run orders of magnitude faster than cuBLAS or cuSPARSE.
Learning a Hierarchy
Generalizing from Simulation
Meta-Learning for Wrestling
We've found that self-play allows simulated AIs to discover physical skills like tackling, ducking, faking, kicking, catching, and diving for the ball, without explicitly designing an environment with these skills in mind.
Nonlinear Computation in Deep Linear Networks
Learning to Model Other Minds
OpenAI Baselines: ACKTR & A2C
We're releasing two new OpenAI Baselines implementations: ACKTR and A2C. A2C is a synchronous, deterministic variant of Asynchronous Advantage Actor Critic (A3C) which we've found gives equal performance.
More on Dota 2
Gathering Human Feedback
Better Exploration with Parameter Noise
We've found that adding adaptive noise to the parameters of reinforcement learning algorithms frequently boosts performance. This exploration method is simple to implement and very rarely decreases performance, so it's worth trying on any problem.
Proximal Policy Optimization
We’re releasing a new class of reinforcement learning algorithms, Proximal Policy Optimization (PPO), which perform comparably or better than state-of-the-art approaches while being much simpler to implement and tune.
Robust Adversarial Examples
Faster Physics in Python
Learning from Human Preferences
Learning to Cooperate, Compete, and Communicate
OpenAI Baselines: DQN
We're open-sourcing OpenAI Baselines, our internal effort to reproduce reinforcement learning algorithms with performance on par with published results. We'll release the algorithms over upcoming months; today's release includes DQN and three of its variants.
Robots that Learn
We've created a robotics system, trained entirely in simulation and deployed on a physical robot, which can learn a new task after seeing it done once.
Unsupervised Sentiment Neuron
We’ve developed an unsupervised system which learns an excellent representation of sentiment, despite being trained only to predict the next character in the text of Amazon reviews.
Spam Detection in the Physical World
Evolution Strategies as a Scalable Alternative to Reinforcement Learning
We've discovered that evolution strategies (ES), an optimization technique that's been known for decades, rivals the performance of standard reinforcement learning (RL) techniques on modern RL benchmarks, while overcoming many of RL's inconveniences.
Learning to Communicate
Attacking Machine Learning with Adversarial Examples
Faulty Reward Functions in the Wild
We're releasing Universe, a software platform for measuring and training an AI's general intelligence across the world's supply of games, websites and other applications.
OpenAI and Microsoft
Report from the Self-Organizing Conference
Infrastructure for Deep Learning
Deep learning is an empirical science, and the quality of a group's infrastructure is a multiplier on progress. Fortunately, today's open-source ecosystem makes it possible for anyone to build great deep learning infrastructure.
Machine Learning Unconference
Concrete AI Safety Problems
OpenAI Technical Goals
OpenAI’s mission is to build safe AI, and ensure AI's benefits are as widely and evenly distributed as possible. We’re trying to build AI as part of a larger community, and we want to share our plans and capabilities along the way.
This post describes four projects that share a common theme of enhancing or using generative models, a branch of unsupervised learning techniques in machine learning.