Blog

Spinning Up in Deep RL: Workshop review

On February 2, we held our first Spinning Up Workshop as part of our new education initiative at OpenAI.

Spinning Up In Deep RL

Illustration: Ruby Chen

February 26, 2019

We hosted ~90 people at our office and engaged nearly 300 more through our livestream. Participants came from a wide range of backgrounds, including academia, software engineering, data science, ML engineering, medicine, and education. This workshop built off our Spinning Up in Deep RL resource package and took a deeper dive into RL algorithm design, robotics, and building safe AI systems.

Person speaking into a microphone in front of a room with a live audience

Building educational tools

One of the goals for education at OpenAI is to help people develop the skills needed to participate in research and development in AI—especially in deep RL, a core area of research at OpenAI. From our experience working with Scholars and Fellows, we’ve found that the key ingredients for skill development are:

  1. a flexible curriculum that includes core material and a review of research frontiers,
  2. mentorship and discussions with experts, and
  3. having the students work on projects that are at the right level to help them grow.

The challenge for education at OpenAI is to figure out how to deliver these at scale. While sharing a curriculum at scale is relatively easy, it isn’t obvious how to scale up mentorship and guidance on projects. Our working theory is that workshops might help us do just that. Our first Spinning Up workshop has given us several positive signs that this is a useful direction, and we’re excited to share what we learned.

The crowd

A large audience listening intently while looking ahead

We hosted around 90 people at our office and involved nearly 300 more through our livestream. Our guests came from a wide range of backgrounds, including academic research, software engineering, data science, ML engineering, medicine, and education. The level of ML experience varied quite significantly across the group, from “almost none” to “built their own Dota bot!”

More than 500 people, from all around the world, applied to participate in this workshop. Although we sadly couldn’t invite everyone to this one because of space constraints, we want to continue engaging the community with future events.

The talks

The workshop kicked off with three hours of talks. To start us off, Joshua Achiam laid out the conceptual foundations of reinforcement learning and gave an overview of different kinds of RL algorithms. If you’d like to study this material, check out Spinning Up in Deep RL.

Matthias Plappert presented on OpenAI’s recent work training a dexterous robot hand in simulation to manipulate objects in the real world. Domain randomization, recurrent neural networks, and large-scale distributed training were necessary ingredients in bridging the “sim2real” gap for this task.

Dario Amodei, the leader of the Safety Team at OpenAI, presented an overview of problems in AI safety and recent work in this space. He described the central safety problem: the fact that correctly specifying agent behavior is hard! It is easy to inadvertently give agents incentives to perform different behavior than what you would have wanted, and when agents are very powerful, this could be dangerous. Dario also described work that OpenAI and collaborators at DeepMind have done to address this issue, in which reward functions are learned from human preferences instead of designed.

The afternoon

The workshop continued into the afternoon with a semi-structured program of hacking and breakout sessions. Participants were able to seek guidance on project ideas and research tips from our slate of volunteers, which included Amanda AskellAlex RayDaniel ZieglerDylan Hadfield-MenellEthan KnightKarl CobbeMatthias Plappert, and Sam McCandlish.

A group of presenters standing in front of.a projection, facing a live audience

The breakout sessions turned out to be the main highlight of the afternoon. Whereas the morning talks covered the conceptual foundations of RL, the breakout sessions were designed to help participants boost their implementation and research skills.

Group of people sitting together around a table, focused on their laptops

In the first session, Karl Cobbe gave an introduction to TensorFlow, a key library used in deep learning research. In the second session, “Writing DQN Together,” Daniel Ziegler led participants step-by-step through the process of implementing a deep RL algorithm. In the third session, “Advanced RL Q&A,” Joshua Achiam described recent research frontiers in RL and took audience questions about doing RL research.

People sitting around large tables, working on their laptops, and talking in a crowded room.

Our takeaways

This was our first experiment with the workshop format, and we were generally pleased with the outcome. In particular, we found it quite gratifying to work directly with such a capable and enthusiastic group of participants. The experience, along with feedback from the group, gave us a good sense of what to keep and what to change for future workshops.

What worked: We asked our participants what their highlights were, and these responses are a fairly representative sample:

“Learning A TON in a very safe, friendly environment where everyone was mainly on the same level in terms of learning.”
“I thought the ability to get one-on-one help and to take on some ‘paired programming’-like time with folks who really know what they’re doing was incredibly helpful. The enthusiasm of the volunteers was also very high, and I felt very encouraged to ask for help.”


Responses like these gave us a sense that the workshop format shined on delivering “mentorship and discussions with experts.”

Two people working together on their laptops

What could be improved: We asked our participants what they thought we could have done differently to enhance their experience, and received responses like:

“I would’ve liked a presentation section of potential projects that we could pursue based on our experience level.”
“Extend the workshop to two days.”

Many participants felt like they either weren’t sure what to work on during the hackathon, or didn’t have enough time to make significant progress on their hacking project.

We think this kind of feedback is a good indicator that the 1-day workshop format isn’t enough to “have the students work on projects that are at the right level to help them grow” in RL. In the future, we’ll consider running longer events so we can meet that goal. This feedback also suggests that we should do more to create “shovel-ready” RL projects that participants can jump right in to.

Side profile of a person sitting with earbuds in their ears, looking intently at a laptop screen of a 3D checkerboard environment

What else? Aside from the technical content of the workshop, creating a supportive and inclusive environment was top-of-mind for us, and participants told us this was important for their experience. One piece of feedback read:

“This is the first non-female exclusive social event I’ve been to in Silicon Valley with ~50% women in the room. It was so shocking that I thought I was in the wrong room in the beginning. It was noticeably easier to socialize as a result of the gender balance, so thank you for that.”
Two people standing and talking while holding food and beverages

What’s next

OpenAI’s Charter gives us a mandate “to create a global community working together to address AGI’s global challenges,” and we’ll continue developing education at OpenAI to help serve that goal. This includes more work on resources like Spinning Up in Deep RL and more events like this Spinning Up Workshop. We are currently planning a second workshop with CHAI at Berkeley, which we expect to formally announce soon.

If you would like to help us do research on RL or teach people about AI, please get in touch! We’re hiring.

Thanks to Maddie Hall and Loren Kwan for co-organizing the event, to Ian Atha for livestreaming and recording the lectures, as well as helping participants with Python and Tensorflow issues, and to Blake Tucker for filming and photography!

Authors