We've written a policy research paper identifying four strategies that can be used today to improve the likelihood of long-term industry cooperation on safety norms in AI: communicating risks and benefits, technical collaboration, increased transparency, and incentivizing standards. Our analysis shows that industry cooperation on safety will be instrumental in ensuring that AI systems are safe and beneficial, but competitive pressures could lead to a collective action problem, potentially causing AI companies to under-invest in safety. We hope these strategies will encourage greater cooperation on the safe development of AI and lead to better global outcomes of AI.
It’s important to ensure that it’s in the economic interest of companies to build and release AI systems that are safe, secure, and socially beneficial. This is true even if we think AI companies and their employees have an independent desire to do this, since AI systems are more likely to be safe and beneficial if the economic interests of AI companies are not in tension with their desire to build their systems responsibly.
This claim might seem redundant because developing and deploying products that do not pose a risk to society is generally in a company’s economic interest. People wouldn’t pay much for a car without functioning brakes, for example. But if multiple companies are trying to develop a similar product, they can feel pressure to rush it to market, resulting in less safety work prior to release.
Such problems generally arise in contexts where external regulation is weak or non-existent. Appropriate regulation of goods and services provided in the marketplace can reduce corner-cutting on safety. This can benefit the users of goods and services as well as the sector itself—the airline sector as a whole benefits commercially from the fact that governments around the world are vigilant about safety, for example, and that when incidents occur, they are always investigated in detail. Conventional regulatory mechanisms may be less effective in dealing with AI, however, due to the rate at which the technology is developing and the large information asymmetries between developers and regulators. Our paper explores what factors might drive or dampen such a rush to deployment, and suggests strategies for improving cooperation between AI developers. Developers “cooperate” not by ceasing to compete but by taking appropriate safety precautions, and they are more likely to do this if they are confident their competitors will do the same.
The Need for Collective Action on Safety
If companies respond to competitive pressures by rushing a technology to market before it has been deemed safe, they will find themselves in a collective action problem. Even if each company would prefer to compete to develop and release systems that are safe, many believe they can’t afford to do so because they might be beaten to market by other companies. Problems like this can be mitigated by greater industry cooperation on safety. AI companies can work to develop industry norms and standards that ensure systems are developed and released only if they are safe, and can agree to invest resources in safety during development and meet appropriate standards prior to release.
Some hypothetical scenarios:
A company develops an image recognition model with very high performance and is in a rush to deploy it at scale, but the engineers at the company have not yet adequately evaluated the system's performance in the real world. The company also knows it lacks full testing standards to know the full "capability surface" of the model. Due to fears of being beaten to market by competitors in a particular niche, however, the company moves forward, gambling that their limited in-house testing will suffice to hedge against any major system failures or public blowback.
A company wishes to deploy some semi-autonomous AI software onto physical robots, such as drones. This software has a failure rate that satisfies regulatory criteria, but because the company is racing to get the technology to market it knows that their product's popular "interpretability" feature gives misleading explanations that are intended more for reassurance than clarification. Due to limited expertise among regulators, this misbehavior falls through the cracks until a catastrophic incident, as does similar behavior by other companies racing to deploy similarly "interpretable" systems.
Some collective action problems are more solvable than others. In general, a collective action problem is more solvable if the expected benefits of cooperating outweigh the expected benefits of not cooperating. The following interrelated factors increase the expected benefits of cooperating:
Companies are more likely to cooperate on safety if they can trust that other companies will reciprocate by working towards a similar standard of safety. Among other things, trust that others will develop AI safely can be established by increasing transparency about resources being invested in safety, by publicly committing to meet a high standard of safety, and by engaging in joint work to find acceptable safety benchmarks.
Companies have a stronger incentive to cooperate on safety if the mutual benefits from safe development are higher. The prospect of cooperation can be improved by highlighting the benefits of establishing good safety norms early, such as preventing incidents of AI failure and misuse, and establishing safety standards that are based on a shared understanding of emerging AI systems. Collaborative efforts like Risk Salon, which hosts events for people working in fraud, risk, and compliance, are a good example of this. These events facilitate open discussions between participants from different companies, and seem to be primarily motivated by the shared gain of improved risk mitigation strategies.
Reducing the harms that companies expect to incur if another company decides not to cooperate on safety increases the likelihood that they themselves will abide by safety standards. Exposure can be reduced by discouraging violations of safety standards (e.g. reporting them) or by providing evidence of the potential risks associated with systems that don’t meet the relevant standards. When standards must be met to enter a market, for example, companies have little to lose if others don’t meet those standards. To comply with the RoHS directive, electronics manufacturers had to switch to lead-free soldering in order to sell their products in the EU. The possibility that one manufacturer would continue to use lead soldering would do little to affect cooperation with lead-reduction efforts, since their failure to comply would not be costly to other manufacturers.
Reducing any advantages companies can expect to get by not cooperating on safety should increase overall compliance with safety standards. For example, companies producing USB connectors don’t expect to gain much from deviating from USB connector standards, because doing so will render their product incompatible with most devices. When standards have already been established and deviating from them is more costly than any benefits, advantage is low. In the context of AI, reducing the cost and difficulty of implementing safety precautions would help minimize the temptation to ignore them. Additionally, governments can foster a regulatory environment in which violating high-stakes safety standards is prohibited.
Identifying the ways in which AI systems could fail if adequate precautions are not taken can increase the likelihood that AI companies will agree not to develop or release such systems. Shared downsides incentivize cooperation when failures are particularly harmful: especially if they are felt by the whole industry (e.g. by damaging public trust in the industry as a whole). After the Three Mile Island incident, for example, the nuclear power industry created and funded the INPO, a private regulator with the ability to evaluate plants and share the results of these evaluations within industry in order to improve operational safety.
Collective action problems are susceptible to negative spirals where the loss of trust causes one party to stop cooperating, causing other parties to stop cooperating. At the same time, it is also possible to generate positive spirals where the development of trust causes some parties to cooperate, resulting in other parties cooperating.
We've found four strategies that can be used today to improve the likelihood of cooperation on safety norms and standards in AI. These are:
1. Promote accurate beliefs about the opportunities for cooperation
Communicate the safety and security risks associated with AI, show that concrete steps can be taken to promote cooperation on safety, and make shared concerns about safety common knowledge.
2. Collaborate on shared research and engineering challenges
Engage in joint interdisciplinary research that promotes safety and is otherwise conducive to fostering strong collaboration (e.g. work that involves combining complementary areas of expertise).
3. Open up more aspects of AI development to appropriate oversight and feedback
Publicize codes of conduct, increase transparency about publication-related decision-making, and, provided that security and IP concerns are addressed, open up individual AI systems to greater scrutiny.
4. Incentivize adherence to high standards of safety
Commend those that adhere to safety standards, reproach failures to ensure that systems are developed safely, and support economic, legal, or industry-wide incentives to adhere to safety standards.
We think collective action problems may be a principal source of policy challenges as AI systems become increasingly powerful. This analysis focuses on the roles that industry can play in preventing such problems, but we anticipate that legal and political mechanisms will also play an important role in preventing and mitigating these issues. We also anticipate that identifying similar mechanisms to improve cooperation on AI safety between states and with other non-industry actors will be of increasing importance in the years to come. There is a great deal of uncertainty about the challenges that future AI systems may pose, but we believe that encouraging greater cooperation on the safe development of AI is likely to have a positive impact on the outcomes of AI development.
While we acknowledge that such challenges exist, we advocate for a more thorough mapping of possible collaborations across organizational and national borders, with particular attention to research and engineering challenges whose solutions might be of wide utility. Areas to consider might include joint research into the formal verification of AI systems' capabilities and other aspects of AI safety and security with wide applications; various applied "AI for good" projects whose results might have wide-ranging and largely positive applications (e.g. in domains like sustainability and health); and joint development of countermeasures against global AI-related threats such as the misuse of synthetic media generation online. To achieve greater cooperation on safety, we need to make it common knowledge that such cooperation is in everyone’s interest, and that methods for achieving it can be identified, researched, and implemented today.