We would not be an AI newsletter without covering the past week’s releases from Google and Microsoft but we will use this chance to introduce the concept of AI race dynamics and why researchers are getting more cynical.
Watch this week's MLAISU on YouTube or listen to it on Spotify.
Understanding Race Dynamics
This week, Microsoft debuted their updated version of Bing, heavily reliant on OpenAI's GPT-4, the latest state-of-the-art language model. In response, Google followed up with their own announcement regarding the "Bard" model, set to enhance their future search capabilities. However, Microsoft's presentation was well-received and informative, while Google's was criticized for its flaws and lack of detail.
Microsoft CEO Satya Nadella views this as a competition for the most profitable digital product, search. In his discussions, he has reportedly discussed AI alignment with Sam Altman and his team, as evidenced by his use of the term "alignment" in appropriate contexts across multiple interviews.
Nadella emphasized that before delving into AI safety and alignment, it is crucial to understand the context in which AI is utilized. He stated, "We should start by using these models in situations where humans are clearly in charge." It is a good idea to scale oversight but we probably still need to think safety from first principles.
Screenshot of the Bing search experience, Tech Crunch
Google has invested $300 million in AI safety research organization Anthropic and now oversees both DeepMind and Anthropic, while Microsoft has focused on exclusive deals and ownership in OpenAI.
This competition, referred to as an "AI race," is a high-risk scenario in AI development that accelerates progress while potentially reducing the emphasis on safety considerations. According to "The Singularity Hypothesis," AI development can be viewed as a winner-takes-all game if AI rapidly improves itself through knowledge generation, creating an incentive for a small group to reach the finish line first. This could lead to dangerous consequences due to the speed at which the technology advances.
David Leslie of the Turing Institute spoke on Bloomberg about this issue and noted that the rapid pace of technology releases poses a risk for ethical usage and development. Luciano Floridi, covered in last week's newsletter, also pointed out the dangers of AI, including the possibility of taking the opportunities it provides to the extreme and reducing human autonomy, self-realization, self-determination, and responsibility.
The risks of AI products are a short-term concern, but we must also be mindful of the potential for an "AI arms race." Haydn Belfield, a previous keynote speaker, highlights this in his award-winning article in the Bulletin of Atomic Scientists, warning that we must avoid extending the concept of arms races to artificial intelligence.
In his analysis, Belfield explores the reasons for the atomic arms race and how it resulted in the earlier development of fission weapons. He identifies three key takeaways to prevent similar race dynamics in the future:
Ensure that a race is actually taking place. Avoid developing artificial general intelligence without proper process.
Be cautious of secrecy, as it can create false perceptions, as seen with the "missile gap" between the US and USSR in the late 1950s.
Most importantly, scientists have a significant level of power and must avoid using it in ways that could harm humanity, as demonstrated by the Szilard-Einstein letter.
In conclusion, race dynamics are a dangerous force in the development of world-altering technologies like atomic bombs and artificial intelligence. As a community, we must take care and consider the consequences of our actions.
Join our Scale Oversight Hackathon today to help mitigate the risks from the large models that may result from an AI race. The hackathon runs for just a few hours on Saturday or you can attend the introductory talk in a few hours.
Other research
Now after having focused on the race dynamics that will be a scary part of the coming ten years, let’s shortly talk about a couple of papers from this week’s AI safety research.
Chughtai, Chan and Nanda explore the universality hypothesis of circuits in neural networks. This is an important assumption which states that the learned algorithms of neural networks will generally be the same across different models of the same architecture.
Yu, Gao et al. finds that modeling human biases in interactive environments as hidden reward functions makes reinforcement learning agents better performing and more helpful. This basically means that the model learns some biased models besides its learning that “understands” what the human player wants and does.
Opportunities
For this week’s opportunities, we have some unique events:
Join the Predictable AI day in Valencia with the wonderful Irina RIsh.
Join the EA global London happening in May with applications closing a month before.
And of course, you can join our hackathon later today.
Thank you for coming along in this week’s ML and AI safety update!