Formerly known as Wikibon
Close this search box.

210 | Breaking Analysis | David vs Goliath reimagined – OpenAI’s approach to AI supervision

Artificial general intelligence, or AGI, has people both intrigued and fearful. As a leading researcher in the field, last July, OpenAI introduced the concept of superalignment via a team created to study scientific and technical breakthroughs to guide and ultimately control AI systems that are much more capable than humans. OpenAI refers to this level of AI as superintelligence. Last week, this team unveiled the first results of an effort to supervise more powerful AI with less powerful models. While promising, the effort showed mixed results and brings to light several more questions about the future of AI and the ability of humans to actually control such advanced machine intelligence.

In this Breaking Analysis we share the results of OpenAI’s superalignment research and what it means for the future of AI. We further probe ongoing questions about OpenAI’s unconventional structure which we continue to believe is misaligned with its conflicting objectives of both protecting humanity and making money. We’ll also poke at a nuanced change in OpenAI’s characterization of its relationship with Microsoft. Finally we’ll share some data that shows the magnitude of OpenAI’s lead in the market and propose some possible solutions to the structural problem faced by the industry.

OpenAI’s Superalignment Team Unveils its First Public Research

With little fanfare, OpenAI unveiled the results of new research that describes a technique to supervise more powerful AI models with a less capable large language model. The paper is called Weak-to-Strong Generalization: Eliciting Strong Capabilities with Weak Supervision. The basic concept introduced is superintelligent AI will be so vastly superior to humans that traditional supervision techniques such as reinforcement learning from human feedback (RLHF), won’t scale. This “super AI,” the thinking goes, will be so sophisticated that humans won’t be able to comprehend its output. Rather, the team set out to test whether less capable GPT-2 models can supervise more capable GPT-4 models as a proxy for a supervision approach that could keep superintelligent systems from going rogue.

The superalignment team at OpenAI is led by Ilya Sutskever and Jan Leike. Ilya’s name is highlighted in this graphic because much of the chatter on Twitter after this paper was released suggested that Ilya was not cited as a contributor to the research. Perhaps his name was left off initially given the recent OpenAI board meltdown and then added later. Or perhaps the dozen or so commenters were mistaken but that’s unlikely. At any rate he’s clearly involved.

Can David AI Control a Superintelligent Goliath?

This graphic has been circulated around the Internet so perhaps you’ve seen it. We’ve annotated in red to add some additional color. The graphic shows that traditional machine learning models involve RLHF where the outputs of a query are presented to humans to rate them. That feedback is then pumped back into the training regimen to improve model results.

Superalignment, shown in the middle would ostensibly involve a human trying to unsuccessfully supervise a far more intelligent AI, which presents a failure mode. For example, the super AI could generate millions of lines of code that mere humans wouldn’t be able to understand. The problem of course is superintelligence doesn’t exist so it can’t be tested. But as a proxy, the third scenario shown here is a less capable model (GPT-2 in this case) was set up to supervise a more advanced GPT-4 model.

For reference we pin AGI at the human level of intelligence, recognizing that definitions do vary depending on who you speak with. Regardless the team tested this concept to see if the smarter AI would learn bad habits from the less capable AI and become “dumber” or would the results close the gap between the less capable AI’s capabilities and a known ground truth set of labels that represent correct answers.

The methodology was thoughtful. The team tested several scenarios across natural language processing, chess puzzles and reward modeling, which is a technique to score responses to a prompt as a reinforcement signal to iterate toward a desired outcome. The results were mixed however. The team measured the degree to which the performance of a GPT-4 model supervised by GPT-2 closed the gap on known ground truth labels. They found that the more capable model supervised by the less capable AI performed 20% to 70% better than GPT-2 on the language tasks but did less well on other tests.

The researchers are encouraged that GPT-4 outdid its supervisor and believe this shows promising potential. But the smarter model had greater capabilities that weren’t unlocked by the teacher calling into question the ability of a less capable AI to control a smarter model.

In thinking about this problem, one can’t help but recall the scene from the movie Good Will Hunting.

Is there a “Supervision Tax” in AI Safety

There are several threads on social and specifically on Reddit, lamenting the frustration with GPT-4 getting “dumber.” A research paper by Stanford and UC Berkeley published this summer points out the drift in accuracy over time. Theories have circulated as to why, ranging from architectural challenges, memory issues and some of the most popular citing the need for so-called guardrails has dumbed down GPT-4 over time.

Customers of ChatGPT’s for pay service have been particularly vocal about paying for a service which is degrading in quality over time. However, much of these claims are anecdotal. It’s unclear to what extent the quality of GPT-4 is really degrading as it’s difficult to track such a fast moving target. Moreover, there are many examples where GPT-4 is improving such as in remembering prompts and fewer hallucinations.

Regardless, the point is this controversy further underscores many alignment challenges between government and private industry, for-profit versus non-profit objectives, AI safety and regulation conflicting with innovation and progress. Right now the market is like the wild west with lots of hype and diverging opinions.

OpenAI Changes the Language Regarding Microsoft’s Ownership

In a post last month, covering the OpenAI governance failure we showed this graphic from OpenAI’s Web site. As we discussed this past week with John Furrier on theCUBE Pod, the way in which OpenAI and Microsoft are characterizing their relationship has quietly changed.

To review briefly, the graphic shows the convoluted and in our view, misaligned structure of OpenAI. It is controlled by a 501(c)(3) non-profit public charity with a mission to do good AI for humanity. That board controls an LLC which provides oversight and also controls a holding company owned by employees and investors like Khosla Ventures, Sequoia and others. This holding company owns a majority of another LLC which is a capped profit company.

Previously on OpenAI’s Web site, Microsoft was cited as a “Minority owner.” That language has now changed to reflect Microsoft’s “Minority economic interest,” which we believe is a 49% stake in the capped profits of the LLC. Now quite obviously this change was precipitated by the UK and US governments looking into the relationship between Microsoft and OpenAI, which is fraught with misalignment as we saw with the firing and re-hiring of CEO Sam Altman. And the subsequent board observer seat consolation that OpenAI made for Microsoft.

The partial answer in our view is to create two separate boards and governance structures. One to govern the non-profit and a separate board to manage the for-profit business of OpenAI. But that alone won’t solve the superalignment problem, assuming superhuman intelligence is a given, which it is not necessarily.

The AI Market is Bifurcated

To underscore the wide schisms in the AI marketplace let’s take a look at this ETR data from the Emerging Technology Survey, ETS, which measures the market sentiment and mindshare amongst privately held companies. Here we’ve isolated on the ML/AI sector which comprises traditional AI plus LLM players as cited in the annotations. We’ve also added the most recent market valuation data for each of the firms. The chart shows Net Sentiment on the vertical axis which is a measure of intent to engage, and mindshare on the horizontal axis which measure awareness of the company.

The first point is OpenAI’s position is literally off the charts in both dimensions. Its lead with respect to these metrics is overwhelming, as is its $86B market cap. On paper it is more valuable than Snowflake (not shown here) and Databricks with a reported $43B valuation. Both Snowflake and Databricks are extremely successful and established firms with thousands of customers. Hugging Face is high up on the vertical axis – think of them as the GitHub for AI model engineers. As of this summer their valuation was at $5B. Anthropic is prominent and with its investments from AWS and Google it touts a recent $20B valuation, which Cohere this summer reportedly had a $3B valuation.

Jasper AI is a popular marketing platform that is seeing downward pressure on its valuation because ChatGPT is disruptive to its value proposition at a much lower cost. DataRobot at the peak of the tech bubble had a $6B valuation but after some controversies around selling insider shares its value has declined. You can also see here and Snorkel with unicorn-like valuations and, which is a chatbot generative AI platform and recently was reported having a $5B valuation.

So you can see the gap between OpenAI and the pack. As well you can clearly see that emergent competitors to OpenAI are commanding higher valuations than the traditional ML players. Generally our view is AI generally and generative AI specifically are a tide that will lift all boats. But some boats will be able to ride the wave more successfully than others and so far, despite its governance challenges, OpenAI and Microsoft have been in the best position.

Key Questions on Superintelligence

There are many questions around AGI and now superAI as this new parlance of superintelligence and superalignment emerge. First is this vision aspirational or it is truly technically feasible. Experts like John Roese, the CTO of Dell have said all the pieces are there for AGI to become a reality, there’s just not enough economically feasible compute today and the quality of data is still lacking. But from a technological standpoint he agrees with OpenAI that it’s coming.

If that’s the case, how will the objectives of superalignment – AKA control – impact innovation and what are the implications of the industry leader having a governance structure that is controlled by a non-profit board? Can their objectives truly win out over the profit motives of an entire industry? We tend to doubt it and the reinstatement of Sam Altman as CEO underscores who is going to win that battle. Sam Altman was the big winner in all that drama…not Microsoft.

So to us, the structure of OpenAI has to change. The company should be split in two with separate boards for the non-profit and commercial arm. And if the mission of OpenAI is truly is to develop and direct artificial intelligence in ways that benefit humanity as a whole, then why not split the companies in two and open up the governance structure of the non-profit to other players including OpenAI competitors and governments.

On the issue of superintelligence, beyond AGI, what happens when AI becomes autodidactic and becomes a true self-learning system. Can that really be controlled by less capable AI? The conclusion of OpenAI researchers is that humans clearly won’t be able to control it.

But before you get too scared there are those skeptics who feel that we are still far away from AGI, let alone superintelligence. Hence point #5 here – i.e. is this a case where Zeno’s paradox applies? Zeno’s paradox you may remember from high school math classes states that any moving object must reach halfway on a course before it reaches the end; and because there are an infinite number of halfway points, a moving object never reaches the end in a finite time.

Is Superintelligence a Fantasy?

This Gary Larson graphic sums up the opinions of the skeptics. It shows a super complicated equation with a step in the math that says “Then a Miracle Occurs.” It’s kind of where we are with AGI and superintelligence…like waiting for Godot.

We don’t often use the phrase “time will tell” in these segments.  As analysts, we like to be more precise and opinionated with data to back those opinions. But in this case we simply don’t know.

But let’s leave you with a thought experiment from Arun Subramaniyan put forth at Supercloud 4 this past October. We asked him for his thoughts on AGI and the same applies for superintelligence. His premise was assume for a minute that AGI is here. Wouldn’t the AI know that we as humans would be wary of the AI and try to control it. So wouldn’t the smart AI act in such a way as to hide its true intentions. Ilya Sutskever has stated this is a concern.

The point being, if super AI is so much smarter than humans, then it will be able to easily outsmart us and control us versus us controlling it. And that is the best case for creating structures that allow the motives of those concerned about AI safety to pursue a mission independent of a profit-driven agenda. Because a profit motive will almost always win over an agenda that sets out to simply do the right thing.

Keep in Touch

Thanks to Alex Myerson and Ken Shifman on production, podcasts and media workflows for Breaking Analysis. Special thanks to Kristen Martin and Cheryl Knight who help us keep our community informed and get the word out. And to Rob Hof, our EiC at SiliconANGLE.

Remember we publish each week on Wikibon and SiliconANGLE. These episodes are all available as podcasts wherever you listen.

Email | DM @dvellante on Twitter | Comment on our LinkedIn posts.

Also, check out this ETR Tutorial we created, which explains the spending methodology in more detail.

Watch the full video analysis:

Note: ETR is a separate company from Wikibon and SiliconANGLE. If you would like to cite or republish any of the company’s data, or inquire about its services, please contact ETR at

All statements made regarding companies or securities are strictly beliefs, points of view and opinions held by SiliconANGLE Media, Enterprise Technology Research, other guests on theCUBE and guest writers. Such statements are not recommendations by these individuals to buy, sell or hold any security. The content presented does not constitute investment advice and should not be used as the basis for any investment decision. You and only you are responsible for your investment decisions.

Disclosure: Many of the companies cited in Breaking Analysis are sponsors of theCUBE and/or clients of Wikibon. None of these firms or other companies have any editorial control over or advanced viewing of what’s published in Breaking Analysis.

Article Categories

Book A Briefing

Fill out the form , and our team will be in touch shortly.
Skip to content