Home

Why the Entire AI World Was Talking About "Q" This Week

In the aftermath of last week’s shocking OpenAI power struggle, there was one final revelation that acted as a kind of epilogue to the sprawling mess: a report from Reuters that revealed a supposedly startling breakthrough at the startup. That breakthrough allegedly happened via a little known program dubbed “Q-Star” or “Q*.”

According to the report, one of the things that may have kicked off the internecine conflict at the influential AI company was this Q-related “discovery.” Ahead of Altman’s ouster, several OpenAI staffers allegedly wrote to the company’s board about a “powerful artificial intelligence discovery that they said could threaten humanity.” This letter was “one factor among a longer list of grievances by the board leading to Altman’s firing,” Reuters claimed, citing anonymous sources.

Frankly, the story sounded pretty crazy. What was this weird new program and why did it, supposedly, cause all of the chaos at OpenAI? Reuters claimed that the Q* program had managed to allow an AI agent to do “grade-school-level math,” a startling technological breakthrough, if true, that could precipitate greater successes at creating artificial general intelligence, or AGI, sources said. Another report from The Information largely reiterated many of the points made by the Reuters article.

Still, details surrounding this supposed Q program haven’t been shared by the company, leaving only the anonymously sourced reports and rampant speculation online as to what the true nature of the program could be.

Some have speculated that the program might (because of its name) have something to do with Q-learning, a form of machine learning. So, yeah, what is Q-learning, and how might it apply to OpenAI’s secretive program?

In general, there are a couple different ways to teach an AI program to do something. One of these is known as “supervised learning”, and works by feeding AI agents large tranches of “labelled” data, which is then used to train the program to perform a function by itself (typically that function is more data classification). By and large, programs like ChatGPT, OpenAI’s content-generating bot, were created using some form of supervised learning.

Unsupervised learning, meanwhile, is a form of ML wherein AI algorithms are allowed to sift through large tranches of unlabeled data, in an effort to find patterns to classify. This kind of artificial intelligence can be deployed to a number of different purposes, such as creating the kind of recommendation systems that companies like Netflix and Spotify use to suggest new content to users based on their past consumer choices.

Finally, there’s reinforced learning, or RL, which is a category of ML that incentivizes an AI program to achieve a goal within a specific environment. Q-learning is a subcategory of reinforced learning. In RL, researchers treat AI agents sort of like a dog that they’re trying to train. Programs are “rewarded” if they take certain actions to affect certain outcomes and are penalized if they take others. In this way, the program is effectively “trained” to seek the most optimized outcome in a given situation. In Q-learning, the agent apparently works through trial and error to find the best way to go about achieving a goal its been programmed to pursue.

What does this all have to do with OpenAI’s supposed “math” breakthrough? One could speculate that the program that managed (allegedly) to do simple math operations may have arrived at that ability via some form of Q-related RL. All of this said, many experts are somewhat skeptical as to whether AI programs can actually do math problems yet. Others seem to think that, even if an AI could accomplish such goals, it wouldn’t necessarily translate to broader AGI breakthroughs. The MIT Technology review, speaking with :

Researchers have for years tried to get AI models to solve math problems. Language models like ChatGPT and GPT-4 can do some math, but not very well or reliably. We currently don’t have the algorithms or even the right architectures to be able to solve math problems reliably using AI, says Wenda Li, an AI lecturer at the University of Edinburgh. Deep learning and transformers (a kind of neural network), which is what language models use, are excellent at recognizing patterns, but that alone is likely not enough, Li adds.

In short: We really don’t know much about Q, though, if the experts are to be believed, the hype around it may be just that—hype.

Despite the fact that he’s back at OpenAI, it bears some consideration that we still don’t know what the fuck happened with Sam Altman last week. In an interview he did with The Verge on Wednesday, Altman gave pretty much nothing away as to what precipitated the dramatic power struggle at his company last week. Despite continual prodding from the outlet’s reporter, Altman just sorta threw up his hands and said he wouldn’t be talking about it for the foreseeable future. “I totally get why people want an answer right now. But I also think it’s totally unreasonable to expect it,” the rebounded CEO said. Instead, the most The Verge was able to get out of the OpenAI executive is that the company is in the midst of conducting an “independent review” into what happened—a process that, he said, he doesn’t want to “interfere” with. Our own coverage of last week’s shitshow interpreted it according to a narrative involving a clash between the board’s ethics and Altman’s dogged push to commercial OpenAI’s automated technology. However, this narrative is just that: a narrative. We don’t know the specific details of what led to Sam’s ousting, though we sure would like to.

Source: Gizmodo

Previous

Next