Quantcast
Channel: Analytics India Magazine
Viewing all articles
Browse latest Browse all 4244

Achieving Mission Impossible in AI with Yejin Choi 

$
0
0

AI startups are not easy to build. The costs of computing alone are astronomical, with a majority of seed funding going towards training models and research. But what if you didn’t need to worry about renting GPUs?

Allen Institute of AI’s commonsense AI senior research director Yejin Choi posed this possibility at Databricks’ Data + AI Summit this year when discussing the possibilities of working with small language models (SLMs).

“Right now, the recipe is, ‘Let’s just make the models super big – the bigger, the better,’ but humans – you and I – can’t really remember all that context, you know?” she said.

In posing several ‘mission impossibles’ in building SLMs, one of the constraints she gave was in building a model without a GPU.

Choi’s Mission Impossibles for SLMs

The first “mission impossible” scenario that Choi went into was being able to summarise sentences without the use of reinforcement learning from human feedback (RLHF), extreme scale pre-training and supervised datasets at scale.

Taking it a step further, her second “mission impossible” was to do the same, but summarising documents this time instead of sentences, with the additional constraint of not relying on human-supervised critics.

Choi reduced the capabilities that her team were working with in the third “mission impossible”, stating that they tried to make older statistical n-gram language models relevant to neural language models

In doing so, she said that they made an equivalent to infinity, it had to compute over trillions of tokens with an almost instantaneous response time, and finally, all of this had to be done without the use of GPUs.

This is done because SLMs do not require the massive amounts of compute that LLMs require, thereby foregoing the need for GPUs. Choi highlighted this by taking the example of Infini.gram, an engine built by researchers from Washington University and the Allen Institute for AI, which can be run with basic compute from CPUs.

In doing so, Choi demonstrated that it was possible to compete with larger models even under previously thought impossible constraints and with limited resources.

Her biggest contention was that AI would be as good as the data it was trained on, hence she speculated that synthesised data would be the way to go in the future.

“Please do not overgeneralise to conclude that all SLMs are completely out of the league. There are numerous other counterexamples that demonstrate that task-specific symbolic knowledge distillation can work across many different tasks and domains,” she said, highlighting that SLMs were still plenty effective if certain criteria were fulfilled.

High-Quality SLMs

Choi said that the commonly held belief at the moment is that we aren’t able to efficiently build the human-like ability to abstract information into AI models.

“You just abstract away everything I told you instantaneously but you still remember what I said so far. That’s really amazing human intelligence that we don’t yet know how to build efficiently through AI models. I believe that it’s possible, we’re just not trying hard enough because we’re blinded by just the magic of a scale,” she said.

This means that, in order to build an efficient SLM, the focus would need to be on perfecting abstraction within the model, rather than increasing the amount of training data that the model relies on.

Synthesised Data

Additionally, Choi focused on synthesised data, rather than what was already available, because, in her words, “We have to synthesise data because if it already exists somewhere on the internet, OpenAI has already crawled it.”

She said that while certain concerns exist about the use of synthetic data, including the quality of the data, as well as potential bias, the actual synthesis of data needs to be done carefully and in a way that is innovative. 

Further, she said this was already occurring, highlighting a research paper on annotating synthesised data by Meta AI called the Segment Anything Model (or SAM). Another paper she highlighted was Microsoft’s ‘Textbooks Are All You Need’.

“When you have really high-quality data, or textbook quality data, synthesised, you can actually compete against larger counterparts across many different tasks,” she said.

Relying on synthesised data, SLMs can lower the cost of acquiring data and satisfy the need to use novel data while still training the model itself on high-quality data, as demonstrated by the ‘Textbooks Are All You Need’ research paper.

With synthesised data and perfecting abstraction, Choi believes that building SLMs can substitute the need for LLMs, especially thanks to the respite it allows in terms of compute.

The post Achieving Mission Impossible in AI with Yejin Choi  appeared first on Analytics India Magazine.


Viewing all articles
Browse latest Browse all 4244

Trending Articles