Quantcast
Channel: Analytics India Magazine
Viewing all articles
Browse latest Browse all 3499

Alibaba Releases Qwen2, Outperforms Llama 3 on Several Benchmarks 

$
0
0
Alibaba Releases Qwen2, Outperforms Llama 3 on Several Benchmarks

In a significant leap for open source AI, Alibaba’s Qwen team has announced the release of Qwen2, an advanced version of its mother of LLMs, Qwen1.5. 

Qwen2 introduces five new models—Qwen2-0.5B, Qwen2-1.5B, Qwen2-7B, Qwen2-57B-A14B, and Qwen2-72B—each optimized for state-of-the-art performance across a variety of benchmarks.

Click here to check out the model on Hugging Face.

These models offer substantial improvements, including training on data from 27 additional languages beyond English and Chinese, including Hindi, Bengali, and Urdu. This multilingual training enhances Qwen2’s capabilities in diverse linguistic contexts, addressing common issues like code-switching with greater proficiency. 

Qwen2 also excels in coding and mathematics, with significantly improved performance in these areas.

A standout feature of Qwen2 is its extended context length support, with Qwen2-7B-Instruct and Qwen2-72B-Instruct models capable of handling up to 128K tokens. This makes them particularly adept at processing and understanding long text sequences.

Qwen2’s release includes various technical enhancements such as Group Query Attention (GQA) for faster speed and reduced memory usage, and optimized embeddings for smaller models.

Performance evaluations show that Qwen2-72B, the largest model in the series, outperforms leading competitors like Llama-3-70B in natural language understanding, coding proficiency, mathematical skills, and multilingual abilities. 

Despite having fewer parameters, Qwen2-72B surpasses its predecessor, Qwen1.5-110B, demonstrating the effectiveness of the new training methodologies.

Safety and responsibility remain a priority, with Qwen2-72B-Instruct performing comparably to GPT-4 in terms of safety across various categories of harmful queries. The model exhibits significantly lower proportions of harmful responses compared to other large models.

The Qwen2 models, licensed under Apache 2.0 and Qianwen License for different versions, are set to accelerate the application and commercial use of AI technologies worldwide. Future plans include training larger models and extending Qwen2 to multimodal capabilities, integrating vision and audio understanding.

The post Alibaba Releases Qwen2, Outperforms Llama 3 on Several Benchmarks  appeared first on AIM.


Viewing all articles
Browse latest Browse all 3499

Trending Articles