Megatron NLG

The Largest and Most Powerful Monolithic Transformer Language NLP Model Triple the Size of OpenAI’s GPT-3
 The Largest and Most Powerful Monolithic
Product Information
This tool is verified because it is either an established company, has good social media presence or a distinctive use case
Release date22 June, 2023
PlatformDesktop

Megatron NLG Features

Microsoft and NVIDIA present the Megatron-Turing Natural Language Generation model (MT-NLG), powered by DeepSpeed and Megatron, the largest and robust monolithic transformer language model trained with 530 billion parameters. MT-NLG is the successor to Turing NLG 17B and Megatron-LM. The scale of this model is three times that of the largest of its kind. It can do natural language tasks with high accuracy, including prediction, reading comprehension, common sense reasoning, natural language reasoning, and word meaning disambiguation. The model is trained on the Selene supercomputer, built on NvidiaDGX SuperPOD, and includes mixed-precision training. There are 560 DGX A100 servers on the supercomputer. HDR InfiniBand with full-fat tree extension is used to connect these servers. Each DGX A100 includes eight A100s, each with an 80GB Tensor Core GPU connected via NVLink and NVSwitch. Source: https://www.microsoft.com/en-us/research/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/
Browse AI Tools Similar to Megatron NLG

Trends prompts: