The year 2023 has witnessed an unprecedented surge in the development of large language models, with new models emerging at an astonishing pace. So let’s take a look at these advancements, who pushed them forward, and what the year entails.
In the early months of the year, Google AI took center stage with the release of PaLM 2, a massive LLM with 540B parameters. PaLM 2 demonstrated remarkable capabilities in various NLP tasks, including text generation, language translation, and question answering.
Following closely behind, Meta AI’s LLaMa, a foundational LLM with a range of parameter sizes (7B, 13B, 33B, and 65B). LLaMa’s versatility and open-source nature made it a valuable tool for researchers and developers alike.
Then a new paper introduced Megatron-Turing NLG, a 530B parameter LLM specifically designed for natural language generation tasks. Megatron-Turing NLG demonstrated an exceptional ability to produce high-quality, factually accurate, and creative text. It achieves superior zero-, one-, and few-shot learning accuracies on several NLP benchmarks.
In March 2023, Bloom, an open-access, multilingual LLM optimized for text generation and language exploration, made its debut. A team from HuggingFace’s BigScience team, the Microsoft DeepSpeed team, the NVIDIA Megatron-LM team, the IDRIS/GENCI team, the PyTorch team, and the volunteers in the BigScience Engineering workgroup all worked in Bloom’s development. The model has the ability to generate text in 13 programming languages and 46 natural languages highlighting the growing emphasis on multilingual capabilities in LLM development.
OpenAI released its GPT-4 model during this time. With the success of ChatGPT after its release the previous November, GPT-4 had a large following waiting for its release. Its features include an expanded context window, multimodal processing, enhanced creativity, and faster training and execution. These features make GPT-4 more versatile and practical for a wider range of applications.
The second half of the year saw the emergence of Claude, an LLM-based generative AI model developed by Anthropic. Claude’s broad range of capabilities, spanning text generation, language translation, question answering, and creative content creation, solidified its position as a powerful tool for various AI applications.
Mid-summer, Google finally introduced its answer to OpenAI’s ChatGPT, Bard. Bard is a 137B parameter LLM capable of generating different creative text formats, including poems, code, scripts, musical pieces, emails, and letters. Bard’s ability to answer questions informatively, even in open-ended, challenging, or strange scenarios, further demonstrated the growing sophistication of LLM models.
Finally, in early November ChatGPT 4.5 Turbo was released. With a new cut-off data window of April 2023, this newest iteration of ChatGPT is hoping to push the envelope in a few ways. GPT-4 Turbo will support up to 128,000 tokens of context. Allowing users to create extremely long and detailed prompts. As we saw this past March, when users are allowed to build off prompts of greater size, what is generated tends to be quite incredible.
The user interface will be smarter and help users pick the correct tool for the job they have in mind. So instead of the familiar dropdown menu where users had to pick the tools they wanted to use, the AI would now pick the tools based on your input.
The tail end of 2022 and the closing of 2023 are witnessing a tremendous surge in innovation within the world of large language models. Major tech companies and startups see the writing on the wall, the future will be shaped by generative AI. This is seen as the market for these tools is expected to explode in the coming years. Combining that with the expected economic impact, LLMs will play a major role in the coming years.
The future of LLMs is still yet to be written. While many of the organizations mentioned above are making great waves in the field, there’s still room for standout projects to emerge. You and your team can be the next ones to make an industry-changing large language model! By attending ODSC East 2024 this April 23rd-25th, and specifically checking out the track devoted to NLP and LLMs, you’ll learn everything you need to know to use existing LLMs or build the next big thing. Register now while tickets are at their cheapest!
Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday. You can also get data science training on-demand wherever you are with our Ai+ Training platform. Interested in attending an ODSC event? Learn more about our upcoming events here.