AI2 is creating an open generative language model called AI2 OLMo (Open Language Model). It will be comparable in scale to other state-of-the-art LLMs at 70 billion parameters and is expected to debut in early 2024. (AI2 Photo)

You can’t read the news today without seeing a story about the latest advances in artificial intelligence, particularly in the nascent area of generative AI and large language models (LLMs).

The rapid pace of development is turning the field into something of the Wild West as everyone from startups to corporate giants race to get to market as quickly as possible so as not to be left behind.

Unfortunately, this raises several issues and concerns, in large part because the language models these companies work with are anything but transparent or fully understood.

Seattle’s Allen Institute for AI (AI2) is working to change that. 

AI2 on Thursday announced it is creating an open generative language model called AI2 OLMo (Open Language Model). It will be comparable in scale to other state-of-the-art LLMs at 70 billion parameters and is expected to debut in early 2024.

This initiative is unique in that it will develop an open language model “by scientists, for scientists.” 

“With the scientific community in mind, OLMo will be purpose-built to advance the science of language models,” said Hannaneh Hajishirzi, an OLMo project lead, a senior director of NLP research at AI2, and a professor at the University of Washington’s Allen School of Computer Science & Engineering. “OLMo will be the first language model specifically designed for scientific understanding and discovery.”

This initiative will benefit the research community, as well as the public, by providing access and education to all aspects of the model, including its development, implementation and use. In addition, the open model is being developed in collaboration with AMD and CSC, using LUMI, one of the greenest supercomputers in the world.

“OLMo will be something special,” said Noah Smith, also an OLMo project lead, a senior director of NLP Research at AI2, and a professor in the Allen School.

Smith added: “In a landscape where many are rushing to cash in on the business potential of generative language models, AI2 has the unique ability to bring our world-class expertise together with world-class hardware from AMD and LUMI to produce something explicitly designed for scientists and researchers to engage with, learn from, and use to create the next generation of safe, effective AI technologies.”

Hannaneh Hajishirzi (left) and Noah Smith. (AI2 Photos)

AI2’s goal is to collaboratively build the best open language model in the world. The philosophy behind OLMo is that by giving access to millions of people who want to better understand and engage with language models, they can create an environment that leads to faster and safer progress for everyone.

This initiative will allow many people in the AI research community to work directly on LLMs for the first time. By making all elements of the OLMo project accessible — not only the data, but also the code used to create it — it will allow the research community to directly take what they create and work to improve it. 

By openly sharing and discussing the ethical and educational considerations around the creation of the model, AI2 hopes to help guide the understanding and responsible development of language modeling technology.

These generative AI models are already being used for everything from building business applications that compose emails, strategic plans, and software code, to providing the foundations for a new generation of search engines. They can already distill and explain complex ideas, solve math problems, create music and write essays on any topic.

But while the ability to perform all of these tasks exists, there remain numerous functional and ethical problems, not least of which is the reliability and accuracy of what is being generated. A truly open model like OLMo could help address such issues.

Other organizational partnerships for the initiative include working with Surge AI and MosaicML for data and training code. AI2 has also created an ethics review committee composed of both internal and external advisors to provide feedback throughout the process.

LLMs typically draw their data from vast swaths of publicly available material, usually by crawling the web. This has raised many concerns around intellectual property rights. The OLMo team is working closely with AI2’s legal department and outside legal experts in order to better assess and address these issues.

Developing OLMo as a greener language model is also crucial given the enormous computing resources needed to operate a typical LLM. Greater transparency on the power usage and emissions of AI models will be of growing importance as these tools become increasingly popular and used throughout business and society. For instance, OpenAI’s ChatGPT, which was launched six months ago, currently has 1.6 billion users, yet the public can only guess at its true energy consumption.  

“Generative AI carries the potential of being the breakthrough technology of this decade, analogous to how search engines and smartphones penetrated our society in the previous decades,” said Pekka Manninen, CSC’s Director of Science and Technology. “Open, transparent, and explainable LLMs are vital for the democratization of this technology.”

CSC — the Finnish Center for High-Performance Computing and Networking — is providing AI2 OLMo access to the LUMI supercomputing resources, a crucial factor in helping to better understand the role of infrastructure in training and running LLMs. Located in CSC’s data center in Kajaani, Finland, LUMI is a pan-European pre-exascale supercomputer.

Manninen continued: “We are proud to be part of this collaboration for its great societal impact and technological ambition level, and happy that we can contribute to it with the LUMI supercomputer and our expertise. Supercomputers like LUMI can accelerate LLM training by an order of magnitude, and many other features of the LUMI infrastructure position it as a leading platform for natural language processing.”

LUMI (Large Unified Modern Infrastructure) was rated the world’s number three supercomputer in the November 2022 TOP500 ranking with a measured performance of 301.9 PFLOPS. It also ranks near the top of the Green500. The GFlops/Watts ratio for LUMI is 51.38, making it one of the greenest supercomputers in the world. Lumi is the Finnish word for “snow.”

Like what you're reading? Subscribe to GeekWire's free newsletters to catch every headline

Job Listings on GeekWork

Find more jobs on GeekWork. Employers, post a job here.