person using laptop

Empowering AI-Driven Solutions

At Diagrid, we architect innovative business outcomes by leveraging cutting-edge technology to tackle urgent challenges.

As a boutique AI-focused consultancy, we transfom ideas into impactful AI strategies and scalable solutions.

Our Services

CEO planning and prioritizing the quarter with the team.

AI Strategy & Roadmapping

  • Defining clear AI goals aligned with business strategy.

  • Evaluating and validating potential AI-driven business models.

  • Identifying high-impact areas for AI integration.

Three businesswomen

Rapid Prototyping & Validation

Through hands-on customer discovery, we validate challenges and develop proof-of-concept (POC) solutions within tight timelines. This approach provides strong ROI signals to inform your business decisions.

Businessman hand putting blank wooden cube block stack on white background with copy space for input text, icon, trend, creative idea, finance, leadership, strategy, business, online marketing concept

AI GTM Execution

  • Supporting organizations in transitioning from prototype to scalable AI products.

  • Creating GTM plans based on product and business strategy.

  • Collaborating on sales enablement to educate and support sales people.

Selected Generative AI Work

AI Developer Tools

Tools including but not limited to data preparation, resource adaptive cluster scheduling, deployment, and monitoring tools for AI developers.

LLM Application Development

Adapting language models for downstream applications via LLM finetuning (e.g. math, logic, and reasoning).

Natural Language Search Systems

Natural Language Search for a Major Public University

TxT360

A globally deduplicated 5T token dataset for LLM pretraining

Crystal-7B

Balances natural language and coding ability to extend the Llama 2 frontier.

K2-65B

The largest fully-reproducible large language model outperforming Llama 2 70B using 35% less compute.

Amber-7B

A 7B English language model with the LLaMA architecture

3D rendering capturing the double helix structure of DNA against a blue backdrop, highlighting the intricate beauty of life's genetic code.

METAGENE-1

A 7B parameter metagenomic foundation model designed for pandemic monitoring

Macro of a Intel motherboard

LLM Training and Inference Benchmarking

Hardware and software training and inference benchmarking for optimal cluster configurations depending on available machines, models, sequence lengths, and batch sizes.

Publications


METAGENE-1: Metagenomic Foundation Model for Pandemic Monitoring

We pretrain METAGENE-1, a 7-billion-parameter autoregressive transformer model, which we refer to as a metagenomic foundation model, on a novel corpus of diverse metagenomic DNA and RNA sequences comprising over 1.5 trillion base pairs.


Towards Best Practices for Open Datasets for LLM Training

Many AI companies are training their large language models (LLMs) on data without the permission of the copyright owners. The permissibility of doing so varies by jurisdiction: in countries like the EU and Japan, this is allowed under certain restrictions, while in the United States, the legal landscape is more ambiguous.


LLM360 K2: Building a 65B 360-Open-Source Large Language Model from Scratch

We detail the training of the LLM360 K2-65B model, scaling up our 360◦ Open Source approach to the largest and most powerful models under project LLM360.


TxT360: a 5T token globally deduplicated dataset for LLM pretraining

We introduce TxT360 (Trillion eXtracted Text), the first dataset to globally deduplicate 99 CommonCrawl snapshots and 14 high-quality data sources from diverse domains (e.g., FreeLaw, PG-19, etc.).


LLM360: Towards Fully Transparent Open-Source LLMs

The recent surge in open-source Large Language Models (LLMs), such as LLaMA, Falcon, and Mistral, provides diverse options for AI practitioners and researchers. However, most LLMs have only released partial artifacts, such as the final model weights or inference code, and technical reports increasingly limit their scope to high-level design choices and surface statistics.


Transforming technology into solutions for a better tomorrow.

Contact Us

Contact Us