$3M in research grants //
Rolling acceptances open

Building the benchmarks that define frontier AI

Open Benchmarks Grants funds researchers, labs, and engineers creating the open-source evaluation infrastructure that shapes how the next generation of AI systems are built and measured.
In partnership with

What we fund

We're looking for benchmarks that advance the fundamental measurement axes of agentic AI. Below are the categories we're actively seeking — but we welcome independent research directions too.
Image

Agentic task completion

Multi-step task execution, tool use, and goal-directed behavior across real-world environments and interfaces.

Image

Long-context reasoning

Evaluation of coherence, retrieval, and synthesis across extended contexts — documents, codebases, and conversations.
Image

Multimodal evaluation

Benchmarks that assess understanding and reasoning across text, images, structured data, and code in combination.
Image

Planning under uncertainty

Evaluation of sequential decision-making, backtracking, and robustness to ambiguous or incomplete specifications.
Image

Reliability & safety

Measuring consistency, refusal calibration, adversarial robustness, and instruction-following under pressure.
Image

Programmatic data quality

Benchmarks and tooling for assessing training dataset quality, annotation consistency, and labeling artifacts.
Independent directions are welcome. If your research addresses a critical evaluation gap not listed above, we want to hear from you. The steering committee reviews all proposals on their merits.

Is this for you?

We're looking for benchmarks that advance the fundamental measurement axes of agentic AI. Below are the categories we're actively seeking — but we welcome independent research directions too.
Strong candidates
  • Academic researchers at any career stage, including PhD students and postdocs
  • Independent researchers and lab teams working on open evaluation tooling
  • Engineers building infrastructure for benchmark creation and reproducibility
  • Teams with a concrete benchmark proposal and existing preliminary work
  • International applicants — geography is no barrier
  • Researchers willing to publish resulting datasets and code under open licenses
Not a fit
  • Closed-source or proprietary benchmark development
  • Work that primarily evaluates a single commercial model or product
  • Benchmark proposals without a concrete methodology or scope
  • Projects where Snorkel AI or its partners are the sole intended beneficiaries
IP & licensing. Grant recipients retain full IP ownership of their work. We require outputs to be published under an OSI-approved open license. Snorkel AI receives no exclusivity or commercial rights.

What you receive

Image

Expert data credits

 
Access to Snorkel's expert annotation network — thousands of specialists across academic, professional, and domain-specific fields — to generate high-quality training and evaluation data at scale.
Image

Research collaboration

Direct team access

Work directly with Snorkel and partner research teams. Steering committee members are available for advisory sessions. We don't just write a check — we're active collaborators.

Image

Partner compute & platform

Prime Intellect · Hugging Face
Access to compute credits from Prime Intellect and platform credits from Hugging Face for hosting, running, and distributing your benchmarks and datasets.

Common questions

Things we've heard from researchers before they applied.

Yes, fully. Geography is not a consideration in our review process. Grants are available to researchers worldwide, subject to standard legal and compliance requirements for fund disbursement in your country.

How to apply

01

Submit your proposal

A brief application describing your benchmark, the evaluation gap it addresses, and your methodology. Two to four pages is sufficient. No templates required.
02

Steering committee review

Proposals are reviewed by our committee of academic and industry leaders. Shortlisted applicants are invited for a conversation with our advisory board.
03

Grant & collaboration kickoff

Selected teams receive their grant, expert data credits, and compute allocations. We agree on a research collaboration structure and publication timeline.
04

Publish & release

Recipients publish the resulting dataset, benchmark, or paper under an open license, with acknowledgement of Open Benchmark Grants support. We amplify the work across our network.

Rolling acceptances — no fixed deadline. Apply when you're ready.

Steering committee

Proposals are reviewed by a committee of researchers and engineers at the frontier of AI evaluation. They bring independent academic judgment — Snorkel does not direct their decisions.
Image

Karthik Narasimhan

Princeton University
Professor of Computer Science at Princeton. Research focuses on reinforcement learning for language and agentic systems.
Image

Chris Ré

Stanford University
Associate Professor at Stanford and co-founder of Snorkel AI. Pioneered programmatic data development for machine learning.
Image

Ludwig Schmidt

Stanford University · LAION
Stanford researcher and LAION collaborator. Co-creator of CIFAR-10.1, WILDS, and several foundational evaluation benchmarks.
Image

Yu Su

Ohio State University
Professor at Ohio State. Research spans conversational AI, question answering, and evaluation methodology for language models.
Image

Lewis Tunstall

Hugging Face
Machine learning engineer at Hugging Face and co-author of Natural Language Processing with Transformers.
Image

Fred Sala

Univ. of Wisconsin–Madison
Assistant Professor at Wisconsin–Madison. Research focuses on data-centric AI, weak supervision, and programmable training pipelines.