Open Science Community

Cohere Labs Community Blog

Research notes, technical essays, and personal stories from the Cohere Labs Community

Research notes, stories, and ideas from people shaping AI together—transparent, collaborative, and community-led.

Read the latest Join the community

Latest posts

Talking to a 4-Year-Old: A Multilingual Benchmark for Children's AI Companions

A 2,312-prompt, 23-language benchmark for child–AI conversations that evaluates four production models and validates the LLM-as-judge pipeline with five independent judges (Cohen's κ up to 0.71).

June 04, 2026 Batuhan Aktas 10 min read

community research multilingual-evaluation child-safety-benchmark open-science
Mix, Fine-Tune, Break: What Happens When You Stress-Test a Multilingual Model's Safety

What happens to a multilingual model's safety guardrails when you fine-tune it on harmful data and probe it with code-mixed inputs, and why current binary benchmarks can't tell you.

May 25, 2026 Tanav Singh Bajaj 9 min read

community research multilingual-evaluation cultural-benchmark open-science
From Showing Up to Leading: Three Years Inside Cohere Labs Community

A community lead reflects on three years of learning, research, and building programs inside the Cohere Labs Open Science Community.

May 19, 2026 Ahmad Anis 6 min read

community open-science research leadership
Tongueless Lions and Hallucinating Machines

A Hungarian cultural riddle benchmark shows that fluent multilingual models can still miss local knowledge and hallucinate culturally plausible answers.

May 12, 2026 Károly Boczka 5 min read

community research multilingual-evaluation cultural-benchmark open-science
Writing for the Cohere Labs Community Blog

Contributor guidelines, research writing patterns, and feature examples for Cohere Labs Community Blog posts.

April 30, 2026 Cohere Labs Community 20 min read

community research writing reproducibility visualization

Cohere Labs Community Blog

Latest posts

Talking to a 4-Year-Old: A Multilingual Benchmark for Children's AI Companions

Mix, Fine-Tune, Break: What Happens When You Stress-Test a Multilingual Model's Safety

From Showing Up to Leading: Three Years Inside Cohere Labs Community

Tongueless Lions and Hallucinating Machines

Writing for the Cohere Labs Community Blog