minish

Soooooooooooo fast

About us

New improvements to model2vec distillation

February 5, 2025

We’ve made a lot of improvements to model2vec since it came out, many of which target the baseline performance of our distillation process.

Read More

ModernBERT support and why it doesn't work

January 29, 2025

Our newest shiny release is here! 0.3.8! This is a small release in line for a big one we’ll be releasing next week. See here for the release notes.

Read More

semhash: deduplication and dataset multitool

January 12, 2025

We’re super excited to announce the release of semhash, our semantic deduplication and dataset multitool (other features coming soon).

Read More

POTION: bag of tricks leads to better models

October 29, 2024

This blogpost describes the Tokenlearn method, which is a method to pre-train Model2Vec models.

Read More

Model2Vec Introduction blogpost

October 14, 2024

This blog was first posted on the Hugging Face blog. We’re also posting it here for archival purposes.

Read More

I gotta make money off of this thing. It's so good!