Open source

Open data

Open training code

Fully reproducible and auditable

Pretty interesting stuff for embeddings, I’m going to try it for my RAG pipeline when I get a chance, I’ve not had as much success as I was hoping, maybe this english-focused one will help