Training Datasets

Explore and validate Mozilla Common Voice and JW300 parallel corpora data powering AfricanGPT

Data Transparency & Privacy

We believe in full transparency about the data used to train AfricanGPT. Every dataset listed here is sourced ethically, with clear licensing and provenance.