Access thousands of carefully curated datasets across every domain - all available for immediate research and development.
Explore datasets by category or search all thousands of open resources available for research and experimentation.
2,147 datasets
3,892 datasets
1,563 datasets
2,345 datasets
1,987 datasets
2.1TB of curated text from GitHub, Wikipedia, and academic papers for training large language models.