Dit zijn handige site met publieke data om lekker tegen aan te scripting. B.v. m.b.v. Python in www.kaggle.com
Google Dataset Search
https://datasetsearch.research.google.com
A website with a very good UI, contains many useful datasets in various formats
https://zenodo.org
Deepmind is a website owned by google, contains a good amount of datasets and research papers
https://deepmind.com/research?filters=%7B%22tags%22:%5B%22Datasets%22%5D%7D
data.world is the enterprise data catalog for the modern data stack, another website containing many datasets
https://data.world/
Kaggle Datasets
https://www.kaggle.com/datasets
FiveThirtyEight shares a lot of useful data which are used by their articles
https://data.fivethirtyeight.com/
World Bank maintains a number of macro, financial and sector databases and datasets
https://data.worldbank.org/
Open data of the US government
https://www.data.gov/
Beta version of Microsoft’s open data, covers a wide range of topics
https://msropendata.com/
Helps users discover and share datasets that are available on AWS resources
https://registry.opendata.aws/
Divides properly into category, the datasets are well labelled
https://github.com/awesomedata/awesome-public-datasets
The UCI Machine Learning Repository is a collection of databases maintained by University of
California Irvine
https://archive.ics.uci.edu/ml/datasets.php
Created and maintained by people working at YouTube, contains datasets on computer vision
https://research.google.com/youtube8m/
The Harvard Dataverse Repository is a free data repository open to all researchers where you can
share, archive, cite, access, and explore research data.
https://dataverse.harvard.edu/
Can be used for accessing datasets on Earth data
https://earthdata.nasa.gov/
Contains datasets on FBI crime data
https://crime-data-explorer.fr.cloud.gov/pages/home