Here I will keep a list of various datasets that are available to public for research in machine learning and related areas.
- Movielens dataset: This is a movies rating dataset.
- Yahoo! Music: This is music rating dataset.
- Heritage Health Prize: The goal of the prize is to develop a predictive algorithm that can identify patients who will be admitted to the hospital within the next year, using historical claims data.
Above I have just mentioned a few data sets. A lot of data sets are available from the website infochimps. Kaggle is one website dedicated to host large dataset competitions.
Advertisements