Datasets

In this page we collect three datasets that our research group published: ContentWise Impressions, 30Music Dataset and TV Audience Dataset. Please remember to cite our paper if you use these datasets.

ContentWise Impressions

The dataset is the first open-source collection of interactions and impressions (previous recommendation lists) with television and cinema content, collected from users subscribed to an industrial video on demand platform.

The dataset is available after filling this survey.

 

30Music Dataset

The dataset is a collection of listening and playlists data retrieved from Internet radio stations through Last.fm API.

The dataset is available at this link.

  • Turrin, R., Quadrana, M., Condorelli, A., Pagano, R., & Cremonesi, P. “30Music listening and playlists dataset”, RecSys 2015 [PDF], [BibTex].

 

Tv Audience Dataset

The dataset contains the TV viewing habits of 13k users over 217 channels during a period of 4 months in 2013. It includes either over-the-air (digital terrestrial broadcasting) or satellite, free or pay-TV.

The dataset is available at this link.

  • Turrin, R., Condorelli, A., Cremonesi, P., and Pagano, R. “Time-based TV programs prediction. In 1st Workshop on Recommender Systems for Television and Online Video”, RecSys 2014 [PDF]