Code: EDI-2019-17-VRT_2

Domain: Internet & Media

Summary

To develop a system that creates synthetic data, based on real data, that can be shared safely with third parties.

Proposed by

 

VRT NWS is the news service of the VRT, the Flemish public broadcast. VRT NWS is active in the field of television, radio and online.

Description

VRT monitors the behaviour of users on its websites. This data is used for analysis, optimisation, recommendations, churn analysis and much more. The technology for these data-driven activities is constantly evolving. Typically, a third party that develops a system during a project has access to anonymised data: the user data are replaced by hashes, but the behaviour data remain unchanged. It is theoretically possible to retrace the behavioural data back to the user. An example is the Netflix Prize, where the datasets had to be taken offline after a lawsuit.

This kind of problems can be avoided by creating synthetic data. This data has the same statistical properties as the real data, but cannot be traced back to the user, as there are no real users behind it. The synthetic data can be used by third parties to develop systems.

When moved to production, the system is retrained with real data by VRT, without the third party ever having been in contact with it.

The challenge here is to develop a system that creates synthetic data, based on real data, that can be shared safely with third parties.

Data

The challenge has the following sample datasets available for download

Expected outcomes

The synthetic data should have the same statistical properties as the real data. The successful candidate will prove this by performing relevant tests on both the synthetic and the real data:

  • statistical analysis
  • recommendations
  • churn prediction

The tests should have similar results for both sets: same mean, same distributions, same evaluation scores for recommender and churn prediction.

The system should be provided in two ways:

  • Demonstrator for evaluation
  • Source code

How do we apply?

Follow by Email
Facebook
YouTube