Domain: Internet & Media
Assigning meaningful and detailed tags to VRT’s news stories, based on the content.
VRT NWS is the news service of the VRT, the Flemish public broadcast. VRT NWS is active in the field of television, radio and online.
VRT’s news website VRTNWS produces around one hundred stories every day. Most of these are in Dutch. For various purposes, like recommendations and archiving, metadata are necessary. The stories are currently labelled with categories, but these are too broad, and we need detailed, meaningful tags. The tagging should be done automatically. Note that there is no training set at this moment: none of the stories have meaningful tags.
The challenge is to develop a system capable of assigning meaningful and detailed tags to VRT’s news stories, based on the content.
The challenge has the following sample datasets available for download
The main result will be a system capable of assigning meaningful and detailed tags to VRT news stories. Priority goes to the stories written in Dutch, as this is the main portion of VRT’s newsroom’s output.
In later stages, the stories in French, English and German should also be tagged.
The evaluation of the accuracy and completeness of the tags will be done by a team of experts, on a randomly drawn sample of stories and their tags, generated by the system.
The system should be provided as both:
- a demonstrator for expert evaluation
the source code