BBC Programmes is a new project which aims to ensure that every programme the BBC broadcasts has a permanent, findable web presence. Launched in October 2007, it publishes programmes data that covers the eight BBC TV channels, ten national radio stations and the six stations covering Scotland, Northern Ireland and Wales. The database behind BBC Programmes also powers iPlayer, the BBC on-demand service that allows UK-based users to view a selection of programmes broadcast on the BBC networks from the last seven days.
Historically the BBC website has focused on provided information and rich online experiences to support only the major BBC programmes but, with well over 1000 programmes broadcast every day across radio and TV networks, web coverage to date has been neither comprehensive nor permanent. BBC Programmes seeks to solve this problem – ensuring that every programme brand, series and episode has a persistent presence on the web.
To enable the sharing of this data in a structured way, we are investigating the linked data approach, where resources on the web can be far more than just documents. They can identify anything, from a particular person to a particular programme. These resources have representations, which can be machine-processable (through the use of RDF, Microformats, RDFa, etc.), and these representations can hold links towards further web resources, allowing to jump from one dataset to another.
In order to provide a direct access to the actual data backing BBC Programmes, we designed a Semantic Web ontology covering programmes data, The Programmes Ontology. This ontology provides web identifiers for concepts such as brand, series, or episode. The ontology is divided in two main parts. First, it captures categorical information about programmes, and the relations between such categories. For example, it allows the description of a brand, a series constituting it, a sub-series and an episode in it. The second part of the ontology describes episodes’ versions and their broadcast on a particular service.
We have designed the ontology so that it can describe any broadcasters’ programmes, both live and on-demand. For example, the Southampton University student radio station, Surge, is using the Programmes Ontology to expose it’s schedule and programme information. We hope that the ontology will be used for broadcasters to interchange and interlink schedule information.
Using D2R Server, a Java application for mapping relational databases to RDF accessible through SPARQL, we have published the BBC Programmes as Linked Data. Around 5 million RDF triples are exposed this way. Multiple views are available for individual records, such as a brand, series or episode, and the SPARQL interface allows the data to be queried directly. By using D2R we avoid synchronization issues as the latest live data is always exposed.
Through the use of SPARQL, we can query the data using a variety of constraints that cannot be easily expressed through the Programmes web interface. We are also able to semantically connect to external data sources such as DBpedia to provide extra information that is not present in our dataset, such as date and place of birth of cast members.
Tom Scott is the Technical Project Team Leader in BBC Audio and Music Interactive where he is the Product Manager for the BBC’s comprehensive programme support (bbc.co.uk/programmes) and its underlying technology. Prior to the BBC, he was the Head of Operations for an information architecture and web development company, Simulacra.
Yves Raimond is currently a PhD student at the Centre for Digital Music, Queen Mary, University of London. His area of research is ontology-based knowledge management for music information retrieval systems. He graduated in 2005 from the ENST (Ecole Nationale Superieure des Telecommunications), Paris, France. His interests include music, music technologies, digital signal processing, open-source and semantic-web technologies.
Patrick Sinclair is a software engineer at the Audio and Music Interactive at the BBC. He was formerly a research fellow investigating the use of Semantic Web technology in the Cultural Heritage domain at the University of Southampton.
Nicholas Humfrey is a Software Engineer working in Audio and Music Interactive at the BBC in London. He was formally a Research Assistant at the University of Southampton investigating IPv6, multicast and semantic web technologies.