Extracting data from the web to use in our computer programs has always been a challenge. Many developers will be familiar with techniques such as Web Scraping, trying to parse a human readable web page and extract data and might dream of more reliable ways to query different sources for data in a standardised way. Linked Data is a proposed answer to this issue that seems to be gaining some momentum with data being exposed in this format by organisations such as the British Govenment and my own employer The Open University. So how do we query these resources and get the data into our PHP scripts?
ou
Back in December 2008 I wrote a small perl script to enable you to enjoy podcasts from the Open University in MythStream, an add on for MythTV that enables you to watch streaming video content through MythTV.
The Open University here in the UK regularly coproduces educational programming in partnership with the BBC. Some of these programmes are for a wide audience such as Coast, and other programmes are for more specialist audiences such as The Story of Maths. To catch these programmes you don't have to necessarily stay in and make sure that you are sat on your sofa in front of the TV at a scheduled time, instead you can catch the repeats on BBC iPlayer (sorry - UK, IoM & Channel Islands only). My colleague Tony Hirst recently created a mash up to find the OU programmes from the last seven days posted to iPlayer using feeds from Twitter, iPlayer and a Yahoo Pipe, he then presented the results in a web page. When looking at his blog post on this it struck me that it would be really nice if this could be presented in a way more suitable for a media centre PC connected to a TV, so this would mean nice big fonts, an attractive interactive-TV type interface and ease of use from a remote control, and then I thought it would be even better to feed this into MythTV, integrating seven days of OU programming alongside the rest of your entertainment.