1 supported by eu projects 12/12/2013 athens, greece open data in agriculture hands-on with data...
TRANSCRIPT
1
Supported by EU projects
12/12/2013Athens, Greece
Open Data in Agriculture
Hands-on with data infrastructures that can power your agricultural data
products
Maths, Computing and Technology FacultyThe Open UniversityWalton HallMilton KeynesMK7 6AA
www.open.ac.ukmct-research.open.ac.uk
Jane Bromley David King David Morse
5
Objectives
An introduction to the Open University’s free material
• Show available metadata
• Talk about RDF – the format used for graph databases
• How to query the material through SPARQL
9
Open Research Online – publications originating from OU researchersOU PodcastsCourse DescriptionsSome KMi datasetsAnd…
11
Resource Description Framework • one of the basic building blocks forming web of semantic data• defines a graph database• format defines statements comprising:
Subject is the T-shirt Predicate (property) is the colour Object is white
subject->predicate->object relationship is called a triple.
RDF
<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:feature="http://www.linkeddatatools.com/clothing-features#">
<rdf:Description rdf:about="http://www.linkeddatatools.com/clothes#t-shirt <feature:color rdf:resource="http://www.linkeddatatools.com/colors#white"/> </rdf:Description></rdf:RDF>
RDF/XML - the XML form of RDF
13
select distinct ?props from <http://data.open.ac.uk/context/openlearn> where { ?subj ?props ?obj }
18
A three step process:
1. Find all the subjects and chose those relevant to agriculture
2. Find all the OpenLearn Units that have just these subjects
3. Collect the metadata for each of the selected Open Learn units
20
(1130) as of end of October 2013http://data.open.ac.uk/topic/psychologyhttp://data.open.ac.uk/topic/sociologyhttp://data.open.ac.uk/topic/social_carehttp://data.open.ac.uk/topic/educational_practicehttp://data.open.ac.uk/topic/biologyhttp://data.open.ac.uk/topic/herbicideshttp://data.open.ac.uk/topic/energyofficial1342688874openlearn_teamadminhttp://data.open.ac.uk/topic/unitsdefault1330523206frank_siebertzz884926http://data.open.ac.uk/topic/pre_course_workdefault1263940536linda_smithlps32http://data.open.ac.uk/topic/employmentofficial1342688874richard_howesrh4685http://data.open.ac.uk/topic/using_mathsdefault1231080717peter_mcalisterzz298445http://data.open.ac.uk/topic/numbersdefault1330523196elizabeth_ellisee944http://data.open.ac.uk/topic/nuclearofficial1342688874lucy_hendylmf7http://data.open.ac.uk/topic/environmental_sciencehttp://data.open.ac.uk/topic/audiohttp://data.open.ac.uk/topic/cctvhttp://data.open.ac.uk/topic/social_workhttp://data.open.ac.uk/topic/scotlandhttp://data.open.ac.uk/topic/personalisationhttp://data.open.ac.uk/topic/religious_studieshttp://data.open.ac.uk/topic/religion…
21
40 topics chosen:<http://data.open.ac.uk/topic/agriculture>, <http://data.open.ac.uk/topic/environment>, <http://data.open.ac.uk/topic/the_environment>, <http://data.open.ac.uk/topic/nature_&_environment> <http://data.open.ac.uk/topic/environmental_science>,<http://data.open.ac.uk/topic/herbicides>,<http://data.open.ac.uk/topic/ecology>,<http://data.open.ac.uk/topic/genetics>,<http://data.open.ac.uk/topic/diversity>,<http://data.open.ac.uk/topic/global_warming>,<http://data.open.ac.uk/topic/biodiversity>,<http://data.open.ac.uk/topic/pollution>,<http://data.open.ac.uk/topic/conservation>,<http://data.open.ac.uk/topic/the_environment>,<http://data.open.ac.uk/topic/climate>,<http://data.open.ac.uk/topic/environmental_studies>,<http://data.open.ac.uk/topic/climate_change>,<http://data.open.ac.uk/topic/sustainability>,<http://data.open.ac.uk/topic/biogas>,<http://data.open.ac.uk/topic/biofuels>,
<http://data.open.ac.uk/topic/photosynthesis>,<http://data.open.ac.uk/topic/waste_management>,<http://data.open.ac.uk/topic/landfill>,<http://data.open.ac.uk/topic/economic_growth>,<http://data.open.ac.uk/topic/waste>,<http://data.open.ac.uk/topic/acid_rain>, <http://data.open.ac.uk/topic/weather>, <http://data.open.ac.uk/topic/meteorology>, <http://data.open.ac.uk/topic/natural_resources>,<http://data.open.ac.uk/topic/animals>, <http://data.open.ac.uk/topic/ecological_sustainability>,<http://data.open.ac.uk/topic/overfishing>, <http://data.open.ac.uk/topic/ecosystem>, <http://data.open.ac.uk/topic/the_end_of_nature>,<http://data.open.ac.uk/topic/survival_of_the_fittest>,<http://data.open.ac.uk/topic/barter>,<http://data.open.ac.uk/topic/plants>,<http://data.open.ac.uk/topic/freshwater>,<http://data.open.ac.uk/topic/maps>,<http://data.open.ac.uk/topic/food>..
Topics relevant to agriculture?
22
A three step process:
1. Find all the subjects and chose those relevant to agriculture
2. Find all the OpenLearn Units that have just these subjects
3. Collect the metadata for each of the selected Open Learn units
23
select distinct ?olu from <http://data.open.ac.uk/context/openlearn>where { ?olu <http://purl.org/dc/terms/subject> ?topic . filter ( ?topic in ( <http://data.open.ac.uk/topic/agriculture>, <http://data.open.ac.uk/topic/environment>, .. .. etc. ) )}
→ 85 OpenLearn units
Units are extracts from OU courses with multiple pages of material and expected to take many hours of study.
24
http://data.open.ac.uk/openlearn/s250_3http://data.open.ac.uk/openlearn/sdk125_1http://data.open.ac.uk/openlearn/t123_1http://data.open.ac.uk/openlearn/t206_2http://data.open.ac.uk/openlearn/t213_1http://data.open.ac.uk/openlearn/s173_1http://data.open.ac.uk/openlearn/u116_3http://data.open.ac.uk/openlearn/s278_19http://data.open.ac.uk/openlearn/t306_3http://data.open.ac.uk/openlearn/s189_1http://data.open.ac.uk/openlearn/s344_1http://data.open.ac.uk/openlearn/s324_1http://data.open.ac.uk/openlearn/s250_2……
25
http://data.open.ac.uk/openlearn/s250_2http://www.open.edu/openlearn/science-maths-technology/science/ environmental-science/social-issues-and-gm-crops/content-section-0
This unit is an adapted extract from the course Science in context (S250)
26
A three step process:
1. Find all the subjects and chose those relevant to agriculture
2. Find all the OpenLearn Units that have just these subjects
3. Collect the metadata for each of the selected Open Learn units
27
import urllib.parseimport urllib.request
# To run: python get_SPARQL_from_OpenData.py# Edit this file in two places to choose output format as json or rdf/xml
def run_SPARQL(course_id): ''' returns results of SPARQL query''' # EDIT HERE # place course_id in request # req = urllib.request.Request('http://data.open.ac.uk/openlearn/{}'.format(course_id), headers={'Accept': 'application/rdf+json'}) req = urllib.request.Request('http://data.open.ac.uk/openlearn/{}'.format(course_id), headers={'Accept': 'application/rdf+xml'}) # fire off the query f = urllib.request.urlopen(req) # pass back the query result having rendered it readable first return(f.read().decode('utf-8'))
if __name__ == '__main__': llist = ['a180_2', 'b823_1', 'd837_1', 'dd100_7', 'e500_11', 'k111_1', …] for course_id in llist: print(course_id) # run query with chosen course id # result = run_SPARQL(course_id) # EDIT HERE # with open('{}.json'.format(course_id), 'w', encoding='utf-8', newline='\n') as f: with open('{}.xml'.format(course_id), 'w', encoding='utf-8', newline='\n') as f: f.write(result)
Python script to dump the metadata
28
{ "http://data.open.ac.uk/openlearn/s250_2" : { "http://purl.org/dc/terms/language" : [ { "type" : "literal" , "value" : "en-gb" , "datatype" : http://www.w3.org/2001/XMLSchema#string } ] , "http://data.open.ac.uk/openlearn/ontology/relatesToCourse" : [ { "type" : "uri" , "value" : http://data.open.ac.uk/course/s250 } ] ,
"http://purl.org/dc/terms/title" : [ { "type" : "literal" , "value" : "Social issues and GM crops" , "datatype" : http://www.w3.org/2001/XMLSchema#string }……
<rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns# xmlns:j.0=http://dbpedia.org/property/ xmlns:j.1="http://xmlns.com/foaf/0.1/" xmlns:j.3=http://web.resource.org/cc/ xmlns:j.2=http://www.w3.org/TR/2010/WD-mediaont-10-20100608/ xmlns:j.4=http://purl.org/dc/terms/ xmlns:j.5=http://data.open.ac.uk/openlearn/ontology/ xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> <j.1:Document rdf:about="http://data.open.ac.uk/openlearn/s250_2"> <j.2:locator rdf:resource="http://www.open.edu/openlearn/nature-environment/the-environment/environmental-science /social-issues-and-gm-crops/content-section-0"/> <j.5:relatesToCourse rdf:resource="http://data.open.ac.uk/course/s250"/> <j.4:creator rdf:resource="http://data.open.ac.uk/organization/the_open_university"/> <j.4:subject rdf:resource="http://data.open.ac.uk/topic/risk"/> <j.4:published rdf:datatype=http://www.w3.org/2001/XMLSchema#dateTime >2011-06-02T23:00:00Z</j.4:published>……
rdf/xml format
json format
29
Summary:
A three step process:
1. Find all subjects/keywords relevant to agriculture 2. Identify OpenLearn Units with these subjects 3. Collect the metadata for each Open Learn unit
All the scripts (and more) are available