building crop ontology for farmers saturday,17 december,2011 prof. n.l sarda gise advanced research...
TRANSCRIPT
BUILDING CROP ONTOLOGYFOR FARMERS
Saturday,17 December,2011
Prof. N.L SardaGISE Advanced Research Lab,
CSE Dept, IIT Bombay1
NEHA, M.Tech(IT) Banasthali University, Rajasthan
(2011-2012)
Outline
Saturday,17 December,2011GISE Lab, IIT Bombay 2
Introduction1
The Agro-Advisory System2
Tools & Techniques3
Knowledge Modeling4
References7
Summary 6
S System Design5
INTRODUCTIONThis project aims at creation of Knowledge Model for Cotton Crop in the context of the Agropedia Indica project.
Farmers have many questions regarding the type of soil/climate, type of pests, diseases and activity timelines related to crop.
So for handling such queries we need 2 things: - Knowledge based system - Searching mechanismRelated workaAQUA system by IIT Bombay
eSagu by IIIT Hyderabad
mKRISHI at TCS
AgroAdvisory System by Chau at IIT Bombay
GISE Lab, IIT Bombay Saturday,17 December,2011
What is Ontology..??Ontology - philosophy“An ontology is a systematic account of Existence”
Ontology - computer science - [Gruber-93]“An ontology is a Formal explicit specification of conceptualization [3]”
Formal : Machine understandable.
Explicit specification: knowledge of a domain is represented in a declarative formalism i.e concepts, properties, functions and axioms.
Conceptualization: an abstract, simplified view of the world that we wish to repersent for some domain.
Saturday,17 December,2011GISE Lab, IIT Bombay
What is Ontology..?? O = [C,P,R,A]
Concept : conceptual entity of the domain
Property : attribute describing a concept
Relationship: connection between concepts or properties
Axiom: coherent description between CRP via logical expressions
Person
Prof.Student
Lecture
emailName
TopicLecture
No.
R.Field
Is-a hierarchy
attends holds
Figure 1. Representation of Ontology [wsmo_tutorial.ppt]
GISE Lab, IIT Bombay Saturday,17 December,2011
Related WorkaAQUAOnline multilingual, multimedia Question and Answer based community forum.
Discussion forum where a farmer submits a problem related to agriculture.
Agriculture experts or other farmers provide solutions.
eSaguA query-less system providing agro-advice without the farmer asking a question.
Deliver personalized expert advice in a timely manner to each individual farm at regular intervals.
mKRISHIAllow farmer to use audio-visual facilities provided on a mobile phone to express their queries to experts with minimal use of text.
GISE Lab, IIT Bombay Saturday,17 December,2011
Agro-Advisory System
Web based query-answering support for farmers An ontology based knowledge system Knowledge acquisition done with the help of Agro experts
Need for Agro-Advisory SystemDemand of a system to understand the query as exactly as farmers see and ask questions.
Enable the data/knowledge captured in the system to answer the user query.
System ArchitectureClick to edit Master text stylesSecond level
Third level Fourth level
Fifth level
IIT Kanpur has contribution in Agricultural Ontology part, constructing knowledge models for various crops.
But, the Knowledge Model for Cotton crop is still untouched.
Figure 2. System Architecture by Chau [1]
Tools and TechniquesTool Description
CMap 5.04.02 ( KM Kit)
Cmap is the editor by Institute of Human and machine Cognition, used to build the KM. It is simple, free and open source.
OWL (Web Ontology Language)
OWL is a W3C recommended language for semantic-based systems that bring the expressive and reasoning power of description logic to it.
Orcale 11g Oracle 11g supports semantic technologies to store the RDF data in the form of URI. All triples are parsed and stored in the system as entries in tables under the MDSYS schema.
Jena Jena is an open source framework developed by Hewlett-Packard. It is a Java API for accessing and manipulating RDF statements.
SPARQL SPARQL is the query language of the Semantic Web.
Eclipse Eclipse an open source community, whose projects are focused on building an open development platform comprised of extensible frameworks, tools and runtimes for building, deploying and managing software across the lifecycle.
Apache Tomcat Apache Tomcat is the servlet container that is used in the official Reference Implementation for the Java Servletand JSP technologies.
Proposed System for Sematic Querying
Information AvailableInformation needed for answering farmer available in the system in various forms.Agricultural data has been collected specifying farming details.
Agricultural Data GatheredData gathered from 3 districts of PunjabGives information about farmer’s farming practices for last 5 yearsInformation about various factors affecting farming practices likeLocation type of soil
time of sowing of the crop
type of fertilizer/insecticide used, its frequency and timings
Knowledge Model
Concept proposed by agropedia project of IIT KanpurMainly used to navigate agricultural knowledge and to organize and search agricultural contentKnowledge models have been organized in the following ways:A generic map for crops
Specialized maps which are crop specific
Maps on diseases, pests, and many more
IIT Kanpur has contributions in knowledge models for nine selected crops.We focus on generating knowledge model for Cotton.
CMap: The EditorThe CMap tool was used because of its various advantages: simple interface
ease of use
export data to various formats such as XML,OWL TXT and it also supports JPG, PDF for visualization.
Maximum expressivity with minimum complexity.
It should be possible to get cross-maps information. Reuse of
C-Maps is possible, the name of concepts should be same.
Relationships should be used consistently.
Figure 3. Snapshot of Cmap Tool 5.04.02
Knowledge Modeling Guidelines[7]
Here, knowledge can be represented as boxes. Each box can store a unique and well defined element (a concept or an instance). Boxes can be linked by arrows. CMap allows navigation of knowledge across maps .Allow reusing of URIs in different maps i.e can link the Cotton crop ontology with general pest ontology, diseases ontology, Fertilizers ontology etc. with the same URI.
Is a
Loamy
Soil
Figure 4. Cross Map Linkages[7] Figure 5. Knowledge Representation in Cmap
Type of Relationships
are: [subclassOf] links concepts in a hierarchical way i.e generic-object hasSubclass a more specific object (example: Sucking_Pests are Insect_Pests). is a: [instanceOf] links every instance to its concept (example: instance “Jassids” is a “Sucking_Pests”).Any other type of Property: should be used consistently (example: usesProcess should be between process or methods).
are
Varities
Desi_Cotton
American_Cotton
is a
Sandy_Loamy
Loamy
Soil
Figure 7. ‘is-a’ Relationship Figure 6. ‘are’ Relationship
Adding Info & OWL Generation
Click to edit Master text stylesSecond level
Third level Fourth level
Fifth level
Figure 8. Adding Info(URI) in CMap Figure 9. OWL file generated from Map
Generic Crop OntologyContains information which is common to every crop, likeorigin
environmental information
varieties and cropping systems
production and post production practices
Figure 10. A snapshot of generic Crop Ontology[7].
Knowledge Model for CottonKnowledge model is a specialization to the generic crop modelProvides information specific to cotton crop as per Foundational Agricultural Crop OntologyAlso gives details about diseases and pests affecting cotton
Figure 11. A snapshot of Cotton ontology[2]
Knowledge Model for Cotton…
Click to edit Master text stylesSecond level
Third level Fourth level
Fifth level
Pest Knowledge model
Figure 12. A snapshot of Pest ontology[2].
Knowledge Model for Cotton…
Click to edit Master text stylesSecond level
Third level Fourth level
Fifth level
Disease Knowledge model
Figure 13. A snapshot of Disease ontology[2].
Resultant OntologyClick to edit Master text stylesSecond level
Third level Fourth level
Fifth level
Figure 14. Final Generic Cotton Crop Ontology[2].
Rules, Discovered Patterns & Domain Knowledge
Rules can be defined over existing relations in ontology, eg. transitivity or any new relationOntology can be divided as informative part and action part and this information is stored in domain knowledgeInformation related on what kind of words can be used in query are collected and storedDiscovered patterns will give us facts like causes behind poor yields in a region, etc.
System Design
Raw Description :
Form Based Queries USER
INTERFACE
COEINTERFACE
STORAGE INTERFACE
Oracle (Jena)
CMap (OWL)
APPLICATION
Figure 16. System Overview
Different Approaches for capturing queries
User interface allows user to enter information.Different ways in which user can provide information can be as belowNatural Language QueryUser query in complete natural language, full sentences
Convenient for user as parsing of input is carried by the system
Eg. What treatment should be done as soon as leaves curl and turn brown?
Keyword QueryRestrictive approach as compared to full sentences natural language query
User enters just the keywords for the question query
Eg. treatment leaves curl and brown
Form-based QueryUser gives his details/queries in a form provided by the system
Form takes key-value pairs as an input
Eg. leaf=“curl and brown” where key=“leaf”, value=“curl and brown”
Context-Observation-Action based QueryContext, by default, implies the details of the farmer unless referred to some other context.
Observation can be his state of crop, or soil, or weather conditions
Action is something he asks the system what to do or confirms his action.
Eg. Context is “ ”, Observation is “leaves curl and brown” and Action is “Treatment to be done”
Different Approaches for capturing queries
Storage Interface[10]
Oracle 11g It supports semantic technologies to store the RDF data in the form of URI.
All triples are parsed and stored in the system as entries in tables under the MDSYS schema. A triple {subject, property, object} is treated as one database object.
A single document containing multiple triples results in multiple database objects.
All the subjects (URIs /Blank node) and objects (URIs/Blank node /Literal) of triples are mapped to nodes in a semantic data network
All properties(URIs) are mapped to network links that have their start node and end node as subject and object.
In my Ontology there are:
- Asserted triples count: 616
- Asserted + Infered triples count: 705
Storage Interface
JenaThe Jena Adapter for Oracle Database provides a Java-based interface to Oracle Database Semantic Technologies by implementing the well-known Model APIs.
Includes:
- API for operations over RDF graphs
- Import and export from RDF/XML, N-Triples
- SPARQL query engine
- Creating persistent storage in a relational database
- Inherent OWL support and various internal reasoners
Storage Interface
SPARQL Query Language for RDF: SPARQL can be used to express queries across diverse data sources stored natively as RDF.
SPARQL is the query language of the Semantic Web.
SPARQL Query Syntax :
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> "+ "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> "+ "Select ?object "+ "where { "+ "?uri rdfs:subClassOf <http://localhost/default#Growth_Stages> "+ "} \n ";
Conclusion
Initial works towards system development : System architecture Ontology construction Loaded ontology in Oracle 11g
Question-answering subsystem
Next task will be to pose some form-based queries on ontology stored in Oracle and retrieve the appropriate result.
References
[1] Vo Thi Ngoc Chau. Internal Report,Agroadvisory system, 2010.[2] Neha and Sonali Sahni Knowledge model for cotton,Internal report, IIT Bombay, 2011.[3] N Nataya F.Noy and Deborah L.McGuinness, Ontology Development 101: A guide for Creating Your First Ontology.[4] Toby Segaran, Colin Evans, and Jamie Taylor, Programming the Semantic Web.[5] Oreilly Publications, Practical RDF.[6] Amit Tripathi,Vimlesh Kumar Yadav and T.V. Prabhakar, IIT-K Agropedia-An ICT tool for extension services in Indian Agriculture, Available : http://agropedia.iitk.ac.in/
References[7] Margherita Sini, Vimlesh Yadav, Jeetendra Singh, Vikas Awasthi, and Prabhakar TV. Knowledge models and guidelines in agropedia indica.[8] E. Kaufmann, A. Bernstein, and R. Zumstein. Querix: A Natural Language Interface to Query Ontologies Based on Clarification Dialogs. In 5th International Semantic Web Conference (ISWC 2006), pages 980–981, 2006[9] Esther Kaufmann, Abraham Bernstein, and Lorenz Fischer. Nlp-reduce: A ânaïveâ but domain-independent natural language interface for querying ontologies. 4th European Semantic Web Conference ESWC 2007, pages 1–2, 2007.[10] ORACLE, Semtech developer guide.