[email protected] artificial neural networks and xml presented by : m. eftekhari advisor :...
TRANSCRIPT
Artificial Neural Networks And XML
Presented by : M. EftekhariAdvisor : Dr. S. Astaneh
Outlines Introduction
From biological Artificial Neural Nets
Inherent capacities
The Distributed Training Environment (DTE) Why distributed
environment? (Motivation) Features JOONE
Xml-Based format for trained Neural Network definition Motivations Neural Network Markup
Language (NNML)
Decomposition of Neural Nets Model
The neural model description in NNML
Processing of NNML documents
PMML ….
Introduction
From biological to Artificial Neuron (Intro.)
A simple Artificial Neuron
f
x1 x2
w1
w0=
w2
2
0
[ ]i i
i
y f w x
( 1) ( )i
w t w t wi i
Activation functionHas the role of events thatOccur in a real neuron of brain
Weights are similar to synapse
Sum simulates the
dendrites
The learning is the process of updating
weights
Out put connections are similar to
axons
Inherent capacities (Intro.) The neurons are parallel inside each other due to
inherent structure of Neural Network.
When a network learns, it works as a autonomous mechanism (speech part of brain).
A central mechanism coordinates, schedules and these self-organize parts. (may be ensemble of parts needed)
So ANNs can be distributed. A learnt ANNs can be shared to use by other applications.
The Distributed Training Environment (DTE)
Why distributed environment? (Motivation)
use a neural net to resolve complex jobs is not sufficient.
For complex problem, net can fall onto a local minima without finding the best results
what must be developed is a mechanism to train many neural nets in parallel on the same problem, on several machines, governing the whole process from a central point.
Java Object Oriented Neural Engine
Joone is a FREE neural net framework to create, train and test neural nets
Distributed Training Environment to train in parallel mode many neural networks to find the fittest one for a given problem.
Features Centralized control The Final results are logged into a XML file to permit
to analyze the results from a custom program/script The training process scale almost linearly adding
more machines No manual configuration needed to add or remove
machines Possibility to add or remove machines dynamically
during the training process The overall process controlled by XML parameters
Xml-Based format for trained Neural Network definition
Motivations unified way for neural network model definition
Interchanging neural models as well as documentation
store and manipulating them independently from the simulation system that produced it.
The development of the neural-based Web services
Neural Network Markup Language(NNML)
XML-based language (Neural Network Markup Language) for the neural network model description.
NNML as an interface between various software systems concerning neural networks (see Fig of next slide)
NNML causes separation of neural networks generators, interpreters, tools for visualization and knowledge extraction (see Fig of next slide)
Applied to the distribution of the neural network models
Integrating the powerful simulation systems like Matlab with Web interface
Neural Network Markup Language(NNML)
Decomposition of Neural Nets Model and NNML
The neural model description in NNML
The problem and model purpose (Task) Data dictionary Data preprocessor Neural network Postprocessor Auxiliary information about the model.
The neural model description in NNMLA simple neuron
Which objectForm Which layer
layer Obj
Processing of NNML documents
Generation and training by means of the neural network simulator.
Creating hierarchical model by the interface module on the basis of internal representation.
Methods of any XML parser are called, object tree of the model are constructed
NNML file is generated.
For loading of the ready NNML file, actions are performed in the
reverse order.
Processing of NNML documents
PMML (Predictive Model Markup Language )
Introduction
The PMML is a set of Document Type Descriptions (DTDs) specified in XML.
The first version (1.0) was provided in July 1999 by the Data Mining Group (DMG, http://www.dmg.org).
A Markup Language for Predictive modeling, but not only restricted to this field.
Support only Back propagation Nets despite of previous introduced Method.
PMML (Contd.)
The PMML 1.1 definition includes DTDs for the following types of models: 1. Naïve bayes 2. Regression Models 3. Decision trees 4. Center and distribution based clusters 5. Sequence and association rules 6. neural nets
Advantages of PMML: Removes the issues of incompatibility between
applications and proprietary formats.
DTDs support proprietary extensions to allow for enriched information storage for specialized tools.
Previous solutions to the problem of sharing data models were incorporated into custom-built systems, and thus exchange of models with an application outside of the system was virtually impossible
Advantages of PMML
For example, it allows users: (sharing the data) To generate data models using one vendor application. Use other vendor application to analyze. Another to evaluate the models. Another vendor application to visualize the model.
PMML (Contd.) The PMML describes the models using eight
modules: 1. Header 2. Data Dictionary schema 3. Data Mining schema 4. Predictive model schema5. Definition for predictive models6. Definition for ensemble of models7. Rules for selecting and combining models and
ensembles of models8. Rules for exception handling
PMML (Contd.) Using PMML to model Association Rules
<?xml version="1.0" ?> <PMML version="1.1"> <Header copyright="www.dmg.org" description="sample model for association rules"/> <DataDictionary numberOfFields="1" > <DataField name="item" optype="categorical" /> </DataDictionary>
<AssociationModel> <AssocInputStats numberOfTransactions="4" numberOfItems="3" minimumSupport="0.6" minimumConfidence="0.5" numberOfItemsets="3" numberOfRules="2"/> <!-- We have three items in our input data --> <AssocItem id="1" value="Cracker" /> <AssocItem id="2" value="Coke" /> <AssocItem id="3" value="Water" /> <!-- and two frequent itemsets with a single item --> <AssocItemset id="1" support="1.0" numberOfItems="1"> <AssocItemRef itemRef="1" /> </AssocItemset> <AssocItemset id="2" support="1.0" numberOfItems="1"> <AssocItemRef itemRef="3" /> </AssocItemset> <!-- and one frequent itemset with two items. --> <AssocItemset id="3" support="1.0" numberOfItems="2"> <AssocItemRef itemRef="1" /><AssocItemRef itemRef="3" /> </AssocItemset> <!-- Two rules satisfy the requirements --> <AssocRule support="1.0" confidence="1.0" antecedent="1" consequent="2" /> <AssocRule support="1.0" confidence="1.0" antecedent="2" consequent="1" /> </AssociationModel> </PMML>
PMML The General Web Architecture
Web WarehouseMaterialize and manages useful Information on web
Application interfaces
A software that facilitates the process ofContent extraction
PMML and ANNs(DTD)
<!ELEMENT NeuralInput (Extension*, ( NormContinuous | NormDiscrete )) >
<!ATTLIST NeuralInput id %NN-NEURON-ID; #REQUIRED >
NN-NEURON-ID is just a string which identifies a neuron
PMML and ANNs(XSD)
<xs:element name="NeuralInput"> <xs:complexType> <xs:sequence> <xs:element minOccurs="0“ maxOccurs="unbounded“ ref="Extension" />
<xs:element ref="DerivedField" /> </xssequence> <xs:attribute name="id" type="NN-NEURON-ID" use="required" />
</xs:complexType> </xs:element>
PMML and ANNs(DTD)
<! ELEMENT Neuron (Extension*, Con+) >
<! ATTLIST Neuron id %NN-NEURON-ID; #REQUIRED bias %REAL-NUMBER; #IMPLIED activationFunction %ACTIVATION-FUNCTION;
#IMPLIED threshold %REAL-NUMBER; #IMPLIED
>
PMML and ANNs(XSD)
<xs:element name="Neuron"> <xs:complexType> <xs:sequence> <xs:element minOccurs="0" maxOccurs="unbounded" ref="Extension" />
<xs:element maxOccurs="unbounded" ref="Con" /> </xs:sequence> <xs:attribute name="id" type="NN-NEURON-ID" use="required" /> <xs:attribute name="bias" type="REAL-NUMBER" /> <xs:attribute name="activationFunction" type="ACTIVATION-FUNCTION" /> <xs:attribute name="threshold" type="REAL-NUMBER" /> <xs:attribute name="width" type="REAL-NUMBER" /> </xs:complexType> </xs:element>
PMML and ANNs(DTD)
<!ELEMENT Con (Extension*) > <!ATTLIST Con from %NN-NEURON-IDREF; #REQUIRED
weight %REAL-NUMBER; #REQUIRED >
PMML and ANNs(XSD)
<xs:element name="Con"> <xs:complexType> <xs:sequence> <xs:element minOccurs="0" maxOccurs="unbounded"
ref="Extension" /> </xs:sequence> <xs:attribute name="from" type="NN-NEURON-
IDREF" use="required" /> <xs:attribute name="weight" type="REAL-NUMBER"
use="required" /> </xs:complexType> </xs:element>
PMML and ANNs(DTD)
<!ELEMENT NeuralNetwork (Extension*, MiningSchema, ModelStats?, NeuralInputs, ( NeuralLayer+), NeuralOutputs? )>
<!ATTLIST NeuralNetwork modelName CDATA #IMPLIED activationFunction %ACTIVATION-FUNCTION; #REQUIRED
threshold #REAL-NUMBER; #IMPLIED >
<!ELEMENT NeuralInputs ( NeuralInput+ ) >
<!ELEMENT NeuralLayer ( Neuron+ ) >
<!ELEMENT NeuralOutputs ( NeuralOutput+ ) >
PMML and ANNs (XSD) <xs:element name="NeuralNetwork"> <xs:complexType> <xs:sequence> <xs:element minOccurs="0" maxOccurs="unbounded" ref="Extension" /> <xs:element ref="MiningSchema" /> <xs:element minOccurs="0" ref="ModelStats" /> <xs:element ref="NeuralInputs" /> <xs:element maxOccurs="unbounded" ref="NeuralLayer" /> <xs:element minOccurs="0" ref="NeuralOutputs" /> </xs:sequence>
<xs:attribute name="modelName" type="xs:string" /> <xs:attribute name="functionName" type="MINING-FUNCTION" use="required" /> <xs:attribute name="algorithmName" type="xs:string" /> <xs:attribute name="activationFunction" type="ACTIVATION-FUNCTION" use="required"
/> <xs:attribute name="threshold" type="REAL-NUMBER" /> <xs:attribute name="numberOfLayers" type="xs:nonNegativeInteger" /> </xs:complexType> </xs:element>
PMML and ANNs (XSD)
<xs:element name="NeuralInputs"> <xs:complexType> <xs:sequence> <xs:element minOccurs="0" maxOccurs="unbounded" ref="Extension" /> <xs:element maxOccurs="unbounded" ref="NeuralInput" /> </xs:sequence> <xs:attribute name="numberOfInputs" type="xs:nonNegativeInteger" />
</xs:complexType> </xs:element>
PMML and ANNs (XSD) <xs:element name="NeuralLayer"> <xs:complexType> <xs:sequence> <xs:element minOccurs="0" maxOccurs="unbounded" ref="Extension" /> <xs:element maxOccurs="unbounded" ref="Neuron" /> </xs:sequence> <xs:attribute name="numberOfNeurons" type="xs:nonNegativeInteger" /> <xs:attribute name="activationFunction" type="ACTIVATION-FUNCTION" /> <xs:attribute name="normalizationMethod" default="none"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:enumeration value="none" /> <xs:enumeration value="simplemax" /> <xs:enumeration value="softmax" /> </xs:restriction> </xs:simpleType> </xs:attribute> </xs:complexType> </xs:element>
PMML and ANNs (XSD)
<xs:element name="NeuralOutputs"> <xs:complexType> <xs:sequence> <xs:element minOccurs="0" maxOccurs="unbounded"
ref="Extension" /> <xs:element maxOccurs="unbounded" ref="NeuralOutput" />
</xs:sequence> <xs:attribute name="numberOfOutputs" type="xs:nonNegativeInteger" />
</xs:complexType> </xs:element>
PMML and ANNs
<?xml version="1.0" ?> <PMML version="2.1"> <Header copyright="DMG.org"/>
PMML and ANNs
<DataDictionary numberOfFields="5"> <DataField name="gender" optype="categorical">
<Value value=" female"/> <Value value=" male"/> </DataField> <DataField name="no of claims" optype="categorical">
<Value value=" 0"/> <Value value=" 1"/> <Value value=" 3"/> <Value value=" > 3"/> <Value value=" 2"/>
</DataField> <DataField name="domicile" optype="categorical">
<Value value="suburban"/> <Value value=" urban"/> <Value value=" rural"/>
</DataField> <DataField name="age of car" optype="continuous"/> <DataField name="amount of claims" optype="continuous"/>
</DataDictionary>
PMML and ANNs
<MiningSchema> <MiningField name="gender"/> <MiningField name="no of claims"/> <MiningField name="domicile"/> <MiningField name="age of car"/> <MiningField name="amount of claims"
usageType="predicted"/> </MiningSchema>
PMML and ANNs
<NeuralInputs numberOfInputs="10"> <NeuralInput id="0"> <DerivedField> <NormContinuous field="age of car"> <LinearNorm orig="0.01" norm="0"/> <LinearNorm orig="3.07897" norm="0.5"/> <LinearNorm orig="11.44" norm="1"/> </NormContinuous> </DerivedField> </NeuralInput>
PMML and ANNs
<NeuralInput id="1"> <DerivedField> <NormDiscrete field="gender" value=" male"/> </DerivedField> </NeuralInput> …. To 9
PMML and ANNs
<NeuralLayer numberOfNeurons="3"> <Neuron id="10"> <Con from="0" weight="-2.08148"/> <Con from="1" weight="3.69657"/> <Con from="2" weight="-1.89986"/> <Con from="3" weight="5.61779"/> <Con from="4" weight="0.427558"/> <Con from="5" weight="-1.25971"/> <Con from="6" weight="-6.55549"/> <Con from="7" weight="-4.62773"/> <Con from="8" weight="1.97525"/> <Con from="9" weight="-1.0962"/> </Neuron> …… </NeuralLayer>
N1 Id=10
.
.
.
.
.
.
I0
I9
-2.08148
-1.0962
PMML and ANNs
Output Neuron <NeuralLayer numberOfNeurons="1">
<Neuron id="13"> <Con from="10" weight="0.76617"/> <Con from="11" weight="-1.5065"/> <Con from="12" weight="0.999797"/> </Neuron>
</NeuralLayer>
N1 Id=10
N2 Id=11
N3 Id=13
NoId=13
output
PMML and ANNs
<NeuralOutputs numberOfOutputs="1"> <NeuralOutput outputNeuron="13">
<DerivedField> <NormContinuous field="amount of claims"> <LinearNorm orig="0" norm="0.1"/> <LinearNorm orig="1291.68" norm="0.5"/> <LinearNorm orig="5327.26" norm="0.9"/> </NormContinuous> </DerivedField>
</NeuralOutput> </NeuralOutputs> </NeuralNetwork> </PMML>