uima sharp 4 - nlp may 25, 2010. outline uima terminology (not just tlas) parts of a uima pipeline...
TRANSCRIPT
![Page 1: UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations Creating a new](https://reader036.vdocuments.net/reader036/viewer/2022062301/56649e985503460f94b9afbc/html5/thumbnails/1.jpg)
UIMA
SHARP 4 - NLP
May 25, 2010
![Page 2: UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations Creating a new](https://reader036.vdocuments.net/reader036/viewer/2022062301/56649e985503460f94b9afbc/html5/thumbnails/2.jpg)
Outline
• UIMA Terminology (not just TLAs)
• Parts of a UIMA pipeline
• Running a pipeline
• Viewing annotations
• Creating a new annotator
![Page 3: UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations Creating a new](https://reader036.vdocuments.net/reader036/viewer/2022062301/56649e985503460f94b9afbc/html5/thumbnails/3.jpg)
UIMA terminology
• CAS XCAS JCAS View
• Analysis Engine (AE) / Annotator– Aggregate Analysis Engine
• XML output: XCAS XMI
• Type System JCasGen
• CAS Visual Debugger (CVD)
• CPE (Collection Processing Engine)
![Page 4: UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations Creating a new](https://reader036.vdocuments.net/reader036/viewer/2022062301/56649e985503460f94b9afbc/html5/thumbnails/4.jpg)
UIMA and Eclipse
• UIMA plugin for Eclipse requires EMF
• UIMA plugin provides visual editors for descriptors
• An “Update site” exists for installing plugin
![Page 5: UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations Creating a new](https://reader036.vdocuments.net/reader036/viewer/2022062301/56649e985503460f94b9afbc/html5/thumbnails/5.jpg)
UIMA Pipeline Flow
• Collection Reader• (CAS Initializer - deprecated)
• Analysis Engine (AE) / Annotator
• CAS Consumer
![Page 6: UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations Creating a new](https://reader036.vdocuments.net/reader036/viewer/2022062301/56649e985503460f94b9afbc/html5/thumbnails/6.jpg)
Pipeline Example
Example
Read files from a dir
Sentence annotator
Tokenizer annotator
Output tokens to a DB
UIMA term
Collection Reader
Analysis Engine
Analysis Engine
CAS Consumer
![Page 7: UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations Creating a new](https://reader036.vdocuments.net/reader036/viewer/2022062301/56649e985503460f94b9afbc/html5/thumbnails/7.jpg)
Options for running UIMA tools
• Tools:
– CPE Configurator
– CVD
• Options:
– Command line scripts/.bat files
– Run within Eclipse
![Page 8: UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations Creating a new](https://reader036.vdocuments.net/reader036/viewer/2022062301/56649e985503460f94b9afbc/html5/thumbnails/8.jpg)
Tying together a UIMA pipeline
• Type System
– Defines the data types passed along
• CAS (Common Analysis Structure)
– Container for the data
![Page 9: UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations Creating a new](https://reader036.vdocuments.net/reader036/viewer/2022062301/56649e985503460f94b9afbc/html5/thumbnails/9.jpg)
Tying together a UIMA pipeline
• CPE descriptor – select the parts– Collection Reader
– Analysis Engine(s)
– CAS Consumer
• Aggregate analysis engine– Multiple Analysis Engines and their order
![Page 10: UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations Creating a new](https://reader036.vdocuments.net/reader036/viewer/2022062301/56649e985503460f94b9afbc/html5/thumbnails/10.jpg)
Options for running a pipeline
• CVD GUI– Single Aggregate Analysis Engine
– No Collection Reader
• CPE GUI
• Instantiate a CpeDescription and invoke the process() method2.3. Running a CPE from Your Own Java Application
![Page 11: UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations Creating a new](https://reader036.vdocuments.net/reader036/viewer/2022062301/56649e985503460f94b9afbc/html5/thumbnails/11.jpg)
Example: Running a pipeline
Running cTAKES within Eclipse using a CPE
Use run configuration
UIMA_CPE_GUI--clinical_documents_pipeline
CPE
test1.xml
from clinical documents pipeline\desc\collection_processing_engine
![Page 12: UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations Creating a new](https://reader036.vdocuments.net/reader036/viewer/2022062301/56649e985503460f94b9afbc/html5/thumbnails/12.jpg)
Options for viewing annotations
• CVD
• Annotation viewer
• XML viewer
• Text editor
![Page 13: UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations Creating a new](https://reader036.vdocuments.net/reader036/viewer/2022062301/56649e985503460f94b9afbc/html5/thumbnails/13.jpg)
Example: Viewing annotations
Viewing annotations using the CVD
• Load the Type System• Load the XCAS or XMI
![Page 14: UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations Creating a new](https://reader036.vdocuments.net/reader036/viewer/2022062301/56649e985503460f94b9afbc/html5/thumbnails/14.jpg)
Example: Running an AE in CVD
Using CVD to run an Analysis Engine– No Collection Reader– Single Analysis Engine (can be an aggregate)– No CAS Consumer
– Just paste/type in text to processFamily history of hyperlipidemia.
![Page 15: UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations Creating a new](https://reader036.vdocuments.net/reader036/viewer/2022062301/56649e985503460f94b9afbc/html5/thumbnails/15.jpg)
Creating a New Annotator
• Create Java project
• Right click -> Add UIMA Nature
• Add UIMA jars to .classpath (Build Path)
• Create Analysis Engine (AE) descriptor
• Add types to AE descriptor, or optionally create separate Type System descriptor
• Write code!
![Page 16: UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations Creating a new](https://reader036.vdocuments.net/reader036/viewer/2022062301/56649e985503460f94b9afbc/html5/thumbnails/16.jpg)
Questions?
![Page 17: UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations Creating a new](https://reader036.vdocuments.net/reader036/viewer/2022062301/56649e985503460f94b9afbc/html5/thumbnails/17.jpg)
Supplemental slides follow
![Page 18: UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations Creating a new](https://reader036.vdocuments.net/reader036/viewer/2022062301/56649e985503460f94b9afbc/html5/thumbnails/18.jpg)
Example: Creating a PEAR file
• Right click -> Add UIMA Nature
• Right click -> Generate Pear
• Select Analysis Engine descriptor
• Select OS and JDK
• Modify Properties if needed
• Select what to include
![Page 19: UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations Creating a new](https://reader036.vdocuments.net/reader036/viewer/2022062301/56649e985503460f94b9afbc/html5/thumbnails/19.jpg)
Example: Modifying a parameter
UIMA’s descriptor editors allow you to modify most parameters without looking at the XML itself.
![Page 20: UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations Creating a new](https://reader036.vdocuments.net/reader036/viewer/2022062301/56649e985503460f94b9afbc/html5/thumbnails/20.jpg)
Links
• Getting started with UIMAhttp://uima.apache.org/doc-uima-annotator.html
• UIMA Update site for use in Eclipsehttp://www.apache.org/dist/incubator/uima/eclipse-update-site/