Download - UIMA Introduction
![Page 1: UIMA Introduction](https://reader035.vdocuments.net/reader035/viewer/2022062222/5681614d550346895dd0cfe6/html5/thumbnails/1.jpg)
UIMA Introduction
SHARPn Summit June 11, 2012
![Page 2: UIMA Introduction](https://reader035.vdocuments.net/reader035/viewer/2022062222/5681614d550346895dd0cfe6/html5/thumbnails/2.jpg)
Outline
UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations interactively
![Page 3: UIMA Introduction](https://reader035.vdocuments.net/reader035/viewer/2022062222/5681614d550346895dd0cfe6/html5/thumbnails/3.jpg)
UIMA Terminology
CAS XCAS JCAS View Analysis Engine (AE) / Annotator XML output: XCAS XMI Type System JCasGen CAS Visual Debugger (CVD) CPE (Collection Processing Engine)
![Page 4: UIMA Introduction](https://reader035.vdocuments.net/reader035/viewer/2022062222/5681614d550346895dd0cfe6/html5/thumbnails/4.jpg)
UIMA
Framework– Defining data types– Passing data from one component to another
Tooling– Viewing results– Debugging– Editing XML visually
![Page 5: UIMA Introduction](https://reader035.vdocuments.net/reader035/viewer/2022062222/5681614d550346895dd0cfe6/html5/thumbnails/5.jpg)
Data Through a Pipeline
Type System– Defines the data types passed along
CAS (Common Analysis Structure)– Container for the data passed along
– Created by UIMA from the Type System
![Page 6: UIMA Introduction](https://reader035.vdocuments.net/reader035/viewer/2022062222/5681614d550346895dd0cfe6/html5/thumbnails/6.jpg)
Parts of a UIMA Pipeline
Collection Reader– Read input document
Analysis Engine(s) / Annotator(s)– Process document
CAS Consumer– Output data
![Page 7: UIMA Introduction](https://reader035.vdocuments.net/reader035/viewer/2022062222/5681614d550346895dd0cfe6/html5/thumbnails/7.jpg)
Tying a Pipeline Together
CPE descriptor (Collection Processing Engine)
– Collection Reader – Analysis Engine(s)
– CAS Consumer
Aggregate analysis engine– Multiple Analysis Engines and their order
![Page 8: UIMA Introduction](https://reader035.vdocuments.net/reader035/viewer/2022062222/5681614d550346895dd0cfe6/html5/thumbnails/8.jpg)
Pipeline Example
UIMA term
Collection Reader
Analysis Engine
Analysis Engine
Analysis Engine
CAS Consumer
Example
Read files from a dir
Sentence detector
Tokenizer annotator
Part of Speech tagger
Output tokens to DB
![Page 9: UIMA Introduction](https://reader035.vdocuments.net/reader035/viewer/2022062222/5681614d550346895dd0cfe6/html5/thumbnails/9.jpg)
UIMA plugin for Eclipse
Provides visual editors for descriptors – Mini GUI for selecting options – Rather than editing XML directly
An “Update site” exists for installing pluginhttp://www.apache.org/dist/incubator/uima/eclipse-update-site
![Page 10: UIMA Introduction](https://reader035.vdocuments.net/reader035/viewer/2022062222/5681614d550346895dd0cfe6/html5/thumbnails/10.jpg)
UIMA Tooling Options
Tools:– CPE Configurator – CVD (CAS Visual Debugger)
Options:– Command line scripts/.bat files
– Run within Eclipse
![Page 11: UIMA Introduction](https://reader035.vdocuments.net/reader035/viewer/2022062222/5681614d550346895dd0cfe6/html5/thumbnails/11.jpg)
Running a Pipeline - CPE
cTAKES provides a script and a bat filerunctakesCPE
Choose a CPE descriptor, such astest_plaintext.xml
from cTAKESdesc/cdpdesc/collection_processing_engine
![Page 12: UIMA Introduction](https://reader035.vdocuments.net/reader035/viewer/2022062222/5681614d550346895dd0cfe6/html5/thumbnails/12.jpg)
Viewing Annotations - CVD
Viewing annotations using the CVD– Load the Type System– Load the XCAS or XMI
![Page 13: UIMA Introduction](https://reader035.vdocuments.net/reader035/viewer/2022062222/5681614d550346895dd0cfe6/html5/thumbnails/13.jpg)
Annotation Viewers
UIMA tools– CVD (CAS Visual Debugger)– Annotation viewer
Viewing XML output– Any XML viewer
– Any text editor
![Page 15: UIMA Introduction](https://reader035.vdocuments.net/reader035/viewer/2022062222/5681614d550346895dd0cfe6/html5/thumbnails/15.jpg)
Supplemental slides follow
![Page 16: UIMA Introduction](https://reader035.vdocuments.net/reader035/viewer/2022062222/5681614d550346895dd0cfe6/html5/thumbnails/16.jpg)
Options to Run a Pipeline
CPE GUI CVD GUI
– Single Aggregate Analysis Engine– No Collection Reader
Instantiate a CpeDescription and invoke
the process() method uimaFIT– removes dependency on XML
![Page 17: UIMA Introduction](https://reader035.vdocuments.net/reader035/viewer/2022062222/5681614d550346895dd0cfe6/html5/thumbnails/17.jpg)
Creating a New Annotator
Within Eclipse– Create Java project– Right click -> Add UIMA Nature– Add UIMA jars to .classpath (Build Path)– Create Analysis Engine (AE) descriptor– Add types to AE descriptor, or optionally
create separate Type System descriptor– Write code!
![Page 18: UIMA Introduction](https://reader035.vdocuments.net/reader035/viewer/2022062222/5681614d550346895dd0cfe6/html5/thumbnails/18.jpg)
Running an AE in CVD
Using CVD to run an Analysis Engine– No Collection Reader– Single Analysis Engine (can be an aggregate)– No CAS Consumer
– Load an Analysis Engine – Paste/type in text to process
Family history of hyperlipidemia.
![Page 19: UIMA Introduction](https://reader035.vdocuments.net/reader035/viewer/2022062222/5681614d550346895dd0cfe6/html5/thumbnails/19.jpg)
Modifying a parameter
UIMA’s descriptor editors allow you to modify most parameters without looking at the XML itself.
![Page 20: UIMA Introduction](https://reader035.vdocuments.net/reader035/viewer/2022062222/5681614d550346895dd0cfe6/html5/thumbnails/20.jpg)
Links
Getting started with UIMA http://uima.apache.org/doc-uima-annotator.html
UIMA Update site for use in Eclipse http://www.apache.org/dist/incubator/uima/eclipse-update-site