lichens, bryophytes and climate change
DESCRIPTION
Edward Gilbert Corinna Gries Thomas H. Nash III Robert Anglin. Lichens, Bryophytes and Climate Change. Goals and Scope. 16 digitization centers > 60 non-governmental US herbaria (95%) Mexico, US, Canada ~ 2.3 million specimen 90% of all specimens 900,000 lichens - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Lichens, Bryophytes and Climate Change](https://reader035.vdocuments.net/reader035/viewer/2022081507/568164cd550346895dd6f16d/html5/thumbnails/1.jpg)
Lichens, Bryophytes and Climate Change
Edward GilbertCorinna GriesThomas H. Nash IIIRobert Anglin
![Page 2: Lichens, Bryophytes and Climate Change](https://reader035.vdocuments.net/reader035/viewer/2022081507/568164cd550346895dd6f16d/html5/thumbnails/2.jpg)
Goals and Scope 16 digitization centers > 60 non-
governmental US herbaria (95%) Mexico, US, Canada
~ 2.3 million specimen 90% of all specimens 900,000 lichens 1.4 million bryophytes
![Page 3: Lichens, Bryophytes and Climate Change](https://reader035.vdocuments.net/reader035/viewer/2022081507/568164cd550346895dd6f16d/html5/thumbnails/3.jpg)
Project Information
http://lbcc.limnology.wisc.edu/
![Page 4: Lichens, Bryophytes and Climate Change](https://reader035.vdocuments.net/reader035/viewer/2022081507/568164cd550346895dd6f16d/html5/thumbnails/4.jpg)
Digitization Workflow
![Page 5: Lichens, Bryophytes and Climate Change](https://reader035.vdocuments.net/reader035/viewer/2022081507/568164cd550346895dd6f16d/html5/thumbnails/5.jpg)
National Portals Lichen Consortium
http://lichenportal.org Started in 2009 24 Collections ~ 797,916 Records
Bryophyte Consortium http://bryophyteportal/ Started in 2010 16 Collections 1,059,063 Records
![Page 6: Lichens, Bryophytes and Climate Change](https://reader035.vdocuments.net/reader035/viewer/2022081507/568164cd550346895dd6f16d/html5/thumbnails/6.jpg)
Imaging Stage
Capture Image
barcode in file name
Create Skeleton
Filebarcode, species name,
exsiccati, etc.
Upload to FTP server
Image processing
extract barcode,
create web versions, map to portal DBs
Duplicate Harvesti
ng
Existing Herbarium Database
Automated ProcessingOCR / NLP /
Georeferencingaugmented with raw OCR, parsed fields, coordinates, etc.
Existing Record
simply link image
Upload to FTP server
Image URLs
Manage Specimen
Data in Portal
Manage / Review
Records in Portal
SymbiotaEditor
review, edit, keystroke, and finalize
Create New Record
barcode, image, skeletal data
![Page 7: Lichens, Bryophytes and Climate Change](https://reader035.vdocuments.net/reader035/viewer/2022081507/568164cd550346895dd6f16d/html5/thumbnails/7.jpg)
LBCC: Workflow Overview Image all specimen / specimen labels Collect and load skeletal data
Barcode, scientific name, country, state Upload to portal
Record exists => link image to existing record Record absent => create empty “unprocessed” record
Automated OCR label Block of raw text => database
Automated NLP (field parsing) Review data
Keystroke full record Collector name & number => look for dups Reparse full record => learnable parsers
![Page 8: Lichens, Bryophytes and Climate Change](https://reader035.vdocuments.net/reader035/viewer/2022081507/568164cd550346895dd6f16d/html5/thumbnails/8.jpg)
Optical Character Recognition Tesseract V3 Dual cycle
Automatic Manual review
Expected hurtles Handwritten
labels Old fonts Faded labels Form labels
Adjustable image variables
¢_].L.|»‘¢ .'».f.'._..‘~,(.Jfin-x‘*\'a:"511z:1 wf .~\:'i/.onli State UniversityP.’~.r"~2= ,_. gg J:.2 " J*J*" †(=:\‘-“ax "»..'\-12�‘ “ "‘ ;T~;‘~7i?»-1_1_\f;>sf`;,' ESXZ»ie+‘-». “~'.»te;~:i_.t<» ff`t;~f3":.f.“» »4 xx, ,"""‘“â€T"’ <1;-.rs f3'a,1.z>.t;;a¢f~rus ’�V4 J 'if . r°'° M '1?nies ivain.) Sav.neutal Station - " '1 ~»r';;4-\P ` 1.T11 ./P.. ,J ..-.ELEV. ' `.fJL_\ LATL Q _‘ 1 _ Y’ DATE_ ,. W5. (> f- , -:‘; i f>i_T ~~ . A 1:». v\ .-v »~. 4. a xvala 8/27/73
PLANTS OF NEW r~1ExIcoHerbarium of Arizona State UniversityParmelia ulophyllodes (Vain.) Sav.COUNTY “°â€â€œâ€œ �Joranada Experimental Station -New Mexico State University"“““' on JuniperusELEV. ‘ 4400EEILLEETUR DATEDU T. H. Nash #7914 8/27/73T. H. N.
![Page 9: Lichens, Bryophytes and Climate Change](https://reader035.vdocuments.net/reader035/viewer/2022081507/568164cd550346895dd6f16d/html5/thumbnails/9.jpg)
Auto-Processing: OCR
1. Iterate through new “unprocessed” images1. 81439 bryophytes images2. 147122 lichens images
2. OCR via Tesseract (version 3)a) Untreated imageb) Treated image (contrast, brightness, etc)
3. Store raw text linked to skeletal record4. Progress to next step
1. Low OCR return => hand processing2. “Unprocessed-OCR” => NLP
![Page 10: Lichens, Bryophytes and Climate Change](https://reader035.vdocuments.net/reader035/viewer/2022081507/568164cd550346895dd6f16d/html5/thumbnails/10.jpg)
Auto-Processing: NLP
1. Iterate through raw OCR text blocksa) 147122 lichen OCR blocksb) 81439 bryophyte OCR blocks
2. Collector, number, and datea) Attempt duplicate harvesting
3. Field-by-field parsing4. Full-parsing5. Parsing based on NLP profiles
1. E.g. targeted label formats
![Page 11: Lichens, Bryophytes and Climate Change](https://reader035.vdocuments.net/reader035/viewer/2022081507/568164cd550346895dd6f16d/html5/thumbnails/11.jpg)
NLP: Duplicate Harvesting1. Extract collector data
a) Last name, number, date2. Harvest duplicates from consortium DB
a) Exact duplicatesb) Duplicate events
3. Compare return field-by-field4. Compare fields with raw OCR5. Populate fields that have high similarity
indexes6. Processing status: “pending review”
![Page 12: Lichens, Bryophytes and Climate Change](https://reader035.vdocuments.net/reader035/viewer/2022081507/568164cd550346895dd6f16d/html5/thumbnails/12.jpg)
NLP: Targeted Parsing Profiles1. Premise: Target similar label formats2. Use raw OCR to locate “Nash” labels3. Need to exclude:
a) Determined by Nashb) Author of scientific namec) Associated collector
4. Test for similarity to target label format
5. Targeted parsing algorithms
![Page 13: Lichens, Bryophytes and Climate Change](https://reader035.vdocuments.net/reader035/viewer/2022081507/568164cd550346895dd6f16d/html5/thumbnails/13.jpg)
Label Review
![Page 14: Lichens, Bryophytes and Climate Change](https://reader035.vdocuments.net/reader035/viewer/2022081507/568164cd550346895dd6f16d/html5/thumbnails/14.jpg)