resource-light bantu part-of-speech tagging

1

RESOURCE-LIGHT BANTU PART-OF-SPEECH TAGGING Guy De Pauw (UA) Gilles-Maurice de Schryver (UGent) Janneke van de Loo (UA) Motivation There are many data-driven taggers available, but they need extensive annotated corpora. Unsupervised part-of-speech tagging techniques for resource-scarce languages exhibit limited results on Sub-Saharan languages Becoming increasingly available: digitally available dictionaries, lexicons, word lists, ... Research questions • What information can we use for part-of-speech tagging? • Can we use this information to bootstrap accurate part-of-speech taggers for the languages under investigation? • How does this technique compare to the state-of- the-art in data-driven part-of-speech tagging? Bag-of-Substrings AdamPROPNAME alionekanaV chumbaniN kwakePRON hanaNEG fahamuN .FULL_STOP Train maximum entropy classifier and compare it to memory- based tagger Experimental Results Conclusion In the absence of large, annotated corpora, the bag-of-substrings approach established a low-resource, high accuracy bootstrapping method for part-of-speech tagging of conjunctively written Bantu languages. Demos

Upload: guy-de-pauw

Post on 21-Aug-2015

355 views

Category:

Technology

1 download

Report

Download

Tags:

Embed Size (px):

TRANSCRIPT

Page 1: Resource-Light Bantu Part-of-Speech Tagging

RESOURCE-LIGHT BANTU PART-OF-SPEECH TAGGING

Guy De Pauw (UA) Gilles-Maurice de Schryver (UGent) Janneke van de Loo (UA)

Motivation

There are many data-driven taggers available, but

they need extensive annotated corpora.

Unsupervised part-of-speech tagging techniques

for resource-scarce languages exhibit limited

results on Sub-Saharan languages

Becoming increasingly available: digitally

available dictionaries, lexicons, word lists, ...

Research questions

• What information can we use for part-of-

speech tagging?

• Can we use this information to bootstrap

accurate part-of-speech taggers for the

languages under investigation?

• How does this technique compare to the

state-of-the-art in data-driven part-of-

speech tagging?

Bag-of-SubstringsAdamPROPNAME alionekanaV chumbaniN kwakePRON hanaNEG fahamuN .FULL_STOP

Train maximum entropy classifier and compare it to memory-based

tagger

Experimental ResultsConclusion

In the absence of large, annotated corpora, the bag-

of-substrings approach established a low-resource,

high accuracy bootstrapping method for part-of-

speech tagging of conjunctively written Bantu

languages.

Demos

Introduction to Syntax, with Part-of-Speech Tagging

PART OF SPEECH TAGGING (POS)

Part-of-Speech Tagging & Parsing

Lecture 09: Part-of-Speech Tagging

CS4705 Part of Speech tagging

Part of Speech Tagging

Part of Speech Tagging - BGUelhadad/nlp13/prob/postagging.pdfPart-of-Speech Tagging Part-of-SpeechTagging I Givenawordsequencew 1 w m,determinethecorresponding part-of-speech(tag)sequencet

Part-of-speech tagging (1) - School of Informatics · Outline Parts of Speech PoS Tagging in NLTK Evaluating taggers Summary Part-of-speech tagging (1) Steve Renals [email protected]

Part-of-Speech Tagging and Hidden Markov Model

CS 4705 Part of Speech Tagging

Part of Speech Tagging - University Of Marylandusers.umiacs.umd.edu/~jbg/teaching/CMSC_470/10b_viterbi.pdf · Part of Speech Tagging Natural Language Processing: Jordan Boyd-Graber

WORD CLASSES AND PART-OF-SPEECH TAGGING

Part-Of-Speech (POS) Tagging - …classes.ischool.syr.edu/ist664/NLPFall2011/POStagging...What is Part-Of-Speech Tagging? • The general purpose of a part-of-speech tagger is to associate

Part-of-Speech Tagging for Twitter:

Robust Part of Speech Tagging

Part-of-speech tagging (3) - The University of EdinburghSteve Renals [email protected] Part-of-speech tagging (3) Outline Recall: HMM PoS tagging Viterbi decoding Trigram PoS tagging

Part-of-Speech Tagging for Bengali - DCU School of …sdandapat/publication/thesis/MS_thesis.pdfPart-of-Speech (POS) tagging is the process of assigning the appropriate part of speech

Hindi Parts-of-Speech Tagging & Chunking

Part-of-Speech Tagging & Sequence Labeling

HMM Part-Of-Speech Tagging - Natural Language Processing

TP2663 Pemprosesan Bahasa Tabii - ftsm.ukm.my 05 Part of Speech.pdf1 TP2663 Pemprosesan Bahasa Tabii Part of Speech Tagging Part of Speech tagging Part of speech tagging Parts of speech

Part-of-Speech Tagging and Lemmatization Manual · 2.2 Part-of-speech tagging This section explains the technical procedures for part-of-speech tagging VOICE. The first operational

4Parts-Of-speech Tagging for Kannada

COMP 786 (Fall 2020) Natural Language Processingmbansal/teaching/slides...Lecture 3: POS-Tagging, NER, Seq Labeling, Coreference . Part-of-Speech Tagging . Part-of-Speech Tagging Basic

Part-of-Speech Tagging and Partial Parsing

Part of Speech Tagging - The University of Edinburgh · HMM Part-of-Speech Tagging Part of Speech Tagging Informatics 2A: Lecture 15 Mirella Lapata School of Informatics University

Persian Part Of Speech Tagging

Part-of-Speech Tagging Using Parallel Weighted - Helda -

POS for Kannada - LDC-IL Tagging for... · PARTSPARTS----OFOOFFOF----SPEECH TAGGING FOR SPEECH TAGGING FOR KANNADA Vijayalaxmi FF ... Parts-Parts ---ofooffof----SpeechSpeechSpeech

A Connectionist approach to Part-Of-Speech Tagging

Part-of-Speech Tagging Updated 22/12/2005. Part-of-Speech Tagging Tagging is the task of labeling (or tagging) each word in a sentence with its appropriate

Part-Of-Speech Tagging using Neural Networks

IMPLEMENTATION OF SPEECH TAGGING FOR TEACHING …

Part of Speech Tagging (Chapter 5)

Part of speech tagging for Arabic