the hebrew bible as data wido van peursen eep talstra centre for bible and computer @shebanq_ /...
TRANSCRIPT
THE HEBREW BIBLE AS DATA
Wido van PeursenEep Talstra Centre for Bible and Computer
@shebanq_ / @PeursenWTvan
2
THE CORPUS
Hebrew Bible > Ca. 400.000 words> Probably composed over a period of about 1000 years (1200-200 BC)> Complex transmission history> Oldest complete manuscript: Codex Leningradensis, 1008/9 AD> Various linguistic layers (e.g. vowel signs)> No native speakers
3
THE DATABASE
WIVU database of the Hebrew Bible> [WIVU = Werkgroep Informatica Vrije Universiteit]
> Createted since 1970s> Linguistic levels:
> Morphology (encoding rather than tagging!)> Words> Phrases> Clauses> Sentences> Text hierarchy
4
THE DATA STRUCTURE
5
EMDROS
Central concept: objects with features> Each object can carry unlimited features> Objects can be aggregated arbitrarily into new objects> Structure that can deal with overlapping hierarchies
> query language: MQL
6
HOWEVER….
1. No dedicated space on the web where an authorized version of this resource is guaranteed to exist.
2. No possibility to annotate it, link to it or build (open source) tools around it.
3. Results of existing queries cannot be shown on the web.
4. EMDROS is maintained by one-person private company.
5. Mainly used by specialists in Bible & Computer.
7
SHEBANQ
To build a bridge between the linguistically annotated Hebrew Text corpus and biblical scholars.
Three steps:
(1) make text & annotations, available to scholars
(2) demonstrate how queries can function to address research questions: repository of saved queries.
(3) give textual scholarship more empirical basis, by creating the opportunity of unique identifiers referring to saved queries.
8