xml processing using gpgpu - gogs · 2018. 3. 3. · iiwas, 2008. jordan vincent xml processing...

18
XML processing using GPGPU Research proposal Jordan Vincent University of Tsukuba February 2, 2011 K D E D a t a B a s e L a b . Jordan Vincent XML processing using GPGPU

Upload: others

Post on 06-Oct-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: XML processing using GPGPU - Gogs · 2018. 3. 3. · iiWAS, 2008. Jordan Vincent XML processing using GPGPU. backup slide: CPU vs GPU thread scheduling GPUhardware, massive parallelism

XML processing using GPGPUResearch proposal

Jordan Vincent

University of Tsukuba

February 2, 2011

KDEDataBaseLab.

Jordan Vincent XML processing using GPGPU

Page 2: XML processing using GPGPU - Gogs · 2018. 3. 3. · iiWAS, 2008. Jordan Vincent XML processing using GPGPU. backup slide: CPU vs GPU thread scheduling GPUhardware, massive parallelism

Jordan Vincent

Academic achievements

Engineering degree with emphasis on Software development.

Research master degree with emphasis on Parallel algorithms.

from University of Technology of Belfort-Montbeliard (France).

Internship

Final project assignment (6 months) at Kitagawa Data Engineeringlaboratory (University of Tsukuba, Japan).

Jordan Vincent XML processing using GPGPU

Page 3: XML processing using GPGPU - Gogs · 2018. 3. 3. · iiWAS, 2008. Jordan Vincent XML processing using GPGPU. backup slide: CPU vs GPU thread scheduling GPUhardware, massive parallelism

Outline

1 Research projectBackgroundMaster projectNext challenge

2 Research planScheduleScope of research

Jordan Vincent XML processing using GPGPU

Page 4: XML processing using GPGPU - Gogs · 2018. 3. 3. · iiWAS, 2008. Jordan Vincent XML processing using GPGPU. backup slide: CPU vs GPU thread scheduling GPUhardware, massive parallelism

Outline

1 Research projectBackgroundMaster projectNext challenge

2 Research planScheduleScope of research

Jordan Vincent XML processing using GPGPU

Page 5: XML processing using GPGPU - Gogs · 2018. 3. 3. · iiWAS, 2008. Jordan Vincent XML processing using GPGPU. backup slide: CPU vs GPU thread scheduling GPUhardware, massive parallelism

XML/XPath

XML

Semi-structured data format for exchanging data in a textual form.

<a><b arg1=”v a l u e 1”>

<c /></b><b arg1=”v a l u e 2”>

<c>t e x t </c></b>

</a>

Worldmap:171 GB(Sept 2010)

English articles:27 GB(Sept 2010)

XPath

Core retrieval language for XML doc. XPath is a subset of XQuery.

/a/b [ @arg1=v a l u e 2 ] / c

Jordan Vincent XML processing using GPGPU

Page 6: XML processing using GPGPU - Gogs · 2018. 3. 3. · iiWAS, 2008. Jordan Vincent XML processing using GPGPU. backup slide: CPU vs GPU thread scheduling GPUhardware, massive parallelism

XML pattern matching

A

B B

C C

arg1"value1"

arg1"value2"

text

A

B

C

XML document XPath query

C

text

Result

arg1"value2"

TwigStack

TwigStack[1] is a famous algorithm to perform XML patternmatching.

Jordan Vincent XML processing using GPGPU

Page 7: XML processing using GPGPU - Gogs · 2018. 3. 3. · iiWAS, 2008. Jordan Vincent XML processing using GPGPU. backup slide: CPU vs GPU thread scheduling GPUhardware, massive parallelism

Manycore processor family and OpenCL

Manycore processor family

Heterogenous parallel processor architectures

Nvidia (GPU)

ATI/AMD (GPU/CPU)

Intel (Larrabee project)

CUDA

Nvidia specific toolkit for general purpose development on GPU.

OpenCL

”The open standard for parallel programming of heterogeneoussystems” includes some GPU, CPU but also some DSP chips.

Sony/IBM/Toshiba Cell

Apple iPhone

Jordan Vincent XML processing using GPGPU

Page 8: XML processing using GPGPU - Gogs · 2018. 3. 3. · iiWAS, 2008. Jordan Vincent XML processing using GPGPU. backup slide: CPU vs GPU thread scheduling GPUhardware, massive parallelism

Research works about GPGPU and DB processing

Fast computation of database operations using graphicsprocessors [Govindaraju, SIGMOD’04]

GPUQP: Query Co-Processing Using Graphics Processors[Fang, SIGMOD’07]

Relational Joins on Graphics Processors [He, SIGMOD’08]

Data Monster: Why graphics processors will transformdatabase processing? [Di Blas, 2009]

Accelerating SQL Database Operations on a GPU with CUDA[Bakkum, GPGPU’10]

Exploring utilisation of GPU for database applications[Walkowiak, ICCS’10]

Accelerating XML Query Matching through Custom StackGeneration on FPGAs [Moussalli, HiPEAC’10]

→ No research result about XML processing using GPGPU.

Jordan Vincent XML processing using GPGPU

Page 9: XML processing using GPGPU - Gogs · 2018. 3. 3. · iiWAS, 2008. Jordan Vincent XML processing using GPGPU. backup slide: CPU vs GPU thread scheduling GPUhardware, massive parallelism

Master project: TwigStackGPU

Outline

Based on Imam Machdi’s research[2] at KDE lab. aboutTwigStack algorithm for parallel query processing on cluster andmulticore processors.

Current result

Application possible to Nvidia GPGPU?6 months work and many technical problems encountered.→ Project works but slow execution time.

Jordan Vincent XML processing using GPGPU

Page 10: XML processing using GPGPU - Gogs · 2018. 3. 3. · iiWAS, 2008. Jordan Vincent XML processing using GPGPU. backup slide: CPU vs GPU thread scheduling GPUhardware, massive parallelism

Illustrated example

XML document

root

leaves

partition 1 partition 2 partition 3

Partitionningalgorithm

XML patternmatching

XML patternmatching

XML patternmatching

Intermediatesolutions merge

querymatches

Jordan Vincent XML processing using GPGPU

Page 11: XML processing using GPGPU - Gogs · 2018. 3. 3. · iiWAS, 2008. Jordan Vincent XML processing using GPGPU. backup slide: CPU vs GPU thread scheduling GPUhardware, massive parallelism

Next challenge

Immediate future tasks

performance evaluation and profiling.

solve implementation issues for better performance.

Many more problems to be addressed

evaluate other architectures than Nvidia.

enhance pattern matching algorithm to make use of morecapabilities of GPU.

explore other problems that share the same representation ofXML documents.

Jordan Vincent XML processing using GPGPU

Page 12: XML processing using GPGPU - Gogs · 2018. 3. 3. · iiWAS, 2008. Jordan Vincent XML processing using GPGPU. backup slide: CPU vs GPU thread scheduling GPUhardware, massive parallelism

Outline

1 Research projectBackgroundMaster projectNext challenge

2 Research planScheduleScope of research

Jordan Vincent XML processing using GPGPU

Page 13: XML processing using GPGPU - Gogs · 2018. 3. 3. · iiWAS, 2008. Jordan Vincent XML processing using GPGPU. backup slide: CPU vs GPU thread scheduling GPUhardware, massive parallelism

Estimated schedule

non-uniformparallelismexploration

design a new query processingalgorithm that better fits GPGPU

address otherfields of XMLprocessing domain

A B C D

2011 2014past

first demoin CUDA(nVIDIA)

TwigStackimplementation

on GPGPU

OpenCL frameworkfor XML processing

algorithm for XML queryprocessing on GPGPU

XML-OLAP implementationon GPGPU

Masterthesis

PhDthesis

Manycore plateformcomparison of TwigStackalgorithm using newOpenCL frameworkTwigStackGPU

presentationand benchmark

Efficient XML queryprocessing on GPGPU

XML-OLAP on GPUpresentation andbenchmark

Jordan Vincent XML processing using GPGPU

Page 14: XML processing using GPGPU - Gogs · 2018. 3. 3. · iiWAS, 2008. Jordan Vincent XML processing using GPGPU. backup slide: CPU vs GPU thread scheduling GPUhardware, massive parallelism

”Layer cake”

OpenCL

Query processing

XML processing

Non-uniform parallelism

hardware

software

OLAPoperation

XML DB Webserver ...

GPUCPU ...

Jordan Vincent XML processing using GPGPU

Page 15: XML processing using GPGPU - Gogs · 2018. 3. 3. · iiWAS, 2008. Jordan Vincent XML processing using GPGPU. backup slide: CPU vs GPU thread scheduling GPUhardware, massive parallelism

Conclusion

1 XML query processing is a problem due to the growingamount of content stored into XML documents.

2 Current project shows that XML query processing on GPGPUis possible but well-known algorithm is not efficient.

3 No research results about XML query processing and GPGPUyet, but promising results about relational database queryprocessing.

4 An efficient GPU framework could be the base of otherresearches related to XML processing.(e.g., XML-OLAP operation[3] using GPGPU)

Jordan Vincent XML processing using GPGPU

Page 16: XML processing using GPGPU - Gogs · 2018. 3. 3. · iiWAS, 2008. Jordan Vincent XML processing using GPGPU. backup slide: CPU vs GPU thread scheduling GPUhardware, massive parallelism

References I

Nicolas Bruno, Nick Koudas, Divesh Srivastava.Holistic Twig Joins: Optimal XML Pattern Matching.SIGMOD 2002

Imam Machdi, Toshiyuki Amagasa, Hiroyuki Kitagawa.Executing parallel TwigStack algorithm on a multi-coresystem.International Journal of Web Information System, 2010.

Chantola Kit, Toshiyuki Amagasa, Hiroyuki Kitagawa.Algorithms for Efficient Structure-based Grouping inXML-OLAP.iiWAS, 2008.

Jordan Vincent XML processing using GPGPU

Page 17: XML processing using GPGPU - Gogs · 2018. 3. 3. · iiWAS, 2008. Jordan Vincent XML processing using GPGPU. backup slide: CPU vs GPU thread scheduling GPUhardware, massive parallelism

backup slide: CPU vs GPU

thread scheduling

GPU hardware, massive parallelism of non-divergentthreads

CPU software, few parallelism of divergent threads

memory consistency

GPU no hardware consistency, software consistency notrecommended (little independent caches, many cores)

CPU hardware and complexe memory consistencymanagement (big unified caches, few cores)

computing priority

GPU Less global memory

CPU More global memory

Jordan Vincent XML processing using GPGPU

Page 18: XML processing using GPGPU - Gogs · 2018. 3. 3. · iiWAS, 2008. Jordan Vincent XML processing using GPGPU. backup slide: CPU vs GPU thread scheduling GPUhardware, massive parallelism

backup slide: GPU powered webserver

GPU

CPU

Dynamic XML doc. request

webserver

webserver

fastCGI

XML document answer

XPath queries on XML doc.

(auth., AES decrypt, ...)

(AES encrypt, ...)

Jordan Vincent XML processing using GPGPU