the scancsv.lua library jaroslav hajtmar. apology english-speaking participants sorry, but this talk...

21
The SCANCSV.LUA library Jaroslav Hajtmar

Upload: lora-olivia-briggs

Post on 26-Dec-2015

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The SCANCSV.LUA library Jaroslav Hajtmar. Apology English-speaking participants Sorry, but this talk is only in Czech. Due to my language skills I would

The SCANCSV.LUA library

Jaroslav Hajtmar

Page 2: The SCANCSV.LUA library Jaroslav Hajtmar. Apology English-speaking participants Sorry, but this talk is only in Czech. Due to my language skills I would

Apology English-speaking participants

Sorry, but this talk is only in Czech. Due to my language skills I would probably just did not know enough to say everything important. I will try to at least the guide slideshow in English. Thanks for your understanding.

Page 3: The SCANCSV.LUA library Jaroslav Hajtmar. Apology English-speaking participants Sorry, but this talk is only in Czech. Due to my language skills I would

AbstractIn the data processing are often used data, stored in CSV (Comma Separated Values) files. The presentation will describe the author's library ScanCSV.lua, the method of its formation and will be demonstrated practical examples of its use in ConTeXt MKIV. Author shows how easily and quickly create print reports, letters, forms, certificates, invitations, cards, business cards, double-sided cards, tables, animations etc. using external texts CSV databases. Users of ConTeXt MKIV (but LuaLATEX and LuaTEX too) can through the library practicaly use data from external CSV tables in own documents through TeX macros built on library and have this data available in an attractive and very simple and natural way.

Page 4: The SCANCSV.LUA library Jaroslav Hajtmar. Apology English-speaking participants Sorry, but this talk is only in Czech. Due to my language skills I would

Introduction• SCANCSV.LUA library – easy way to use text database

data stored in external CSV files in ConTeXt MkIV (in LuaLaTeX and LuaPlain is working too).

• Easily create documents LuaTeX which handle multiple data (CSV simple database).

• Varied uses: printing of various forms, collective letters, certificates, invitations, cards, business cards, double-sided cards, tables, animations etc.

• Main objective : easy to use without knowledge of Lua, use in LuaLaTeX and LuaPlainTeX too, access CSV data by TeX macros built in library functions (without Lua code), motivate other users to use LuaTeX.

Page 5: The SCANCSV.LUA library Jaroslav Hajtmar. Apology English-speaking participants Sorry, but this talk is only in Czech. Due to my language skills I would

CSV data format and SCANCSV.LUA

• Exchange data, export to CSV (f.e. the MySQL database), a simpler alternative to the XLM, easy handling (sorting and editing), spreadsheets (Excel, Calc, Gnumeric, ...)

• Description CSV format generally• CSV format suitable for SCANCSV.LUA:

– file must be encoded in UTF-8! (Exported XLS files to be recoded – handicap)

– Field separators: basically anything, default value is ; semi-colon (MS Excel)

– Spacers fields: anything, left and right may be different (most often "quotes), the default value is without spacers!

– The parsing algorithm SCANCSV.LUA is very simple (although it can be freely adjusted) => limitation (if set spacers must be used everywhere - in general, it needs not to be)

Page 6: The SCANCSV.LUA library Jaroslav Hajtmar. Apology English-speaking participants Sorry, but this talk is only in Czech. Due to my language skills I would

SCANCSV – history, inspiration• 2005 - appearance macro scanbase.tex of Petr Olšák.

Macro processing text files in a certain format.• Petr Olšák modify and generalized the macro scanbase.tex

to macro scancsv.tex – it processing text files in CSV format. I used it in plainTeX to 2008.

• 2008 - modification of macro for LaTeX (Jaromír Kuben) and for ConTeXt (Petr Olšák). I used it in ConTeXt MkII today.

• 2010 - I began to use ConTeXt MkIV. Original macro does not work. ConTeXt is working with character set UTF8, but macro is unable to process this character set.

• March 2010 – my familiarization with LuaTeX, Lua language and I start creating the library scancsv.lua. First version was practicaly useless.

• July 2010 – first real applicable version• today – improvements, tuning and expansion of options

Page 7: The SCANCSV.LUA library Jaroslav Hajtmar. Apology English-speaking participants Sorry, but this talk is only in Czech. Due to my language skills I would

The operating principle of the library1. Load library scancsv.lua (single Lua code in the source text,

context).2. Optional settings flag header, separator elements, and

spacers (Otherwise, the default value).3. Opening CSV file (different ways).4. Loading CSV table row (manually or in a cycle)5. Parse row (column separation data). 6. Retrieving column data to TeX macros.7. Repeat steps 4 to 6 for all lines of CSV tables.

Method of processing of first row of the table depends on whether it's "head" or not. After loading the column data in the macro data are available ConTeXt. Rows can browse the "manually", using the standard cycles or macros of library.

Page 8: The SCANCSV.LUA library Jaroslav Hajtmar. Apology English-speaking participants Sorry, but this talk is only in Czech. Due to my language skills I would

Using in the "manual" mode• Load library \directlua{dofile(scancsv.lua)}• Setting a flag header (when the head) \setheader (or

unset - \resetheader)• Opening CSV file \opencsvfile{file.csv}• Then, in source text, we use the macros \cA, \cB ... (or \

Firstname, \Lastname, ... if line first line contains header Firstname, Lastname, …). These macros contains the column values of the current CSV row

• \Nextrow - go to the next table row (macro \cA, \cB ... or \Firstname, \Lastname, … are filled with new values)

Page 9: The SCANCSV.LUA library Jaroslav Hajtmar. Apology English-speaking participants Sorry, but this talk is only in Czech. Due to my language skills I would

Main TeX macros for using the library• \setfiletoscan{CSVFile} – setting of name of CSV file• \setheader – set a flag header• \resetheader – unset a flag header• \setsep{,}, \setld{*}, \setrd{!} – setting of separator of columns

and spacers of columns to user value (nondafault value)• \resetsep, \resetld, \resetrd – unset to default values• \opencsvfile{CSVFile}, \openheadercsvfile{CSVFile} - • \nextrow – go to to next row of CSV file• \printline, \printall – print all of line / all of CSV table• \filelineaction, \filelineaction{CSVfile}, \filelineaction{CSVfile}

{to}, \filelineaction{CSVfile}{from}{to} – macros for processing of user-defined macro \lineaction in a cycle

Page 10: The SCANCSV.LUA library Jaroslav Hajtmar. Apology English-speaking participants Sorry, but this talk is only in Czech. Due to my language skills I would

1;Petr;Novák;19.5.1989;m;Nymburk;U Brány 72;Jan;Novotný;5.7.1991;m;Praha;Uhlířská 1783;Zuzana;Vašíčková;13.9.1984;ž;Ostrava;Jánská 14…

Possibility set of Roman numbers of columns:\cI, \cII, \cIII, \cIV, … (defalut UserColumnNumbering=‘XLS’)

\cA \cB \cC \cD … \resetheader

Surname;Firstname;Birthdate;Sex;City;Zipcode;StreetNovák;Jan;14.10.1997;m;Zbečno;27024;Farní 21Pospíšilová;Hana;4.1.1996;ž;Zábřeh;78901;Studénky 420…

\cA = \Surname \cB = \Firstname \cC = \Birthdate … \setheader

no header

Header (no data)

data lines

data lines

CSV file with Header (switch with \setheader)

CSV file without Header (default option - \resetheader)

TeX macros for accessing of columns data

Page 11: The SCANCSV.LUA library Jaroslav Hajtmar. Apology English-speaking participants Sorry, but this talk is only in Czech. Due to my language skills I would

TeX macros to obtain „system“ information• \csvfilename – name of actual open CSV file• \numcols – number of columns of the CSV table• \numrows – number of processed (offered) lines• \numline – the serial number of the currently loaded row• \csvreport – Report information on open CSV file

Hooks for data processing (default \relax)• \blinehook, \elinehook – begin line hook, end l.h. – macros are executed

before and after processing row macro \lineaction (ie CSV table row)• \bfilehook, \efilehook – performed before and after processing the entire

CSV table• \bch, \ech – begin column hook, end c.h. - can be manually set in lua code, because of the impossibility of testing the macro, this option is disabled

TeX IF for testing EOF CSV file• \ifEOF – TRUE, if we get to the end of processing a CSV file• \ifnotEOF – opposite \ifEOF

Page 12: The SCANCSV.LUA library Jaroslav Hajtmar. Apology English-speaking participants Sorry, but this talk is only in Czech. Due to my language skills I would

Using „manual“ mode

• In the source code we use the macros \cA, \cB, or ... \Firstname, \Lastname, ... (if first line contains a header) containing a column value of the current CSV row.\Nextrow - go to the next table row (macros \cA, \cB ... are filled with new values)

Page 13: The SCANCSV.LUA library Jaroslav Hajtmar. Apology English-speaking participants Sorry, but this talk is only in Czech. Due to my language skills I would

Modification of functions of library

• Default settings can be changed by editing the file scancsv.lua - in the introductory section of code

• During the processing of ConTeXt MKIV (LuaLaTeX) can continuously change settings separator, spacers, headers, using TeX macros ...

• Possibility of processing different CSV files in one document (with different dividers and spacers columns)

• Use Hooks – default are \relax

Page 14: The SCANCSV.LUA library Jaroslav Hajtmar. Apology English-speaking participants Sorry, but this talk is only in Czech. Due to my language skills I would

Main Lua library functions• ParseCSVdata() -- Functions for parsing of individual records (rows) CSV table

• lineaction() -- processing of user macro \lineaction according to the specified range of lines at the open CSV file

• CreatePageFiles() -- create a two CSV files from open CSV file. It would by used to print double-sided cards, printed on the page in block R x C (the "Reposition" CSV file from the 2nd page so that the front and back of the tiles match)

• Filelineactioncards() – printing 1st and 2nd sides of list of cards from the files created by the previous function

• CSVReport() – get report information about open CSV file

• csvfilename() – name of actualy open CSV file

• TMN(s) – (TeX Macro Name). Macro name must not contain prohibited characters

• ar2rom() -- Convert Arabic numbers to Roman. Used for "numbering" column in the macro

• ar2xls() -- convert numbers to the column name (Excel format)

• ar2colnum() -- podle nastavení glob. proměnné vrací označení sloupce TeXového makra

• printline() -- vypíše aktuální řádek CSV tabulky

• printall() -- vypíše celou CSV tabulku

• printallcontext() -- vypíše celou CSV tabulku v ConTeXtové syntaxi

Page 15: The SCANCSV.LUA library Jaroslav Hajtmar. Apology English-speaking participants Sorry, but this talk is only in Czech. Due to my language skills I would

Testing and cyclesConditions with AND and OR (see Olšák TBN) % Condition A AND B

\doloop{\ifnum\Id>2

\ifnum\Id<10\lineaction\fi

\fi\ifEOF\exitloop\else\nextrow\ifEOF\exitloop\fi\fi}

% Condition A OR B

\def\AorB{\lineaction}\doloop{\ifnum\Id=1\AorB%\else\ifnum\Id>3\AorB\fi\fi\ifEOF\exitloop\else\nextrow\ifEOF\exitloop\fi\fi}

Page 16: The SCANCSV.LUA library Jaroslav Hajtmar. Apology English-speaking participants Sorry, but this talk is only in Czech. Due to my language skills I would

SCANCSV.LUA and cyclesExamples of ConTeXt cycles: \dorecurse{5}{\lineaction\nextrow} - \lineaction macro for next 5 rows\doloop{\lineaction\nextrow\ifnum\numline>7\exitloop\fi}\doloop{\ifEOF\exitloop\else\lineaction\nextrow\fi}\doloop{\lineaction\nextrow \if\Id3 \exitloop \fi}

Examples of library cycles (only in test version SCANCSV.LUA):The macros are based on \doloop macro to easier use in source code.\doloopwhile{\Trida}{3.A}{\tableaction} % List all meet the criterion\doloopuntil{\Trida}{3.A}{\tableaction} % list until it is not satisfied\doloopforall{\lineaction} – for all lines will \lineaction macro\doloopfromto{3}{7}{\lineaction}\doloopaction – without parameter done for all rows macro \lineaction.\doloopaction{\useraction} – done for all rows user macro \useraction\doloopaction{\useraction}{5} – for the first 5 rows will doing \useraction macro\doloopaction{\useraction}{5}{7} - for rows 5-7 will doing \useraction macro

Page 17: The SCANCSV.LUA library Jaroslav Hajtmar. Apology English-speaking participants Sorry, but this talk is only in Czech. Due to my language skills I would

• Forms, multiple letters, etc.• Cards, business cards, …• Tables• Metapost animation• Use ConTeXtových cycles, IF tests• SCANCSV.LUA "drifts" (TeX macros in a CSV

file, change \lineaction during processing CSV)• Samples of work for CTM & TE

Practical demonstrations of the use of libraries

Page 18: The SCANCSV.LUA library Jaroslav Hajtmar. Apology English-speaking participants Sorry, but this talk is only in Czech. Due to my language skills I would

Constraints, compatibility, flaws• SCANCSV.LUA does not handle general CSV

files. Reason: The parsing algorithm is very simple.If the item contains a column separator “,” the CSV output is current: 1, Jan, Novotny, "The Gate 4, 111 50 Prague", ... Solution: an better (general) algorithm that. Only suffice change ParseCSVdata function ().

• Occasional problems with the expansion. Eg. I failed to get SCANCSV.LUA in the module database (\usemodule [database]) Mojca Miklavec

• Some things work only in ConTeXt

Page 19: The SCANCSV.LUA library Jaroslav Hajtmar. Apology English-speaking participants Sorry, but this talk is only in Czech. Due to my language skills I would

Possibilities of improvement ...

• Improvements and generalizations parsing algorithm

• Using for XML processing??• Create a separate module ONLY FOR MKIV

(gone for a number of limitations LuaLaTeX)

Page 20: The SCANCSV.LUA library Jaroslav Hajtmar. Apology English-speaking participants Sorry, but this talk is only in Czech. Due to my language skills I would

Thanks…• Members of mail conference [email protected] for

advice about ConTeXt and Lua. The library would not have given their kind assistance. Special thanks to Taco Hoekwater, Hans Hagen, Wolfgang Schuster.

• Members of mail conference [email protected] for advice about TeX and LaTeX. Especially Mr. Zdenek Wagner, Vit Zýka, Pavel Stříž, Petr Olšák ..

• Pavel Stříž for inspiration, testing, advice and for convincence to me to finish the library and presented at this conference.

Page 21: The SCANCSV.LUA library Jaroslav Hajtmar. Apology English-speaking participants Sorry, but this talk is only in Czech. Due to my language skills I would

Discussion