the how and why of latex february 2013 tim finin [email protected]

33
The How and Why of LaTeX February 2013 Tim Finin [email protected] http:// ebiq.org /r/3

Upload: eunice-arnold

Post on 20-Jan-2016

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

The How andWhy of LaTeX

February 2013

Tim [email protected]

http://ebiq.org/r/350

Page 2: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Objective

• Understand the role of LaTeX in your research• Learn how to create a simple LaTeX2e

document–Create a LaTeX source file–Create and include figures–Reference figures and sections–Create lists– Include other tex files–Generate pdf output–Cite bibliographic references

Page 3: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

History: TeX and LaTeX• Donald Knuth created TeX in the late 70s

so he could typeset his famous Art of Computer Programming books

• TeX produced great output and was very powerful (and programmable) but also very obscure

• Leslie Lamport of SRI produced LaTeX in the ealry 80s as a macro package making TeX easy to use

• I’ve never know anyone who used TeX directly

Page 4: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Other Options• Microsoft Word is a great product

–Track changes is a great feature and you can’t beat it for small documents

• HTML is fine if your target is a screen–The W3C does all of its documentation in HTML and

many ebook formats (e.g., Kindle) use HTML

• Google Docs is up and coming–great for real-time collaboration

• That’s about it these days–No one uses Tj6, Scribe, Pub, troff, WordPerfect, …

Page 5: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Why LaTeX?

Page 6: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Why LaTeX• It’s good for complex documents

like a thesis or dissertation• It’s the standard for CS, Math & other STEM fields

–Many conferences have their own LaTex document–Elsevier uses LaTeX to typeset all their journals

• LaTeX enforces typesetting best practices• Its bibliography system, BibTex, is a standard• It is programmable!• It’s open source, has a large user and developer

community & a good infrastructure (CTAN)

Page 7: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Getting Started

• Some good resources include:– Getting Started with TeX, LaTeX, and Friends– LaTeX wiki book– LaTeX cheat sheet

• Your steps will be– Install LaTeX on your system if it’s not there– Try working with a simple document– Customize your environment– Learn how to create and manage figures, tables,

equations, create a bibliography, etc.– Write a few LaTeX macros

Page 8: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Accessing and Installing LaTeX

• Latex and associated tools are typicallypreinstalled on Linux and Mac OS X–Sadly no longer on Mac’s Mountain Lion–They are also on the CSEE servers and gl

• The TeX Users Group is a good resource• Miktex is a good choice for Windows• MacTeX is a good choice for MAC OS X

Page 9: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Latex commands start with a backslash, required args are in {}, optional args in []sLatex commands start with a backslash, required args are in {}, optional args in []ssample.tex

\documentclass[12pt]{article}

% a simple example\usepackage{times}

\begin{document}

\title{Hello World in LaTeX}

\author{My Name Goes Here}

\maketitle

Hello, world!

{\em Hello, world!}

{\bf Hello, world!}

{\Large \bf Hello, world!!!}

\end{document}

Start by declaring the document type (article) and use the 12pt option setting the font sizeStart by declaring the document type (article) and use the 12pt option setting the font size

Loads required packages defining commands or setting parametersLoads required packages defining commands or setting parameters

LaTex uses begin|end commands for blocks. Every document must have a document blockLaTex uses begin|end commands for blocks. Every document must have a document block

The title and author command set document variables and the maketitle command generates the output text

The title and author command set document variables and the maketitle command generates the output text

Paragraphs are separated by blank linesParagraphs are separated by blank lines

{}s introduce blocks and control scope. \em for italics, \bf for bold, \Large ups font size{}s introduce blocks and control scope. \em for italics, \bf for bold, \Large ups font size

Coments: text from a % to the EOL is ignoredComents: text from a % to the EOL is ignored

Page 10: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

f.auxf.aux

Compiling with pdflatex

> pdflatex sample

This is pdfTeX, Version 3.1415926-1.40.10 (TeX Live 2009)

entering extended mode

(./sample.tex

LaTeX2e <2009/09/24> ...

(/usr/local/texlive/2009/texmf-dist/tex/latex/base/article.cls

Document Class: article 2007/10/19 v1.4h Standard LaTeX document class

(/usr/local/texlive/2009/texmf-dist/tex/latex/base/size12.clo))

...

Output written on sample.pdf (1 page, 29675 bytes).

Transcript written on sample.log.

f.texf.tex

pdflatexpdflatex

f.pdff.pdf

Page 11: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Compiling, old school

> latex sample

This is pdfTeX, Version 3.1415926-1.40.10 (TeX Live 2009)

...

Output written on sample.dvi (1 page, 652 bytes).

Transcript written on sample.log.

> dvips sample -o sample.ps

This is dvips(k) 5.98 Copyright 2009 Radical Eye Software (www.radicaleye.com)

' TeX output 2011.01.31:0857' -> sample.ps

...

> ps2pdf sample.ps

>

f.texf.tex

latexlatex

dvipsdvips

f.dvif.dvi

f.psf.ps

ps2pdfps2pdf

f.pdff.pdf

Page 12: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Output files

> ls -l sample*

-rw-r--r-- 1 finin staff 8 Jan 31 08:57 sample.aux

-rw-r--r-- 1 finin staff 652 Jan 31 08:57 sample.dvi

-rw-r--r-- 1 finin staff 3363 Jan 31 08:57 sample.log

-rw-r--r--@ 1 finin staff 3336 Jan 31 09:00 sample.pdf

-rw-r--r-- 1 finin staff 10664 Jan 31 08:58 sample.ps

-rw-r--r-- 1 finin staff 237 Jan 31 08:33 sample.tex

Page 13: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Files LaTeX Uses

• Input source file (.tex)• Files containing structure and layout definitions

(.sty)• Tex formatted output file (.dvi)• Others:

.toc (table of contents), .lof (list of figures), .lot (list of tables), .bib (bibliography)

Page 14: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Document Classes

• Documents start with a document class declar-ation (e.g., article, report, book, slides, letter)

\documentclass{article}\documentclass[11pt,letterpaper]{article}

• The second example adds two optional args: 11pt is the default font size, letterpaper the paper size

• Conferences & journals define their own classes\documentclass[10pt,journal,compsoc]{IEEEtran}\documentclass[runningheads,a4paper]{llncs}

Page 15: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Packages

• After declaring the document class, you typically load one or more packages

\usepackage{graphicx}\usepackage{algorithm}

• Packages add new commands or redefine existing ones to provide new capabilities, e.g.

–graphicx: essential for including images–colortbl: add colors to table rows, columns, cells–floatflt: figures and tables that text flows around

• Many are pre-installed, all can be easily downloaded

Page 16: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Installing Classes and Packages

• If you need a class, package or bib style that’s not installed, download the files

• Put them in a document’s directory or your common latex directory–export TEXINPUTS=.:~/latex/

• File extensions are .cls (class), .sty (package) and .bst (bib style)

Page 17: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Including Other LaTeX Files

• Supports modularity–a single LaTeX document can

consist of multiple LaTeX files–Very useful for group work,

e.g., many authors using SVN

• \input{intro}–used to include other Latex

files–Latex filename is intro.tex

\documentclass[letterpaper]{article}\usepackage{aaai}\usepackage{times}\usepackage{graphicx}% comment: more here\begin{document} \include{title} \include{intro} \include{motivation} \include{related} \include{approach} \include{evaluation} \include{conclusion} \include{bibliograph}\end{document}

\documentclass[letterpaper]{article}\usepackage{aaai}\usepackage{times}\usepackage{graphicx}% comment: more here\begin{document} \include{title} \include{intro} \include{motivation} \include{related} \include{approach} \include{evaluation} \include{conclusion} \include{bibliograph}\end{document}

A typical top level file

Page 18: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Figures and tables

• Standard LaTeX figures and tables have a caption and can appear inline or float to the top or bottom of a page

• There are many packaged that extend features• Reference like “see Figure~\ref{acc} …”\begin{figure}[tbhp] \centering \includegraphics[width=0.8\textwidth]{accuracy.png} \caption {Category accuracy... linking.} \label{acc}\end{figure}

Page 19: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Tablular environment

\begin{figure}[tbp] \centering \begin{tabular}{|c|c|c|c|} \hline {\em City}&{\em State}&{\em Mayor}&{\em Population}\\ \hline Baltimore&MD&S.C.Rawlings-Blake&640,000\\ \hline Philadelphia&PA&M.Nutter&1,500,000\\ \hline New York&NY&M.Bloomberg&8,400,000\\ \hline Boston&MA&T.Menino&610,000\\ \hline \end{tabular} \caption{This simple table represents ...America} \label{table:USCities}\end{figure}

Page 20: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

LaTeX Miscellanea

• Latex quotes and hyphens– `` ’’ – not " "– -- or ---, not –

• There are good packages for producing html from a latex document, e.g, latex2html– I use it to generate pdf and html versions of my CV

from the same latex source

Page 21: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Errors and warnings

• Some errors are fatal and require fixing before LaTeX run to completion

• You can continue can from some• Some are just warnings, e.g., missing labels,

over/underfilled line• You can refer to them later in the .log file• Sometimes it can be hard to find the source of

an error, just as in a compiler for a programming language

Page 22: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Learning the rest

• Study at the source of documents that have things you want to do–Your lab-mates and colleagues documents–http://svn.cs.umbc.edu/ebiquity/papers/

• Read some documentation–LaTeX and common classes and packages have mostly

been designed by and for CS types–Reading documentation can reveal powerful features

• Search for help online–e.g. “latex two column figure”

Page 23: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Latex is a three-pass compiler!

• Latex needs three passes to get all of the labels (e.g., sections, citations, figures) right

• Typical final run is:latex mypaperbibtex mypaperlatex mypaperlatex mypaper

• You might want to use a makefile or latexmk

Page 24: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Bibliographies 101

• Bibtex is LaTeX’s system for bibliographies• Start with a file (e.g., thesis.bib) with entries for

papers you often cite@Book{Torre08, author = "Joe Torre and Tom Verducci”, publisher = "Doubleday”, title = "The Yankee Years”, year = 2008}

• Cite papers like \cite{Torre08,Knuth72}• Specify bibliography style: \bibliographystyle{IEEEtran}• Link .bib file to use: \bibliography{thesis}• Run bibtex command to create bibliography

Page 25: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

BibTeX Ontology

• Bibtex has a set of document typesarticle, book, booklet, inbook, incollection, inproceedings, manual, misc, phdthesis, proceedings, techreport, unpublished

• Each has a set of required and optional attributes, e.g.:author, title, booktitle, edition, editor, journal, howpublished, institution, month, note, number, volume, organization, pages, publisher, series, year

• The Wikipedia article is a quick reference for what attributes go with which entity types

• BibTeX ignores unknown entity types and attributes

Page 26: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

A typical entry

@InProceedings{mlm12a, author = "M. Lisa Mathews and Paul Halvorsen and ...", title = "{A Collaborative Approach ... CyberSecurity}", month = "October", year = "2012", booktitle = "8th {IEEE} Int. Conf. on ... Worksharing", publisher = "IEEE Computer Society",}

Page 27: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

BibTeX cautions

• Be sure you understand the attributes you use–E.g., pages refers not to the number of pages in an

article but its pages in a journal or proceedings

• You can capture Bibtex entries online–Those from ACM or IEEE are usually correct–Google Scholar often has the wrong type

• List authors FN LN and separated by and• Most bib styles use sentence case; wrap text

that should be in caps with braces– booktitle = "8th {IEEE} Int. Conf. on ... Worksharing",

Page 28: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Conclussion

• Knowledge of LaTeX is required in many circles• Most journals & conferences provide LaTeX

files ensuring high-quality camera-ready documents

• Mastering LaTeX is not hard for a computing person, but it takes a while

• It’s fun to be able to write your own macros

Page 29: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Real example\documentclass[runningheads,a4paper]{llncs}

\usepackage{graphicx}\usepackage{algorithm}\usepackage{algorithmic}\usepackage{subfigure}\usepackage[rflt]{floatflt} % floating figures\usepackage{colortbl} % for colors in tables

\setcounter{tocdepth}{3}

\begin{document}\mainmatter \title{Using linked data to interpret tables\footnote{\scriptsize Research supported in part by a gift from Microsoft Research, a Fulbright fellowship, NSF award IIS-0326460 and the Human Language Technology Center of Excellence.}}\titlerunning{Using linked data to interpret tables}\author{Varish Mulwad \and Tim Finin \and Zareen Syed \and Anupam Joshi}\authorrunning{Varish Mulwad \and Tim Finin \and Zareen Syed \and Anupam Joshi}\institute{Department of Computer Science and Electrical Engineering\\ University of Maryland, Baltimore County, Baltimore, MD USA 21250\\ \{varish1,finin,joshi\}@cs.umbc.edu, [email protected] }\toctitle{Using linked data to interpret tables}\tocauthor{Mulwad \and Finin \and Syed \and Joshi}

\maketitle

Page 30: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Real example

\begin{abstract} Vast amounts of information is available in structured forms like spreadsheets, database relations, and tables found in documents and on the Web. We describe an approach that uses linked data to interpret such tables and associate their components with nodes in a reference linked data collection. Our proposed framework assigns a class (i.e. type) to table columns, links table cells to entities, and inferred relations between columns to properties. The resulting interpretation can be used to annotate tables, confirm existing facts in the linked data collection, and propose new facts to be added. Our implemented prototype uses DBpedia as the linked data collection and Wikitology for background knowledge. We evaluated its performance using a collection of tables from Google Squared, Wikipedia and the Web.\end{abstract}

\section{Introduction}

Resources like Wikipedia and the Semantic Web's linked open data collection\cite{bizerc2009} are now being integrated to provide experimental knowledge basescontaining both general purpose knowledge as well as a host of specific facts aboutsignificant people, places, organizations, events and many other entities of interest.The results are finding immediate applications in many areas, including improvinginformation retrieval, text mining, and information extraction. Still more structured datais being extracted from text found on the web through several new research programs\cite{etzioni2006machine,mcnamee2009overview}.

Page 31: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Real example

…of the class labels predicted were considered correct by the evaluators. The accuracy ineach of the four categories is shown in Figure \ref{columnCorrectness_EntityLinking}. Weenjoyed moderate success in assigning class labels for {\em Organizations} and{\em Other} types of data probably because of sparseness of data in the KB about thesetypes of entities.

\begin{figure}[tbp]\fbox{\includegraphics[scale = 0.65]{images/accuracy_chart}}\caption {Category wise accuracy for ``column correctness'' is shown in (a) and for entity linking in (b) }\label{columnCorrectness_EntityLinking}\end{figure}

\subsection{Linking table cells to entities}

For the evaluation of linking table cells to entities, we manually hand-labeled the 611table cells to their appropriate Wikipedia / DBpedia pages. The system generated linkswere compared against the

Page 32: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Real example

\subsection{Relation identification}

We did a preliminary evaluation for identification of relation between columns. We askedhuman evaluators to identify pairs of columns in a table between which a relation mayexist and compared that against the pairs of columns identified by the system. For fivetables, used in this evaluation, in 25\% of the cases, the system was able to identify thecorrect pairs of columns.

\section{Conclusion}

We presented an automated framework for interpreting data in a table using existing LinkedData KBs. Using the interpretation of the table we generate linked RDF fromwebtables. Evaluations show that we have been fairly successful in generating correctinterpretation of webtables. Our current work is focused on improving relationshipdiscovery and generating new facts and knowledge from tables that contain entities notpresent in the LOD knowledge bases. To deal with web scale analytics, we plan to focus onadapting our algorithms for parallelization using Hadoop or Azure type frameworks. We arealso exploring ways to apply this work to create an automated (or semi-automated / humanin the loop) framework for interpreting and representing public government datasets aslinked data.

\bibliographystyle{springer}\bibliography{cold}

\end{document}

Page 33: The How and Why of LaTeX February 2013 Tim Finin finin@cs.umbc.edu

Real examplecold.bib

@article{bizerc2009, author = {Bizer, Christian}, journal = {IEEE Intelligent Systems}, number = {5}, pages = {87--92}, title = {The Emerging Web of Linked Data}, volume = {24}, year = {2009} }

@inproceedings{zieglerp04, author = {Ziegler, Patrick and Dittrich, Klaus R.}, booktitle = {Building the Information Society}, doi = {10.1007/978-1-4020-8157-6_1}, pages = {3--12}, publisher = {Springer Boston}, title = {Three Decades of Data Intecration: all Problems Solved?}, url = {http://www.springerlink.com/content/t25x6t660v43m37k/}, volume = {156}, year = {2004} }

@MastersThesis{t2ldvarishthesis10, author = "Varish Mulwad", title = {{T2LD} - An automatic framework for extracting, interpreting and representing tables as Linked Data}, month = "August", year = "2010", publisher = "UMBC", school= {{U. of Maryland, Baltimore County}}, howpublished = {http://ebiquity.umbc.edu/paper/html/id/480/T2LD-An-automatic-framework-for-extracting-interpreting-and-

representing-tables-as-Linked-Data} }