adventures in modern publishing - notes

Download Adventures in Modern Publishing - Notes

If you can't read please download the document

Upload: scott-nesbitt

Post on 16-Apr-2017

1.251 views

Category:

Technology


0 download

TRANSCRIPT

Adventures in Modern Publishing

Adventures in Modern PublishingBy: Scott Nesbitt

Publishing has changed dramatically in the last 20 years. In fact, its undergone something of a minor revolution in the last 10. You don't need me to tell you that, but it's a good opportunity for me to tell a story.

Once upon a time, publishing was the domain of the folks who could afford printing presses. Not just to own them, but to operate and maintain them. Printing presses were, and are, huge machines that require skilled people to work with them.

Compounding that, if you were a writer the only way you could get your book published was to put the fate of your book in the hands of an editor at a publishing house. Your chances then, as now, werent that good. While you could go the vanity publishing route that was expensive; out of reach for most.

But as time went on, alternatives emerged. Devices like the typewriter, the mimeograph machine, and the photocopier. The quality varied, depending on what people were using but those devices put printing in the hands of ordinary people. They enabled everyday folk to put things in print and in some level of mass quantity. Still, paper and binding could be expensive. That was, and is still (to a degree), a big barrier to entry.

But high-quality publishing did move closer into the hands of the average person and most authors. I like to think that that move started with Donald Knuth and his creation, the TeX typesetting system. For me, though, the turning point came in 1988. I was in journalism school and the newsroom of the student newspaper got a battery of Macintosh computers. The ones that we now call Classic Macs. Using Microsoft Word, a laser printer, and the venerable techniques of paste up, we were able to quickly assemble an edition of the paper and send it to the printer.

I remember one instance in particular, where my class covered an event late one Thursday afternoon. We rushed back to the newsroom, wrote up our stories, put together the paper, and sent it to the printer. All before 7:00 p.m. the same day.

What we did was primitive, but it opened my eyes. As did more powerful tools like Ventura Publisher, Quark Xpress, PageMaker and, in the world of technical communication, FrameMaker. But those tools had one thing in common. Although the work was done on computers, the goal was to put the work into print. Going mobile It wasnt until the mid-1990s, when truly mobile devices -- ones with screens only a few inches across started to hit the market en masse that some people got the idea to create books that were meant to be read on those devices.

Admittedly, most of those books were public domain works, classics, and reference material. There was little, if any new or mainstream content. But the seeds were sown and from those days until today a variety of devices (whether for reading ebooks or not) have come and gone. And ebook formats have bloomed like a thousand flowers. Many of those formats died on the vine. Some survive to this day.

There are a number of ebook formats available now. Most are just niche or marginal. The two that are arguably the most popular are PDF and EPUB.PDF versus EPUBPDF has been around since the early 1990s. At the time, it was somewhat revolutionary. Here was a format that could literally take a snapshot of the look and feel of any document no matter how complex the layout. That, in itself, was pretty impressive. For the time and even for now.

But PDF, no matter what Adobe says, is really a format for printing. At best, its a format for viewing on larger screens -- desktop monitors, laptops, and (in a stretch) on larger tablets.

EPUB, on the other hand, is a young upstart. From day one, EPUB had the advantage of being created in the right place at the right time. EPUB was built for viewing on screen. Print wasnt even an afterthought -- I dont think it was even considered to be a necessary feature of EPUB.

While EPUB files might not be as visually pretty as PDFs, theyre more than up to the task for reading on screen. Any screen. Let me give you a few reasons why.Why EPUB?Id like to take a moment to look at why think that EPUB is the best format (at the moment, anyway) for publishing ebooks.

First off, EPUB is based on open standards. Ill be talking a bit more about this in a moment. While PDF (or, at least, some variants of PDF) is an ISO standard, its not really open. To be honest, Id rather use an open standard than a closed one. Or a closed format.

Secondly, EPUB is widely supported. Most ebook readers can handle EPUB files, and reader software for computers and tablets and smartphones (most of it free or Open Source) can too. There are even browser-based EPUB readers, like the extension for Firefox called EPUBReader.

Third, EPUB content is designed to flow. What do I mean by that? Think of all of the devices that youd read an ebook on: computer or laptop, a tablet, and ebook reader, or a smartphone. All of them have different sized screens and different screen resolutions. EPUB pages arent exactly one-size-fits-all. Theyre more one-size-adapts-to-all. You always (well, there are exceptions) get text on a single page, within the margins of the screen.

With a PDF file, things can be very different. Ive used readers that leave widowed and orphaned lines. On top of that, one strength of PDF is a major drawback when the format is used for ebooks. And thats PDFs ability to maintain the layout, the look, and the feel of a printed document. Its always nice to admire the work of a good layout or design person. But when reading that on a small screen, you often wind up scrolling and resizing. That disrupts the flow of reading, and gets really frustrating.

Finally, EPUB is very well suited for text-heavy books. You can include vector and raster images as well. And, unlike PDF, including graphics wont overly bloat the size of the file.Drawbacks of EPUBId be remiss if I didnt mention a few of EPUBs drawbacks. The main ones are:

Its not suited to books with more complex and precise layouts -- for example photo books or digital comics.

When it comes to scientific and technical publishing, EPUB doesnt support equations set using MathML (an XML variant for presenting the structure and content of mathematics). Instead, you need to use image files.

Theres no provision for linking into or between books.

Taking a peek into EPUBLet's put on our x-ray specs and take a peek into an EPUB file.

Remember when I said that EPUB is based on open standards? Well, those standards are XHTML, XML, and CSS.

The text of a book is in XHTML. Yes, one of the file formats used to create Web pages. So if you have existing content -- for example, articles that have been published on the Web or blog posts -- you can use them as the basis of an EPUB book. More about this shortly.

CSS, if you dont know, is short for Cascading Style Sheets. Cascading Style Sheets let you apply formatting to a Web page. Think of a CSS file as being like a template in a word processor. By changing attributes in a CSS file, you can change the look and feel of an EPUB file.

XML comes in with an EPUB files table of contents file (named toc.ncx) and a metadata file (named content.opf). The table of contents file not only provides structure to an ebook, it also provides navigation. Yes, a true table of contents. The metadata file, obviously, contains information about the book -- like its title, author, language, the software used to publish it, and the like. This is information that readers rarely, if ever, see but which should in an EPUB file to make it complete.

EPUB files have the extension .epub. What a surprise But .epub isnt some esoteric and murky format like, say, .doc. Its actually a ZIP file. You can open an EPUB file using any file compression utility -- like Archive Manager in GNOME or WinZip in Windows.

If you're familiar with the structure of Java .jar files or OpenDocument files, then the structure of an EPUB file will look familiar. Here's what the root directory looks like:

The folder META-INF contains some basic metadata for the book. The folder OEBPS is where the XML, XHTML, CSS, and any graphic files reside. Ebook readers expect this directory structure, and it becomes very important when validating a book. I'll be discussing validation soon.Lets look at some toolsA while ago, I heard someone say that creating EPUBs in 2011 is like creating Web pages circa 1997. The implication there was that a lot of manual work is involved. I dont agree. Sure, you can assemble your own EPUB books (including building your own table of contents files by hand). Why do that? Why not let the tools do the bulk of the work for you?

Ill be looking at five tools. Well, not all of them are tools -- two of them are markup languages. For the purposes of this talk, lets just pretend theyre all tools. Theyre not the only games in town, but theyre the ones Im most familiar with.

Im going to put these tools into three broad categories:

Conversion

Native authoring

A hybrid solution

The tools Im going to discuss, for the most part, arent meant for high-volume publishing. But for a lone writer wanting to produce EPUB books or even a small firm wanting to put out content as EPUB theyre more than up to the task.

Lets get to the tools.DocBookDocBook is a markup language, based on XML, thats widely used in documenting hardware and software. But a few publishers, notably OReilly Media and XML Press, use DocBook for publishing their books. Theres even a subset of DocBook aimed at publishers.

If you want to create EPUBs from DocBook source files, its a lot easier than it used to be. Thats because the DocBook stylesheets now support EPUB output. In case youre wondering, DocBooks stylesheets are simply a set of files that aid in converting XML documents to various formats like HTML, PDF, and EPUB.

The EPUB stylesheets are a relatively recent addition. When I first tried them, the EPUB stylesheets left a lot to be desired. Theyve gotten a lot better though.

In addition to the stylesheets, youll need an XSLT processor. An XSLT processor is software that does the actual work of transforming a DocBook file into another format. Most XSLT processors are command line tools, but theyre easy to use. If you use Linux, many distributions come with one called xsltproc already installed. You can also download and install a couple of other popular processors called Saxon and Xalan.

Lets assume youve got everything you need -- the stylesheets and an XSLT processor installed, and a DocBook source file. What do you do? You point the processor at the right stylesheet and tell it the name of the file you want to transform. With xsltproc, youd use this command:

xsltproc [path to stylesheets]/epub/docbook.xsl [your_file.xml]

That was easy, wasnt it? Well things get a bit messier from here. If you have any graphics, you'll need to manually move them into the directory structure that the DocBook tools create.

And remember the .epub container I mentioned earlier? While the DocBook tools just create the files that go into that container, youll need to create that container yourself. Thats fairly easy. Just use a file archiving utility to create a .zip file, then change the extension to .epub. There are a few other things you need to do, which are explained in detail in this article.

To me, what I just mentioned is the biggest drawback to using DocBook to create EPUB books. One complaint (well, actually a whine) that I constantly hear is that DocBook has too many tags. Over 400 of them, as I recall. People complain that they cant possibly learn them all. Guess what? You dont have to learn them all. You might use a dozen or two tags at the most. Focus on those ones, and use reference material for the rest.AsciiDocAsciiDoc is one of those quintessential Open Source projects. The programmer behind it, Stewart Rackham, wanted to use DocBook to document the software he was writing. But he found that:

DocBook is a complex language, the markup is difficult to read and even more difficult to write directlyI found I was spending more time typing markup tags, consulting reference manuals and fixing syntax errors, than I was writing the documentation.

So he came up with AsciiDoc.

AsciiDoc is a couple of things. First, its a lightweight markup language. Unlike HTML and XML, which use tags surrounded by angle brackets to format a document, AsciiDoc uses keyboard symbols to apply formatting. For example, if you want to create a heading you put a set of dashes below the text of the heading. A numbered list consists of items with a number and a period before them. I think that you get the idea.

Second, its a set of scripts and stylesheets that will convert a marked-up file to various formats like XHTML, PDF, and EPUB.

One thing that I should mention is that AsciiDoc is a command line tool. But dont worry: you dont need to remember a long string of commands and options. Rackham wrote a script named a2x which does all the heavy lifting for you. All you need to do is tell the script what format you want to output and what file you want to convert.

Heres how to use the script:

a2x -fepub -dbook [ebook_source.txt]

Overall, AsciiDoc outputs a nice looking EPUB. Of course, to do that you should follow the format for preparing the source file. If you do that, youll run into fewer headaches.OpenOffice.org/LibreOfficeOK, youre probably thinking: using a word processor as an ebook publishing tool? Theres no reason why you cant. People have written and published ebooks using OpenOffice.org and LibreOffice. OK, those ebooks were PDFs ... What about EPUB?

Thanks to an extension for OpenOffice.org Writer called Writer2EPUB, you can do just that. In case youre wondering: the extension works with LibreOffice Writer, too.

After youve installed the extension, using it is quite easy. Just open your book file in OpenOffice.org or LibreOffice Writer. Then, just click the Writer2EPUB button on your toolbar. You can enter metadata (remember, thats information about the book) and even attach a separate cover file if you have. Then, click OK.

Ill be honest: Ive only experimented with files about 50 or 60 pages long at the most. That said, the conversion was fairly fast and quite smooth. The book looked good to boot.

When youre preparing content for conversion to EPUB with Writer (and even if you arent), always keep this in mind: use styles. Dont apply formatting manually -- for example, dont create a heading by making text 22 point DejaVu Sans and applying bolding. Apply the Heading 1 style instead.

The reason you need to do this is simple. EPUB files are very structured. Styles, while they can help make a document look nice, are there to enforce consistency and structure. If you dont use styles, there can be problems. The biggest one is that the table of contents for your EPUB file wont generate properly. Which means you wont have proper navigation or structure. SigilIn some ways, I consider Sigil to be the main event. It pretty much does it all when it comes to creating and publishing EPUB files.

Sigil is a simple application, but it works. Consider it a WYSIWYG word processor for creating EPUB files. And guess what? Its native format is .epub.

All you need to do is download and install Sigil, and then just fire it up. From there, you can start typing. Just remember to start a new document for each chapter and for the cover of your ebook.

Earlier in this talk, I mentioned that if you have articles published on the Web or blog posts you can use them as the basis of an ebook. Sigil can help you do just that. You can import HTML and XHTML files into Sigil, and theyll become chapters in your book. You can arrange the chapters, edit them, add images well, everything that youd do in a word processor to tidy up or change their look.

Sigil will also automatically generate a table of contents using the headings in the chapters of your ebook. You can also add some basic metadata to the file -- title, author, and language.

Sigil is a quick and easy way to create an EPUB book. In fact, I used it to create the EPUB version of my first ebook. I was very pleased with the results.BookiI have a soft spot in my heart for Booki. Its the tool that the FLOSS Manuals project uses to write and publish its guides. Booki isnt a desktop application. Its a wiki. In fact, Ive heard book described as a wiki, but instead of Web pages it produced books. It does produce Web pages, too but thats beside the point

Booki is fairly easy to use. Theres no wiki markup to deal with. The editing interface is like a Web-based word processor. You can change formatting with a click or two. Ive worked with a number of people who, when being thrown into Booki, adapted to it within 30 minutes.

Being a wiki, Bookis backend format is (as you might have guessed) XHTML. Which, as we know, is one of the components of an EPUB file.

There are two ways you can create an EPUB with Booki. One, just go to any manual on the FLOSS Manuals site. Then, click the EPUB button in the navigation panel on the left side of any page. After a few seconds, you get a nicely-formatted EPUB file.

The other way is to go to objavi.flossmanuals.cc. Objavi is the publishing back end of Booki, and using it enables you to choose from a number of output types including EPUB. You can also modify the default Cascading Style Sheet or point to another one of the Web. That, as you know, will let you change the look and feel of the book. Why do that? While the default stylesheet is fine, you might want to change the font being used or the spacing between paragraphs or the size of headings.

In either case, Booki assembles the chapter files, creates a table of contents, and surrounds it all with the EPUB wrapper.A quick note about PDFDocBook, AsciiDoc, and Booki all have one crucial piece of flexibility: if you need a PDF, you can easily create it. That probably sounds strange, especially after what I said about PDF earlier in this presentation.

Even though ebooks are all the rage, you might want to print your work. EPUB isn't suited for that. But PDF is. Last year, I ran a FLOSS Manuals book sprint at Toronto Open Source Week. We used Booki to create a manual for the Thunderbird email client. To do something special for the participants, I generated a PDF and printed copies of the manual using something called the Espresso Book Machine.

But lets face it: like it or not, PDF is a de-facto electronic publishing standard. Some commercial electronic publishing channels will only distribute PDFs. And there are a number of people who only know PDF.

And not everyone owns an ebook reader or a tablet. Theyll read they read on their desktop or laptop computers.

For now, PDF is still a bit more popular than EPUB. Heres a very unscientific example. Recently, I published my first ebook. Its sold as a PDF through e-junkie.com (an electronic fulfillment service) and as an EPUB in Amazons Kindle Store. The PDF version outsells the EPUB version by about 1.5:1.Validation and testingSo youve got a nicely-formatted EPUB file. Now, all you have to do is let it loose into the wild. Not so fast. You can do that, but its not the best move. Before offering your EPUB for download or for sale, you should validate and test it first.

Lets take a look at both processes.ValidationValidation is the process of making sure that your EPUB books contain all the elements that ebook readers expect. Like what? Heres a partial list:

Complete metadata

The proper directory structure in the EPUB file

Valid XHTML

Working links and references to files in the EPUB file

A table of contents

And a lot more. If you dont validate your EPUB book, chances are it will render properly in your ebook reader. But why take that chance? But dont worry: validation isnt difficult to do. There are some good software and services that let you do just that.

One of the features of Sigil that I didnt mention earlier is its built-in validator. All you need to do is open your EPUB file in Sigil, click a button, and after a few seconds it points out any problems.

Another validator you might want to consider is the online validator maintained by digital publishing firm ThreePress. Just upload your ebook and the service does the rest.

If you dont want to do that, then download and install epubcheck. epubcheck is what powers the ThreePress validator. Its a command line Java application thats quite easy to use. Just run the command:

java -jar epubcheck-0.9.2.jar ebook_file.epub

That seems simple enough, doesnt it? There is one catch, though. Validators are great at finding problems. But in many cases, theyre lacking when it comes to explaining what those problems are, specifically. The validators assume that you have a level of knowledge and the knowledge to fix the problem. Thats not always the case.

When I was validating an ebook, I got an error message telling me that there was invalid HTML syntax in a particular file. I went to the line number that the validator pointed to in the file, and I didnt see anything wrong. And I have a strong knowledge of HTML. Well, it turned out that the validator was expecting paragraph tags ( and ) around text surrounded by tags. I only figured that out by running the offending HTML file through an HTML validator.TestingLike validation, testing is optional. But its worthwhile doing it, if only as a final quality check. Crossing is, dotting ts, making sure that line and paragraph breaks are accurate. That sort of thing.

In a perfect world, someone publishing an ebook would have access to one of every device on which people read electronic books -- ebook readers, tablets, and smartphones. Sadly, its not a perfect world.

So, what do you do? Use the devices that you have. They should give you a good idea of how your ebook will look when people read it. Also, consider using Calibre. Calibre is an Open Source ebook management application for desktop and laptop computer. While its not (as some people believe) a tool for reading ebooks, Calibre does have a solid ebook reading feature. One sneaky trick you can use is to resize Calibres ereader window to simulate how your ebook will look on screens of various sizes.

Chances are, you wont find many (if any) problems. Final thoughtsAs with a number of other areas, Open Source tools are more than up to the task of publishing ebooks. It doesnt hurt when one of the most popular formats for distributing ebooks is an open standard, either.

Whether youre creating a short report or manual, a longer non fiction book, or a novel theres an Open Source tool that will help you do the job. While I dont believe that creating an EPUB in 2011 is like creating a Web page in 1997, I do have to admit that theres still a way to go. That said, those of us in the Open Source world have some solid tools at our disposal. And theyre only getting better.

Remember, though, that all the tools in the world wont make your book worth reading. That will only happen if you have an interesting idea and do a good job of presenting that idea in writing. EPUB is just a delivery system. Its the content that counts.

Want to connect?Web site:http://scottnesbitt.netBlog:http://weblog.scottnesbitt.netTwitter:http://www.twitter.com/ScottWNesbittidenti.ca:http://identi.ca/scottnesbitt

2011 Scott NesbittAdventures in Modern Publishing -