the guile 100 programs projectguile can be used as a scripting language. programs can be written as...

61
The Guile 100 Programs Project 0.6 - Apr 22, 2013 Edited by Michael Gran

Upload: others

Post on 14-Mar-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

The Guile 100 Programs Project0.6 - Apr 22, 2013

Edited by Michael Gran

Page 2: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

c© 2013 by Michael Gran100 Guile ProgramsThis work is licensed under GFDL 1.3+(GFDL 1.3+).A Lonely Cactus ProductionLos Angeles, California

Page 3: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

i

Short Contents

1 Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

3 Theme 1: “/bin” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

4 Theme 2: Web 1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

A Other Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Page 4: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

ii

Table of Contents

1 Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

3 Theme 1: “/bin” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.1 Problem 1: Echo and Cat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

3.1.1 Echo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.1.2 Cat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3.2 Problem 2: ‘ls’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.2.1 An Implementation of ‘ls’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.3 Problem 3: LZW Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.4 Problem 4: tar file archives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.4.1 ustar Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363.4.2 The rustar File Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4 Theme 2: Web 1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434.1 Problem 5: PHP-Style GUILE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444.2 Problem 6: MySQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464.3 Problem 7: Animated GIF Badges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Appendix A Other Examples . . . . . . . . . . . . . . . . . . 48A.1 ustar Archives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Page 5: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

1

1 Preface

his book aspires to be a useful set of examples about how one might use GNU Guile.One of the interesting things about the Scheme community is that they are perhaps too

clever. The depth and complexity of their thinking about computer languages is intenseand wonderful.

And yet, some times you just want to do something mundane. Where are the resourcesfor how to use Scheme – and specifically Guile – for quotidian tasks?

Well, this document will be it, if all goes according to plan.

Page 6: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

2

2 Acknowledgements

hanks the many people who have helped us develop this book.• Chris K Jester-Young contributed the original version of the echo and cat scripts for

Problem 1.• Jez Ng contributed the original version of ls for Problem 2. He also contributed an

example ustar generation script for Problem 4.• Daniel Harwig contributed the LZW compression routines for Problem 3.• Mark Weaver contributed a feature complete ustar generation script for Problem 4.

Page 7: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

3

3 Theme 1: “/bin”

very project has to start somewhere, so we may as well begin at the beginning.Guile can be used as a scripting language. Programs can be written as plain text files,

and then run from the command line by using the Guile interpreter. As such, most scriptsrun on Unix-like shells will begin with a sha-bang #! invocation. And most scripts muststart off doing the same chores: parsing the command line, acting on the options, andfinding the files whose names appeared in the command-line arguments.

To introduce these mundane concepts, our first theme is /bin, e.g. re-implementing somecommon Unix tools. This will get us warmed up.

These examples should demonstrate• How to set up the sha-bang invocation for Guile scripts run from Unix shells.• How to handle command line arguments• How to map file names given as command line arguments to their files• How to search for files and directories• How to open files, both as binary data and as encoded text data

To demonstrate some of these concepts, in the following sections you will find echoscript that prints out its own arguments; cat which concatenates files or standard inputto the standard output; ls which lists the files in a directory. There is also compress anduncompress which perform LZW compression on a file. And lastly there are scripts togenerate tar-conformant archives.

And so, without further ado, here are the examples.

3.1 Problem 1: Echo and Cat

In this problem, two venerable Unix commands are re-implemented in Scheme: echo andcat. echo prints out the command-line arguments, and cat prints a file to the terminal.

In this problem, like in many of the problems, we’ll lay out the requirements for aprogram, and then see how our volunteer implemented the requirements. For the purposeof this exercise, the requirements for echo and cat with be drawn from the Posix standard1,with a couple of minor modifications. Since these commands are implemented in differentways on different systems, a specification is given for the versions implemented here.

3.1.1 Echo

The echo script writes its arguments to the standard output, followed by a <newline>. Ifthere are no arguments, it just prints a <newline>.

echo has no command-line options. Even ‘--help’ and ‘--version’ are not treated ascommand-line options.

If any of the arguments contain the backslash character (\), the argument is modified.Backslash introduces an escape. These escapes are parsed from logical left to right.

1 [IEEE 2004], page 56

Page 8: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

4

\a Write an <alert> in place of \a.

\b Write a <backspace> in place of \b.

\c Suppress the <newline> that would otherwise be written after the command-linearguments. The \c is not written, any remaining characters in this argumentare not written, and any remaining arguments are not written.

\f Write a <form-feed> in place of \f.

\n Write a <newline> in place of \n.

\r Write a <carriage-return> in place of \r.

\t Write a <tab> in place of \t

\v Write a <vertical-tab> in place of \v.

\\ Write a single backslash character in place of the pair of backslash characters.

\0num Write an 8-bit character corresponding to num, an octal number between octal0 and octal 377 (decimal 255) inclusive.

A backslash at the end of a command line argument will not be escaped. The backslashwill be written. However, the exit value will be 1 in this case.

A backslash followed by any other character not listed in the table, will will not beescaped. The backslash will be written, and the character that follows it will be written.However, in this case, the exit value will be 1.

For the octal escape \0, it is important to note that this value is not an ISO-8859-xposition or a Unicode code point, but, rather a raw 8-bit byte to be sent unencoded to thestandard output. It is up to the operator, not echo, to ensure that a character sequencethat is valid for the environments locale is being sent.

If a \0 escape is present, but is not followed by an number, the raw byte zero is written.If a \0 escape is present and is followed by an octal number of greater than 3 digits, only

the first 3 digits will be interpreted as being part of the escape.If a \0 escape is present and its octal value is greater than 377, print nothing. In this

case, the exit value will be 1.An octal escape may not have unnecessary initial zeros. For example• \01 should output raw byte 1• \001 should output raw byte zero followed by the string “01”• \0001 should output raw byte zero followed by the string “001”

The digits 8 and 9 are not part of an octal escape. For example, the string \018 shallbe output as the raw byte 1 followed by the character for the numeral 8.

Remember that command-line arguments and file names may contain any characterallowed by the current locale.

In all other cases, the exit value will be zero.

Page 9: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

5

An implementation of ‘echo’

Chris K Jester-Young wrote the original solution to this problem.

#!/usr/bin/guile \-e main -s!#

(use-modules (ice-9 binary-ports))

;; The exit code for the program: #t == exit code 0, #f == exit code 1(define status #t)

(define (main args)(setlocale LC_ALL "")

;; Recursively loop over the list of command-line arguments(let loop ((args (cdr args))

(first-arg #t))(cond ((null? args)

(newline)(quit status))(else(unless first-arg

(write-char #\space))(let ((arg (car args)))

;; Take the current command-line argument and create a;; port from that argument. Pass that port as input to;; the procedure ‘initial’.(call-with-input-string arg initial)(loop (cdr args) #f))))))

;; ‘initial’ and ‘echo’ jointly form a recursive loop that reads;; characters one-by-one from the port and writes them to stdout.;; Backslash may introduce a string escape that needs special;; processing.(define (echo ch port)(write-char ch)(initial port))

(define (initial port)(define ch (read-char port))(cond ((eqv? ch #\\)

(backslash port))((not (eof-object? ch))(echo ch port))))

;; Special handling of backslash escape sequences

Page 10: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

6

(define (backslash port)(define ch (read-char port))(case ch

((#\a) (echo #\alarm port))((#\b) (echo #\backspace port))((#\c) (quit status))((#\f) (echo #\page port))((#\n) (echo #\newline port))((#\r) (echo #\return port))((#\t) (echo #\tab port))((#\v) (echo #\vtab port))((#\\) (echo #\\ port))((#\0) (let ((next (peek-char port)))

(if (and (assv next octal-digits)(not (char=? next #\0)))

(octal port)(echo #\nul port))))

(else (set! status #f)(write-char #\\)(unless (eof-object? ch)

(unread-char ch port)(initial port)))))

;; Backslash 0 introduces the octal escape. Zero to three octal;; numbers are read and output as a raw (not locale encoded) byte.(define (octal port)(let loop ((value 0)

(waiting 3))(cond ((zero? waiting)

(if (< value 256)(put-u8 (current-output-port) value)(set! status #f))

(initial port))(else (let ((ch (read-char port)))

(cond ((eof-object? ch)(loop value 0))((assv ch octal-digits)=> (lambda (ass)

(loop (+ (* value 8) (cdr ass))(1- waiting))))

(else(unread-char ch port)(loop value 0))))))))

(define octal-digits’((#\0 . 0) (#\1 . 1) (#\2 . 2) (#\3 . 3)

(#\4 . 4) (#\5 . 5) (#\6 . 6) (#\7 . 7)))

Page 11: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

7

3.1.2 Cat

Again, since cat is implemented differently on different systems, a specification of what wewere trying to accomplish is given here.

cat [OPTION]... [FILE]...cat concatenates files or standard input and prints it to the standard output.This version of cat supports three command-line options, each with a short and a long

form.

‘-u --unbuffered’Do no buffering. Write bytes from the input to the standard output withoutdelay as each character is read.

‘-h --help’Print out command help.

‘-v --version’Print out the program name and version number.

After the command-line options, a list of file names is expected. The contents of thefiles are printed to standard output. No character encoding or decoding of the contents ofthe files should be performed: they should be transmitted unmodified.

If the special file name ‘-’ (hyphen) is given, at that point the contents of the standardinput will be transmitted to the standard output.

If one of the files does not exist, or if it cannot be opened, the program will print adescriptive error message to the standard error and will return the exit code 1.

Otherwise, the exit code is zero.

Page 12: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

8

An implementation of cat

Chris K Jester-Young wrote the original solution for cat as well. One interesting thing tonote in this example is the use of catch to catch system errors that may arise if files do notexist or cannot be opened.

#!/usr/bin/guile \-e main -s!#

(use-modules (srfi srfi-1)(ice-9 binary-ports)(ice-9 format)(ice-9 getopt-long))

;; The exit code of the script: #t == exit code 0, #f == 1(define status #t)

(define (main args)(define opts (getopt-long args (get-getopt-options)));; Handle the unbuffered flag(when (assq ’unbuffered opts)

(setvbuf (current-output-port) _IONBF))(let ((files (assq-ref opts ’())))

(if (null? files)(cat)(for-each (lambda (file)

;; If a filename is "-" get text from stdin(if (string=? file "-")

(cat)(cat file)))

files))(catch ’system-error force-output write-error-handler)(quit status)))

(define cat(case-lambda;; When called with no arguments, get data from stdin(()(catch ’system-error cat-port (read-error-handler "stdin")));; When called with one argument, read data from a file((file)(catch ’system-error

(lambda () call-with-input-file file cat-port)(read-error-handler file)))))

(define* (cat-port #:optional (in (current-input-port))(out (current-output-port)))

Page 13: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

9

(define bv (get-bytevector-some in))(unless (eof-object? bv)

(catch ’system-error (lambda () put-bytevector out bv) write-error-handler)(cat-port in out)))

;; An error handler that catches system errors receives a list;; containing the errno.(define (read-error-handler label)(lambda args

(perror label (system-error-errno args))(set! status #f)))

(define (write-error-handler . args)(perror "write error" (system-error-errno args));; Don’t try to flush buffers at exit, since it’d obviously fail.(primitive-_exit 1))

(define (perror label errno)(format (current-error-port) "cat: ~a: ~a~%" label (strerror errno)))

(define (help _)(display "Usage: cat [OPTION]... [FILE]...\n")(display "Concatenate FILE(s), or standard input, to standard output.\n")(newline)(for-each (lambda (option)

(format #t " -~a, --~16a ~a~%"(cadr (assq ’single-char (cdr option)))(car option)(cadr (assq ’description (cdr option)))))

getopt-options)(quit))

(define (version _)(display "cat 0.1, for Guile100\n")(quit))

(define (get-getopt-options);; getopt-long doesn’t like extraneous option properties, so filter out(map (lambda (option)

(remove (lambda (prop)(and (pair? prop) (eq? (car prop) ’description)))

option))getopt-options))

;; Here is a list of all the command-line options(define getopt-options‘((unbuffered (single-char #\u) (value #f)

Page 14: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

10

(description "do not buffer standard output"))(help (single-char #\h) (value #f) (predicate ,help)

(description "display this help and exit"))(version (single-char #\v) (value #f) (predicate ,version)

(description "output version information and exit"))))

Page 15: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

11

3.2 Problem 2: ‘ls’

In this section, we investigate the most famous Unix command of all time: ls. ls lists filesor directories, and displays their properties.

However, ls has accumulated dozens of options over the past decades. A feature-complete ls would be too long to make a usable example. So, this script is constrained tothe most important command-line options.

The command ls lists information about files, directories, and the contents of directories.Basically, for this challenge, the script should operate like a limited functionality version ofPosix ls1.

The Requirements for a Limited ls

This script only recognizes a limited set of command-line options:

• ‘-a’ - display all matching files, including those whose name begins with a period

• ‘-l’ - use the long output format

• ‘-R’ - recursively descend into subdirectories

Any other command-line arguments that begin with a hyphen should cause an “invalidoption” error, and the program will be terminated with a non-zero exit code.

The command-line option ‘-R’ will recursively print the contents of any subdirectoryencountered.

The command-line option ‘-l’ has two effects. One, information about the files will beprinted in the long format. Two, when given a symbolic link to a directory, the commandwill print information about the symbolic link itself and not the file or directory to whichit points.

Operands

If a command-line argument does not begin with a hyphen, it is treated as an operand.

When called without operands, the contents of the current directory are printed.

Operands must be either the names of files, directories, or symbolic links. When anoperand that is not one of the above is encountered, the script should print a descriptiveerror and exit with a non-zero return code.

If an operand is a file, ls will print the name of the file. If an operand is a symbolic linkto a file, the command will print the name of the link. If an operand is a directory, ls willprint out the contents of that directory. If an operand is a symbolic link to a directory, lswill print the contents of that directory, unless the ‘-l’ is given.

When printing the contents of a directory, files and directories that begin with <period>are usually not printed. If the command-line option ‘-a’ is given, files and directories thatbegin with <period> are printed.

1 The Posix spec for ls

Page 16: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

12

Output

There are two output formats: the default format and the long format.

Within each directory, the files are sorted in case-insensitive alphabetical order accordingto the current locale.

In the default format, the filenames are output one per line. You can print them out ina columnar format if you like, though.

In the long format, the file information will be printed as follows

Field Length DescriptionType 1 ‘d’ for directory

‘-’ for regular file‘b’ for block special file‘l’ for symbolic link‘c’ for character special file‘p’ for fifo

User Read 1 ‘r’ if readable by the owner‘-’ otherwise

User Write 1 ‘w’ if twritable by the owner‘-’ otherwise

User Execute 1 ‘S’ if the file is not executable and the set-user-ID mode isset‘s’ if the file is executable and the set-user-ID mode is set‘x’ if the file is executable or the directory is searchable bythe owner‘-’ otherwise

Group Read 1 ‘r’ if readable by the group‘-’ otherwise

Group Write 1 ‘w’ if writable by the group‘-’ otherwise

Group Execute 1 ‘S’ if the file is not executable and the set-group-ID modeis set‘s’ if the file is executable and the set-group-ID mode is set‘x’ if the file is exectuable or the directory is searchable bymembers of this group‘-’ otherwise

Other Read 1 ‘r’ if readable by others‘-’ otherwise

Other Write 1 ‘w’ if writable by others‘-’ otherwise

Page 17: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

13

Other Execute 1 +space

‘T’ if the file is a directory and the search permission is notgranted to others and the restricted deletion flag is set‘t’ if the file is a directory and the search permission isgranted to others and the restricted deletion flag is set‘x’ if the file is executable or the directory is searchable byothers‘-’ otherwise

Link Count For a directory, number of immediate subdirectories it hasplus one for itself plus one for its parent. The link count fora file is one.

Owner NameGroup NameFile Size in bytesDate & Time “month day hour:sec” format if the file has been modified in

the last six months, or “month day year” format otherwisePathname For non-links, the path

For links, “<link name> -> <path to linked file or directory>”The exit code should be zero except in those error cases described above.For more information about ls, you can consult The Open Group Base Specifications

Issue 6, or the documentation of any BSD or GNU version of ls.

Page 18: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

14

3.2.1 An Implementation of ‘ls’

Jez Ng contributed a script to these specifications. It is an interesting solution.One thing to note is how he has decided to truly minimize the scope of the procedures

by declaring procedures within procedures.Unsurprisingly, the majority of the script involves getting the format right for long

output.#! /usr/local/bin/guile -s!#

;; A solution to Guile 100 Problem #2 ‘ls’;; Contributed by Jez Ng.

(use-modules (srfi srfi-1) ; fold, map etc(srfi srfi-26) ; cut (partial application)(srfi srfi-37) ; args-fold(ice-9 ftw)(ice-9 format)(ice-9 i18n))

(define perror (cut format (current-error-port) <...>))

(define (default-printer path st . rest)(format #t "~a~%" (basename path)))

(define* (long-printer path st #:optional(max-nlinks 0) (max-size 0)(max-uname-length 0) (max-groupname-length 0))

(let*((bits-set?

(lambda (bits . masks)(let ((mask (apply logior masks)))(= mask (logand bits mask)))))

(permission-string(lambda (perms)(let* ((setuid-bit #o4000)

(setgid-bit #o2000)(sticky-bit #o1000)(owner-read-bit #o400)(owner-write-bit #o200)(owner-exec-bit #o100)(group-read-bit #o40)(group-write-bit #o20)(group-exec-bit #o10)(other-read-bit #o4)(other-write-bit #o2)(other-exec-bit #o1)

Page 19: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

15

(rwx-letter (lambda (bit letter)(if (bits-set? perms bit) letter #\-)))

(setid-letter (lambda (exec-bit setid-bit letter)(cond ((bits-set? perms exec-bit setid-bit) letter)

((bits-set? perms setid-bit)(char-downcase letter))(else (rwx-letter exec-bit #\x))))))

(string (rwx-letter owner-read-bit #\r)(rwx-letter owner-write-bit #\w)(setid-letter owner-exec-bit setuid-bit #\S)(rwx-letter group-read-bit #\r)(rwx-letter group-write-bit #\w)(setid-letter group-exec-bit setgid-bit #\S)(rwx-letter other-read-bit #\r)(rwx-letter other-write-bit #\w)(setid-letter other-exec-bit sticky-bit #\T)))))

(format-time(lambda (time)(if (and (<= time (current-time))

(< (- (current-time) time) (* 3600 24 30 6)))(strftime "%b %e %H:%M" (localtime time))(strftime "%b %e %_5Y" (localtime time)))))

(type (case (stat:type st)((directory) #\d)((regular) #\-)((symlink) #\l)((block-special) #\b)((char-special) #\c)((fifo) #\p)(else #\?)))

(digits (lambda (n) (if (= n 0) 1 (1+ (inexact->exact (ceiling (log10 n))))))))(format #t "~a~a ~vd ~va ~va ~vd ~a ~a\n"

type(permission-string (stat:perms st))(digits max-nlinks) (stat:nlink st)max-uname-length (passwd:name (getpwuid (stat:uid st)))max-groupname-length (group:name (getgrgid (stat:gid st)))(digits max-size) (stat:size st)(format-time (stat:mtime st))(if (char=? type #\l)

(format #f "~a -> ~a" path (readlink path))(basename path)))))

(define (ls-dir dir-name dir-stat recursive? all? print-header? printer)(let* ((not-hidden? (lambda (name) (not (string-prefix? "." name))))

(enter? (lambda (path st)(or (and (or all? (not-hidden? (basename path))) recursive?)

Page 20: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

16

(= (stat:ino st) (stat:ino dir-stat))))))(let recurse ((tree (file-system-tree dir-name enter?))

(parent-path ‘(,(dirname dir-name)))(top-level? #t))

;; ‘file-system-tree’ returns a structure of the form;; (string basename, object stat, tree children)(let* ((path (cons (car tree) parent-path))

(path-string (string-join (reverse path) file-name-separator-string))(children(filter(lambda (tree) (or all? (not-hidden? (car tree))))(sort (let ((current-dir-path (in-vicinity path-string "."))

(parent-dir-path (in-vicinity path-string "..")))(cons (list current-dir-path (lstat current-dir-path))

(cons (list parent-dir-path (lstat parent-dir-path))(cddr tree))))

(lambda (a b) (string-locale-ci<? (car a) (car b))))));; ‘max’ throws an error if called without arguments;;; ‘max-above-0’ just returns 0(max-above-0 (lambda args (apply max (cons 0 args))))(stats (map cadr children))(max-nlinks (apply max-above-0 (map stat:nlink stats)))(max-size (apply max-above-0 (map stat:size stats)))(max-uname-length(apply max-above-0 (map (compose string-length passwd:name

getpwuid stat:uid) stats)))(max-groupname-length(apply max-above-0 (map (compose string-length group:name

getgrgid stat:gid) stats))))(if (or (not top-level?) print-header?) (format #t "~a:~%" path-string))(for-each (lambda (child)

(printer(in-vicinity path-string (car child))(cadr child)max-nlinks max-size max-uname-length max-groupname-length))

children)(if recursive?

(for-each (lambda (child)(if (and (eq? (stat:type (cadr child)) ’directory)

(not (or (equal? (basename (car child)) ".")(equal? (basename (car child)) ".."))))

(recurse child path #f)))children))))))

(let* ((program-name (car (program-arguments)))(make-bool-option(lambda (opt-name flag)

Page 21: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

17

(option ‘(,flag) #f #f (lambda (opt name arg result)(acons opt-name #t result)))))

;; ‘getopt-long’ requires the long option name to be provided,;; but the real ‘ls’ does not use long names. srfi-37 does not;; have this restriction, so we use it instead.(args (args-fold

(cdr (program-arguments))(map make-bool-option ’(all? recursive? long?) ’(#\a #\R #\l))(lambda (opt name arg result)(perror "~a: illegal option -- ~a~%" program-name name)(perror "usage: ~a [-alR] [file ...]~%" program-name)(exit 1))

(lambda (opt result) (assq-set! result’paths(cons opt (assq-ref result ’paths))))

’((paths))))(paths (if (null? (assq-ref args ’paths)) ’(".") (assq-ref args ’paths)))(printer (if (assq-ref args ’long?) long-printer default-printer))(ls-dir-cut (cut ls-dir <> <>

(assq-ref args ’recursive?) (assq-ref args ’all?)(> (length paths) 1)printer))

(exit-code 0))(for-each(lambda (path)(catch ’system-error

(lambda ()(let ((st (lstat path)))(case (stat:type st)

((directory) (ls-dir-cut path st))((symlink) (if (assq-ref args ’long?)

(printer path st)(ls-dir-cut(let ((linked-path (readlink path)))(if (absolute-file-name? linked-path)

linked-path(in-vicinity (dirname path)

linked-path)))(stat path))))

(else (printer path st)))))(lambda args

(perror "~a: ~a: ~a~%"program-name path (strerror (system-error-errno args)))

(set! exit-code 1)))) paths)(exit exit-code))

Page 22: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

18

3.3 Problem 3: LZW Compression

Good old LZW compression: a nice problem in every CompSci’s undergraduate classes.Lempel-Ziv-Welch compression is the basis of both the UNIX Compress program and ofGIF encoding.

The only problem with LZW is that it doesn’t actually to a very good job at compression,but, it is has an interesting logic and is familiar enough that it makes a good example.

This task has two parts.• Write ‘compress’ and ‘uncompress’ procedures for LZW compression.• Use them to make ‘compress’ and ‘uncompress’ scripts.

First up are the compression procedures.

lzw-compress and lzw-uncompress

[Guile Procedure]lzw-compress input-bv #:key table-size dictionaryThis procedure should take a bytevector presumed to contain 8-bit unsigned integers,and it should return a bytevector containing 16-bit unsigned integers in little-endianformat.input-bv is the input bytevector.table-size is an optional parameter that indicates the maximum number of entries inthe dictionary. This parameter is limited to the range 258 - 65536. The default valueof table-size is 65536.dictionary is an optional parameter that modifies the output. When true, the proce-dure shall return both the output 16-bit bytevector as well as the hash table createdby the compression routine that maps indices to codes.

Probably the best writup on LZW compression is the one by Mark Nelson over athttp://marknelson.us/2011/11/08/lzw-revisited/. Refer to that article for details onLZW compression.

It is possible to fill up the dictionary. In that case, one continues to use the dictionaryas it is, without adding new entries.

As I’ve noted, we’re focussing on the problem of encoding 8-bit binary data. Thus,the first 256 entries in the dictionary – entries #0 to #255 – are initialized to 0 to 255.Entry #256 is not used in this example, but, it is usually reserved for a special code thatempties the dictionary. Entries #257 to #(table-size - 1) contain the multi-byte entries inthe dictionary.

[Guile Procedure]lzw-uncompress input-bv #:key table-size dictionarySimilarly, this procedure takes input-bv the bytevector created by compress and anoptional table size and returns the 8-bit unsigned bytevector of uncompressed data.dictionary, when true, causes the procedure to also return its dictionary or hash table.

Daniel Hartwig contributed an implementation of these compression routines.There are a couple of interesting techniques of which to take note. First, if you C pro-

grammers have ever wondered how to create a static variable in a function, make-serial-number-generator show the Scheme analog of that technique.

Page 23: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

19

;; Copyright (C) 2013 Daniel Hartwig <[email protected]>;;;; This program is free software: you can redistribute it and/or modify;; it under the terms of the GNU General Public License as published by;; the Free Software Foundation, either version 3 of the License, or;; (at your option) any later version.;;;; This program is distributed in the hope that it will be useful,;; but WITHOUT ANY WARRANTY; without even the implied warranty of;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the;; GNU General Public License for more details.;;;; You should have received a copy of the GNU General Public License;; along with this program. If not, see <http://www.gnu.org/licenses/>.

(define-module (lzw)#:use-module (rnrs bytevectors)#:use-module (rnrs io ports)#:use-module (srfi srfi-1)#:use-module (srfi srfi-26)#:use-module (ice-9 receive)#:export (lzw-compress

lzw-uncompress%lzw-compress%lzw-uncompress))

;; This procedure adapted from an example in the Guile Reference;; Manual.(define (make-serial-number-generator start end)(let ((current-serial-number (- start 1)))

(lambda ()(and (< current-serial-number end)

(set! current-serial-number (+ current-serial-number 1))current-serial-number))))

(define (put-u16 port k);; Little endian.(put-u8 port (logand k #xFF))(put-u8 port (logand (ash k -8) #xFF)))

(define (get-u16 port);; Little endian. Order of evaluation is important, use ’let*’.(let* ((a (get-u8 port))

(b (get-u8 port)))(if (any eof-object? (list a b))

(eof-object)(logior a (ash b 8)))))

Page 24: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

20

(define (%lzw-compress in out done? table-size)(let ((codes (make-hash-table table-size))

(next-code (make-serial-number-generator 0 table-size))(universe (iota 256))(eof-code #f))

;; Populate the initial dictionary with all one-element strings;; from the universe.(for-each (lambda (obj)

(hash-set! codes (list obj) (next-code)))universe)

(set! eof-code (next-code))(let loop ((cs ’()))(let ((c (in)))(cond ((done? c)

(unless (null? cs)(out (hash-ref codes cs)))

(out eof-code)(values codes))((hash-ref codes (cons c cs))(loop (cons c cs)))(else(and=> (next-code)

(cut hash-set! codes (cons c cs) <>))(out (hash-ref codes cs))(loop (cons c ’()))))))))

(define (ensure-bv-input-port bv-or-port)(cond ((port? bv-or-port)

bv-or-port)((bytevector? bv-or-port)(open-bytevector-input-port bv-or-port))(else(scm-error ’wrong-type-arg "ensure-bv-input-port"

"Wrong type argument in position ~a: ~s"(list 1 bv-or-port) (list bv-or-port)))))

(define (for-each-right proc lst)(let loop ((lst lst))

(unless (null? lst)(loop (cdr lst))(proc (car lst)))))

(define (open-bit-output-port bits-per-entry)(let ((current 0)

(location 0))(call-with-values

Page 25: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

21

(lambda ()(open-bytevector-output-port))

(lambda (port get-bytevector)(let ((write-to-bv (lambda (val)

;; (format #t "Entering write-to-bv: current ~a location ~a val ~a bpe ~a~%" current location val bits-per-entry)(set! current (logior current (ash val location)))(set! location (+ location bits-per-entry))(while (> location 8)

;; (format #t "Writing ~a~%" (logand current #xff))(put-u8 port (logand current #xff))(set! current (ash current -8))(set! location (- location 8)))

;; (format #t "Leaving write-to-bv: current ~a location ~a~%" current location)))

(get-bv (lambda ()(put-u8 port current)(get-bytevector))))

(values write-to-bv get-bv))))))

(define (open-bit-input-port bv bits-per-entry)(let ((current 0)

(location 0)(eof #f))

(call-with-values(lambda ()(open-bytevector-input-port bv))

(lambda (port);; Return the read procedure, which begins here(lambda ();; (format #t "Entering read-from-bv: current ~x location ~a~%" current location)(let loop ((u8 (get-u8 port)));; (format #t "Read ~a~%" u8)(if (eof-object? u8)

(if (> location 0)(begin

(let ((output (bit-extract current 0 bits-per-entry)))(set! current (ash current (- bits-per-entry)))(set! location (- location bits-per-entry));; (format #t "EOF Leaving read-from-bv: current ~x location ~a output ~x~%" current location output)output))

(begin;; (format #t "EOF Leaving read-from-bv: <eof>~%")(eof-object)))

;; else(begin

(set! current (logior current (ash u8 location)))

Page 26: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

22

(set! location (+ location 8))(if (< location bits-per-entry)

(begin;; (format #t "Looping in read-from-bv: current ~x location ~a~%" current location)(loop (get-u8 port)))

;; else(let ((output (bit-extract current 0 bits-per-entry)))(set! current (ash current (- bits-per-entry)))(set! location (- location bits-per-entry));; (format #t "Leaving read-from-bv: current ~x location ~a output ~x~%" current location output)output))))))))))

#!(lambda ()(format #t "Entering read-from-bv: current ~x location ~a~%" current location)(if eof

(eof-object);;else(begin(while (< location bits-per-entry)

(format #t "Looping in read-from-bv: current ~x location ~a~%" current location)(let ((u8 (get-u8 port)))(format #t "Read ~a~%" u8)(if (eof-object? u8)

(begin(set! eof #t)(break))

;; else(begin(set! current (logior current (ash u8 location)))(set! location (+ location 8))))))

(format #t "After loop in read-from-bv: current ~x location ~a~%" current location)(let ((output (bit-extract current 0 bits-per-entry)))

(set! current (ash current (- bits-per-entry)))(set! location (- location bits-per-entry))(format #t "Leaving read-from-bv: current ~x location ~a output ~x~%" current location output)output))))))))

!#

(define (%lzw-uncompress in out done? table-size)(let ((strings (make-hash-table table-size))

(next-code (make-serial-number-generator 0 table-size))(universe (iota 256))(eof-code #f))

(for-each (lambda (obj)(hash-set! strings (next-code) (list obj)))

universe)

Page 27: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

23

(set! eof-code (next-code))(let loop ((previous-string ’()))(let ((code (in)))(unless (or (done? code)

(= code eof-code))(unless (hash-ref strings code)

(hash-set! stringscode(cons (last previous-string) previous-string)))

(for-each-right out(hash-ref strings code))

(let ((cs (hash-ref strings code)))(and=> (and (not (null? previous-string))

(next-code))(cut hash-set! strings <> (cons (last cs)

previous-string)))(loop cs)))))))

(define (lzw-compress-inner bv table-size dictionary)(call-with-values

(lambda ()(open-bytevector-output-port))

(lambda (output-port get-result)(let ((dict (%lzw-compress (cute get-u8 (ensure-bv-input-port bv))

(cute put-u16 output-port <>)eof-object?table-size)))

(if dictionary(values (get-result) dict)(get-result))))))

(define* (lzw-compress bv #:key (table-size 65536) dictionary)(let ((bv (lzw-compress-inner bv table-size dictionary)))

(receive (write-to-bv get-bv)(open-bit-output-port (integer-length (1- table-size)));; (write (bytevector->uint-list bv (endianness little) 2)) (newline)(for-each write-to-bv (bytevector->uint-list bv (endianness little) 2))(get-bv))))

(define* (lzw-uncompress-inner bv table-size dictionary)(format #t "lzw-uncompress: table-size ~a~%" table-size)(call-with-values

(lambda ()(open-bytevector-output-port))

(lambda (output-port get-result)(let ((dict (%lzw-uncompress (cute get-u16 (open-bytevector-input-port bv))

Page 28: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

24

(cute put-u8 output-port <>)eof-object?table-size)))

(if dictionary(values (get-result) dict)(get-result))))))

(define* (lzw-uncompress bv #:key (table-size 65536) dictionary)(let* ((get-val (open-bit-input-port bv (integer-length (1- table-size))))

(u16lst (let loop ((x (get-val))(lst ’()))

(if (eof-object? x)lst(loop (get-val) (append lst (list x)))))))

(lzw-uncompress-inner (uint-list->bytevector u16lst (endianness little) 2) table-size dictionary)))

Page 29: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

25

The ‘compress’ and ‘uncompress’ scripts

Once the procedures are working, it is a simple task to write scripts that use them. Sowe’ll write scripts that are simplified versions Unix commands ‘compress’ and ‘uncompress’.These scripts will manipulate files with the following format.

Each file will begin with a 3 byte header.• Byte 1: #x1F• Byte 2: #x9D• Byte 3: Dictionary size, given as an 8-bit unsigned number between 9 and 16 inclusive.

The number indicates a dictionary size from between 2^9 and 2^16.

The rest of the file is the LZW-compressed 16-bit binary data stored in little-endianformat.

Note that this will not be compatible with your operating system’s version of compress.The compress file format is not consistent across platforms. Every current implementationof compress adds more functionality to squeeze more compression out of the vanilla LZWalgorithm.

compress [-v] [-b bits] [name ...]

For each filename, compress, will create a LZW-compressed version of an input file.The compressed file will have the same filename as the input file with the ".Z" extensionappended to it. If the compression is successful and the output file is successfully written,the input file will be deleted.

If no filenames are given, compress will take the contents of stdin and send the com-pressed data to stdout.

The optional ‘-b’ bits parameter will indicate the maximum size of the dictionary. Ifbits is given, it must be between 9 and 16, indicating maximum dictionary sizes of 2^bits.

If the optional ‘-v’ parameter is given, the script should print to stdout the compressionratio for each file processed. If no file was specified and this program is thus compressingstdin to stdout, this flag is ignored.

Compress should fail with appropriate error messages if any of the following problemsoccur• The command-line has unknown options or is otherwise incorrect• The command line argument after a ‘-b’ is out of range, non-numeric, or missing.• The file associated with an input filename does not exist or is unreadable• An input filename has a ".Z" suffix• Writing the output file would overwrite a file that already exists• Writing to disk fails for any reason• Erasing the input file on completion fails for any reason

If an error occurs, the script should return the error code 1. Otherwise it returns theerror code 0.

uncompress [-v] [name ...]

uncompress will create an uncompressed version of a file generated by compress. Theuncompressed file with have the same filename as the input file with the ".Z" extension

Page 30: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

26

removed. If the uncompression is successful and the output file is successfully written, theinput file will be deleted.

Also, like compress, if no filenames are given, uncompress takes the contents of stdinand uncompresses them to stdout.

If the optional ‘-v’ parameter is given, the script should print to stdout the compressionratio for each file processed. If no file was specified and thus this program is compressingstdin to stdout, this flag is ignored.

Uncompress should fail with appropriate error messages if any of the following problemsoccur• The command-line has unknown options or is otherwise incorrect• The file header is incorrect• The bits parameter in the file header is out of range• The file associated with the input filename does not exist or is unreadable• The input compressed data is incorrect or corrupt, which can be detected by receiving

an index that is not yet in the dictionary, or if an index value exceeds the number ofentries in the dictionary as specified in the header, or if the last entry in the file not acomplete 16-bit integer

• The input file does not end in ".Z"• The output file would overwrite a file that already exists• Writing to disk fails for any reason.• Erasing the input file on completion fails for any reason

If an error occurs, the script should return the error code 1. Otherwise it returns theerror code 0.

compress and uncompress

Daniel Hartwig contributed compress and uncompress scripts. As you can imagine, themajority of the scripts do unglamorous tasks such as checking options, filenames and thelike.

Page 31: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

27

Here’s compress

#!/usr/bin/guile \-L . -e main -s!#

;; Copyright (C) 2013 Daniel Hartwig <[email protected]>;;;; This program is free software: you can redistribute it and/or modify;; it under the terms of the GNU General Public License as published by;; the Free Software Foundation, either version 3 of the License, or;; (at your option) any later version.;;;; This program is distributed in the hope that it will be useful,;; but WITHOUT ANY WARRANTY; without even the implied warranty of;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the;; GNU General Public License for more details.;;;; You should have received a copy of the GNU General Public License;; along with this program. If not, see <http://www.gnu.org/licenses/>.

(use-modules (lzw)(ice-9 control)(ice-9 format)(ice-9 i18n)(rnrs bytevectors)(rnrs io ports)(srfi srfi-37))

(define *program-name* #f)

;; This form of ’gettext’ is helpful for longer messages. A single;; message id can be split and aligned across many lines, similar to;; the common usage in C.(define (_ msg . rest)(gettext (string-concatenate (cons msg rest)) "guile100-compress"))

(define (error* status msg . args)(force-output)(let ((port (current-error-port)))

(when *program-name*(display *program-name* port)(display ": " port))

(apply format port msg args)(newline port)(unless (zero? status);; This call to ’abort’ causes ’main’ to immediately return the

Page 32: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

28

;; specified status value. Similar to ’exit’ but more;; controlled, for example, when using the REPL to debug,;; ’abort’ will not cause the entire process to terminate.;;;; This is also handy to attempt processing every file, even;; after an error has occured. To do this, establish another;; prompt at an interesting place inside ’main’.(abort (lambda (k)

status)))))

(define (make-file-error-handler filename)(lambda args

(error* 1 (_ "~a: ~a")filename(strerror (system-error-errno args)))))

(define (system-error-handler key subr msg args rest)(apply error* 1 msg args))

(define (compression-ratio nbytes-in nbytes-out)(exact->inexact (/ (- nbytes-in nbytes-out) nbytes-in)))

(define (write-lzw-header port bits)(put-bytevector port (u8-list->bytevector (list #x1F #x9D bits))))

(define (compress-port in out bits verbose?)#;(begin

(write-lzw-header out bits)(%lzw-compress (cute get-u8 in)

(cute put-u16 out <>)eof-object?(expt 2 bits)))

(let* ((in-bv (get-bytevector-all in))(out-bv (lzw-compress in-bv #:table-size (expt 2 bits))))

(write-lzw-header out bits)(put-bytevector out out-bv)))

(define (compress-file infile bits verbose?)(catch ’system-error

(lambda ()(let ((outfile (string-append infile ".Z")))

(when (string-suffix? ".Z" infile)(error* 1 (_ "~a: already has .Z suffix") infile))

(when (file-exists? outfile)(error* 1 (_ "~a: already exists") outfile))

(let ((in (open-file infile "rb"))

Page 33: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

29

(out (open-file outfile "wb")));; TODO: Keep original files ownership, modes, and access;; and modification times.(compress-port in out bits verbose?)(when verbose?

(format #; (current-error-port)(current-output-port)(_ "~a: compression: ~1,2h%\n") ; ’~h’ is localized ’~f’.infile(* 100 (compression-ratio (port-position in)

(port-position out)))))(for-each close-port (list in out))(delete-file infile))))

system-error-handler))

(define (ensure-bits obj)(let ((n (or (and (integer? obj) obj)

(and (string? obj)(locale-string->integer obj))

(error* 1 (_ "bits must be an integer -- ~a") obj))))(unless (<= 9 n 16)(error* 1 (_ "bits must be between 9 and 16 -- ~a") n))

n))

(define (make-boolean-processor key)(lambda (opt name arg config . rest)

(apply values (assq-set! config key #t)rest)))

(define (make-option-processor key parse)(lambda (opt name arg config . rest)

(apply values (assq-set! config key (parse arg))rest)))

(define (usage status)(format (current-error-port)

(_ "Usage: ~a [-v] [-b bits] [FILE]...\n"" -v, --verbose show compression ratio\n"" -b, --bits bits maximum number of BITS per code [16]\n")

*program-name*)(abort (lambda (k)

status)))

(define options(list (option ’(#\h "help") #f #f

(lambda args(usage 0)))

Page 34: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

30

(option ’(#\v "verbose") #f #f(make-boolean-processor ’verbose?))

(option ’(#\b "bits") #t #f(make-option-processor ’bits ensure-bits))))

(define (main args);; Establishing this prompt ensures that any call to ’abort’ will at;; most escape to the continuation of ’%’ here. In effect, calling;; ’abort’ causes ’main’ to stop what it was doing and continue with;; the procedure passed to ’abort’ instead.(% (call-with-values

(lambda ()(args-fold (cdr args)

options(lambda (opt name arg . rest)(error* 0 (_ "invalid option -- ’~a’") name)(usage 1))

(lambda (arg config infiles)(values config

(cons arg infiles)));; First seed: config (with default values).’((bits . 16)(verbose? . #f))

;; Second seed: infiles (initially empty list).’()))

(lambda (config infiles)(let ((bits (assq-ref config ’bits))

(verbose? (assq-ref config ’verbose?)))(for-each (lambda (infile)

(cond ((string=? infile "-")(compress-port (current-input-port)

(current-output-port)bitsverbose?))

(else(compress-file infile

bitsverbose?))))

(if (null? infiles);; No arguments, use stdin.’("-");; Process the files in the order given on;; the command line.(reverse infiles)))

;; Exit indicating success. If an error occured anywhere,;; the call to ’abort’ will produce a different status.0)))))

Page 35: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

31

(when (batch-mode?)(setlocale LC_ALL "")(set! *program-name* (basename (car (program-arguments))))(exit (main (program-arguments))))

Page 36: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

32

Here’s uncompress

#!/usr/bin/guile \-L . -e main -s!#

;; Copyright (C) 2013 Daniel Hartwig <[email protected]>;;;; This program is free software: you can redistribute it and/or modify;; it under the terms of the GNU General Public License as published by;; the Free Software Foundation, either version 3 of the License, or;; (at your option) any later version.;;;; This program is distributed in the hope that it will be useful,;; but WITHOUT ANY WARRANTY; without even the implied warranty of;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the;; GNU General Public License for more details.;;;; You should have received a copy of the GNU General Public License;; along with this program. If not, see <http://www.gnu.org/licenses/>.

(use-modules (lzw)(ice-9 control)(ice-9 format)(ice-9 i18n)(ice-9 match)(rnrs bytevectors)(rnrs io ports)(srfi srfi-37))

(define *program-name* #f)

(define (_ msg . rest)(gettext (string-concatenate (cons msg rest)) "guile100-compress"))

(define (error* status msg . args)(force-output)(let ((port (current-error-port)))

(when *program-name*(display *program-name* port)(display ": " port))

(apply format port msg args)(newline port)(unless (zero? status)(abort (lambda (k)

status)))))

Page 37: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

33

(define (make-file-error-handler filename)(lambda args

(error* 1 (_ "~a: ~a")filename(strerror (system-error-errno args)))))

(define (system-error-handler key subr msg args rest)(apply error* 1 msg args))

(define (compression-ratio nbytes-in nbytes-out)(exact->inexact (/ (- nbytes-in nbytes-out) nbytes-in)))

(define (read-lzw-header port)(match (bytevector->u8-list (get-bytevector-n port 3))

((#x1F #x9D bits)(and (<= 9 bits 16)

(values bits)))(x #f)))

(define (uncompress-port in out verbose?)(let ((bits (read-lzw-header in)))

(unless bits(error* 1 (_ "incorrect header")))

#;(%lzw-uncompress (cute get-u16 in)

(cute put-u8 out <>)eof-object?(expt 2 bits))

(let* ((in-bv (get-bytevector-all in))(out-bv (lzw-uncompress in-bv #:table-size (expt 2 bits))))

(put-bytevector out out-bv))))

(define (uncompress-file infile verbose?)(catch ’system-error

(lambda ()(let ((outfile (string-drop-right infile 2)))(when (not (string-suffix? ".Z" infile))(error* 1 (_ "~a: does not have .Z suffix") infile))

(when (file-exists? outfile)(error* 1 (_ "~a: already exists") outfile))

(let ((in (open-file infile "rb"))(out (open-file outfile "wb")))

(uncompress-port in out verbose?)(when verbose?(format #; (current-error-port)

(current-output-port)(_ "~a: compression: ~1,2h%\n") ; ’~h is localized ’~f’.

Page 38: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

34

infile(* 100 (compression-ratio (port-position out)

(port-position in)))))(for-each close-port (list in out))(delete-file infile))))

system-error-handler))

(define (usage status)(format (current-error-port)

(_ "Usage: ~a [-v] [FILE]...\n"" -v, --verbose show compression ratio\n")

*program-name*)(abort (lambda (k)

status)))

(define (make-boolean-processor key)(lambda (opt name arg config . rest)

(apply values (assq-set! config key #t)rest)))

(define (main args)(% (call-with-values

(lambda ()(args-fold (cdr args)

(list (option ’(#\h "help") #f #f(lambda args(usage 0)))

(option ’(#\v "verbose") #f #f(make-boolean-processor ’verbose?)))

(lambda (opt name arg . rest)(error* 0 (_ "invalid option -- ’~a’") name)(usage 1))

(lambda (arg config infiles)(values config

(cons arg infiles)));; First seed: config (with default values).’((verbose? . #f));; Second seed: infiles (initially empty list).’()))

(lambda (config infiles)(let ((verbose? (assq-ref config ’verbose?)))(for-each (lambda (infile)

(cond ((string=? infile "-")(uncompress-port (current-input-port)

(current-output-port)verbose?))

(else

Page 39: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

35

(uncompress-file infileverbose?))))

(if (null? infiles);; No arguments, use stdin.’("-");; Process the files in the order given on;; the command line.(reverse infiles)))

;; Exit indicating success.0)))))

(when (batch-mode?)(setlocale LC_ALL "")(set! *program-name* (basename (car (program-arguments))))(exit (main (program-arguments))))

Page 40: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

36

3.4 Problem 4: tar file archives

This challenge is to create a script that takes a list of filenames and that generates anustar-format archive file. This archive file format is compatible with common POSIX tools.

The ustar interchange format is one of the simpler formats used for archive files thatcontain multiple files along with their metadata.

To begin, we are going to create a script that creates ustar-format files. But, to keepthings simple, we are only going to use a small subset of the functionality that ustar filescan provide. The result should be readable by common tar and pax tools.

3.4.1 ustar Script

The ustar script will have a simple calling structure.ustar archive file1 .. filen

It will create a new archive containing the files indicated on the command line.The script will have to handle many error conditions, including but not limited to• filename contains characters not in the ustar-string’s character set• file part of filename is longer than 100 characters• path part of filename is longer than 155 characters• file is a symbolic link, fifo, directory or any othet type of non-normal file• file’s uname and gname contain characters not in ustar-string’s character set• file’s uname or gname are longer than 31 characters• file length is greater than 8,589,934,591 bytes, (octal 77777777777)• file’s UID or GID is greater than 2,097,151 (octal 7777777)• system errors about inability to open, write, or close files.

3.4.2 The rustar File Format

First, I will describe our restricted ustar file format, which, I’m going to dub rustar forrestricted ustar, just so that we’re clear that I’m talking about something more specific thanthe ustar format.

File Structure

A rustar file contains a set of logical records. Each logical record represents the contentsof a file plus its metadata. The logical records appear sequentially in the file, one afteranother, and there is no global header in the file. At the end of the file is a footer.

Logical Records

Each logical record consists of two parts, a header segment, and the contents of the filea.k.a the data segment. Of these, only the header requires a detailed explanation.

Header

The header segment is a 512 byte block that contains metadata for a file. The block isbroken up into 17 fields of fixed length. Each field contains data in one of three types.

Page 41: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

37

Header Types

Here we describe the three types that can appear in a header. Each type has the annotation[N]. The N indicates that this field is a fixed-size that takes up N bytes.1. rustar-string[N] is a fixed-width string that contains only the codepoints listed

below. It is stored in the ASCII encoding, and, if necessary, is right padded withNULL bytes to ensure it occupies the whole of its N bytes. NULL bytes can onlyappear at the end of the string. The string need not end with NULL bytes if it fills thewhole of its fixed witdh.The list of allowed codepoints is• U+20 to U+22• U+25 to U+3F• U+41 to U+5A• U+5F• U+61 to U+7A• and U+00, but, U+00 can only be followed by more U+00.

2. rustar-0string[N] — note the ‘0’ — is a fixed-width string with the same formatand restrictions as a rustar-string[N] but with an addition restriction. It must endwith at least one NULL byte.

3. rustar-number[N] is an unsigned integer stored as a fixed-width string. The stringcontains the the text representation of the integer in octal format. The last byte (andonly the last byte) of the string must be NULL. The string is left-padded with the ‘0’character to ensure the number occupies the whole of its fixed width buffer.For example, a rustar-number[8] field for the integer 10 will be the string “0000012”followed by one byte of NULL. 12 octal equals 10 decimal.

Header Fields

The 17 fields in the 512 byte header block of a logical record are

Field Format DescriptionName string[100] The filename by itself, with no directory

information. The path separator character(U+2F), is not allowed.

Mode number[8] A bitfield of the permissions. See below.UID number[8] The User ID of the fileGID number[8] The Group ID of the fileSize number[12] The length of the file in bytesmtime number[12] The 32-bit integer modification time of the

file.Checksum number[8] 256 + the sum of all the bytes in this header

except the checksum field.Typeflag string[1] Always “0”.Link name string[100] Always 100 bytes of NULL.Magic 0string[6] The string “ustar” plus a NULL.

Page 42: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

38

Version string[2] The string “00”.uname 0string[32] The uname of the file.gname 0string[32] The gname of the fileDev-Major number[8] Always zero.Dev-Minor number[8] Always zero.Prefix string[155] Path information for this file. If this file has

no additional path information, this is allNULL. Directory separation is representedby ‘/’ forward slash. The slash at the endis assumed, and should not be included ex-plicitly.1

Padding 0string[12] 12 bytes of NULL.The mode bitfield is a standard permissions bitfield:• 0x001 execute permission for ’other’• 0x002 write permission for ’other’• 0x004 read permission for ’other’• 0x008 exeute permission for ’group’• 0x010 write permission for ’group’• 0x020 read permission for ’group’• 0x040 execute permission for ’owner’• 0x080 write permission for ’owner’• 0x100 read permission for ’owner’• 0x200 (unused)• 0x400 if is setgid• 0x800 if is setuid

Data

After the 512-byte header block, the binary contents of the file are stored. The data segmentis NULL-padded so that it ends on a 512-byte block boundary.

Footer

The footer is 1024 bytes of NULL that appears at the end of the file.

1 For example: prefix “foo” + name “bar” forms “foo/bar”. Prefix “foo/” + name “bar” forms “foo//bar”.Don’t do that.

Page 43: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

39

The Archive Script

Jez Ng contributed a script that meets the above requirements quite nicely. One thingto note here is the use of the procedures cut and cute. These let you, in effect, pass asubset of the required parameters to a procedure. In a later call, you can add the remainingparameters to the procedure and then truly call it.

#! /usr/bin/env guile \-e main -s!#

(use-modules (rnrs bytevectors)(rnrs io ports)(srfi srfi-1) ; map, reduce(srfi srfi-26) ; cut, cute(ice-9 format))

(define write-bytevector (cut put-bytevector (current-output-port) <...>))

(define block-size 512)

(define (cat)(define bv (make-bytevector block-size 0))(let ((read-count (get-bytevector-n! (current-input-port) bv 0 block-size)))

(unless (eof-object? read-count)(write-bytevector bv)(unless (< read-count block-size) (cat)))))

(define rustar-char-set(char-set-union(ucs-range->char-set #x20 #x23)(ucs-range->char-set #x25 #x40)(ucs-range->char-set #x41 #x5B)(char-set #\x5F)(ucs-range->char-set #x61 #x7B)))

(define (valid-rustar-char? c)(char-set-contains? rustar-char-set c))

(define (make-fixed-string length string)(let ((bv (make-bytevector length 0)))

(string-for-each-index(lambda (i)(let ((c (string-ref string i)))

(unless (valid-rustar-char? c)(throw ’ustar-error "encountered invalid character"))

(bytevector-u8-set! bv i (char->integer c))))string)

Page 44: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

40

bv))

(define (make-rustar-string length string)(if (<= (string-length string) length)

(make-fixed-string length string)(throw ’ustar-error "’~a’ is too long for tar header" string)))

(define (make-rustar-0string length string)(if (< (string-length string) length)

(make-fixed-string length string)(throw ’ustar-error "’~a’ is too long for tar header" string)))

(define (make-rustar-number length number)(let* ((num (number->string number 8))

(padding (- length (string-length num) 1)))(if (>= padding 0)

(make-fixed-string length (string-append (make-string padding #\0) num))(throw ’ustar-error "~a is too large for tar header" num))))

;; Unlike dirname, this doesn’t return "." for files in the cwd.(define (raw-dirname path)(let ((last-separator-pos (string-rindex

path(string-ref file-name-separator-string 0))))

(if last-separator-pos(string-take path last-separator-pos)"")))

(define (write-file-header filename)(define st (lstat filename))(unless (eq? (stat:type st) ’regular)

(throw ’ustar-error "Only regular files are supported"))(let* ((uid (stat:uid st))

(gid (stat:gid st)); We only really need an a-list for the purposes of modifying; checksum in-place. The other keys are not used. However, they do; serve as documentation.(header‘((filename . ,(make-rustar-string 100 (basename filename)))(mode . ,(make-rustar-number 8 (stat:perms st)))(uid . ,(make-rustar-number 8 uid))(gid . ,(make-rustar-number 8 gid))(size . ,(make-rustar-number 12 (stat:size st)))(mtime . ,(make-rustar-number 12 (stat:mtime st)))(checksum . ,(make-bytevector 8 (char->integer #\space)))(typeflag . ,(make-rustar-string 1 "0"))(link-name . ,(make-rustar-string 100 ""))

Page 45: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

41

(magic . ,(make-rustar-0string 6 "ustar"))(version . ,(make-rustar-string 2 "00"))(uname . ,(make-rustar-0string 32 (passwd:name (getpwuid uid))))(gname . ,(make-rustar-0string 32 (group:name (getgrgid gid))))(dev-major . ,(make-rustar-number 8 0))(dev-minor . ,(make-rustar-number 8 0))(path . ,(make-rustar-string 155 (raw-dirname filename)))(padding . ,(make-rustar-0string 12 ""))))

(sum (cut reduce + 0 <>))(checksum (sum (map (compose sum bytevector->u8-list cdr) header))))

(set! header (assq-set! header ’checksum (make-rustar-number 8 checksum)))(for-each (compose write-bytevector cdr) header)))

(define (tar archive filenames)(with-output-to-file archive

(lambda ()(for-each (lambda (filename)

(write-file-header filename)(with-input-from-file filename cat #:binary #t))

filenames)(write-bytevector (make-bytevector (* block-size 2) 0)))

#:binary #t))

(define (main args)(define perror (cut format (current-error-port) <...>))(define (system-error-handler . args)

(perror "error: ~a~%" (strerror (system-error-errno args)))(exit 1))

(define (ustar-error-handler . args)(perror "error: ")(apply perror (cdr args))(perror "~%")(exit 1))

(catch ’ustar-error(lambda ()(catch ’system-error

(cute tar (cadr args) (cddr args))system-error-handler))

ustar-error-handler))

Page 46: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

42

Later, Mark Weaver contributed a more featureful script that handles almost all of thecapabilites of the ustar archive format. It does directories and links as well as files. Also,he uses a very common hack to allow longer path names. He puts whatever part of thepath that will fit within the 100 character field for the filename. You can find his script inthe appendix, See Section A.1 [ustar Archives], page 48.

Page 47: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

43

4 Theme 2: Web 1.0

he second theme in this project is “Web 1.0”, where we’ll talk about interacting withthe Internet as it existed in the 1990s.

The 1990s began with emergence of Gopher clients and servers. The Internet Gopherprotocol visualized the world as a series of folders. The folders usually contained plain-textdocuments or media files like GIFs or AU audio. This was before both HTML and PDF, somixing text and graphics in a single file wasn’t as common, and, if it did occur, it was informats such as PostScript.

The HTTP-and-HTML-based internet is linked to the appearance of the NCSA Mosaicbrowser and the NCSA httpd server. There were precursors, but, as a practical matter, 1993was the beginning of the HTTP/HTML web.

But, in those days, before AJAX or Flash, most of the content was static HTML contentor dynamic content created by CGI scripts. In this context, before the concept of cookieswas developed in 1994, personalization of content for different users was not practical.

JavaScript appeared in Netscape Navigator 2.0 in 1995 and Internet Explorer 3.0, in late1996, but, with incompatibilities between the two implementations. Before 1996, almost allcontent was static and generated on the server side. This early Web had a more stronglydefined separation between client and server.

The early Web pages had stylistic quirks that are less common today. Before CSS2, Webpage layouts were often created by using tables. Blinking text, animated GIFs, embeddedMIDI tunes were common.

By the end of the decade, Linux, Apache, MySQL, and PHP were all quite functional.Those programs, in conjunction with Perl, which first appeared in the 80s, became thebuilding blocks of the famous LAMP stack. This free, open software stack allowed for someof the common types of interactivity to which we have become accustomed.

PHP used a model that allowed for rapid generation of Web pages, where code wasembedded within otherwise static HTML web pages. When those pages were requested, theembedded PHP code was run, and its output became HTML content.

So, in our second theme we’ll imagine what the world would have been like if GUILEwere part of the ecosystem that made up the 1990s Internet experience. Specifically, we’lltake a look at using Guile for

• on-the-fly evaluation of code embedded within HTML documents

• the Internet Gopher protocol

• CGI scripting

• the Linux Apache GUILE MySQL stack

• and the animated GIF format.

And away we go.

Page 48: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

44

4.1 Problem 5: PHP-Style GUILE

This challenge is to write a CGI script that1. receives a filename as a parameter2. passes a file by that filename through a preprocessor called eguile3. and returns the output to the CGI client.

But why eguile? That script helps us mix HTML and Guile.One of the programming paradigms of Web 1.0 was the PHP programming model, where

code was embedded within HTML. The code was run when a client requested the file fromthe server, and any output printed by the execution of the code became embedded in theHTML when it was sent to the client. The code enclosed between the <?php and ?> tagsis evaluated when the file is requested. Anything printed to stdout appears in the HTMLdocument.<!DOCTYPE html><html><body>

<?phpecho "My first PHP script!";

?>

</body></html>

When it first arrived on the scene, PHP was CGI executable.The side effect of today’s challenge is to re-create the PHP programming model in Guile,

making something like the following possible.<!DOCTYPE html><html><body>

<p><?scm

(display "A Guile Script!")?>

</p><p><?scm:d "A string" ?>

</p></body>

</html>

Mixing HTML and Scheme

So I mentioned eguile. It is an abandonware project that, when given a file that is a mixof HTML and of Guile code, can run it PHP style. The HTML text will be passed throughunmodified to the output, and Scheme code will be executed, and anything that the Schemecode displays to the current output port will be passed through as well.

Page 49: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

45

eguile does this by recognizing two new tags.• ‘<?scm’ and ‘?>’ enclose Scheme code, which eguile will pass to Guile for evaluation.• ‘<?scm:d’ and ‘?>’ also enclose Scheme code to be evaluated, just like the ‘<?scm’ tags.

Additionally, eguile will display the value of the last expression using the displayprocedure.

Making a CGI Script

eguile by itself is not a complete solution. It can run mixed HTML and Guile code throughthe Guile interpreter, but, it doesn’t have any hooks to connect it to the webserver.

To make this happen, we can add some framework code to have eguile run as part ofa CGI script.

The quickest way to make a CGI script is to use the functions provided by the Guile-WWW project. Guile-WWW has routines that provide CGI functionality.

Thus, we’ll be creating a Scheme script that uses Guile-WWW for CGI processing andthat includes an updated version of eguile.

URL Parsing

We’ll call this script ‘ghp.cgi’. ghp is short for Guile HTML Processor. For any basicwebserver, you can put the ‘ghp.cgi’ in the ‘cgi-bin’ directory, and run it by pointingyour browser to something like http://localhost/cgi-bin/ghp.cgi.

But wait! We have to tell ‘ghp.cgi’ what HTML-and-Scheme file it needs to processand output. One way is to have ‘ghp.cgi’ parse any extra path information at the end ofits URL.

That script can parse extra path information is given after the script name, like so:http://localhost/cgi-bin/ghp.cgi/FILENAME

Any normal webserver should put the extra path information for the CGI script in thePATH_INFO environment variable.

The ‘ghp.cgi’ script should load an HTML-and-Guile file named FILENAME from somesensible default path, process it through Eguile, and then serve it back to the client.

Like any sane CGI script that processes a URL, ‘ghp.cgi’ should strip out any ‘/../’ inthe path, or maybe just fail if there are ‘/../’ in the path.

The Task at Hand

The task is to write a CGI script that1. inspects its PATH_INFO to see if an extra filename appears at the end of the URL used

to call the script2. passes a file by that filename through the Eguile processing procedure3. and sends it back to the client

If a file by that filename doesn’t exist, the script should return a HTTP 404 “Not Found”error.

Truly, this statement of the problem will probably be much longer than the ‘ghp.cgi’ file

Page 50: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

46

I’m asking you to create. But, once the CGI script is in place, we can serve up mixed HTMLand Scheme content just like PHP 3 did way back in 1997.

You can find Guile-WWW at http://nongnu.org/guile-www.For the moment, you can find Eguile at

https://github.com/spk121/guile100/blob/master/code/eguile.scm

The original source of Eguile is athttp://woozle.org/~neale/src/eguile/. Remember that it is abandonware, so don’tbug the owner with questions. We’re going to find a new home and maintainer for Eguilein the near future.

Eguile itself was based on other predecessors, like Shiro Kawai’s ESCM. You can findESCM athttp://practical-scheme.net/vault/escm.html.

4.2 Problem 6: MySQL

This challenge is to write one static HTML form page and one CGI script that will adddata to a MySQL database table.1. Create a static HTML page that has a form with a name text field and a

male/female/other gender radio button set. The form, when posted, will call a GuileCGI script as its action, posting the name and gender fields.

2. Create a CGI script that receives the form’s name and gender post data and adds itto a MySQL / MariaDB database. The script will then display the entire contents ofthe database as a table in HTML.

You may find Guile-WWW useful when creating CGI scripts. You can find Guile-WWWat http://nongnu.org/guile-www.

Guile-DBI is probably the best way to access MySQL databases in Guile. You can findit at http://home.gna.org/guile-dbi/.

4.3 Problem 7: Animated GIF Badges

A very important part of the Web 1.0 experience were the GIF badges. These 88 by 31 pixelimages were typically bright colors on a grey background with a border to give it a raisedbutton effect. They had text announcing one’s loyalty to a brand of webbrowser, computer,or political philosophy, or were used as download buttons. They were usually animated.

To create our Web 1.0 experience, we need animated GIF badges. So this week’s taskis to write a procedure that will create a GIF. The procedure will have to come in twoversions: one for animated GIF and one for static GIF.

For the static GIF case, you should assume that your input data is the following:• a filename for the output• a palette of 256 24-bit RGB colors, perhaps stored as a vector of unsigned integers• a two dimensional array of unsigned 8-bit indices to the palette colors

For the animated GIF case, you should assume that your input data is• a filename for the output• a palette of 256 24-bit RGB colors, as above

Page 51: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

47

• a three-dimensional array of unsigned 8-bit indices• and a variable containing the desired millisconds per frame

The actual specification for GIF, GIF89a, can be found athttp://www.w3.org/Graphics/GIF/spec-gif89a.txt. This specification, how-ever, contains a lot of fields and features that won’t be needed for this specificcase. On the other end of the spectrum is the current Wikipedia page for GIF,https://en.wikipedia.org/wiki/Gif, which, at the time of this writeup, contains avery condensed and cryptic description of the file format and the fields contained therein.By merging information from the official specification and the condensed one, it should bepossible to write a legible function that creates GIFs for the two cases described above.

One of the trickier parts of the implementation is the LZW compression required. For-tunately, an implementation of LZW compression is handy, See Section 3.3 [Problem 3],page 18: LZW Compression.

These days, the giflib project is as close as we have to a canonical library for the Gifreading and writing. It can be referenced to help understand the places in the specificationthat are obscure. It is at http://sourceforge.net/projects/giflib.

An alternate strategy would be to wrap up giflib as a Guile extension using either itsFFI interface or its C interface.

Page 52: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

48

Appendix A Other Examples

Here are some other examples for you

A.1 ustar Archives

Back in Section 3.4 [Problem 4], page 36, I defined a limited, reduced functionality versionof the ustar archive format. The limited version had just enough functionality to create avalid TAR file. After I received Jez’s solution, Mark Weaver sent an alternate script thathandles almost all of the capabilities of the ustar file format, including links and longerpath names. That script is below#!/usr/bin/guile \-e main -s!#;;; Copyright (C) 2013 Mark H Weaver <[email protected]>;;;;;; This program is free software: you can redistribute it and/or modify;;; it under the terms of the GNU General Public License as published by;;; the Free Software Foundation, either version 3 of the License, or;;; (at your option) any later version.;;;;;; This program is distributed in the hope that it will be useful,;;; but WITHOUT ANY WARRANTY; without even the implied warranty of;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the;;; GNU General Public License for more details.;;;;;; You should have received a copy of the GNU General Public License;;; along with this program. If not, see <http://www.gnu.org/licenses/>.

(use-modules (srfi srfi-1)(ice-9 match)(ice-9 receive)(rnrs bytevectors)(rnrs io ports))

;; ’file-name-separator-string’ and ’file-name-separator?’ are;; included in Guile 2.0.9 and later.(define file-name-separator-string "/")(define (file-name-separator? c) (char=? c #\/))

(define (fmt-error fmt . args)(error (apply format #f fmt args)))

;; Like ’string-pad-right’, but for bytevectors. However, unlike;; ’string-pad-right’, truncation is not allowed here.(define* (bytevector-pad

Page 53: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

49

bv len #:optional (byte 0) (start 0) (end (bytevector-length bv)))(when (< len (- end start))

(fmt-error"bytevector-pad: truncation would occur: len ~a, start ~a, end ~a, bv ~s"len start end bv))

(let ((result (make-bytevector len byte)))(bytevector-copy! bv start result 0 (- end start))result))

(define (bytevector-append . bvs)(let* ((lengths (map bytevector-length bvs))

(total (fold + 0 lengths))(result (make-bytevector total)))

(fold (lambda (bv len pos)(bytevector-copy! bv 0 result pos len)(+ pos len))

0 bvs lengths)result))

(define ustar-charset#;(char-set-union (ucs-range->char-set #x20 #x23)

(ucs-range->char-set #x25 #x40)(ucs-range->char-set #x41 #x5B)(ucs-range->char-set #x5F #x60)(ucs-range->char-set #x61 #x7B))

char-set:ascii)

(define (valid-ustar-char? c)(char-set-contains? ustar-charset c))

(define (ustar-string n str name)(unless (>= n (string-length str))

(fmt-error "~a is too long (max ~a): ~a" name n str))(unless (string-every valid-ustar-char? str)

(fmt-error "~a contains unsupported character(s): ~s in ~s"name(string-filter (negate valid-ustar-char?) str)str))

(bytevector-pad (string->utf8 str) n))

(define (ustar-0string n str name)(bytevector-pad (ustar-string (- n 1) str name)

n))

(define (ustar-number n num name)(unless (and (integer? num)

Page 54: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

50

(exact? num)(not (negative? num)))

(fmt-error "~a is not a non-negative exact integer: ~a" name num))(unless (< num (expt 8 (- n 1)))

(fmt-error "~a is too large (max ~a): ~a" name (expt 8 (- n 1)) num))(bytevector-pad (string->utf8 (string-pad (number->string num 8)

(- n 1)#\0))

n))

(define (checksum-bv bv)(let ((len (bytevector-length bv)))

(let loop ((i 0) (sum 0))(if (< i len)

(loop (+ i 1) (+ sum (bytevector-u8-ref bv i)))sum))))

(define (checksum . bvs)(fold + 0 (map checksum-bv bvs)))

(define nuls (make-bytevector 512 0))

;; write a ustar record of exactly 512 bytes, starting with the;; segment of BV between START (inclusive) and END (exclusive), and;; padded at the end with nuls as needed.(define* (write-ustar-record

port bv #:optional (start 0) (end (bytevector-length bv)))(when (< 512 (- end start))

(fmt-error "write-ustar-record: record too long: start ~s, end ~s, bv ~s"start end bv))

;; We could have used ’bytevector-pad’ here,;; but instead use a method that avoids allocation.(put-bytevector port bv start end)(put-bytevector port nuls 0 (- 512 (- end start))))

;; write 1024 zero bytes, which indicates the end of a ustar archive.(define (write-ustar-footer port)(put-bytevector port nuls)(put-bytevector port nuls))

(define (compose-path-name dir name)(if (or (string-null? dir)

(file-name-separator? (string-ref dir (- (string-length dir) 1))))(string-append dir name)(string-append dir "/" name)))

;; Like ’call-with-port’, but also closes PORT if an error occurs.

Page 55: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

51

(define (call-with-port* port proc)(dynamic-wind

(lambda () #f)(lambda () (proc port))(lambda () (close port))))

(define (call-with-dirstream* dirstream proc)(dynamic-wind

(lambda () #f)(lambda () (proc dirstream))(lambda () (closedir dirstream))))

(define (files-in-directory dir)(call-with-dirstream* (opendir dir)

(lambda (dirstream)(let loop ((files ’()))(let ((name (readdir dirstream)))(cond ((eof-object? name)

(reverse files))((member name ’("." ".."))(loop files))(else(loop (cons (compose-path-name dir name) files)))))))))

;; split the path into prefix and name fields for purposes of the;; ustar header. If the entire path fits in the name field (100 chars;; max), then leave the prefix empty. Otherwise, try to put the last;; component into the name field and everything else into the prefix;; field (155 chars max). If that fails, put as much as possible into;; the prefix and the rest into the name field. This follows the;; behavior of GNU tar when creating a ustar archive.(define (ustar-path-name-split path orig-path)(define (too-long)

(fmt-error "~a: file name too long" orig-path))(let ((len (string-length path)))

(cond ((<= len 100) (values "" path))((> len 256) (too-long))((string-rindex path

file-name-separator?(- len 101)(min (- len 1) 156))

=> (lambda (i)(values (substring path 0 i)

(substring path (+ i 1) len))))(else (too-long)))))

(define (write-ustar-header port path st)

Page 56: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

52

(let* ((type (stat:type st))(perms (stat:perms st))(mtime (stat:mtime st))(uid (stat:uid st))(gid (stat:gid st))(uname (or (false-if-exception (passwd:name (getpwuid uid)))

""))(gname (or (false-if-exception (group:name (getgrgid gid)))

""))

(size (case type((regular) (stat:size st))(else 0)))

(type-flag (case type((regular) "0")((symlink) "2")((char-special) "3")((block-special) "4")((directory) "5")((fifo) "6")(else (fmt-error "~a: unsupported file type ~a"

path type))))

(link-name (case type((symlink) (readlink path))(else "")))

(dev-major (case type((char-special block-special)(quotient (stat:rdev st) 256))(else 0)))

(dev-minor (case type((char-special block-special)(remainder (stat:rdev st) 256))(else 0)))

;; Convert file name separators to slashes.(slash-path (string-map (lambda (c)

(if (file-name-separator? c) #\/ c))path))

;; Make the path name relative.;; TODO: handle drive letters on windows.(relative-path (if (string-every #\/ slash-path)

"."(string-trim slash-path #\/)))

Page 57: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

53

;; If it’s a directory, add a trailing slash,;; otherwise remove trailing slashes.(full-path (case type

((directory) (string-append relative-path "/"))(else (string-trim-right relative-path #\/)))))

(receive (prefix name) (ustar-path-name-split full-path path)

(let* ((%name (ustar-string 100 name "file name"))(%mode (ustar-number 8 perms "file mode"))(%uid (ustar-number 8 uid "user id"))(%gid (ustar-number 8 gid "group id"))(%size (ustar-number 12 size "file size"))(%mtime (ustar-number 12 mtime "modification time"))(%type-flag (ustar-string 1 type-flag "type flag"))(%link-name (ustar-string 100 link-name "link name"))(%magic (ustar-0string 6 "ustar" "magic field"))(%version (ustar-string 2 "00" "version number"))(%uname (ustar-0string 32 uname "user name"))(%gname (ustar-0string 32 gname "group name"))(%dev-major (ustar-number 8 dev-major "dev major"))(%dev-minor (ustar-number 8 dev-minor "dev minor"))(%prefix (ustar-string 155 prefix "directory name"))

(%dummy-checksum (string->utf8 " "))

(%checksum(bytevector-append(ustar-number7(checksum %name %mode %uid %gid %size %mtime

%dummy-checksum%type-flag %link-name %magic %version%uname %gname %dev-major %dev-minor%prefix)

"checksum")(string->utf8 " "))))

(write-ustar-record port(bytevector-append%name %mode %uid %gid %size %mtime%checksum%type-flag %link-name %magic %version%uname %gname %dev-major %dev-minor%prefix))))))

Page 58: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

54

(define (write-ustar-path port path)(let* ((path (if (string-every file-name-separator? path)

file-name-separator-string(string-trim-right path file-name-separator?)))

(st (lstat path))(type (stat:type st))(size (stat:size st)))

(write-ustar-header port path st)(case type((regular)(call-with-port* (open-file path "rb")

(lambda (in)(let ((buf (make-bytevector 512)))

(let loop ((left size))(when (positive? left)

(let* ((asked (min left 512))(obtained (get-bytevector-n! in buf 0 asked)))

(when (or (eof-object? obtained)(< obtained asked))

(fmt-error "~a: file appears to have shrunk" path))(write-ustar-record port buf 0 obtained)(loop (- left obtained)))))))))

((directory)(for-each (lambda (path) (write-ustar-path port path))

(files-in-directory path))))))

(define (write-ustar-archive output-path paths)(catch #t

(lambda ()(call-with-port* (open-file output-path "wb")

(lambda (out)(for-each (lambda (path)

(write-ustar-path out path))paths)

(write-ustar-footer out))))(lambda (key subr message args . rest)(false-if-exception (delete-file output-path))(format (current-error-port) "ERROR: ~a\n"

(apply format #f message args))(exit 1))))

(define (main args)(match args

((program output-path paths ...)(write-ustar-archive output-path paths))(_ (display "Usage: ustar <archive> <file> ...\n" (current-error-port))

(exit 1))))

Page 59: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

55

;;; Local Variables:;;; mode: scheme;;; eval: (put ’call-with-port* ’scheme-indent-function 1);;; eval: (put ’call-with-dirstream* ’scheme-indent-function 1);;; End:

Page 60: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

56

5 References

Allen, John. 1978. Anatomy of Lisp. New York: McGraw-Hill.ANSI X3.226-1994. American National Standard for Information Systems

—Programming Language—Common Lisp.The IEEE and The Open Group. 2001-2004. The Open Group Base Specifications Issue

6 IEEE Std 1003.1, 2004 Edition.

Page 61: The Guile 100 Programs ProjectGuile can be used as a scripting language. Programs can be written as plain text files, and then run from the command line by using the Guile interpreter

57

Index

Any inaccuracies in this index may be explained by the factthat it has been prepared with the help of a computer.—Donald E. Knuth, Fundamental Algorithms(Volume 1 of The Art of Computer Programming)

(Index is nonexistent)