release 0 - wsgi

54
www.wsgi.org Documentation Release 0.9 Feb 25, 2020

Upload: others

Post on 15-Jan-2022

13 views

Category:

Documents


0 download

TRANSCRIPT

www.wsgi.org DocumentationRelease 0.9

Feb 25, 2020

Contents

1 Contents 1

2 Contributing 43

3 Indices and tables 45

Bibliography 47

Index 49

i

ii

CHAPTER 1

Contents

1.1 What is WSGI?

WSGI is the Web Server Gateway Interface. It is a specification that describes how a web server communicates withweb applications, and how web applications can be chained together to process one request.

WSGI is a Python standard described in detail in PEP 3333.

For more, see Learn about WSGI.

1.2 Learn about WSGI

• WSGI Tutorial by Clodoaldo Neto

• WSGI Explorations in Python by Mike Orr

• An Introduction to the Python Web Server Gateway Interface (WSGI) by Titus Brown

• A Do-It-Yourself Framework by Ian Bicking

• URL Parsing with WSGI by Ian Bicking

• WSGI and WSGI Middleware is Easy by Ben Bangert

• WSGI - Gateway or Glue by Mark Rees (particularly good as a starting point)

• Mix and match Web components with Python WSGI by Uche Ogbuji

• ‘Hello World with WSGI’ and WSGI Middleware by Rufus Pollock

• Getting started with WSGI by Armin Ronacher

• Why so many Python web frameworks? by Joe Gregorio (outlines the creation of a web framework using severalWSGI-based tools)

• Introducing WSGI: Python’s Secret Web Weapon by James Gardner [xml2006-09]

• Introducing WSGI: Python’s Secret Web Weapon, Part Two by James Gardner [xml2006-10]

1

www.wsgi.org Documentation, Release 0.9

• test.wsgi a WSGI test app showing whether your WSGI environment is working (and also outputs some interest-ing informations like Python version, sys.path, WSGI environment, etc.). It can be directly used for mod_wsgiand easily for all other WSGI servers. When started directly from command line, it tries to use wsgiref’s simpleserver to serve the application.

1.3 Frameworks that run on WSGI

This is an alphabetic list of frameworks known to support WSGI. The level and nature of their support sometimesvaries, as do the APIs they provide. The descriptions here focus on that, and not the flavor of the frameworks them-selves. If you want to know more, follow the links!

Note: Some frameworks really only support using pluggable WSGI servers, which means you get a number ofoptions from HTTP, FastCGI, SCGI, threaded, forking, etc. However, not all such frameworks live well alongsideother frameworks in the same process, or may require extra configuration. This is what is meant by noting when aframework supports WSGI servers, vs. a framework that supports a greater number of WSGI compositions, especiallythe kind of things noted in Middleware and libraries for WSGI Please feel free to expand on the list, the descriptions,or to make corrections.

appier Appier is an object-oriented Python web framework built for super fast app development. It’s as lightweightas possible, but not too lightweight. It gives you the power of bigger frameworks, without their complexity.

bobo Bobo is a light-weight framework. Its goal is to be easy to use and remember.

Bottle Bottle is a fast and simple micro-framework for small web-applications. It offers request dispatching (Routes)with url parameter support, Templates, key/value Databases, a build-in HTTP Server and adapters for manythird party WSGI/HTTP-server and template engines. All in a single file and with no dependencies other thanthe Python Standard Library.

CherryPy CherryPy is a pythonic, object-oriented web development framework. Includes support for WSGI servers.CherryPy 3 includes better support for living alongside other WSGI frameworks, applications, and middleware.

Django Includes support for WSGI servers

Falcon Falcon is a high-performance Python framework for building cloud APIs. It encourages the REST architec-tural style, and tries to do as little as possible while remaining highly effective.

Flask Flask is a microframework for Python based on Werkzeug, Jinja 2 and good intentions.

It inherits its high WSGI usage and compliance from Werkzeug.

notmm The notmm toolkit is a fork of Django that doesn’t get in your way. Features includes improved WSGIsupport (Paste), SQLAlchemy, and very few developers! ;-)

PoorWSGI Poor WSGI for Python is light WGI connector with uri routing between WSGI server and your applica-tion. It have mod_python compatible request object, which is post to all uri or http state handler.

Pycnic Pycnic is a mimimalist JSON API oriented framework for Python 2.7 and 3.x. It provides routing, cookies,and JSON error handling, while maintaining a small codebase.

Pyramid Merger of the Pylons and repoze.bfg projects, Pyramid is a minimalist web framework aiming at compos-ability and making developers paying only for what they use.

QWeb Another WSGI framework (not sure what the distinguishing features are)

repoze.zope2 A module that implements an analogue of the Zope 2 ZPublisher, with some major simplificationsand cleanups. Its core mission is to allow publishing existing Zope2 applications in a WSGI environment thatexternalizes some of the features of “classic” Zope2 into middleware.

2 Chapter 1. Contents

www.wsgi.org Documentation, Release 0.9

TurboGears Database-driven app in minutes; inherits its WSGI support from CherryPy.

web.py Makes web apps. A small RESTful library.

web2py A full stack framework includes its own Database Abstraction Layer (with support for SQLite, MySQL,PostgreSQL, MSSQL, DB2, Informix, Oracle, FireBase, Ingres and Google App Engine), its own templatelaguage, and a web based IDE. web2py itself is a WSGI app. Not related to web.py.

WebCore A nanoframework (only a few hundred lines of code) offering an entry_points-based dependency graph-ing extension system, MVC separation, reusable namespaces, and universal URL dispatch protocol with tightWebOb integration and natural Python semantics.

weblayer weblayer is a lightweight, componentised package for writing WSGI applications.

Zope 3 The venerable Python web framework, recreated anew in Zope 3, and now a WSGI application. It seems tohave some WSGI bits deep inside the publisher, but they aren’t really documented at this time.

1.3.1 Deprecated Systems

These systems still exist but got replaced by others or are unmaintained.

Clever Harold Clever Harold is an ambitious web framework. It has many features for rapid, reusable, and reliableweb application construction. Clever Harold is a complete WSGI framework. To build an application, you pickand choose the servers and components that fit your needs.

Colubrid Colubrid is a WSGI publisher which simplifies python web developement. Colubrid is not a framework:-) Although some people like the idea of having found a framework in colubrid. All colubrid does for you isparsing form data / url parameters / cookies and providing a url dispatcher. Colubrid was replaced by Werkzeug.

Nettri Nettri is a newcomer of Python World. It is under heavy development. Features includes CMS, Own templateEngine, modules and more coming.

Paste WebKit An implementation of the Webware servlet API using Paste infrastructure and WSGI.

pycoon Pythonic web development framework based on XML pipelines and WSGI

Pylons Full-stack Python web development framework combining the very best from the worlds of Ruby, Python andPerl.

Pylons has been superseded by pyramid .

repoze.bfg A Python WSGI-compliant web framework inspired by Zope, Pylons, and Django with built-in securityand templating.

repoze.bfg was renamed pyramid and moved under the Pylons project.

RhubarbTart A pure-WSGI dispatcher and simple framework, inspired by CherryPy.

simpleweb A simple Python WSGI-compliant web framework inspired by Django, TurboGears, and web.py.

skunk.web A totally WSGI-ified version of SkunkWeb.

Wareweb A rethinking of the Webware/WebKit servlet model, in a pure-WSGI framework. Not used widely.

WebStack WebStack is a package which provides a simple, common API for Python Web applications, allowing suchapplications to run within many different environments with virtually no changes to application code.

1.4 Servers which support WSGI

This is an alphabetic list of WSGI servers. In some cases these are WSGI-only systems, in other cases a packageincludes a server.

1.4. Servers which support WSGI 3

www.wsgi.org Documentation, Release 0.9

Please feel free to expand the list or descriptions. Direct links to documentation on how to use the server is especiallyappreciated.

ajp-wsgi

A threaded/forking WSGI server implemented in C (it embeds a Python interpreter to run the actualapplication). It communicates with the web server via AJP, and is known to work with mod_jk andmod_proxy_ajp. Also available in an SCGI flavor.

Aspen

A pure-Python web server (using the CherryPy module mentioned next) with three hooks to hang yourWSGI on.

cherrypy.wsgiserver

CherryPy’s “high-speed, production ready, thread pooled, generic WSGI server.” Includes SSL support.Supports Transfer-Encoding: chunked. For details on running foreign (non-CherryPy) applications underthe CherryPy WSGI server, see WSGI Support. See also the CherryPy wiki ModWSGI page.

chiral.web.httpd

A fast HTTP server supporting WSGI, with extensions for Coroutine-based pages with deeply-integratedCOMET support.

cogen.web.wsgi

WSGI server with extensions for coroutine oriented programming.

FAPWS

Fapws is a WSGI binding between Python and libev.

See also: author’s block, GoogleGroup.

fcgiapp

fcgiapp is a Python wrapper for the C FastCGI SDK. It’s used by PEAK’s FastCGI servers to provideWSGI-over-FastCGI.

flup

Includes threaded and forking versions of servers that support FastCGI, SCGI, and AJP protocols.

gevent-fastcgi

WSGI-over-FastCGI server implemented using gevent coroutine-based networking library. SupportsFastCGI connection multiplexing. Includes adapters for Django and other frameworks that use Past-eDeploy.

Gunicorn

WSGI HTTP Server for UNIX, fast clients and nothing else. This is a port of Unicorn to Python andWSGI.

ISAPI-WSGI

An implementation of WSGI for running as a ISAPI extension under IIS.

James

James provides a very simple multi-threaded WSGI server implementation based on the HTTPServerfrom Python’s standard library. (unmaintained)

Julep

A WSGI Server inspired by Unicorn, written in pure Python.

4 Chapter 1. Contents

www.wsgi.org Documentation, Release 0.9

m2twisted

WSGI server built with M2Crypto and twisted.web2 with some SSL related tricks. Used with client sidesmart cards and it is also possible to run the HTTPS server with a key in a HSM (like a crypto token)

modjy

Modjy is a java servlets to WSGI gateway that enables the running of jython WSGI applications insidejava servlet containers.

mod_wsgi

Python WSGI adapter module for Apache

NWSGI

NWSGI is a .NET implementation of the Python WSGI specification for IronPython and IIS. This makesit easy to run Python web applications on Windows Server. This is a potential alternative to ISAPI +ISAPI_WSGI modules.

netius

Netius is a Python network library that can be used for the rapid creation of asynchronous non-blockingservers and clients. It has no dependencies, it’s cross-platform, and brings some sample netius-poweredservers out of the box, namely a production-ready WSGI server.

paste.httpserver

Minimalistic threaded WSGI server built on BaseHTTPServer. Doesn’t support Transfer-Encoding: chun-ked.

phusion passenger

“proof of concept” WSGI since 2008 (1.x), support upgraded to “beta” in version 3 (with limitations e.g.requires Ruby even when unused) and first-class in Passenger 4.

python-fastcgi

python-fastcgi is a lightweight wrapper around the Open Market FastCGI C Library/SDK. It includesthreaded and forking WSGI server implementations.

Spawning

twisted.web

A WSGI server based on Twisted Web’s HTTP server (requires Twisted 8.2 or later).

uWSGI

Fast, self-healing, developer-friendly WSGI server, meant for professional deployment and developmentof Python Web applications.

werkzeug.serving

Werkzeug’s multithreaded and multiprocessed development server. Wraps wsgiref to add a reloader,multiprocessing, static files handling and SSL.

wsgid

Wsgid is a generic WSGI handler for mongrel2 webserver. Wsgid offers a complete daemon environment(start/stop/restart) to your app workers, including automatically re-spawning of processes.

WSGIserver

WSGIserver is a high-speed, production ready, thread pooled, generic WSGI server with SSL support forboth Python 2 (2.6 and above) and Python 3 (3.1 and above). WSGIserver is a one file project with nodependency.

1.4. Servers which support WSGI 5

www.wsgi.org Documentation, Release 0.9

WSGIUtils

Includes a threaded HTTP server.

wsgiref (Python 3)

Included as part of thef standard library since Python 2.5; it includes a threaded HTTP server, a CGIserver (for running any WSGI application as a CGI script), and a framework for building other servers.

For versions prior to Python 2.5, see wsgiref’s original home.

1.5 Applications that run on WSGI

Appwsgi

Illustration of ajax applications running on a modwsgi apache server.

FSCSI search

A syntax-aware web search interface for searching large source code file system trees (using the Python2.6.1 distribution in the example, but it can be configured for any source tree). It is distributed as part ofWHIFF and it uses external functionality from Nucular and Pygments.

MoinMoin

MoinMoin is a wiki engine written in Python.

PyAMF

PyAMF provides Action Message Format (AMF) support for Python that is compatible with the FlashPlayer.

pydap

pydap is a modular and extensible OPeNDAP server, used by the IPCC to serve model output.

Roundup

Roundup is a popular issue tracker which includes WSGI support.

RUM

Rum is a framework to develop CRUD web applications, usually used in the “admin” back-end of awebsite.

soaplib

A simple, easily extensible SOAP library that provides several useful tools for creating and publishingSOAP web services in Python such as on-demand WSDL generation for published services, a WSGI-compliant web application, support for complex class structures, binary attachments, a simple frameworkfor creating additional serialization mechanisms, and a client library.

Trac

Trac is a popular issue tracker. It includes WSGI support in trac.web.wsgi

Zine

A blog application written in Python.

6 Chapter 1. Contents

www.wsgi.org Documentation, Release 0.9

1.5.1 Deprecated

BrightContent

Python weblog software built from reusable components. It offers many of the usual features of weblogengines, but its basic opration and plug-in model is based on WSGI. Many existing WSGI componentscan be plugged directly into Bright Content in order to enhance its functionality. Bright Content also hasa set of specialized components for common weblog needs.

Webskine

Webskine is a simple weblog with an AJAX interface.

1.6 Middleware and libraries for WSGI

Barrel Flexible WSGI authentication and authorization tools.

Beaker Lighweight WSGI sessions middleware.

Beaker’s starts with the Perl Cache::Cache module, which was ported for use in Myghty. Beaker was thenextracted from this code, and has been substantially rewritten and modernized since.

Deliverance Deliverance is a tool to theme HTML, applying a consistent style to applications and static files regard-less of how they are implemented, and separating site-wide styling from application-level templating.

hatom2atom hatom2atom provides Python tools for use with hAtom2Atom.xsl. Includes a test runner that useshtml/atom file pairs to test for expected output and a WSGI app that acts as a proxy to transform hAtom docu-ments into Atom (that you are looking at now).

lib537.httpy Smooths over WSGI’s worst warts. In addition to calling start_response and returning an iterable, httpylets you return a string, or return or raise a Response object.

Oort A WSGI-enabled toolkit for creating RDF-driven web apps.

Paste Roughly a framework, though more of a set of tools for frameworks. Provides Integration layers with otherframeworks like CherryPaste, DjangoPaste and zope.paste.

Paste Deploy Configuration system for WSGI applications, servers, and middleware; both to configure individualcomponents and to compose those components into a single running system.

raptorizemw A layer of WSGI middleware that adds a velociraptor to every page served. Fact: every WSGI app isbetter with a raptor.

Repoze Repoze is an effort to bring Zope technologies to the larger Python web development community by breakingZope up into pieces that fit into a WSGI deployment model. This effort also allows existing Zope users to makeuse of WSGI technologies for development and deployment purposes, notably including the ability to run Zope2 and Plone applications under WSGI servers.

SchevoWsgi Provides integration between Schevo and WSGI apps.

selector This distribution provides WSGI middleware for “RESTful” mapping of URL paths to WSGI applications.Selector now also comes with components for environ based dispatch and on-the-fly middleware composition.

static This distribution provides an easy way to include static content in your WSGI applications. There is a con-venience method for serving files located via pkg_resources. There are also facilities for serving mixed (staticand dynamic) content using “magic” file handlers. Python 2.4 string substitution and Kid template support areprovided and it is easy to roll your own handlers. Note that this distribution does not require Python 2.4 or Kidunless you want to use those types of templates.

1.6. Middleware and libraries for WSGI 7

www.wsgi.org Documentation, Release 0.9

ToscaWidgets A web widget toolkit for Python to aid in the creation, packaging, and distribution of common viewelements normally used in the Web. ToscaWidgets is an almost complete rewrite of TurboGears 1.0’s widgets inthe spirit of TurboGears 2.0 philosophy of repackaging its services as independent WSGI components for easiermaintenance and reuse in other Python web applications or frameworks.

urlrelay Simple RESTful URL dispatcher that passes HTTP requests to an WSGI application based on a matching aURL path regex pattern and, optionally, the HTTP request method.

Werkzeug Werkzeug started as a simple collection of various utilities for WSGI applications and has become one ofthe most advanced WSGI utility modules. It includes a powerful debugger, full featured request and responseobjects, HTTP utilities to handle entity tags, cache control headers, HTTP dates, cookie handling, file uploads,a powerful URL routing system and a bunch of community contributed addon modules.

WFront Front-door dispatcher that directs HTTP requests based on “virtual host”. Includes tools to isolate WSGIapps from server deployment details.

WHIFF WSGI HTTP Integrated File System Frames WHIFF reduces application complexity by providing an in-frastructure for managing web application name spaces, a configuration template language for wiring namedcomponents into an application, and an applications programmer interface for accessing named componentsfrom Python and javascript modules.

wsgiakismet Validates form submissions against the Akismet service to verify that they are not comment spam.

wsgiauth WSGI authentication middleware. Supports HTTP basic, digest, IP, HTML form, and OpenID-based au-thentication.

WSGIFilter A simple framework for doing output-filtering of WSGI content. Works well with WSGIRemote.

wsgiform WSGI middleware for validating and parsing HTML form submissions. Supports automatic escaping ofHTML and data sterilization.

WSGI Intercept Redirects Python HTTP calls to an in-process WSGI application. This can allow HTTP API calls(e.g., REST, XML-RPC, etc) without actually touching the network.

wsgilog WSGI logging and event reporting middleware. Supports logging events in WSGI applications to STDOUT,time rotated log files, email, syslog, and web servers. Also supports catching and sending HTML-formattedexception tracebacks to a web browser for debugging.

WSGIRemote Client library for doing RPC-style internal subrequests in a WSGI stack. Also works for doing HTTPRPC requests.

WSGIRewrite Middleware for URL rewriting, uses the same syntax as Apache’s mod_rewrite.

wsgiserialize Object serialization middleware for WSGI. Supported object serialization formats include: XML-RPC,JSON, YaML, marshal, and pickle.

wsgistate Session, HTTP cache control, and caching middleware for WSGI. Sessions are flup-compatible. Supportsmemory, filesystem, database, and memcached based backends.

wsgi-statsd WSGI middleware that provides an easy way to time all requests and report to statsd. Measurement keynames are automatically generated.

WSGIUtils Includes a simple WSGI application (wsgiAdaptor) that provides basic authentication, signed cookiesand persistent sessions.

wsgiview Turns any TurboGears/Buffet template plug-ins into WSGI middleware.

wsgize WSGI without the WSGI. Provides middleware for WSGI-enabling Python callables including:

• Middleware that makes non-WSGI Python functions, callable classes, or methods into WSGI applications

• Middleware that automatically handles generating WSGI-compliant HTTP response codes, headers, andcompliant iterators

8 Chapter 1. Contents

www.wsgi.org Documentation, Release 0.9

• An HTTP response generator

• A secondary WSGI dispatcher

yaro This distribution provides Yet Another Request Object (for WSGI) in a way that is intended to be simple anduseful for web developers who don’t want to have to know a lot about WSGI to get the job done. It’s also ahandy convenience for those who do like to get under the hood but would be happy to eliminate some boilerplatewithout the encumbrance of some all-singing-all-dancing framework.

1.6.1 deprecated

AuthKit AuthKit is an authentication and authorization toolkit for WSGI applications and frameworks.

The authentication middleware part is essentially an extension of paste.auth and there is an adaptor moduleproviding support for Pylons although it works with all WSGI apps.

memento This distribution provides code reloading middleware for use with your WSGI applications. Upon recievingeach request, it forgets everything that it has imported since the last request so that it is imported all over again.The concept was inspired by the RollBackImporter used by Steve Purcell in PyUnit

webstring webstring is a template engine for programmers whose favorite template language is Python. webstringcan be used to generate any text format from a template with the additional advantage of advanced XML andHTML templating using the lxml and cElementTree libraries.

WSGIOverlay Application-neutral macro templating language. Seems to be superseded by Deliverance.

wsgixml WSGI middleware modules for XML processing

1.7 Testing tools for WSGI

Any HTTP-based testing system can be used with WSGI applications.

Obviously any HTTP testing system can test any HTTP application.

However, some testing frameworks work more intimately with WSGI, and provide the ability the call WSGI applica-tions in a controlled environment, with tracebacks and full use of debugging tools.

WSGI Intercept

Intercepts normal Python calls to httplib, and redirects them to a WSGI application running in-process.Any testing tools written in Python can be made to test WSGI applications in-process.

Twill

See Testing WSGI Apps with twill for a description of the specifics on plugging these together. WSGIIntercept was originally written for Twill.

WebTest

Extraction of paste.fixture.TestApp, rewriting portions to use WebOb.

Allows for testing WSGI applications without having to start a WSGI server.

cherrypy.test.webtest

Extensions to unittest for web frameworks.

webunit

Unit test your websites with code that acts like a web browser.

zope.testbrowser

1.7. Testing tools for WSGI 9

www.wsgi.org Documentation, Release 0.9

An easy to use programmatic web browser with special focus on testing. Used in Zope 3, but not Zopespecific.

1.8 Presentations about WSGI

1.8.1 Videos

ReUsable Web Components with Python and Future Python Web Development (Google TechTalk, 2006, Ben Bangert)

WSGI: Working together to solve the web’s problems (PyCon 2011, panel)

1.8.2 Slide decks

Developing Applications with the Web Server Gateway Interface (EuroPython 2006, James Gardner)

Introduction to Web Programming with WSGI (EuroPython 2007, Michele Simionato)

1.9 Specifications related to WSGI

This page holds specifications (proposed, accepted, and withdrawn) that build on WSGI.

1.9.1 About these specifications

These specifications are written up here and discussed on WEB-SIG. Once accepted, these can all use the wsgiorg.prefix for their keys. Until accepted, please use x-wsgiorg. – this is primarily so that people who implement thespecification before it is accepted will not leave out-of-spec implementations around (except the obvious ones due tothe x-).

To be “accepted” the proposal should have certain qualities:

1. The spec won’t change without good reason, so you can start implementing against it (once it is “approved”).

2. It’s useful in multiple contexts; if one implementation is all anyone will ever need, then just make your imple-mentation. Feel free to discuss it, but you don’t need anyone’s approval.

3. Some eyes have been on it, and it’s been reviewed by multiple people.

There are certain advantages:

1. Having implemented either side, you can expect that maybe someone will care (either producing or consumingwhat you are looking for).

2. Someone won’t implement something they think is the same, but isn’t, because the document specifies therequirements sufficiently.

3. New proposals will take old proposals into account, and so they shouldn’t overlap or repeat their purposes.

There’s no particular process for a proposal to become accepted. Someone else should like your proposal (+1, not just+0), and probably no one should be opposed (no -1’s).

Unless noted otherwise, everything here can be assumed to be public domain (in keeping with the purpose of postingmaterial here).

10 Chapter 1. Contents

www.wsgi.org Documentation, Release 0.9

1.9.2 Accepted

Where to put information parsed out of the request path

Title wsgiorg.routing_args

Author Ian Bicking <[email protected]>

Discussions-To Python Web-SIG <[email protected]>

Status Accepted

Created 21-Oct-2006

Contents

• Where to put information parsed out of the request path

– Abstract

– Rationale

– Specification

– Types

– Example

Abstract

This proposes a new standard environment key environ['wsgiorg.routing_args'] to represent the resultsof more complicated URL parsing strategies.

Rationale

WSGI currently specifies the meaning of SCRIPT_NAME and PATH_INFO, which allows generic prefix-based dis-patchers to be created. These dispatchers can work with any WSGI application that respects the meaning of these twovariables. The basic meaning of SCRIPT_NAME is the portion of the path that has been consumed and PATH_INFOis the portion of the path left to the application.

Using these two variables more complex dispatchers cannot represent the information they pull out of the requestpath. This specification simply defines a place where such dispatchers can put their information: wsgiorg.routing_args.

Specification

This specification defines a new key that can go in the WSGI environment, wsgiorg.routing_args. This key isoptional.

If a dispatcher (like routes or selector) pulls named information out of the portion of the request path it parses, itcan put that information into environ['wsgiorg.routing_args']. routing_args must be a two-tupleof (positional_args, named_args), where positional_args is a sequence of arguments that werecaptured positionally, and named_args is a dictionary of the arguments that were given names.

1.9. Specifications related to WSGI 11

www.wsgi.org Documentation, Release 0.9

Not all kinds of dispatchers will produce both positional and named arguments – some may only be capable of pro-ducing one or the other. Similarly, not all consumers will know what to do with both positional and named arguments.Implementors putting together producers and consumers of wsgiorg.routing_args will have to choose combi-nations that work for their combination of pieces. Dispatchers that do not produce one of these items must put in anempty tuple/list or empty dictionary in for the missing item.

The values in wsgiorg.routing_args need not be strings (except for the keys of named_args). For instance,a dispatcher is allowed to parse /archive/2005/10/01 into ((), {'date': datetime.date(2005,10, 1)}).

Portions of the path that have been parsed should still be moved to SCRIPT_NAME (and removed from PATH_INFO).

Types

The objects in (positional_args, named_args) are intended to be usable asfunc(*positional_args, **named_args). Therefore positional_args must be coercable to atuple, and named_args must be a dictionary with string keys (str or unicode-ASCII). Python does not allowdictionary-like but values for **named_args (except for actual dict objects).

Example

This example is a dispatcher that is given regular expressions and matching applications. It checks each regular ex-pression in turn, and when one matches it moves the named groups into wsgiorg.routing_args and dispatchesto the associated application.

class RegexDispatch(object):

def __init__(self, patterns):self.patterns = patterns

def __call__(self, environ, start_response):script_name = environ.get('SCRIPT_NAME', '')path_info = environ.get('PATH_INFO', '')for regex, application in self.patterns:

match = regex.match(path_info)if not match:

continueextra_path_info = path_info[match.end():]if extra_path_info and not extra_path_info.startswith('/'):

# Not a very good matchcontinue

pos_args = match.groups()named_args = match.groupdict()cur_pos, cur_named = environ.get('wsgiorg.routing_args', ((), {}))new_pos = list(cur_pos) + list(pos_args)new_named = cur_named.copy()new_named.update(named_args)environ['wsgiorg.routing_args'] = (new_pos, new_named)environ['SCRIPT_NAME'] = script_name + path_info[:match.end()]environ['PATH_INFO'] = extra_path_inforeturn application(environ, start_response)

return self.not_found(environ, start_response)

def not_found(self, environ, start_response):start_response('404 Not Found', [('Content-type', 'text/plain')])

(continues on next page)

12 Chapter 1. Contents

www.wsgi.org Documentation, Release 0.9

(continued from previous page)

return ['Not found']

dispatch_app = RegexDispatch([(re.compile(r'/archive/(?P<year>\d{4})/$'), archive_app),(re.compile(r'/archive/(?P<year>\d{4})/(?P<month>\d{2})/$'),archive_app),

(re.compile(r'/archive/(?P<year>\d{4})/(?P<month>\d{2})/(?P<article_id>\d+)$'),view_article),

])

1.9.3 Proposed

Waiting for File Descriptor Events

Title Waiting for File Descriptor Events

Author Christopher Stawarz <[email protected]>

Discussions-To Python Web-SIG <[email protected]>

Status Proposed

Created 11-May-2008

Contents

• Waiting for File Descriptor Events

– Abstract

– Rationale

– Specification

* Handling of the Input Stream

– Examples

– Problems

– Other Possibilities

– Open Issues

Abstract

This specification defines a set of extensions that allow a WSGI application to suspend its execution until an eventoccurs on a specified file descriptor.

Rationale

The architecture of asynchronous (aka event driven) servers requires all I/O operations, including both interprocessand network communication, to be non-blocking. For a WSGI-compliant server, this requirement extends to all appli-cations run on the server. However, the WSGI specification does not provide sufficient facilities for an application to

1.9. Specifications related to WSGI 13

www.wsgi.org Documentation, Release 0.9

ensure that its I/O is non-blocking. Specifically, it lacks a mechanism by which an application can suspend its execu-tion until an arbitrary file descriptor (such as one belonging to a socket or pipe opened by the application) is ready forreading or writing. This specification defines a standard interface by which servers can provide such a mechanism toapplications.

Specification

This specification introduces three new variables to the WSGI environment: x-wsgiorg.fdevent.readable,x-wsgiorg.fdevent.writable, and x-wsgiorg.fdevent.timeout.

The variables x-wsgiorg.fdevent.readable and x-wsgiorg.fdevent.writable are callable objectsthat accept two positional arguments, one required and one optional. In the following description, these arguments aregiven the names fd and timeout, but they are not required to have these names, and the application must invokethe callables using positional arguments.

The first argument, fd, is either an integer representing a file descriptor or an object with a fileno method thatreturns such an integer. The set of acceptable file descriptors is defined to be those accepted by select.select.(Note that this set is platform dependent: only sockets are allowed on Windows, whereas sockets, pipes, and files areacceptable on Unix-like systems.) The second, optional argument, timeout, is either None or a floating-point valuein seconds. If omitted, it defaults to None.

When called, x-wsgiorg.fdevent.readable and x-wsgiorg.fdevent.writable return the emptystring (''), which must be yielded by the application iterable to the server. (The result of calling x-wsgiorg.fdevent.readable or x-wsgiorg.fdevent.writable and yielding a non-empty string, or making multi-ple calls to x-wsgiorg.fdevent.readable and/or x-wsgiorg.fdevent.writable before yielding theempty string, is undefined.) The server then suspends execution of the application until one of the following conditionsis met:

• The specified file descriptor is ready for reading (if the application called x-wsgiorg.fdevent.readable) or writing (if the application called x-wsgiorg.fdevent.writable).

• timeout seconds have elapsed without the desired file descriptor becoming readable (if the applicationcalled x-wsgiorg.fdevent.readable) or writable (if the application called x-wsgiorg.fdevent.writable), unless the value of timeout is None, in which case the wait will never timeout.

• The server detects an error or “exceptional” condition (such as out-of-band data) on the file descriptor.

Put another way, if the application calls x-wsgiorg.fdevent.readable and yields the empty string,it will be suspended until select.select([fd],[],[fd],timeout) would return. If the applicationcalls x-wsgiorg.fdevent.writable and yields the empty string, it will be suspended until select.select([],[fd],[fd],timeout) would return.

The variable x-wsgiorg.fdevent.timeout is an object whose truth value can be changed by the server. (Forexample, it could be a list instance, whose truth value is false when empty, true otherwise.) If timeout secondselapse without the desired file descriptor event occurring, x-wsgiorg.fdevent.timeout will be true when theapplication resumes; otherwise, it will be false. The truth value of x-wsgiorg.fdevent.timeout when theapplication is first started or after it yields each response-body string is undefined.

The server may use any technique it desires to detect events on an application’s file descriptors. (Most likely, it willadd them to the same event loop that it uses for accepting new client connections, receiving requests, and sendingresponses.)

Handling of the Input Stream

While technically outside the scope of this specification, the application’s input stream (wsgi.input) is anothersource of potentially blocking I/O that deserves mention.

14 Chapter 1. Contents

www.wsgi.org Documentation, Release 0.9

The methods provided by the input stream follow the semantics of the corresponding methods of the file class. Inparticular, each of these methods can invoke the underlying I/O function (in this case, recv on the socket connectedto the client) more than once, without giving the application the opportunity to check whether each invocation willblock. Although authors of asynchronous servers may be tempted to provide a non-standard input stream that supportson-demand, non-blocking reads, such an input stream would be incompatible with WSGI middleware.

In order to avoid these problems, it is strongly recommended that asynchronous servers pre-read the entire requestbody (to an in-memory buffer or temporary file) before invoking the application, either by default or as a configurableoption. Doing so will ensure that the input stream is compatible with middleware and that reads from it will not blockwaiting for data from the client.

Examples

The following application acts as a proxy to python.org. It uses a pycurl.CurlMulti instance to perform theoutgoing HTTP request in a non-blocking fashion. When the CurlMulti.perform() method detects that its nextI/O operation would block, it returns control to the application, which then yields until the file descriptor of interestbecomes readable or writable as required. If the descriptor is not ready after one second, the application sends a 504Gateway Timeout response to the client and terminates:

def pyorg_proxy(environ, start_response):result = StringIO()

c = pycurl.Curl()c.setopt(pycurl.URL, 'http://python.org' + environ['PATH_INFO'])c.setopt(pycurl.WRITEFUNCTION, result.write)

m = pycurl.CurlMulti()m.add_handle(c)

while True:while True:

ret, num_handles = m.perform()if ret != pycurl.E_CALL_MULTI_PERFORM:

breakif not num_handles:

break

read, write, exc = m.fdset()if read:

yield environ['x-wsgiorg.fdevent.readable'](read[0], 1.0)else:

yield environ['x-wsgiorg.fdevent.writable'](write[0], 1.0)

if environ['x-wsgiorg.fdevent.timeout']:msg = 'The request to python.org timed out.'start_response('504 Gateway Timeout',

[('Content-Type', 'text/plain'),('Content-Length', str(len(msg)))])

yield msgreturn

start_response('200 OK', [('Content-Type', 'application/octet-stream'),('Content-Length', str(result.len))])

yield result.getvalue()

The following adapter allows an application that uses the x-wsgiorg.fdevent extensions to run on a server thatdoes not support them, without any modification to the application’s code:

1.9. Specifications related to WSGI 15

www.wsgi.org Documentation, Release 0.9

def with_fdevent(application):def wrapper(environ, start_response):

select_args = [None]

def readable(fd, timeout=None):assert (not select_args[0])select_args[0] = ([fd], [], [fd], timeout)return ''

def writable(fd, timeout=None):assert (not select_args[0])select_args[0] = ([], [fd], [fd], timeout)return ''

environ['x-wsgiorg.fdevent.readable'] = readableenviron['x-wsgiorg.fdevent.writable'] = writable

timeout = False

class TimeoutWrapper(object):def __nonzero__(self):

return timeout

environ['x-wsgiorg.fdevent.timeout'] = TimeoutWrapper()

for result in application(environ, start_response):assert (not (result and select_args[0]))if result or (not select_args[0]):

yield resultelse:

ready = select.select(*select_args[0])timeout = (ready == ([], [], []))select_args[0] = None

return wrapper

Problems

• The empty string yielded by an application after calling x-wsgiorg.fdevent.readable orx-wsgiorg.fdevent.writable must pass through any intervening middleware and be detected by theserver. Although WSGI explicitly requires middleware to relay such strings to the server (see MiddlewareHandling of Block Boundaries), some components may not, making them incompatible with this specification.

Other Possibilities

• To prevent an application that does blocking I/O from blocking the entire server, an asynchronous server couldrun each instance of the application in a separate thread. However, since asynchronous servers achieve highlevels of concurrency by expressly avoiding multithreading, this technique will almost always be unacceptable.

• The greenlet package enables the use of cooperatively-scheduled micro-threads in Python programs, and aWSGI server could potentially use it to pause and resume applications around blocking I/O operations. However,such micro-threading is not part of the Python language or standard library, and some server authors may beunwilling or unable to make use of it.

16 Chapter 1. Contents

www.wsgi.org Documentation, Release 0.9

Open Issues

• Some third-party libraries (such as PycURL) provide non-blocking interfaces that may need to monitor multiplefile descriptors for events simultaneously. Since this specification allows an application to wait on only one filedescriptor at a time, application authors may find it difficult or impossible to use such libraries, or they may belimited to a subset of the libraries’ capabilities.

Although this specification could be extended to include an interface for waiting on multiple file descriptors,it is unclear whether it would be easy (or even possible) for all servers to implement it. Also, the appropriatebehavior for a multi-descriptor wait is not obvious. (Should the application be resumed when a single descriptoris ready? All of them? Some minimum number?)

Authentication for developer-oriented tools

Title Developer Auth

Author Ian Bicking <[email protected]>

Discussions-To Python Web-SIG <[email protected]>

Status Proposed

Created 31-Mar-2008

Contents

• Authentication for developer-oriented tools

– Abstract

– Rationale

– Specification

– Example

– Problems

– Other Possibilities

– Open Issues

– Implementations

Abstract

Many tools can be written for a WSGI stack which should only accessible to developers. For example, an interactivedebugger in response to sessions. Or a template system might display the underlying filenames that created a page.Or profiling data. In some cases there are security implications to exposing this data, in other cases it is harmless butundesirable to show this information to normal users. This specification offers a single, simple way to detect if a usershould be presented with this information.

Rationale

So far these tools have been controlled by configuration, e.g., debug = True, or --debug on the command line.This works but can be dangerous, as a deployer or developer can forget to turn off tools. Or, if it is controlled through

1.9. Specifications related to WSGI 17

www.wsgi.org Documentation, Release 0.9

Python code, it can be difficult to enable on a site that wasn’t intended to have the tool on, e.g., if you want to debuga live site because you can’t reproduce a problem in development. Also, configuration doesn’t allow some people tosee these development tools while hiding them from other people. A per-request and secure authentication method ismore desirable.

This could be implemented using application-specific authentication methods and permission levels. This is undesir-able because often debugging is orthogonal to users – you may want to debug a problem only present when a low-permission or anonymous user is visiting the site. Also it is difficult to keep application and debugging permissionscoherent, which is probably why this technique is not used by any tools.

Specification

Debugging tools should look for a key x-wsgiorg.developer_user. This will contain some kind of user name.If it is empty or not present, then debugging tools should not activate themselves, or should not expose any informationin the browser.

The user name can be used in logging, but all users are considered to have the same permission level (total access).The username must be a str, but its contents are not constrained (an IP address, for example, would be acceptable,or a name and email, with an embedded space).

If a URL is protected except for developers, applications should simply return 403 Forbidden. Seamless login isnot part of this specification or its goals. Some systems may be IP-controlled, for example, and no login is possible.

Example

This is a simple exception catcher that uses the key:

import sys, traceback

class CatchExceptions(object):def __init__(self, app):

self.app = appdef __call__(self, environ, start_response):

if not environ.get('x-wsgiorg.developer_user'):return self.app(environ, start_response)

try:return self.app(environ, start_response)

except:start_response('500 Server Error', [('content-type', 'text/plain')],

sys.exc_info())return [traceback.format_exc()]

Here is a IP-restricted middleware that sets the key:

class IPDeveloper(object):def __init__(self, app, ips=('127.0.0.1',)):

self.app = appself.ips = ips

def __call__(self, environ, start_response):if environ.get('REMOTE_ADDR') in self.ips:

environ['x-wsgiorg.developer_user'] = environ['REMOTE_ADDR']return self.app(environ, start_response)

18 Chapter 1. Contents

www.wsgi.org Documentation, Release 0.9

Problems

• With security by obscurity in mind, it might be best if login methods weren’t clear. With ease of use in mind,easy logins are best.

• There’s no levels of access. Everyone is assumed to have complete access. (You could add another custom keyif you want to share extra information between the authentication and application layer.)

• This encourages people to do production deployments with debugging tools enabled.

Other Possibilities

• Configuration

• Conditional middleware composition

• Application login systems

• Some other generalized authentication system (AuthKit, etc).

Open Issues

• Should 401 Authorization Required be returned? Potentially with WWW-Authenticate:x-wsgiorg.developer_user. This would signal to the middleware that a login should occur, whichit may or may not ignore (it could translate that to 403 Forbidden). This would make, for example, HTTPBasic authentication doable (since that authentication is per-request, and so you can’t detect if a user alreadyhas logged in). But HTTP Basic would probably be inappropriate for many systems, where a page is filtered byauthentication, it isn’t blocked.

Implementations

DevAuth implements the authentication portion of this system. Deliverance and Cabochon both use DevAuth foraccess to backend logging and controls.

DevAuth implements a login form (which uses a cookie) and IP restrictions. This allows developers from selected IPaddresses to login. No links are provided to the login form, instead developers must know the location, or it should bedocumented in applications using DevAuth. Similarly there’s no way for applications to reject a request and suggest alogin; when a user accesses something they are not allowed to access the applications simply generate 403 Forbidden.This is unlike user-oriented login forms which helpful; this is distinctly unhelpful.

Techniques to avoid serializing the input or output when stacking middleware

Title Avoiding Serialization When Stacking Middleware

Author Ian Bicking <[email protected]>

Discussions-To Python Web-SIG <[email protected]>

Status Proposed

Created 06-03-2007

1.9. Specifications related to WSGI 19

www.wsgi.org Documentation, Release 0.9

Contents

• Techniques to avoid serializing the input or output when stacking middleware

– Abstract

– Rationale

– Specification

– Example

– Problems

– Other Possibilities

– Open Issues

Abstract

This proposal gives a strategy for avoiding unnecessary serialization and deserialization of request and response bod-ies. It does so by attaching attributes to wsgi.input and the app_iter, as well as a new environment keyx-wsgiorg.want_parsed_response.

Rationale

Output-transforming middleware often has to parse the upstream content, transform it, then serialize it back to a stringfor output. The original output may have already been in the parsed form that the middleware wanted. Or there maybe more middleware that does similar transformations on the same kind of objects.

The same things apply to the parsing of wsgi.input, specifically parsing form data. A similar strategy is presentedto avoid unnecessarily reparsing that data.

Specification

WSGI applications (or middleware) can return an app_iter that not only serializes the output, but also has extra at-tributes. An attribute is given here, app_iter.x_wsgiorg_parsed_response which is a function/methodthat takes one argument, the “type” of object that you want to receive. It may return that type of object, or None(meaning it cannot produce that type of object). Consumers should fall back on normal parsing of the response if themethod does not exist, or returns None.

Similarly the wsgi.input object may have the same method, with the same meaning.

WSGI applications that want to lazily serialize their output have a problem: they probably cannot cal-culate Content-Length without doing the actual serialization. Browsers typically want to know aboutContent-Length, but WSGI middleware seldom cares, since it just can get the content from app_iter re-gardless of its length. WSGI middleware that will transform the output can set environ['x-wsgiorg.want_parsed_response'] = True to give this hint to the application. Applications are thus encouraged toonly lazily serialize their output when that key is present and true. (There is no equivalent concept for wsgi.input.)

The object returned by x_wsgiorg_parsed_response() may be modified in-place by the WSGI middlewareusing that object. Producers should make a copy if they do not want consumers modifying the object.

20 Chapter 1. Contents

www.wsgi.org Documentation, Release 0.9

Example

Two examples are provided: one for output, and one for input.

The output transformation parses the page with lxml.etree.HTML (from the lxml library) and replaces all <i>tags with <em> tags. First we show the middleware:

import lxml.etree

class EmTagMiddleware(object):def __init__(self, app):

self.app = appdef __call__(self, environ, start_response):

parent_wants_parsed = environ.get('x-wsgiorg.want_parsed_response')environ['x-wsgiorg.want_parsed_response'] = Truewritten_output = []captured_headers = []def repl_start_response(status, headers, exc_info=None):

if exc_info:raise exc_info[0], exc_info[1], exc_info[2]

captured_headers[:] = [status, headers]return written_output.append

app_iter = self.app(environ, repl_start_response)parsed = Noneif captured_headers and not written_output:

method = getattr(app_iter, 'x_wsgiorg_parsed_response', None)if method:

parsed = method(lxml.etree._Element)if parsed is None:

# Have to manually parse, because:# a) start_response was called lazily# b) the start_response writer was used# c) app_iter.x_wsgiorg_parsed_response didn't exist# d) that method returned Nonetry:

for item in app_iter:written_output.append(item)

finally:if hasattr(app_iter, 'close'):

app_iter.close()parsed = self.parse_body(''.join(written_output))

status, headers = captured_headersnew_body = self.transform_body(parsed)for i in range(len(headers)):

if headers[i][0].lower() == 'content-length':del headers[i]break

if parent_wants_parsed:new_app_iter = self.make_app_iter(new_body)

else:serialized_body = serialize(new_body)headers.append(('Content-Length', str(len(serialized_body))))new_app_iter = [serialized_body]

return new_app_iter

def parse_body(self, body):return lxml.etree.HTML(body)

(continues on next page)

1.9. Specifications related to WSGI 21

www.wsgi.org Documentation, Release 0.9

(continued from previous page)

def transform_body(self, root):for el in root.xpath('//i'):

el.tag = 'em'return root

def make_app_iter(self, body):return LazyLXML(body)

def serialize(element):return lxml.etree.tostring(element)

class LazyLXML(object):def __init__(self, body):

self.body = bodyself.have_yielded = False

def __iter__(self):return self

def next(self):if self.have_yielded:

raise StopIterationself.have_yielded = Truereturn serialize(self.body)

def x_wsgiorg_parsed_response(self, type):if type is lxml.etree._Element:

return self.bodyreturn None

Here’s a simpler example for parsing normal form inputs in wsgi.input:

import cgiimport urllibfrom cStringIO import StringIO

def parse_form(environ):content_type = environ.get('CONTENT_TYPE', '')assert content_type in ['application/x-www-form-urlencoded', 'multipart/form-data

→˓']wsgi_input = environ['wsgi.input']method = getattr(wsgi_input, 'x_wsgiorg_parsed_response', None)if method:

parsed = method(cgi.FieldStorage)if parsed is not None:

return parsedform = cgi.FieldStorage(fp=wsgi_input, environ=environ, keep_blank_values=True)environ['wsgi.input'] = FakeFormInput(form)return form

class FakeFormInput(object):def __init__(self, form):

self.form = formself.serialized = None

def x_wsgiorg_parsed_response(self, type):if type is cgi.FieldStorage:

return self.formreturn None

def read(self):if self.serialized is None:

(continues on next page)

22 Chapter 1. Contents

www.wsgi.org Documentation, Release 0.9

(continued from previous page)

self._serialize()return self.serialized.read()

def readline(self, *args):if self.serialized is None:

self._serialize()return self.serialized.readline(*args)

def readlines(self, *args):if self.serialized is None:

self._serialize()return self.serialized.readlines(*args)

def __iter__(self):if self.serialized is None:

self._serialize()return iter(self.serialized)

def _serialize(self):# XXX: Doesn't deal with file uploads, and multipart/form-data generallydata = urllib.urlencode(self.form.list, True)self.serialized = StringIO(data)

Problems

Obviously the code is not simple, but this is the nature of WSGI output-transforming middleware. Ideally a frameworkof some sort would be used to construct this kind of middleware.

Something that replaces wsgi.input (like the example) may change the CONTENT_LENGTH of the request; nor-malization alone may change the length, even if the data is the same (e.g., there are multiple ways to urlencode astring). However, there’s no way without actually serializing to determine the proper length. Ideally requests like thisshould allow simply reading to the end of the object, without needing a CONTENT_LENGTH restriction (this is nottrue for socket objects). Ideally something like CONTENT_LENGTH="-1" would indicate this situation (simply amissing CONTENT_LENGTH generally means 0). Another option is to set it to 1 and simply return the entire serializedresponse all at once. cgi.FieldStorage actually protects against this. Or set it to a very very large value, andallow reading past the end (returning ""). This is likely to work with most consumers. I’m not sure what effect -1 willhave on different code.

Other Possibilities

• You could simply parse everything ever time.

• You could pass data through callbacks in the environment (but this can break non-aware middleware).

• You can make custom methods and keys for each case.

• You can use something other than WSGI.

I think this specification offers advantages over all these options.

Open Issues

Should “type” be the class object? A string describing the type? Things like lxml.etree._Element are a littleunclean, since the actual class isn’t a public object (only the factory function lxml.etree.Element()). Also,there are occasionally times when multiple classes implement the same interface.

The boolean x-wsgiorg.want_parsed_response doesn’t really give any idea of what kind of object you want.This is actually something of a problem, because sometimes it’s impossible to give that kind of object. For instance,

1.9. Specifications related to WSGI 23

www.wsgi.org Documentation, Release 0.9

if you want to transform images you might want the PIL object for the image. But if the response is HTML there’sno way to give this type. Similarly if you are transforming HTML then images don’t mean anything to you, and youprobably do want them to come out as normal. And potentially both a image transformer and an HTML transformerare in the stack. Should that key actually hold a list of types that are of interest?

x_wsgiorg_parsed_response() isn’t a very good name for the method on wsgi.input, as it’s not a re-sponse.

A very basic description of authentication opportunities in WSGI

Title Simple Authentication

Author Ian Bicking <[email protected]>

Discussions-To Python Web-SIG <[email protected]>

Status Proposed

Created 13-Nov-2006

Contents

• A very basic description of authentication opportunities in WSGI

– Abstract

– Rationale

– Specification

– Example

– Problems

– Other Possibilities

– Open Issues

Abstract

This describes a simple pattern for implementing authentication in WSGI middleware. This does not propose any newfeatures or environment keys; it only describes a baseline recommended practice.

Rationale

Authentication is probably the most common detail that should be abstracted away from an application, as it is aconcern most often bound to a deployment.

Specification

There are two components to authentication:

1. Indicating when a request is authenticated, and by who

2. Responding that authentication is necessary

24 Chapter 1. Contents

www.wsgi.org Documentation, Release 0.9

There are already two conventions for this:

1. Put the username in REMOTE_USER

2. Respond with 401 Unauthorized

Note: Please do not confused 401 Unauthorized with “permission denied”. Permission denied should beindicated with 403 Forbidden.

REMOTE_USER: This should be the string username of the user, nothing more.

401 Unauthorized: Because middleware is handling the authentication, additional information is not required.You do not (and should not) include a WWW-Authenticate header. The middleware may include that header,or may change the response in some other way to handle the login.

Example

The first example implements simple HTTP Basic authentication:

class HTTPBasic(object):

def __init__(self, app, user_database, realm='Website'):self.app = appself.user_database = user_databaseself.realm = realm

def __call__(self, environ, start_response):def repl_start_response(status, headers, exc_info=None):

if status.startswith('401'):remove_header(headers, 'WWW-Authenticate')headers.append(('WWW-Authenticate', 'Basic realm="%s"' % self.realm))

return start_response(status, headers)auth = environ.get('HTTP_AUTHORIZATION')if auth:

scheme, data = auth.split(None, 1)assert scheme.lower() == 'basic'username, password = data.decode('base64').split(':', 1)if self.user_database.get(username) != password:

return self.bad_auth(environ, start_response)environ['REMOTE_USER'] = usernamedel environ['HTTP_AUTHORIZATION']

return self.app(environ, repl_start_response)

def bad_auth(self, environ, start_response):body = 'Please authenticate'headers = [

('content-type', 'text/plain'),('content-length', str(len(body))),('WWW-Authenticate', 'Basic realm="%s"' % self.realm)]

start_response('401 Unauthorized', headers)return [body]

def remove_header(headers, name):for header in headers:

if header[0].lower() == name.lower():

(continues on next page)

1.9. Specifications related to WSGI 25

www.wsgi.org Documentation, Release 0.9

(continued from previous page)

headers.remove(header)break

Problems

• Strictly speaking, it is illegal to send a 401 Unauthorized response without the WWW-Authenticateheader. If no middleware is installed, most browsers will treat it like a 200 OK. There is also no way todetect if an appropriate middleware is installed.

• This doesn’t give any other information about the user. That information can go in other keys, but that is notaddressed in this specification currently.

• Some login methods will redirect the user, and any POST request data will possibly be lost. (Note that aspecification like A specification for how to process POST form requests helps address this problem.)

Other Possibilities

• While you can add to this specification, I think it’s the most logical and useful way to do authentication andbetter efforts can build on this base.

Open Issues

See Problems.

How to disable error catching through the environment

Title x-wsgiorg.throw_errors

Author Ian Bicking <[email protected]>

Discussions-To Python Web-SIG <[email protected]>

Status Proposed

Created 13 Nov 2006

Contents

• How to disable error catching through the environment

– Abstract

– Rationale

– Specification

– Example

– Problems

– Other Possibilities

– Open Issues

– Implementations

26 Chapter 1. Contents

www.wsgi.org Documentation, Release 0.9

Abstract

WSGI applications are generally not supposed to raise exceptions, instead handling their own errors (possibly returninga 500 Server Error response). But in some context it is desired that unexpected exceptions be allowed to bubbleup. This specification defines a key to set in this circumstance.

Rationale

When in a testing context it is undesirable for an application to handle its own errors. Typically the test framework isbetter at handling the errors, either through error formatting or by dropping into a debugger like pdb.

Additionally when an exception catcher is installed in a stack, ideally it will be used for all exceptions. This allowsfor centralized configuration (for example, when emails are sent when errors occur). Dynamically disabling any otherexception catchers is often ideal in this situation.

Specification

An exception catcher should check for x-wsgiorg.throw_errors. If it is true, it should not try to catch ex-ceptions. This need only be checked as the application is being entered, it should not be checked later. Applicationsshould not try to set this to effect middleware that wraps them, only to effect applications they may call.

Example

A simple exception catcher:

class ExceptionCatch(object):def __init__(self, app):

self.app = appdef __call__(self, environ, start_response):

if environ.get('x-wsgiorg.throw_errors'):return self.app(environ, start_response)

try:return self.app(environ, start_response)

except:import sys, traceback, StringIOexc_info = sys.exc_info()start_response('500 Server Error', [('content-type', 'text/plain')],

exc_info=exc_info)out = StringIO.StringIO()traceback.print_exc(file=out)return [out.getvalue()]

Problems

• In theory an application may know better how to format an error response than the middleware exception catcher.Of course, an application can ignore x-wsgiorg.throw_errors if it thinks it is best (or if it has beenexplicitly configured to do so).

1.9. Specifications related to WSGI 27

www.wsgi.org Documentation, Release 0.9

Other Possibilities

• You can just get the unwrapped application object and test it.

Open Issues

• None I know of

Implementations

WebTest sets a key (paste.throw_errors) during debugging, which allows it to do functional testing of ap-plications that have the paste.exceptions middleware applied to them (that middleware looks for the key anddisables itself per-request when it sees it).

Zope 2 has its own flag on the (non-WSGI) request to do this, showing substantial history for this technique. Zope 3uses something like wsgi.handleErrors in the WSGI environ to the same effect (it shouldn’t be using wsgi.,but it does).

1.9.4 Withdrawn

Unicode Support for WSGI

Title WSGI Unicode Handling

Author Armin Ronacher <[email protected]>

Status Rejected

Created 1-Nov-2006

Contents

• Unicode Support for WSGI

– Rejected

– Abstract

– Motivation

– Specification

– Problem

– Implementation

Rejected

This proposal is rejected mainly because of those reasons:

• It’s easy enough for applications to do that on their own

• Many applications don’t use unicode objects

• there should be an easier and more flexible way for that issue

28 Chapter 1. Contents

www.wsgi.org Documentation, Release 0.9

From Ian Bicking:

I’ll add some commentary here, since I was the primary critic (of the limited audience before Armin with-drew this specification). Leaving this proposal here hopefully will be useful to later people consideringthis problem.

Changing the response app_iter is pretty heavy, and isn’t really an extension to WSGI, it’s a change tothe core specification. Current WSGI implementors really expect str responses. When str goes awayin Python 3000, they will have to expect bytes responses too, but that’s a relatively straight-forward(though not trivial) change. Dealing with backward compatibility is quite difficult.

The use cases I personally see in this is avoiding the confusion and overhead of encoding and decodingresponses when there are intermediaries which handle the response in its unicode form. This is notuncommon – for instance, XML processing happens on unicode data, and ideally all text responses shouldbe handled as unicode. Deciding the encoding, and then doing the proper decoding, is not completelytrivial (though not terribly hard). It is hard enough that people will and have avoided it, potentiallyworking with str data when that was not correct. Similarly, it is important to send either properly-encoded data, or to change the encoding in the headers. Since encoding information can show up inmultiple places (unfortunately) this can also be error-prone.

Despite these problems, sending unencoded data opens up a whole bunch of other problems, and realisti-cally we get the union of all problems because we definitely cannot remove the sending of encoding textdata. So everyone has to deal with both cases now, instead of just one case.

Anyway, that’s my take on this. – Ian

Abstract

This specification proposes a possible implementation of unicode support in WSGI. Current all WSGI applicationhave to output str objects instead.

Motivation

Python ships two types of strings subclassing the abstract base class basestring. str and unicode. In Python 3unicode will replace str and a new class bytes will be introduced (PEP 3100#atomic-types, PEP 3137). Alsotoday many developers use unicode objects because support a wider range of characters and functions like len()still return the correct output, even when using multibyte encodings like utf-8.

But at the moment all WSGI applications have to yield str objects which require that uses encoder their data to aspecial encoding by hand. WSGI middlewares don’t know about the charset the application is using etc.

Specification

A possible solution would be a new key in the environ called wsgi.charset. The WSGI gateway would set this toNone per default which means that yielding of unicode objects results in an exception. But if the charset is correctlydefined all returned unicode objects get encoded in the defined encoding by the WSGI gateway.

Middlewares could use this value too convert incomming form data to unicode automatically so that the applicationdeveloper doesn’t have to take care about this issue.

1.9. Specifications related to WSGI 29

www.wsgi.org Documentation, Release 0.9

Problem

If this environment key is updated by the application middlewares would still see None as charset because it’s updatedon first iteration only. So an application developer would need to wrap the whole application including middlewaresafterwards again with a new middleware that updates this key. Another possibility would be that the WSGI gatewayprovides a configuration value for the charset.

If encoding the output of the wsgi application the gateway must also get the wsgi.charset key each time a unicodeobject is found. Caching won’t work because the application must be able to change the charset before each iteration:

def app(environ, start_response):start_response('200 OK', [('Content-Type', 'text/plain')])environ['wsgi.charset'] = 'utf-8'yield u'Hällo Wörld'environ['wsgi.charset'] = 'iso-8895-15'yield u'Hällo Wörld'

Implementation

Here a very simple CGI gateway that implements this functionality:

import osimport sys

def run_with_cgi(app, charset=None):environ = dict(os.environ.items())environ['wsgi.charset'] = charsetenviron['wsgi.input'] = sys.stdinenviron['wsgi.errors'] = sys.stderrenviron['wsgi.version'] = (1,0)environ['wsgi.multithread'] = Falseenviron['wsgi.multiprocess'] = Trueenviron['wsgi.run_once'] = True

if environ.get('HTTPS','off').lower() in ('on','1'):environ['wsgi.url_scheme'] = 'https'

else:environ['wsgi.url_scheme'] = 'http'

headers_set = []headers_sent = []

def write(data):if not headers_set:

raise AssertionError('write() before start_response()')elif not headers_sent:

status, response_headers = headers_sent[:] = headers_setsys.stdout.write('Status: %s\r\n' % status)for header in response_headers:

sys.stdout.write('%s: %s\r\n' % header)sys.stdout.write('\r\n')

if isinstance(data, unicode):charset = environ['wsgi.charset']if charset is None:

raise AssertionError('application returned unicode without ''defined charset')

(continues on next page)

30 Chapter 1. Contents

www.wsgi.org Documentation, Release 0.9

(continued from previous page)

data = data.encode(charset)sys.stdout.write(data)sys.stdout.flush()

def start_response(status,response_headers,exc_info=None):if exc_info:

try:if headers_sent:

raise exc_info[0], exc_info[1], exc_info[2]finally:

exc_info = Noneelif headers_set:

raise AssertionError('Headers already set!')headers_set[:] = [status,response_headers]return write

result = app(environ, start_response)try:

for data in result:if data:

write(data)if not headers_sent:

write('')finally:

if hasattr(result,'close'):result.close()

A specification for how to process POST form requests

Title Handling POST forms in WSGI

Author Ian Bicking <[email protected]>

Discussions-To Python Web-SIG <[email protected]>

Status Withdrawn

Created 21-Oct-2006

Contents

• A specification for how to process POST form requests

– Abstract

– Reason for Withdrawl

– Rationale

– Specification

– Query String data

– Middleware

– Problems

– Other Possibilities

1.9. Specifications related to WSGI 31

www.wsgi.org Documentation, Release 0.9

– Open Issues

Abstract

This suggests a way that WSGI middleware, applications, and frameworks can access POST form bodies so that thereis less contention for the wsgi.input stream.

Reason for Withdrawl

I decided that there were opportunities to decorate the wsgi.input stream itself, and have been pursing them inWSGIRemote. I may describe that strategy in a specification later.

Rationale

Currently environ['wsgi.input'] points to a stream that represents the body of the HTTP request. Once thisstream has been read, it cannot necessarily be read again. It may not have a seek method (none is required by theWSGI specification, and frequently none is provided by WSGI servers).

As a result any piece of a system that looks at the request body essentially takes ownership of that body, and no oneelse is able to access it. This is particularly problematic for POST form requests, as many framework pieces expect tohave access to this. One notable case is when a request “enters” a traditional web framework which parses the POSTform, then “exits” back to WSGI through some framework-specific WSGI gateway.

The specification covers library code that multiple frameworks can implement. This is not functionality that is intendedto be added to a WSGI “stack”.

Specification

This applies when certain requirements of the WSGI environment are met:

def is_post_request(environ):if environ['REQUEST_METHOD'].upper() != 'POST':

return Falsecontent_type = environ.get('CONTENT_TYPE', 'application/x-www-form-urlencoded')return (content_type.startswith('application/x-www-form-urlencoded'

or content_type.startswith('multipart/form-data'))

That is, it must be a POST request, and it must be a form request (generally application/x-www-form-urlencoded or when there are file uploads multipart/form-data).

When this happens, the form can be parsed by cgi.FieldStorage. The results of this parsing is put in wsgi.post_form as (new_wsgi_input, old_wsgi_input, FieldStorage_object).

The new_wsgi_input can be used to check if an intermediary has replaced the input since wsgi.post_formwascalculated. If the input has been changed, the wsgi.post_form data should be discarded. The old_wsgi_inputcan be used if you want to get access to the original input stream (which may be seekable, and so still useful).

The replacement wsgi.input guards against routines that access the data but don’t conform to this specification.Ideally the replacement will act like the original wsgi.input (producing the same data), but if not it should raise anexception. The input should not block or produce inaccurate data.

32 Chapter 1. Contents

www.wsgi.org Documentation, Release 0.9

def get_post_form(environ):assert is_post_request(environ)input = environ['wsgi.input']post_form = environ.get('wsgi.post_form')if (post_form is not None

and post_form[0] is input):return post_form[2]

# This must be done to avoid a bug in cgi.FieldStorageenviron.setdefault('QUERY_STRING', '')fs = cgi.FieldStorage(fp=input,

environ=environ,keep_blank_values=1)

new_input = InputProcessed('')post_form = (new_input, input, fs)environ['wsgi.post_form'] = post_formenviron['wsgi.input'] = new_inputreturn fs

class InputProcessed(object):def read(self, *args):

raise EOFError('The wsgi.input stream has already been consumed')readline = readlines = __iter__ = read

By using this routing multiple consumers can parse a POST form, accessing the form data in any order (later consumerswill get the already-parsed data).

Query String data

Note that nothing in this specification touches or applies to the query string (in environ['QUERY_STRING']).This is not parsed as part of the process, and nothing in this specification applies to GET requests, or to the querystring which may be present in a POST request.

Middleware

While this proposal makes it more feasible for middleware to access POST form data, it should not be read as encour-aging middleware to do so. In particular, no consumer should ever expect that wsgi.post_form is in the requestenvironment. Also, no intermediary should parse the POST form data unless it actually is interested in that data –access should be deferred until there is a real need for the POST data.

Problems

• This specification only works for parsing with cgi.FieldStorage. This is not the only parser possible,though it is the only parser in common usage.

• The API for cgi.FieldStorage is not particularly well defined, so creating compatible parsers is difficult.

• cgi.FieldStorage doesn’t have any unicode handling (it has to be done higher up).

• Ideally middleware should just not access “envvar:wsgi.input; people can (and have) read this specification asencouraging middleware to do this parsing.

• In an ideal world wsgi.input would stick around, either as a temporary file or as a file that was a lazyserialization of the parsed data.

1.9. Specifications related to WSGI 33

www.wsgi.org Documentation, Release 0.9

Other Possibilities

• One of the simplest possibilities is to add this information to environ['wsgi.input'] itself as a separateattribute. E.g.:

fs = getattr(environ['wsgi.input'], 'cgi_FieldStorage', None)if fs is None: # parse and replace wsgi.input...

There’s a certain elegance to keeping wsgi.input self-describing and movable.

Open Issues

1. This doesn’t address non-form-submission POST requests. Most of the same issues apply to such requests,except that frameworks tend not to touch the request body in that case. The body may be large, so the actualcontents of the request body shouldn’t go in the environment. Perhaps they could go in a temporary file, butthis too might be an unnecessary indirection in many cases. Also other kinds of request (like PUT) that have arequest body are not covered, for largely the same reason. In both these cases, it is much easier to construct anew wsgi.input that accesses whatever your internal representation of the request body is.

2. Is the tuple of information necessary in wsgi.post_form, or could it just be the FieldStorage in-stance? Should all the information go in wsgi.input directly?

3. Should wsgi.input be replaced by InputProcessed, or just left as is? Or should we look for code thatserializes FieldStorage objects back to parseable strings?

4. Does QUERY_STRING actually have to be set for cgi not to mess up, or is that just an issue with GET requests?

1.9.5 Wanted

These don’t exist yet, but they could. Write one?

• A standard place to put HTTP proxy scheme and host information (e.g., when a server acts as an HTTP proxy therequest looks like GET http://hostname.org/path ..., and we don’t have a place to keep http://hostname.org).

• Ben Bangert suggested a simple session standard, focused solely on the session ID (persistence handled else-where). This is fairly modest but still useful. This was in an email: http://mail.python.org/pipermail/web-sig/2006-January/001858.html

• Maybe a full session interface built on the session ID standard. This is an API proposed earlier: http://svn.colorstudy.com/home/ianb/proposed_session_interface.py

• Often debugging tools open security holes (for example, paste.evalexception gives you a Python prompton every exception). Authentication isn’t really the right way to handle it, because debugging might involvelogging in as various users. A specification could just define a key that indicates when these debugging toolsshould be allowed. This might get set by configuration, IP address, a cookie, etc.

• Debugging mode is something that can be used in all sorts of places; to increase verbosity, annotate outputpages, displaying errors in the browser, etc. Having a single key for turning on debugging mode would allow itsconsumption in lots of places. Not as strict as authenticating.

• Some systems prefer that unexpected exceptions bubble up, like test frameworks. A key could define this case(modelled on paste.throw_errors) and thus disable exception catchers.

• Logging is a tricky situation. The logging module allows for statically setting up logging systems, thenconfiguring them at startup. This often isn’t the best way to set up logging. Putting a logging.Logger

34 Chapter 1. Contents

www.wsgi.org Documentation, Release 0.9

instance right in the environment might be better. This requires some design and usage before setting on onespec.

• Request object wrapping the environment.

• Thread-local values are a common technique in web frameworks, allowing global objects or functions to returnrequest-specific information. This pattern could be codified into one core system, using some feedback fromexisting systems (which have their advantages and flaws).

• Configuration takes fairly common forms, usually a dict of some sort. It could be put somewhere standard.

• Maybe Paste Deploy’s entry points could be standardized. (Paste Deploy itself only consumes those entry points;other consumers are possible and packages implementing those entry points don’t introduce any dependency onPaste Deploy)

• A way to extend wsgiref.validate to add more validation, for all these new specs. (Probably this is animplementation, not a spec)

• A way to describe custom keys, maybe associated with the validation.

• Anchors for doing recursive calls, similar to paste.recursive. (it’s kind of an old module that is morecomplicated than it needs to be)

• A place to put a database transaction manager

• More user-based information than just REMOTE_USER; like wsgiorg.user_info? The basics of this aredescribed in A very basic description of authentication opportunities in WSGI, but it doesn’t cover anythingadvanced.

These can be written based on specifications/specification_template.

1.10 Amendments to WSGI 1.0

This page is intended to collect any ideas related to amendments to the original WSGI 1.0 so that it can be marked as‘Final’.

The purpose of the amendments is to address any mistakes or ambiguities in the 1.0 specification or to change anyrequirements that in practice could not be implemented for one reason or another. The amendments would also addressany differences in how the 1.0 specification should be interpreted for Python 3. See Python 3 for details.

Note that this isn’t about changing the 1.0 specification drastically in any way, that is what Proposals related to WSGI2.0 specification will be about. You should though not construe anything in here as an indication that said change willbe made. This is especially the case with Python 3 support as there is a measure of disagreement as to how WSGIshould work for Python 3. In other words, you would be unwise to implement any WSGI application or WSGI adapterwith information in here as a basis as it could change or simply never be adopted.

The page has been created in response to a discussion on the Python WEB-SIG.

In addition, Graham Dumpleton gives details and clarifications on WSGI 1.0 amendments on his blog.

1.10.1 readline(size)

Currently the specification does not require servers to provide environ['wsgi.input'].readline(size)(the size argument in particular). But cgi.FieldStorage calls readline this way, so in effect it is required.

1.10. Amendments to WSGI 1.0 35

www.wsgi.org Documentation, Release 0.9

1.10.2 Python 3

Python 3 default string type is now unicode and existing python2 strings correspond to bytes. This changes how termsneed to be interpreted. From WSGI, Python 3 and Unicode, the following suggested amendments were proposed forPython 3.

• When running under Python 3, applications SHOULD produce bytes output, status line and headers

• When running under Python 3, servers and gateways MUST accept strings as application output, status line orheaders, under the existing rules (i.e., s.encode('latin-1') must convert the string to bytes without anexception)

• When running under Python 3, servers MUST provide CGI HTTP variables and as strings, decoded from theheaders using HTTP standard encodings (i.e. latin-1 + RFC 2047) (Open question: are there any CGI or WSGIvariables that should NOT be strings?)

• When running under Python 3, servers MUST make wsgi.input a binary (byte) stream

• When running under Python 3, servers MUST provide a text stream for wsgi.errors

See the mailing list archive for the full discussion of issues.

Note that this doesn’t address any clarifications that may be required around wsgi.file_wrapper optional exten-sion.

Note that current thinking is that the WSGI adaptor should not worry about RFC 2047.

1.10.3 Errata 1

In the “Specification Details” chapter there is this note:

Note: the application must invoke the start_response() callable before the iterable yields its first body string,so that the server can send the headers before any body content. However, this invocation may be performed by theiterable’s first iteration, so servers must not assume that start_response() has been called before they beginiterating over the iterable.) What’s wrong is that the invocation of start_response may be performed at any iteration ofthe iterable, as long as the application yields empty strings.

See http://mail.python.org/pipermail/web-sig/2007-December/003064.html for more info.

• I don’t really think that this is a good assumption to make. I could see how some implementations could allowfor this, but strictly speaking, I wouldn’t assume that most implementations would do that. Besides that, whatpurpose does yielding an empty string serve? For those reasons, I think this is better of left as an undefinedbehavior. –JasonBaker July 1, 2008

1.10.4 When HTTP response headers can be sent

The WSGI spec explicitly states that HTTP response headers must be sent when the application yields the first nonempty strings.

However if a WSGI implementation is allowed to send headers early (not when start_response is called, butwhen the first string is yielded by the WSGI application, even if empty), then in case of an HEAD request no contentgeneration is required (assuming, of course, that the WSGI application returns a generator).

See http://mail.python.org/pipermail/web-sig/2007-October/002881.html, http://mail.python.org/pipermail/web-sig/2007-October/002799.html, http://mail.python.org/pipermail/web-sig/2007-October/002803.html and http://mail.python.org/pipermail/web-sig/2007-October/002879.html

36 Chapter 1. Contents

www.wsgi.org Documentation, Release 0.9

That thread is a bit confused.

1.10.5 start_response and error checks

The WSGI spec says that start_response callable must not actually transmit the response headers. Instead, it muststore them.

The problem is that it says nothing about errors checking.

See http://mail.python.org/pipermail/web-sig/2007-September/002771.html

1.10.6 Clarification about start_response

What happens if an application calls start_response with an incorrect status line or headers?

Should an implementation consider the function called, so that an application can call it a second time, without theexc_info parameter?

See http://mail.python.org/pipermail/web-sig/2007-October/002887.html

1.10.7 Specify the type of SERVER_PORT

Some implementations currently expect it to be an integer, some a string. Can we please specify one or the other oreither? The “URL reconstruction” code snippet in PEP 333 presumes it’s a string, the reference to the (defunct) CGIspec would seem to imply it should be a string, but it should be explicit.

1.11 Proposals related to WSGI 2.0

This page is intended to collect any ideas related to WSGI 2.0. In particular, any proposed changes to the specification.

Note: What is described here should not be considered a DRAFT for WSGI 2.0. It is only a list of ideas or issuesthat need to be considered if there ever is enough momentum towards producing an updated WSGI specification. It isquite possible that there may never be an updated specification which embodies the ideas described here. Thus, if youimplement any web application interfaces based on the API described here, call it something else, do not call it WSGI2.0 as no such thing exists.

1.11.1 start_response and write

We could remove start_response and the writer that it implies. This would lead to a signature like:

def app(environ):return '200 OK', [('Content-type', 'text/plain')], ['Hello world']

That is, return a three-tuple of (status, headers, app_iter).

It’s relatively simple to provide adapters to and from this signature to the WSGI 1.0 signature.

1.11. Proposals related to WSGI 2.0 37

www.wsgi.org Documentation, Release 0.9

1.11.2 Making some keys required

Several keys are optional in WSGI, but required in CGI, in particular SCRIPT_NAME, PATH_INFO andQUERY_STRING. Also REMOTE_ADDR and SERVER_SOFTWARE are supposed to exist, even if empty. All thesekeys could become required in WSGI.

1.11.3 Unknown-length wsgi.input

There’s no documented way to indicate that there is content in wsgi.input, but the content length is unknown.A value of -1 may work in many situations. A missing CONTENT_LENGTH doesn’t generally work currently (it’sassumed to mean 0 by much code).

This is an issue because chunked transfer encoding on request content can’t be supported properly unless there is away to indicate that there is data with unknown content length. Also an issue with a web server or WSGI middlewarecomponent that mutates the input stream (eg. decompression), where it will not know the new content length inadvance of mutating the data stream.

Any change in this area also needs to take into consideration the current link between CGI and WSGI specificationsand whether the CGI requirement to not read more input data than defined by CONTENT_LENGTH and that returningan EOF indicator is optional is really appropriate for WSGI.

For more information see thread: http://mail.python.org/pipermail/web-sig/2007-March/002630.html

1.11.4 readline(size)

Currently the specification does not require servers to provide environ['wsgi.input'].readline(size)(the size argument in particular). But cgi.FieldStorage calls readline this way, so in effect it is required.

1.11.5 app_iter and threads

It’s not clear if the app_iter must be used in the same thread as the application. Since the application is blocking,presumably it must be run all in one thread. This should be more explicitly documented.

1.11.6 long response headers

Noted here: http://mail.python.org/pipermail/web-sig/2006-September/002244.html

1.11.7 request trailers and chunked transfer encoding

When using chunked transfer encoding on request content, the RFCs allow there to be request trailers. These arelike request headers but come after the final null data chunk. These trailers are only available when the chunked datastream is finite length and when it has all been read in, thus not available at time that start application is called.

1.11.8 Decoding SCRIPT_NAME/PATH_INFO

Because SCRIPT_NAME and PATH_INFO are decoded in WSGI, there’s no way to distinguish %2F from /

38 Chapter 1. Contents

www.wsgi.org Documentation, Release 0.9

1.11.9 No encoding horrors any more

Analysis see there: http://www.mail-archive.com/[email protected]/msg02483.html

Can we have that horror removed for wsgi2 apps, please?

A quite easy approach would be to have a set of RAW_* env vars (e.g. RAW_PATH_INFO) that has /Foo%XXBar%YYcontent (is not decoded, plain ascii like in the http protocol).

That also would solve issues with ? and / (see section above) that are encoded as %XX (and NOT meant as query /path component separator).

Any wsgi1 app can continue to use the wsgi1 env vars, any wsgi2 app can check whether the wsgi2 RAW_* env varsare there and use them (or fall back to using the wsgi1 env vars).

1.12 Python 3

PEP 3333 aims to resolve issues with Python 3 and WSGI.

This page is intended to collect ideas and proposals about WSGI amendments for Python 3.

See also Amendments to WSGI 1.0

1.12.1 Presentation, at DjangoCon 2010 (by Armin Ronacher)

• Slides: slideshare / scribd

• Video on blip.tv

• Blog post commentary by Armin

• Reinout van Rees’s reaction and commentary

1.12.2 Latest discussions

• Main discussions occur on the WEB-SIG mailing list

• ‘WSGI on Python 3’ thread

• Graham Dupleton’s 2009 A roadmap for the Python WSGI specification which describe all the proposals exten-sively (author: Graham

1.12.3 Proposals

There’s lots of discussions about the type of data (bytes versus unicode) in various places of the specification.

The actual competitors are:

mod_wsgi [Ochtman2010]

all unicode [Ronacher2009]

web3 [McDonough2010]

flat optimized for ease of validation and low cognitive overhead (inputs are native except for the byte stream, alloutputs are bytes)

1.12. Python 3 39

www.wsgi.org Documentation, Release 0.9

Here is a summary table which outlines the bytes/unicode differences between these proposals.

WSGI1.0

mod_wsgi Unicode web3 flat

environ keys bytes nativeCGI values bytes native unicode bytes native (PEP

383)

SCRIPT_NAME, PATH_INFO,QUERY_STRING

bytes native unicode (utf-8)

bytes native (PEP383)

wsgi.url_schemebytes native unicode bytes native

wsgi.inputbytes

status line bytes bytes (or na-tive)

unicode (orbytes)

bytes bytes

headers bytes bytes (or na-tive)

unicode orbytes

bytes bytes

response iterable bytes bytes (or na-tive)

bytes bytes bytes

write() callback bytes bytes (or na-tive)

(deprecated) (re-moved)

(removed)

Notes:

• a native string is the primary string type for a particular Python implementation:

– for Python 2.x this is a byte string,

– for Python 3.x this is a Unicode string

• unless otherwise stated, all unicode strings are decoded using ISO-8859-1

• when SCRIPT_NAME and PATH_INFO are ‘native’ or ‘unicode’, the environment should contain 2 additionalvalues wsgi.script_name and wsgi.path_info which contain raw-bytes values. (Except in the flatproposal, which assumes CGI variables are decoded as utf-8 using PEP 383 surrogateescape encoding, andthat the raw bytes can thus be retrieved by re-encoding.)

• details about the mod_wsgi proposal:

– it is already implemented in mod_wsgi 3.0

– almost entirely compatible with current WSGI 1.0 for Python 2

– it runs the WSGI 1.0 ‘Hello World!’ unchanged

• details about the all unicode proposal:

– the SCRIPT_NAME and PATH_INFO will be decoded as UTF-8. If it fails, they are decoded asISO-8859-1. The name of the successful codec is stored in wsgi.uri_encoding.

– the REQUEST_URI variable is optional and stores the full URI as requested by the client.

• details about the web3 proposal:

– this proposal does not try to be compatible with WSGI 1.0. It targets Python 2.6+ and Python 3.1+.

– all wsgi.* variables are intentionally renamed web3.* in the document.

40 Chapter 1. Contents

www.wsgi.org Documentation, Release 0.9

1.12.4 Draft implementations

• mod_wsgi 3.0+: see the page about Python 3 support

• CherryPy 3.2: see details about CherryPy’s Python 3 WSGI implementation

• Experimental WSGI servers for Python 3

1.13 Definitions of keys and classes

1.13.1 Standard environ keys

REQUEST_METHODThe HTTP request method, such as GET or POST. This cannot ever be an empty string, and so is always required.

SCRIPT_NAMEThe initial portion of the request URL’s “path” that corresponds to the application object, so that the applicationknows its virtual “location”. This may be an empty string, if the application corresponds to the “root” of theserver.

PATH_INFOThe remainder of the request URL’s “path”, designating the virtual “location” of the request’s target within theapplication. This may be an empty string, if the request URL targets the application root and does not have atrailing slash.

QUERY_STRINGThe portion of the request URL that follows the “?”, if any. May be empty or absent.

CONTENT_TYPEThe contents of any Content-Type fields in the HTTP request. May be empty or absent.

CONTENT_LENGTHThe contents of any Content-Length fields in the HTTP request. May be empty or absent.

SERVER_NAME

SERVER_PORTWhen combined with SCRIPT_NAME and PATH_INFO, these variables can be used to complete the URL.Note, however, that HTTP_HOST, if present, should be used in preference to SERVER_NAME for recon-structing the request URL. See the URL Reconstruction section below for more detail. SERVER_NAME andSERVER_PORT can never be empty strings, and so are always required.

SERVER_PROTOCOLThe version of the protocol the client used to send the request. Typically this will be something like “HTTP/1.0”or “HTTP/1.1” and may be used by the application to determine how to treat any HTTP request headers. (Thisvariable should probably be called REQUEST_PROTOCOL, since it denotes the protocol used in the request,and is not necessarily the protocol that will be used in the server’s response. However, for compatibility withCGI we have to keep the existing name.)

HTTP_ Variables

Variables corresponding to the client-supplied HTTP request headers (i.e., variables whose names beginwith HTTP_). The presence or absence of these variables should correspond with the presence or absenceof the appropriate HTTP header in the request.

1.13. Definitions of keys and classes 41

www.wsgi.org Documentation, Release 0.9

1.13.2 WSGI environ keys

wsgi.versionThe tuple (1, 0), representing WSGI version 1.0.

wsgi.url_schemeA string representing the “scheme” portion of the URL at which the application is being invoked. Normally, thiswill have the value “http” or “https”, as appropriate.

wsgi.inputAn input stream (file-like object) from which the HTTP request body can be read. (The server or gateway mayperform reads on-demand as requested by the application, or it may pre- read the client’s request body andbuffer it in-memory or on disk, or use any other technique for providing such an input stream, according to itspreference.)

wsgi.errorsAn output stream (file-like object) to which error output can be written, for the purpose of recording programor other errors in a standardized and possibly centralized location. This should be a “text mode” stream; i.e.,applications should use “n” as a line ending, and assume that it will be converted to the correct line ending bythe server/gateway.

For many servers, wsgi.errors will be the server’s main error log. Alternatively, this may be sys.stderr, or a logfile of some sort. The server’s documentation should include an explanation of how to configure this or whereto find the recorded output. A server or gateway may supply different error streams to different applications, ifthis is desired.

wsgi.multithreadThis value should evaluate true if the application object may be simultaneously invoked by another thread in thesame process, and should evaluate false otherwise.

wsgi.multiprocessThis value should evaluate true if an equivalent application object may be simultaneously invoked by anotherprocess, and should evaluate false otherwise.

wsgi.run_onceThis value should evaluate true if the server or gateway expects (but does not guarantee!) that the applicationwill only be invoked this one time during the life of its containing process. Normally, this will only be true fora gateway based on CGI (or something similar).

42 Chapter 1. Contents

CHAPTER 2

Contributing

Found a typo? Or some awkward wording? Want to add a link to a presentation, a tutorial or a new (or old andmissing) WSGI-related tool? Fixing a dead link?

WSGI.org is open-source and hosted on github, contributions are encouraged and appreciated.

43

www.wsgi.org Documentation, Release 0.9

44 Chapter 2. Contributing

CHAPTER 3

Indices and tables

• genindex

• search

45

www.wsgi.org Documentation, Release 0.9

46 Chapter 3. Indices and tables

Bibliography

[xml2006-09] xml.com, Sept 2006. Part 1: getting started

[xml2006-10] xml.com, Oct 2006. Part 2: Making Use of a Middleware

[Ochtman2010] Dirkjan Ochtman, (lost link), 2010

[Ronacher2009] Armin Ronacher, http://bitbucket.org/ianb/wsgi-peps/src/tip/pep-XXXX.txt, 2009

[McDonough2010] Chris McDonough, http://github.com/mcdonc/web3/blob/master/web3.rst, 2009

47

www.wsgi.org Documentation, Release 0.9

48 Bibliography

Index

CCONTENT_LENGTH, 23, 38

Eenvironment variable

PATH_INFO, 40QUERY_STRING, 40SCRIPT_NAME, 40wsgi.input, 40wsgi.post_form, 34wsgi.url_scheme, 40

environment variableCONTENT_LENGTH, 23, 38, 41CONTENT_TYPE, 41HTTP_HOST, 41paste.throw_errors, 28, 34PATH_INFO, 11, 12, 38, 40, 41QUERY_STRING, 38, 41RAW_PATH_INFO, 39REMOTE_ADDR, 38REQUEST_METHOD, 41REQUEST_PROTOCOL, 41REQUEST_URI, 40SCRIPT_NAME, 11, 12, 38, 40, 41SERVER_NAME, 41SERVER_PORT, 41SERVER_PROTOCOL, 41SERVER_SOFTWARE, 38wsgi.charset, 29, 30wsgi.errors, 36, 42wsgi.file_wrapper, 36wsgi.handleErrors, 28wsgi.input, 14, 20, 22–24, 32–34, 36, 38, 42wsgi.multiprocess, 42wsgi.multithread, 42wsgi.path_info, 40wsgi.post_form, 32, 33wsgi.run_once, 42wsgi.script_name, 40

wsgi.uri_encoding, 40wsgi.url_scheme, 42wsgi.version, 42wsgiorg.routing_args, 11, 12wsgiorg.user_info, 35x-wsgiorg.developer_user, 18x-wsgiorg.fdevent, 15x-wsgiorg.fdevent.readable, 14, 16x-wsgiorg.fdevent.timeout, 14x-wsgiorg.fdevent.writable, 14, 16x-wsgiorg.throw_errors, 27x-wsgiorg.want_parsed_response, 20, 23

HHTTP_HOST, 41

Ppaste.throw_errors, 28, 34PATH_INFO, 40PATH_INFO, 11, 12, 38, 40, 41Python Enhancement Proposals

PEP 3100#atomic-types, 29PEP 3137, 29PEP 333, 37PEP 3333, 1, 39PEP 383, 40

QQUERY_STRING, 40QUERY_STRING, 38

RRAW_PATH_INFO, 39REMOTE_ADDR, 38REQUEST_PROTOCOL, 41REQUEST_URI, 40RFC

RFC 2047, 36

49

www.wsgi.org Documentation, Release 0.9

SSCRIPT_NAME, 40SCRIPT_NAME, 11, 12, 38, 40, 41SERVER_NAME, 41SERVER_PORT, 41SERVER_SOFTWARE, 38

Wwsgi.charset, 29, 30wsgi.errors, 36wsgi.file_wrapper, 36wsgi.handleErrors, 28wsgi.input, 40wsgi.input, 14, 20, 22–24, 32–34, 36, 38wsgi.path_info, 40wsgi.post_form, 34wsgi.post_form, 32, 33wsgi.script_name, 40wsgi.uri_encoding, 40wsgi.url_scheme, 40wsgiorg.routing_args, 11, 12wsgiorg.user_info, 35

Xx-wsgiorg.developer_user, 18x-wsgiorg.fdevent, 15x-wsgiorg.fdevent.readable, 14, 16x-wsgiorg.fdevent.timeout, 14x-wsgiorg.fdevent.writable, 14, 16x-wsgiorg.throw_errors, 27x-wsgiorg.want_parsed_response, 20, 23

50 Index