apache architecture
DESCRIPTION
Apache Architecture. How do we measure performance?. Benchmarks Requests per Second Bandwidth Latency Concurrency (Scalability). Building a scalable web server. handling an HTTP request map the URL to a resource check whether client has permission to access the resource - PowerPoint PPT PresentationTRANSCRIPT
Apache Architecture
How do we measure performance?
• Benchmarks– Requests per Second– Bandwidth– Latency– Concurrency (Scalability)
Building a scalable web server
• handling an HTTP request – map the URL to a resource – check whether client has permission to access the
resource – choose a handler and generate a response – transmit the response to the client – log the request
• must handle many clients simultaneously • must do this as fast as possible
Resource Pools• one bottleneck to server performance is the operating
system – system calls to allocate memory, access a file, or create a
child process take significant amounts of time – as with many scaling problems in computer systems,
caching is one solution
• resource pool: application-level data structure to allocate and cache resources – allocate and free memory in the application instead of using
a system call – cache files, URL mappings, recent responses – limits critical functions to a small, well-tested part of code
Multi-Processor Architectures
• a critical factor in web server performance is how each new connection is handled – common optimization strategy: identify the most commonly-
executed code and make this run as fast as possible – common case: accept a client and return several static
objects – make this run fast: pre-allocate a process or thread, cache
commonly-used files and the HTTP message for the response
Connections
• must multiplex handling many connections simultaneously – select(), poll(): event-driven, singly-threaded – fork(): create a new process for a connection – pthread create(): create a new thread for a connection
• synchronization among processes/threads – shared memory: semaphores – message passing
Select
• select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout);
• Allows a process to block until data is available on any one of a set of file descriptors.
• One web server process can service hundreds of socket connections
Event Driven Architecture
• one process handles all events • must multiplex handling of many clients and
their messages – use select() or poll() to multiplex socket I/O events– provide a list of sockets waiting for I/O events – sleeps until an event occurs on one or more sockets – can provide a timeout to limit waiting time
• must use non-blocking system calls • some evidence that it can be more efficient than
process or thread architectures
Process Driven Architecture
• devote a separate process/thread to each event – master process listens for connections – master creates a separate process/thread for each
new connection
• performance considerations – creating a new process involves significant overhead – threads are less expensive, but still involve overhead
• may create too many processes/threads on a busy server
Process/Thread Pool Architecture
• master thread – creates a pool of threads – listens for incoming connections – places connections on a shared queue
• processes/threads – take connections from shared queue – handle one I/O event for the connection – return connection to the queue – live for a certain number of events (prevents long-lived
memory leaks)
• need memory synchronization
Hybrid Architectures
• each process can handle multiple requests – each process is an event-driven server – must coordinate switching among events/requests
• each process controls several threads – threads can share resources easily – requires some synchronization primitives
• event driven server that handles fast tasks but spawns helper processes for time-consuming requests
What makes a good Web Server?
• Correctness
• Reliability
• Scalability
• Stability
• Speed
Correctness
• Does it conform to the HTTP specification?
• Does it work with every browser?
• Does it handle erroneous input gracefully?
Reliability
• Can you sleep at night?
• Are you being paged during dinner?
• It is an appliance?
Scalability
• Does it handle nominal load?
• Have you been Slashdotted?– And did you survive?
• What is your peak load?
Speed (Latency)
• Does it feel fast?
• Do pages snap in quickly?
• Do users often reload pages?
Apache the General Purpose Webserver
Apache developers strive for
correctness first, and
speed second.
Apache HTTP Server
Architecture Overview
Classic “Prefork” Model
• Apache 1.3, and
• Apache 2.0 Prefork
• Many Children
• Each child handles one connection at a time. Child
Parent
ChildChild… (100s)
http://httpd.apache.org/docs/2.2/mod/prefork.html
Multithreaded “Worker” Model
• Apache 2.0 Worker
• Few Children
• Each child handles many concurrent connections.
Child
Parent
ChildChild… (10s)
10s of threads
Dynamic Content: Modules
• Extensive API
• Pluggable Interface
• Dynamic or Static Linkage
In-process Modules
• Run from inside the httpd process– CGI (mod_cgi)– mod_perl– mod_php– mod_python– mod_tcl
Out-of-process Modules
• Processing happens outside of httpd (eg. Application Server)
• Tomcat– mod_jk/jk2, mod_jserv
• mod_proxy• mod_jrun
Parent
TomcatChild
ChildChild
Performancetransactions per second
Architecture Hello Big DB
Apache-Mod_perl 324 82 98
Apache-CGI 59 25 6
Architecture: The Big Picture
Child
Parent
ChildChild… (10s)
10s of threads Tomcat
DB
100s of threads
mod_jkmod_rewritemod_phpmod_perl
“MPM”
• Multi-Processing Module
• An MPM defines how the server will receive and manage incoming requests.
• Allows OS-specific optimizations.
• Allows vastly different server models(eg. threaded vs. multiprocess).
“Child Process” aka “Server”
• Called a “Server” in
httpd.conf
• A single httpd process.
• May handle one or more
concurrent requests
(depending on the MPM).
Child
Parent
ChildChild… (100s)
Servers
“Parent Process”
• The main httpd process.
• Does not handle connections itself.
• Only creates and destroys children.
• Shared memory scoreboard to determine who handles connections
Child
Parent
Child
Child
… (100s)
Only one P
arent
“Thread”
• In multi-threaded MPMs (eg. Worker).
• Each thread handles a single connection.
• Allows Children to handle many
connections at once.
Prefork MPM
• Apache 1.3 and Apache 2.0 Prefork
• Each child handles one connection at a time
• Many children
• High memory requirements
• “You’ll run out of memory before CPU”
Prefork Directives (Apache 2.0)
• StartServers
• MinSpareServers
• MaxSpareServers
• MaxClients
• MaxRequestsPerChild
Worker MPM
• Apache 2.0 and later
• Multithreaded within each child
• Dramatically reduced memory footprint• Only a few children (fewer than prefork)
Worker Directives
• MinSpareThreads
• MaxSpareThreads
• ThreadsPerChild
• MaxClients
• MaxRequestsPerChild
Apache 1.3 and 2.0Performance Characteristics
Multi-process,
Multi-threaded,
or Both?
Prefork
• High memory usage
• Highly tolerant of faulty modules
• Highly tolerant of crashing children
• Fast
• Well-suited for 1 and 2-CPU systems
• Tried-and-tested model from Apache 1.3
• “You’ll run out of memory before CPU.”
Worker
• Low to moderate memory usage• Moderately tolerant to faulty modules• Faulty threads can affect all threads in child• Highly-scalable• Well-suited for multiple processors• Requires a mature threading library
(Solaris, AIX, Linux 2.6 and others work well)
• Memory is no longer the bottleneck.
Important Performance Considerations
• sendfile() support
• DNS considerations
sendfile() Support
• No more double-copy• Zero-copy*• Dramatic improvement for static files• Available on
– Linux 2.4.x– Solaris 8+– FreeBSD/NetBSD/OpenBSD– ...
* Zero-copy requires both OS support and NIC driver support.
aio
• Process does not block on – Socket– Disk read or write– semaphores
Sendfilebackend throughput % disk
utilUser+sys
writev 17.59MB 50% 25%
sendfile 33.13MB 70% 30%
aio 50MB 98% 60%
aio-sendfile
44.15MB 90% 40%
DNS RoundRobin