Memo about HTTP service in python

· Read in about 4 min · (646 Words)
python

General idea of WSGI

As specified by pep-0333, the WSGI aims to make application written in python run on any web server as long as they follow the WSGI.

It includes two parts:

The Application/Framework Side

The application object should be callable(a function, a class with call ..), and accepts two arguments. Like:

def simple_app(environ, start_response):
    """Simplest possible application object"""
    status = '200 OK'
    response_headers = [('Content-type', 'text/plain')]
    start_response(status, response_headers)
    return ['Hello world!\n']

The Server/Gateway Side

The server will invoke the app for each request. For example, in werkzeug’s BaseWSGIServer, it invokes app with app(environ, start_response).

werkzeug

run_simple: make a simple HTTP server

The interesting part is it supports restart on code change. The mechanism is that, when script executes, it starts a new process that executes a new process running the same script, but add a new envrioment variable WERKZEUG_RUN_MAIN, so that a process can tell if it is running in forked process or original process, and executes different logic accordingly.

The main process is a while(1) loop, it forks a subprocess, waits until it terminates, checks its status code. If the code is 3, it knows code has changed and forks subprocess again, so that server is restarted. If code is not 3, then some error may happended, the main process itself will terminates.

In subprocess, it just starts a new thread which runs the server instance, and then runs the reloader. In reloader, it use watchdog to detect file add/remove/modification infinitely, if any change happens, it breaks from loop and subprocess terminates with status code 3. Notice that, the server thread is started with daemon options on, so when parent process terminates, the thread terminates too.

BaseWSGIServer and WSGIRequestHandler

BaseWSGIServer inherits Python’s HTTPServer. It sets its handler to WSGIRequestHandler which will handle request. For each request, this WSGIRequestHandler will be initialized, and handles request.

In WSGIRequestHandler’s base class BaseRequestHandler, we can see in its contructor, it calls step(), handle(), finish() in turn. All its child class will override handle() to give response.

In WSGIRequestHandler’s parent class BaseHTTPRequestHandler, it overrides handle() and call handle_one_request() for each request came. And WSGIRequestHandler overrides handle_one_request() which finally calls run_wsgi(). And in run_wsgi, the app which is the server logic written by user is called with app(envrion, start_response). envrion is envrioment variables from request. start_response is a function returns write method. write method is where response data is actually get written. start_response just set the response_headers and status, and return a function which will finally write response into socket.

So where does this envrion come from? In WSGIRequestHandler there’s a make_environ method that copies the header info from request into envrion.

routing

Routing is implemented with werkzeug.routing. As a general view, routing can be used as:

m = Map([Rule('/', endpoint='hello')], default_subdomain='www')
m.add(Subdomain('kb',[Rule('/', endpoint='hello2')])
m.add(route.Subdomain('kb',[Rule('/test', endpoint='hello2.test')]))

Now the Map object has the internal data like:

Map([<Rule 'kb|/test' -> hello2.test>,
 <Rule 'www|/' -> hello>,
 <Rule 'kb|/' -> hello2>])

Rule reprents a single mapping rule, it becomes usefule after being bind to a map. In bind method, the regex pattern will be generated. For example, a rule r = Rule('/add', endpoint='doAdd') after bind to m, print(r._regex.pattern) has the result ^www\|\/add$.

Map should be bind to a host name with ad = m.bind('example.com'), which returns a MapAdapter. MapAdapter is in charge for URL matching. Note that for the above ad object, it only matches url with subdomain www. To match url with subdomain kb, it should be bind with m.bind('example.com', subdomain='kb').

The url dispatch is also done by MapAdapter, with dispatch method.

flask

flask looks like a showbox of how to use werkzeug. For a code snippet like:

from flask import Flask
app = Flask(__name__)

@app.route("/say/<name>")
def say(name):
    return "hello %s" % name


if __name__ == '__main__':
    app.run(host="localhost", port=5000, debug=True)

@app.route will register a routing entry, with werkzeug’s Routing module. For the above example, it will route '/say/xxx' url to say(name) function.