Latest Version: 0.9.6.2
  Dashboard > Pylons Cookbook > ... > Hacking Pylons for handling large file upload > A Better Way To Limit File Upload Size
  Pylons Cookbook Log In | Sign Up   View a printable version of the current page.  
  A Better Way To Limit File Upload Size
Added by Michael Bayer, last edited by Philip Jenvey on Aug 21, 2008  (view change) show comment
Labels: 
(None)

Hacking Pylons for handling large file upload features a customized Cascade middleware which intentionally does not copy the POSTed data to a tempfile, thus allowing cgi.maxlen to raise an error. In my experimentation with the Paste server, I have observed that this approach does not actually close out the client connection when a too-large body is received. Additionally, it relies upon the internals of Cascade and also the "cgi.maxlen" variable which is not documented, and of course is not compatible with middleware that reads the POST body since it may have already been consumed by a previously cascaded app.

After some conversations on the Paste list, it was instead recommended to simply reject requests based on the "Content-length" argument. While this at first seemed odd, as an attacker could simply omit or forge this value, some perusal of the inner workings of Paste server revealed that the LimitedLengthFile applied to the input guarantees that only "Content-length" bytes will be read. So you can in fact rely upon it to guard the size of incoming content.

So the solution is simple:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
from webob import Request
from webob.exc import HTTPBadRequest

class LimitUploadSize(object):

    def __init__(self, app, size):
        self.app = app
        self.size = size

    def __call__(self, environ, start_response):
        req = Request(environ)
        if req.method=='POST'
            len = req.headers.get('Content-length')
            if not len:
                return HTTPBadRequest("No content-length header specified")(environ, start_response)
            elif int(len) > self.size:
                return HTTPBadRequest("POST body exceeds maximum limits")(environ, start_response)
        resp = req.get_response(self.app)
        return resp(environ, start_response)

Simply place this at the very start of the middleware chain, i.e. at the bottom of make_app() in middleware.py:

1
2
app = LimitUploadSize(app, 2000000)
return app

In my tests with this approach, uploading a very large file errors out immediately; its clear that as soon as the server sees the invalid header, the client is disconnected and no data is transmitted.

In my own application I've customized LimitUploadSize further to work conditionally, based on the login credentials of the user. Beaker's SessionMiddleware, at least when used with cookie-based sessions (which is what you should be using), works fine if you move it all the way to the top of the middleware chain:

1
2
3
app = LimitUploadSize(app, 2000000)
app = SessionMiddleware(app, config)
return app

LimitUploadSize can then check for authentication like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
class LimitUploadSize(object):

    def __init__(self, app, size):
        self.app = app
        self.size = size

    def __call__(self, environ, start_response):
        req = Request(environ)
        if req.method=='POST':
            session = environ['beaker.session']
            if "login_token" not in session:
                len = req.headers.get('Content-length')
                if not len:
                    return HTTPBadRequest("No content-length header specified")(environ, start_response)
                elif int(len) > self.size:
                    return HTTPBadRequest("POST body exceeds maximum limits")(environ, start_response)
        resp = req.get_response(self.app)
        return resp(environ, start_response)

Customizations like this are what make the WSGI-level upload size limiter a better approach than even Apache's LimitRequestBody directive.

Just remember that when using an asynchronous web server like nginx/lighttpd as a reverse proxy, it will gather the request from the client, then send it to your web app all in one go.

This means that if you want to use the limit to stop incoming requests from taking too long if they are too large and/or decrease your incoming bandwidth it has to be done at the nginx/lighttpd level.

Posted by Zepo Len at Aug 20, 2008 23:34 | Permalink

I use 0.9.6.2 and didn't find any "webob.exc" or "webop" in pylons installation. So... I don't know, but it seems that this version doesn't work. Or, may be some comments to this problem?

Posted by Ilya at Dec 07, 2008 21:53 | Permalink
Site running on a free Atlassian Confluence Open Source Project License granted to Pylons. Evaluate Confluence today.
Powered by Atlassian Confluence, the Enterprise Wiki. (Version: 2.3.3 Build:#645 Feb 13, 2007) - Bug/feature request - Contact Administrators
Top