This python module, httpheader, provides some utility functions useful for parsing and dealing with some of the HTTP 1.1 protocol headers which are not adequately covered by the standard Python libraries, such as:
- Byte range requests (multipart/byteranges)
- Content negotiation (content type, language, all the
Accept-*headers), with full priority/qvalue handling - Content/media type parameters
- Conversion to and from HTTP date and time formats
Download
You can download the source from this website, or from GitHub project dmeranda/httpheader.
- Version 1.1: 2013-02-08 (see changes)
- Download source httpheader.py
- View the documentation
- Browse the files.
- Version 1.0: 2005-12-19
- Browse the files.
License
This is Free Software, licensed under the terms of the GNU LGPL version 2.1 or later.
This basically means you are completely free to use it however you want (even in proprietary systems) for any purpose. However, if you make modifications to it and you redistribute it to others, then you must make your modifications to this software public. Read the license for all details.
Clarification: all sample code shown on this page is hereby placed in the Public Domain. Other text is covered by my general copyright. The httpheader module and its associated documentation is LGPL as stated above.
See also
- httpheaders module, part of the larger Python Paste framework.
- RFC 2616,
Hypertext Transfer Protocol -- HTTP/1.1, June 1999.
Also see the errata. - RFC 2046, (MIME) Part Two: Media Types, November 1996.
- RFC 3066, Tags for the Identification of Languages, January 2001.
Using HTTP range requests
This is a simple example of how to use this module to correctly handle HTTP range requests (on a GET or HEAD operation); i.e., those that specify the Ranges: header.
This example is presented in a way that is not bound to any particular web framework, whether that be mod_python, WSGI, etc. So some of the actions are intentionally generic. This example is also intentionally simplified by assuming that you know ahead of time how big the whole "file" is. If you don't know this, such as if the file is generated on-the-fly, then your actual use will be a bit more complex (mainly because range requests may reference file offsets relative to the end of the file rather than the beginning).
This is what you must do first, by whatever method is available to you in your framework.
# Preliminary handling of the HTTP request ranges_hdr = http_request_headers.get( 'Range' ) # This example only supports GET and HEAD requests if http_method not in ('GET', 'HEAD'): http_response_headers[ 'Allow' ] = 'GET, HEAD' set_http_return_code( 405, 'Method not allowed' ) return # stop processing # Determine the "file" being requested file_size = size of the whole file being retrieved. file_contents = the actual contents of the file file_content_type = 'application/octet-stream' # or whatever it is
Now try to parse the Range header and figure out what portion(s) of the file has been requested.
# Process the ranges header import httpheader ranges = None # None means return the whole file if ranges_hdr: try: ranges = httpheader.parse_range_header( ranges_hdr ) except httpheader.ParseError: # Ranges header is malformed; you should ignore this error # per the RFC and just serve the whole file instead. ranges = None if ranges: try: # Simplify/optimize the range(s) if possible ranges.fix_to_size( file_size ) ranges.coalesce() # You may wish to skip this except httpheader.RangeUnsatisfiableError: # Requested range can not be satisifed, return error http_repsonse_header['Content-Range'] = '*/%d' % file_size set_http_return_code( 416, 'Range not satisfiable' ) return # stop processing if ranges.is_single_range() and \ ranges.range_specs[0].first == 0 and \ ranges.range_specs[0].last == file_size - 1: # Effectively getting whole file in pieces ranges = None
At this point, if ranges is None, then you're
returning the whole file just like you would in a normal request.
Be sure to return the Accept-Ranges header to announce that you
can process range requests on future requests.
if not ranges:
# Returning the whole file
http_response_header['Accept-Ranges'] = 'bytes'
http_response_header['Content-Length'] = file_size
http_response_header['Content-Type'] = file_content_type
set_http_return_code( 200, 'OK' )
if http_method != 'HEAD':
http_response_write( file_contents )
Otherwise you are returning one or more portions of the file. First we set up the various HTTP response headers.
else: # ranges is not None if ranges.is_single_range(): # Just one part of the file, no need for a multipart http_response_header['Content-Type'] = file_content_type http_response_header['Content-Range'] = \ '%s %d-%d/%d' % ( 'bytes', ranges.range_specs[0].first, ranges.range_specs[0].last, file_size - 1) else: # Multiple parts. Pick a random MIME boundary string import randon, string boundary = '--------' + ''.join([ random.choice(string.letters) for i in range(32) ]) http_response_header['Content-Type'] = \ 'multipart/byteranges; boundary=%s' % boundary http_response_header['Accept-Ranges'] = ranges.units # 'bytes' set_http_return_code( 206, 'Partial Content' )
Now we output the actual content.
if http_method == 'GET':
if ranges.is_single_range():
# Single range output
first = ranges.range_specs[0].first )
last = ranges.range_specs[0].last + 1 # Notice the +1
http_response_write( file_contents[first:last] )
else: # not ranges.is_single_range()
# Multipart!
for R in ranges.range_specs:
http_response_write( '%s\r\n' % boundary )
http_response_write('Content-Type: %s\r\n' % file_content_type )
http_response_write('Content-Range: %s %d-%d/%d\r\n' \
% (ranges.units, R.first, R.last, file_size) )
http_response_write( '\r\n' )
http_respose_write( file_contents[ R.first : R.last+1 ] )
http_response_write( '\r\n' )
http_response_write( '%s--\r\n' % boundary )
elif http_method == 'HEAD':
# do similar to above to output correct Content-Length,
# but don't output body.
Don't forget: In a real-world use, be sure to return the ETag and
other cache-friendly headers such as Last-Modified or Digest.
Just be sure that those headers are with respect to the entire file, not just
the portions returned in the range request.

