One thing I decided very early on with FreezerBurn was the need for a Web-Based monitoring tool. I didn't want to have to deploy a Python tool with a full user-interface to every desktop and then deal with all the Security implications of that. With a web interface, I could simply let everyone use FireFox or IE or Safari and load it however they wanted. At least, that was the initial idea.
Very quickly I came across Python's HTTPServer class which handles a good 90% of what I needed. After some experiments, though, I found that it tended to die under heavy loads (eg, loading my page with external JS, CSS, and Images). I needed it to be fairly reliable, and thusly discovered alot more then I ever intended to about HTTP Caching.
HTTP implements caching 2 ways: ETags & "Last-Modified" dates. Etags are simply checksums where the Browser says "The version I have has this checksum", and the server checks it and tells you "That's right" or "That's old, here's a new one". In my case, I chose to instead implement the "Last-modified" version where I could simply check the last modified date on the files. The resulting code looks like this:
class MonitorServer(HTTPRequestHandler):So, this piece of code creates a "MonitorServer" object and defines the "GET" function. In there it parses the requested URL, and if it's in one of a few selected forms then send the dynamically generated content. Otherwise, send the requested file directly to the User. As a security precaution, I don't allow the user to specify a path (the path is stripped and the file must exist within 1 specific directory). If the browser provides a "If-None-Match" header, then I return a 304 code which indicates the Cache is up-to-date. If they provide a "If-Modified-Since", then I parse the date and compare it against the file, and returna 304 if appropriate. One interesting thing I learned from this is that the Browser doesn't actually return the date of the file, but rather returns the "Last-Modified" you send it, in identical formatting.
def do_GET(self):
global jobQueue
global servers
global mQueue
global initTime
# process self.path
(scm, netloc, path, params, query, fragment) = urlparse.urlparse(self.path, 'http')
if scm != 'http' or fragment:
self.send_error(400, "bad url %s" % self.path)
return
print 'HTTP: Serving %s' % path
# Write to self.wfilt
if path == "/":
# Raw Index Page
self.wfile.write(""" blah blah blabh """)
elif path == "/job":
blah blah blah
else:
# Serve files from the system
path = path.split('/')[-1]
if os.path.exists(constants.MonitorPath + '\\' + path):
(ctype, enc) = mimetypes.guess_type(path)
info = os.stat(constants.MonitorPath + '\\' + path)
lastmod = datetime.datetime.fromtimestamp(info[ST_MTIME])
if self.headers.get('If-None-Match',''):
self.send_response(304)
return
if self.headers.get('If-Modified-Since',''):
dt = self.headers.get('If-Modified-Since').split(';')[0]
try:
modsince = datetime.datetime.strptime(dt, "%a, %d %b %Y %H:%M:%S %Z")
if modsince >= lastmod:
print "HTTP: No new version of %s" % path
self.send_response(304)
return
except:
pass
self.send_response(200)
self.send_header('Cache-Control', 'max-age=864000')
self.send_header('Expires', "Fri, 30 Jan 2010 12:00:00 GMT")
self.send_header('Content-Length', info[ST_SIZE])
self.send_header('Last-Modified', lastmod.strftime("%a, %d %b %Y %H:%M:%S GMT"))
self.send_header('Content-Type', ctype)
self.end_headers()
self.copyfile(open(constants.MonitorPath + '\\' + path, 'rb'), self.wfile)
else:
print "MONITOR:Can't find %s" % (constants.MonitorPath + '\\' + path)
self.send_error(404, "Unknown url %s" % self.path)
Technorati Tags: python, webserver, caching
Comments (0)
Yeraze's Domain
http://www.yeraze.com/article.php/webservers_with_python_caching