pywkher: wkhtmltopdf for Python on Heroku

On a couple different occasions, we’ve had reason to generate PDFs from applications. For both Python and Ruby, there are a number of different options for generating PDFs.

Some of those options are able to convert HTML to PDFs (sometimes very simplistically); some require directly manipulating the PDF structure; and some require use of a special markup language that’s not HTML.

But if we’re already working with HTML, generated by Django or Rails or whatever, an awesome option is passing that HTML through a command-line tool called wkhtmltopdf.

wkhtmltopdf

wkhtmltopdf is so awesome because it renders HTML using the WebKit engine — the same rendering engine that powers Safari and Chrome. Which means that when we can use it, the resultant PDFs are as beautiful (or as ugly) as they would be if viewed via the browser.

wkhtmltopdf + Heroku

It’s relatively easy to install and use wkhtmltopdf if you have the power to compile the source code. On Macs, you can even install it using Homebrew. But on Heroku it’s a bit more difficult, as the nature of the platform means you can’t just get on the command line and compile something.

Instead, on Heroku, you have to “vendor in” binaries that are precompiled for the platform. That’s how it handles installing things like the various versions of programming languages or other tools that are supported – the binaries are compiled and sitting in an accessible location like S3, and the Heroku buildpacks figure out what things are needed and copy them into the individual instances.

Of course, the fine folks at Heroku have provided a way for users to compile things for the platform: it’s called vulcan. Vulcan works great, actually, but not every user is going to have the experience or inclination to go through the process. Plus, it’d be nice if everyone who ever needs to use something doesn’t have to set up their own vulcan build processes.

Also Some Python

In addition to wanting to provide a pre-built wkhtmltopdf binary for ourselves and other users — we’re not the first to do so; Brad Phelan created a Ruby gem called wkhtmltopdf-heroku that provides the binary for Ruby projects, and we’re actually re-packaging his binary — we also wanted to provide some actual Python code that uses wkhtmltopdf to generate a PDF file.

Go Get It

We’re releasing pywkher open source under the BSD license. It’s up on Github, and pip-installable via pywkher on PyPI.

Have fun!

Posted on August 26, 2012 by Jason Mayfield

comments powered by Disqus