Clean URLs in Jekyll

September 13, 2011

I've always been wondering how to get jekyll to remove the .html extension in its links between posts/pages. Paul Stamatiou's post on Wordpress-Jekyll mentions that if you enable the permalink option, and have extension-free permalinks, jekyll generates all the posts in subdirectories, with index.html's for each post. This generates a morass of extra files I don't want, and also requires putting a trailing slash at the end of post names - since they are essentially directories now. Personally, not something I'm a fan of.

After some searching on github, I found an issue attached to mojombo/jekyll which talked about the solution I was looking for - removing the .html extension from the url, but not from the file.

URL: /2010/02/27/first-post
File: /2010/02/27/first-post.html

So far so good. pedrocr had forked jekyll some time ago, so I forked a more recent version and made some alterations, namely an option called '--cleanurl', which strips the .html extension from the urls when set to true, but leaves the paths intact. I later found another fork henrick/jekyll which had done essentially the same sort of thing, with a fork in 2009, for use with Apache +MultiViews.

So now I have clean urls, without .html extensions, which is a much nicer solution than the automated permalink system. I plan to make my fork a little more robust, but the first iteration (which generated this post) can be found Jekyll. In terms of Nginx rules to sort out the urls, I used:

if ($request_filename ~* ^.+.html$) {
    rewrite ^/(.*).html$ /$1 permanent;
    break;
}

try_files $uri.html 404.html;

By way of a footnote, I'd have liked to pull out the if statement, as if's are particularly dispised when it comes to Nginx (see IfIsEvil on the Nginx wiki). Unfortunately I couldn't come up with a better way of handling the extension removal. If anyone has any ideas, they can freely drop me a line - james@ this domain.