Site banner

DISCLAIMER: as demonstrated in this post, I am not a Perl nor shell nor systems administration guru. There are undoubtedly much better ways of going about templating than what I did, but I figured it would be fun to share how I decided to go about it for anyone potentially curious.

Website Architecture

This site runs on OpenBSD using the default http daemon that comes with it. Originally, all of the HTML files were directly created on the server with Vim. The issue of tediously writing shared HTML like a navigation bar or header was avoided by not having those to begin with.

However, eventually I decided I wanted an actual website layout, and I definitely didn't want to have to to maintain that manually. Plus, even if I stuck with my old layout, it was still annoying having to copy and paste the head each time. So, I decided to do some templating.

As soon as I started thinking of this idea, I felt like using Perl. This may come off as strange, as:

  1. I had no experience using Perl before.
  2. Perl is a language that basically nobody uses anymore unless they learned it 20 years ago.

However, I decided to use it anyway because a handful of people that I greatly respected used it. I also knew that, while it is general purpose, its best strength is text manipulation, which I knew I would need to do for some sort of templating. It also comes installed with OpenBSD, so it also meant the script wouldn't introduce any package dependencies. I also simply thought it would be a fun and practical case for learning a new language that I was curious about.

However, I still initially planned on making use of Perl modules available through CPAN, which is Perl's online repository for modules, like Node's npm or Python's pip. There was an HTML Template module, which seemed like it would be perfect for my use case.

However, this ended up not being the case. First, using CPAN was a real pain. For some reason, trying to install anything through it would cause the server to hang indefinitely after the tests finished, which included kicking me out of my ssh session and forcing me to restart my server through my server provider (Vultr)'s control panel. I couldn't find anything about this bug online, but eventually I found a workaround by setting a flag to skip the tests, which avoided this hang.

However, when I finally had the templating module installed, I quickly realized that it wasn't really fit for my purpose. It seemed like the main purpose of the templater was to make it simple to generate HTML for displaying a Perl data structure. This is likely useful for that purpose, but this is not what I needed for my website. I was not trying to insert any Perl data into a HTML page, I was trying to make a bunch of HTML pages share the same structure. The template module could do this, it had support for basic variable replacement of course. But I realized that if all I needed was basically just an automated find and replace, there was no reason for me to make use of a complex object-oriented module with a dozen dependencies. I could just write my own generator, so that's exactly what I did.

On each page, everything is the same except for the page's title and the actual page's content. So, a simple structure for generating an entire site with this layout would be:

  1. Create a template HTML file with comments inserted at specific points as markers for where the page title and content would be inserted.
  2. Convert each existing HTML page to a stub format, with comments inserted at specific points as markers for where to start and stop reading the data for each variable.
  3. Write a Perl script that takes a stub as input, and replaces the placeholders in the template with the stub data
  4. Write a shell script to parse through all the stubs and run the script on each one, outputting to the respective HTML files on the site.

(The last step could have been done in Perl too, but the way of doing this in shell seemed immediately obvious to me.)

This is all fairly simple to do, though there was one aspect of the site's design that I thought would be tricky: the latest 5 changes. At first, I thought there only two ways for me to go about implementing it. Either I had to write or import my own HTML parser to grab the latest 5 entries, or I had to explicitly put labels in the changelog for where to slice out the latest 5 entires. The former would be significantly more complicated but more automated, while the latter would be simple but require a lot of manual work.

However, I realized there was a third option that was both easy to write and also automated: use a regular expression to get the HTML.

Usually, parsing HTML with regular expressions is seen as a common novice pitfall, as it seems very obvious of a solution but quickly shows itself to be impossible once you actually try to do it. However, I don't need to handle every single possible case here. The changelog is written by me in a very consistent format where each entry is a level 2 header followed by an unordered list. Consequently, I can write a regular expression to grab the first 5 entries of the changelog with the following regular expression: /<h2>(.*?)<\/ul>\n){5}/s. This is a great example of how very complicated problems to handle generally can be greatly simplified if you can narrow it down to the cases you have to actually handle.

As for the actual implementation, as stated before the template is just a usual HTML file, but with comments where the variable content will be inserted. The stubs are an HTML comment containing the title, and then another HTML comment labelling the start of the main content, which is read as the rest of the file.

otfd.html (Template)

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"><head>
<meta http-equiv="content-type" content="text/html; charset=windows-1252"/>
<title>OtFD: <!-- TITLE --></title>
<link rel="stylesheet" type="text/css" href="/style.css"/>
<link rel="icon" type="image/png" href="/otfd-icon.png" sizes="16x16"/>
</head>
<body>
<div id="site-banner">
<a href="/">
<img src="/otfd-youmu-1bit.png" alt="Site banner"/>
</a>
</div>
<div id="container">
<div id="nav">
<h1>Sitemap</h1>
<ul>
<li><a href="https://overthefraildream.net/changelog.html">Changelog</a></li>
<li><a href="https://overthefraildream.net/writings">Writings</a></li>
<li><a href="https://overthefraildream.net/charts">Etterna Charts</a></li>
</ul>

<div id="changelog">
<h1>Last 5 Changes</h1>
<!-- CHANGES -->
</div>
</div>
<div id="content">
<!-- CONTENT -->
</div>
</div>
</body></html>

writings/index.html (Example Stub)

<!-- TITLE: Writings -->

<!-- CONTENT -->
<h1>Writings</h1>
<p>Might split this into categories when there's more stuff here.</p>
<p>YYYY/MM/DD - Title</p>
<ul>
<li>2024/04/28 - <a href="./jumpcancels.html">Implementing Jump Cancels in MUGEN with Variable Expansion</a></li>
<li>2024/04/05 - <a href="./feedback.html">Giving feedback on Etterna charts</a></li>
</ul>

The Perl script reads the stub from stdin, and also opens both the changelog and the template. Rather than treating the files as arrays of lines like usual, we output them into a scalar (a single string). The stub's fields are parsed with regular expressions and then inserted into the template with s/// notation. Likewise, the changelog is parsed with the regular expression discussed above and also inserted into the template. Then, the template is output to stdout.

generator.pl

#!/usr/bin/perl
use v5.35;

my $content = do { local $/; <> };
open( my $changelog_file, "<", "changelog.html" ) || die "Can't open changelog: $!";
my $changelog = do { local $/; <$changelog_file> };
open( my $template_file, "<", "../otfd.html" ) || die "Can't open template: $!";
my $template = do { local $/; <$template_file> };

# $1 used to grab only what's in the (.*?), so commment borders aren't pasted in
if(  $content =~ /<!-- TITLE: (.*?) -->/ ) {
  my $title = $1;
  $template =~ s/<!-- TITLE -->/$title/;
}

if(  $content =~ /(?<=<!-- CONTENT -->\n).*$/s) {
  my $content = $&;
  $template =~ s/<!-- CONTENT -->/$content/;
}

# $& used to grab all groups because for some reason regex finds the
# entries as two groups?
if(  $changelog =~ /(<h2>(.*?)<\/ul>\n){5}/s )
{
  my $last_5_changes = $&;
  $template =~ s/<!-- CHANGES -->/$last_5_changes/;
}

print $template;

The reason why the Perl script is designed to use stdin and stdout is because I had planned to use shell for handling the actual filesystem traversal and generation of the files, so making the script conducive to shell piping seemed ideal for that.

As for how the shell script was implemented, I decided that in the same directory that my site's root is in I would create a stubs directory with the same directory hierarchy as the actual site. This way the process of figuring out where the generated files ought to go is trivial: just look at the absolute path of the stub, and replace "stubs" with "overthefraildream.net". I ended up handling the recursive directory traversal using find, as it seemed like the least ugly way to go about doing it in shell, even if I still found it very ugly in practice. In hindsight writing this part in Perl may have been worth it solely to avoid doing this.

generate-site.sh

#/bin/ksh
cd stubs
find . -type f -exec ksh -c "cat {} | ../generator.pl > ../overthefraildream.net/{}" \;

Another obvious aside: When doing things with shell scripts, always make backups. It is very easy to accidentally shoot yourself in the foot with an incorrectly written shell script, and this was no exception. At one point I had accidentally used > instead of | and overwrote the Perl script instead of piping to it; fortunately I had backed up the file so no progress was ultimately lost. I also backed up all of my site's files in the case where the generation somehow went terribly wrong.

With all of the components written, it was time to actually use it. I did test it a bit beforehand, but I wasn't as thorough as I should've been, all of my tests were using a single stub without considering if nested directories would work. Regardless, I created the stub directory by copying my site, removing everything but the HTML files, and converting them to the stub format. Then, I ran the script.

Surprisingly, everything worked out on the first run, with the exception of one of the pages being blank because I made a typo in the stub markers. It was really satisfying to see the combination of my new frontend and backend all coming together.

Theres still a few small things that could be done. In particular, the shell script and Perl script make use of relative instead of absolute paths, meaning that they're very specific to where they're located and ran. If I made them use absolute paths instead, I could move the scripts to /usr/bin or something to both declutter htdocs and also make it so that I don't need to change to the htdocs directory every time I want to regenerate the site. (2024/06/21 Update: I have done this as well as written a Perl script to handle HTML escaping.)

(A slight disclaimer: The order in which I explained things in this writing is not exactly the same as my train of thought when actually doing this. I've rearranged things for narrative clarity since at multiple points I was both working on implementing something that I knew how to do immediately while also planning ahead thinking about general approaches to upcoming parts I had to do.)


Appendix: About the frontend design

I figured it would also be worth covering the site's frontend design as well, though I don't think it's interesting enough to warrant its own writing.

The layout of the site is done using absolute positioning to put the navigation bar and the content side-by-side, see this guide for more details on how that works. I used a fixed amount of pixels for the width of the nav bar and content so I don't need to worry about things looking too squished or too wide depending on the monitor. The site probably looks jank on a 4K monitor but you can just zoom in if that's an issue. Another reason why I used a fixed pixel width is so that I could align the border of the nav bar and the main content to be between the website title and the art in the banner.

As for the actual design, it's inspired by a lot of older websites that I really liked the designs of, particularly n-gate (turn off the Comic Sans) and All in the Head's 2003 layout. The banner uses this art of Youmu with a crop, some slight edits, and a conversion to 1-bit using paint.net. The font is Silkscreen Expanded, and was a real pain to get looking right due to how paint.net renders fonts, I had to do some manual edits to get it to look right. The site colors were based on the banner, with adjustments made to the brightness of certain elements to optimize distinction and readability. The favicon I made by hand myself, I had thought about it in my head when I realized I the start of each of the four letters could be drawn clearly in a 4x4 tile, so a 16x16 favicon could be made by arranging the acronym into a square.