This page explains the structure of all the web files, templates, and data to achieve this website. It also contains notes on hosting, validation, security, selecting libraries, rendering maths and code nicely, making the navigation breadcrumb, choosing colours, creating the favicon and app manifest, making the XML sitemap, adding a draft/wip mode, …
Hence this page provides recipes for how to achieve the various features of a website. They are mostly independent of one another, allowing for easy modification, substitution, or ommission (and the agnostic nature of AWG means that nothing is left behind).
File structure
Organisational considerations include:
- The purpose of each file, the relationship with other files, and especially how nearby each file is to related files.
- When the file is loaded by the templating mechanism (
{% include %}
or{% extends %}
). - When the file is loaded by the browser.
- Where browsers expect or need to find special files.
This translates into the following principles:
- Use a flat root directory for all files which browsers expect to see there, such as
robots.txt
,sitemap.xml
, the app manifest, favicons, etc. - Use a flat root directory for all the files loaded by the base template (
_base.html
) on every page. The base template uses absolute paths. - Only extend templates from the same directory or in parent directories. So the template hierarchy overlays the file system hierarchy.
- Otherwise, keep related files together in their own directory. And use relative paths.
Even though it is traditional, I don’t organise files by type (all javascript in /js/
, all css
in /css/
, etc).
The result is quite a few files in the root directory, but with the benefit that they are all visible. And then each page has its own directory, with child pages in child directories, etc.
The template inheritance hierarchy is simple:
- The base template is in
/_base.html
, and includes/_header.html
and/_footer.html
. - The page template is in
/_page.html
, extends the base template, and adds blocks for the breadcrumb, title, page content, and the “page is draft” additions (see below). - Most concrete pages then extend the page template (the welcome page being an exception, as it alters the layout to have a sidebar card).
Indentation management
All HTML files are indented for clarity during editing. On output after templating, they are all formatted
properly by AWG, so there is no need to try to generate properly indented HTML
within the templates (avoiding fiddly whitespace management with “-” in e.g. {%- ... %}
). For
example, the indentation below is entirely to aid readability at the template stage (rather than final
HTML).
{% extends "../_page.html" %}
{% block breadcrumb %}
{{ super() }}
<div class="level-item">
<span class="tag">
<a href="index.html"><i class="fa-solid fa-arrow-left"></i> Back to: Building this website</a>
</span>
</div>
{% endblock breadcrumb %}
{% block title %} Content approach {% endblock title %}
{% block page %}
{{ "_approach.md" | markdown() }}
{% endblock page %}
To make editing easy with language specific editing/formatting/colourising modes, I avoid files containing
a mixture of languages. Hence there are separate files for each piece of markdown, no “frontmatter”
TOML/YAML within markdown files, javascript is always included from .js
files, etc.
Choice of web framework
There are many to choose from, but I selected Bulma because it is CSS only (no javascript), looks good, is well documented, very popular, and actively maintained. I like the responsive layout and support for colour management.
The two closest alternatives were:
- Bootstrap, which is older, bit boring looking now, and slightly heavier weight than I need.
- UIkit, which looks slick, but is perhaps more intended for applications. Also there is less support e.g. for colours.
Maths
For any substantial maths I create PDFs using Typst, but for immediately visible maths in the browser I use the javascript library, KaTeX. Other libraries exist but Katex is well maintained, popular, and fast.
The Common Markdown parser used by AWG includes the dollarmath_plugin. It produces inline and block maths with the following HTML markup:
<span class="math inline"> ...latex maths... </span>
<div class="math block"> ...latex maths... </div>
A small amount of javascript makes KaTeX render on the correct elements after the DOM has loaded:
document.addEventListener("DOMContentLoaded", (event) => {
for (var node of document.getElementsByClassName("math")) {
katex.render(node.innerText, node, {
displayMode: node.classList.contains("block"), // otherwise assume it is "inline"
throwOnError: false,
});
}
});
I put this javascript in the file /render-maths.js
and load it in the
<head>
tag of the base template along with the KaTeX library (both javascript and CSS)
from the jsDelivr CDN:
<head>
...
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.16.22/dist/katex.min.css" integrity="sha384-5TcZemv2l/9On385z///+d7MSYlvIEw9FuZTIdZ14vJLqWphw7e7ZPuOiCHJcFCP" crossorigin="anonymous">
...
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.16.22/dist/katex.min.js" integrity="sha384-cMkvdD8LoxVzGF/RPUKAcvmm49FQ0oxwDF3BGKtDXcEc+T1b2N+teh/OJfpU0jr6" crossorigin="anonymous"></script>
...
<script defer src="/render-maths.js"></script>
...
</head>
The result is for markdown such as
For example, inline maths looks like $(x+1)^2 - (x-1)^2 = 4x$, and block maths like
$$
\sum_{k=1}^n { k! \over (1+k)^2 }
$$
to be displayed as
For example, inline maths looks like (x+1)^2 - (x-1)^2 = 4x, and block maths like
\sum_{k=1}^n { k! \over (1+k)^2 }
Code
Highlighting code is easy with highlight.js. This will colour many different programming languages in any of a number of different themes, expecting HTML markup like
<pre>
<code class="language-python">
...python code...
</code>
</pre>
The Common Markdown standard used by AWG has fenced code blocks which produces tags with CSS classes exactly like this.
Hence the <head>
section of the base template pulls in the highlight Javascript and
chosen theme CSS (in this case, gruvbox-light-hard
) from a CDN, and instructs the browser to
render code once the page has loaded:
<head>
...
<link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.9.0/build/styles/gruvbox-light-hard.min.css">
...
<script defer src="https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.9.0/build/highlight.min.js" crossorigin="anonymous"></script>
...
<script defer src="/render-code.js"></script>
...
</head>
and the contents of /render-code.js
runs the highlighter after all the DOM content is present
and correct:
document.addEventListener("DOMContentLoaded", (event) => {
hljs.highlightAll();
});
The highlight colour theme was chosen to be close to my colour theme, but although close the background
isn’t a perfect match. I fix this with some CSS in /main.css
(using CSS
custom properties to access bulma’s derived colours), and also style with a fine border:
/* Fix up the style of the code blocks e.g. consistent background colour. */
code.hljs {
border: 1px solid grey;
border-radius: 0px;
background: var(--bulma-primary-95);
}
Of course, the above demonstrates the end-result of formatting some HTML and Javascript code.
Navigation breadcrumb
The navigation breadcrumb works by sub-templating, calling {{ super() }}
to retain the
navigation from above. So the pattern is as follows (ignoring all styling).
<!-- This is the base template: /_page.html -->
<nav class="navigation">
{% block breadcrumb %}
{% endblock breadcrumb %}
</nav>
<!-- Then in a subclass template in a sub-directory, e.g. /a/_foo.html -->
{% extends "../_page.html" %}
{% block breadcrumb %}
{{ super() }}
<a href="/">Home</a>
{% endblock breadcrumb%}
<!-- And then again, e.g. in /a/b/_bar.html -->
{% extends "../_foo.html" %}
{% block breadcrumb %}
{{ super() }}
<a href="../index.html">Back to Recreational Maths</a>
{% endblock breadcrumb%}
Manifest and favicons
The Web Application Manifest is a JSON
file
containing metadata about a web application. Although this site is not a web app as such, it improves user
experience to use the manifest to document the location of all the favicons and theme colours.
Favicons appear as the icons in browser url bars, tabs, bookmark menus. And also in the “add to home screen” feature of touch screen devices.
Adding favicons involves:
- Creating a set of favicons, ensuring the colours are coordinated with the colour theme of the website.
- Telling browsers where to find all the favicons, noting that some are expected in “standard” locations anyway.
I created a set of favicons using an online favicon
generator, using the same primary colours as configured in Bulma. These are all copied into the root
(/
) directory of the site according to the file structure principles described above.
The manifest then points to these favicons, and is itself put in the root directory as
/manifest.json
(see here).
Lastly, the base template (in _base.html
) indicates the principal favicons and the location of
the manifest:
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
...
<link rel="apple-touch-icon" sizes="180x180" href="/apple-touch-icon.png">
<link rel="icon" type="image/png" sizes="32x32" href="/favicon-32x32.png">
<link rel="icon" type="image/png" sizes="16x16" href="/favicon-16x16.png">
<link rel="icon" type="image/x-icon" href="/favicon.ico" />
<link rel="manifest" href="/manifest.json">
...
</head>
...
</html>
Icons
I use Font Awesome for icons. I found that loading them from Font
Awesome produced rendering lag (the icons flickered as they appeared), and also suffered from fragile SRI
settings. Hence I host directly. To keep things self-contained, the fonts and CSS are all in a top level
/fontawesome/
directory.
The head
section in the _base.html
template loads the CSS using:
<link rel="stylesheet" type="text/css" href="/fontawesome/css/fontawesome.min.css" integrity="sha384-{{ '/fontawesome/css/fontawesome.min.css' | sha() }}">
<link rel="stylesheet" type="text/css" href="/fontawesome/css/brands.min.css" integrity="sha384-{{ '/fontawesome/css/brands.min.css' | sha() }}">
<link rel="stylesheet" type="text/css" href="/fontawesome/css/regular.min.css" integrity="sha384-{{ '/fontawesome/css/regular.min.css' | sha() }}">
<link rel="stylesheet" type="text/css" href="/fontawesome/css/solid.min.css" integrity="sha384-{{ '/fontawesome/css/solid.min.css' | sha() }}">
The main fontawesome.min.css
fetches the required fonts from
/fontawesome/webfonts/
.
Colour, styling, and light/dark mode
Colours are both technical and personal. I found these useful to get started:
Then the following helped me experiment with different palettes:
One gotcha I encountered was that there are different variants/standards of RGB.
Bulma has a “customizer” popup on its website which allows colours (and other style aspects) to be tried out before exporting as CSS settings. Because it automatically derives shades, the main task is to decide a Primary colour, a Link colour, and colours for Info, Success, Warning, and Danger.
Bulma also automatically derives and manages the colour variations between light and dark mode. For that to
work, one needs to use the “soft” and “bold” colour classes for those elements which should be a function
of light/dark mode. For example, I use the has-background-primary-bold-invert
and
has-text-primary-bold
classes for the main page section. See the Bulma docs for details.
Lastly, remember to coordinate the colour choices across the Bulma setting, the manifest, and the favicons.
Draft / wip mode
Given the purpose of this website, many files are constantly being revised and refactored. New content could be excluded from the build until completely ready, but a “softer and more organic” approach is to include it with a DRAFT watermark and delay linking to such pages or adding them to the sitemap until more ready. Then such content can be viewed if you know it is there but is not readily discovered otherwise; and if seen then it is obvious that it is work-in-progress.
To control this watermark, the /_page.html
template adds a CSS class if a draft
variable is set:
<div class="content {% if draft %}draft-watermark{% endif %}">
{% block page %}{% endblock page %}
</div>
The corresponding CSS is:
.draft-watermark {
background-image: url("draft-watermark.svg");
}
and the corresponding draft-watermark.svg
contains
<svg xmlns="http://www.w3.org/2000/svg" version="1.1" height="170px" width="210px">
<text transform="translate(0, 40) rotate(30)" fill="rgba(245,5,5,0.05)" font-size="60px">
DRAFT
</text>
</svg>
Hence to mark a page as in-draft/work-in-progress, set the draft
variable at the top of the
template as follows:
{% set draft = true %}
{% extends "../_page.html" %}
...etc
XML sitemap and the robots.txt file
To support search engine indexing and SEO, the robots.txt
file and related sitemap file (in
sitemap.xml
) are used to hint to search engines what pages they should index. See Google’s
descriptions of robots.txt and sitemaps.
For this site I just use the robots.txt
file to point to the sitemap:
Sitemap: {{ SITEURL }}/sitemap.xml
(the SITEURL
is set in a template data TOML file).
The sitemap.xml
file will also be run though Jinja by AWG because it
has an .xml
extension. Hence it is a template:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
{%- for p in SITEMAP_FILENAMES %}
<url>
<loc>{{ SITEURL }}/{{ p.name }}</loc>
<changefreq>{{ p.change_frequency or "monthly" }}</changefreq>
{%- if p.last_mod %}
<lastmod>{{ p.last_mod }}</lastmod>
{%- endif %}
</url>
{%- endfor %}
</urlset>
(Unlike HTML, XML is not automatically tidied by AWG, hence the few “{%-
” whitespace control
indicators to indent the output.)
The data describing the set of URLs which should be indexed is kept in _sitemap.toml
. For
example:
[[SITEMAP_FILENAMES]]
name = "index.html"
change_frequency = "monthly"
[[SITEMAP_FILENAMES]]
name = "welcome/index.html"
change_frequency = "weekly"
This isn’t too cumbersomb to maintain (e.g. by listing all candidates with ls -1 **.html
). It
is also possible to put the different entries in different .toml
files in respective
directories.
Validation
Iterating with various (free) validation sites makes it easy to check for correctness, best practice, and learn about the web world. For example:
Notable issues I could not address include:
- The existence of trailing slashes on void elements such as
meta
andlink
tags. These cannot be fixed by hand because AWG checks and formats all HTML with HTML Tidy, produces such trailing slashes. The main concern (see here and here) seems to be that because HTML5 is not XML, and since href arguments don’t have to be quoted, there is an ambiguity if the href is the last attribute and the url contains a trailing slash. Compare:
This isn’t a problem so long as href’s are always quoted.<link href=https://foo.bar.baz/> <link href="https://foo.bar.baz"/> <link href="https://foo.bar.baz/">
- …
Security
Excellent references on web security can be found on Google’s web.dev and Mozzila’s MDN.
To find security weaknesses and information about how to address them, I follow the findings from MDN’s Observatory tool.
Content Security Policy (CSP)
A Content Security Policy (CSP) instructs browsers to place restrictions on what loaded code can do. This is to defend against cross-site-scripting (CSS) and clickjacking in which an attacker finds ways to inject malicious code.
A related concept is SubResource Integrity
(SRI), which makes browsers only accept resources when they match the hash contained in the
integrity
attribute. This attribute is notably available on <script>
and
<link rel="stylesheet">
tags. So SRI helps to prevent security problems from source file
tampering.
CSP is configured using the Content-Security-Policy
HTTP Header. Since this is a static site I
use the http-equiv
meta tag in every HTML file:
<meta http-equiv="...name of HTTP Header..." content="...HTTP header contents...">
My approach is:
- To deny everything by default and add specific permissions as needed.
- To always use SRI, including for local (or
'self'
) files. Note that AWG provides a Jinja filter to make it easy to generate the hashes (such as sha384) from the source files. - To check validity using tools such as CSP Evaluator.
In annotated outline, the CSP is as follows:
upgrade-insecure-requests; <-- Instruct browser to switch site HTTP urls to HTTPS
default-src 'none'; <-- Default fallback is deny
require-trusted-types-for 'script'; <-- See link below
base-uri 'self'; <-- Don't allow the base URL to change from self
img-src 'self'; <-- Only allow images served up from self
manifest-src 'self'; <-- Only allow manifest served up from self
script-src-elem
'strict-dynamic' <-- Trusted scripts (i.e. javascript) are trusted to use other scripts
'sha384-kri+HXDJ8qm2+...' <-- Trust scripts with the following hashes
...etc
;
connect-src
'self' <-- Allow connections to self e.g. for websockets (used by hot reloader)
;
font-src
'self' <-- Allow fonts from self, e.g. Fontawesome.
https://cdn.jsdelivr.net <-- Allow fonts (e.g. for Katex) from jsDelivr CDN
;
style-src
'self' <-- Allow loading of CSS files from self
https://cdn.jsdelivr.net <-- Allow CSS files from jsDelivr CDN
'sha384-vpayKGwduWhgY...' <-- Permit CSS with following hashes (does not seem to do anything!)
...etc
;
(Link for require-trusted-types-for.)
I discovered a few helpful things along the way:
- Safari does not read
style-src-elem
, but allows it to exist. Chrome does read it. Hence usingstyle-src
. - The fallback from say
style-src-elem
tostyle-src
todefault-src
does not mean keep trying until one passes, but use the most specific provided. If the most specific fails then the permission is denied. - The
style-src
section does not do anything with hashes for link files. It neither checks the hashes or complains if present. This could be about CSP level 2 vs level 3. See also, here. I’ve kept the hashes in because I believe it should work like this, and doing so appears harmless. - If script hashes are provided in
script-src
then SRI must also be used (i.e. theintegrity
attribute should exist and contain the hash). - For both CSS and javascript, if the SRI is present (using the
integrity
attribute), then it is checked and must pass. Hence independently of CSP, SRI seems uniformly implemented. - Test on different browsers, because (a) they may behave differently, and (b) when things don’t work they give different diagnostic information (some more helpful than others).
In practice, using template data reduces maintenance overhead and helps document what is going on and where things are from. For example:
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
...
<meta http-equiv="Content-Security-Policy" content="
...
script-src-elem
'strict-dynamic'
'{{ KATEX_JS_SHA }}'
'{{ HIGHLIGHT_JS_SHA }}'
'sha384-{{ '/hot-reloader.js' | sha() }}'
'sha384-{{ '/render-maths.js' | sha() }}'
'sha384-{{ '/render-code.js' | sha() }}'
;
...
style-src
'self'
https://cdn.jsdelivr.net
'{{ BULMA_CSS_SHA }}'
'{{ KATEX_CSS_SHA }}'
'{{ GRUVBOX_CSS_SHA }}'
'sha384-{{ '/main.css' | sha() }}'
;
">
...
<link rel="stylesheet" type="text/css" href="{{ KATEX_CSS }}" integrity="{{ KATEX_CSS_SHA }}" crossorigin="anonymous">
...
<script defer src="{{ KATEX_JS }}" integrity="{{ KATEX_JS_SHA }}" crossorigin="anonymous"></script>
...
<script defer src="/hot-reloader.js" integrity="sha384-{{ '/hot-reloader.js' | sha() }}"></script>
...
</head>
...
</html>
Note:
- The
crossorigin="anonymous"
attribute on the<link>
and<script>
tags is needed to make the browser send the appropriate CORS headers to fetch external resources without leaking user credentials - see here. - The
sha()
Jinja filter provided by AWG is used to statically compute hashes of local content. It is done in two places for each file: the CSP header and the SRI integrity attribute. - Jinja variables help show meaning and aid re-use. They are kept in a TOML file, e.g. as follows:
# Maths
KATEX_CSS = "https://cdn.jsdelivr.net/npm/katex@0.16.22/dist/katex.min.css"
KATEX_CSS_SHA = "sha384-5TcZemv2l/9On385z///+d7MSYlvIEw9FuZTIdZ14vJLqWphw7e7ZPuOiCHJcFCP"
KATEX_JS = "https://cdn.jsdelivr.net/npm/katex@0.16.22/dist/katex.min.js"
KATEX_JS_SHA = "sha384-cMkvdD8LoxVzGF/RPUKAcvmm49FQ0oxwDF3BGKtDXcEc+T1b2N+teh/OJfpU0jr6"
# Theme for highlight.js
HIGHLIGHT_JS = "https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.9.0/build/highlight.min.js"
HIGHLIGHT_JS_SHA = "sha384-F/bZzf7p3Joyp5psL90p/p89AZJsndkSoGwRpXcZhleCWhd8SnRuoYo4d0yirjJp"
GRUVBOX_CSS = "https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.9.0/build/styles/base16/gruvbox-light-hard.min.css"
GRUVBOX_CSS_SHA = "sha384-vpayKGwduWhgY00faoPtbmJwz8TjOLnnDuqvy+xWy2DWuIVxIt0dxj0mjrMVPxdd"
# Framework
BULMA_CSS = "https://cdn.jsdelivr.net/npm/bulma@1.0.2/css/bulma.min.css"
BULMA_CSS_SHA = "sha384-tl5h4XuWmVzPeVWU0x8bx0j/5iMwCBduLEgZ+2lH4Wjda+4+q3mpCww74dgAB3OX"
Strict Transport Security (HSTS)
This site can use HTTPS throughout. To help prevent manipulator-in-the-middle (MiTM) attacks, the Strict-Transport-Security HTTP Header should be set, together with the upgrade-insecure-requests directive in the CSP.
Hence the following are added in the _base.html
template to the <head>
tag.
<meta http-equiv="Upgrade-Insecure-Requests" content="1" />
<meta
http-equiv="Strict-Transport-Security"
content="max-age=63072000; includeSubDomains"
/>
(max-age
is set to the recommended 2 years).
The upgrade-insecure-requests
CSP directive is explained in the Content Security Section
above.
I’ve also configured GitHub Pages to only serve HTTPS.
NB: The presence of these security settings means that AWG must be run in HTTPS mode.
Deny embedding
A clickjacking approach relies on embedding sites in other sites. Ideally this would be prevented using CSP
by setting the frame-ancestors
and the X-Frame-Options
header. See here
for details. But unfortunately neither can be done using http-equiv
and I don’t have control
over the server HTTP Headers.
Referrer policy
To stop leaking information about where outbound links are coming from (see here), I set the HTTP header as follows:
...
<meta http-equiv="Referrer-Policy" content="no-referrer">
...
MIME types
To inform browsers not to load scripts and stylesheets unless the server indicates the correct MIME type, I
set the X-Content-Type-Options
header using the <meta>
tag to
nosniff
as explained here:
...
<meta http-equiv="X-Content-Type-Options" content="nosniff">
...
Deployment on GitHub pages
It is easy and convenient to host static content on GitHub pages.
One can either use files from a git branch, the root directory of the repository, or a directory called
docs/
. It would be nice to be able to use a different directory name, but so be it. I just use
the docs/
directory on the master branch.
A custom domain can be used by creating a CNAME
file containing the full domain (in my case,
www.corbettclark.com
).
The default GitHub action detects code commits and deploys on their infrastructure, making the result visible within a couple of minutes (often faster).
As I’m the only person making changes, I mostly dispense with creating a branch and making a pull request to myself (GitHub flow), but instead just make a number of meaningful commits locally. Then when ready to publish, I git push to GitHub. In short, my workflow is:
- Start up AWG with
./awg.py content/ docs/ --certfile localhost.pem --keyfile localhost-key.pem
- Repeat until ready to publish:
- Make changes and check in local browser without leaving my editor (because of hot reload).
- Commit locally using git.
- Git push to GitHub.
- After a minute or so, check the changes have reached live ok.