Blog Architecture Redux

« Blog Architecture Redux »

27 January, 2025
808 words
3 minute read time

There's a common joke out there that bloggers will write more about their blogging system than actual content.

Blog is dead, long live Blog

My first post that explains how I write my blog is hilariously outdated now. Between then and now, my process has been:

Generate Jekyll locally
Push to S3 with s3_website

The aforementioned blog post outlined self-hosting static assets on a VPS, which is a stupid architecture with modern object storage options. Static hosting on S3 alleviates any scaling or availability concerns.

What I didn't like

I found myself dissuaded from writing for a few reasons:

Authoring Markdown is annoying. I mostly use org mode these days, which is empirically a better markup format. It's much easier to write code samples, preview formatting, and work with metadata.
Syntax highlighting is hard to get right. I'd write code in my editor and whatever-the-hell Jekyll likes to use would look different. I want HTML-rendered code to look the same as when I wrote it.
I really dislike working in Ruby these days. I prefer Rust if the use case is Very Important, but a language with less obnoxious patterns would make me more inclined to improve my blog. This doesn't matter if you aren't working in the plumbing of the static generation, but I am. And if I'm giving up strong typing, I might as well have as much power that I can get.

My time in lisp land has left me with the following impressions:

Strong typing is still table stakes for any serious software engineering, but if you’re going to throw it all out the window to play fast and loose, lisp gives you everything you need to Fuck Shit Up (in a fun way)
— tyler (@leothrix) January 4, 2025

Don't Say Org Mode

It's org-mode now. The new setup works like this:

Author blog posts in org mode. One file per post, I don't write anything HTML-specific.
Use an emacs package called weblorg to turn my org-mode files into web pages. If you've worked with org mode before, you know emacs is capable of exporting org-mode buffers to HTML. This works but lacks many needs for an actual blogging web site (think: templating, slugs, and so on). weblorg is a veneer over org-mode's HTML export functions.
- I have… a significant number of customizations on top of weblorg. I'll include them here if you want to follow my example.
Use the newer s3_website_revived to sync the site into S3 and reconfigure Cloudfront.

Don't Keep Talking About Org Mode

I wanted to write org files instead of md files but there were extra benefits:

Through htmlize I get a one-to-one style export from my source code blocks to generated HTML and CSS. I don't pretend to understand how htmlize can translate editor fonts identically to CSS, I just smile nervously and move along.
Writing Ruby in 2025 with the mental degradation inherent to my mid-30s brain was an exercise in frustration. weblorg lacks many features I need but emacs lisp is nothing else if not a language explicitly designed to be torn apart and stitched back together. I've replaced every feature that I missed from Jekyll by shuffling up either weblorg or org-mode functions (often times with a hammer like advice-add.)

I don't recommend this setup. It's completely reliant on emacs and I don't think anyone else would enjoy working this way except for me. But I'm here to report it works.

Other Release Notes

Welcome to my 2025 blog! I also updated:

CSS, using the latest Pico release
Several pages with either old architectures or techniques that needed revision
My own tone to be more reflective of my current attitude (less exuberance and more terse skepticism)

Happy new year!

Appendix A: `weblorg` Functions

weblorg misses several functions you need to emulate a Jekyll-like experience. These functions may be useful to others.

Post Metadata

This snippet:

ELisp

Font to highlight Lisp quotes.
Font used to highlight strings.
Font used to highlight builtins.
Font used to highlight function names.

(weblorg-route
 :name "posts"
 :input-pattern "posts/*.org"
 :template "post.html"
 :output "_site/{{ file_slug }}/index.html"
 :input-aggregate #'weblorg-input-aggregate-with-navigation
 :input-parser #'tjl/weblorg-parse-org-file-extras
 :url "/{{ file_slug }}/")

Enriches posts in two ways:

In aggregate with #'weblorg-input-aggregate-with-navigation
Per-file with #'tjl/weblorg-parse-org-file-extras

Aggregate Functions

The #'weblorg-input-aggregate-with-navigation function adds "next" and "previous" metadata to posts; pretty important for site navigation.

ELisp

Font used to highlight strings.
Font used to highlight function names.
Font to highlight Lisp quotes.
Font used to highlight built-in function names.
Font used to highlight documentation embedded in program code. It is typically used for special documentation comments or strings.
Font used to highlight function names.
Font used to highlight keywords.

(defun weblorg-input-aggregate-with-navigation (posts)
  "Add next and previous links to each post in POSTS.
POSTS should be sorted in reverse chronological order (newest first).
Returns a list in the same format as weblorg-input-aggregate-each."
  (let* ((sorted-posts (sort (copy-sequence posts) #'weblorg--compare-posts-desc))
         (len (length sorted-posts)))
    (cl-loop for i from 0 below len
             for post = (nth i sorted-posts)
             collect
             (let ((post-with-nav (copy-alist post)))
               (when (> i 0)
                 (let ((next-post (nth (1- i) sorted-posts)))
                   (push (cons "next"
                               `(("title" . ,(cdr (assoc "title" next-post)))
                                 ("file_slug" . ,(cdr (assoc "file_slug" next-post)))))
                         post-with-nav)))
               (when (< i (1- len))
                 (let ((prev-post (nth (1+ i) sorted-posts)))
                   (push (cons "prev"
                               `(("title" . ,(cdr (assoc "title" prev-post)))
                                 ("file_slug" . ,(cdr (assoc "file_slug" prev-post)))))
                         post-with-nav)))
               `(("post" . ,post-with-nav))))))

The links to the sides of post titles and navigation links at the bottom of blog posts use these titles and slugs to construct links.

Per-Post Functions

I use #'tjl/weblorg-parse-org-file-extras as an input parser for weblorg-compatible org files to get:

Excerpts
- In Jekyll, you enter two empty lines ("\n\n") to signify an "excerpt", or the part of a post you'd include as a preview. My code uses all content leading up to the first org horizontal ruler markup (-----)
- I also make this cleaner by trimming the superfluous horizontal rule when generating actual HTML so it disappears on export.
Word count
- Not hard, but made a little smarter by excluding non-prose.
Read time
- Apparently consensus is that you can divide by 233 to get (roughly) the time it takes to read content. This is probably inaccurate, but I'm okay with that.

ELisp

Basic font for unvisited links.
Font used to highlight type and class names.
Font for backslashes in Lisp regexp grouping constructs.
Font used to highlight builtins.
Font used to highlight constants and labels.
Font to highlight quoted Lisp symbols.
Font used to highlight strings.
Font used to highlight comments.
Font used to highlight comment delimiters.
Font used to highlight function names.
Font used to highlight built-in function names.
Font used to highlight documentation embedded in program code. It is typically used for special documentation comments or strings.
Font used to highlight function names.
Font used to highlight keywords.

(defun tjl/weblorg-parse-org-file-extras (input-path)
  "Like weblorg--parse-org but more fields for static site gen."
  (let* ((input-data (with-temp-buffer
                       (insert-file-contents input-path)
                       (buffer-string)))
         (org-tree (with-temp-buffer
                     (insert input-data)
                     (org-element-parse-buffer)))
         ;; I desperately wish that I didn't have to parse it twice,
         ;; but the interaction between `copy-tree` and
         ;; `org-element-map` is unreliable.
         (excerpt-tree (org-remove-image-links
                        (weblorg--extract-excerpt-from-tree
                         (with-temp-buffer
                           (insert input-data)
                           (org-element-parse-buffer)))))
         ;; Get standard parse results
         (keywords (with-temp-dir-file tmpfile "tjl-weblorg"
                                       (file-name-base input-path)
                                       (org-element-interpret-data
                                        (remove-first-horizontal-rule org-tree))
                                       (weblorg--parse-org-file tmpfile)))
         (words (org-prose-word-count org-tree))
         ;; Convert excerpt tree back to org syntax
         (excerpt-org (when excerpt-tree
                        (with-temp-buffer
                          (insert (org-element-interpret-data excerpt-tree))
                          (buffer-string)))))
    ;; If we got an excerpt, parse it to HTML and add to keywords
    (when excerpt-org
      (let ((excerpt-html 
             (with-temp-buffer
               (insert excerpt-org)
               (org-export-as 'html nil nil t))))
        (push (cons "excerpt" excerpt-html) keywords)))
    (push (cons "words" (cl-format nil "~:d" words)) keywords)
    (push (cons "reading_minutes" (max 1 (/ words 233))) keywords)
    keywords))

(defun org-prose-word-count (element)
  "Count words in prose sections of an org-mode ELEMENT tree.
ELEMENT should be the result of `org-element-parse-buffer'.
Returns number of words in regular paragraphs, excluding source blocks,
drawers, and other metadata."
  (let ((word-count 0))
    (org-element-map element '(paragraph plain-text)
      (lambda (elem)
        ;; Only process paragraphs that aren't inside special blocks
        (when (and (eq (org-element-type elem) 'paragraph)
                   (not (member (org-element-type (org-element-property :parent elem))
                                '(src-block example-block verse-block
                                            quote-block center-block comment-block
                                            drawer property-drawer keyword))))
          (cl-incf word-count
                   (count-words-string
                    (org-element-interpret-data elem))))))
    word-count))

(defun count-words-string (str)
  "Count words in STR, considering only regular prose words."
  (with-temp-buffer
    (insert str)
    ;; Remove any remaining org markup
    (goto-char (point-min))
    (while (re-search-forward "\\[\\[.*?\\]\\]\\|\\[.*?\\]" nil t)
      (replace-match " "))
    ;; Count remaining words
    (count-words (point-min) (point-max))))

(defun remove-first-horizontal-rule (tree)
  "Remove the first horizontal rule from an org-element tree.
The tree should be obtained from `org-element-parse-buffer'."
  (let ((found nil))
    (org-element-map tree 'horizontal-rule
      (lambda (hr)
        (unless found
          (setq found t)
          (org-element-set-element
           hr
           (org-element-create 'paragraph))))
      nil nil t))
  tree)

(defun weblorg--find-horizontal-rule (tree)
  "Find first horizontal rule in parsed org TREE.
Returns the position of the horizontal rule, or nil if none found."
  (org-element-map tree 'horizontal-rule
    (lambda (hr)
      (org-element-property :begin hr))
    nil t)) ; use t to stop after first match

(defun weblorg--truncate-tree-at-pos (tree pos)
  "Truncate org element TREE at position POS."
  (org-element-map tree '(section paragraph plain-list table special-block)
    (lambda (element)
      (let ((elem-end (org-element-property :end element)))
        (when (<= elem-end pos)
          element)))
    nil nil nil t)) ; Keep all matches, don't filter nil

(defun weblorg--extract-excerpt-from-tree (tree)
  "Extract excerpt from org TREE up to first horizontal rule.
Returns the excerpt as a new org element tree."
  (let ((hr-pos (weblorg--find-horizontal-rule tree)))
    (if hr-pos
        (weblorg--truncate-tree-at-pos tree hr-pos)
      tree))) ; If no horizontal rule, use entire tree

(defmacro with-temp-dir-file (file-var prefix name content &rest body)
  "Create a temporary file in a new directory, execute BODY, then clean up.
FILE-VAR is bound to the temp file path during execution.
PREFIX and SUFFIX are passed to `make-temp-file'.
CONTENT is written to the file before BODY executes."
  (declare (indent 1))
  (let ((temp-dir-sym (make-symbol "temp-dir")))
    `(let* ((,temp-dir-sym (make-temp-file ,prefix t))
            (,file-var (expand-file-name ,name ,temp-dir-sym)))
       (unwind-protect
           (progn 
             (with-temp-file ,file-var
               (insert ,content))
             ,@body)
         (delete-directory ,temp-dir-sym t)))))

(defun org-remove-image-links (element)
  "Remove all image link elements from an org-element tree.
ELEMENT is the root of the tree, as returned by `org-element-parse-buffer'.
Returns a new tree with all image links removed."
  (org-element-map element t
    (lambda (node)
      (when (and (eq 'link (org-element-type node))
                 (string-match-p (rx bos (or "file" "http" "https"))
                                 (org-element-property :type node))
                 (string-match-p (rx "." (or "jpg" "jpeg" "png" "gif" "svg" "webp") eos)
                                 (org-element-property :path node)))
        (org-element-extract-element node)))
    nil nil t)
  element)

So much easier than relying on pre-built Jekyll functions. I'm choking up because I'm happy, not because I spent days converting markdown posts to org-mode.

« How to Preview System Updates on NixOS

A Beginner's Guide to Extending Emacs »

Tyblog

All the posts unfit for blogging
blog.tjll.net