qsgen3/how-it-works.md
Stig-Ørjan Smelror 81ffa53d70 Add comprehensive theme documentation and improve migration script
- Add THEMES-HOWTO.md: Complete guide for creating and customizing themes
- Remove theme sections from how-it-works.md to avoid duplication
- Update migration script to place all blog posts in single directory
- Streamline documentation structure for better organization
2025-05-31 03:00:50 +02:00

16 KiB

How qsgen3 Works

Table of Contents

  1. Philosophy and Design Principles
  2. Project Structure
  3. Configuration System
  4. Content Processing Pipeline
  5. Static File Handling
  6. Template System
  7. Output Generation
  8. Command Line Interface
  9. Dependencies and Requirements
  10. Detailed Workflow
  11. Troubleshooting and Debugging

Philosophy and Design Principles

Core Philosophy

qsgen3 is designed to be 100% design-agnostic. It does not impose any specific CSS frameworks, JavaScript libraries, or HTML structures on users. The generator's role is strictly to:

  1. Process Markdown content into HTML
  2. Combine content with user-chosen templates and styling
  3. Generate a complete static site structure

Key Principles

  • Minimal Dependencies: Only requires Pandoc for content processing
  • In-Memory Operations: All content manipulation occurs in memory to improve performance and reduce storage wear
  • Flexible Theme System: Supports easy switching between themes via configuration
  • Template Agnostic: Works with any Pandoc-compatible HTML templates
  • No Forced Assets: Only automatically links the main theme CSS; all other asset inclusion is explicit

Project Structure

A typical qsgen3 project follows this structure:

project-root/
├── bin/
│   └── qsgen3                 # Main generator script
├── site.conf                 # Main configuration file
├── .qsgen3_preserve          # Optional: File preservation patterns
├── content/                  # Markdown content
│   ├── posts/               # Blog posts
│   │   └── hello-world.md
│   └── pages/               # Static pages
├── layouts/                 # Pandoc HTML templates
│   ├── index.html          # Homepage template
│   ├── post.html           # Blog post template
│   └── rss.xml             # RSS feed template
├── static/                  # Static assets (CSS, images, etc.)
│   ├── css/
│   └── images/
├── output/                 # Generated site (created by qsgen3)
    ├── index.html
    ├── rss.xml
    ├── posts/
    └── static/

Configuration System

Primary Configuration: site.conf

The site.conf file uses a simple key-value format:

# Site Metadata
site_name="My Awesome Site"
site_tagline="A brief description of my site"
site_url="http://localhost:8000"

# Directory Paths
paths_content_dir="content"
paths_output_dir="output"
paths_layouts_dir="layouts"
paths_static_dir="static"

# Build Options
build_options_generate_rss=true
build_options_generate_sitemap=true
build_options_process_drafts=false

Configuration Loading Process

  1. File Location: Defaults to $PROJECT_ROOT/site.conf
  2. Override: Can be specified with -c <file> or --config <file>
  3. Parsing: Simple line-by-line parsing of key="value" pairs
  4. Storage: Values stored in QSG_CONFIG associative array
  5. Validation: Basic validation for required keys and file existence

Key Configuration Variables

  • site_name: Name of the site
  • site_url: Base URL for the site (used in RSS and absolute links)
  • paths_*: Directory paths (can be relative or absolute)
  • build_options_*: Boolean flags for optional features

Content Processing Pipeline

Markdown Processing

  1. Discovery: Recursively scans paths_content_dir for .md files
  2. Metadata Extraction: Parses YAML front matter for post metadata
  3. Content Conversion: Uses Pandoc to convert Markdown to HTML
  4. Template Application: Applies appropriate template based on content type
  5. Output Generation: Writes processed HTML to corresponding location in output/

Content Types

  • Posts: Files in content/posts/output/posts/
  • Pages: Files in content/pages/output/
  • Index: Generated from post metadata → output/index.html
  • RSS: Generated from post metadata → output/rss.xml

Metadata Handling

Each Markdown file can include YAML front matter:

---
title: "Post Title"
date: "2023-01-01"
author: "Author Name"
description: "Post description"
draft: false
---

# Post Content

Your markdown content here...

Static File Handling

Copy Strategy

Static files are copied in a specific order to handle theme overrides:

  1. Root Static Files: Copy from paths_static_dir to output/static/
  2. Theme Static Files: Copy from theme's static source to output/static/
  3. Override Behavior: Theme files overwrite root files with same names

Copy Implementation

  • Primary Tool: rsync with -av --delete flags
  • Fallback: cp -R if rsync is unavailable
  • Preservation: Maintains directory structure and file permissions

CSS File Linking

  1. Availability: Theme CSS files are copied to output/static/
  2. Verification: Script checks for CSS file existence after copying
  3. Pandoc Integration: CSS path passed to Pandoc via --css flag
  4. Path Format: Uses site-root-relative paths (e.g., /static/css/style.css)

Template System

Template Types

qsgen3 uses Pandoc templates with specific purposes:

  • index.html: Homepage template (receives post list metadata)
  • post.html: Individual post template (receives post content and metadata)
  • rss.xml: RSS feed template (receives post list for syndication)

Template Variables

Templates receive data through Pandoc's variable system:

Post Templates

  • $title$: Post title from front matter
  • $date$: Post date
  • $author$: Post author
  • $body$: Converted HTML content
  • Custom variables from YAML front matter

Index Template

  • $site_name$: From site.conf
  • $site_tagline$: From site.conf
  • $posts$: Array of post metadata for listing

RSS Template

  • $site_url$: Base URL for absolute links
  • $posts$: Array of post data with URLs and content

Template Resolution

  1. Theme Override: If theme provides templates, use theme's layouts/
  2. Default: Use project's layouts/ directory
  3. Fallback: Error if required template not found

Output Generation

Directory Structure

Generated output maintains a clean, predictable structure:

output/
├── index.html           # Homepage
├── rss.xml             # RSS feed
├── posts/              # Individual post pages
│   └── post-name.html
├── static/             # All static assets
│   ├── css/           # Stylesheets
│   ├── js/            # JavaScript (if provided by theme)
│   └── images/        # Images and media
└── css/               # Legacy: Index-specific CSS location
    └── theme.css      # Copy of main theme CSS for index page

File Naming

  • Posts: content/posts/hello-world.mdoutput/posts/hello-world.html
  • Pages: content/pages/about.mdoutput/about.html
  • Index: Generated → output/index.html
  • RSS: Generated → output/rss.xml

URL Structure

  • Posts: /posts/post-name.html
  • Pages: /page-name.html
  • Static Assets: /static/path/to/asset
  • CSS: /static/css/style.css (for posts), /css/theme.css (for index)

Command Line Interface

Basic Usage

./bin/qsgen3 [options]

Available Options

  • -h, --help: Display usage information and exit
  • -V, --version: Show script name and version, then exit
  • -c <file>, --config <file>: Specify custom configuration file path

Path Resolution

  • PROJECT_ROOT: Defaults to current working directory ($PWD)
  • CONFIG_FILE: Defaults to $PROJECT_ROOT/site.conf
  • Relative Paths: Configuration file path can be relative to project root

Exit Codes

  • 0: Successful generation
  • 1: Error (missing dependencies, configuration issues, processing failures)

Dependencies and Requirements

Required Dependencies

  • Pandoc: Core dependency for Markdown processing and HTML generation
  • Zsh: Shell interpreter (script written in Zsh)

Optional Dependencies

  • rsync: Preferred tool for efficient file copying (falls back to cp)

System Requirements

  • Operating System: Linux/Unix-like systems
  • File System: Support for standard Unix file permissions
  • Memory: Minimal requirements (all processing in memory)

Environment Setup

The script configures a consistent environment:

LC_ALL=C
LANG=C
umask 0022

Detailed Workflow

1. Initialization Phase

Start qsgen3
├── Parse command line arguments
├── Set PROJECT_ROOT (default: $PWD)
├── Determine CONFIG_FILE path
├── Set environment variables (LC_ALL, LANG, umask)
└── Initialize QSG_CONFIG array

2. Configuration Loading

Load Configuration
├── Check if CONFIG_FILE exists
├── Parse key="value" pairs line by line
├── Strip quotes from values
├── Store in QSG_CONFIG associative array
└── Validate required configuration keys

3. Dependency Checking

Check Dependencies
├── Verify Pandoc is available
├── Check Pandoc version compatibility
├── Verify other required tools
└── Exit with error if dependencies missing

4. Output Preparation

Prepare Output Directory
├── Check for .qsgen3_preserve file in project root
├── If preserve file exists:
│   ├── Read file patterns (shell glob patterns)
│   ├── Create temporary backup directory
│   ├── Find and backup matching files from output directory
│   ├── Remove entire output directory
│   ├── Recreate clean output directory
│   ├── Restore preserved files maintaining directory structure
│   └── Clean up temporary backup directory
├── If no preserve file:
│   ├── Remove entire output directory
│   └── Create fresh output directory
└── Log preservation and cleaning operations

File Preservation System

qsgen3 supports preserving specific files during the cleaning process to handle cases where content has been shared or bookmarked and should remain accessible even after title changes.

Preserve File Format (.qsgen3_preserve):

  • Located in project root directory
  • One pattern per line using shell glob patterns (*, ?, [])
  • Lines starting with # are comments
  • Empty lines are ignored
  • Patterns are relative to the output directory

Example preserve patterns:

# Preserve specific shared articles
posts/my-important-shared-article.html
posts/viral-blog-post.html

# Preserve files by pattern
posts/legacy-*.html
archive/*

# Preserve all PDFs and downloads
*.pdf
downloads/*

Benefits:

  • Maintains stable URLs for shared content
  • Prevents broken links when content is renamed
  • Flexible pattern matching for various preservation needs
  • Backward compatible (no preserve file = complete cleaning)

5. Static File Processing

Copy Static Files
├── Copy from paths_static_dir to output/static/
│   ├── Use rsync -av --delete if available
│   └── Fallback to cp -R
├── Copy from theme static source to output/static/
│   ├── Theme files overwrite root files
│   └── Preserve directory structure
└── Log copy operations and results

6. CSS Path Determination

Determine CSS Linking
├── Read site_theme_css_file from configuration
├── Construct expected CSS file path in output/static/
├── Verify CSS file exists after copying
├── Set QSG_CONFIG[pandoc_css_path_arg] for Pandoc
└── Log CSS path decisions and warnings

7. Content Processing

Process Markdown Content
├── Scan paths_content_dir recursively for .md files
├── For each Markdown file:
│   ├── Extract YAML front matter
│   ├── Determine output path and template
│   ├── Run Pandoc with appropriate template and CSS
│   ├── Write generated HTML to output directory
│   └── Log processing results
└── Collect metadata for index and RSS generation

8. Index Generation

Generate Index Page
├── Collect all post metadata
├── Create YAML metadata file for Pandoc
├── Run Pandoc with index template
├── Apply CSS styling
├── Write output/index.html
└── Clean up temporary files

9. RSS Generation

Generate RSS Feed
├── Collect post metadata with URLs
├── Create YAML metadata for RSS template
├── Run Pandoc with RSS template
├── Generate absolute URLs using site_url
├── Write output/rss.xml
└── Clean up temporary files

10. Finalization

Complete Generation
├── Log final directory structure
├── Report generation success
├── Clean up any remaining temporary files
└── Exit with status code 0

Troubleshooting and Debugging

Common Issues

1. CSS Not Applied

Symptoms: Generated HTML doesn't show theme styling Causes:

  • Incorrect site_theme_css_file path in site.conf
  • CSS file doesn't exist in theme's static assets
  • Theme static directory structure mismatch

Solutions:

  • Verify CSS file path relative to theme's static source
  • Check theme directory structure
  • Enable debug logging to trace CSS path resolution

2. Template Errors

Symptoms: Pandoc errors during HTML generation Causes:

  • Missing required templates
  • Template syntax errors
  • Incompatible template variables

Solutions:

  • Verify all required templates exist
  • Check Pandoc template syntax
  • Review template variable usage

3. Static File Copy Issues

Symptoms: Assets missing from output directory Causes:

  • Permission issues
  • Disk space problems
  • Path resolution errors

Solutions:

  • Check file permissions
  • Verify available disk space
  • Review path configurations for absolute vs. relative paths

4. File Preservation Issues

Symptoms: Expected files not preserved during cleaning, or preservation not working Causes:

  • Incorrect patterns in .qsgen3_preserve file
  • File paths don't match patterns
  • Permission issues with temporary backup directory
  • Malformed preserve file format

Solutions:

  • Verify patterns use shell glob syntax (*, ?, [])
  • Check that patterns are relative to output directory
  • Ensure .qsgen3_preserve file is in project root
  • Test patterns with find output/ -name "pattern" before adding to preserve file
  • Enable debug logging to see preservation process details
  • Verify file permissions allow temporary directory creation

Example debugging:

# Test if your pattern matches files
find output/ -name "posts/legacy-*.html"

# Enable debug logging to see preservation process
QSG_DEBUG=1 ./bin/qsgen3

Debug Logging

Enable detailed logging by modifying the _log function or adding debug statements:

# Enable debug logging
QSG_DEBUG=1 ./bin/qsgen3

Path Debugging

The script includes path resolution logic to handle both relative and absolute paths. If experiencing path issues:

  1. Check that PROJECT_ROOT is correctly set
  2. Verify configuration paths are relative to project root
  3. Review log messages for path construction details

Configuration Validation

Ensure site.conf follows the correct format:

  • Use double quotes for values: key="value"
  • No spaces around the equals sign
  • One configuration per line
  • Comments start with #