# How qsgen3 Works ## Table of Contents 1. [Philosophy and Design Principles](#philosophy-and-design-principles) 2. [Project Structure](#project-structure) 3. [Configuration System](#configuration-system) 4. [Content Processing Pipeline](#content-processing-pipeline) 5. [Static File Handling](#static-file-handling) 6. [Template System](#template-system) 7. [Output Generation](#output-generation) 8. [Command Line Interface](#command-line-interface) 9. [Dependencies and Requirements](#dependencies-and-requirements) 10. [Detailed Workflow](#detailed-workflow) 11. [Troubleshooting and Debugging](#troubleshooting-and-debugging) ## Philosophy and Design Principles ### Core Philosophy qsgen3 is designed to be **100% design-agnostic**. It does not impose any specific CSS frameworks, JavaScript libraries, or HTML structures on users. The generator's role is strictly to: 1. Process Markdown content into HTML 2. Combine content with user-chosen templates and styling 3. Generate a complete static site structure ### Key Principles - **Minimal Dependencies**: Only requires Pandoc for content processing - **In-Memory Operations**: All content manipulation occurs in memory to improve performance and reduce storage wear - **Flexible Theme System**: Supports easy switching between themes via configuration - **Template Agnostic**: Works with any Pandoc-compatible HTML templates - **No Forced Assets**: Only automatically links the main theme CSS; all other asset inclusion is explicit ## Project Structure A typical qsgen3 project follows this structure: ``` project-root/ ├── bin/ │ └── qsgen3 # Main generator script ├── site.conf # Main configuration file ├── .qsgen3_preserve # Optional: File preservation patterns ├── content/ # Markdown content │ ├── posts/ # Blog posts │ │ └── hello-world.md │ └── pages/ # Static pages ├── layouts/ # Pandoc HTML templates │ ├── index.html # Homepage template │ ├── post.html # Blog post template │ └── rss.xml # RSS feed template ├── static/ # Static assets (CSS, images, etc.) │ ├── css/ │ └── images/ ├── output/ # Generated site (created by qsgen3) ├── index.html ├── rss.xml ├── posts/ └── static/ ``` ## Configuration System ### Primary Configuration: `site.conf` The `site.conf` file uses a simple key-value format: ```bash # Site Metadata site_name="My Awesome Site" site_tagline="A brief description of my site" site_url="http://localhost:8000" # Directory Paths paths_content_dir="content" paths_output_dir="output" paths_layouts_dir="layouts" paths_static_dir="static" # Build Options build_options_generate_rss=true build_options_generate_sitemap=true build_options_process_drafts=false ``` ### Configuration Loading Process 1. **File Location**: Defaults to `$PROJECT_ROOT/site.conf` 2. **Override**: Can be specified with `-c ` or `--config ` 3. **Parsing**: Simple line-by-line parsing of `key="value"` pairs 4. **Storage**: Values stored in `QSG_CONFIG` associative array 5. **Validation**: Basic validation for required keys and file existence ### Key Configuration Variables - **`site_name`**: Name of the site - **`site_url`**: Base URL for the site (used in RSS and absolute links) - **`paths_*`**: Directory paths (can be relative or absolute) - **`build_options_*`**: Boolean flags for optional features ## Content Processing Pipeline ### Markdown Processing 1. **Discovery**: Recursively scans `paths_content_dir` for `.md` files 2. **Metadata Extraction**: Parses YAML front matter for post metadata 3. **Content Conversion**: Uses Pandoc to convert Markdown to HTML 4. **Template Application**: Applies appropriate template based on content type 5. **Output Generation**: Writes processed HTML to corresponding location in `output/` ### Content Types - **Posts**: Files in `content/posts/` → `output/posts/` - **Pages**: Files in `content/pages/` → `output/` - **Index**: Generated from post metadata → `output/index.html` - **RSS**: Generated from post metadata → `output/rss.xml` ### Metadata Handling Each Markdown file can include YAML front matter: ```yaml --- title: "Post Title" date: "2023-01-01" author: "Author Name" description: "Post description" draft: false --- # Post Content Your markdown content here... ``` ## Static File Handling ### Copy Strategy Static files are copied in a specific order to handle theme overrides: 1. **Root Static Files**: Copy from `paths_static_dir` to `output/static/` 2. **Theme Static Files**: Copy from theme's static source to `output/static/` 3. **Override Behavior**: Theme files overwrite root files with same names ### Copy Implementation - **Primary Tool**: `rsync` with `-av --delete` flags - **Fallback**: `cp -R` if rsync is unavailable - **Preservation**: Maintains directory structure and file permissions ### CSS File Linking 1. **Availability**: Theme CSS files are copied to `output/static/` 2. **Verification**: Script checks for CSS file existence after copying 3. **Pandoc Integration**: CSS path passed to Pandoc via `--css` flag 4. **Path Format**: Uses site-root-relative paths (e.g., `/static/css/style.css`) ## Template System ### Template Types qsgen3 uses Pandoc templates with specific purposes: - **`index.html`**: Homepage template (receives post list metadata) - **`post.html`**: Individual post template (receives post content and metadata) - **`rss.xml`**: RSS feed template (receives post list for syndication) ### Template Variables Templates receive data through Pandoc's variable system: #### Post Templates - `$title$`: Post title from front matter - `$date$`: Post date - `$author$`: Post author - `$body$`: Converted HTML content - Custom variables from YAML front matter #### Index Template - `$site_name$`: From site.conf - `$site_tagline$`: From site.conf - `$posts$`: Array of post metadata for listing #### RSS Template - `$site_url$`: Base URL for absolute links - `$posts$`: Array of post data with URLs and content ### Template Resolution 1. **Theme Override**: If theme provides templates, use theme's `layouts/` 2. **Default**: Use project's `layouts/` directory 3. **Fallback**: Error if required template not found ## Output Generation ### Directory Structure Generated output maintains a clean, predictable structure: ``` output/ ├── index.html # Homepage ├── rss.xml # RSS feed ├── posts/ # Individual post pages │ └── post-name.html ├── static/ # All static assets │ ├── css/ # Stylesheets │ ├── js/ # JavaScript (if provided by theme) │ └── images/ # Images and media └── css/ # Legacy: Index-specific CSS location └── theme.css # Copy of main theme CSS for index page ``` ### File Naming - **Posts**: `content/posts/hello-world.md` → `output/posts/hello-world.html` - **Pages**: `content/pages/about.md` → `output/about.html` - **Index**: Generated → `output/index.html` - **RSS**: Generated → `output/rss.xml` ### URL Structure - **Posts**: `/posts/post-name.html` - **Pages**: `/page-name.html` - **Static Assets**: `/static/path/to/asset` - **CSS**: `/static/css/style.css` (for posts), `/css/theme.css` (for index) ## Command Line Interface ### Basic Usage ```bash ./bin/qsgen3 [options] ``` ### Available Options - **`-h, --help`**: Display usage information and exit - **`-V, --version`**: Show script name and version, then exit - **`-c , --config `**: Specify custom configuration file path ### Path Resolution - **`PROJECT_ROOT`**: Defaults to current working directory (`$PWD`) - **`CONFIG_FILE`**: Defaults to `$PROJECT_ROOT/site.conf` - **Relative Paths**: Configuration file path can be relative to project root ### Exit Codes - **0**: Successful generation - **1**: Error (missing dependencies, configuration issues, processing failures) ## Dependencies and Requirements ### Required Dependencies - **Pandoc**: Core dependency for Markdown processing and HTML generation - **Zsh**: Shell interpreter (script written in Zsh) ### Optional Dependencies - **rsync**: Preferred tool for efficient file copying (falls back to `cp`) ### System Requirements - **Operating System**: Linux/Unix-like systems - **File System**: Support for standard Unix file permissions - **Memory**: Minimal requirements (all processing in memory) ### Environment Setup The script configures a consistent environment: ```bash LC_ALL=C LANG=C umask 0022 ``` ## Detailed Workflow ### 1. Initialization Phase ``` Start qsgen3 ├── Parse command line arguments ├── Set PROJECT_ROOT (default: $PWD) ├── Determine CONFIG_FILE path ├── Set environment variables (LC_ALL, LANG, umask) └── Initialize QSG_CONFIG array ``` ### 2. Configuration Loading ``` Load Configuration ├── Check if CONFIG_FILE exists ├── Parse key="value" pairs line by line ├── Strip quotes from values ├── Store in QSG_CONFIG associative array └── Validate required configuration keys ``` ### 3. Dependency Checking ``` Check Dependencies ├── Verify Pandoc is available ├── Check Pandoc version compatibility ├── Verify other required tools └── Exit with error if dependencies missing ``` ### 4. Output Preparation ``` Prepare Output Directory ├── Check for .qsgen3_preserve file in project root ├── If preserve file exists: │ ├── Read file patterns (shell glob patterns) │ ├── Create temporary backup directory │ ├── Find and backup matching files from output directory │ ├── Remove entire output directory │ ├── Recreate clean output directory │ ├── Restore preserved files maintaining directory structure │ └── Clean up temporary backup directory ├── If no preserve file: │ ├── Remove entire output directory │ └── Create fresh output directory └── Log preservation and cleaning operations ``` #### File Preservation System qsgen3 supports preserving specific files during the cleaning process to handle cases where content has been shared or bookmarked and should remain accessible even after title changes. **Preserve File Format (`.qsgen3_preserve`):** - Located in project root directory - One pattern per line using shell glob patterns (`*`, `?`, `[]`) - Lines starting with `#` are comments - Empty lines are ignored - Patterns are relative to the output directory **Example preserve patterns:** ```bash # Preserve specific shared articles posts/my-important-shared-article.html posts/viral-blog-post.html # Preserve files by pattern posts/legacy-*.html archive/* # Preserve all PDFs and downloads *.pdf downloads/* ``` **Benefits:** - Maintains stable URLs for shared content - Prevents broken links when content is renamed - Flexible pattern matching for various preservation needs - Backward compatible (no preserve file = complete cleaning) ### 5. Static File Processing ``` Copy Static Files ├── Copy from paths_static_dir to output/static/ │ ├── Use rsync -av --delete if available │ └── Fallback to cp -R ├── Copy from theme static source to output/static/ │ ├── Theme files overwrite root files │ └── Preserve directory structure └── Log copy operations and results ``` ### 6. CSS Path Determination ``` Determine CSS Linking ├── Read site_theme_css_file from configuration ├── Construct expected CSS file path in output/static/ ├── Verify CSS file exists after copying ├── Set QSG_CONFIG[pandoc_css_path_arg] for Pandoc └── Log CSS path decisions and warnings ``` ### 7. Content Processing ``` Process Markdown Content ├── Scan paths_content_dir recursively for .md files ├── For each Markdown file: │ ├── Extract YAML front matter │ ├── Determine output path and template │ ├── Run Pandoc with appropriate template and CSS │ ├── Write generated HTML to output directory │ └── Log processing results └── Collect metadata for index and RSS generation ``` ### 8. Index Generation ``` Generate Index Page ├── Collect all post metadata ├── Create YAML metadata file for Pandoc ├── Run Pandoc with index template ├── Apply CSS styling ├── Write output/index.html └── Clean up temporary files ``` ### 9. RSS Generation ``` Generate RSS Feed ├── Collect post metadata with URLs ├── Create YAML metadata for RSS template ├── Run Pandoc with RSS template ├── Generate absolute URLs using site_url ├── Write output/rss.xml └── Clean up temporary files ``` ### 10. Finalization ``` Complete Generation ├── Log final directory structure ├── Report generation success ├── Clean up any remaining temporary files └── Exit with status code 0 ``` ## Troubleshooting and Debugging ### Common Issues #### 1. CSS Not Applied **Symptoms**: Generated HTML doesn't show theme styling **Causes**: - Incorrect `site_theme_css_file` path in site.conf - CSS file doesn't exist in theme's static assets - Theme static directory structure mismatch **Solutions**: - Verify CSS file path relative to theme's static source - Check theme directory structure - Enable debug logging to trace CSS path resolution #### 2. Template Errors **Symptoms**: Pandoc errors during HTML generation **Causes**: - Missing required templates - Template syntax errors - Incompatible template variables **Solutions**: - Verify all required templates exist - Check Pandoc template syntax - Review template variable usage #### 3. Static File Copy Issues **Symptoms**: Assets missing from output directory **Causes**: - Permission issues - Disk space problems - Path resolution errors **Solutions**: - Check file permissions - Verify available disk space - Review path configurations for absolute vs. relative paths #### 4. File Preservation Issues **Symptoms**: Expected files not preserved during cleaning, or preservation not working **Causes**: - Incorrect patterns in `.qsgen3_preserve` file - File paths don't match patterns - Permission issues with temporary backup directory - Malformed preserve file format **Solutions**: - Verify patterns use shell glob syntax (`*`, `?`, `[]`) - Check that patterns are relative to output directory - Ensure `.qsgen3_preserve` file is in project root - Test patterns with `find output/ -name "pattern"` before adding to preserve file - Enable debug logging to see preservation process details - Verify file permissions allow temporary directory creation **Example debugging:** ```bash # Test if your pattern matches files find output/ -name "posts/legacy-*.html" # Enable debug logging to see preservation process QSG_DEBUG=1 ./bin/qsgen3 ``` ### Debug Logging Enable detailed logging by modifying the `_log` function or adding debug statements: ```bash # Enable debug logging QSG_DEBUG=1 ./bin/qsgen3 ``` ### Path Debugging The script includes path resolution logic to handle both relative and absolute paths. If experiencing path issues: 1. Check that `PROJECT_ROOT` is correctly set 2. Verify configuration paths are relative to project root 3. Review log messages for path construction details ### Configuration Validation Ensure site.conf follows the correct format: - Use double quotes for values: `key="value"` - No spaces around the equals sign - One configuration per line - Comments start with `#` ---