Document build
I write this knowledgebase, and my website in orgmode markup.
Getting it into a publishable format (pdf, html, epub) involves a Makefile, some elisp, and the org publish/export features.
On my end I call make html
. This runs emacs in batch mode, loads my publish.el
file, and calls org-publish-project
with the appropriate project name. The publish function grabs a list of files, then calls org's export function on each of them. This will evaluate any code blocks, then convert the finishied file to html, and write it in the publishing directory.
For the org export part of that process it is worth reading over Summary of the export process in the org manual. It gives an overview of the general order of things which is helpful when you're trying to modify the process.
Things get a little more complicated for my ebook, and pdf targets where I use secondary tools to build the final document.
publish.el
I do all my publish project, and general orgmode configuration in publish.el
.
It consists of:
- loading supporting libraries (mostly
./lisp/drislox*.el
) - setting publishing directories
- setting document metadata
- setting names and descriptions of sub-documents
- orgmode configuration (mostly publishing/export preferences)
- listing all the publishing projects and their associated properties
Publishing projects
A project defines a set of files or directory to publish. It has some properties that control how it is published, and it defines the function used to publish it.
For simple formats, that generally breaks down to a couple projects. For example, I have a 'tex' target, and 'tex-static'. The former converts org documents to LaTeX, and the latter copies static assets from the base directory to the publishing directory.
For HTML I wanted to produce multiple index pages so I broke the main publish project into one per sub-directory, and then defined a meta project. The meta-project in my case doesn't need any special setup, it just has one property, :components
, that is a list of sub-projects.
I also publish some RSS feeds, but they are a special case. Discussed in RSS output.
Final document production
Orgmode goes a long way, but doesn't produce all the final formats I want. For PDF I rely on latexmk, and a pre-amble document pkb.tex
that includes the org generated index, which subsequently includes all the other files as subfiles.
For epub, org produces an XHTML output that I then package using my own ebook.py
script which is based on ebooklib.
How I mangle things
This is just categorized by which source file the function is in. Some things, like library of babel functions, aren't located by functionality.
lisp/lib-babel.org
These are source blocks executed from #+CALL: function
- convert
- causes the following image link to be modified and copied to the publishing directory
- machine-log-entries
- collects all the machine log entries into one buffer for eventual RSS export
- blog-entries
- collects all the blog entries into one buffer for eventual RSS export
lisp/drislox-util.el
Common utility functions
- get-current-func-symbol
- Returns the current function symbol from the stack. Helper for drislox-error. Stolen from https://emacs.stackexchange.com/a/2312
- drislox-error
- My error macro. Print a function name, and a message. Optionally prints a backtrace. Optionally exits.
- drislox-unprotect-text
- removes some double '/' escapes. Used for filenames in a couple places.
- drislox-get-publish-project
- fetches the current project based on a variable set during Makefile invocation.
lisp/drislox-sitemap.el
(defun drislox-sitemap-entry (entry style project) "#+BIND: org-export-filter-final-output-functions (drislox-reformat-sitemap) (defun drislox-base-sitemap (title list) (defun drislox-sitemap (title list) (defun drislox-section-names-to-lisp () ;; and html index.org #+BIND: org-export-filter-final-output-functions (defun drislox-reformat-sitemap (text backend info)
lisp/drislox-rss.el
(defun drislox-rss-get-entries (file id &optional level) (defun drislox-rss-headline-contents-with-generated-ids (file id) (defun drislox-rss-cut-subtrees-after-n (n &optional level) (defun drislox-rss-sort-org-entries-by-pubdate-property () (defun drislox-rss-make-links-absolute (base-url)
lisp/drislox-links.el
(defun drislox-replace-home-with-tilde (text backend info) (add-to-list 'org-export-filter-link-functions (defun drislox-better-pdf-links (text backend info) (add-to-list 'org-export-filter-final-output-functions (defun about-export (link desc format) (defun chrome-export (link desc format) (defun youtube-browse-url (handle) (defun youtube-export (link desc backend info)
lisp/drislox-htmlize.el
(defun face-spec-default (spec) (defun face-spec-min-color (display-atts) (defun face-spec-highest-color (spec) (defun face-spec-t (spec) (defun my-face-attribute (face attribute &optional frame inherit) (advice-add 'face-attribute :override #'my-face-attribute)
lisp/drislox-html.el
(defun drislox-preamble (info) (defun drislox-head (base-directory)
lisp/drislox-images.el
(defun drislox-scour-svg (filename new-filename) (defun drislox-fix-svg-options (text backend info) (add-to-list 'org-export-filter-link-functions 'drislox-fix-svg-options) (defun drislox-imagemagick-convert (defun drislox-push-image-file (image-filename gallery-root asynchronous (defun drislox-relative-image-path (image-filename) (defun drislox-html-image (args) (advice-add 'org-html–format-image :filter-args #'drislox-html-image) (defun drislox-tex-image-path (image-filename) (defun drislox-latex-image (args) (advice-add 'org-latex–inline-image :filter-args #'drislox-latex-image) (defun drislox-html-gallery (args) (advice-add 'org-html-special-block :filter-args #'drislox-html-gallery) (defun drislox-latex-gallery (contents)
lisp/drislox-headlines.el
(defun drislox-filter-if (pred filter-function arg) (defun drislox-headline-custom-id-filter (backend) (add-to-list 'org-export-before-parsing-hook 'drislox-headline-custom-id-filter)
lisp/drislox-blocks.el
(defun drislox-note-special-block (text backend info) (add-to-list 'org-export-filter-special-block-functions (defun drislox-latex-special-block (special-block contents info) (advice-add 'org-latex-special-block :override #'drislox-latex-special-block) (defun drislox-latex-example-block (example-block contents info) (advice-add 'org-latex-example-block :override #'drislox-latex-example-block) (defun drislox-src-tcolorbox-env (text backend info) (add-to-list 'org-export-filter-src-block-functions (defun org-html-example-block (example-block _contents info) (defun drislox-html-src-block (block) (advice-add 'org-html-src-block :filter-return #'drislox-html-src-block) ;; (defun drislox-html-src-encode (orig-fun &rest args) ;; (advice-add 'org-html-do-format-code :around #'drislox-html-src-encode)
lisp/drislox-blog.el
(defun drislox-blog-get-preview (file) (defun drislox-blog-get-contents (file) (defun drislox-blog-parse-sitemap-list (l) (defun drislox-blog-sort-article-list (l p) (defun drislox-blog-remove-draft-files (file-list project-plist) (let* ((info (org-export-get-environment)) (defun drislox-blog-sitemap (title list)
Where my manglings fit into the org-publish/org-export process
Here is the export process summarized:
Process temporary copy of the source Org buffer :
- Execute
org-export-before-processing-functions
- Expand
#+include
keywords in the whole buffer - Remove commented subtrees in the whole buffer
- Replace macros in the whole buffer
- Process code blocks
This is where #+CALL: functions happen. They can modify the temporary buffer which is handy if you want to place or modify org markup. I also use them for external side effects (image modification).
- Execute
Parse the temporary buffer, creating AST (Abstract Syntax Tree):
- Execute
org-export-before-parsing-functions
. The hook functions may still modify the buffer
Used for customizing ID of headlines to be a nice string. Has the downside that files are published in-turn so inter-linking can still end up producing org-generated IDs.
- Calculate export option values according to subtree-specific export settings, in-buffer keywords,
#+BIND
keywords, and buffer-local and global customization. The whole buffer is considered; - When
org-org-with-cite-processors
is non-nil (default), determine contributing bibliographies and record them into export options. The whole buffer is considered; - Execute
org-export-filter-options-functions
; - Parse the accessible portion of the temporary buffer to generate an AST. The AST is a nested list of lists representing Org syntax elements.
- Execute
Past this point, modifications to the temporary buffer no longer affect the export; Org export works only with the AST;
- Remove elements that are not exported from the AST:
- Headings, Comments, Table width/alignment rows, Table recalc columns, Clocks, drawers, fixed-width environments, footnotes, LaTeX environments, and fragments, node properties, planning lines, property drawers, statistics cookies, timestamps, etc. according to
#+OPTIONS
keyword
- Headings, Comments, Table width/alignment rows, Table recalc columns, Clocks, drawers, fixed-width environments, footnotes, LaTeX environments, and fragments, node properties, planning lines, property drawers, statistics cookies, timestamps, etc. according to
- Expand environment variables in file link AST nodes
- Execute
org-export-filter-parse-tree-functions
. These functions can modify the AST by side effects; - Replace citation AST nodes and
#+print_bibliography
keyword AST nodes
- Convert the AST to text by traversing the AST nodes, depth-first:
- Convert the leaf nodes (without children) to text
- Pass the converted nodes through the corresponding export filters
- Concatenate all the converted child nodes to produce parent node contents;
- Convert the nodes with children to text, passing the nodes themselves and their exported contents to the corresponding transcoders and then to the export filters.
- Post-process the exported text:
- Post-process the converted AST, as prescribed by the export backend. This step usually adds generated content (like Table of Contents) to the exported text;
- Execute
org-export-filter-body-functions
; - Add the necessary metadata to the final document, as prescribed by the export backend. Examples: Document author/title; HTML headers/footers; LaTeX preamble;
- Add bibliography metadata, as prescribed by the citation export processor;
- Execute
org-export-filter-final-output-functions
.
RSS output
I use ox-rss for the final RSS export.
Preparing things for RSS export is mostly just collecting all the entries into one buffer in a suitable format. Then sorting them anti-chronologically.
The first part happens in lib-babel.org functions. The rest is in drislox-rss.el or drislox-blog.el.
Blog has the extra feature of drafts. If a file has #+FILETAGS: :draft:
then it is still exported, but won't be linked from the blog, nor included in the RSS.
TODO write some more, link the file names above