Emacs Polymode for Managing BibTeX

tl;dr: Emacs polymode allows the use of more than one major mode in the same file, and is easy to set up. Motivation is that embedded major mode content may be syntax highlighted, but can only be edited in the file’s major mode, meaning that an org file may contain embedded Python, for example, but the editor does not switch to Python mode when in a Python block.

I have a lot of ambitions for my Emacs usage, but often have very little time to learn how to achieve those ambitions. One long-term desire has been to embed BibTeX records in org-mode files to create an annotated bibliography to which new material can be added as ‘to do’ items, signed off as done, etc - just as you would expect to do in org - and to export the BibTeX ready for import into LaTeX using org babel’s tangling functionality. Just never got round to trying to do it, and have persisted using big, unwieldy BibTeX files, in version control with git hooks to publish updates to the local texmf folder and keeping my notes on papers separately.

Recently, I read an article on Gregory Stein’s Caches to Caches blog about his approach to an annotated bibliography in org-mode. His example inspired me to try to create the system I had wanted for a long time. Gregory Stein illustrates an annotated bibliography that can be exported to a PDF via LaTeX using the org-mode export functionality.

My thought had been to create an org-mode file with notes for each source and the BibTeX record embedded in the notes. That only makes sense for my use if I can tangle the BibTeX and, using org-babel, export it to BibTeX files in the local texmf folder. With such an arrangement I can use LaTeX as I ordinarily would /and/ have an annotated bibliography.

Inspired by Stein’s account, and some recent experiments embedding LaTeX in org-mode files, I thought I would try. Embedding code and formats like BibTeX in org-mode is a matter of creating a structure block with the language specified, for example:

#+begin_src bibtex


org-mode then treats the content in the structure block as if it was in the specified language. This can be used to run or evaluate code inside an org-mode file, for example. It can also be tangled, i.e. exported to file. With a block such as the following:

#+begin_src bibtex :tangle references.bib


Invoking org-babel-tangle on the file exports the reference to the file references.bib. Indeed, one org-mode file can tangle references to more than one output file, each specified in the structure block, and concatenate multiple references to the same file. In practice then, I can have org-mode notes for an academic paper or source, each with a single BibTeX reference that can then be exported to a specific file. Within an org-mode file I can have hundreds of annotated references from which BibteX references can be exported to a number of BibTeX files – say one for new references to be triaged, one for academic references, and so on – using a single command in Emacs.

At this point, things sound pretty good. I can embed BibTeX references in an org-mode file, and export them to BibTeX files of my choice. Well, yes, but the problem is that though org-mode ‘knows’ the structure block contains BibTeX and can syntax highlight it, the bibtex-mode tools are not available. Why? Well bibtex-mode is, like org-mode, a major mode and, by default, Emacs allows one major mode per buffer. There are, fortunately, ways of allowing multiple major modes to operate in different sections of the same buffer, one of which is called polymode.

Polymode (project: https://github.com/polymode/polymode, documentation: https://polymode.github.io/, mastering emacs: https://www.masteringemacs.org/article/polymode-multiple-major-modes-how-to-use-sql-python-in-one-buffer) claims to be simple to set up and in practice it leans towards being simple. The principle is straightforward, create a host mode which is an existing major mode; org-mode in the case I am describing. Then create one or more inner modes that map to an existing major mode, and define how the parts of the document are delimited where the inner mode should be active. This can be defined in the host mode, or in the inner mode. In this case the intent is to have BibTeX mode working in the structure blocks, so the delimiters can be defined as ‘#+begin_src bibtex’ and ‘#+end_src’. Then it is a matter of defining the polymode as a single mode that maps to one major mode and supports a number of inner modes. I also embed LaTeX in org-mode files, so, for me, working in three major modes in one org-mode file is becoming normal. In elisp and using use-package, my polymode configuration for embedding BibTeX and LaTeX in an org-mode file is as follows:

;; polymode
;; see documentation at https://polymode.github.io/installation/
;; Following draws on Mastering Emacs chapter on Polymode
;; a polymode config for BibTeX and LaTeX in org-mode
(use-package polymode
  :ensure t
  :after org
  :mode ("\\.org$" . sb-poly-org-latex-mode) 
    (define-hostmode sb-poly-org-hostmode
      :mode 'org-mode)
    (define-innermode sb-poly-org-latex-innermode
      :mode 'LaTeX-mode
      :head-matcher "^#\\+begin_src latex.*\n"
      :tail-matcher "^#\\+end_src"
      :head-mode 'host
      :tail-mode 'host)
    (define-innermode sb-poly-org-bibtex-innermode
      :mode 'bibtex-mode
      :head-matcher "^#\\+begin_src bibtex.*\n"
      :tail-matcher "^#\\+end_src"
      :head-mode 'host
      :tail-mode 'host)
    (define-polymode sb-poly-org-latex-mode
      :hostmode 'sb-poly-org-hostmode
      :innermodes '(sb-poly-org-latex-innermode sb-poly-org-bibtex-innermode))))

So, how does this work in practice?

Generally really well. There are a few issues or annoyances. One of which is that polymode seems to be a little brittle at times, and occasionally stops working until Emacs or the .emacs configuration is reloaded. This may be related to updates within Emacs. The other, which is a bigger problem when working with LaTeX embedded in org-mode - and would be similar for larger code blocks - is that to tangle the file means using ‘M-x org-babel-tangle’ or moving to an org-mode part of the file to use the ‘C-c C-v t’ sequence. However, this is a minor inconvenience considering the additional functionality polymode enables.