Custom input method for Emacs
There are different input methods built into Emacs (M-x set-input-method
) for
different languages and scripts. The ones I commonly use are:
- latin-postfix
- devanagari-itrans
- kannada-itrans
- telugu-itrans
More information is available through M-x describe-input-method
. However, there is no
built-in facility to input IAST, which is probably the most common transliteration
scheme for Sanskrit and other Indic languages.
So, this is a short tutorial on how to define your own custom input method, with IAST as example.
Quail
Quail is minor mode for inputting multilingual text. It provides useful functions for abstracting input methods. The idea is that transliteration rules are stored as a sort of key-value pairs (aka hash table). Quail will take care of converting this table into a lower level suitable for Emacs.
There are just two functions, quail-define-package
and quail-define-rules
, that we
must implement. The latter must follow the former immediately. The parameters for
transliteration are specified in the former where as the mapping table itself is specified
in the latter.
quail-define-package
This function takes three mandatory paramters and a dozen optional ones.
(quail-define-package NAME LANGUAGE TITLE
&optional GUIDANCE DOCSTRING ... SIMPLE)
The mandatory paramters NAME
, LANGUAGE
and TITLE
are self-explanatory. GUIDANCE
is
a boolean. If it’s set to true, Emacs will show all possible completions for the
keystrokes in the echo area, right while you’re typing. This is useful for resolving
ambiguities or getting quick help. The last parameter SIMPLE
is also a boolean. Setting
it to true means we’re promising Emacs that we’ll not mess with key bindings like C-b
and C-f
which are used for navigation. We shall set all other params to nil
.
Here’s how it looks now
(quail-define-package
"iast-postfix" "UTF-8" "InR<" t
"Input method for Indic transliteration with postfix modifiers.
Long vowels are dealt with by doubling.
| | postfix | examples |
|------------------+---------+----------------------|
| macron | | aa -> ā ee -> ē |
| diacritic below | . | d. -> ḍ rr. -> ṝ |
| diacritic above | ' | s' -> ś n' -> ṅ |
| tilde | ~ | n~ -> ñ |
"
nil t nil nil nil nil nil nil nil nil t)
We’re saying that our new input method “iast-postfix” uses UTF–8 encoding. TITLE
is
InR<
(Indic Roman, <
means postfix) which will be shown in Emacs’ mode line for
highlighting the current input method. Docstring follows next. The last param SIMPLE
is
set to true t
. DOCSTRING
is free text that defines the format of input method which
we’ve designed.
quail-define-rules
This part is easy. Just put the mapping table! It’s very similar to m17n’s syntax (see for example sa-iast.mim), which is no coincidence because Kenichi HANDA maintains both of them.
(quail-define-rules
;; long vowels
("aa" "ā")
("ii" "ī")
("uu" "ū")
("rr." "ṝ")
("ee" "ē")
("oo" "ō")
;; dot below
("r." "ṛ")
("l." "ḷ")
("m." "ṃ")
("h." "ḥ")
("t." "ṭ")
("d." "ḍ")
("n." "ṇ")
("s." "ṣ")
;; diacritic above
("n'" "ṅ")
("s'" "ś")
("n~" "ñ")
)
Obviously you can extend this table to have uppercase characters, etc. Note that the second elemen in each of the pairs above is a pre-composed Unicode codepoint. If it consists of multiple codepoints, you need to use square brackets:
("gy" ["jñ"]) ; as in, gyaana becomes jñāna
Loading the input method
Just save the above two functions in a file, say “indic-input.el”. Convert that file into an Emacs package by simply adding this as the last line of the file:
(provide 'indic-input)
You can then use the above package by putting these lines in your emacs init file:
(add-to-list 'load-path "/folder/where/this/file/exists/")
(require 'indic-input)
You can then switch to the input method in any buffer by the usual means M-x
set-input-method
and choosing iast-postfix
.