Ag is a fast, scriptable anagram generator. These notes describe how to use the command-line version of Ag, known as "agc". The examples given below assume you've installed the agc binary in a suitable location so you can run it by simply typing agc in a terminal window.
UsageIf you type agc by itself you'll get version and usage information: This is agc version 1.7 (with Lua version 5.4.6). Usage: agc [options] word or phrase to be anagrammed [options] Options: -a N print at most N anagrams (default is unlimited) -c word print anagrams containing the given word -h print this help information -i print anagrams with increasing word lengths -l lexicon use the given lexicon file (default is Words.lex) -m pattern only print lexicon/usable words that match pattern -n N print N usable words per line (default is 10) -o newlexicon save current lexicon in the given file -p only print words in lexicon -r script run the given Lua script -t textfile use the given text file (UTF-8 encoded) as lexicon -u only print usable words, by increasing length -ua only print usable words, in alphabetical order -U print all words in UPPERCASE -w MIN,MAX minimum and maximum words in anagrams (default is 1,10) Some not-so-obvious tips
How to find good anagramsThere are plenty of programs that can generate anagrams from a given text. Ag tries to make it easier to find interesting anagrams. It does this by splitting the process into two steps:
Pattern matchingThe -m option can be used to print out only the lexicon words or usable words that match a given pattern. Let's look at some simple examples: agc andrew -u -m"*a*" (print usable words containing the letter "a") agc andrew -ua -m"*a*" (ditto, but print the words alphabetically)Note that you don't need to specify -u if you use -m and supply some text to be anagrammed. In that case it's assumed you want to match usable words: agc andrew -m"re*" (print usable words starting with "re") agc andrew -m"\!re*" (print usable words that don't start with "re") agc andrew -m!"re*" (ditto) agc andrew -m"*re" (print usable words ending with "re") agc andrew -m"???" (print usable words with 3 letters)Similarly, you don't need to specify -p if you use -m without supplying any anagram text. In that case it's assumed you want to match lexicon words. These examples all match lexicon words: agc foo -p -m"?9" (print lexicon words with 9 letters) agc -p -m"?7-9" (print lexicon words with 7 to 9 letters) agc -p -m"?7-" (print lexicon words with at least 7 letters) agc -p -m"?-7" (print lexicon words with at most 7 letters) agc -m"*[xyz]" (print lexicon words ending in "x" or "y" or "z") agc -m"*(ab|xy)*" (print lexicon words containing "ab" or "xy") agc -m"[\!aeiou]-" (print lexicon words with no vowels) agc -m[!"aeiou]-" (ditto)Note that it's usually best to enclose pattern strings in double quotes. This is necessary because most of Ag's special pattern characters also have a special meaning to the shell. The exclamation mark (!) can be tricky. If inside double quotes you might have to type "\!" rather than "!", or else put it outside the double quotes (the above examples show both methods). Here are Ag's special pattern characters: * Match zero or more letters. ? Match any single letter. [...] Match any letter in the given list; eg. [abc]. [!...] Match any letter NOT in the given list; eg. [!aeiou]. N Specifies a fixed repeat count (N is a non-negative integer). Repeat counts are only allowed after ?, ], or a letter; eg. ?9. M-N Specifies a variable repeat count, where M and N are optional non-negative integers indicating the minimum and maximum counts. If M is missing then 0 is assumed, and if N is missing then infinity is assumed. Note that * is equivalent to ?-. <...> Any repeat count can be enclosed in angle brackets; eg. <2-5>. This form of a repeat count is necessary when using a numeric lexicon (see below). - Used inside [...] to indicate a letter range; eg. [a-z], or to separate min and max repeat counts; eg. ?2-5. (...) Match a sub-pattern; eg. *(ed|ing). | Means OR. For matching alternative patterns; eg. a*|b*. ! Means NOT. Can only be the first character in the pattern or the first character after [. Lexicon filesA lexicon file is a binary file containing a list of words in a special format that allows the file to be loaded into memory very quickly. The format also allows specific words to be found very quickly. All lexicon files supplied with Ag have a .lex extension. It's not strictly required, but it's a good idea to stick to that convention when naming your own lexicon files. All words in a lexicon consist of 1 to 30 lowercase letters, possibly including some non-ASCII letters. The full set of valid letters is: a to z and áàâäãåçéèêëíìîïñóòôöõúùûüßæøœÿıTechnical note: Non-ASCII letters are stored in the MacRoman encoding. This allows lexicons to support many non-English languages (French, German, Italian, Spanish, etc) but retain the simplicity of one byte per letter. Although Ag uses the MacRoman encoding internally, when it prints out words it uses the UTF-8 encoding. Note for Mac users: Lexicon files are in the same format as the "word list" files used by my defunct Anagrams app, so there's no problem using those word lists with Ag. Using a text file as the lexiconThe -t option allows any text file (in UTF-8 encoding) to be used as the lexicon: agc -t foo.txt -p (prints the unique words in foo.txt) agc -t foo.txt andrew (generates anagrams using the words in foo.txt)All of the unique words in the given file will be extracted, but only if they are "valid" words. A valid word is a contiguous sequence of 1 to 30 lowercase letters (see the previous section for the set of valid letters) delimited by any of these characters: NUL to space (this includes TAB, CR, LF) and !"(),.:;?¿¡«»… —“”The 4th last character is a non-breaking space. The -o option can be used to save the current lexicon in a new lexicon file. You'll get an error message if you try to overwrite an existing file. Some examples: agc -t foo.txt -o Foo.lex (creates a lexicon file with the words in foo.txt) agc -o NewWords.lex (copies the default lexicon file) Numeric lexiconsA numeric lexicon contains "words" consisting entirely of digits (0 to 9) rather than letters. Creating a numeric lexicon is quite simple. Just use the -t option to load a text file that begins with a digit. Ag will then extract all the unique numbers (ie. digit strings) in the file and build a numeric lexicon. You can add the -o option to save the results in a .lex file. There is one potential source of confusion when entering patterns to match numbers in a numeric lexicon. If you want to specify a repeat count then you must surround it with angle brackets. This avoids any ambiguity, as should be clear from the following examples: agc -l Primes.lex -m"?3" (print all 2-digit primes ending with 3) agc -l Primes.lex -m"?<3>" (print all 3-digit primes)Note that angle brackets can also be used in patterns for non-numeric lexicons, so you might prefer to use them all the time. Scripting functionsThis section describes all the ag.* functions that can be used in a script run by agc.
String arraysA number of ag.* functions return arrays containing strings. These are standard Lua arrays — that is, tables indexed by integers where the first string has index 1. All the strings use the UTF-8 encoding and might contain non-ASCII characters, so to count the characters in a string you should avoid using the standard Lua length operator "#" as it returns the number of bytes. Instead, use ag.numchars to return the number of characters: if ag.numchars(word) > 2 then print(word) endAnd to extract each character in a UTF-8 string you can use ag.getchars: for _, ch in ipairs(ag.getchars(word)) do print(ch) endNote that the Lua functions string.lower and string.upper don't convert non-ASCII characters correctly, so use ag.lower and ag.upper instead. Well done if you managed to read this far. Now go have some fun! Author: Andrew Trevorrow (andrew@trevorrow.com) aka Overt Word Warren. |