Load Lexicon...
Loads the chosen file and makes it the current lexicon. The file can be a lexicon file (with a .lex extension) or any sort of text file (as long as any non-ASCII characters are in the UTF-8 encoding).
If you choose a text file then Ag will extract all the valid words and build a lexicon containing all the unique words. A valid word is any contiguous sequence of 1 to 30 lowercase letters from this set:
a to z and áàâäãåçéèêëíìîïñóòôöõúùûüßæøœÿıThe letter sequence can be delimited by any of these characters:
NUL to space (this includes TAB, CR, LF) and !"(),.:;?¿¡«»… —“”The 4th last character is a non-breaking space.
Note that if the very first character in the text file is a digit (0 to 9) then Ag will extract all the valid numbers in the file and create a numeric lexicon. A valid number is simply any contiguous sequence of 1 to 30 digits delimited by the same set of characters listed above.
Load Recent
This submenu lets you load a previously selected lexicon. The most recent lexicon is always at the top.
Save Lexicon...
Saves the current lexicon data in a specified .lex file. If you save it in the Lexicons folder then it will appear in the list of lexicon files at the end of the Lexicon menu (see below).
Save Lexicon as Text...
Writes all the words in the current lexicon to the specified text file, one word per line. The words will be in lowercase (regardless of the Edit menu's Uppercase setting) so the file can be loaded by "Load Lexicon".
The remaining items in the Lexicon menu are the names of lexicon files stored in the Lexicons folder (non-lexicon files are ignored). The current lexicon is ticked. If none are ticked it means you've loaded a file outside the Lexicons folder. To load a lexicon, simply select it from the menu or click on one of the following links:
Danish.lex | — 310,000 Danish words | |
Dutch.lex | — 160,000 Dutch words | |
English.lex | — 170,000 English words | |
French.lex | — 230,000 French words | |
German.lex | — 200,000 German words | |
Integers.lex | — all integers from 0 to 999,999 | |
Italian.lex | — 260,000 Italian words | |
Junior.lex | — 10,000 common English words suitable for pre-teen children | |
Knuth5.lex | — Donald Knuth's list of 5-letter words from The Stanford GraphBase | |
OSPD.lex | — a very old version of the Official Scrabble Players Dictionary | |
Primes.lex | — all prime numbers less than 1 million | |
Shakespeare.lex | — all words used in the collected works of William Shakespeare | |
Spanish.lex | — 250,000 Spanish words | |
Words.lex | — 40,000 English words in a typical adult's vocabulary |
Lexicon files usually have a .lex extension, but that isn't compulsory. A .lex file is a binary file containing a list of words in a special format that allows the file to be loaded into memory very quickly. The format also allows specific words to be found very quickly.
Technical note: Non-ASCII letters are stored in the MacRoman encoding. This allows lexicons to support many non-English languages (French, German, Italian, Spanish, etc) but retain the simplicity of one byte per letter. Although Ag uses the MacRoman encoding internally, when it displays words (or copies them to the clipboard, or writes them to a file) it uses the UTF-8 encoding.
Note for Mac users: Lexicon files are identical to the "word list" files used by my defunct Anagrams app, so there's no problem using your old word lists with Ag.
Numeric Lexicons
A numeric lexicon contains "words" consisting entirely of digits (0 to 9) rather than letters. An example of a numeric lexicon is Primes.lex — it contains all the prime numbers less than one million. To see them all, just type "*" into the text box and click the "Find Lexicon Words" button (or hit the return/enter key).
To create a numeric lexicon you need to load a text file that begins with a digit. Ag will then extract all the unique numbers (ie. digit strings) in the file and build a numeric lexicon. You then have the option of saving the results in a .lex file.
There is one potential source of confusion when entering patterns to match numbers in a numeric lexicon. If you want to specify a repeat count then you must surround it with angle brackets. For example, if you want to find all the 3-digit numbers using a repeat count then you need to type in the pattern "?<3>" rather than "?3" (the latter will find all 2-digit numbers ending with 3). Note that angle brackets can also be used in patterns for non-numeric lexicons, so you might prefer to use them all the time.