This article is from the Crosswords FAQ, by James A. Lundon (jlundon@xstacy.enet.dec.com) with numerous contributions by others.
There are quite a few locations where complete on-line dictionaries are
to be found on the Internet. Many thanks must go to Ross Beresford for
the following list.
File name(s) : ukacd11.zip,teadac11.zip
File size(s) : 543330,530953
Site(s) : gatekeeper.dec.com (crossword archive)
Directory : /pub/micro/msdos/misc/crossword-archive
Origin : UK Advanced Cryptics Dictionary
Entries : 185582
Inflected Forms: yes
Phrases : yes
Mixed case : yes
Comments : Word list specifically for crosswords maintained
by Ross Beresford (ross@bryson.demon.co.uk).
ukacd11.zip is in plain ASCII; teadac11.zip is
in TEA format (see The Electronic Alveary).
File name(s) : web2.Z
File size(s) : 1038775
Site(s) : many sites (for example wuarchive.wustl.edu)
Directory : (for example /mirrors4/4.3bsd-reno/share/dict)
Origin : Websters 2nd Edition words (cf web2a.Z)
Entries : 234932
Inflected Forms: no
Phrases : no
Mixed case : yes
Comments :
File name(s) : web2a.Z
File size(s) : 434291
Site(s) : many sites (for example wuarchive.wustl.edu)
Directory : (for example /mirrors4/4.3bsd-reno/share/dict)
Origin : Websters 2nd Edition phrases (cf web2.Z)
Entries : 76205
Inflected Forms: no
Phrases : yes
Mixed case : yes
Comments :
File name(s) : OSPD.shar.Z
File size(s) : 472885
Site(s) : ftp.cs.cornell.edu
directory : /pub/turney
Origin : U.S. Official Scrabble Player's Dictionary
Entries : 113901
Inflected Forms: yes
Phrases : no
Mixed case : no
Comments :
File name(s) : mrc2.dct
File size(s) : 11179399
Site(s) : black.ox.ac.uk
Directory : /ota/dicts/1054
Origin : Shorter Oxford English Dictionary
Entries : 119888
Inflected Forms: yes
Phrases : no
Mixed case : yes
Comments : Maintained by the Oxford Text Archive.
File name(s) : words[1234].zip
File size(s) : 95306,74597,99024,84500
Site(s) : wuarchive.wustl.edu
Directory : /mirrors/msdos/linguistics
Origin : Uncertain (see read.me file)
Entries : 109582
Inflected Forms: yes
Phrases : no
Mixed case : no
Comments : This list has also been seen split into zip
files as evanwrd[1234].zip
File name(s) : words.english.Z
File size(s) : 288385
Site(s) : sparta.nmsu.edu,haywire.nmsu.edu
Directory : /pub/lexicals/word-lists
Origin : Unknown
Entries : 69964
Inflected Forms: yes
Phrases : no
Mixed case : yes
Comments :
File name(s) : Unabr.dict.Z
File size(s) : 951951
Site(s) : arthur.cs.purdue.edu,
ftp.denet.dk
Directory : /pub/pcert/dict/misc/black.ox.ac.uk,
/pub/wordlists/dictionaries
Origin : Unknown
Entries : 213557
Inflected Forms: no
Phrases : no
Mixed case : no
Comments :
File name(s) : unabrd.dic.Z
File size(s) : 1041512
Site(s) : world.std.com
Directory : /obi/WordLists/English
Origin : Unknown
Entries : 235544
Inflected Forms: no
Phrases : no
Mixed case : yes
Comments :
File name(s) : pocket.dic.Z
File size(s) : 85821
Site(s) : ftp.uu.net
Directory : /doc/literary/obi/WordLists/English
Origin : Unknown
Entries : 21111
Inflected Forms: no
Phrases : no
Mixed case : no
Comments :
File name(s) : w130794.Z
File size(s) : 522533
Site(s) : ftp.uu.net
Directory : /doc/literary/obi/WordLists/English
Origin : Unknown
Entries : 130794
Inflected Forms: yes
Phrases : no
Mixed case : no
Comments :
File name(s) : ispell-3.0.09.tar.z
File size(s) : 467745
Site(s) : prep.ai.mit.edu
Directory : /pub/gnu
Origin : Uncertain (see README files)
Entries : ca. 50000
Inflected Forms: yes
Phrases : no
Mixed case : yes
Comments : This is the GNU ispell package which could undergo
quite frequent releases. Hence the file name and
size could change.
File name(s) : roget13a.zip
File size(s) : 643011
Site(s) : mrcnext.cso.uiuc.edu
Directory : /gutenberg/etext91
Origin : Roget's Thesaurus, 1911
Entries :
Inflected Forms: yes
Phrases : yes
Mixed case : yes
Comments : Since this edition is out of copyright, it appears
in several different forms on the net. The one
above is maintained by Project Gutenberg.
File name(s) : dictionaries.tar.Z
File size(s) : 485521
Site(s) : guardian.cs.psu.edu
Directory : /pub
Origin : Unknown
Entries : 53091
Inflected Forms: no
Phrases : no
Mixed case : yes
Comments : A collection of specialised word lists, primarily
intended for password screening.
If there are problems FTPing any of the files mentioned above, I would
like to know about it. If you cannot find the file in the directory
specified, consult your local archie to find an FTP site which does have
the file in question (Thanks to Antony Lewis for this suggestion).
There are some commercial outfits willing to sell you large word/phrase
lists. I advise you to think very carefully before deciding to buy any
such lists as the files above will suffice in most circumstances, if you
are looking for single word combinations only. Probably the best known
commercial word list(s) currently available is the 'Moby' dataset. If
you are interested in further details about 'Moby' please contact Grady
Ward (grady@netcom.com).
The 'Moby' dataset consists of the following pieces:
Moby Thesaurus II ($500)
Moby Pronunciator ($265)
Moby Part-of-Speech ($170)
Moby Hyphenator ($105)
Moby Words II ($100)
 
Continue to: