Help‎ > ‎

DiceWords

Note: dicewords is currently available with Centrych, but we plan to shortly have portable versions for both MacOS and Windows. An announcement will be posted when they are available for download.

Dicewords is a Python 3 script that implements an extension to the Diceware(tm) method of generating passphrases developed by Arnold G. Reinhold.  If you're not familiar with his work or Diceware, you can read more about it on his website.

The idea behind Dicewords is simple; to use a larger list of words as the base to choose from when creating a passphrase.  This is accomplished by using a different method for creating word lists and by adding a sixth die to the process of picking a component word that goes into the resulting passphrase.


Using Dicewords
The script has two modes; passphrase generation and wordlist generation.


Passphrase generation
When generating passphrases, you can either enter your dice manually and the dicewords script will output the passphrase once your input is complete, or you can let the script generate a passphrase for you.

The following command will generate a 4 word passphrase:

$ dicewords -g
Dicewords: 23328, 3 columns
Salt: False, Sym: False, Fence: False, Min: 3, Max: 7

Passphrase 1 (24): years trucks perch locke

The output from that command indicates that the dicewords file used, which in this case is the default file  located at /usr/share/dicewords/dicewords.txt, has 23,328 words in three columns (more on that in the next section).

The next line indicates that there is no salt file, passphrase and fencepost symbols were not used, and the minimum and maximum length of all words is 3 and 7 respectively.

The last line shows the passphrase generated.  The 24 in parenthesis indicates the length of the passphrase (including spaces).

The '1' indicates that one passphrase was requested.  You can ask for more in one run, like this:

$ dicewords -g -c 10

When executed the script will return 10 passphrases.

You can also add symbols to your passphrase.  The -s option generates symbols between words while -f generates two fencepost symbols at either end of the passphrase:

$ dicewords -g -s
Dicewords: 23328, 3 columns
Salt: False, Sym: True, Fence: False, Min: 3, Max: 7

Passphrase 1 (30): tarter^bicycle,shoving_floater

When generating fenceposts there is the possibility of a space character being generated (if you didn't disable it with the --nospace option).  If one happens to be generated, dicewords will quote the passphrase as follows:

$ dicewords -g -f
Dicewords: 23328, 3 columns
Salt: False, Sym: False, Fence: True, Min: 3, Max: 7

Passphrase 1 (33): '}riched reacts sorbing buggers '

The default symbol table used selects from the 33 possible symbols available on a US computer keyboard.  Since some people prefer to not have to use the SHIFT key when entering a password or passphrase you can add the option --noshift to restrict the symbol selection to the 22 non-SHIFT numbers and symbols.  If you
include the --nospace option there are only 21 non-SHIFT keys used.

$ dicewords -g -f --noshift --nospace
Dicewords: 23328, 3 columns
Salt: False, Sym: False, Fence: True, Min: 3, Max: 7

Passphrase 1 (32): 9academe posied infract glopped]

One other symbol table is available with the --mobile option.  This table maps to the symbol keyboard on an iPhone.  When used with fencepost symbols you will only have to shift to that keyboard twice, once at the start and once at the end, when entering a passphrase.

To generate a passphrase manually drop the -g option:

$ dicewords -f --mobile --nospace
Dicewords: 23328, 3 columns
Salt: False, Sym: False, Fence: True, Min: 3, Max: 7
Enter the values of 6 dice for each of 4 word(s).
Press ENTER by itself to terminate input.
Roll 1: 631424
Roll 2: 322655
Roll 3: 142531
Roll 4: 235411
Enter the values of 2 dice for 2 symbol(s).
Symbol roll 1: 23
Symbol roll 2: 41

Passphrase 1 (30): 8tinware ganted beatles davit)


Wordlist generation
Dicewords is able to take input wordlists to generate one or two files:

dicewords.txt
This is the default file that is used to select component words for each passphrase created. The file contains 23,328 words, but files with up to 46,656 words are possible. The next section discusses how you can build a file that large.

The command line option for the input file to generate your own dicewords.txt file is --wordlist.

saltwords.txt
This is an optional file that doesn't exist by default, it needs to be created by you.  When present, this file adds a 'salt' to each passphrase in the form of a word that does not exist in the Dicewords master  dictionary, /usr/share/dicewords/master.txt.  The saltwords list will contain 36 words.

The command line option for the input file to generate a saltwords file is --saltlist.

Both input files have to be located in either the default directory, /usr/share/dicewords, or your data directory, which is normally set to $HOME/.dicewords.

For the purposes of this README you're going to create a custom dicewords file and a saltwords file in one operation as follows:

The dicewords file we're going to create is going to contain words of between 4 and 6 letters and we're going to create a saltwords file using the propernames list that comes with the system:

$ zcat /usr/share/dict/propernames.gz > ~/.dicewords/propernames.txt
$ dicewords --build --min 4 --max 6 --saltlist propernames.txt
Master: 53665 (39394, 14271)
Wordlist: 12761 (5711, 7050)
Write dicewords: 7776, cols: 1, min: 4, max: 6
Saltwords file created successfully.

The output indicates that the master dictionary has 53,665 words between 4 and 6 letters in length.  This is further broken down into 39,394 headwords and 14,271 inflection words.  The input wordlist, using the default wordlist file, /usr/share/dicewords/wordlist.txt, has 12,761 4 through 6 letter length words broken down as 5,711 headwords and 7,050 inflection words.

The new dicewords file contains 7,776 words in one column with 4 and 6 being the minimum and maximum word length.  The saltwords file can be viewed by opening the file $HOME/.dicewords/saltwords.txt.  You should see a list of 36 entries, each with a two number die index, 11-66, and a single lowercase propername.

To confirm that the salt is being added to the passphrase we can generate one and see:

$ dicewords -g -f --noshift --nospace
Dicewords: 7776, 1 columns
Salt: True, Sym: False, Fence: True, Min: 4, Max: 6

Passphrase 1 (27): `cling RAGNAR eagles draws3

Note the second word, RAGNAR, is in uppercase.  Dicewords converts the salt word to uppercase as a visual indication that a salt word was added to the passphrase.  Also, dicewords will randomize the insertion of the salt word. Run the command a few times to confirm that it's position in the passphrase changes.


Additional documentation
The man page for the diceware script contains details on the options covered here as well as some additional options that were not discussed.

The next section provides some background on wordlist construction and this file closes with some additional details on other files that are included with this package.


Wordlist construction
Where Diceware and Dicewords depart is the core list of words used.

A Diceware list uses 5 dice to select from one of 7,776 words. In exponential notation this is 6^5, which translates to 6 numbers per die raised to the power of 5 (dice used).

Adding a sixth die results in a list of 46,656 words.  For most people, this number is well beyond the average vocabulary size of 20-35,000 as summarized in a blog entry at: testyourvocab.com.

What Dicewords allows is for several list sizes, each of which is an even multiple of the original 5 dice, 7,776 word list used with Diceware.  They are:

Words   #Dice  Columns
 7,776     5       1
15,552     6       2
23,328     6       3
46,656     6       6

When you view a Dicewords list it has the same 5 dice index as a Diceware list, which starts at 11111 down to 66666.  The difference is there can be multiple columns of words in the file as listed in the table above.

The reason only four column sizes are used is to prevent any selection bias when using the sixth die to choose a word.  A single column file does not need a sixth die, but the others do.  How this works is by using the modulus of, or remainder when dividing, the value of the sixth die by the number of columns.

Note: the 6 column version doesn't need this calculation since there is a 1:1 mapping of values on a die to the column number.

The second part of this are the words themselves.  To stay within the limits of an average person's vocabulary Dicewords assembles the wordlist based on "headwords," also called "root" words, and several of the inflected forms of these head, or root words.

For example, say you have a list that contains the word "abides."  When Dicewords assembles a list it looks up the headword for abides, which is abide. When finished, the wordlist will contain abide and the inflected forms; abided, abides, and abiding.

It's this process that allows Dicewords to take the default list of 9,915 words and expand it to a three column, 23,328 list of 3-7 letter words for choosing passphrase words.

There is an alternate list of 40,792 words that can create a 46,656 list of 4-7 letter words (and many other word length combinations) for those who prefer a larger list with somewhat less common words for choosing passphrase words.

The command used to create this list would be:

dicewords --build --min 4 --max 7 --wordlist alternate.txt.

The output file will be contained in $HOME/.dicewords/dicewords.txt.


Saltwords
Another feature of Dicewords is the concept of adding a 'salt' to each passphrase.  A salt is a single word added from a seperate list of words that was created by you.  What's special about this list is that each word has been checked to NOT exist in the Dicewords master dictionary, which contains nearly 325,000 words.

By creating a list that's guaranteed to be unique when compared against the master dictionary you're making a password cracker's job that much harder.


Included files
In addition to the diceware script, this package includes several text files located in /usr/share/dicewords.  A brief description of each file is included here.

alternate.txt
This is the alternate wordlist mentioned in the previous section that can be used to generate larger lists then is possble with the default wordlist.

dicewords.txt
This file was created using the default wordlist with the following command:

dicewords --build --min 3 --max 7

filter.txt
This is a list of words that some people may find objectionable.  The default and alternate wordlists have been filtered using this list. A command line option is available to disable filtering when generating a dicewords file.

master.txt
This is the Dicewords master dictionary.  The first word in each line is a headword. If there are inflected forms of the headword they are included on the same line.  Dicewords uses this file to accept words from an input wordlist when building a dicewords file.

variant.txt
This file contains two words per line.  It is used for mapping between alternative spelling of a word, which in this case are the variations between American and British spelling.

wordlist.txt
This is the default wordlist that was used to build the dicewords.txt file.


Additional files
There are some additional files included in /usr/share/doc/dicewords, which are detailed here as well.

DiceTables.ods
DiceTables.pdf
This is a LibreOffice spreadsheet and PDF of several dice tables that are used for generating random character strings using dice.  Similar tables are available on the Diceware site, the key differences are:
  • Because all of the tables are not completely populated there is the potential of rolling a dice combination that's invalid.  The tables have been designed to make it easy to identify when a re-roll is required.  In most cases when a 6 is rolled a re-roll will be necessary. Each table includes helper notes for re-rolls.
  • Tables for using the limited symbols on an iPhone and for computer keyboard symbols that don't require a SHIFT key. These tables are also used within Dicewords for generating passphrase and fencepost symbols that make it easy to enter on these devices.
patch_csw12.sh
The website zyzzyva.net contains several Scrabble(tm) lexicon files, with CSW12 being the largest.  These files list words and short definitions that can be helpful if you're interested in finding additional words to create your own dicewords and/or saltwords files.

The script and the associated patch file will copy, truncate and clean-up a lot of editing errors that exist in the file as well as to remove some HTML sequences that are present.

To use copy the script and patch file to a temporary directory then download the CSW12.txt file from the website.  Run the script from the command line and you will have a new file, csw12.txt (note the lowercase) which has been cleaned up and trimmed down, words of 14 letters and longer are removed since 13 letter words are the largest words present in the Dicewords master dictionary.


Source files
While not included in this package, you can download the source package for this, which includes several perl scripts that were used to create the master, variant and wordlist files.

To download it use the following command:

apt-get source dicewords