Constructs an list of indexable phrases from the document. List is in
order of frequency, partially weighted by number of words.

a : all phrases instead of 1/30 truncation
h : filter html (automatic if url starts with 'http' or ends with 'htm?')
o : filter html only; do not index
s : arg is string, not filename
v : vertical output
w : weighted ordering instead of numeric
W : web filter