Spellcheck the Shell Way

I was reading this awesome book (about which I shall soon blog) and there was this moment of, “Fark! What a brilliant line!” like I actually said that ’cos it was so good, followed by, “Fark! Spelling mistake of spacecraft’s name!” And I thought wouldn’t a good way to deal with spellchecking (besides my favourite cmd-;) be to take the entire text, do something fancy command-line to it, and output all the words alphabetically by frequency. Then you could just spellcheck that file, find the weird words, go back to the original document and correct the shit out of them. So I did. Brilliant!

# take a text and output all the words alphabetically by frequency
# spaces replaced with line breaks, lowercase everything, punctuation included (apostrophe in ascii \047)
# http://unix.stackexchange.com/questions/39039/get-text-file-word-occurrence-count-of-all-words-print-output-sorted
# http://tldp.org/LDP/abs/html/textproc.html
# http://donsnotes.com/tech/charsets/ascii.html
find . -name "foo.txt" -exec cat {} \; | tr ' ' '\012' | tr A-Z a-z | tr -cd '\012[a-z][0-9]\047' | grep -v "^\s*$" | sort | uniq -c | sort -bnr