Back in the day, I would use antiword to extract the text from a Word .DOC file. But it only understands DOC. Over the years, more and more Word files have been using the “open” (ha ha) DOCX format, which antiword doesn’t read. So I found Pandoc, which does much, much more.
$ pandoc -i some.docx -t plain > some.txt
There’s also a convert-to-Markdown option:
$ pandoc -i some.docx -t markdown > some.md
I find, however, that the Markdown produced by docx2md is more to my liking. It’s less cluttered, as it doesn’t aim at fidelity to the Word document’s formatting to the same degree as pandoc, but only the basics.
To install pandoc you can use the Macports version, but lately, I’ve found it easier simply to install the official binary Mac OSX PKG.