This project is read-only.

Extracting text from a Word document

Jan 21, 2013 at 6:09 PM

Is it possible extract the text from a word document (doc and docx) and load it into a variable?

I can only find code samples to create Word documents, but not to open them.

Also I am a little confused about the system requirements:

It says that Microsoft Office compatibility pack is required for older office versions, and one point below, that Windows is not required... but AFAIK Microsoft Office compatibility pack is only available on Windows. Which is it?

My code will run on a Linux server, and I am looking for a simple solution to extract the text of a word document and do a wordcount.