 |
|
|
Oct. 16, 2009
New OpenDocument family documents parsers are availble now.
more
Aug. 31, 2007
New MS Office 2007 documents parsers has been added to collection.
more
Nov. 21, 2005
Docs2text 2.0 component released. Supported document formats are MS Word, MS Excel, MS PowerPoint, rtf, Adobe Acrobat PDF.
more
|
|
|
TEXTOLUTION
Full Text Indexing and Retrieval library with Approximate Search.
|
|
Check also pdf2text, odf2text, xls2text and ppt2text
 |
doc2text
doc2text is a component/library designed to convert MS Word documents into an easy editable and readable plain text which is also ease indexing and searching process of your documents.
Below is a short list of the most important features of doc2text:
- doesn't require MS Word to process documents;
- fastest possible processing speed - up to 200-300 times faster than using MS Word automation;
- precise output - in most cases output is better than MS Word «Save As Text» does;
- full extraction of tables, numbered and bulleted lists, headers, footers;
- document summary extraction - author, title, keywords etc.;
docx2text
docx2text is a totally new library which is capable of processing new MS Word 2007 documents (docx) into the text.
It provides the same conversion features as doc2text. Please, refer to the features above for more details.
Still have questions - use our feedback form to ask any questions and we'll be happy to assist you.
Proceed to Download page to download doc2text demo which is part of docs2text demo.
|
|
|