TextLib
Recent News

Aug. 31, 2007
New MS Office 2007 documents parsers has been added to collection.
more

Nov. 21, 2005
Docs2text 2.0 component released. Supported document formats are MS Word, MS Excel, MS PowerPoint, rtf, Adobe Acrobat PDF.
more

Our partner

TEXTOLUTION
Full Text Indexing and Retrieval library with Approximate Search.

MS Word document format is a proprietary binary format used by MS Word® being de facto standard at office document's management it became very popular however its nondocumented structure makes it almost impossible to correctly read it by a third-party applications.
MS Word logo
docs2text component/library is able to read MS Word 97 - 2003 documents without having MS Office/Word installed delivering high accuracy and incredible processing speed.

NEW!!! MS Word 12 documents (docx) are also supported now.
learn more

PDF (stands for Portable Document Format) is developed by Adobe Systems Inc. for displaying/printing documents on a different systems and devices keeping its layout unchanged. It can contain text, images, movies, sounds, forms etc.
PDF logo
While PDF format is documented it isn't a trivial task to develop a reliable parser to process PDF documents. Vast majority of the current solutions on the market is based on the open source project xPDF with all its con's and pro's. docs2text is up to 100 times faster than any text from PDF extraction solution available.

learn more

MS PowerPoint® format is a popular presentations format using for creating a stunning slide shows and presentations.
MS PowerPoint logo
docs2text can extract text objects from MS PowerPoint presentations without MS PowerPoint installed.

learn more

MS Excel document format represents a popular spreadsheets storage. It can contain text, formulas, charts, images, complex calculations
MS Excel logo
As all MS Office binary formats MS Excel format doesn't make an exception and is nondocumented as well and as you may notice docs2text can easily read MS Excel's spreadsheets without any applications/components installed providing high accuracy, unbeatable performance and extreme flexibility.

NEW!!! MS Excel 12 documents (xlsx) are also supported now.
learn more