A method of automatically generating a thematic summary from a document image without performing character recognition to generate an ASCII representation of the document text. The method begins with decomposition of the document image into text blocks, and text lines. Using the median x-height...http://www.google.de/patents/US5848191?utm_source=gb-gplus-sharePatent US5848191 - Automatic method of generating thematic summaries from a document image without performing character recognition 