Document Layout Analysis and Reconstruction | |
Description | Performs a layout analysis and convert a document image into HTML and PDF. Text (latin characters) is recognized and pages elements (such as images, separators, ...) are cropped and included in the document. This algorithm tries to preserve the page layout. An XML file in PAGE format is also provided to describe the document layout. A program based on this tool has won the 2nd place at ICDAR 2011's Historical Document Layout Analysis Competition. |
Specifications | Best results are obtained on newspapers, letters and magazines pages with a resolution of 300dpi in A4 format. Rendering may differ between HTML and PDF. Estimated processing time: 30-60s |
Status | Under active development. Last update: September 2012 |
Disclaimer | We will never use/read/save any uploaded data for personal use. Only the last uploaded file and its associated results are stored on our servers during 15min for technical reasons. |
Terms and conditions | You are free to use the generated results for any usage. However, if they are used for research purpose, please refer to our library by citing the following paper:"The SCRIBO Module of the Olena Platform: a Free Software Framework for Document Image Analysis". |
- Copyright (C) 2010-2012 EPITA Research and Development Laboratory (LRDE) -