Last demo update: 03/04/13, New demo available: Staff Removal in Music Scores.

 

Document Layout Analysis and Reconstruction
DescriptionPerforms a layout analysis and convert a document image into HTML and PDF.
Text (latin characters) is recognized and pages elements (such as images, separators, ...) are cropped and included in the document. This algorithm tries to preserve the page layout. An XML file in PAGE format is also provided to describe the document layout. A program based on this tool has won the 2nd place at ICDAR 2011's Historical Document Layout Analysis Competition.
SpecificationsBest results are obtained on newspapers, letters and magazines pages with a resolution of 300dpi in A4 format. Rendering may differ between HTML and PDF.
Estimated processing time: 30-60s
StatusUnder active development.
Last update: September 2012
DisclaimerWe will never use/read/save any uploaded data for personal use. Only the last uploaded file and its associated results are stored on our servers during 15min for technical reasons.
Terms and conditions You are free to use the generated results for any usage. However, if they are used for research purpose, please refer to our library by citing the following paper:"The SCRIBO Module of the Olena Platform: a Free Software Framework for Document Image Analysis".
Warning: do not open several tabs to send several images at the same time. You may not retrieve the expected results.

Choose one of the examples Upload an image
It must fulfill the following conditions:
  • JPG, PNG, PNM, GIF, BMP, TIF
  • 20 MB maximum.
  •  
    Results will be displayed here.



    This algorithm has been developed in the context of the SCRIBO project of the Free Software Thematic Group, part of the "System@tic Paris-Région" cluster (France). This project is partially funded by the French Government, its economic development agencies, and by the Paris-Région institutions.

    Source code is released under GNU GPLv2 license in our Git repository (See Download section) and available online in scribo/src/content_in_doc.cc.



    - Copyright (C) 2010-2012 EPITA Research and Development Laboratory (LRDE) -