<<

Informations

2

INDEX

ABOUT THE SUMMER SCHOOL ...... 4 VENUE ...... 5 PEOPLE ...... 7 SCHOOL PROGRAM ...... 8 MAP OF FOURNI ...... 28 USEFUL TELEPHONE NUMBERS...... 29

3

ABOUT THE SUMMER SCHOOL

After the successful first Summer School on Docu- ment Image Processing (IDIPS 2013 ), (http:// www.iapr.org/docs/newsletter-2013-03.pdf p.13-14) the second Summer School on Document Image Processing (IDIPS2014) aims to provide both an objective overview and an in-depth analysis of the state-of-the-art of the Document Analysis Systems (DAS) and their current open issues. The courses will be delivered by world- renowned researchers in the field and relevant compa- nies, covering both theoretical and practical aspects of real problems of Document Analysis Systems, as well as examples of successful applications. This year, the school aims to provide an attractive opportunity for 40 young researchers, post-graduate and PhD students. The participants will benefit from direct interaction and discussions with the renowned speakers and practice on real problems. The researchers will also have the opportunity to present the results of their scien- tific research, and interact with their colleagues in a friendly and constructive atmosphere. The Summer School is hosted on the beautiful is- land of Fourni which ensures quiet and beautiful envi- ronment, with quality food services and housing, which would enjoy the students and lecturers, while the par- ticipants will have the opportunity to experience the Greek hospitality.

4

VENUE

The Summer School will take place in the City Hall of Fourni. Fourni can be reached by ship by the ports of or Karlovassi (). The Organizers will take care of the transportation, from the airport Aristarchos of Samos, of the participants that will arrive there by June 2nd, before 3.00 p.m., and they will be transferred back on the June 8th, about 10.00 a.m. The airport of Samos has direct flights from/to , Germany, Neth- erlands, England, Belgium, Scandinavian Countries, Switzerland etc. Fournoi Korseon (Greek: Φούρνοι Κορσέων or Greek: Φούρνοι Iκαρίας - Fournoi Ikarias), more com- monly simply Fournoi, is a complex or archipelago of small Greek islands that lie between , Samos and in Ikaria regional unit, region. The two largest islands of the complex, the main isle of Fournoi (31 km²) and the isle of (10 km²), are inhabited, as is Island (2.3 km²) to the east. On the main isle Fournoi (town) is the largest settlement and then Chrysomilia in the north the second largest (and third largest overall, after Thymaina). Fournoi (town) proper is the main ferry harbor, with ferries also landing on Thymaina. Fourni, as proven by various archeological sites and other findings on the islands, have been the home of a civilization dating back to the 3rd century B.C. 'till the 2nd to 3rd century A.C.

5

The islands are uniquely characterized by their ex- tremely complicated coastline, the total length of witch is 126 kilometers, while the area of the islands counts merely 48 sq. kilometers. They're run through by easy to follow asphalt roads, while paths and traditionally built steps lead to lots of beaches with transparent waters, such as Vlichada, Kassidi, Petrokopio, Elidaki, Vitsilia and more. The highest peaks are Korakas (514m.), Fanas (476m.) and Pappas of Thymaina (470 m.). Many of the inhabitants are fishermen, although during the summer season the population is also occu- pied in tourist activities, mostly room rentals and cater- ing. The climate is arid and hot during summers. Win- ters are rather mild with average rainfall but constant strong archipelagic winds prevail. You'll enjoy swimming in exotic waters, staying in hospitable rooms, tasting traditional Greek dishes, fresh fish and lobsters, as well as local honey and cheese of fine quality. But most of all, we are sure you will discover the beauty of simplicity and serenity, at a place extremely authentic, in a way only Greek islands can be.

6

PEOPLE

Program Chairs

Apostolos Antonacopoulos, PRImA, University of Salford Ergina Kavallieratou, University of the Aegean Josep Llados, CVC, University Autonoma de Barcelona

Local Organizing Committee

Ergina Kavallieratou, University of the Aegean Efstathios Stamatatos, University of the Aegean Irini Stathi, University of the Aegean

Program Committee

David Doermann, University of Maryland Daniel Lopresti, Lehigh University Jean-Marc Ogier, Universite de La Rochelle Ioannis Pratikakis, Democritus University of Thrace

7

SCHOOL PROGRAM

Every morning there will two lectures while in the eve- ning either student presentation (Monday) or prac- tice in laboratories (Tuesday, Thursday, Friday) will take place. The last day (Friday), the participants will have the chance to attend the presentation of several case studies.

Keywords of the Lectures

System architectures

- end-to-end document analysis systems

- Common issues & implications

- performance evaluation

Pre-processing

- Binarization

- Noise removal

- Enhancement

- Show through cancellation

- Geometric correction (page curl removal, Dewarp- ing, Slant&Skew correction)

- Skeletonization

8

Layout analysis

- Segmentation and region classification (physical layout)

- Logical layout analysis. Reading order.

Recognition

- Machine printed text recognition

- Handwriting recognition

- Graphics recognition

Evaluation

- Datasets

- Metrics

- Examples of performance evaluation methodolo- gies and protocols

Indexing

- Inverted files

- Hashing

- Latent semantic indexing (LSI)

- information spotting

Document classification

- Document classification

- Document retrieval

9

System Level Issues in Document Image Analysis Daniel Lopresti

Document image analysis techniques are not used in isolation, but rather to solve specific tasks. In this lec- ture, we shall survey a variety of architectures used when assembling end-to-end document analysis sys- tems, both traditional and novel. We will examine how the nature of the task places demands on the degree of automation that is required, the minimum acceptable accuracy level, and the kinds of errors that can be toler- ated. Also discussed will be the implications for per- formance evaluation and, ultimately, the success of the system.

After attending this lecture, students will understand:  Representative architectures that are used when building end-to-end document analysis systems.  Common issues that arise in such systems.  How to evaluate the performance of such systems.  Implications for conducting research on a component technology used in such systems.

Daniel Lopresti received his bachelor's degree from Dartmouth in 1982 and his Ph.D. in computer sci- ence from Princeton in 1987. After completing his doc- torate, he joined the Department of Computer Science at

10

Brown and taught courses ranging from VLSI design to computational aspects of molecular biology and con- ducted research in parallel computing and VLSI CAD. He went on to help found the Matsushita Information Technology Laboratory in Princeton, and later also served on the research staff at Bell Labs where his work turned to document analysis, handwriting recognition, and biometric security. In 2003, Dr. Lopresti joined the Department of Computer Science and Engineering at Lehigh where he leads a research group examining fundamental algo- rithmic and systems-related questions in pattern recog- nition, bioinformatics, and security. Dr. Lopresti is direc- tor of the Lehigh Pattern Recognition Research (PatRec) Lab. On July 1, 2009, he became Chair of the Department of Computer Science and Engineering. http://www.cse.lehigh.edu/~lopresti/

Document Image Binarisation Ioannis Pratikakis

Document image binarisation is the initial step of most document image analysis and understanding sys- tems that converts a grey scale image into a binary im- age aiming to distinguish text areas from background areas. Binarisation plays a key role in document process- ing since its performance affects drastically the degree of success in subsequent character segmentation and recog-

11 nition. Although for modern machine printed docu- ments, binarisation is considered as a problem that has already been solved, when processing degraded docu- ment images, binarisation is not an easy task. This advo- cates the intensive contemporary research efforts for the binarisation of historical document images. This tutorial aims to survey the main trends on the document image binarisation process combining the the- ory with hands-on demonstrations. Last but not least, the presentation of established evaluation measures for document image binarisation methods will be addressed as the means to study the algorithmic behaviour by pro- viding qualitative and quantitative performance indica- tion

Ioannis Pratikakis is an Assistant Professor at the Department of Electrical and Computer Engineering of Democritus University of Thrace in Xanthi, . He received the Ph.D. degree in 3D Image analysis from the Electronics engineering and Informatics department at VrijeUniversiteit Brussel, Belgium, in January 1999. From March 1999 to March 2000, he joined IRISA/ViSTA group, Rennes, France as an INRIA postdoctoral fellow. From January 2003 to June 2010, he was working as Adjunct Researcher at the Institute of Informatics and Telecom- munications in the National Centre for Scientific Re- search “Demokritos”, Athens, Greece. His research inter- ests lie in image processing, pattern recognition, vision and graphics, and more specifically, in document image analysis and recognition, medical image analysis as well

12 as multimedia content analysis, search and retrieval with a particular focus on visual content. He has served as co- chair of the Eurographics Workshop on 3D object re- trieval (3DOR) in 2008 and 2009 as well as Guest Editor for the Special issue on 3D object retrieval at the Interna- tional Journal of Computer Vision. He will be the co- organiser of the International Conference on Frontiers in Handwriting Recognition (ICFHR 2014). He is Senior Member of the IEEE, member of the Board of the Hellenic Artificial Intelligence Society for the period 2010-2012 and member of the European Association for Computer Graphics (Eurographics). http://utopia.duth.gr/~ipratika/

Recognition of Textual and Graphical Patterns Josep Llados

Documents contain two main categories of infor- mation, namely text and graphics. The recognition of text is solved by Optical Character Recognition (OCR) when the text is machine printed. OCR is a mature re- search topic, and commercial software exists offering high levels of performance in documents having tradi- tional Manhattan layouts (text blocks are strictly ori- ented horizontally or vertically). A challenge is the rec- ognition of textual parts is when the text is handwritten. Graphics recognition deals with the interpretation of graphical parts (symbols, logos, lines in tables or forms,

13 etc.). In this lecture we will review the main techniques addressed to recognize both sources of information. The lecture will be structured in two blocks. First, basic OCR techniques will be reviewed. We will review the tradi- tional shape descriptors and classifiers used in the litera- ture to recognize machine printed text. We will complete the textual analysis block reviewing techniques for handwriting recognition. In the second part we will re- view the main graphics recognition techniques such as vectorization and symbol recognition. Finally, we will address the problem of documents combining text and graphics, and review the problem of text-graphics sepa- ration.

Josep Lladós received the degree in Computer Sci- ences in 1991 from the Universitat Politècnica de Cata- lunya and the PhD degree in Computer Sciences in 1997 from the Universitat Autònoma de Barcelona (Spain) and the Université Paris 8 (France). Currently he is an Associ- ate Professor at the Computer Sciences Department of the Universitat Autònoma de Barcelona and a staff re- searcher of the Computer Vision Center, where he is also the director since January 2009. He is the head of the Pat- tern Recognition and Document Analysis Group (2009SGR-00418). He is chair holder of Knowledge Trans- fer of the UAB Research Park and Santander Bank. His current research fields are document analysis, graphics recognition and structural and syntactic pattern recogni- tion. He has been the head of a number of Computer Vi- sion R+D projects and published more than 150 papers in

14 national and international conferences and journals. J. Lladós is an active member of the Image Analysis and Pattern Recognition Spanish Association (AERFAI), a member society of the IAPR. He is currently the chairman of the IAPR-ILC (Industrial Liaison Committee). For- merly he served as chairman of the IAPR TC-10, the Technical Committee on Graphics Recognition, and also he is a member of the IAPR TC-11 (reading Systems) and IAPR TC-15 (Graph based Representations). He is editor in chief of the ELCVIA (Electronic Letters on Computer Vision and Image Analysis) and he serves on the Edito- rial Board of the IJDAR (International Journal in Docu- ment Analysis and Recognition), and also a PC member of a number of international conferences. He was the re- cipient of the IAPR-ICDAR Young Investigator Award in 2007. He was the general chair of the International Con- ference on Document Analysis and Recognition (ICDAR’2009) held in Barcelona in July 2009, and co-chair of the IAPR TC-10 Graphics Recognition Workshop of 2003 (Barcelona), 2005 (Hong Kong), 2007 (Curitiba) and 2009 (La Rochelle). Josep Lladós has also experience in technological transfer and in 2002 he created the com- pany ICAR Vision Systems, a spin-off of the Computer Vision Center working on Document Image Analysis, af- ter winning the entrepreneurs award from the Catalonia Government on business projects on Information Society Technologies in 2000. http://dag.cvc.uab.es/people/josep-llados

15

16

17

Indexing Techniques Jean-Marc Ogier

Thus talk give provide an overview concerning the main approaches of the literature concerning indexing techniques. These methods will be discussed according to their context of utilization. The outline of the presentation will be the follow- ing :  Introduction : challenge related to document index- ing  Space-partitioning-based methods : KD Trees, NKD and PKD Trees, PA Trees, R Trees, R* Trees, SR Trees, X Trees  Clustering-based methods : K-Means Clustering Trees, Agglomerative clustering Trees, K Medoïds clustering trees,  Hashing-based methods : Locality Sentive Hashing, ( LSH), Kernelized LSH, Entropy based LSH, Multi- probe LSH  Latent Semantic Indexing (LSI)  Main schema related to Information spotting prob- lems : from information characterization to indexing  Conclusion - discussion

18

Jean-Marc Ogier (07/04/1967) a obtenu son diplôme de doctorat en informatique de l'Université de Rouen en 1994, dans le cadre d’un partenariat avec la société Matra Systèmes et Informations. Depuis 2001, il est professeur titulaire à l'Université de La Rochelle. Les intérêts de ses recherches actuelles portent sur la recon- naissance, l’interprétation, et l'indexation de documents numériques. Jean-Marc Ogier est directeur du laboratoire L3I de l’Université de la Rochelle, qui comporte 90 chercheurs. Il a été le leader national ou international plusieurs pro- jets nationaux financés par l'Agence Nationale pour la recherche nationale (MADONNE, NAVIDOMASS, ...) et par la communauté européenne (Eureka !). Il est direc- teur adjoint du GDR I3 du CNRS depuis 2005 (structure regroupant environ 800 personnes). Il est le président général du comité technique numéro 10 (reconnaissance graphique - 300 chercheurs dans le monde) de l'Associa- tion internationale pour la Reconnaissance des Formes (IAPR), et a été membre du Governing Board de l'IAPR (association internationale scientifique regroupant plus de 10 000 chercheurs dans le monde), représentant la France. Il est également vice-président de l'Université de La Rochelle. Il est par ailleurs président de l’association VALCONUM, consortium académico-industriel visant la compétitivité française en termes de recherche et d’in- novation. Il est auteur de plus de 200 publications http://pageperso.univ-lr.fr/jmogier/

19

Archives/Researchers’ point of view Irini Stathi

This lecture will focus on critical issues raising from archival policies regarding collection development, access, exhibition, cataloging, preservation, and restora- tion of audiovisuals. Introduction to principle models and methodologies of moving image archive practices such as collection development of classical, national, re- gional (small gauge formats, independent and amateur productions, new media) materials; changing role of technology in preservation and restoration; cultural im- pact of public programming and elaboration of possible metadata and useful information mined from the audio- visual text. Finally, this lecture will examine and evaluate the current preservation standards for storage and duplica- tion. Discussion of critical preservation problems and management of the data produced by digitization prac- tices will be focused as well.

Irini Stathi Received a Laurea from DAMS, Uni- versity of Bologna, in 1990 and the Ph.D in Cinema The- ory and Communication from the Department of Mass Media and Communication, Panteion University of Ath- ens, in 1996. She is an Associate Professor in Theory of Cinema and Audiovisuals at the Department of Cultural

20

Technology and Communication, at the University of the Aegean. Her research interest include: film theory and film history, audiovisuals and new technologies, rela- tionship between cinema and the other arts, Film preser- vation. She has collaborated as well with the Greek di- rector Theo Angelopoulos for more than 10 years and she is the responsible for the digitalization of his Ar- chive. http://aviac.ct.aegean.gr/?page_id=172

Document Image Preprocessing Ergina Kavallieratou

By document image preprocessing, we refer to all the procedures before the main task of a Document Im- age Analysis () system. It is a very wide area starting from image binarization, extending to noise removal, Geometric and any other transformations and could reach to segmentation, normalization, etc. Such preprocessing tasks are included in almost every DIA system. In this presentation, the evolution through time of several popular tasks is going to be pre- sented. The combination of various tasks and their de- pendence on the system will be analyzed and evaluated. Our intention is to demonstrate how any given system could impose its preprocessing requirements, analyzing its involved problems and characteristics.

21

Moreover, we will examine how the same preproc- essing tasks can be shared between similar or related systems.

Ergina Kavallieratou was born in Kefalonia, Greece. She received her Diploma in Electrical and Com- puter Engineering in 1996 from the Polytechnic School of the University of and her PhD in Handwritten Op- tical Character Recognition and Document Image proc- essing from the same department in 2000. She teaches in Greek Open University, since 2001. Since September 2004, she is a member of the teaching staff of the depart- ment of Information and Communication System Engi- neering, University of the Aegean, as Assistant Professor since 2012. Her research interests include Optical Charac- ter Recognition, Document Image Analysis, Computer Vision and Pattern Recognition. http://www.icsd.aegean.gr/lecturers/kavallieratou/cv.htm

Page Layout Segmentation and Analysis David Doermann

Historically, documents in printed form have been organized into pages in a way that facilitates sequential browsing, reading and/or search. The structure of these documents varies by genre, there is no doubt that there

22 is an implicit organization that helps readers navigate the content in order to enable the transfer of information from the author to the reader. As authors, humans have a canny ability to tweak the basic rules of organization to enhance the experience of the reader, and as readers we are able use this infor- mation implicitly to comprehend what we are read- ing. While ideally we should be teaching our document analysis systems to be able to parse structures and use them to their advantage, traditionally page analysis has simply been a way to divide up content in a way that en- ables content analysis algorithms to perform more effi- ciently. Pages are divided into zones and these zones labeled and analysis by content specific routines. Zones of text are divided into lines, lines of text into words, etc. As part of interpretation, the logical relationships between zones or regions on the page can be analyzed for reading order or as functional units of organization. In this session we cover two fundamental problems of Page Segmentation and Logical Layout Analy- sis. While traditional these two tasks have been ad- dressed in sequence, it is more effective to fully under- stand the challenges that face the community including dealing with handwritten and degraded material

23

Document Image Classification and Retrieval

David Doermann

Traditional approaches to document retrieval focus on conversion to electronic text followed by indexing of the text content. Recently some work in the community has focused on indexing document image content di- rectly. In this talk, we will overview work at Maryland on Classification and Indexing that scales to millions of documents. First we present a learning based approach for computing structural similarities among document images for unsupervised exploration in large document collections. The approach is based on multiple levels of content and structure. At a local level, a bag-of-visual words based on SURF features provides an effective way of computing content similarity. The document is then recursively partitioned and a histogram of codewords is computed for each partition. Structural similarity is com- puted using a random forest classifier trained with these histogram features. We experiment with three diverse datasets of document images varying in size, degree of structural similarity, and types of document images. Sec- ond, we present a scalable algorithm for segmentation free content retrieval in document images. The contribu- tions of this paper include the use of the SURF feature for image passage retrieval, a novel indexing algorithm for efficient retrieval of SURF features and a method to filter results using the orientation of local features and

24 geometric constraints. Results demonstrate that logo, signature block and stamp retrieval can be performed with high accurately and efficiently scaled to a large datasets.

Dr. David Doermann is a senior research scientist in UMIACS. He received a B.Sc. degree in Computer Sci- ence and Mathematics from Bloomsburg University in 1987, and a M.Sc. degree in 1989 in the Department of Computer Science at the University of Maryland, Col- lege Park. He continued his studies in the Computer Vi- sion Laboratory, where he earned a Ph.D. 1993. Since 1993, he has served as co-director of the Laboratory for Language and Media Processing in the University of Maryland's Institute for Advanced Computer Studies and as an adjunct member of the graduate faculty. His team of researchers focuses on topics related to document image analysis and multimedia information processing. Recent intelligent document image analysis projects include page decomposition, structural analysis and classification, page segmentation, logo recognition, document image compression, duplicate document im- age detection, image based retrieval, character recogni- tion, generation of synthetic OCR data, and signature verification. In video processing, projects have centered on the segmentation of compressed domain video se- quences, structural representation and classification of video, detection of reformatted video sequences and the performance evaluation of automated video analysis al- gorithms.

25

In 2002 he received an Honorary Doctorate of Tech- nology Sciences from the University of Oulu for his con- tributions to digital media processing and document analysis research. He is a founding co-editor of the Inter- national Journal on Document Analysis and Recognition, has the General Chair or Co-Chair of over a half dozen international conferences and workshops and is the Gen- eral Chair of the International Conference on Document Analysis and Recognition (ICDAR) to be held in Wash- ington DC in 2013. He has over 30 journal publications and over 125 refereed conference papers. http://www.umiacs.umd.edu/people/doermann

Document Analysis System Evaluation Efstathios Stamatatos

Common evaluation frameworks for Document Image Analysis will be presented. Evaluation techniques used in DIA competitions and systems will be discussed focusing on their pros and cons. Moreover, current evaluation techniques from other related research fields and evaluation campaigns will be presented. Through this presentation the students will get familiar with sev- eral evaluation techniques and methodologies and will understand their strengths and limitations.

Efstathios Stamatatos received the diploma degree

26 in electrical engineering (1994) and the doctoral degree in electrical and computer engineering (2000), both from the University of Patras, Greece. He was a research associate of the Wire Communications Lab. of the University of Patras from 1995 to 2003. He also joined the Polytechnic University of Madrid (1998) as a visiting researcher, the Austrian Research Institute for Artificial Intelligence as a post-doc researcher (2001-2002), and the TEI of Ionian Is- lands as an adjunct professor (2002-2004). Since 2004 he is a faculty member of the Dept. of Information and Com- munication Systems Engineering, University of the Ae- gean. http://www.icsd.aegean.gr/lecturers/stamatatos/

27

28

USEFUL TELEPHONE NUMBERS

Municipally of Fourni 22753 50600 K.E.P (Citizen Service Center) 22750 50601 Police Department of Fourni 22750 51222 Port Authorities of Fourni 22750 51207 Medical Center 22750 51202 Ticket Agents 22750 51481-51546

29

30

ΠΑΡΟΣ - ΑΝΤΙΠΑΡΟΣ Ν.Ε.

31

32