Towards Automated Infographic Design: Deep Learning-Based Auto-Extraction of Extensible Timeline
Total Page:16
File Type:pdf, Size:1020Kb
© 2019 IEEE. This is the author’s version of the article that has been published in IEEE Transactions on Visualization and Computer Graphics. The final version of this record is available at: 10.1109/TVCG.2019.2934810 Towards Automated Infographic Design: Deep Learning-based Auto-Extraction of Extensible Timeline Zhutian Chen, Yun Wang, Qianwen Wang, Yong Wang, and Huamin Qu am am Icon Icon at My first submission to VIS My first submission to VIS Italy 1 – 1 France. OMG Zidane Germany 1-0 Argentina. has been accepted… Font head-butted Materazzi! Wonderful game. has beenet accepted… 2015 Font 2006 2014 em em2015 em am 2014 2016 et et Font Font 2002 2010 2018 at 2014 2016 Brazil 2-0 Germany. A Spain 1-0 Netherlands. France 4-2 Croatia. Very The first year of my Ph.D. My second submission to The first year of my Ph.D. at Font Font My second submission to What a pity for Netherlands. exciting for so many goals. Everything is wonderful! VIS has been accepted Everything is wonderful! beautiful match. Again! VIS has been accepted Again! Icon Fig. 1: An automated approach to extract an extensible timeline template from a bitmap image. a) Original bitmap image; b) Content understanding including global and local information of the timeline; c) Extensible template contains editable elements and their semantic roles; d) New timeline (with mock-up colors) automatically generated with updated data. Abstract—Designers need to consider not only perceptual effectiveness but also visual styles when creating an infographic. This process can be difficult and time consuming for professional designers, not to mention non-expert users, leading to the demand for automated infographics design. As a first step, we focus on timeline infographics, which have been widely used for centuries. We contribute an end-to-end approach that automatically extracts an extensible timeline template from a bitmap image. Our approach adopts a deconstruction and reconstruction paradigm. At the deconstruction stage, we propose a multi-task deep neural network that simultaneously parses two kinds of information from a bitmap timeline: 1) the global information, i.e., the representation, scale, layout, and orientation of the timeline, and 2) the local information, i.e., the location, category, and pixels of each visual element on the timeline. At the reconstruction stage, we propose a pipeline with three techniques, i.e., Non-Maximum Merging, Redundancy Recover, and DL GrabCut, to extract an extensible template from the infographic, by utilizing the deconstruction results. To evaluate the effectiveness of our approach, we synthesize a timeline dataset (4296 images) and collect a real-world timeline dataset (393 images) from the Internet. We first report quantitative evaluation results of our approach over the two datasets. Then, we present examples of automatically extracted templates and timelines automatically generated based on these templates to qualitatively demonstrate the performance. The results confirm that our approach can effectively extract extensible templates from real-world timeline infographics. Index Terms—Automated Infographic Design, Deep Learning-based Approach, Timeline Infographics, Multi-task Model 1 INTRODUCTION Graphic designers have been producing infographics in a variety of graphic design, which has been widely used in commercial software, fields, such as advertisement, business presentation, and journalism, be- such as Microsoft PowerPoint and Adobe Illustrator. These systems cause of their effectiveness in spreading information [20,52]. To inform can automatically generate infographics by plugging in data to a design data context and engage audiences, infographics are often embellished template. Although easy to use, these systems typically only provide with icons, shapes, and images in various styles [26]. However, cre- limited types of templates with default styles, which leads to a lack of arXiv:1907.13550v2 [cs.HC] 7 Oct 2019 ating infographics is demanding. Designers should consider not only diversity. By contrast, many infographics “in the wild” with diverse perceptual effectiveness but also aesthetics, memorability, and engage- styles can only be accessed as images in a bitmap format. If users want ment [9, 19]. Researchers have introduced design tools [26, 52, 55] to follow the styles of these bitmap infographics, they have to manually to alleviate the burden of infographics creation by automating some create their own infographics, which is difficult and tedious. processes (e.g., visual encoding). However, these tools require users In this work, we investigate the methods of automatically extract- to manually initialize most of the design (e.g., drawing graphical ele- ing an extensible template from a bitmap infographic. Compared to ments). The process remains difficult and time-consuming, especially editable templates, extensible templates contain not only the editable for laymen, leading to the demand for automated infographic design. elements but also the semantic roles of these elements, which enable the Using templates is an effective approach to enable automated info- automatic extension with updated data. Previous works [38,39] attempt to extract visual encodings and color mappings from chart images based on rules and machine learning (ML) methods, by utilizing the legends, • Z. Chen, Q. Wang, Yong Wang and H. Qu are with Hong Kong University of axes, plot areas, and common layouts. However, the content of info- Science and Technology. E-mail: fzhutian.chen, qwangbb, graphics can be unstructured and manifold. This makes it challenging [email protected], [email protected]. to analyze infographic images and extract extensible templates from • Y. Wang is with Microsoft Research. E-mail: [email protected]. them. As a first step towards automated infographic design, we focus Manuscript received xx xxx. 201x; accepted xx xxx. 201x. Date of Publication on the timeline infographics, which have been widely used for centuries xx xxx. 201x; date of current version xx xxx. 201x. For information on and whose design space has been extensively studied [10]. obtaining reprints of this article, please send e-mail to: [email protected]. Automatically extracting an extensible template from a bitmap time- Digital Object Identifier: xx.xxxx/TVCG.201x.xxxxxxx line infographic is non-trivial. Particularly, two obstacles stand in the way. First, it is challenging to interpret a bitmap timeline infographic au- 1 tomatically. Understanding the content of the infographic is necessary trains a fully-connected neural network to predict design choices based for automating the extraction. However, the elements in an infographic on input data. Although we also aim for automated design, these sys- can be distributed in any place with any style( e.g., shapes, colors, and tems, however, cannot be adapted to infographics. They focus mainly sizes, etc.) It is difficult for a machine to interpret the infographic on recommending visual encodings for the input data (e.g., how to that can only be accessed in pixels. Second, it is intricate to convert a encode data using visual channels). By contrast, designing infographics bitmap infographic to be extensible automatically. An understanding of requires additional attention to visual styles (e.g., how to embellish a timeline infographic is not enough for using it as a template. Even if the visualization with shapes and icons), which are omitted in these the machine has already obtained structural information of the timeline systems. In this regard, our work is inherently different from them. (e.g., type, orientation, and categories and locations of its elements), how to convert the timeline to an extensible template remains unclear, 2.2 Computational Interpretation of Visualization not to mention the information could be incorrect. Computational interpretation of visualization seeks to enable machines To address these challenges, we propose a novel end-to-end approach to understand the content of visualization images (e.g., data, styles, for automatically extracting an extensible template from a bitmap time- and visual encodings). According to the targets, prior methods can be line infographic. Our approach adopts a deconstruction and reconstruc- divided into two categories: for charts and for infographics. tion paradigm. We address the first challenge at the deconstruction A general pipeline when interpreting a chart is first to identify the stage. We propose a multi-task deep neural network (DNN) that si- type of the chart via classifications, then detect elements (e.g., marks multaneously parses two kinds of information from a timeline image: or text) in the chart, and finally extract the underlying information global and local information. Global information includes the represen- (e.g., data or visual encodings). As a pioneer, Savva et al. introduced tation, scale, layout, and orientation of the timeline. Local information ReVision [46], in which the graphical and textual features are fed into includes the location, category, and pixels of each visual element on a support vector machine (SVM) model for a chart type classification. the timeline. These two kinds of information provide a panorama of the ReVision then localizes the marks and extracts data from pie and bar timeline. We tackle the second challenge at the reconstruction stage. charts by using a carefully designed multi-steps method based on image By utilizing the deconstruction results, we propose a pipeline with three processing and heuristics. Siegel