Open-Source Linear Book Scanner

ME450 - Winter 2014 Term The University of Michigan: Mechanical Engineering Team 19 Open-Source Linear Book Scanner Garrett Cullen Kelsey Lindberg Lauren Staszel Teresa Tombelli ABSTRACT protected against physical damage, and can be Digitizing books and documents has made searchable. This idea led to a “mass become increasingly prevalent amongst digitization” initiative by Google in 2004 called libraries and university facilities as large data Google Books, which involved partnerships with storage becomes cheaper and more available five universities including the University of and physical storage space becomes scarce. Michigan [2]. The aim of this project was to There are currently many commercial scanning preserve the massive collections of these methods available, but all but have trade-offs research libraries and to allow anyone on the between speed, reliability, automatic page internet to access them for free. This sparked turning and price. The cost of even the most interest in automatic scanning and the basic scanners is quite significant and development of commercial products. prohibitively expensive to smaller libraries and Several patented configurations for book facilities. To alleviate this problem, Google scanners are currently in use and are began an open-source book scanning project categorized by page turning method. The first whose goal was to develop a prototype that is design holds a book face-up while a swinging of reasonable size, fully automatic, and can be arm uses vacuum suction to turn the pages and easily manufactured within the budget of a an overhead camera captures images [3]. A small library. Since 2012, three prototypes have second design holds a book face-up in a V- been built in an attempt to satisfy these shaped cradle while two digital cameras capture requirements. Each iteration of the design has images. To turn pages, a flat piece of plastic is implemented a new variation on the idea of inserted between pages and turns them as it pneumatic page turning. This involves the use rotates [4]. Another method involves blowing of a vacuum to pull book pages into a channel air to “fluff” the pages apart, and a vacuum is that flips them to the other side of the book. raised off the surface of the book to turn pages The purpose of this report is to present the [5]. The majority of these methods are fast but design process and final plans for the fourth not especially reliable, with error rates around generation of the linear book scanner 5%. Commercially available scanners that prototype, as developed by ME450 Winter 2014 incorporate those methods are also extremely senior design Team 19. expensive. With prices ranging from $50,000- 250,000, they are essentially inaccessible to BACKGROUND smaller libraries [6-8]. Some of the scanners Libraries have traditionally been limited by with lower cost such as BookDrive still require the number of individuals they can reach, space manual page turning, and are far less efficient for physical text storage and the challenge of than a fully automated scanner. finding the specific information requested by a Since commercially available devices are user [1]. All of these problems can be solved by either too costly, not reliable or not efficient, storing information digitally, as texts can be the goal of Google’s project has been to kept in a small space at a low cost, are manufacture an affordable, reliable book 1 Figure 1a (top): “A book moves back and forth over the machine” [9]. Figure 1b (bottom): “Each time across, a vacuum sucks a page from one side to the other” [9]. scanner that is fully automated. This project is that the scanner is as reliable and cost-effective also intended to be open source, so that any as possible. library can have access to the engineering drawings and manufacture a scanner for their METHODS facility. Dany Qumsiyeh produced the first To develop the next generation of linear prototype while he was an employee at Google book scanner, the first step was to fully define in 2012. He also built a second prototype that the problem by investigating current scanning included improvements from the first before products and interviewing the librarians in handing the project to a University of Michigan charge of digitizing the UM library. Next, senior design team in 2013. concepts for the new prototype were generated The most recent (third generation) and compared to select the most promising prototype consists of a moving “saddle” seated design. Finally, engineering analysis was on an inverted V-shaped base that forms the performed to verify that the selected designs body of the scanner. A channel is cut through would function as intended. This phase was also the top of the body to allow pages to fall used to further refine many of the physical through it. A user places a book face-down in design features. the saddle and a motor moves the saddle down the body of the machine, scans the two Stakeholder Needs and Specifications exposed pages, then a vacuum “grabs” the page An investigation of user needs was currently facing down and pulls it into the conducted through interviews of the library channel. As the saddle moves back towards its sponsors and a tour of current book-imaging starting position, the page in the channel is facilities. While the University has already funneled into a separate channel and exits the imaged 90% of its books, it quickly became clear body on the opposite side, effectively turning that current imaging methods were inefficient. the page (Fig. 1 a-b, above). This existing While the library has several systems in use, prototype of the linear book scanner is none can consistently turn pages without functional, but changes need to be made so human intervention, regardless of price. The most reliable system in use is the Zuetschel, 2 which has no automatic page turning ability at Concept Generation and Selection all and requires a person to turn every page by After speaking to the library sponsors and hand. The library also has a Treventus which fully defining their requirements for the book can scan very quickly, but the error rate for scanner, the next step was to begin generating skipped pages is still around 5% for this design concepts. While it was clear that the $100,000 system. After speaking with the sponsors intended for the new design to be employees in charge of scanning, the user similar to the previous prototypes, the first requirements for a new scanning system were round of the concept generation process also defined. included alternative page turning methods. It First, the device must turn pages reliably. was important to begin by evaluating the page This means that it should minimize the rate of turning process as a whole so that even well- skipped, duplicated and damaged pages to established ideas could be examined for quality. below 10%. While 10% is still a relatively high In this first round, ideas as diverse as error rate for a book with hundreds of pages, electrostatics and purely mechanical page this number is comparable to the error rate of turning were evaluated against the original the Treventus and is a reasonable place to start. vacuum suction method. In the end however, a Low cost is the second most important Pugh chart was created and it became clear that requirement for the book scanner, since the a pneumatic device was still the best option. purpose of the open source product is to Once the general shape of the scanner was increase accessibility. This means that the chosen, a functional decomposition was created design should be both simple and cheap enough to outline each subsystem of the scanner, and for a smaller library to build. Specifically, the an additional set of concepts were generated to goal for the project is for the material costs to address each one. These new concepts focused be below $1000, as specified by the sponsor. In on making improvements over the previous addition to reliability, it is preferred that the prototypes. After each idea had been fully scanning process require as few manual developed and presented to the team, a adjustments as possible. Ideally, several discussion was held to decide which of the new scanners could be running at once with one ideas should be included in the final design. technician working to supervise all the Ultimately, it became clear that the major new machines. Additionally, the scanner should be innovation would be to add sensors to detect able to accommodate a large range of book which pages had been turned, rather than sizes. The intended range should include all adding new fans or hardware and change the common book sizes, with the smallest having geometry of the page turning slit. While the dimensions 4”x6” and the largest 12”x12” previous prototype had incorporated a plastic [10][11]. Any restrictions on the physical design “finger” to guide pages as they were being are simply a matter of practicality - the scanner flipped, a new flap would be included in its must be able to fit on a table top (2’x4’) and place. This would allow pages to be turned in two people should be able to move it. Each of one fluid motion instead of two distinct steps. these requirements is summarized in following The idea was to prevent errors by removing the table. obstruction of the “finger” as pages moved through the body of the scanner. Table 1: User specifications numbered by priority The final concept was a simple combination of several design features, and stood out because it combined the best aspects of the old prototypes with new concepts that were intended to fix the flaws of the previous versions.

Open-Source Linear Book Scanner

COURT REJECTS $125 MILLION SETTLEMENT in GOOGLE DIGITAL BOOK SCANNING PROJECT by Jennifer J

The Why in DIY Book Scanning

Library Instruction Round Table

IFLA Guidelines for Planning the Digitization of Rare Book And

Hachette Book Group V. Internet Archive

Index of /Sites/Default/Al Direct/2011/October

Digital Imaging and Preservation Microfilm: the Future of the Hybrid

Second Circuit Rules Against Authors Guild in University Book-Scanning Case

Book Scanning Service

A Brief Survey of Online Libraries

The DIY New Standard Book Scanner

June 12, 2014 from BOOKS to BYTES: SECOND CIRCUIT