Spectral Python

Total Page:16

File Type:pdf, Size:1020Kb

Spectral Python Learning Geospatial Analysis with Python Master GIS and Remote Sensing analysis using Python with these easy to follow tutorials Joel Lawhead BIRMINGHAM - MUMBAI Learning Geospatial Analysis with Python Copyright © 2013 Packt Publishing All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews. Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book. Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information. First published: October 2013 Production Reference: 1181013 Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK. ISBN 978-1-78328-113-8 www.packtpub.com Cover Image by Jarek Blaminsky ([email protected]) Credits Author Project Coordinator Joel Lawhead Angel Jathanna Reviewers Proofreader Jorge Samuel Mendes de Jesus Bernadette Watkins Athanasios Tom Kralidis Alessandro Pasotti Indexer Hemangini Bari Acquisition Editor Joanne Fitzpatrick Graphics Abhinash Sahu Lead Technical Editor Balaji Naidu Production Coordinator Shantanu Zagade Technical Editors Pooja Arondekar Cover Work Shantanu Zagade Anita Nayak Anusri Ramchandran About the Author Joel Lawhead is a PMI-certified Project Management Professional (PMP) and the Chief Information Officer (CIO) for NVisionSolutions.com, an award-winning firm specializing in geospatial technology integration and sensor engineering. He began using Python in 1997 and began combining it with geospatial software development in 2000. He has been published in two editions of the Python Cookbook by O'Reilly. He is also the developer of the widely used open source Python Shapefile Library (PyShp) and maintains the geospatial technical blog GeospatialPython.com and Twitter feed @SpatialPython discussing the use of the Python programming language within the geospatial industry. In 2011, he reverse engineered and published the undocumented shapefile spatial indexing format and assisted fellow geospatial Python developer, Marc Pfister, in reversing the algorithm used, allowing developers around the world to create better-integrated and more robust geospatial applications involving shapefiles. He has served as the lead architect, project manager, and co-developer for geospatial applications used by US government agencies including NASA, FEMA, NOAA, the US Navy, as well as many commercial and non-profit organizations. In 2002, he received the international "Esri Special Achievment in GIS" award for work on the Real-time Emergency Action Coordination Tool (REACT) for emergency management using geospatial analysis. I would like to acknowledge my loving family including my wife Julie and four children Lauren, Will, Lillie, and Lainie who allowed me to write this book after hours. Thank you to my parents who inspired me through their actions to pursue computers, teaching, and writing; all the ingredients needed for a technical book. I would also like to acknowledge the work of the geospatial Python pioneers whose relentless and selfless contributions over the years in developing and publishing code to the geospatial Python body of knowledge made the content of this book possible, including Sean Gillies, Howard Butler, Matthew Perry, Frank Warmerdam, and Marc Pfister. About the Reviewers Jorge Samuel Mendes de Jesus has 15 years of programming experience in the field of Geoinformatics, with focus on Python programming, web services, and spatial databases. He has a PhD in Geography and Sustainable Development from Ben-Gurion University and has been employed by the Joint Research Center, ISPRA, Plymouth Marine Laboratory and currently works at ISRIC, World Soil Information. He currently lives in Wageningen, the Netherlands and spends his time learning combat sports and Dutch. Athanasios Tom Kralidis is a Senior Systems Scientist for the Meteorological Service of Canada, where he provides geospatial technical and architectural leadership in support of MSC's data. His professional background includes key involvement in the development and integration of geospatial web standards, systems and services for the Canadian Geospatial Data Infrastructure (CGDI) with Natural Resources Canada (NRCan), as well as using these principles in architecting RésEau, Canada's water information portal. He is active in the Open Geospatial Consortium (OGC) community, was lead contributer to the OGC Web Map Context Documents Specification, member of the CGDI Architecture Advisory Board, as well as part of the Canadian Advisory Committee to ISO Technical Committee 211 Geographic Information / Geomatics. He is a developer on the MapServer, GeoNode and OWSLib open source software projects, and part of the MapServer Project Steering Committee. He is the founder and lead developer of pycsw, an OGC-compliant CSW reference implementation. He is also a charter member of the OGC. Tom holds a Bachelor's degree in Geography from York University, GIS certification from Algonquin College, and a Master's degree in Geography and Environmental Studies (research and dissertation in Geospatial Web Services / Infrastructure) from Carleton University. He is a Certified Geomatics Specialist (GIS/LIS) with the Canadian Institute of Geomatics. Alessandro Pasotti is the founder of ItOpen, an Italian web development consultancy focused on web GIS development and accessible websites. He has been programming for over two decades and he is now mainly a web application developer, handling both frontend and backend development. He fell in love with Linux and free software in 1994 and never turned back. He spends most of his time developing web GIS applications in Python using GeoDjango and JavaScript mapping libraries such as OpenLayers. www.PacktPub.com Support files, eBooks, discount offers and more You might want to visit www.PacktPub.com for support files and downloads related to your book. Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub. com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details. At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks. http://PacktLib.PacktPub.com Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can access, read and search across Packt's entire library of books. Why Subscribe? • Fully searchable across every book published by Packt • Copy and paste, print and bookmark content • On demand and accessible via web browser Free Access for Packt account holders If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view nine entirely free books. Simply use your login credentials for immediate access. Table of Contents Preface 1 Chapter 1: Learning Geospatial Analysis with Python 9 Geospatial analysis and our world 9 Beyond politics 12 History of geospatial analysis 13 Geographic Information Systems 17 Remote sensing 18 Elevation data 24 Computer-aided drafting 25 Geospatial analysis and computer programming 26 Object-oriented programming for geospatial analysis 27 Importance of geospatial analysis 28 Geographic Information System concepts 29 Thematic maps 30 Spatial databases 31 Spatial indexing 32 Metadata 32 Map projections 32 Rendering 33 Raster data concepts 34 Images as data 35 Remote sensing and color 35 Common vector GIS concepts 36 Data structures 36 Buffer 38 Dissolve 38 Generalize 39 Intersection 40 Table of Contents Merge 40 Point in polygon 41 Union 42 Join 42 Geospatial rules about polygons 43 Common raster data concepts 43 Band math 43 Change detection 44 Histogram 45 Feature extraction 45 Supervised classification 46 Unsupervised classification 46 Creating the simplest possible Python GIS 46 Getting started with Python 46 Building SimpleGIS 47 Summary 55 Chapter 2: Geospatial Data 57 Data structures 61 Common traits 61 Geo-location 61 Subject information 61 Spatial indexing 62 Metadata 65 File structure 66 Vector data 72 Shapefiles 74 CAD files 76 Tag and markup-based formats 77 GeoJSON 79 Raster data 80 TIFF files 81 JPEG, GIF, BMP, and PNG 81 Compressed formats 82 ASCII GRIDS 82 World files 83 Point cloud data 85 Summary 87 Chapter 3: The Geospatial Technology Landscape 89 Data access 92 GDAL 92 OGR 93 [ ii ] Table of Contents Computational geometry 95 PROJ.4 96 CGAL 97 JTS 98 GEOS 100 PostGIS 101 Other spatially-enabled databases 104 Oracle spatial and graph 105 ArcSDE 107 Microsoft SQL Server 109 MySQL 109 SpatiaLite 109 Routing 110 Esri Network Analyst and Spatial Analyst 110 pgRouting 110 Desktop tools 111 Quantum GIS 112 OpenEV 113 GRASS GIS 115 uDig 116 gvSIG 118 OpenJUMP 118 Google Earth 118 NASA World Wind 120 ArcGIS 122 Metadata
Recommended publications
  • The Uatu System for Visualizing Networked Writing Activity
    The Uatu System for Visualizing Networked Writing Activity J. Holden Hill, Phillip Parli-Horne, Paul Gestwicki Brian McNely Computer Science Department English Department [email protected], [email protected], [email protected] [email protected] Ball State University Computer Science Department Technical Report 2011-01 June 3, 2011 Abstract Over the course of an academic year, two students and a pair of faculty advisors have written software to visualize how participants collaborate on networked writing projects. Using Google Docs as a way to allow students to instantaneously interact with a document in real-time, this software captures data from Google’s cloud service and displays it in a pair of visualizations. We used agile methods of software development to devise a way to implement their ideas in an appealing way. This document contains detailed instructions on where the latest iteration of the software can be located. It also details the process of making the system operational on a new machine, stating how the software works and where it should be placed in the file system. The document also explains how one can use the system to visualize writing collaboration. Finally, many failed iterations of the software have led to meaningful reflections on software development practices. The document elaborates on the hardships of development, as well as provides insight on how this software may evolve toward richer experiences. 1Primary contact 1 1. Introduction The Uatu project is designed to visualize collaborations in a digital workspace. The name comes from the Marvel Comics character, Uatu the Watcher, who is tasked with watching over the universe; in the same way, this software watches over revisions of Google Documents and stores their information for later access.
    [Show full text]
  • Opensocialに見る Googleのオープン戦略 G オ 戦略
    OpenSocialに見る Googleのオオ戦略ープン戦略 Seasar Conference 2008 Autumn よういちろう(TANAKA Yoichiro) (C) 2008 Yoichiro Tanaka. All rights reserved. 08.9.6 1 Seasar Conference 2008 Autumn Self‐introduction • 田中 洋郎洋一郎(TANAKA Yoichiro) – Mash up Award 3rd 3部門同時受賞 – Google API Expert(OpenSocial) 天使やカイザー と呼ばれて 検索 (C) 2008 Yoichiro Tanaka. All rights reserved. 08.9.6 2 Seasar Conference 2008 Autumn Agenda • Impression of Google • What’s OpenSocial? • Process to open • Google now (C) 2008 Yoichiro Tanaka. All rights reserved. 08.9.6 3 Seasar Conference 2008 Autumn Agenda • Impression of Google • What’s OpenSocial? • Process to open • Google now (C) 2008 Yoichiro Tanaka. All rights reserved. 08.9.6 4 Seasar Conference 2008 Autumn Services produced by Google http://www.google.co.jp/intl/ja/options/ (C) 2008 Yoichiro Tanaka. All rights reserved. 08.9.6 5 Seasar Conference 2008 Autumn APIs & Developer Tools Android Google Web Toolkit Blogger Data API Chromium Feedbunner APIs Gadgets API Gmail Atom Feeds Google Account Google AdSense API Google AdSense for Audio Authentication API Google AdWords API Google AJAX APIs Google AJAX Feed API Google AJAX LAnguage API Google AJAX Search API Google Analytics Google App Engine Google Apps APIs Google Base Data API Google Book Search APIs Google Calendar APIs and Google Chart API Google Checkout API Google Code Search Google Code Search Data Tools API GlGoogle Custom ShSearch API GlGoogle Contacts Data API GlGoogle Coupon FdFeeds GlGoogle DkDesktop GdGadget GlGoogle DkDesktop ShSearch API APIs Google Documents List Google
    [Show full text]
  • Modules | D7.Rovin.Be
    Modules | d7.rovin.be http://d7.rovin.be/admin/modules Hello d7 Log out Dashboard Content Structure Store Appearance People Modules Configuration Reports Help Add content Find content Edit shortcuts Modules List Update Uninstall Refreshed 106 strings for the enabled modules. The configuration options have been saved. No update information available. Run cron or check manually . Download additional contributed modules to extend Drupal's functionality. Regularly review and install available updates to maintain a secure and current site. Always run the update script each time a module is updated. Install new module 1 of 36 31/01/2012 17:00 Modules | d7.rovin.be http://d7.rovin.be/admin/modules Hello d7 Log out Dashboard Content Structure Store Appearance People Modules Configuration Reports Help Add content Find content Edit shortcuts feeds). Controls the visual building blocks a page is constructed with. Blocks are boxes of content rendered into an area, or region, of a web page. Block 7.10 Required by: Dashboard Help Permissions Configure (enabled ), Block languages (enabled ), Menu translation (enabled ), Nodes in block (disabled ), Old search facets (deprecated) ( disabled ) Blog 7.10 Enables multi-user blogs. Allows users to create and Book 7.10 organize related content in an outline. Allows administrators to change the color scheme of Color 7.10 Help compatible themes. Required by: Stylizer ( disabled ) Allows users to comment on and discuss published content. Requires: Text ( enabled ), Field (enabled ), Field SQL storage Comment 7.10 (enabled ) Help Permissions Configure Required by: Forum ( disabled ), Hansel Forum ( disabled ), Multilingual forum ( disabled ), Talk ( disabled ), Tracker (disabled ) Enables the use of both personal and site-wide contact Contact 7.10 forms.
    [Show full text]
  • WEB-BASED SMART FARM VISUALIZATION GIS TOOL a Paper
    WEB-BASED SMART FARM VISUALIZATION GIS TOOL A Paper Submitted to the Graduate Faculty of the North Dakota State University of Agriculture and Applied Science By Abhishek Agarwal In Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCE Major Department: Computer Science November 2014 Fargo, North Dakota North Dakota State University Graduate School Title Web-based Smart Farm Visualization GIS Tool By Abhishek Agarwal The Supervisory Committee certifies that this disquisition complies with North Dakota State University’s regulations and meets the accepted standards for the degree of MASTER OF SCIENCE SUPERVISORY COMMITTEE: Dr. Anne M. Denton Chair Dr. Kendall E. Nygard Dr. Fred Riggins Approved: 10/15/2015 Dr. Brian M. Slator Date Department Chair ABSTRACT Visualization of data is much more vibrant and informative than a written description could ever be. Continued research on information visualization has resulted in discovery of a number of designs and variety of techniques. Number of people who are engaged in the research of natural products who often either conduct field collections themselves or collaborate with partners. The information gleaned from such collecting trips (e.g. longitude/latitude coordinates, geography, elevation and other field observations) has provided valuable data to the scientific community. Geographic Information Systems (GIS) have been used to display, and analyze geographic data. The aim of the project is to provide a web interface to visualize data with geographic coordinates giving the ability to the users to select an area on the map, which will serve as an input for generating meaningful chart representation. Tools used in this project are JQuery, Google Maps/Charts, Spring MVC and MySQL database.
    [Show full text]
  • HTML5 Graphing and Data Visualization Cookbook
    HTML5 Graphing and Data Visualization Cookbook Learn how to create interactive HTML5 charts and graphs with canvas, JavaScript, and open source tools Ben Fhala BIRMINGHAM - MUMBAI HTML5 Graphing and Data Visualization Cookbook Copyright © 2012 Packt Publishing All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews. Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book. Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information. First published: November 2012 Production Reference: 1161112 Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK. ISBN 978-1-84969-370-7 www.packtpub.com Cover Image by Ben Fhala ([email protected]) Credits Author Project Coordinator Ben Fhala Michelle Quadros Reviewer Proofreader Chris Valleskey Chris Smith Acquisition Editor Indexer Joanna Finchen Hemangini Bari Lead Technical Editor Graphics Kedar Bhat Valentina Dsilva Technical Editors Production Coordinator Prasad Dalvi Nilesh R. Mohite Joyslita D'souza Ankita Meshram Cover Work Nilesh R.
    [Show full text]
  • The Comparative Analysis of Selected Interactive Data Presentation Techniques on the Example of the Land Use Structure in the Commune of Tomice
    Polish Cartographical Review Vol. 48, 2016, no. 3, pp. 115–127 DOI: 10.1515/pcr­2016­0009 KAROL KRÓL, BARBARA PRUS University of Agriculture in Kraków Faculty of Environmental Engineering and Land Surveying [email protected]; [email protected] The comparative analysis of selected interactive data presentation techniques on the example of the land use structure in the commune of Tomice Abstract. The authors present the results of a comparative analysis of selected techniques and programming tools for building interactive data presentation in the form of diagrams and maps generated in the browser. The results of an inventory of land use structure, which are a part of a geographic information system data­ base of the commune of Tomice in district of Wadowice, were employed as input data. The research has shown that the tested tools have a similar design capacity; which makes it difficult to determine which of them is the best. Different factors contribute to choosing a particular tool. They include technical specification, project budget, license conditions, technical support and visualization possibilities. Keywords: data visualization, Application Programming Interface API, interactive maps and diagrams, interactive maps 1. Introduction 2016). Attractiveness of interactive visualiza­ tion of data on the Internet originates mostly In the communication process strong expo­ from the functionality of the medium itself, the sition of visual elements plays a particular role: speed of access to information using it as well pictures, animations and videos (M. Woźnia­ as the diversity of forms of its communication kowski 2014). Modern civilization is based on (D. Gotlib 2008, P.
    [Show full text]
  • Dart in Action
    Chris Buckett FOREWORD BY Seth Ladd MANNING www.allitebooks.com Dart in Action www.allitebooks.com www.allitebooks.com Dart in Action CHRIS BUCKETT MANNING Shelter Island www.allitebooks.com For online information and ordering of this and other Manning books, please visit www.manning.com. The publisher offers discounts on this book when ordered in quantity. For more information, please contact Special Sales Department Manning Publications Co. 20 Baldwin Road PO Box 261 Shelter Island, NY 11964 Email: [email protected] ©2013 by Manning Publications Co. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps. Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine. Development editor: Susanna Kline Manning Publications Co. Technical proofreader: John Evans
    [Show full text]
  • Synthesis of Patterned Media by Self-Assembly of Magnetic
    European Journal of Molecular & Clinical Medicine ISSN 2515-8260 Volume 08, Issue 02, 2021 Business Intelligence (BI) Application in Open-source Content Management System (CMS): A Review Harnisa Azrin HAKIMI1, Sharifalillah NORDIN2 1,2Faculty of Computer and Mathematical Sciences Universiti Teknologi MARA (UiTM) 40450 Shah Alam, Selangor, Malaysia e-mail: [email protected] Abstract: This paper reviews the development of the business intelligence (BI) platform in the web application system based on the open-source content management system (CMS): WordPress, Joomla, and Drupal. The open-source CMS mostly used by companies for their websites and web application systems, while the BI application used by companies to analyze the information needed for decision-making. The BI application involves data visualization (DV) techniques that displayed in a visual chart or dashboard, which can effectively improve the interpretation capabilities of data. However, the open-source CMS cannot easily be integrated with commercial BI software, and most commercial BI tools are expensive and require the purchase of copyright. In this study, the existing DV plugins, extensions, and modules in WordPress, Joomla, and Drupal are reviewed and analyzed for developing the BI application in the system. The aimed result for this study is to find out which open-source CMS between WordPress, Joomla, and Drupal is the best used for developing the BI platform in the web application system by using their DV plugins, extensions, and modules. Keywords: Business Intelligence (BI), Data Visualization, Content Management System (CMS), WordPress, Joomla, Drupal. 1. INTRODUCTION In a dynamic and competitive business environment, open-source content management systems (CMS) are mostly used by companies for their websites and web application systems.
    [Show full text]
  • Google Sheet Spreadsheet Menu
    Google Sheet Spreadsheet Menu Remonstrant Tucker etiolates: he buttonholes his gaper colloquially and normatively. Piffling and illative Carlton excoriate, but Miles small-mindedly cupel her duplicatures. Desert Alexis harm, his Dunker quibbles reduplicating puristically. Set a google spreadsheet settings to copy of your phone numbers, there is the theme stylesheets, visual designers to Custom menus in Google Docs Sheets Slides or Forms To ear the menu when the user opens a file write the menu code within an onOpen function The shoot below shows how you add a menu with warehouse item followed by a visual separator then a sub-menu that contains another item. Split google sheets Itpluto. Google Sheets Create is Down Lists and Check Boxes YouTube. To and then select as essential for navigating it in miles is public filings in a version. I believe your firm as follows You abide to vary the equity by selecting the sheet where the custom menu on Google Spreadsheet For himself how. Let me know certainly a comment below this tutorial. You clicked the first menu item! Create a Category Menu Applied Digital Skills. Select or matching file saved on button drawing window from google sheets spreadsheet is shown below shows how can also adjust for? Establish a connection to the spreadsheet using your Google account. It for sites into your apps script editor. First of revenue, click next the format painter and simply drive over the cells that page want simply apply the formatting to. Hi Mohamed, you can translate text only inside within a spreadsheet. Add a hasted steel defender benefit from a detailed business by side by advertising fees by.
    [Show full text]
  • Sustav Udomljenika Podesivih Za Potrošačko Financijsko Upravljanje
    SVEUČILIŠTE U ZAGREBU FAKULTET ELEKTROTEHNIKE I RAČUNARSTVA Diplomski rad br. 86 Sustav udomljenika podesivih za potrošačko financijsko upravljanje Tomislav Lugarić Zagreb, lipanj, 2010. Zahvaljujem mentoru prof. dr. sc. Siniši Srbljiću za priliku da radim na jednom zanimljivom, atraktivnom i poticajnom projektu. Zahvaljujem mr. sc. Miroslavu Popoviću za brojne savjete i smjernice tokom pisanja rada. Zahvaljujem kolegama s Ekonomskog fakulteta u Zagrebu Vjekoslavu Mesarošu i Jurici Štimcu za savjete pri razvoju sustava, pomoć oko literature i vrijeme utrošeno na projektu. Zahvaljujem dr. sc. Blanki Krauthacker za pomoć pri pisanju, ispravljanju i oblikovanju rada. Zahvaljujem kolegi Zvonimiru Pavliću za pomoć pri izgradnji programske potpore ovom radu. Zahvaljujem svojoj djevojci Gordani, kao i cijeloj obitelji za potporu i razumijevanje. 1 Sadržaj 1. Uvod..................................................................................................................... 3 2. Financijska analiza i upravljanje.......................................................................... 5 2.1 Metode financijske analize ................................................................................ 5 2.2 Upravljanje investicijama .................................................................................. 8 2.3 Postojeći portali za financijsku analizu i upravljanje ........................................ 8 2.2.1 Portali za pregled informacija..................................................................... 8 2.2.2 Portali s mogućnošću
    [Show full text]
  • An Overview of the Hadoop/Mapreduce/Hbase
    Bioinformatics Open Source Conference 2010 An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics Author: Ronald Taylor, Pacific Northwest National Laboratory, Richland, WA ([email protected]) This talk will present an overview of Apache Hadoop (http://hadoop.apache.org/) and associated open source software projects. Current Hadoop usage within the bioinformatics community will be summarized, and the advantages of Hadoop will be discussed as they pertain to subareas of bioinformatics analysis. Hadoop is a software framework that can be installed on a commodity Linux cluster to permit large scale distributed data analysis. Hadoop provides the robust Hadoop Distributed File System (HDFS) as well as a Java-based API that allows parallel processing across the nodes of the cluster. Programs employ a Map/Reduce execution engine which functions as a fault-tolerant distributed computing system over large data sets - a method popularized by use at Google. There are separate Map and Reduce steps, each step done in parallel, each operating on sets of key-value pairs. Processing can be parallelized over thousands of nodes working on terabyte or larger sized data sets. The Hadoop framework automatically schedules map tasks close to the data on which they will work, with "close" meaning the same node or, at least, the same rack. Node failures are also handled automatically. In addition to Hadoop itself, which is a top-level Apache project, there are subprojects build on top of Hadoop, such as Hive (http://hadoop.apache.org/hive/), a data warehouse framework used for ad hoc querying (with an SQL type query language) and used for more complex analysis; and Pig (http://hadoop.
    [Show full text]
  • The Use of Google Apps for Implementing BI Applications
    The use of Google Apps for implementing BI applications Omar Ghadban Pou Menthor: Izr. Prof. Dr. Vili Podgorelec Universitat Politècnica de Catalunya, FIB University of Maribor, FERI July 2010 The use of Google Apps for implementing BI applications Abstract This project aims to prove that is possible to develop Business Intelligence applications by using Google Apps. Instead of parameterize some existing tool, we want to demonstrate that we can build a solution for a concrete corporation according to its needs. For this, we have separated the work in two parts. The first one is based on a studio about which Google code tools and libraries can be useful for this purpose. All the alternatives have been classified, analyzed and exemplified. Also, for concluding the studio, we have explained how we should assemble them in order to obtain a BI system. As a second part, we have implemented an example tool following the steps described on the previous studio, and have also proposed some possible improvements to apply to the original solution. After that work we have demonstrate that we can develop totally functional BI tools using Google Apps. Moreover, these ones will be characterized by its sturdiness, fastness, security and change adaptability. All of that allows us to say that Google Apps is an excellent platform for innovate and develop BI solutions for companies totally adapted to their needs, reducing costs, and giving a new perspective over themselves. 3 The use of Google Apps for implementing BI applications 4 The use of Google Apps for implementing BI applications Index 1.
    [Show full text]