Using Geomesa on Top of Apache Accumulo, Hbase, Cassandra, and Big Data file Formats for Massive Geospatial Data Apachecon 2019 James Hughes James Hughes
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
The National Atlas and Google Earth in a Geodata Infrastructure
12th AGILE International Conference on Geographic Information Science 2009 page 1 of 11 Leibniz Universität Hannover, Germany Maps and Mash-ups: The National Atlas and Google Earth in a Geodata Infrastructure Barend Köbben and Marc Graham International Institute for Geo-Information Science and Earth Observation (ITC), Enschede, The Netherlands INTRODUCTION This paper is about different worlds, and how we try to unite them. The worlds we talk about are different occurrences or expressions of spatial data in different technological environments: One of them is the world of National Atlases, collections of complex, high quality maps presenting a nation to the geographically interested. The second is the world of Geodata Infrastructures (GDIs), highly organised, standardised and institutionalised large collections of spatial data and services. The third world is that of Virtual Globes, computer applications that bring the Earth as a highly interactive 3D representation to the desktops of the masses. These worlds represent also different areas of expertise. The atlas is where cartographers, with their focus on semiology, usability, graphic communication and aesthetic design, are most at home. Their digital tools are graphic design software and multimedia and visualisation toolkits. The GDIs are typically the domain of the (geo-)information scientists and GIS technicians. Their 'native' technological environment has an IT focus on multi-user databases, distributed services and high-end client software. Virtual globes are produced by hard-core informatics specialists: programmers that specialise in real-time 3D rendering of large data volumes, fast image tiling and texturing, similar to computer gaming technology. Their users however are mostly not aware of, nor interested in the technology behind the products. -
Analysis of Big Data Storage Tools for Data Lakes Based on Apache Hadoop Platform
(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 12, No. 8, 2021 Analysis of Big Data Storage Tools for Data Lakes based on Apache Hadoop Platform Vladimir Belov, Evgeny Nikulchev MIREA—Russian Technological University, Moscow, Russia Abstract—When developing large data processing systems, determined the emergence and use of various data storage the question of data storage arises. One of the modern tools for formats in HDFS. solving this problem is the so-called data lakes. Many implementations of data lakes use Apache Hadoop as a basic Among the most widely known formats used in the Hadoop platform. Hadoop does not have a default data storage format, system are JSON [12], CSV [13], SequenceFile [14], Apache which leads to the task of choosing a data format when designing Parquet [15], ORC [16], Apache Avro [17], PBF [18]. a data processing system. To solve this problem, it is necessary to However, this list is not exhaustive. Recently, new formats of proceed from the results of the assessment according to several data storage are gaining popularity, such as Apache Hudi [19], criteria. In turn, experimental evaluation does not always give a Apache Iceberg [20], Delta Lake [21]. complete understanding of the possibilities for working with a particular data storage format. In this case, it is necessary to Each of these file formats has own features in file structure. study the features of the format, its internal structure, In addition, differences are observed at the level of practical recommendations for use, etc. The article describes the features application. Thus, row-oriented formats ensure high writing of both widely used data storage formats and the currently speed, but column-oriented formats are better for data reading. -
An Open Source Spatial Data Infrastructure for the Cryosphere Community
OpenPolarServer (OPS) - An Open Source Spatial Data Infrastructure for the Cryosphere Community By Kyle W. Purdon Submitted to the graduate degree program in Geography and the Graduate Faculty of the University of Kansas in partial fulfillment of the requirements for the degree of Master of Science. Chairperson Dr. Xingong Li Dr. Terry Slocum Dr. David Braaten Date Defended: April 17, 2014 ii The Thesis Committee for Kyle W. Purdon – certifies that this is the approved version of the following thesis: OpenPolarServer (OPS) - An Open Source Spatial Data Infrastructure for the Cryosphere Community Chairperson Dr. Xingong Li Date approved: April 17, 2014 iii Abstract The Center for Remote Sensing of Ice Sheets (CReSIS) at The University of Kansas has collected approximately 700 TB of radar depth sounding data over the Arctic and Antarctic ice sheets since 1993 in an effort to map the thickness of the ice sheets and ultimately understand the impacts of climate change and sea level rise. In addition to data collection, the storage, management, and public distribution of the dataset are also one of the primary roles of CReSIS. The OpenPolarServer (OPS) project developed a free and open source spatial data infrastructure (SDI) to store, manage, analyze, and distribute the data collected by CReSIS in an effort to replace its current data storage and distribution approach. The OPS SDI includes a spatial database management system (DBMS), map and web server, JavaScript geoportal, and application programming interface (API) for the inclusion of data created by the cryosphere community. Open source software including GeoServer, PostgreSQL, PostGIS, OpenLayers, ExtJS, GeoEXT and others are used to build a system that modernizes the CReSIS SDI for the entire cryosphere community and creates a flexible platform for future development. -
Development of a Web Mapping Application Using Open Source
Centre National de l’énergie des sciences et techniques nucléaires (CNESTEN-Morocco) Implementation of information system to respond to a nuclear emergency affecting agriculture and food products - Case of Morocco Anis Zouagui1, A. Laissaoui1, M. Benmansour1, H. Hajji2, M. Zaryah1, H. Ghazlane1, F.Z. Cherkaoui3, M. Bounsir3, M.H. Lamarani3, T. El Khoukhi1, N. Amechmachi1, A. Benkdad1 1 Centre National de l’Énergie, des Sciences et des Techniques Nucléaires (CNESTEN), Morocco ; [email protected], 2 Institut Agronomique et Vétérinaire Hassan II (IAV), Morocco, 3 Office Régional de la Mise en Valeur Agricole du Gharb (ORMVAG), Morocco. INTERNATIONAL EXPERTS’ MEETING ON ASSESSMENT AND PROGNOSIS IN RESPONSE TO A NUCLEAR OR RADIOLOGICAL EMERGENCY (CN-256) IAEA Headquarters Vienna, Austria 20–24 April 2015 Context In nuclear disaster affecting agriculture, there is a need for rapid, reliable and practical tools and techniques to assess any release of radioactivity The research of hazards illustrates how geographic information is being integrated into solutions and the important role the Web now plays in communication and disseminating information to the public for mitigation, management, and recovery from a disaster. 2 Context Basically GIS is used to provide user with spatial information. In the case of the traditional GIS, these types of information are within the system or group of systems. Hence, this disadvantage of traditional GIS led to develop a solution of integrating GIS and Internet, which is called Web-GIS. 3 Project Goal CRP1.50.15: “ Response to Nuclear Emergency affecting Food and Agriculture” The specific objective of our contribution is to design a prototype of web based mapping application that should be able to: 1. -
Developer Tool Guide
Informatica® 10.2.1 Developer Tool Guide Informatica Developer Tool Guide 10.2.1 May 2018 © Copyright Informatica LLC 2009, 2019 This software and documentation are provided only under a separate license agreement containing restrictions on use and disclosure. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica LLC. Informatica, the Informatica logo, PowerCenter, and PowerExchange are trademarks or registered trademarks of Informatica LLC in the United States and many jurisdictions throughout the world. A current list of Informatica trademarks is available on the web at https://www.informatica.com/trademarks.html. Other company and product names may be trade names or trademarks of their respective owners. U.S. GOVERNMENT RIGHTS Programs, software, databases, and related documentation and technical data delivered to U.S. Government customers are "commercial computer software" or "commercial technical data" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, duplication, disclosure, modification, and adaptation is subject to the restrictions and license terms set forth in the applicable Government contract, and, to the extent applicable by the terms of the Government contract, the additional rights set forth in FAR 52.227-19, Commercial Computer Software License. Portions of this software and/or documentation are subject to copyright held by third parties. Required third party notices are included with the product. The information in this documentation is subject to change without notice. If you find any problems in this documentation, report them to us at [email protected]. -
Development of an Extension of Geoserver for Handling 3D Spatial Data Hyung-Gyu Ryoo Pusan National University
Free and Open Source Software for Geospatial (FOSS4G) Conference Proceedings Volume 17 Boston, USA Article 6 2017 Development of an extension of GeoServer for handling 3D spatial data Hyung-Gyu Ryoo Pusan National University Soojin Kim Pusan National University Joon-Seok Kim Pusan National University Ki-Joune Li Pusan National University Follow this and additional works at: https://scholarworks.umass.edu/foss4g Part of the Databases and Information Systems Commons Recommended Citation Ryoo, Hyung-Gyu; Kim, Soojin; Kim, Joon-Seok; and Li, Ki-Joune (2017) "Development of an extension of GeoServer for handling 3D spatial data," Free and Open Source Software for Geospatial (FOSS4G) Conference Proceedings: Vol. 17 , Article 6. DOI: https://doi.org/10.7275/R5ZK5DV5 Available at: https://scholarworks.umass.edu/foss4g/vol17/iss1/6 This Paper is brought to you for free and open access by ScholarWorks@UMass Amherst. It has been accepted for inclusion in Free and Open Source Software for Geospatial (FOSS4G) Conference Proceedings by an authorized editor of ScholarWorks@UMass Amherst. For more information, please contact [email protected]. Development of an extension of GeoServer for handling 3D spatial data Optional Cover Page Acknowledgements This research was supported by a grant (14NSIP-B080144-01) from National Land Space Information Research Program funded by Ministry of Land, Infrastructure and Transport of Korean government and BK21PLUS, Creative Human Resource Development Program for IT Convergence. This paper is available in Free and Open Source Software for Geospatial (FOSS4G) Conference Proceedings: https://scholarworks.umass.edu/foss4g/vol17/iss1/6 Development of an extension of GeoServer for handling 3D spatial data Hyung-Gyu Ryooa,∗, Soojin Kima, Joon-Seok Kima, Ki-Joune Lia aDepartment of Computer Science and Engineering, Pusan National University Abstract: Recently, several open source software tools such as CesiumJS and iTowns have been developed for dealing with 3-dimensional spatial data. -
Opengis Web Feature Services for Editing Cadastral Data
OpenGIS Web Feature Services for editing cadastral data Analysis and practical experiences Master of Science thesis T.J. Brentjens Section GIS Technology Geodetic Engineering Faculty of Civil Engineering and Geosciences Delft University of Technology OpenGIS Web Feature Services for editing cadastral data Analysis and practical experiences Master of Science thesis Thijs Brentjens Professor: prof. dr. ir. P.J.M. van Oosterom (Delft University of Technology) Supervisors: drs. M.E. de Vries (Delft University of Technology) drs. C.W. Quak (Delft University of Technology) drs. C. Vijlbrief (Kadaster) Delft, April 2004 Section GIS Technology Geodetic Engineering Faculty of Civil Engineering and Geosciences Delft University of Technology The Netherlands Het Kadaster Apeldoorn The Netherlands i ii Preface Preface This thesis is the result of the efforts I have put in my graduation research project between March 2003 and April 2004. I have performed this research part-time at the section GIS Technology of TU Delft in cooperation with the Kadaster (the Dutch Cadastre), in order to get the Master of Science degree in Geodetic Engineering. Typing the last words for this thesis, I have been realizing more than ever that this thesis marks the end of my time as a student at the TU Delft. However, I also realize that I have been working to this point with joy. Many people are “responsible” for this, but I’d like to mention the people who have contributed most. First of all, there are of course people who were directly involved in the research project. Peter van Oosterom had many critical notes and - maybe even more important - the ideas born out of his enthusiasm improved the entire research. -
Unravel Data Systems Version 4.5
UNRAVEL DATA SYSTEMS VERSION 4.5 Component name Component version name License names jQuery 1.8.2 MIT License Apache Tomcat 5.5.23 Apache License 2.0 Tachyon Project POM 0.8.2 Apache License 2.0 Apache Directory LDAP API Model 1.0.0-M20 Apache License 2.0 apache/incubator-heron 0.16.5.1 Apache License 2.0 Maven Plugin API 3.0.4 Apache License 2.0 ApacheDS Authentication Interceptor 2.0.0-M15 Apache License 2.0 Apache Directory LDAP API Extras ACI 1.0.0-M20 Apache License 2.0 Apache HttpComponents Core 4.3.3 Apache License 2.0 Spark Project Tags 2.0.0-preview Apache License 2.0 Curator Testing 3.3.0 Apache License 2.0 Apache HttpComponents Core 4.4.5 Apache License 2.0 Apache Commons Daemon 1.0.15 Apache License 2.0 classworlds 2.4 Apache License 2.0 abego TreeLayout Core 1.0.1 BSD 3-clause "New" or "Revised" License jackson-core 2.8.6 Apache License 2.0 Lucene Join 6.6.1 Apache License 2.0 Apache Commons CLI 1.3-cloudera-pre-r1439998 Apache License 2.0 hive-apache 0.5 Apache License 2.0 scala-parser-combinators 1.0.4 BSD 3-clause "New" or "Revised" License com.springsource.javax.xml.bind 2.1.7 Common Development and Distribution License 1.0 SnakeYAML 1.15 Apache License 2.0 JUnit 4.12 Common Public License 1.0 ApacheDS Protocol Kerberos 2.0.0-M12 Apache License 2.0 Apache Groovy 2.4.6 Apache License 2.0 JGraphT - Core 1.2.0 (GNU Lesser General Public License v2.1 or later AND Eclipse Public License 1.0) chill-java 0.5.0 Apache License 2.0 Apache Commons Logging 1.2 Apache License 2.0 OpenCensus 0.12.3 Apache License 2.0 ApacheDS Protocol -
Talend Open Studio for Big Data Release Notes
Talend Open Studio for Big Data Release Notes 6.0.0 Talend Open Studio for Big Data Adapted for v6.0.0. Supersedes previous releases. Publication date July 2, 2015 Copyleft This documentation is provided under the terms of the Creative Commons Public License (CCPL). For more information about what you can and cannot do with this documentation in accordance with the CCPL, please read: http://creativecommons.org/licenses/by-nc-sa/2.0/ Notices Talend is a trademark of Talend, Inc. All brands, product names, company names, trademarks and service marks are the properties of their respective owners. License Agreement The software described in this documentation is licensed under the Apache License, Version 2.0 (the "License"); you may not use this software except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0.html. Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. This product includes software developed at AOP Alliance (Java/J2EE AOP standards), ASM, Amazon, AntlR, Apache ActiveMQ, Apache Ant, Apache Avro, Apache Axiom, Apache Axis, Apache Axis 2, Apache Batik, Apache CXF, Apache Cassandra, Apache Chemistry, Apache Common Http Client, Apache Common Http Core, Apache Commons, Apache Commons Bcel, Apache Commons JxPath, Apache -
Assessment of Multiple Ingest Strategies for Accumulo Key-Value Store
Assessment of Multiple Ingest Strategies for Accumulo Key-Value Store by Hai Pham A thesis submitted to the Graduate Faculty of Auburn University in partial fulfillment of the requirements for the Degree of Master of Science Auburn, Alabama May 7, 2016 Keywords: Accumulo, noSQL, ingest Copyright 2016 by Hai Pham Approved by Weikuan Yu, Co-Chair, Associate Professor of Computer Science, Florida State University Saad Biaz, Co-Chair, Professor of Computer Science and Software Engineering, Auburn University Sanjeev Baskiyar, Associate Professor of Computer Science and Software Engineering, Auburn University Abstract In recent years, the emergence of heterogeneous data, especially of the unstructured type, has been extremely rapid. The data growth happens concurrently in 3 dimensions: volume (size), velocity (growth rate) and variety (many types). This emerging trend has opened a new broad area of research, widely accepted as Big Data, which focuses on how to acquire, organize and manage huge amount of data effectively and efficiently. When coping with such Big Data, the traditional approach using RDBMS has been inefficient; because of this problem, a more efficient system named noSQL had to be created. This thesis will give an overview knowledge on the aforementioned noSQL systems and will then delve into a more specific instance of them which is Accumulo key-value store. Furthermore, since Accumulo is not designed with an ingest interface for users, this thesis focuses on investigating various methods for ingesting data, improving the performance and dealing with numerous parameters affecting this process. ii Acknowledgments First and foremost, I would like to express my profound gratitude to Professor Yu who with great kindness and patience has guided me through not only every aspect of computer science research but also many great directions towards my personal issues. -
The State of Open Source GIS
The State of Open Source GIS Prepared By: Paul Ramsey, Director Refractions Research Inc. Suite 300 – 1207 Douglas Street Victoria, BC, V8W-2E7 [email protected] Phone: (250) 383-3022 Fax: (250) 383-2140 Last Revised: September 15, 2007 TABLE OF CONTENTS 1 SUMMARY ...................................................................................................4 1.1 OPEN SOURCE ........................................................................................... 4 1.2 OPEN SOURCE GIS.................................................................................... 6 2 IMPLEMENTATION LANGUAGES ........................................................7 2.1 SURVEY OF ‘C’ PROJECTS ......................................................................... 8 2.1.1 Shared Libraries ............................................................................... 9 2.1.1.1 GDAL/OGR ...................................................................................9 2.1.1.2 Proj4 .............................................................................................11 2.1.1.3 GEOS ...........................................................................................13 2.1.1.4 Mapnik .........................................................................................14 2.1.1.5 FDO..............................................................................................15 2.1.2 Applications .................................................................................... 16 2.1.2.1 MapGuide Open Source...............................................................16 -
HDP 3.1.4 Release Notes Date of Publish: 2019-08-26
Release Notes 3 HDP 3.1.4 Release Notes Date of Publish: 2019-08-26 https://docs.hortonworks.com Release Notes | Contents | ii Contents HDP 3.1.4 Release Notes..........................................................................................4 Component Versions.................................................................................................4 Descriptions of New Features..................................................................................5 Deprecation Notices.................................................................................................. 6 Terminology.......................................................................................................................................................... 6 Removed Components and Product Capabilities.................................................................................................6 Testing Unsupported Features................................................................................ 6 Descriptions of the Latest Technical Preview Features.......................................................................................7 Upgrading to HDP 3.1.4...........................................................................................7 Behavioral Changes.................................................................................................. 7 Apache Patch Information.....................................................................................11 Accumulo...........................................................................................................................................................