Oracle Big Data Connectors User's Guide, Release 2 (2.1) E36961-05

Oracle® Big Data Connectors User's Guide Release 2 (2.1) E36961-05 July 2013 Describes installation and use of Oracle Big Data Connectors: Oracle SQL Connector for Hadoop Distributed File System, Oracle Loader for Hadoop, Oracle Data Integrator Application Adapter for Hadoop, and Oracle R Connector for Hadoop. Oracle Big Data Connectors User's Guide, Release 2 (2.1) E36961-05 Copyright © 2011, 2013, Oracle and/or its affiliates. All rights reserved. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government. This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group. This software or hardware and documentation may provide access to or information on content, products, and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services. Cloudera, Cloudera CDH, and Cloudera Manager are registered and unregistered trademarks of Cloudera, Inc. Contents Preface ................................................................................................................................................................. ix Audience....................................................................................................................................................... ix Documentation Accessibility..................................................................................................................... ix Related Documents ..................................................................................................................................... ix Text Conventions......................................................................................................................................... x Syntax Conventions .................................................................................................................................... x Changes in This Release for Oracle Big Data Connectors User's Guide .................. xi Changes in Oracle Big Data Connectors Release 2 (2.0) ........................................................................ xi 1 Getting Started with Oracle Big Data Connectors About Oracle Big Data Connectors....................................................................................................... 1-1 Big Data Concepts and Technologies................................................................................................... 1-2 What is MapReduce? ......................................................................................................................... 1-2 What is Apache Hadoop? ................................................................................................................. 1-2 Downloading the Oracle Big Data Connectors Software................................................................. 1-3 Oracle SQL Connector for Hadoop Distributed File System Setup............................................... 1-4 Software Requirements ..................................................................................................................... 1-4 Installing and Configuring a Hadoop Client ................................................................................. 1-4 Installing Oracle SQL Connector for HDFS ................................................................................... 1-5 Providing Support for Hive Tables ................................................................................................. 1-8 Granting User Privileges in Oracle Database.............................................................................. 1-10 Setting Up User Accounts on the Oracle Database System ...................................................... 1-11 Setting Up User Accounts on the Hadoop Cluster .................................................................... 1-11 Oracle Loader for Hadoop Setup........................................................................................................ 1-12 Software Requirements .................................................................................................................. 1-12 Installing Oracle Loader for Hadoop ........................................................................................... 1-12 Providing Support for Offline Database Mode .......................................................................... 1-13 Oracle Data Integrator Application Adapter for Hadoop Setup.................................................. 1-13 System Requirements and Certifications..................................................................................... 1-14 Technology-Specific Requirements .............................................................................................. 1-14 Location of Oracle Data Integrator Application Adapter for Hadoop.................................... 1-14 Setting Up the Topology ................................................................................................................ 1-14 Oracle R Connector for Hadoop Setup ............................................................................................. 1-14 Installing the Software on Hadoop............................................................................................... 1-14 iii Installing Additional R Packages.................................................................................................. 1-17 Providing Remote Client Access to R Users................................................................................ 1-19 2 Oracle SQL Connector for Hadoop Distributed File System About Oracle SQL Connector for HDFS.............................................................................................. 2-1 Getting Started With Oracle SQL Connector for HDFS................................................................... 2-2 Configuring Your System for Oracle SQL Connector for HDFS.................................................... 2-5 Using the ExternalTable Command-Line Tool................................................................................... 2-6 About ExternalTable.......................................................................................................................... 2-6 ExternalTable Command-Line Tool Syntax ................................................................................... 2-6 Creating External Tables......................................................................................................................... 2-8 Creating External Tables with the ExternalTable Tool................................................................. 2-8 Creating External Tables from Data Pump Format Files ............................................................. 2-8 Creating External Tables from Hive

Oracle Big Data Connectors User's Guide, Release 2 (2.1) E36961-05

Combined Documents V2

Apache Sentry

Talend Open Studio for Big Data Release Notes

Realization of Big Data Ana- Lytics Tool for Optimization Processes Within the Finnish Engineering Company

Mapreduce Service

HDP 3.1.4 Release Notes Date of Publish: 2019-08-26

SAS 9.4 Hadoop Configuration Guide for Base SAS And

Talend Open Studio for Big Data Release Notes

Pentaho EMR46 SHIM 7.1.0.0 Open Source Software Packages

Technology Overview

Code Smell Prediction Employing Machine Learning Meets Emerging Java Language Constructs"

A Guide in the Big Data Jungle Thesis, Bachelor of Science