Install Guide

Total Page:16

File Type:pdf, Size:1020Kb

Install Guide DMX Install Guide Version 9.10 DMX Install Guide Copyright 1990, 2020 Syncsort Incorporated. All rights reserved. This document contains unpublished, confidential, and proprietary information of Syncsort Incorporated. No disclosure or use of any portion of the contents of this document may be made without the express written consent of Syncsort Incorporated. Getting technical support: Customers with a valid maintenance contact can get technical assistance via MySupport. There you will find product downloads and documentation for the products to which you are entitled, as well as an extensive knowledge base. Version 9.10 Last Update: 15 May 2020 Contents DMX Overview ............................................................................................................... 4 Installing DMX/DMX-h ................................................................................................... 4 DMX-h Overview .......................................................................................... 4 Prerequisites ................................................................................................ 5 Step-by-Step Installation ............................................................................. 8 Configuring the DMX Run-time Service ................................................... 22 Applying a New License Key to an Existing Installation ......................... 26 Running DMX ............................................................................................................... 28 Graphical User Interfaces ......................................................................... 28 DMX Help .................................................................................................... 28 Connecting to Databases from DMX ......................................................................... 29 Amazon Redshift ........................................................................................ 29 Azure Synapse Analytics (formerly SQL Data Warehouse) .................... 31 Databricks .................................................................................................. 33 DB2 .......................................................................................................... 35 Greenplum ................................................................................................. 36 Hive data warehouses ............................................................................... 38 Apache Impala ........................................................................................... 45 Microsoft SQL Server ................................................................................. 48 Netezza ....................................................................................................... 48 NoSQL Databases ...................................................................................... 50 Oracle .......................................................................................................... 51 Snowflake ................................................................................................... 51 Sybase ......................................................................................................... 54 Teradata ..................................................................................................... 54 Vertica ......................................................................................................... 55 Other DBMSs .............................................................................................. 56 Defining ODBC Data Sources ................................................................... 58 Connecting to Message Queues from DMX ............................................................. 60 IBM WebSphere MQ .................................................................................. 60 Connecting to Salesforce from DMX ......................................................................... 61 Connecting to SAP from DMX .................................................................................... 62 DMX Install Guide i Registering DMX in SAP SLD ..................................................................... 63 Connecting to HDFS from DMX ................................................................................. 63 Connecting to Connect:Direct nodes from DMX ...................................................... 63 Security ....................................................................................................... 63 Installation and Configuration ................................................................. 63 Connecting to CyberArk Enterprise Password Vault ............................................... 64 CyberArk Licenses ..................................................................................... 64 Connecting to Protegrity Data Security Gateway ..................................................... 65 Connecting to QlikView data eXchange files from QlikView or Qlik Sense............ 65 QlikView desktop installation overview ................................................... 65 Qlik Sense desktop installation overview ................................................ 65 Connecting to Tableau Data Extract files from Tableau .......................................... 66 Tableau desktop installation overview .................................................... 66 Removing DMX/DMX-h from Your System ............................................................... 66 DMX installation component options ....................................................................... 69 DMX Management Service installation and configuration .................... 70 DMX DataFunnel run-time service install and configuration ................ 76 Technical Support ....................................................................................................... 80 ii DMX Install Guide Documentation Conventions The following conventions are used in the format sections of the command options in this manual. Convention Explanation Example Regular type Items in regular type must be entered literally using ASCII either lowercase or uppercase letters. Items may be ascending abbreviated. Italics (non-bold) Items in italics (non-bold) represent variables. You must file_name substitute an appropriate numerical or text value for the variable. Braces { } Braces indicate that a choice must be made among items {"a" } contained in the braces. The choices may be presented {X"xx" } in an aligned column, or on one line separated by a vertical bar ( | ). OR {AND | OR} Brackets [ ] Brackets indicate that an item is optional. A choice may [alias] be made among multiple items contained in brackets. OR [+ | -] Slash / A slash identifies a DMX option keyword. The slash /INFILE must be included when an option keyword is specified. /infile Double quotes " " Double quotation marks that appear in a format "b"-"e" statement must be specified literally. Ellipsis … An ellipsis indicates that the preceding argument or [expression…] group of arguments may be repeated. Sequence A sequence number indicates that a series of arguments field2 number or values may be specified. The sequence number itself must never be specified. DMX Install Guide 3 DMX Overview DMX™ is a high-performance data transformation product. With DMX you can design, schedule, and control all your data transformations from a simple graphical interface on your Windows desktop. Data records can be input from many types of sources such as database tables, SAP systems, Salesforce.com objects, flat files, XML files, pipes, etc. The records can be aggregated, joined, sorted, merged, or just copied to the appropriate target(s). Before output, records can be filtered, reformatted, or otherwise transformed. Metadata, including record layouts, business rules, transformation definitions, run history and data statistics, can be maintained either within a specific task or in a central repository. The effects of making a change to your application can be analyzed through impact and lineage analysis. You can run your data transformations directly from your desktop, on any UNIX or Windows server, or schedule them for later execution, embed them in batch scripts, or invoke them from your own programs. Installing DMX/DMX-h Installed DMX components are dependent on your license key: • DMX server license key installs components based on whether you select a Standard, Full, Classic, or Custom installation. See DMX installation component options. • DMX workstation license key installs the development client, Job and Task Editors; the DMX engine, dmxjob/dmexpress;; and the service for development client, which is the DMX Run-time Service, dmxd. The version of DMX server software must be at least as high as the version of the DMX client software that is used to develop jobs and connect to the server. Thus, when installing a new version of DMX, ensure that you install the same release of DMX on your client and server machines. If you are upgrading and unable to install both the client and the server at the same time, you need to upgrade the server prior to upgrading the client. DMX-h Overview DMX-h is the Hadoop-enabled edition of DMX, providing the following Hadoop functionality: • ETL Processing in Hadoop – Develop a DMX-h ETL application entirely in the DMX GUI to run seamlessly in the Hadoop MapReduce framework, with no Pig, Hive, or Java programming required. Currently, jobs can be run in either MapReduce
Recommended publications
  • File Formats for Big Data Storage Systems
    International Journal of Engineering and Advanced Technology (IJEAT) ISSN: 2249 – 8958, Volume-9 Issue-1, October 2019 File Formats for Big Data Storage Systems Samiya Khan, Mansaf Alam analysis or visualization. Hadoop allows the user to store data Abstract: Big data is one of the most influential technologies of on its repository in several ways. Some of the commonly the modern era. However, in order to support maturity of big data available and understandable working formats include XML systems, development and sustenance of heterogeneous [9], CSV [10] and JSON [11]. environments is requires. This, in turn, requires integration of Although, JSON, XML and CSV are human-readable technologies as well as concepts. Computing and storage are the two core components of any big data system. With that said, big formats, but they are not the best way, to store data, on a data storage needs to communicate with the execution engine and Hadoop cluster. In fact, in some cases, storage of data in such other processing and visualization technologies to create a raw formats may prove to be highly inefficient. Moreover, comprehensive solution. This brings the facet of big data file parallel storage is not possible for data stored in such formats. formats into picture. This paper classifies available big data file In view of the fact that storage efficiency and parallelism are formats into five categories namely text-based, row-based, the two leading advantages of using Hadoop, the use of raw column-based, in-memory and data storage services. It also compares the advantages, shortcomings and possible use cases of file formats may just defeat the whole purpose.
    [Show full text]
  • Research Article Improving I/O Efficiency in Hadoop-Based Massive Data Analysis Programs
    Hindawi Scientific Programming Volume 2018, Article ID 2682085, 9 pages https://doi.org/10.1155/2018/2682085 Research Article Improving I/O Efficiency in Hadoop-Based Massive Data Analysis Programs Kyong-Ha Lee ,1 Woo Lam Kang,2 and Young-Kyoon Suh 3 1Research Data Hub Center, Korea Institute of Science and Technology Information, Daejeon, Republic of Korea 2School of Computing, KAIST, Daejeon, Republic of Korea 3School of Computer Science and Engineering, Kyungpook National University, Daegu, Republic of Korea Correspondence should be addressed to Young-Kyoon Suh; [email protected] Received 30 April 2018; Revised 24 October 2018; Accepted 6 November 2018; Published 2 December 2018 Academic Editor: Basilio B. Fraguela Copyright © 2018 Kyong-Ha Lee et al. /is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Apache Hadoop has been a popular parallel processing tool in the era of big data. While practitioners have rewritten many conventional analysis algorithms to make them customized to Hadoop, the issue of inefficient I/O in Hadoop-based programs has been repeatedly reported in the literature. In this article, we address the problem of the I/O inefficiency in Hadoop-based massive data analysis by introducing our efficient modification of Hadoop. We first incorporate a columnar data layout into the conventional Hadoop framework, without any modification of the Hadoop internals. We also provide Hadoop with indexing capability to save a huge amount of I/O while processing not only selection predicates but also star-join queries that are often used in many analysis tasks.
    [Show full text]
  • PCLTA-21 PCI Network Adapter Models 74501, 74502, 74503, and 74504
    PCLTA-21 PCI Network Adapter Models 74501, 74502, 74503, and 74504 ▼ Universal 32-bit PCI adapter card for LONWORKS® networks for PCs with 3.3V or 5V PCI ▼ Plug-and-play network driver compatible with MicrosoftWindows® 98/2000 and Windows XP ▼ Downloadable firmware allows updates without accessing or changing hardware ▼ Integral FT 3150® Free Topology Smart Transceiver, RS-485, TPT/XF-78, or TPT/XF-1250 transceiver ▼ LNS® Network Services Interface (NSI) supports LNS applications ▼ Layer 5 MIP for use with OpenLDV™ driver Description ▼ CE Mark, U.L. Listed, cU.L. Listed The PCLTA-21 Network Adapter is a high-perform- ance LONWORKS interface for personal computers equipped with a 3.3V or 5V 32-bit Peripheral The NSI mode of the PCLTA-21 adapter is used Component Interconnect (PCI) interface and a compati- with applications based on the LNS network operating system such as the LonMaker Tool, or the LNS DDE ble operating system. Designed for use in LONWORKS control networks that require a PC for monitoring, Server. managing, or diagnosing the network, the PCLTA-21 The MIP mode of the PCLTA-21 adapter is used with adapter is ideal for industrial control, building automa- applications based on OpenLDV. tion, and process control applications. The PCLTA-21 In both NSI and MIP modes, the adapter permits the adapter features an integral twisted pair transceiver, host PC to act as a LONWORKS application device, such downloadable memory, a network management inter- that the PC is running application-specific programs face, and plug-and-play capability with Microsoft while the adapter handles lower layer functions such as Windows 98/2000 and Windows XP.
    [Show full text]
  • Systematic Testing of the Continuous Behavior of Automotive Systems
    Systematic Testing of the Continuous Behavior of Automotive Systems Eckard Bringmann Andreas Krämer DaimlerChrysler DaimlerChrysler Alt-Moabit 96a Alt-Moabit 96a 10559 Berlin, Germany 10559 Berlin, Germany +49 30 39982 242 +49 30 39982 336 [email protected] [email protected] ABSTRACT gap, the testing of automotive systems in practice focuses on In this paper, we introduce a new test method that enables the simple data tables to describe input signals or on script languages, systematic definition of executable test cases for testing the such as Visual Basic, Python or Perl, to automate tests. continuous behavior of automotive embedded systems. This Nevertheless, signals are still very difficult to handle in these method is based on a graphical notation for test cases that is not languages. Even worse, there is no systematic procedure to help only easy to understand but also powerful enough to express very testers reveal redundancies and missing, test relevant, aspects complex, fully automated tests as well as reactive tests. This new within their test cases. In other words, the selection of test data approach is already in use in several production-vehicle occurs ad hoc and is based on some use cases and typical extreme development projects at DaimlerChrysler and at some suppliers. scenarios but often does not cover all functional requirements of the system under test (SUT). Categories and Subject Descriptors Our new approach – which is called Time Partition Testing D2.5 [Software Engineering]: Testing
    [Show full text]
  • Automated Unit Testing in Model-Based Embedded Software Development
    Automated Unit Testing in Model-based Embedded Software Development Christoph Luckeneder1, Hermann Kaindl1 and Martin Korinek2 1Institute of Computer Technology, TU Wien, Vienna, Austria 2Robert Bosch AG, Gollnergasse¨ 15-17, Vienna, Austria Keywords: Automated Testing, Unit Tests, Model-based Development, Embedded Software, Safety-critical Systems, Automotive. Abstract: Automating software tests is generally desirable, and especially for the software of safety-critical real-time systems such as automotive control systems. For such systems, also conforming with the ISO 26262 standard for functional safety of road vehicles is absolutely necessary. These are embedded systems, however, which pose additional challenges with regard to test automation. In particular, the questions arise on which hardware platform the tests should be performed and by use of which workflow and tools. This is especially relevant in terms of cost, while still ensuring conformance with ISO 26262. In this paper, we present a practical approach for automated unit testing in model-based embedded software development for a safety-critical automotive application. Our approach includes both a workflow and sup- porting tools for performing automated unit tests. In particular, we analyze an as-is workflow and propose changes to the workflow for reducing costs and time needed for performing such tests. In addition, we present an improved tool chain for supporting the test workflow. In effect, without manually implementing each test case twice unit tests can be performed both in a simulation environment and on an open-loop test environment including the embedded platform target hardware. 1 INTRODUCTION The following support through a tool chain was planned: modeling and simulation of the resulting Automotive systems have more and more become software models, as well as model-based test automa- software-intensive systems, which include large-scale tion.
    [Show full text]
  • Impala: a Modern, Open-Source SQL Engine for Hadoop
    Impala: A Modern, Open-Source SQL Engine for Hadoop Marcel Kornacker Alexander Behm Victor Bittorf Taras Bobrovytsky Casey Ching Alan Choi Justin Erickson Martin Grund Daniel Hecht Matthew Jacobs Ishaan Joshi Lenni Kuff Dileep Kumar Alex Leblang Nong Li Ippokratis Pandis Henry Robinson David Rorke Silvius Rus John Russell Dimitris Tsirogiannis Skye Wanderman-Milne Michael Yoder Cloudera http://impala.io/ ABSTRACT that is on par or exceeds that of commercial MPP analytic Cloudera Impala is a modern, open-source MPP SQL en- DBMSs, depending on the particular workload. gine architected from the ground up for the Hadoop data This paper discusses the services Impala provides to the processing environment. Impala provides low latency and user and then presents an overview of its architecture and high concurrency for BI/analytic read-mostly queries on main components. The highest performance that is achiev- Hadoop, not delivered by batch frameworks such as Apache able today requires using HDFS as the underlying storage Hive. This paper presents Impala from a user's perspective, manager, and therefore that is the focus on this paper; when gives an overview of its architecture and main components there are notable differences in terms of how certain technical and briefly demonstrates its superior performance compared aspects are handled in conjunction with HBase, we note that against other popular SQL-on-Hadoop systems. in the text without going into detail. Impala is the highest performing SQL-on-Hadoop system, especially under multi-user workloads. As Section7 shows, 1. INTRODUCTION for single-user queries, Impala is up to 13x faster than alter- Impala is an open-source 1, fully-integrated, state-of-the- natives, and 6.7x faster on average.
    [Show full text]
  • TPT Tutorial
    Time Partition Testing Systematic automated testing of embedded systems PikeTec GmbH, http://www.piketec.com There are only a few tools for testing embedded systems in the automotive domain. Their function usually lies in the test-management or automation of tests by means of test scripts. Time Partition Testing (TPT) noticeably exceeds this. It combines a systematic and very graphic modelling technique for test cases with a fully automatic test execution in different environments and automatic test evaluation in a unique way. In doing so, TPT even supports testing of control systems. Time Partition Testing (TPT) was developed to graphically and compactly model tests of embedded systems – especially those with continuous behaviour –, to automate those tests and to offer a systematic that supports the tester in selecting an ideal amount of test cases. The intuitive graphic modelling technique for test cases, the mechanisms for the structuring of complex test problems, the simple integration in any test- bench and a number of additional interesting features makes TPT a unique test solution. All this has contributed to TPT becoming the central testing tool in a large number of development projects at car manufacturers and suppliers and has become an integral part of the process chain. Embedded systems TPT is specialized on testing embedded systems whose inputs and outputs can be represented as signals. Most control systems belong to this system class. The implementation language of the embedded system (e.g. ‘C’ code, C++, MATLAB/Simulink/Stateflow, Statemate or a combination of multiple languages) is irrelevant for TPT. TPT does not require a particular implementation language.
    [Show full text]
  • Hortonworks Data Platform Data Movement and Integration (December 15, 2017)
    Hortonworks Data Platform Data Movement and Integration (December 15, 2017) docs.cloudera.com Hortonworks Data Platform December 15, 2017 Hortonworks Data Platform: Data Movement and Integration Copyright © 2012-2017 Hortonworks, Inc. Some rights reserved. The Hortonworks Data Platform, powered by Apache Hadoop, is a massively scalable and 100% open source platform for storing, processing and analyzing large volumes of data. It is designed to deal with data from many sources and formats in a very quick, easy and cost-effective manner. The Hortonworks Data Platform consists of the essential set of Apache Hadoop projects including MapReduce, Hadoop Distributed File System (HDFS), HCatalog, Pig, Hive, HBase, ZooKeeper and Ambari. Hortonworks is the major contributor of code and patches to many of these projects. These projects have been integrated and tested as part of the Hortonworks Data Platform release process and installation and configuration tools have also been included. Unlike other providers of platforms built using Apache Hadoop, Hortonworks contributes 100% of our code back to the Apache Software Foundation. The Hortonworks Data Platform is Apache-licensed and completely open source. We sell only expert technical support, training and partner-enablement services. All of our technology is, and will remain, free and open source. Please visit the Hortonworks Data Platform page for more information on Hortonworks technology. For more information on Hortonworks services, please visit either the Support or Training page. You can contact us directly to discuss your specific needs. Except where otherwise noted, this document is licensed under Creative Commons Attribution ShareAlike 4.0 License. http://creativecommons.org/licenses/by-sa/4.0/legalcode ii Hortonworks Data Platform December 15, 2017 Table of Contents 1.
    [Show full text]
  • Testwell CTC++ Test Coverage Analyser
    Testwell CTC++ Test Coverage Analyser What is Testwell CTC++? Testwell CTC++ is the leading Code Coverage Tool for measuring Code Coverage on host and on all embedded targets (even very small ones). The tool can be used to fulfill the code coverage requirements of safety standards like DO-178C, ISO 26262, EN 50128, and IEC 60880. Hundreds of companies in more than 30 countries all over the world use Testwell CTC++ in order to insure the quality of their softwares. Testwell CTC++ is the first choice for companies which have to achieve and to proof high code coverage in aerospace, automotive, transportation, healthcare, nuclear power and other industries. Why Code Coverage? Code coverage is a measure, which describes the degree to which the source code of a program is tested. This can be considered as an indirect measure of quality of the software. Code coverage finds areas of a program which have not yet exercised by a set of test cases. This way it supports you by creating additional test cases to increase the code coverage and prevents you from writing redundant test cases. Code coverage is most useful during the module testing phase, though it also has benefit during integration testing and at other times, depending on how and what you are testing. Code coverage is « highly recommanded » (which means de facto mandatory) for safety critical software. Safety standards like DO-178C (Software Considerations in Airborne Systems and Equipment Certification), IEC/EN 61508 (functional safety of electrical/electronic, programmable electronic safety-related systems), EN 50128 (Railway applications - Communication, signalling and processing systems - Software for railway control and protection systems), IEC 60880 (nuclear power), and ISO 26262 (functional safety of road vehicles) request different levels of code coverage according to the safety level of the application.
    [Show full text]
  • Tpt Test Design and Test Generation
    TPT IN A NUTSHELL TPT TEST DESIGN DASHBOARD ASSESSMENT TRACEABILITY OF TESTING SAFETY TPT is a tool for functional testing of embed- AND TEST User interfaces can be designed for expe- OF TESTS AND REQUIREMENTS AND SYSTEMS PikeTec GmbH ded software interfacing a huge number of riments with a System under Test. Manual standard development tools. GENERATION validation, test observation and interaction REPORTING TEST CASES TPT supports qualified testing and Waldenserstr. 2 - 4 are possible. verification of safety related systems. 10551 Berlin | Germany TPT is suitable for all development phases. Expressing test cases with TPT is both, TPT supports fully automated assessment TPT supports analysis and coverage Tel. +49 30 394 096 830 TPT can be used for MiL, SiL, PiL, HiL powerful and easy to handle. The Dashboard is a powerful feature and documentation of test results including examination of requirements and tests. Safety standard directives can be satisfied Mail. [email protected] and in vehicles. of TPT for many reasons: Requirements can be imported from while testing with TPT up to the highest EMBEDDED TESTING Use automatons for structuring test phases: several formats and tools. safety level. Related standards are: www.piketec.com Back-to-Back testing with relative Tests can be created graphically by the Dashboards can be used for exper- and absolute tolerances STARTS HERE user or generated automatically. INIT imental testing long before test design Requirements can be analyzed, linked ISO 26262 light on Pass and fail criteria Dashboards can also be used to test cases and reported along with IEC 61508 IF Test phase 2 Powerful signal pattern The tests are executed, assessed and bright changing light together with automated tests the tests.
    [Show full text]
  • Major Technical Advancements in Apache Hive
    Major Technical Advancements in Apache Hive Yin Huai1 Ashutosh Chauhan2 Alan Gates2 Gunther Hagleitner2 Eric N. Hanson3 Owen O’Malley2 Jitendra Pandey2 Yuan Yuan1 Rubao Lee1 Xiaodong Zhang1 1The Ohio State University 2Hortonworks Inc. 3Microsoft 1{huai, yuanyu, liru, zhang}@cse.ohio-state.edu 2{ashutosh, gates, ghagleitner, owen, jitendra}@hortonworks.com [email protected] ABSTRACT than 100 developers have made technical efforts to improve Hive on Apache Hive is a widely used data warehouse system for Apache more than 3000 issues. With its rapid development pace, Hive has Hadoop, and has been adopted by many organizations for various been significantly updated by new innovations and research since big data analytics applications. Closely working with many users the original Hive paper [45] was published four years ago. We will and organizations, we have identified several shortcomings of Hive present its major technical advancements in this paper. in its file formats, query planning, and query execution, which are Hive was originally designed as a translation layer on top of key factors determining the performance of Hive. In order to make Hadoop MapReduce. It exposes its own dialect of SQL to users Hive continuously satisfy the requests and requirements of process- and translates data manipulation statements (queries) to a directed ing increasingly high volumes data in a scalable and efficient way, acyclic graph (DAG) of MapReduce jobs. With an SQL interface, we have set two goals related to storage and runtime performance users do not need to write tedious and sometimes difficult MapRe- in our efforts on advancing Hive. First, we aim to maximize the ef- duce programs to manipulate data stored in Hadoop Distributed fective storage capacity and to accelerate data accesses to the data Filesystem (HDFS).
    [Show full text]
  • Silver Datasheet
    DATASHEET Silver Virtualization of ECUs Virtual ECUs bring Overview code to life Silver, a virtual ECU platform, is used to move development tasks from road and test rigs to a Windows PC, enabling the most efficient development. Virtual ECUs Silver is a tool for creating and running virtual ECUs. It moves development tasks from road and test rig to a Windows PC. How Much of the ECU Software Runs Inside a vECU? Depending on the use case and the availability of source code, different parts of the ECU software can be ported to a PC. The figures below show two typical virtualizations of an AUTOSAR ECU: (A) only the Application Software (ASw) is virtualised, (B) also parts of the Basic Software (BSw). synopsys.com ASw–Application Software Standard API Non-standard API ASw ASw Standardized interface BSw basic software: IO hardware Dem, Det, Dcm, Com, Non-standard interface abstraction OS NvM, SchM, FiM… CDD OS OS complex Unmodified software device BSw CDD BSw CDD Standard MCAL drivers Emulation layer in Silver Can, Lin, Fr, Fls, Adc, Spi, Hardware MCAL MCAL IcuPwm, Dio, Port, Wdg, Mcu... Target chip–ECU hardware Windows PC Windows PC Building vECUs with Silver Established coverage measures used by TestWeaver for software controllers are: statement, decision and MC/DC code coverage. Silver supports two ways to build vECUs: • Based on ECU source code—by compiling the C code for Windows PC • Based on ECU binary code (Hex file)—with Silver’s chip simulation feature, supported for MCUs of the Tricore (Infineon) and PowerPC (NXP/ Freescale and STM) families The build procedure is similar in both cases: • Write a configuration file that describes the desired vECU, for instance, tasks to run, inputs and outputs • Let Silver build the vECU from the configuration file.
    [Show full text]