Modeling Guide for SAP Data Hub Company
Total Page:16
File Type:pdf, Size:1020Kb
PUBLIC SAP Data Hub 2.6 Document Version: 2.6.1 – 2019-08-23 Modeling Guide for SAP Data Hub company. All rights reserved. All rights company. affiliate THE BEST RUN 2019 SAP SE or an SAP SE or an SAP SAP 2019 © Content 1 Modeling Guide for SAP Data Hub............................................... 5 2 Introduction to the SAP Data Hub Modeler.........................................7 2.1 Log in to SAP Data Hub Modeler...................................................8 2.2 Navigating the SAP Data Hub Modeler..............................................9 3 Creating Graphs............................................................ 11 3.1 Execute Graphs..............................................................16 Parameterize the Process Execution.............................................17 3.2 Schedule Graph Executions.....................................................17 Cron Expression Format.....................................................19 3.3 Maintain Resource Requirements for Graphs.........................................20 3.4 Groups, Tags, and Dockerfiles................................................... 23 3.5 Execution Model.............................................................25 4 Monitoring the SAP Data Hub Modeler...........................................27 4.1 Monitor the Graph Execution Status...............................................28 Graph Execution..........................................................29 Graph States.............................................................30 Process States............................................................31 4.2 Trace Messages............................................................. 31 Trace Severity Levels.......................................................33 4.3 Using SAP Data Hub Monitoring ................................................. 33 Log in to SAP Data Hub Monitoring.............................................34 Monitoring with the SAP Data Hub Monitoring Application.............................35 Schedule Graph Executions with SAP Data Hub Monitoring............................38 4.4 Downloading Diagnostic Information for Graphs.......................................41 Diagnostic Information Archive Structure and Contents.............................. 42 5 Tutorials..................................................................47 5.1 Create a Graph to Execute a TensorFlow Application....................................47 Example Graphs for TensorFlow............................................... 49 5.2 Creating a Graph to Perform Text Analysis...........................................50 Example Graph for Simple Text Analysis..........................................51 Example Graph for Text Analysis of Files..........................................53 6 Create Operators........................................................... 57 6.1 Operators..................................................................61 Modeling Guide for SAP Data Hub 2 PUBLIC Content Operator Details.......................................................... 62 6.2 Port Types.................................................................68 Compatible Port Types......................................................70 7 Working with the Data Workflow Operators........................................72 7.1 Execute an SAP BW Process Chain................................................76 7.2 Execute an SAP HANA Flowgraph Operator..........................................77 7.3 Transform Data .............................................................79 Configure the Data Source Node .............................................. 82 Configure the Data Target Node............................................... 83 Configure the Projection Node................................................ 85 Configure the Aggregation Node...............................................87 Configure the Join Node.....................................................89 Configure the Union Node....................................................91 Configure the Case Node....................................................92 7.4 Execute an SAP Data Hub Pipeline................................................93 7.5 Execute an SAP Data Services Job................................................95 7.6 Transfer Data ...............................................................97 Transfer Data from SAP BW to SAP Vora or Cloud Storage.............................98 Transfer Data from SAP HANA to SAP Vora or Cloud Storage..........................104 7.7 Control Flow of Execution......................................................106 7.8 Control Start and Shut Down of Data Workflows......................................108 7.9 Send E-mail Notifications......................................................108 8 Integrating SAP Cloud Applications with SAP Data Hub.............................110 9 Service Specific Information..................................................113 9.1 Azure Data Lake (ADL)........................................................113 9.2 Local File System (File)........................................................115 9.3 Google Cloud Storage (GCS)....................................................116 9.4 Hadoop File System (HDFS)....................................................118 9.5 Amazon S3................................................................121 9.6 Microsoft Azure Blob Storage (WASB).............................................125 9.7 WebHDFS.................................................................127 10 Subengines...............................................................131 10.1 Working with the C++ Subengine to Create Operators..................................132 Getting Started.......................................................... 133 Creating an Operator......................................................134 Logging and Error Handling..................................................135 Port Data...............................................................136 Setting Values for Configuration Properties.......................................139 Process Handlers.........................................................140 Modeling Guide for SAP Data Hub Content PUBLIC 3 API Reference............................................................141 10.2 Working with Python2.7 and Python3.6 Subengines to Create Operators.................... 153 Normal Usage...........................................................155 Advanced Usage..........................................................157 10.3 Working with the Node.js Subengine to Create Operators...............................170 Node.js Operators and OS Processes...........................................170 Use Cases for the Node.js Subengine...........................................171 The Node.js Subengine SDK................................................. 172 Data Types..............................................................174 Additional Type Specific Constraints............................................174 Develop a Node.js Operator ..................................................175 Project Structure.........................................................177 Project Files and Resources..................................................177 Logging................................................................179 10.4 Working with Flowagent Subengine to Connect to Databases.............................181 11 Create Dockerfiles......................................................... 185 11.1 Docker Inheritance..........................................................186 12 Create Types..............................................................187 13 Creating Scenes for Graphs (Beta).............................................190 13.1 Using the Scene Editor (Beta)...................................................193 13.2 Developing Controls..........................................................197 Understanding Scripts.....................................................199 Modeling Guide for SAP Data Hub 4 PUBLIC Content 1 Modeling Guide for SAP Data Hub The Modeling Guide contains information on using the SAP Data Hub Modeler. The SAP Data Hub Modeler helps create data processing pipelines (graphs), and provides a runtime and design time environment for data-driven scenarios. The tool reuses existing coding and libraries to orchestrate data processing in distributed landscapes. This guide is organized as follows: Title Description Introduction to the SAP Data Hub Modeler Provides an overview of the tool and describes certain business use cases. Creating Graphs Explains the steps for creating, configuring, and executing a graph in the modeler. Monitoring the SAP Data Hub Modeler This section describes the different monitoring options available within the tool to monitor the working of the SAP Data Hub Mod eler. Tutorials Includes tutorials and example graphs to help users understand and work with the SAP Data Hub Modeler. Create Operators Describes operators and explains the steps for creating, configur ing, and using the operators in a graph. Working with the Data Workflow Operators SAP Data Hub Modeler categorizes certain operators as Data Workflow operators. This sections describes the data workflow op erators and how to model graphs with the data workflow opera tors. Subengines Describes working with the C++, Python2.7, and Python3.6 suben gines. Create Dockerfiles Explains the steps for creating