PUBLIC SAP Data Intelligence 2020-08-28
Administration Guide company. All rights reserved. affiliate
THE BEST RUN 2021 SAP SE or an SAP © Content
1 Administration Guide for SAP Data Intelligence...... 6
2 Getting Started in the Cloud...... 7 2.1 Create an SAP Data Intelligence Instance in SAP BTP...... 7 Add a Tenant to an Existing SAP Data Intelligence Instance...... 9 2.2 Update an SAP Data Intelligence Instance in SAP BTP...... 11 2.3 Managing SAP Data Intelligence Cluster Hibernation and Wakeup...... 12 Triggering Hibernation and Wakeup...... 12 Scheduling Hibernation and Wakeup...... 13 2.4 Configure SAP Cloud Connector ...... 15
3 Using SAP Data Intelligence System Management...... 18 3.1 Log On to SAP Data Intelligence System Management...... 18 3.2 Manage Applications...... 20 SAP Data Intelligence Nested Applications...... 22 Manage Metadata and Preparation Memory...... 22 Manage Metadata Automatic Lineage Extractions...... 24 Manage Size Limit for File Upload and Download...... 25 Manage Metadata Rule Validation CSV Length...... 26 Manage Metadata Failed Records...... 26 3.3 Manage Users...... 31 3.4 Manage Policies...... 33 Resource Types...... 35 Pre-Delivered Policies...... 35 Nested Policies...... 37 Application Start Policies...... 38 Resource Quotas...... 43 Working With Policies...... 44 3.5 Manage Files...... 48 Sharing Files Using Solution Repository...... 51 3.6 Manage Strategies...... 51 Strategies...... 53 3.7 Configuring Client Certificate Authentication for NGINX in SAP Data Intelligence...... 54 Troubleshooting...... 56 Manage Tenant Certificates...... 56 Manage User Certificates...... 56
4 Using SAP Data Intelligence Connection Management...... 59
Administration Guide 2 PUBLIC Content 4.1 Log in to SAP Data Intelligence Connection Management...... 61 4.2 Create a Connection...... 62 4.3 Manage Certificates...... 63 4.4 Supported Connection Types...... 64 ABAP...... 66 ABAP LEGACY...... 68 ADL...... 69 ADL_V2...... 70 AWS_SNS...... 71 AZURE_SQL_DB...... 72 BW...... 73 CLOUD_DATA_INTEGRATION...... 74 CPEM...... 76 CPI...... 77 DATASERVICES...... 78 DB2...... 79 GCS...... 81 GCP_BIGQUERY...... 82 GCP_DATAPROC...... 84 GCP_PUBSUB...... 85 HANA_DB...... 86 HANA_XS...... 88 HDFS...... 88 HDL_DB...... 90 HDL_FILES...... 92 HTTP...... 93 IMAP...... 94 INFORMATION_STEWARD...... 94 KAFKA...... 95 MLCLUSTER...... 97 MSSQL...... 97 MYSQL...... 98 ODATA...... 101 OPEN_CONNECTORS...... 102 OPENAPI...... 104 ORACLE...... 105 OSS...... 108 REDSHIFT...... 109 RSERVE...... 112 S3...... 113 SAP_IQ...... 114
Administration Guide Content PUBLIC 3 SDL...... 116 SFTP...... 117 SMTP...... 119 SNOWFLAKE...... 119 VORA...... 123 WASB...... 124 4.5 Using SAP Cloud Connector Gateway...... 125 4.6 (Mandatory) Configure Authorizations for Supported Connection Types...... 126 HANA_DB...... 127 4.7 Allowing SAP Data Intelligence Access Through Firewalls...... 127
5 SAP Data Intelligence Monitoring ...... 128 5.1 Log In to SAP Data Intelligence Monitoring...... 128 5.2 Using the Monitoring Application...... 130
6 Maintaining SAP Data Intelligence...... 134 6.1 On-Demand Certificate Renewal...... 134 Certificate Rotation for Root Client Certificate and Internal Certificates...... 135 Functional Cluster...... 135 Non-Functional Cluster...... 135
7 Exporting Customer Data...... 137 7.1 Log in to SAP Data Intelligence Customer Data Export ...... 140
8 Improving Performance...... 141 8.1 Improving CDC Graph Generator Operator Performance...... 141
9 Sizing Guide for Metadata Explorer and Self-Service Data Preparation...... 143 9.1 Configure App-Data...... 143 9.2 Configure Other Applications...... 145
10 Understanding Security...... 147 10.1 Data Protection and Privacy in SAP Data Intelligence...... 147 Managing Audit Logs...... 150 Viewing Audit Logs...... 151 Malware Scanning...... 152 10.2 Securing SAP Data Intelligence...... 152 Enabling Authentication for SAP Data Intelligence Services and SAP Data Intelligence Users ...... 152 Configuring External Identity Providers in SAP Data Intelligence...... 153 Giving User Permissions for SAP Data Intelligence Access...... 154 SAP Data Intelligence Self-Signed CA, X.509 Certificates and TLS for SAP Data Intelligence Services...... 154 SAP Vora Integration for External Users...... 155
Administration Guide 4 PUBLIC Content SAP Data Intelligence on Kubernetes Security...... 155 Connecting Your On-Premise Systems To SAP Data Intelligence...... 155
11 Troubleshooting SAP Data Intelligence...... 159 11.1 Troubleshooting Vora Users After Upgrade...... 159 11.2 Troubleshooting SAP Cloud Connector...... 160 11.3 Troubleshooting Flowagent...... 160
Administration Guide Content PUBLIC 5 1 Administration Guide for SAP Data Intelligence
The SAP Data Intelligence Administration Guide contains information about configuring, monitoring, and managing SAP Data Intelligence.
Related Information
Getting Started in the Cloud [page 7] Using SAP Data Intelligence System Management [page 18] Using SAP Data Intelligence Connection Management [page 59] Manage Policies [page 33] Monitoring Graphs Using SAP Data Intelligence Diagnostics Using SAP Data Intelligence Metrics Maintaining SAP Data Intelligence [page 134] Exporting Customer Data [page 137] Improving Performance [page 141] Understanding Security [page 147] Troubleshooting SAP Data Intelligence [page 159]
Administration Guide 6 PUBLIC Administration Guide for SAP Data Intelligence 2 Getting Started in the Cloud
Provides the information that you need to get started using SAP Data Intelligence Cloud.
Related Information
Create an SAP Data Intelligence Instance in SAP BTP [page 7] Update an SAP Data Intelligence Instance in SAP BTP [page 11] Managing SAP Data Intelligence Cluster Hibernation and Wakeup [page 12] Configure SAP Cloud Connector [page 15]
2.1 Create an SAP Data Intelligence Instance in SAP BTP
To access SAP Data Intelligence, create a new instance in SAP BTP cockpit.
Prerequisites
● Your global account has a commercial entitlement via either cloud credits (a consumption-based model) or a subscription contract. ● You have administration authorization. ● You are using Google Chrome to properly view popups in SAP BTP.
Context
In the SAP Data Intelligence Administration Guide, we provide high-level steps to create an SAP Data Intelligence instance on SAP BTP. For more detailed information, or for instructions that use the Cloud Foundry Command-Line Interface, see the SAP BTP documentation.
To use SAP Data Intelligence cloud service, perform the following steps to create an instance in SAP BTP.
Administration Guide Getting Started in the Cloud PUBLIC 7 Procedure
1. In SAP BTP cockpit, select Global Accounts and select the global account that is entitled for SAP Data Intelligence. 2. To create a subaccount, choose New Subaccount, enter or select the subacount information, and click Create and Save. The SAP Data Intelligence subaccount must contain the following selections:
Option Value
Environment Cloud Foundry
Provider AWS
Region Europe (Frankfurt)
Subdomain Specify the alphanumeric name that will be part of the URL. It can contain only letters, digits, and hyphens (not allowed in the beginning or at the end), and must be unique across all accounts in the same region of the Cloud Foundry environment of SAP BTP.
A new tile appears on the global account page with the subaccount details.
3. To enable the Cloud Foundry environment, select the subaccount tile that you created, and click Enable Cloud Foundry.
4. Return to the global account overview page and choose Entitlements Subaccount Assignment , then select Configure Entitlements Add Service Plans and add the SAP Data Intelligence entitlement for the subaccount.
The subaccount lists the available quota for SAP Data Intelligence.
5. To create a space in your subaccount, click Subaccount Spaces New Space , enter the space name, choose the permissions to assign to your ID, and click OK.
The space carries the quota, and consumes all available quota provided by the cloud credits.
6. To create an SAP Data Intelligence instance, choose Services Service Marketplace , search for "Data Intelligence", and select the SAP Data Intelligence service.
Note
A subaccount can support only one SAP Data Intelligence instance.
7. Choose Instances New Instance , and follow the steps to create an instance, specifying the desired credentials and sizing configuration. Choose a minimum and maximum number of worker nodes; SAP Data Intelligence will scale based on the usage. If you want to use VPN or VPC peering connectivity, and your network has an IP address conflict with 10.0.0.0/16 (the default CIDR, or Classless Inter-Domain Routing, value used by the SAP Data Intelligence network), you can optionally specify a CIDR block value (/22 or larger) for your SAP Data Intelligence network.
Creation of the instance can take up to an hour.
8. When the instance status is "Created", click Actions Open Dashboard to the right of the instance name.
Administration Guide 8 PUBLIC Getting Started in the Cloud 9. Log into SAP Data Intelligence on the "default" tenant with the credentials that you provided during instance creation.
Related Information
Add a Tenant to an Existing SAP Data Intelligence Instance [page 9] Connect Using Site-to-Site VPN [page 156] Connect Using Virtual Network Peering [page 158]
2.1.1 Add a Tenant to an Existing SAP Data Intelligence Instance
If you have a running SAP Data Intelligence instance, you can add new, logically isolated tenants to it.
Prerequisites
● Your global account has a commercial entitlement via either cloud credits (a consumption-based model) or a subscription contract. ● You already have a running SAP Data Intelligence enterprise instance. For more information, see Create an SAP Data Intelligence Instance in SAP BTP [page 7].
Context
In the SAP Data Intelligence Administration Guide, we provide high-level steps to create an SAP Data Intelligence instance on SAP BTP. For more detailed information, or for instructions that use the Cloud Foundry Command-Line Interface, see the SAP Business Technology (SAP BTP) documentation.
After you create an instance, SAP Data Intelligence provides two service plans:
● Enterprise plan: Gives you a fully isolated cluster with dedicated hardware resources. When you create an enterprise service instance, one tenant called "default" is automatically created for you. ● Tenant plan: Lets you add up to 10 tenants to an existing SAP Data Intelligence instance. The tenants are isolated from eachother. Tenants let you create logical partitions inside an SAP Data Intelligence instance. Data from one tenant is not accessible to other tenants; you could, for example, use different tenants for different departments of your business. The two tenants would have different users and data.
Be aware of the following information about tenants:
● You must have an enterprise plan instance before you can add more tenants to it. ● Additional tenants share the same hardware resources of their parent enterprise instance. This is important to consider when sizing your system. A heavy workload on one tenant could slow down
Administration Guide Getting Started in the Cloud PUBLIC 9 workloads running on another tenant. You can specify resource quotas when you create a tenant to limit how many resources it can use. ● Up to 20 tenants can be added to a given enterprise instance.
To add new tenants to your existing SAP Data Intelligence cloud service, perform the following steps:
Procedure
1. In SAP BTP cockpit, select Global Accounts and select the global account that contains your SAP Data Intelligence instance. 2. Select the subaccount that contains your SAP Data Intelligence instance. 3. From the spaces list, select the space where you created your SAP Data Intelligence enterprise instance.
4. To create an SAP Data Intelligence tenant, choose Services Service Marketplace , search for "Data Intelligence", and select the SAP Data Intelligence service. 5. In Choose Service Plan, choose the tenant plan and click Next. 6. In the Specify Parameters window, choose the name of your tenant, and the user name and password for the initial user that will be added to the tenant. 7. (Optional) Add resource quotas to limit how many resources the tenant can use.
You can have multiple quotas, and specify whether the target you are limiting is the applications or the workloads. You can apply quotas to three types of resources: ○ CPU (millicpu) ○ Memory (megabytes) ○ Pod count (number of pods) 8. The Cluster Name parameter specifies which enterprise service instance that you are adding the tenant to. If you have only one enterprise instance, it is automatically selected in this field. If you have more than one SAP Data Intelligence enterprise service instance, choose the instance that you are adding the tenant to, and click Next. 9. In the Confirm window, choose your instance name and click Finish.
The creation of a new tenant can take up to 20 minutes.
Results
To use the new tenant, return to the instance list view. After the tenant Last Operation field is set to Created, click the Open Dashboard button in the Actions column, which redirects you to the tenant login screen. The tenant name and user credentials that you selected in Step 6 should be filled in automatically so that you can log into your new tenant.
Administration Guide 10 PUBLIC Getting Started in the Cloud 2.2 Update an SAP Data Intelligence Instance in SAP BTP
If you have a running SAP Data Intelligence instance, you can update it to extend the capacity or reduce costs.
Prerequisites
You already have a running SAP Data Intelligence enterprise instance.
Context
In the SAP Data Intelligence Administration Guide, we provide high-level steps to update an SAP Data Intelligence instance on SAP BTP. For more detailed information, or for instructions that use the Cloud Foundry Command-Line Interface, see the SAP BTP documentation.
To update an SAP Data Intelligence service instance, perform the following steps to update a running instance in SAP BTP.
Procedure
1. Open the SAP BTP dashboard and select the Subaccount where the SAP Data IntelligenceSAP Data Intelligence service instance was created. 2. Navigate to Instances and Subscriptions. 3. Find the SAP Data Intelligence instance with the plan
Related Information
Managing SAP Data Intelligence Cluster Hibernation and Wakeup [page 12]
Administration Guide Getting Started in the Cloud PUBLIC 11 Scheduling Hibernation and Wakeup [page 13]
2.3 Managing SAP Data Intelligence Cluster Hibernation and Wakeup
You can "pause" SAP Data Intelligence clusters to reduce costs while the cluster is not in use.
Hibernation means that all cluster nodes are stopped and that only storage is persisted. Although hibernation provides many advantages from a cost perspective, note that costs are highly reduced but not eliminated.
Note
To trigger hibernation and add schedules, the user must be in the SAP BTP Cloud Foundry Space where SAP Data Intelligence is created, and have either a Space Manager or Space Developer role. For information about adding space users, see Add Space Members Using the Cockpit.
Note the following points:
● Hibernation cannot be triggered while graphs are running. You must first manually stop them. ● Hibernation cannot be triggered while an operational task, such as backup, is in progress. ● During hibernation, the cluster cannot be updated or deleted. ● During hibernation, backups are not triggered. ● During hibernation, scheduled Modeler jobs are not triggered. ● During hibernation, the cluster URL is not accessible and a connection timed out error is received. ● During a maintenance window, hibernating SAP Data Intelligence clusters are woken up and then hibernated again after completion. ● For availability tracking, the hibernation period is not considered.
When a cluster is in hibernation, the View Dashboard button redirects to an informational page.
The hibernation process takes approximately 15 minutes to complete; the wake-up process takes approximately 30 minutes.
2.3.1 Triggering Hibernation and Wakeup
You can trigger hibernation and wakeup using either the command line or the user interface.
To trigger hibernation or wakeup via the command line, use the following commands as an example:
Enable Hibernation
cf update-service $DATA_INTELLIGENCE_INSTANCE_NAME -p enterprise -c '{"enableHibernation":"true"}'
Disable Hibernation
cf update-service $DATA_INTELLIGENCE_INSTANCE_NAME -p enterprise -c '{"enableHibernation":"false"}'
Administration Guide 12 PUBLIC Getting Started in the Cloud 2.3.2 Scheduling Hibernation and Wakeup
In addition to manually triggering hibernation, you can also schedule hibernation and cluster wakeup by configuring hibernation schedules.
Note that scheduled hibernation is triggered if Modeler graphs are running or if operational tasks are being executed.
It is possible to configure hibernation and wake up scheduling using the cf command line or the user interface. Note
Currently, you can configure a schedule using the SAP BTP cockpit user interface only when you create the instance. Updating an instance schedule is not supported.
Currently, at most seven schedules can be configured simultaneously. When you use the SAP BTP user interface, schedules are configured using the browser time zone. Using a command line, schedules must be provided in UTC.
The hibernation/wake up schedule via the command line is configured using Cron expressions, but the fields Day of Month and Month are not allowed. The properties start and end are used to configure hibernation and wake up respectively. Use the following commands as an example:
# Configuring hibernation schedule from monday to friday at 18h UTC
cf update-service $DATA_INTELLIGENCE_INSTANCE_NAME -p enterprise -c '{"hibernationSchedules": "[{\"start\": \"0 18 * * 1,2,3,4,5\"}]"}' # Configuring wake up schedule from monday to friday at 8h UTC cf update-service $DATA_INTELLIGENCE_INSTANCE_NAME -p enterprise -c '{"hibernationSchedules": "[{\"end\": \"0 8 * * 1,2,3,4,5\"}]"}' # Configuring hibernation and wake up schedules cf update-service $DATA_INTELLIGENCE_INSTANCE_NAME -p enterprise -c '{"hibernationSchedules": "[{\"start\": \"0 18 * * 1,2,3,4,5\", \"end\": \"0 8 * * 1,2,3,4,5\"}]"}' # Configuring hibernation at 18h UTC on mondays and wake up at 8h UTC on tuesdays cf update-service $DATA_INTELLIGENCE_INSTANCE_NAME -p enterprise -c '{"hibernationSchedules": "[{\"start\": \"0 18 * * 1\"}, {\"end\": \"0 8 * * 2\"}]"}' # Invalid as Day of Month and Month are not allowed
cf update-service $DATA_INTELLIGENCE_INSTANCE_NAME -p enterprise -c '{"hibernationSchedules": "[{\"end\": \"0 8 * 1 1\"}]"}'
The following are some schedule examples:
1. Hibernate at 00:00 every Saturday and wake up at 00:00 every Monday (hibernate from 00:00 Saturday to 00:00 Monday; that is, the whole weekend):
Administration Guide Getting Started in the Cloud PUBLIC 13 2. Hibernate at 20:00 every day and wake up at 06:00 every day (hibernate from 20:00 to 06:00 every day; that is, every night):
3. Hibernate at 20:00 and wake up at 06:00 every Monday to Friday (hibernate from 20:00 Friday to 06:00 Monday and from 20:00 to 06:00 Monday to Friday; that is, every night during workdays, and the whole weekend):
Administration Guide 14 PUBLIC Getting Started in the Cloud 2.4 Configure SAP Cloud Connector
In the SAP Cloud Connector, or SCC, you establish the set of on-premise systems to be made available in SAP Data Intelligence Cloud.
Prerequisites
Before you can configure the SAP Cloud Connector, the following prerequisites must be fulfilled:
● The SAP Cloud Connector is installed (version 2.12.2 or later). For more information about installing SAP Cloud Connector, see SAP Cloud Connector Installation. ● SAP Data Intelligence Cloud is installed. SAP Data Intelligence Cloud provides one entitlement for the connectivity proxy service, which is automatically installed as a service instance next to SAP Data Intelligence Cloud. ● On SAP BTP, the subaccount user used for initial setup has established the SAP Cloud Connector connection.
Context
The SAP Cloud Connector serves as a link between SAP BTP applications, such as SAP Data Intelligence Cloud, and on-premise systems. For more information about configuring and using the SAP Cloud Connector, see SAP Cloud Connector Configuration.
Procedure
1. Log on to the SAP Cloud Connector. 2. Click the Add Subaccount button and enter or select the following information, and then click Save.
Field Description
Region Choose the region where the SAP Data Intelligence Cloud service has been deployed.
Subaccount The values you obtained when you registered your ac count on SAP BTP. To get your subaccount ID in the Cloud Foundry environment see Find Your Subaccount ID (Cloud Foundry Environment).
Subaccount User The user name and password of the user who approves the Cloud Connector access to SAP BTP. The user must Password be an administrator of the subaccount.
Location ID The SAP Cloud Connector over which the connection is opened.
Administration Guide Getting Started in the Cloud PUBLIC 15 3. In the Subaccount dashboard, you can check the state of all subaccount connections managed by the SAP Cloud Connector at a glance. After you create your subaccount, select it from the main menu, and check the status on the Connector Dashboard to verify that it is connected. The Connector Overview section provides additional information about the subaccount. 4. In the main menu, select Cloud to On-Premise. The Cloud To On-Premise dashboard displays the SAP Data Intelligence Cloud on-premise connections that you want to be visible within the SAP Cloud Connector network.
The internal host specifies the host and port under which the target system can be reached within the intranet. It must be an existing network address that can be resolved on the intranet and has network visibility for the SAP Cloud Connector. The SAP Cloud Connector tries to forward the request to the network address specified by the internal host and port, so this address needs to be real.
The virtual host is the name that is displayed within SAP Data Intelligence Cloud. The fields are prepopulated with the values of the internal host and internal port. You can also assign a different port to the virtual host than the internal host. 5. Now you can establish the set of systems to be made available by the SAP Cloud Connector. To create a connection, click the Add button. 6. In the Add System Mapping window, select your Back-End Type, click Next, and select your Protocol.
SAP Data Intelligence Cloud supports the following:
Supported Back-End Type Supported Protocols
ABAP System RFC
SAP HANA TCP
Non-SAP System HTTP
HTTPS
Some protocols require additional configuration (for example, HTTP). For additional configuration steps, see SAP Cloud Connector Initial Configuration.
Additional Information ○ If the target requires TLS using self-signed certificates: ○ For HTTP- and HTTPS-based connections, certificates must be stored in the SAP Cloud Connector. ○ For connections tunneled via socks5 through Cloud Connector (SAP HANA), certificates must be stored in SAP Data Intelligence Cloud Connection Management. ○ SAP Data Intelligence Cloud does not support the Trusted applications feature of the SAP Cloud Connector.
Results
After you configure your connections in the SAP Cloud Connector, you can set up your connection in SAP Data Intelligence Cloud Connection Management.
Administration Guide 16 PUBLIC Getting Started in the Cloud Related Information
Using SAP Cloud Connector Gateway [page 125] SAP Cloud Connector Troubleshooting SAP Cloud Connector [page 160]
Administration Guide Getting Started in the Cloud PUBLIC 17 3 Using SAP Data Intelligence System Management
The SAP Data Intelligence System Management application allows you to manage applications, users, and files. It provides the initial point of access to the user-facing applications running on its server.
The following section lists some of the tasks that you can perform in the System Management application:
● Manage tenant administrator and member users within the tenant ● Manage users' secrets ● Create application instances (deprecated) ● Delete application instances ● Launch applications ● Configure applications ● Manage files
Related Information
Log On to SAP Data Intelligence System Management [page 18] Manage Applications [page 20] Manage Users [page 31] Manage Policies [page 33] Manage Files [page 48] Manage Strategies [page 51] Configuring Client Certificate Authentication for NGINX in SAP Data Intelligence [page 54]
3.1 Log On to SAP Data Intelligence System Management
An initial System Management user is created during the installation of SAP Data Intelligence for the initial logon.
Procedure
1. Open the System Management UI in a browser. It is available at the following URL:
Administration Guide 18 PUBLIC Using SAP Data Intelligence System Management ○ For on-premise installations (if TLS is enabled):
https://
○
https://
○
kubectl get ingress -n $NAMESPACE
The login screen appears. 2. Enter the required data:
○ Your tenant name ○ Your username ○ Your password
For newly created instance the tenant name is "default". For user name and password use the credentials used while creating a service.
The System Management UI opens, displaying the initial screen. Your user is shown at the top-right of the screen.
Note
If you enter an incorrect password five consecutive times within a minute, your account will be temporarily locked for ten seconds until your next attempt.
Results
An example of the initial screen for regular tenants is shown below. You can manage applications, users, and files using the tabs at the top of the page:
Administration Guide Using SAP Data Intelligence System Management PUBLIC 19 You can view the status of the asynchronous tasks in the (Tasks) panel. You can manage the task history of the completed tasks using Set Preferences.
Related Information
Manage Clusters
3.2 Manage Applications
The application management feature allows you to launch stable instances of applications associated with the tenant. For example, you can launch instances of SAP Vora Tools, Launchpad, Modeler, Connection Management, and more.
Prerequisites
You are a tenant administrator or a member user.
Context
The Applications page lists all applications currently available for the active user:
● An instance of an application launched from application management can be accessed only by the user who started it.
Administration Guide 20 PUBLIC Using SAP Data Intelligence System Management ● A stable application instance can be started or restarted. ● Applications can be configured only by a tenant administrator.
Procedure
In the Applications page, perform the appropriate action:
Option Description
Launch an ap 1. From the list of applications in the Applications page, locate the application. plication 2. In the Action column, click the icon.
Note You cannot launch the applications that have no user interface.
Start a stable ○ In the Applications page, for an application, click the (Start) icon to start a stable instance. application in ○ If an application instance is already started, click the (Restart) icon to restart the instance. stance. Note You cannot restart an application for one of the following reasons: ○ The application is a nested application. You can instead try restarting the core application. ○ You don't have permissions to restart the application.
Set application The tenant administrators can set application configurations. configuration 1. In the Applications page, choose the icon. 2. In the View Application Configuration and Secrets page, under General tab, find the relevant parame ter and enter a value.
Note Some of the application parameters are disabled for modification as they are protected. Pro tected parameters can only be modified by cluster administrators.
3. Choose Save to save the setting.
Create a secret Tenant administrators can create and upload secrets. 1. In the Applications page, choose the icon. 2. In the View Application Configuration and Secrets page, under Secrets tab, choose the icon to cre ate a secret. 3. Enter a name for the new secret. 4. Browse to select and upload your secret file. 5. Choose Create.
Note Secrets can't be shared accross tenants.
Related Information
SAP Data Intelligence Nested Applications [page 22]
Administration Guide Using SAP Data Intelligence System Management PUBLIC 21 Manage Metadata and Preparation Memory [page 22] Manage Metadata Automatic Lineage Extractions [page 24] Manage Size Limit for File Upload and Download [page 25] Manage Metadata Rule Validation CSV Length [page 26] Manage Metadata Failed Records [page 26]
3.2.1 SAP Data Intelligence Nested Applications
Some SAP Data Intelligence applications that consume fewer resources are packaged and deployed on a single pod. These applications are referred to as nested applications.
SAP Data Intelligence Connection Management, SAP Data Intelligence Monitoring, SAP Data Intelligence, and Metadata Explorer are some examples of nested applications. The routing of these nested applications are internally managed by a core application and are always mapped on a stable instance. In general,
● When you launch any nested application, SAP Data Intelligence creates a stable instance for it in the core application (if no stable instance exist). This application consumes resources from a single pod where the core application resides. ● You cannot create new instances of nested applications. These always point to the core application. ● When you delete the stable instance of the core application, SAP Data Intelligence removes all the instances of the nested applications managed by the core application.
3.2.2 Manage Metadata and Preparation Memory
Change the default size for metadata and data preparation memory usage.
Prerequisites
You are logged in as a tenant administrator.
Context
The default memory usage limits are 8192 MB for the Metadata Catalog and 4096 MB for Preparations. It is recommended that approximately 60 percent of the available SAP HANA memory usage is divided between the Metadata Catalog and Preparations. The other 40 percent is used for other applications, queries, and logging.
Remember
The HANA instance is shared between all tenants, but this size limit is for a single tenant. Hence, consider all tenants for memory usage division. For example, if you have a tenant and a 20 GB database, the default
Administration Guide 22 PUBLIC Using SAP Data Intelligence System Management is 60 percent. But, if you have two tenants, then the space is dropped in half as each tenant could potentially use 60 percent with the default and HANA would run out of memory.
If you are running out of memory for the Metadata Catalog, you can increase the memory usage for Metadata Catalog and decrease the memory for Preparations, for example. Follow these steps to increase or decrease the memory usage limits.
Procedure
1. Open the System Management UI. Under Applications, click View Application Configuration and Secrets icon. 2. In the General tab, click and set the limit for the following option.
Option Description
Metadata Explorer: Metadata catalog memory usage Controls the amount of memory available for the data in limit (MB) the Metadata Catalog. Enter a value in MB.
Applications: Preparation memory usage limit (MB) Controls the amount of memory available when process ing data preparations. Enter a value in MB.
Note
If you set the memory usage limit to 0, then the system sets the value of 8192 MB for Metadate Explorer and 4096 MB for Data Preparation.
3. Click Update. Close the Configuration panel. 4. Restart the Data Application by clicking Restart.
Results
When you open the Metadata Explorer, the dataintelligence-app-data application is re-created with the updated memory usage limits. Go to the Metadata Explorer to verify the results in the Memory Usage tile.
Related Information
Using the Metadata Explorer
Administration Guide Using SAP Data Intelligence System Management PUBLIC 23 3.2.3 Manage Metadata Automatic Lineage Extractions
Change the settings to automatically extract lineage from graphs in the Modeler or data preparations in the Metadata Explorer.
Context
Automatic lineage extraction is run on Modeler graphs and Metadata Explorer data preparations. The benefit of automatically extracting lineage is to have a history of lineage extractions over time, and you can view how data sources have been added or removed. It also shows which data sources and transformations are included in the output of a data target.
By default, automatic lineage extraction is turned off.
Note
To view the Automatic Lineage tab in Metadata Explorer, you must assign the app.datahub-app- data.administrator policy for the user in System Management.
Procedure
1. Choose System Management from the Launchpad. 2. On the Applications tab, choose View Application Configuration and Secrets. 3. Click Edit. 4. In the Metadata Explorer: Automatic lineage extraction of Modeler Graphs option, choose one of these settings.
○ enabled_and_publish_datasets: Extracts lineage and publishes the dataset to the catalog in the Metadata Explorer. Access the lineage by browsing the connection or the catalog. ○ enabled_and_do_not_publish_datasets: Extracts lineage but does not publish the dataset to the catalog. Access lineage by browsing the connection in the Metadata Explorer. ○ disabled: Do not automatically extract lineage. 5. In the Metadata Explorer: Automatic lineage extraction of Data Preparations option, choose one of these settings.
○ enabled_and_publish_datasets: Extracts lineage and publishes the dataset to the catalog in the Metadata Explorer. Access the lineage by browsing the connection or the catalog. ○ enabled_and_do_not_publish_datasets: Extracts lineage but does not publish the dataset to the catalog. Access lineage by browsing the connection in the Metadata Explorer. ○ disabled: Do not automatically extract lineage. 6. In the Metadata Explorer: Days until deletion of automatic lineage extractions from the monitoring task list and catalog option, set the number of days after which the automatic lineage is deleted. Set to -1 to keep all automatic lineage extractions. 7. In the Metadata Explorer: Automatic lineage extraction frequency option, set the time when you want automatic lineage started.
Administration Guide 24 PUBLIC Using SAP Data Intelligence System Management 8. Click Update. 9. Restart the Data App Daemon by clicking Restart.
3.2.4 Manage Size Limit for File Upload and Download
You can change the default size for files that can be uploaded and downloaded in Metadata Explorer.
Prerequisites
Note
This procedure does not change the limits in Data Preparation.
You’re logged in as a tenant administrator.
Context
The default size limit for file upload is 100 MB. You can increase this limit to support the file sizes you wish to allow. This setting affects all users in the tenant.
Follow these steps to increase or decrease the size limit for file upload.
Procedure
1. Open the System Management UI. Under Applications, click View Application Configuration and Secrets. 2. In the General tab, click and set the limit for the following option.
Option Description
Applications: Size limit on the files that can be up Controls the file size that can be uploaded and down loaded or downloaded (MB) loaded. Enter a value in MB. If you enter 0, the system de faults to the value of 100 MB.
3. Click Update. Close the configuration panel. 4. Restart the Data Application by clicking Restart. 5. Refresh the Metadata Explorer browser page to use the new file size limit.
Administration Guide Using SAP Data Intelligence System Management PUBLIC 25 3.2.5 Manage Metadata Rule Validation CSV Length
Set the maximum length for a CSV column when validating rules.
If you receive errors while processing your rules with CSV files, you could have an issue with the string length. By default, you can have 5000 characters in a string while running a rulebook. If your strings are larger than 5000, you can increase the size in the Maximum CSV column length for rule validation option.
Follow these steps to increase or decrease the maximum CSV column length.
1. Open the System Management UI. Under Applications, click View Application Configuration and Secrets. 2. In the General tab, click and set the limit for Maximum CSV column length for rule validation. 3. Click Update. Close the configuration panel. 4. Restart the Data Application by clicking Restart.
3.2.6 Manage Metadata Failed Records
Save the data that does not pass the rules to a separate dataset.
Context
When you create and run rules, you may have records that do not pass the rules. You can save those records to a separate location and examine why the records failed.
Note
You may need certain rights and permissions to set these options.
Procedure
1. Open the System Management UI. Under Applications, click View Application Configuration and Secretsicon. 2. In the General tab, click Edit icon and set these options.
Option Description
Metadata Explorer: Failed record connection ID, only Enter the SAP HANA_DB connection ID. HANA_DB connection types supported
Metadata Explorer: Failed record schema, Example: / Enter the name of the schema where the failed data is Failed_Records loaded.
3. Click Update. Close the Configuration panel. 4. Restart the Data Application by clicking Restart.
Administration Guide 26 PUBLIC Using SAP Data Intelligence System Management Related Information
Extract Details from Failed Records [page 27] Run Rulebooks and View Results
3.2.6.1 Extract Details from Failed Records
Use your database management system to query metadata tables that contain failed data information.
In Metadata Explorer, you can choose to save all failed records to a separate table while running a rulebook. During processing, the Metadata Explorer adds metadata tables to the relational database system. You can extract information from those tables in your database management system by using SQL-lke queries. For example, learn the about all the rules a given record failed in one run.
This section describes the tables related to the failed data output.
The following diagram shows the tables and how some columns are connected to each table.
RULEBOOK_EXECUTION_HISTORY
This table contains a new record for every rulebook that is run.
Column Name Data Type Description
EXECUTION_ID STRING(32)/ Unique identifier for the run. This column is the primary key NVARCHAR(32) for this table.
Administration Guide Using SAP Data Intelligence System Management PUBLIC 27 Column Name Data Type Description
RULEBOOK_ID STRING(32)/ Unique identifier for a rulebook. NVARCHAR(32)
RULEBOOK_NAME STRING(256)/ User-entered name of the rulebook. NVARCHAR(256)
RULEBOOK_DESCRIPTION STRING(5000)/ User-entered description of the rulebook. NVARCHAR(5000)
START_TIME TIMESTAMP Starting time of the rulebook run.
END_TIME TIMESTAMP Ending time of the rulebook run.
STATUS STRING(20)/ Status of the rulebook execution, for example, OK or NVARCHAR(20) ERROR.
RULE_EXECUTION_HISTORY
This table contains a new record for every rule that is part of a rulebook run.
Column Name Data Type Description
EXECUTION_ID STRING(32)/ Unique identifier for the run. This column is part of the NVARCHAR(32) primary key for this table.
RULE_ID STRING(32)/ Unique identifier for a rule. This column is part of the NVARCHAR(32) primary key for this table.
RULE_ID_NAME STRING(256)/ User-entered ID of the rule. NVARCHAR(256)
RULE_DISPLAY_NAME STRING(256)/ User-entered name of the rule. NVARCHAR(256)
RULE_DESCRIPTION STRING(5000)/ User-entered description of the rule. NVARCHAR(5000)
CATEGORY STRING(256)/ Category where the rule is included. NVARCHAR(256)
DATASET_EXECUTION_HISTORY
This table contains a new record for every unique dataset that is part of a rulebook run.
Column Name Data Type Description
EXECUTION_ID STRING(32)/ Unique identifier for the run. This column is part of the NVARCHAR(32) primary key for this table.
DATASET_ID STRING(32)/ Unique identifier for a dataset. This column is part of the NVARCHAR(32) primary key for this table.
Administration Guide 28 PUBLIC Using SAP Data Intelligence System Management Column Name Data Type Description
CONNECTION_ID STRING(256)/ Unique connection identifier for the connection used. This NVARCHAR(256) ID is what is shown in Connection Management.
QUALIFIED_NAME STRING(256)/ Qualified name of the dataset. NVARCHAR(256)
DATASET_PREFIX STRING(32)/ Value for "DATASET_PREFIX" for this run that specifies the NVARCHAR(32) _FAIL_DATA and _FAIL_INFO tables.
START_TIME TIMESTAMP Starting time of the dataset run.
END_TIME TIMESTAMP Ending time of the dataset run.
STATUS STRING(20)/ Status of the rulebook run, for example, OK or ERROR. NVARCHAR(20)
TOTAL_ROWS FLOATING(8)/DOUBLE Number of rows that were part of the dataset run.
RULE_BINDING_HISTORY
This table contains a new record for every unique binding of a rule that is part of a rulebook run. If the same rule is bound to the same dataset multiple times, then there are multiple rows in this table for the same rule ID.
Column Name Data Type Description
EXECUTION_ID STRING(32)/ Unique identifier for the run. This column is part of the NVARCHAR(32) primary key for this table..
RULE_ID STRING(32)/ Unique identifier for a rule. This column is part of the NVARCHAR(32) primary key for this table..
BINDING_ID STRING(32)/ Unique identifier for a rule binding . This column is part of NVARCHAR(32) the primary key for this table.
COLUMN_BINDING_HISTORY
This table contains a new record for every unique column mapping that is part of a rule binding. If a rule has multiple parameters, then there are multiple rows in this table, one for each parameter within the rule.
Column Name Data Type Description
EXECUTION_ID STRING(32)/NVARCHAR(32) Unique identifier for the run. This column is part of the primary key for this table..
BINDING_ID STRING(32)/NVARCHAR(32) Unique identifier for a rule binding. This column is part of the primary key for this table..
PARAMETER STRING(256)/ Rule parameter name. This column is part of the primary NVARCHAR(256) key for this table.
COLUMN_NAME STRING(256)/ Data source column name that the specified parameter is NVARCHAR(256) bound to.
Administration Guide Using SAP Data Intelligence System Management PUBLIC 29
This table contains a new record for every record that fails a run.
Column Name Data Type Description
EXECUTION_ID STRING(32)/NVARCHAR(32) Unique execution identifier that is the same for every record from a single run. This column is part of the primary key for this table.
GENERATED_ID FLOATING(8)/DOUBLE Generated row identifier that is created as part of rule processing that uniquely identifies each record. This ID is helpful for linking with the detailed failure information in the
If rule expression contains either the is_unique or is_data_dependent function, an additional column is created for each function. The column contains the Group ID if this record failed the is_unique or is_data_dependent functions. The column name is GRP_ID_
A record is present in this table for every rule failure that happens. Because a single record being processed can fail multiple rules, this table may contain multiple records for each failed record. Because a rule may also contain multiple parameters and columns used, this table may also contain multiple records for that case as well.
Column Name Data Type Description
EXECUTION_ID STRING(32)/NVARCHAR(32) Unique identifier for the run. This column is part of the primary key for this table.
GENERATED_ID FLOATING(8)/DOUBLE Generated row identifier that is created as part of rule processing that uniquely identifies each record. This ID is helpful for linking with the detailed failure information in the
BINDING_ID STRING(32)/NVARCHAR(32) Unique identifier for a rule binding that Metadata Explorer already provides as part of the rule configuration. This column is part of the primary key for this table.
Administration Guide 30 PUBLIC Using SAP Data Intelligence System Management 3.3 Manage Users
A System Management user can use the features in System Management and access the applications it runs.
Prerequisites
SAP Data Intelligence integrates with SAP BTP User Account and Authentication (SAP BTP UAA). In SAP Data Intelligence, users are not managed by System Management, but in the SAP BTP UAA. Nevertheless, it is possible and necessary to assign policies to these users, as described below. To make the user initially visible in System Management, the user needs to log in once. The user automatically gets the Member policy assigned, which can be changed later by a Tenant Administrator.
Context
A user is uniquely identified by its name. A user name can consist of 4 to 64 alphanumeric characters (that is, letters and numbers, but not punctuation marks or white spaces). A user name cannot be changed later. By default, the password must be between 8 and 255 characters. These settings can be changed during the installation.
A user is either a tenant administrator or member user depending on the policies assigned to the user. When creating a user, you assign the required policies.
User Allowed Actions
Tenant Administrator The tenant administrator can do the following within their tenant:
● Create, view, and delete other tenant administrators and members of their tenant. ● Reset their own and other users' passwords. ● View and delete user-created instances. ● Can assign policies.
Member The member users can change their own passwords.
Procedure
1. Open the System Management UI and at the top of the page choose the Users tab.
The Users page lists all tenant administrators and members in the tenant. 2. Choose the appropriate option:
Administration Guide Using SAP Data Intelligence System Management PUBLIC 31 Option Description
Create a user 1. In the Users page, choose the icon. 2. Enter a username.
Note Usernames are case insensitive. This means that User is the same as user, USER, and so on. Login works with any variation.
3. Enter the new user's password twice. 4. To require the user to change the password upon next login, enable the re spective checkbox. 5. Choose Create.
Change a user's password 1. Locate the user in the Users page. 2. Click the icon. 3. In the details page, choose Change Password. 4. Enter current password. 5. Enter the new password twice 6. To require the user to change the password upon next login, enable the re spective checkbox. 7. Choose Change Password.
Assign policies to a user By default members are assigned with policy sap.dh.member. This provides them default access to SAP Data Intelligence System Management and SAP Data Intelligence Monitoring. The tenant administrators can modify (add or remove) the default policies assigned to users with their tenant.
SAP Data Intelligence supports the following default policies. ○ sap.dh.admin provides access to all applications deployed in their tenant. This policy is assigned to tenant administrators. ○ sap.dh.developer provides access only to the Modeler along with the default access. ○ sap.dh.metadata provides access only to Connection Management and Meta data Explorer along with default access.
Note sap.dh.metadata is deprecated and will be removed in future re leases. Instead, you can use app.datahub-app- data.fullAccess as a close replacement.
○ sap.dh.member provides access only to the SAP Data Intelligence System Management, Monitoring, Connection Management, and Vora tools (Auto ML, ML Data Manager, and ML Scenario appears). This policy is assigned to tenant members.
To assign policies, follow these steps: 1. In the Users page, select a user. 2. Choose the Policies tab. 3. To assign a policy, choose the icon. 4. Select the policy you want to assign and choose Assign.
To remove policies, follow these steps: 1. In the Users page, select a user. 2. Choose the Policies tab.
Administration Guide 32 PUBLIC Using SAP Data Intelligence System Management Option Description
From the list of assigned policies, locate the policy that you want to remove for the selected user. 3. In the Action column, click the icon.
Note
The user has to log out of the current session and log in again to access the changes.
Tenant administrators can manage a policy with the SAP Data Intelligence Policy Management application: 1. In the Users page, select a user. 2. In the Policies tab, from the list assigned policies, click the policy.
This operation launches the SAP Data Intelligence Policy Management application in a new browser tab.
Update policies assigned to a Only tenant administrators can update policies assigned to a user. They can pro user mote or demote other users within their tenant. 1. Locate a user in the Users page. 2. Click the icon. 3. In the Policies tab, delete the existing policy and assign new policy.
Delete a user 1. In the Users page, locate the user and click the icon. 2. Confirm user deletion.
Delete a user's instance 1. Select the user in the Users page. 2. In the Instances tab, locate the instance in the list and choose the icon. 3. Confirm instance deletion.
Restriction
If you don't have admin privileges, when you assign admin policy to a user, it doesn't reflect on SAP Vora. To get admin privileges on SAP Vora, you must run the following queries in an admin user account. GRANT
3.4 Manage Policies
Policy Management is the component in SAP Data Intelligence responsible for authorizations.
For application developers, policy management provides a flexible way to define domain-specific authorizations and a distributed service to perform authorization checks on object instances. For administrators, it provides an easy way to manage instance-specific authorizations and assign them to users.
Policy management implements attribute based access control (ABAC) based on policies, following the concept of extended access control markup language (XACML). Authorizations checks are executed in policy
Administration Guide Using SAP Data Intelligence System Management PUBLIC 33 decision points (PDP) with the help of the open-source software Open Policy Agent (OPA). The check service is called by applications for implementing permission checks (policy enforcement point, or PEP).
Note
Policy assignments are used for permission control in multiple product components.
The propagation of a change (for example:a new assignment, a removal, or a change to the assigned policy) can be delayed until it is visible in the entire cluster. A delay of up to 15 seconds is expected.
Applications that use a permission change directly after a policy assignment has changed need to retry their requests for this duration.
Definitions
The following is a list of terms used in policy management:
Term Description
Resource An entity that you want to protect (for example, a connection instance). Each resource has a resource type.
Resource type Defines how resource instances can be identified, what ac tivities can be performed on a resource, and how permis sions are checked. It contains the following information:
● A list of possible activities on the resource. ● A list of attributes for identifying a resource instance, if required. ● Rules for how to check the authorization. Rules are given in a "rego" file that is evaluated by the "open pol icy agent" engine.
Policy A collection of resources and activities, which are specified either directly or referred from other (hence nested) policies. A policy gives authorization to perform the specified activi ties on the specified resource instances.
Assignment Maps a policy to a user (policy user assignment).
Related Information
Resource Types [page 35] Pre-Delivered Policies [page 35] Nested Policies [page 37] Application Start Policies [page 38] Resource Quotas [page 43] Working With Policies [page 44] Administering Network Policies
Administration Guide 34 PUBLIC Using SAP Data Intelligence System Management 3.4.1 Resource Types
A resource type defines how resource instances can be identified, what activities can be performed on a resource, and how permissions are checked.
A resource type contains the following information:
● A list of possible activities on the resource. ● A list of attributes for identifying a resource instance, if required. ● Rules for how to check the authorization. Rules are given in a "rego" file that is evaluated by the "open policy agent" engine.
The following are the resource types that SAP Data Intelligence provides:
Resource Type Activities Instance Identification Policy Check Identification
Application start Name of the application data.dataintelligence.vsys tem.allowApplication
systemManagement read, write no instance check data.dataintelligence.vsys tem.allowSystemManage ment
Connection read, write, ownerRead, own connectionId,connectionIds data.dataintelligence.con erWrite nection.allow, data.dataintel ligence.connection.allow Batch
connectionContent manage, ownerManage connectionId,connectionIds data.dataintelligence.con nectionContent.allowCon nectionContent, data.datain telligence.connectionCon tent.allowConnectionCon tentBatch
certificate manage no instance check data.datahub.certificate.al- low
3.4.2 Pre-Delivered Policies
Pre-delivered policies are the policies that are created by default with the tenant.
Some of the predelivered application start policies are only accessible in the system tenant or in the customer tenants, depending on the applications that are installed on those tenants.
The following table describes the pre-delivered policies:
Policy ID Description Exposure Flag
sap.dh.admin Administer the tenant. true
sap.dh.applicationAllStart Start all the current applications. false
Administration Guide Using SAP Data Intelligence System Management PUBLIC 35 Policy ID Description Exposure Flag
sap.dh.connectionContentAllManage Modify content on the connection such false as delete, upload, rename, and create in the Metadata Explorer.
sap.dh.connectionContentOwnerMan Access content of the user's own con false age nections in Metadata Explorer
sap.dh.certificate.manage Manage server certificates (uploading, true deleting)
sap.dh.connectionsAllWrite Write access to all connection defini- false tions in the tenant.
sap.dh.connectionsOwnerRead Read access to the user's own connec false tion definitions.
sap.dh.connectionsOwnerWrite Write access to the user's own connec false tion definitions.
sap.dh.developer Work with the Pipeline Modeler. true
sap.dh.member Member of the tenant. true
sap.dh.metadata Work with the Metadata Explorer. true
Note sap.dh.metadata is depre cated and will be removed in future releases. Instead, you can use app.datahub-app- data.fullAccess as a close replacement.
sap.dh.metadataStart Start Metadata Explorer. false
Note sap.dh.metadataStart is deprecated and will be removed in future releases. Instead, you can use app.datahub-app- data.fullAccess as a close replacement.
sap.dh.modelerStart Start Pipeline Modeler. false
sap.dh.systemAccess Access to System Management APIs, false base policy.
sap.dh.systemMgtWrite Change operations in system manage false ment and policy management.
Administration Guide 36 PUBLIC Using SAP Data Intelligence System Management Policy ID Description Exposure Flag
sap.dh.connectionCredentialsUn As a tenant administrator or pipeline false masked modeler, explicitly enable this policy to see the username of connections for troubleshooting connectivity issues and to know with what credentials you are accessing the data sources.
This policy gives you full visibility to se lected connection fields such as user names.
3.4.3 Nested Policies
A nested policy uses the definitions and resources of other policies.
The following table shows policies and their respective nested policies.
Policy ID Referred Policies
sap.dh.admin sap.dh.systemAccess, sap.dh.systemMgtWrite, sap.dh.appli cationAllStart, sap.dh.connectionsAllRead, sap.dh.connec tionsAllWrite
sap.dh.clusterAdmin sap.dh.systemAccess, sap.dh.systemMgtWrite, sap.dh.appli cationAllStart, sap.dh.connectionsAllRead, sap.dh.connec tionsAllWrite
sap.dh.developer sap.dh.modelerStart, sap.dh.connectionMgtStart, sap.dh.connectionsAllRead
sap.dh.member sap.dh.systemAccess, sap.dh.connectionMgtStart, sap.dh.connectionsOwnerRead, sap.dh.connectionsOwner Write, sap.dh.connectionContentOwnerManage
sap.dh.metadata sap.dh.metadataStart, sap.dh.connectionMgtStart, sap.dh.connectionsAllRead, sap.dh.connectionContentAll Note Manage sap.dh.metadata and sap.dh.metadataStart are deprecated and will be removed in future releases. Instead, you can use app.datahub-app-data.fullAccess as a close replacement.
Administration Guide Using SAP Data Intelligence System Management PUBLIC 37 3.4.4 Application Start Policies
Start policies allow tenant administrators to specify what applications can be started by individual users.
Application start policies allow better control over resource consumption and ease of use for users, who see only a subset of available applications in their launchpad.
A policy to start an application uses the application resource type, with the start activity and the name field. The name field is matched against the application identifier. (The application ID is the ID column that is shown while listing the available templates with the command-line client. It is also equivalent to the link to the application; for example, datahub-app-launchpad.) You can create application start policies using the SAP Data Intelligence System Management Command-Line Client or Policy Management.
All pre-delivered applications have a start policy available to use. The specific policies are not exposed; they can only be nested under other policies and not directly assigned to users.
The sap.dh.member policy includes most of the available start policies. To restrict the set of startable applications, the administrator must create a new policy and reference the appropriate start policies.
The sap.dh.applicationAllStart policy allows all applications to be started, without the need to assign specific start policies. Tenant and cluster administrators have this policy assigned to them.
Start policies are checked when an application is started (that is, when a pod is created). The start policies are not checked when the application is accessed; for example, when a user uses the stable link after the pod has been created. In other words, after a tenant application has been started, any user in possession of the stable link can use it.
Some applications may depend on features of other applications. It is the tenant administrator's responsibility to model the policies so that users have the policy to start the application, as well as the policies to start its dependencies.
Users can stop their user applications, but not the user applications of other users, nor tenant application instances. Tenant administrators can stop tenant applications instances, as well as any user application instance in their tenant. It is directed by the sap.dh.stopAppsForOtherUsers policy, which is given to administrators but not to regular members. Users with the sap.dh.stopAppsForOtherUsers policy can stop any instance of an application in their tenant, provided that they can start that application (that is, they have a start policy for that application that applies to them).
The list of available applications always depends on the policies of the user. The cluster administrator can see how many applications their users can start by using the -t and -u flags of the System Management Command-Line Client, which allow some commands to be executed as another tuple tenant/user.
Note
Start policies do not apply in the system tenant.
Related Information
Required Application Start Policies [page 39]
Administration Guide 38 PUBLIC Using SAP Data Intelligence System Management 3.4.4.1 Required Application Start Policies
To start an SAP Data Intelligence application, a user must have access to all of the start policies required for the application.
If your user has the application start policy sap.dh.member, you already have access to many of the available application start policies. However, if your system administrator uses custom policies, you must have the following application start policies to use applications:
Required Application Start Application Policy Dependent Application Dependent Application ID
Audit Log Viewer sap.dh.datahubAppAudi Audit Log Viewer datahub-app-auditlog tlog.start
Axino sap.dh.axinoService.start Axino axino-service
sap.dh.connectionMgtStart Connection Management datahub-app-connection
AutoML sap.dh.automl.start AutoML automl
sap.dh.connectionMgtStart Connection Management datahub-app-connection
sap.dh.dspGitServer.start Git Server dsp-git-server
sap.dh.mlApi.start ML API ml-api
sap.dh.mlDmApi.start Data Manager ml-dm-app
sap.dh.mlTracking.start Tracking API ml-tracking
sap.dh.modelerStart Pipeline Modeler pipeline-modeler
sap.dh.shared.start Shared shared
Flowagent sap.dh.dataHubFlow Flowagent data-hub-flow-agent Agent.start
sap.dh.axinoService.start Axino axino-service
sap.dh.connectionMgtStart Connection Management datahub-app-connection
sap.dh.datahubAppAudi Audit Log Viewer datahub-app-auditlog tlog.start
JupyterLab sap.dh.jupyter.start Jupyter Lab jupyter
sap.dh.automl.start AutoML automl
sap.dh.connectionMgtStart Connection Management datahub-app-connection
sap.dh.datahubApp Core Application Core.start
sap.dh.datahubAppDae Data App Daemon mon.start
sap.dh.datahubApp Data Application Data.start
sap.dh.datahubAppData Data Intelligence App DB base.start
Administration Guide Using SAP Data Intelligence System Management PUBLIC 39 Required Application Start Application Policy Dependent Application Dependent Application ID
sap.dh.datahubAppLaunch Launchpad datahub-app-launchpad pad.start
sap.dh.dataHubFlow Flowagent data-hub-flow-agent Agent.start
sap.dh.metadataStart Metadata Explorer datahub-app-metadata
Note sap.dh.metadataS tart is deprecated and will be removed in future releases. Instead, you can use app.datahub-app- data.fullAccess as a close replacement.
sap.dh.mlApi.start ML API ml-api
sap.dh.mlDmApi.start ML DM API ml-dm-api
sap.dh.mlScenarioMan ML Scenario Manager ml-scenario-manager ager.start
sap.dh.mlTracking.start ML Tracking API ml-tracking
sap.dh.modelerStart Pipeline Modeler pipeline-modeler
sap.dh.shared.start Shared shared
sap.dh.trainingService.start Training Service training-service
Launchpad sap.dh.datahubAppLaunch Launchpad datahub-app-launchpad pad.start
sap.dh.shared.start Shared shared
Metadata Explorer sap.dh.metadataStart Metadata Explorer datahub-app-metadata
sap.dh.connectionMgtStart Connection Management datahub-app-connection
sap.dh.modelerStart Pipeline Modeler pipeline-modeler
Note sap.dh.metadataS tart is deprecated and will be removed in future releases. Instead, you can use app.datahub-app- data.fullAccess as a close replacement.
sap.dh.modelerUI.start Modeler modeler-ui
Administration Guide 40 PUBLIC Using SAP Data Intelligence System Management Required Application Start Application Policy Dependent Application Dependent Application ID
Metrics Explorer sap.dh.metricsExplorer.start Metrics Explorer metrics-explorer
sap.dh.automl.start AutoML automl
sap.dh.connectionMgtStart Connection Management datahub-app-connection
sap.dh.jupyter.start Jupyter Lab jupyter
sap.dh.mlApi.start ML API ml-api
sap.dh.mlScenarioMan ML Scenario Manager ml-scenario-manager ager.start
sap.dh.mlTracking.start ML Tracking API ml-tracking
sap.dh.modelerStart Pipeline Modeler pipeline-modeler
sap.dh.shared.start Shared shared
ML API sap.dh.mlApi.start ML API ml-api
sap.dh.connectionMgtStart Connection Management datahub-app-connection
sap.dh.dspGitServer.start Git Server dsp-git-server
sap.dh.mlTracking.start Tracking API ml-tracking
sap.dh.modelerStart Pipeline Modeler pipeline-modeler
sap.dh.resourceplanSer Resource Plan API resourceplan-service vice.start
sap.dh.trainingService.start Training Service training-service
ML Data Manager sap.dh.mlDmApp.start ML Data Manager ml-dm-app
sap.dh.connectionMgtStart Connection Management datahub-app-connection
sap.dh.datahubAppLaunch Launchpad datahub-app-launchpad pad.start
sap.dh.dataHubFlow Flowagent data-hub-flow-agent Agent.start
sap.dh.metadataStart Metadata Explorer datahub-app-metadata
Note sap.dh.metadataS tart is deprecated and will be removed in future releases. Instead, you can use app.datahub-app- data.fullAccess as a close replacement.
sap.dh.mlApi.start ML API ml-api
sap.dh.mlDmApi.start ML DM API ml-dm-api
sap.dh.shared.start Shared shared
Administration Guide Using SAP Data Intelligence System Management PUBLIC 41 Required Application Start Application Policy Dependent Application Dependent Application ID
ML Deployment API sap.dh.resourceplanSer Resource Plan API resourceplan-service vice.start
sap.dh.trainingService.start Training Service training-service
ML DM API sap.dh.mlDmApi.start ML DM API ml-dm-api
sap.dh.connectionMgtStart Connection Management datahub-app-connection
sap.dh.datahubAppLaunch Launchpad datahub-app-launchpad pad.start
sap.dh.dataHubFlow Flowagent data-hub-flow-agent Agent.start
sap.dh.metadataStart Metadata Explorer datahub-app-metadata
Note sap.dh.metadataS tart is deprecated and will be removed in future releases. Instead, you can use app.datahub-app- data.fullAccess as a close replacement.
ML Scenario Manager sap.dh.mlScenarioMan ML Scenario Manager ml-scenario-manager ager.start
sap.dh.connectionMgtStart Connection Management datahub-app-connection
sap.dh.dspGitServer.start Git Server dsp-git-server
sap.dh.jupyter.start Jupyter Lab jupyter
sap.dh.mlApi.start ML API ml-api
sap.dh.mlDmApi.start Data Manager ml-dm-app
sap.dh.mlTracking.start Tracking API ml-tracking
sap.dh.modelerStart Pipeline Modeler pipeline-modeler
sap.dh.resourceplanSer Resource Plan API resourceplan-service vice.start
sap.dh.shared.start Shared shared
sap.dh.trainingService.start Training Service training-service
sap.dh.modelerUI.start Modeler modeler-ui
ML Tracking API sap.dh.mlTracking.start ML Tracking API ml-tracking
Resource Plan Service sap.dh.resourceplanSer Resource Plan Service resourceplan-service vice.start
Shared sap.dh.shared.start Shared shared
Administration Guide 42 PUBLIC Using SAP Data Intelligence System Management Required Application Start Application Policy Dependent Application Dependent Application ID
System Management sap.dh.datahubAppSystem System Management datahub-app-system-man Management.start agement
sap.dh.shared.start Shared shared
Training Service sap.dh.trainingService.start Training Service training-service
sap.dh.resourceplanSer Resource Plan Service resourceplan-service vice.start
Vora Tools sap.dh.voraTools.start Vora Tools vora-tools
sap.dh.connectionMgtStart Connection Management datahub-app-connection
sap.dh.shared.start Shared shared
3.4.5 Resource Quotas
You can use resource quotas to restrict the cluster usage for users.
Resource Quotas
Resource quotas are defined as user policies that can be assigned by cluster and tenant administrators. The following is a form of resource quotas policy.
Resource Type Resource Resource Limit Target
resourceQuotas CPU, Memory, PodCount Limit (numerical value) Applications, Workloads
In the resource quotas policy you must specify which resource should be restricted. The resource can be CPU usage, memory usage, or the number of pods that a user is allowed to use. The limit represents the corresponding unit, for CPU in millicpu, for Memory in bytes, and for PodCount, the number of pods. You must also specify the target which should be limited, that is, either applications or workloads.
You can assign multiple resource quotas policies to a user, to restrict a combination of the specified resources. If more than one policy is specified for the same resource, the maximum of the limits in those rules is applied automatically.
If a user wants to schedule an application or a workload that would violate the limits of the assigned resource quotas, the scheduling of the application or workload will be rejected. This happens if the requested resource consumption of the application or the workload in addition to the current resource consumption of the user is higher than the set limit.
Administration Guide Using SAP Data Intelligence System Management PUBLIC 43 3.4.6 Working With Policies
Policy Management application allows the tenant administrator to set resource access to a user for defined policies.
A policy is a defined control structure where access rights are granted to users via the specific use of policies. The policy can use multiple attributes such as user attributes and resource attributes. The application supports a binary or a Boolean logic returning either a true (allow) or a false (don't allow) after processing.
The SAP Data Intelligence administrator creates policies, which define what the users can view and run in the application. Each policy includes the following:
● Policy ID: Represents the unique identifier of a policy ● Description (optional): Description of a policy. ● Exposed: Specifies whether the policy can be assigned to the users or not. ● Add From Policy (optional): You can search the existing policies to use or import their resources.
The following table lists policies that are predefined for use within Policy Management.
Policy Category Policy Description Exposed
Base Policies sap.dh.clusterAdmin Administrate cluster true
When creating a new user in Sys sap.dh.admin Administrate tenant true tem Management, one of them sap.dh.member Member of tenant true must be assigned to the user.
Additional Exposed Policies app.datahub-app-data.fullAccess Acess to Metadata Explorer, in true cluding resources, activities, and Additional policies can be as dependencies. signed to the user. sap.dh.developer Work with Pipeline Modeler true
app.datahub-app-data.adminis Run profiling and rulebook tasks, true trator connect to systems, view data
app.datahub-app-data.busines View data true sUser
app.datahub-app-data.dataSte Run profiling and rulebook tasks, true ward create rules, rulebooks, glossary terms, and tags
app.datahub-app-data.metada View catalog, glossary, and tags true taUser
app.datahub-app-data.publisher Create and run profiling tasks, true view data
app.datahub-app-data.prepara Create and run preparation tasks, true tionUser view data
Unexposed Policies sap.dh.applicationAllStart Start all applications false
Unexposed policies cannot be as sap.dh.systemMgtWrite Change operations in system false signed to a user but can be refer management and policy manage red to when creating other poli ment
Administration Guide 44 PUBLIC Using SAP Data Intelligence System Management Policy Category Policy Description Exposed
cies. As such, they serve as build sap.dh.connectionsAllRead Read access to all content in a false connection, including browsing ing blocks. connections and viewing data and metadata, and creating the repli cation artifacts necessary to read delta data.
sap.dh.connectionsAllWrite Write access to all connection false definitions in the tenant
app.datahub-app-data.dependen Use in custom policies to start false cies Metadata Explorer and its resour ces.
sap.dh.modelerStart Start Pipeline Modeler false
Pre-defined policies cannot be edited or deleted. When you view the policy, you see the details in read-only format.
Remember
You need to add to an existing policy or create a new policy to gain full access to the features in Metadata Explorer, such as creating a rules dashboard when the following conditions are true:
● Your users are assigned the sap.dh.admin policy without being assigned the sap.dh.metadata policy. ● Your users have custom policies that reference sap.dh.applicationAllStart or sap.dh.metadataStart without being assigned the sap.dh.metadata policy.
To fully access the Metadata Explorer features, add the sap.dh.metadata and app.datahub-app- data.metadataUser policies, or create a custom policy that includes app.datahub-app- data.qualityDashboard.manage.
Note
sap.dh.metadata and sap.dh.metadataStart are deprecated and will be removed in future releases. Instead, you can use app.datahub-app-data.fullAccess as a close replacement.
Resource Types
Adding a resource type to a policy allows you to specify which activities a user is allowed to do within set policies. You can see the resources you’ve recently defined and also the inherited resources that come from the other policies (Add From Policy field). The only resource type currently available is "Connection".
● Connection: When a connection resource type is added to a policy, the administrator can choose whether the user can complete the actions of Read, Write, or both. The user has rights for the respective actions depending on the underlying identifier of resources.
Create a Policy [page 46] Create a policy that grants resource access to users.
Assign a Policy to a User [page 47]
Administration Guide Using SAP Data Intelligence System Management PUBLIC 45 Use System Management to assign previously created or predefined policies to a user.
Related Information
Create a Connection [page 62] Create a Task Workflow
3.4.6.1 Create a Policy
Create a policy that grants resource access to users.
Procedure
1. Log into SAP Data Intelligence and choose the Policy Management application from SAP Data Intelligence Launchpad.
There you will observe the list of policies already in place and a list with their policy ID, description, and other information about the policy. 2. To add a policy, click Create. 3. Fill in the policy details for the policy you want to create.
All required fields are labeled with an asterisk (*).
Note
The policy ID should have a format that contains alphanumerical characters ('a' to 'z'), an underscore ( _ ), and a dot ( . ) It must not contain any other symbol, and it must not start with 'sap'.
4. Choose whether the policy is exposed or not. Only an exposed policy can be assigned to users. 5. Choose whether the policy is enabled or not. If enabled, the policy resources and those of its referenced policies are available for policy checks. If disabled, assignments of this policy are possible, but have no effect on the user's permissions. 6. (Optional) If you want to copy resources from an existing policy, search for that policy and select it.
The policy and resources are added. Select the Inherited tab to see which resources have been added. 7. Click + to add your defined resource.
Connection is the only resource type available.
Note
Each policy must have at least one resource added to it, whether from an inherited policy, or a defined resource.
Administration Guide 46 PUBLIC Using SAP Data Intelligence System Management 8. Select an allowed action.
Only one action is allowed per resource. To define multiple actions for a policy, add additional resources. 9. Add a glob expression for the connection ID. If any connection ID should be allowed, add *. 10. Click OK. 11. Click Create.
After the policy is created, it is displayed in the policy list. 12. Click in the Action column to Edit or Delete a policy.
Task overview: Working With Policies [page 44]
Related Information
Assign a Policy to a User [page 47]
3.4.6.2 Assign a Policy to a User
Use System Management to assign previously created or predefined policies to a user.
Context
The following steps explain how to add policies to users:
Procedure
1. Log in to SAP Data Intelligence and choose the System Management application from the launchpad. 2. Click the Users tab on the top of the page. 3. On the left side of the page, select the user or cluster that you'd like to add the policy to. 4. Click the Policies tab. In the Policies tab you can see which policies the user already has assigned to them and the description for each policy. You can further view the policy and the defined resources by clicking the button and clicking Manage. 5. Click and either search for the policy, or select the policy from the list. 6. Click Assign.
A message lets you know that the policy was added successfully. 7. To remove a policy, click and select Remove.
Administration Guide Using SAP Data Intelligence System Management PUBLIC 47 Task overview: Working With Policies [page 44]
Related Information
Create a Policy [page 46]
3.5 Manage Files
Applications running on the System Management application server can access a shared file system. This file system is synchronized across all application instances.
Prerequisites
You are logged in as a tenant administrator or a member user.
Context
Any modifications made within the file system are visible only to the user who made them. They are present only within the user’s workspace.
Procedure
1. Open the System Management UI and choose the Files tab. By default, the System Management UI provides a tree view of files and folders. Expand the tree to navigate and open the folders and files in the directory structure.
Tip
You can also view all files in all workspaces along with all other files associated with the applications. In the File Management toolbar, choose Union View.
2. Choose the appropriate option:
Administration Guide 48 PUBLIC Using SAP Data Intelligence System Management Option Description
Create a If you want to create a folder to group files in the user workspace: folder 1. In the user workspace, select where you want to create the folder and in the toolbar, choose the (Create new file or folder) menu. 2. Choose the Create Folder menu option. 3. Enter the name of the new folder and confirm by choosing Create.
Tip You can create a subfolder using a forward slash (/) followed by the name of the folder.
Create a file 1. In the user workspace toolbar, choose the (Create new file or folder) menu. 2. Choose the Create File menu option. 3. Enter the name of the new file. 4. Choose Create.
Delete a file 1. In the user directory structure, select the file or the folder to be deleted and choose the icon. or folder 2. Confirm by choosing Delete.
Remember When you delete a file or folder that is shared with other users, the application deletes the file only for you by creating a whiteout file (for the deleted file) in your workspace. Other users can continue to access the file. If you want to restore the file or folder, in the, delete the whiteout file in your workspace.
Import a file You can import a file or import a solution as a file. A solution file is a packaged application compressed or solution into a TAR.GZ file. 1. In the user workspace, choose the Import file or solution icon. The dropdown menu allows you to choose between Import File and Import Solution File. 2. Browse to find the file to be uploaded. 3. Select the file and choose Open.
Note When you import a solution, in the respective space, the /files/manifest.json file is cre ated/updated with the manifest of the imported solution.
Import a sol You can import a solution from the solution repository and modify it as per your requirement. ution from 1. In the user workspace, or union view, choose the Import file or solution icon. solution re 2. In the dropdown, choose Import Solution from solution repository. pository 3. Select a solution from the existing solutions in the repository. 4. Choose Import Solution.
Note ○ If there is a conflict with the file or solution being imported, you can choose to either replace the conflicting files or retain the existing files. Currently conflict resolution is supported only in user and union workspace. ○ When you import a solution, in the respective space, the /files/manifest.json file is created/updated with the manifest of the imported solution.
Export a file You can export a file or folder to your local system. or folder 1. In the user workspace, select one or more files or folders.
2. In the toolbar, choose (Export files or solution). 3. Choose Export files.
Administration Guide Using SAP Data Intelligence System Management PUBLIC 49 Option Description
4. Provide a file name and click Export Files.
Note If there is a conflict with the file or solution being imported, you can choose to either replace the conflicting files or retain the existing files. Currently conflict resolution is supported only in user and union workspace.
Export a file You can export a file or folder as a solution (a ZIP file) to your local system. or folder as a 1. In the user workspace, select one or more files or folders. solution 2. In the toolbar, choose (Export files or solution) 3. Choose Export as Solution. 4. Update the VSolution JSON with the required name, version, and other details and choose Export as Solution.
Export a sol You can export a solution to the solution repository. ution to sol 1. In your workspace or union view; select a solution. ution reposi tory 2. In the toolbar, choose icon. 3. In the dropdown, choose Export as solution to solution repository. 4. Update the VSolution JSON with the required name, version, and other details and choose Export as Solution.
Rename a 1. Select the file or folder to be renamed and choose the icon. file or folder 2. Enter the new file name or folder name and confirm by choosing Rename.
Copy path of You can copy the absolute path of file or folder to clipboard. file or folder 1. Click the (Show actions) icon of a file or folder in the workspace. 2. Choose Copy Path.
Related Information
Sharing Files Using Solution Repository [page 51] Sharing Files Using Solution Repository [page 51]
Administration Guide 50 PUBLIC Using SAP Data Intelligence System Management 3.5.1 Sharing Files Using Solution Repository
With the deprecation of the tenant layer, the solution repository allows you to share files among other users.
Context
You can create and modify files in the System Management application, upload them to the solution repository, and fetch existing file solutions from there. The overall workflow is as follows:
Procedure
1. In the System Management application, you can work on your files by accessing the Files tab. 2. Select one or more files and choose Export as solution to solution repository. 3. Fill the manifest definition and click the Export as Solution. A new solution with the selected files is created in the repository. 4. To import files of an existing solution to your workspace, select Import solution from solution repository, and choose a solution from the list. All solution files will be extracted to your current workspace. 5. If conflicts occur while importing files, you'll be able to select which files to keep.
You can share files among other users through the following vctl commands: ○ vctl vrep [space] import-solution [name] [version] [destination]: Imports a solution from the repository of space, to the specified destination path. Parameters, name and version are used to import the exact solution. The flag -r can be used to set the conflict resolution mode, or in other words, to handle the file conflicts during the operation. ○ vctl vrep [space] export-solution [name] [version] [source...]: Creates a solution with the files of space described by source and uploads it to the repository. Parameters, name and version are used to export the exact solution . If needed, dependencies can be passed with the flag - d.
3.6 Manage Strategies
As a tenant administrator, you can extend your strategies by importing your own solutions.
Prerequisites
To manage strategies, you must be a tenant administrator.
Administration Guide Using SAP Data Intelligence System Management PUBLIC 51 Context
SAP Data Intelligence default tenant is assigned with a default strategy, which is inherited from the parent strategy available in the SAP Data Intelligence installation package. You can extend this strategy by importing your own solutions.
Feature Description
Solutions A solution file is a packaged application compressed into a TAR.GZ file.
Strategies Strategies can reference a parent strategy. This means that all solutions included in the parent strategy are automati cally available to the strategies that reference the parent strategy. If the parent strategy is updated, then all the strat egies derived from the parent strategy are also updated.
Tenants Tenant administrators can update the solutions in the strat egy with restrictions. They can perform only the following actions:
● Reorder the solutions in the strategy. ● Add or remove solutions in the strategy.
Restriction
They cannot modify or update the solutions of the parent strategy.
● They can change the inherited parent strategy and ref erence a different parent strategy.
Procedure
1. Open the System Management UI and choose the Cluster tab. If you are a tenant administrator, then choose the Tenant tab. 2. Choose the appropriate option:
Option Description
Create a solu 1. Choose the Solutions tab. tion 2. In the Solutions panel, choose the icon. 3. Browse to select and upload the solution file. 4. Choose Create Solution.
Update a strat 1. Choose the Strategy tab. egy 2. Select the strategy. 3. Click (Edit) icon. 4. Choose the (Add Solutions) icon. 5. Select the required solutions. 6. Choose Add.
Administration Guide 52 PUBLIC Using SAP Data Intelligence System Management Option Description
Update the de Upgrading the SAP Data Intelligence instance does not affect the strategies assigned to the tenant. fault strategy. However, after the upgrade is complete, you can consume the new content. 1. Choose the Strategies tab. 2. Select the strategy. 3. Click (Edit) icon. 4. From the Parent Strategy dropdown list, select the new default strategy for your tenant. 5. Choose Save. 6. Activate the changes by restarting the application.
Related Information
Strategies [page 53]
3.6.1 Strategies
Strategies are cluster-scoped entities used to administer the applications and content available to the tenants.
Strategies work similar to the layering concept of Docker images, producing a filesystem snapshot, where the content of the solutions is layered on top of each other. When a path collision occurs between the content of the included solutions, the resulting view's contents are determined by the ordering of the solutions (the top most prevails).
Strategies are extended by referencing another strategy as a parent strategy, thus inheriting the solutions of its parent. Strategies are divided into the following categories:
● Base strategies: self-contained, no references to other strategies. ● Extension strategies: strategies inheriting from a base strategy.
Depending on the type of strategy assigned to a tenant, tenant administrator has limited permissions to operations on strategies. The following table describes the operations available to a tenant administrator:
Operation Base Strategy Extension Strategy
Create No No
Delete No No
Get If base strategy is the base to this If extension strategy is assigned to this tenant’s strategy tenant.
List If base strategy is the base to this If extension strategy is assigned to this tenant’s strategy tenant.
Modify (Add/Remove/Reorder No Tenant administrator can modify an Solutions) extension strategy if:
● the extension strategy is assigned to their tenant
Administration Guide Using SAP Data Intelligence System Management PUBLIC 53 Operation Base Strategy Extension Strategy
● the extension strategy is not assigned to other tenants
Set-Parent No No
Also, tenant administrators cannot remove the parent reference.
If you are a tenant administrator and are updating an extension strategy, it is advised that you include the following set of essential solutions.
● vsolution_vsystem_ui-
3.7 Configuring Client Certificate Authentication for NGINX in SAP Data Intelligence
Prerequisites
● The Kubernetes configuration file for the SAP Data Intelligence cluster, with access to read/write and secrets. ● NGINX ingress installed in your cluster
Context
To use the client certificates feature with web access, you need to follow the below procedure. The ingress needs to handle the load of the TLS authentication according to a concept called TLS offloading, which is employed commonly. Therefore, we need the secrets vora.conf.secop.client.ca.truststore (used for client authentication) and vora.conf.secop.ingress.client.truststore (used for ingress authentication) in the cluster, which should be created and labeled automatically at the time of installation. To verify their existence:
kubectl get secrets | grep truststore
To enable user client certificates in NGINX you must set annotations in the ingress resource.
Administration Guide 54 PUBLIC Using SAP Data Intelligence System Management Procedure
1. Locate the file ingress-resource.yaml. You can also extract with kubectl get ing
apiVersion: extensions/v1beta1 kind: Ingress metadata: name: ingress-resource annotations: nginx.ingress.kubernetes.io/auth-tls-secret:
...
3. Apply the resource through:
export NAMESPACE=
kubectl -n $NAMESPACE apply -f ingress-resource.yaml
4. To ensure that the ingress is configured correctly, verify that the ingress is accessible:
$ ip="$(kubectl get svc nginx-ingress-ingress-nginx-controller -o jsonpath='{.status.loadBalancer.ingress[0].ip}')" $ curl -sk "https://${ip}:443"
5. Login with a client certificate. For more information, See vctl login.
○ Check the logs of the nginx ingress controller (increase verbosity with --v=
openssl s_client -state -debug -status -connect ${ip}:443
Related Information
Manage Tenant Certificates [page 56] Manage User Certificates [page 56]
Administration Guide Using SAP Data Intelligence System Management PUBLIC 55 Troubleshooting
● Check the logs of the nginx ingress controller (increase verbosity with --v=
openssl s_client -state -debug -status -connect ${ip}:443
3.7.1 Manage Tenant Certificates
In System Management application, each tenant uses an intermediate Certificate Authority (CA) to allow for individual certificate rotation/renewal. Suppose the private key is compromised, a tenant or cluster administrator can trigger a forceful tenant certificate rotation. This operation revokes the old tenant certificate and replaces it with a new one. As a result, all client certificates on this tenant will be invalidated. Also, the old tenant certificate will become invalid and unusable even if someone holds the private key to it. After the rotation, certificates can be signed and used as usual again.
Rotate Tenant Certificates
Prerequisite: To rotate tenant certificates, you must be a cluster or tenant administrator.
In an event of a known key compromise or just out of suspicion, an cluster or tenant administrator can rotate the certificate for a given tenant with the System Management Command-Line Client (vctl):
vctl tenant certificate rotate
Note
If you are logged in with a certificate in a tenant you are about to rotate, you will be logged out after the cache duration (default: 30s), since your certificate will be invalidated. This happens since the certificate tree is being cut off with the tenant CA. In this case, ensure decent backup authentication mechanisms. (Example: passwords).
3.7.2 Manage User Certificates
A System Management user can use client certificates for authentication instead of basic password-based authentication.
Currently, interaction with System Management Command-Line Client (vctl) is supported.
Prerequisite: To generate and manage certificates for other users, you must be a cluster or tenant administrator.
Administration Guide 56 PUBLIC Using SAP Data Intelligence System Management Generate Certificates
When logged in previously, a user can use the following command to issue a certificate signing request (CSR) for the specified username to System Management, which will return the signed certificate in response:
vctl user certificate generate [-t|--tenant
By default, the files will be written to the ~/.vsystem directory. To write the output to another directory use the -o flag.
Note
The files are named "bundle.pem" and "key.pem", as the bundle also contains the respective tenant CA appended (certificate chain).
Tip
In general it is good practice to keep certificates of other users outside of the ~/.vsystem directory (where you should keep your own), since certificates overwritten are used immediately and can lead to unwanted impersonations.
User Allowed Actions
Tenant Administrator The tenant administrator can do the following within their tenant: - Generate, list, and revoke other certificates for their tenant.
Cluster Administrator The cluster administrator can do the following within the cluster: - Generate, list, and revoke other certificates for every tenant.
Note
You can hold only a configurable set (default: 10) of valid certificates (neither expired/revoked). Once you reach this limit, you will have to revoke at least one certificate to generate a new one.
Revoke Certificates
In case a certificate's key is compromised to present knowledge or the previously mentioned limit is reached, certificates can be revoked using vctl. If no arguments except
vctl user certificate revoke
Alternatively, you can pass a specific serial number with the -s flag, which can be obtained from the bundle.pem, by inspecting it with openssl. In case a user is deleted, the corresponding certificates are automatically revoked and detached from the user's username.
Administration Guide Using SAP Data Intelligence System Management PUBLIC 57 List Certificates
To list all the user certificates with the System Management Command-Line Client (vctl), use the following command:
vctl user certificate list
Providing the --all flag will list all certificates with one of the following statuses: [valid, revoked, expired]. Otherwise, only certificates with status valid will be listed. The access policy follows the same pattern as vctl user certificate generate above, that is, tenant administrators have full privileges within their tenant and cluster administrators have additional cross-tenant privileges.
Administration Guide 58 PUBLIC Using SAP Data Intelligence System Management 4 Using SAP Data Intelligence Connection Management
SAP Data Intelligence administrators or other business users with necessary privileges can use the SAP Data Intelligence Connection Management application to create and maintain connections. A connection represents an access point to a remote system or a remote data source.
The table lists the various actions that you can perform in the Connection Management application.
Actions Description
Create Connections To create connections, launch the connection management application and in the Connections tab, choose Create. For more information, see the parent topic Create a Connection [page 62].
Filter Connections Filter connections based on tags and connections types. A tag is an at tribute that you can use for grouping and filtering connections. Each con nection type is associated with a fixed set of predefined tags (storage, application, http, and db).
To filter the connections, follows these steps:
1. In the Connections tab, choose the Filter button. 2. Define the filter conditions and choose Apply.
Edit a Connection To edit an existing connection, follow these steps:
1. In the Connections tab, select the required connection. 2. In the Actions column for the selected connection, select the over flow menu icon and choose Edit. 3. In the Edit Connection pane, edit the connection values and choose Save.
Note
You cannot change the connection ID or the connection type. If you do not have sufficient permissions to edit a connection, you will not see the Save button on the Edit Connection pane.
Delete a Connection To delete an existing connection, follow these steps:
1. In the Connections tab, select the required connection. 2. In the Actions column for the selected connection, select the over flow menu icon and choose Delete. 3. Confirm the delete operation.
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 59 Actions Description
View Connection Status To view a connection status, follow these steps:
1. In the Connections tab, select the required connection. 2. In the Actions column, click the overflow menu icon and choose Check Status.
The status check performs a type-specific reachability test of the remote system referred to in the connection definition. The possible connection statuses are OK, ERROR, and UNKNOWN. The application displays the status in the Connection Status dialog box.
Import or export connections You can export a connection as a JSON schema or import a JSON schema that defines a connection to create a connection in SAP Data Intelligence.
To export one or more connections, do the following:
1. Choose the Connections tab.
2. In the menu bar, choose (Export). 3. In the Export Connections dialog box, select the connection IDs to export its JSON schema to your local system. 4. Choose Export.
To import a connection,
1. Choose the Connections tab.
2. In the menu bar, choose (Import). 3. Browse and select the JSON schema of the connection that you want to import.
Note
The username and password details are not imported. You have to manually provide this information after importing and create a connection.
Filter Connection Types Filter connection types based on tags and capabilities types. Each con nection type is associated with a fixed set of predefined tags and capabil ities.
To filter the connections types, follow these steps:
1. In the Connection Types tab, choose the Filter button. 2. Define the filter conditions and choose Apply.
Note
Internal connection types: In the Connection Types tab, it is possible that some of the connection types are
associated with the icon, . These connection types are internal connection types used by the Connection Management application.
Administration Guide 60 PUBLIC Using SAP Data Intelligence Connection Management By default, these are disabled. This means that you cannot create or update a connection with disabled connection types. In the Connection Types tab, click the icon to enable them and to create a connection using that connection type.
Related Information
Log in to SAP Data Intelligence Connection Management [page 61] Create a Connection [page 62] Manage Certificates [page 63] Using SAP Cloud Connector Gateway [page 125] (Mandatory) Configure Authorizations for Supported Connection Types [page 126] Allowing SAP Data Intelligence Access Through Firewalls [page 127]
4.1 Log in to SAP Data Intelligence Connection Management
You can access the SAP Data Intelligence Connection Management application from the SAP Data Intelligence Launchpad or directly launch the application with a stable URL.
Procedure
1. Launch the SAP Data Intelligence Launchpad user interface in a browser using one of the preferred URLs:
○ For on-premise installations (if TLS is enabled):
https://
where ○
https://
where ○
kubectl get ingress -n $NAMESPACE
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 61
The welcome screen appears where you can enter the login credentials. 2. Log into the SAP Data Intelligence using the following credentials:
○ Tenant name ○ Your username ○ Your password
The SAP Data Intelligence Launchpad opens and displays the initial home page. User details and the tenant name are displayed in the upper-right area of the screen. The home page displays all applications available in the tenant. 3. On the home page, choose Connection Management. The application UI opens displaying the initial screen.
4.2 Create a Connection
Create a connection in SAP Data Intelligence, which represents an access point to a remote system or a remote data source.
Context
Any user with member privileges can create connections, view, edit, or delete their own connections. The connections are private. This means that they are not visible to other member users but only to the tenant administrator. For granting access to another user, a corresponding policy has to be assigned to the user by the tenant administrator. You create connections in the SAP Data Intelligence Connection Management application. The application associates each connection with a unique ID and a type that specifies the nature of the access point. In addition to the mandatory ID attribute, every connection can have an optional description. After creating a connection, you can use the connection ID in the SAP Data Intelligence Modeler to reference an external resource.
Procedure
1. Start the SAP Data Intelligence Connection Management application. 2. In the editor toolbar, choose Create to add a connection. 3. In the Create Connection page, enter the connection ID in the ID textbox. 4. In the Connection Type dropdown list, select the required connection type.
The connection type determines the rest of the connection information that you must provide.
Note
For more information, see the topic Supported Connection Types.
Administration Guide 62 PUBLIC Using SAP Data Intelligence Connection Management As a tenant administrator or pipeline modeler, explicitly enable the app.datahub-app- core.connectionCredentialsUnmasked policy to see the username of connections for troubleshooting connectivity issues and to know with what credentials you are accessing the data sources.
This policy gives you full visibility to selected connection fields, such as user names.
5. Enter the remainder of the connection information. 6. (Optional) Add custom tags. You can associate a connection with one or more user-defined tags. The tags can help you, for example, to filter connections. After creating a tag, you can reuse them in other connection definitions.
a. In the Tag text field, enter a tag and press ENTER to define another tag.
For example, you can associate the connection with tags such as db, storage, application, and so on. Later, you can use these tags to filter connections in the application UI. b. If you have already associated connections with tags and if want to reuse the existing tags, in the Tag dropdown list, select the required tag and press ENTER . 7. (Optional) Test connection. Before creating a connection to the target system, you can test the connection based on the connection details. Similarly, if you are editing an existing connection, you can test the connection before saving. a. After providing all the necessary connection details, at the bottom right corner, choose Test Connection. If the application can successfully create a connection to the target system with provided details, it displays the connection status as OK. Otherwise, it displays an error message. If there are errors, verify the connection details that you have provided and test again. 8. Choose Create. 9. (Optional) If you want to change the connection settings, choose Edit.
4.3 Manage Certificates
Use the SAP Data Intelligence Connection Management application to manage certificates for remote systems.
Context
The connection management application provides capabilities to import certificates for connections to remote systems. For operators in the SAP Data Intelligence Modeling tool that leverage HTTPS as the underlying transport protocol (using TLS transport encryption), the certificate of the upstream system must be trusted. To import a certificate into the trust chain, obtain the certificates from the target endpoint and import it using the connection management application.
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 63 Procedure
1. Start the SAP Data Intelligence Connection Management application. 2. On the home page, choose the Certificates tab. 3. In the menu bar, choose Import icon to import a certificate. 4. Browse and select your certificate.
The application displays the list of certificates imported under the Certificates tab. The certificates are imported to the ca folder in the repository. 5. (Optional) To delete a certificate, hover on the certificate and choose .
Remember
You can’t delete the preinstalled certificates.
Next Steps
You can also import certificates to the ca folder using the SAP Data Intelligence System Management application. If you have imported certificates using the system management application, then in the Certificates tab, click to update the list of certificates displayed.
4.4 Supported Connection Types
SAP Data Intelligence delivers a set of predefined connection types. These connection types represent a specific category of remote resource.
The table lists the supported connection types in SAP Data Intelligence.
Note
For more information about supported remote systems and data sources, see SAP Note 2693555 .
Supported Connection Types Connection Type Connection
ADL [page 69] Microsoft Azure Data Lake Store (ADL)
BW [page 73] SAP Business Warehouse
GCS [page 81] Google Cloud Storage (GCS)
HANA_DB [page 86] SAP HANA Database
HANA_XS [page 88] SAP HANA Database
HTTP [page 93] HTTP or HTTPS server connection
Administration Guide 64 PUBLIC Using SAP Data Intelligence Connection Management Connection Type Connection
S3 [page 113] Amazon Simple Storage Service (Amazon S3)
VORA [page 123] SAP Vora
WASB [page 124] Microsoft Windows Azure Storage Blob (WASB)
HDFS [page 88] Hadoop Distributed File System (HDFS)
GCP_DATAPROC [page 84] Google Cloud Dataproc cluster
GCP_PUBSUB [page 85] Google publish/subscribe service
SMTP [page 119] SMTP server
IMAP [page 94] IMAP server
RSERVE [page 112] RServe server
CPI [page 77] SAP BTP Integration system
DATASERVICES [page 78] SAP Data Services connection
ORACLE [page 105] Oracle database
ODATA [page 101] OData RESTful APIs
MSSQL [page 97] Microsoft SQL Server database
MYSQL [page 98] MySQL database
DB2 [page 79] IBM DB2 database
GCP_BIGQUERY [page 82] Google Cloud BigQuery
ABAP [page 66] SAP ABAP
AWS_SNS [page 71] Amazon Simple Notification Service (Amazon SNS)
MLCLUSTER [page 97] Machine Learning cluster
KAFKA [page 95] Apache Kafka cluster
CLOUD_DATA_INTEGRATION [page 74] Cloud Data Integration
OPEN_CONNECTORS [page 102] SAP BTP Open Connectors
SDL [page 116] Internal data lake
OPENAPI [page 104] OpenAPI
OSS [page 108] Alibaba Cloud Object Storage Service
SAP_IQ [page 114] SAP IQ Databases
AZURE_SQL_DB [page 72] Microsoft Azure Cloud SQL Database
SFTP [page 117] SSH File Transfer Protocol
ADL_V2 [page 70] Microsoft Azure Data Lake Store (ADL)
REDSHIFT [page 109] Amazon Redshift Database
ABAP LEGACY [page 68] ABAP LEGACY
HDL_DB [page 90] SAP HANA Data Lake Database
INFORMATION_STEWARD [page 94] SAP Information Steward administration server
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 65 Related Information
Changing Data Capture (CDC)
4.4.1 ABAP
Provides connection and access information to objects (tables, DDIC views, and CDS views) in an SAP ABAP system (S/4, ECC, or R/3).
The ABAP connection type supports the following protocols:
● RFC ● WebSocket RFC
For more information, see SAP Note 2835207 .
Note
When creating an ABAP connection using the Connection Management application, save the connection before you perform a connection check. The application executes the check based on the connection information that you provide and save.
Operations
The ABAP connection type supports the following operations:
● Metadata Access ● Metadata Extraction ● ABAP Operator Execution
Attributes
Attribute Description
Protocol WebSocket RFC or RFC.
System ID System ID
Client Client of the ABAP system to use for logon.
Hostname Hostname of WebSocket RFC connection endpoint.
Portnumber Port number of WebSocket RFC connection endpoint value. The default port number is 443
Administration Guide 66 PUBLIC Using SAP Data Intelligence Connection Management Attribute Description
Application Server Hostname or IP address of the application server for RFC connection.
System Number 2-digits system number for RFC connection.
Gateway Host Hostname of the Gateway server for RFC connection
SAP Router SAP router configuration string for RFC connection.
User Username that the application must use to authenticate
Password Password that the application must use to authenticate
Language Two-digit ISO language code. If not set, the application uses the default logon language of the ABAP system.
Connection Type With Load Balancing: Using ABAP connection against mes sage server of the ABAP source system.
Without Load Balancing: Using ABAP connection against a specific application server of the ABAP source system.
Message Server Host Hostname or IP address of the message server for RFC con nection.
Message Server Port Port number of the message server for RFC connection.
Logon Group Logon group (default: SPACE).
Enable SNC Switch Secure Network Communication (SNC) on or off; if you enable SNC, it is used to establish the RFC connection
SNC Partner Name Only relevant if SNC is switched on: SNC name of the com munication partner
Quality of Protection Only relevant if SNC is switched on: level of protection, which can take the following values:
● 1 (apply authentication only) ● 2 (apply integrity protection - includes authentication) ● 3 (apply privacy protection - includes integrity protec tion and authentication) ● 8 (apply the default protection) ● 9 (apply the maximum protection)
Because connection information is persisted in the connection cache, changes to a connection may not take effect immediately after the change.
Note
For information about supported versions, see SAP Note 2693555 .
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 67 4.4.2 ABAP LEGACY
Provides connection and access information to ODP objects in an SAP ABAP system where DMIS installation is not possible.
Note
ABAP LEGACY does not provide all the features that ABAP connection provides. For ABAP connection, see SAP Note 2835207
● The ABAP LEGACY connection type supports only the RFC protocol. ● The ABAP LEGACY connection doesn't support metadata extraction and lineage via Metadata Explorer. ● The ABAP LEGACY connection can only browse and preview ODP objects, and it does not support ABAP TABLES. ● The ABAP LEGACY connection is supported in SAP Application Consumer operator and Flowagent ABAP ODP Consumer operator.
On-Premise Connectivity via Cloud Connecter
For on-premise connectivity via Cloud Connector (for DI cloud installation), Navigate to Cloud Connector Administration -> Cloud To On-Premise -> Access Control
Either add the prefixes to the allowlist: /SAPDS/, BAPI, RODPS and RFC, or add the following exact functions to the allowlist.
● /SAPDS/EXTRACTOR_NAVIGATE ● /SAPDS/GET_VERSION ● /SAPDS/MODEL_NAVIGATE ● BAPI_USER_GET_DETAIL ● RFC_FUNCTION_SEARCH ● RODPS_REPL_CONTEXT_GET_LIST ● RODPS_REPL_ODP_OPEN ● RODPS_REPL_ODP_FETCH ● RODPS_REPL_ODP_GET_DETAIL ● RODPS_REPL_ODP_GET_LIST
Features
The ABAP LEGACY connection allows you to:
● View connection status in Connection Management ● Browse remote objects in Metadata Explorer ● View fact sheets of remote objects in Metadata Explorer ● Preview content of remote objects in Metadata Explorer
Administration Guide 68 PUBLIC Using SAP Data Intelligence Connection Management ● Read tables and views in Modeler
Attributes
Attribute Description
Protocol RFC only (for other types, use ABAP connection).
System ID System ID
Application Server Hostname or IP address of the application server for RFC connection.
Client Client of the ABAP system to use for logon.
Instance Number Two-digit instance number.
Gateway Host Hostname of the Gateway server for RFC connection.
SAP Router SAP router configuration string for RFC connection.
User Username that the application must use to authenticate.
Password Password that the application must use to authenticate.
Language Two-digit ISO language code. If not set, the application uses the default logon language of the ABAP system.
Connection Type Indicates whether the connection has load balancing.
Message Server Host Hostname or IP address of the message server for the RFC connection.
Message Server Port Port number of the message server for the RFC connection.
Logon Group Logon group (default: SPACE).
4.4.3 ADL
Provides connection and access information to objects in Microsoft Azure Data Lake (ADL).
Operation
The ADL connection type allows:
● Browsing folders and files in the Azure Data Lake Storage server ● Obtaining file metadata ● Profiling data ● Previewing data ● Performing flowgraph tasks with Azure Data Lake files as the source and/or target
You can create the ADL connection and use the SAP Data Intelligence Modeler to:
● Read and write files using the Read File and Write File operators.
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 69 ● Rename and remove files using the Move File and Remove File operators.
Attributes
Attribute Description
Account name Name of the Azure Data Lake Storage Gen1 account.
Tenant ID ID of the Azure Data Lake Storage Gen1 tenant.
Client ID The client ID (also referred to as Application ID).
Client Key The client key (also referred to as Client Secret or Authenti cation Key.
Note
For more information about supported remote systems and data sources, see SAP Note 2693555 .
4.4.4 ADL_V2
Provides connection and access information to objects in Microsoft Azure Storage Gen2.
Operation
The ADL_V2 connection type allows you to:
● Browse folders and files in the Azure Data Lake Storage Gen2 ● Obtain file metadata ● Profile data ● Preview data ● Perform flowgraph tasks with Azure Data Lake Storage Gen2 files as the source and/or target
You can create the ADL_V2 connection and use the SAP Data Intelligence Modeler to:
● Read and write files in byte chunks in Modeler via Read File and Write File operators. ● Read and write CSV, Parquet, ORC files using Structured File Consumer and Structured File Producer operators. ● Rename and remove files in Modeler via Move File and Remove File operators, respectively.
Administration Guide 70 PUBLIC Using SAP Data Intelligence Connection Management Attributes
Attribute Description
Authorization Method The authorization method to be used. Currently "shared_key" and "shared_access_signature" are supported.
Note
● If you have selected shared_key, then provide value to the attribute, Account Key. ● If you have selected shared_access_signature, then provide value to the attribute, SAS Token.|
Account Name The account name used in the shared key authorization.
Account Key The account key used in the shared key authorization.
SAS Token The SAS token used in SAS token authentication.
Endpoint Suffix The endpoint suffix. If none is set, the default value "core.windows.net" is used.
Root Path The root path name for browsing. It starts with a slash and the file system name. For example, /MyFileSystem/ MyFolder. The file system must be provided. Any path used with this connection will be prefixed with this root path.
Note
For more information about supported remote systems and data sources, see SAP Note 2693555 .
4.4.5 AWS_SNS
Provides connection to the Amazon Simple Notification Service. Amazon SNS is a managed pub/sub service.
Operation
The AWS_SNS connection type allows:
● Creating a topic in the AWS SNS service. ● Subscribing to a topic on the AWS SNS service.
You can create the AWS_SNS connection and use the SAP Data Intelligence Modeler to:
● Send messages to topics using the AWS SNS Producer operator. ● Receive messages from the topics using the AWS SNS Consumer operator.
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 71 Attributes
Attribute Description
AWS Account ID The account ID assigned to the IAM user that owns the SNS resources.
AWS Access Key The ID of the key that can be used to access the AWS SNS API (in combination with the secret key). The key must have permissions to use the SNS service.
AWS Secret Key The key that can be used to access the AWS SNS API (in combination with the key ID). The key must have permis sions to use the SNS service.
Region The region where the SNS topics and subscriptions are stored or looked up.
Note
For more information about supported remote systems and data sources, see SAP Note 2693555 .
4.4.6 AZURE_SQL_DB
Provides connection and access information to Microsoft Azure Cloud SQL Database.
Operation
You can create the AZURE_SQL_DB connection and use the SAP Data Intelligence Modeler to:
● Ingest tables and views ● Ingest data from SQL queries. ● Execute native SQL DDL/DML statements.
Attributes
Attribute Description
Host Host name of the Azure server.
Port Port number of the Azure server.
Validate host certificate Validate the server certificate.
Hostname in certificate Validate if hostname is in certificate.
Administration Guide 72 PUBLIC Using SAP Data Intelligence Connection Management Attribute Description
Database name Database to connect
User Azure user with privileges to connect to the database.
Password Password of the Azure user.
Additional session parameters Allows to set session specific variables.
The network for Azure SQL DB instances is protected by a firewall that controls incoming traffic.
For SAP Data Intelligence, cloud edition, Connection Management exposes an IP address via a read-only connection called INFO_NAT_GATEWAY_IP. Use this IP address and add the connection to allowlist in the Azure dashboard. For SAP Data Intelligence, on-premise edition, as a Kubernetes administrator, obtain the public IP address of all the nodes, and add them to allowlist in the Azure dashboard.
Execute the steps below to add domain IPs to allowlist:
1. Access your Azure SQL Server. 2. Within Settings, select SQL databases and then select the database you wish to connect. 3. Click on Set server firewall and then click on + Add client IP. 4. Specify a rule name and the IP range of the kubernetes nodes.
Note
For information about supported versions, see SAP Note 2693555 .
4.4.7 BW
Provides connection and access information to the BW systems (BW, BW on HANA and BW/4 HANA).
Operation
The BW connection type allows:
● Browsing and previewing InfoProviders and BW Queries as datasets in metadata explorer. ● Executing process chains in the SAP Data Intelligence Modeler using the BW Process chain operator. ● Retrieving data from Queries and InfoProviders and writing them to SAP Vora or other supported cloud storages using the Data Transfer operator.
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 73 Attributes
Attribute Description
Host Hostname of the ABAP web server without specifying the protocol (HTTP or HTTPS).
Tip
To identify the host name, login to the BW system with SAP GUI and execute the transaction INA Testmonitor using the code, /nrsbitt. In the subsequent screen, switch the protocol to see the host and port details to be used.
Port Port of the ABAP web server
Protocol Protocol (HTTP or HTTPS). The default value is HTTPS.
Client Client of the BW system to use for login. If you do not provide any value, the application will use the default client of the BW system.
User Username in the BW (ABAP) system
Password Password in the BW (ABAP) system
HANA DB Connection ID ID of the HANA_DB connection to the underlying database if it is a BW on HANA or BW/4 HANA system.
The application requires this ID to optimize the execution of the Data Transfer operator by routing it via SAP HANA. To achieve this optimization, create another connection of type HANA_DB in the connection management application and use the connection ID in this attribute.
Language Language as two-digit ISO code. If not set, the application uses the default login language of the BW (ABAP) system.
Note
For information about supported versions, see SAP Note 2693555 .
4.4.8 CLOUD_DATA_INTEGRATION
Provides connection and access information to systems that provide OData-based APIs for data integration with SAP Data Intelligence and SAP BW/4 HANA.
Here are some of the SAP Cloud applications that provide such APIs.
● Fieldglass ● SAP C/4HANA Sales Cloud and SAP C/4HANA Service Cloud ● S/4 HANA Cloud
Administration Guide 74 PUBLIC Using SAP Data Intelligence Connection Management Features
● Browsing datasets from SAP Cloud Data Integration in the Metadata Explorer. ● Previewing datasets in the Metadata Explorer. ● Indexing datasets in the Metadata Explorer.
You can create the CLOUD_DATA_INTEGRATION connection to read data in the SAP Data Intelligence Modeler.
Attributes
Attribute Description
Host Host for accessing the cloud OData service
Port Port for accessing the cloud OData service
Protocol Protocol (HTTP or HTTPS). The default value is HTTPS.
Service Path Relative path (without host and port) to the Cloud Data Inte gration service endpoint. The value must start with a forward slash ( / ).
Authentication The authentication method that the application must use. You can select NoAuth for no authentication, Basic to au thenticate with user/password details or OAuth2.
If you have selected Basic, then provide values to the follow ing attributes:
● User: Username that application must use for authenti cation. ● Password: Password for authentication.
If you have selected OAuth2, then provide values to the fol lowing attributes:
● OAuth 2 Grant Type: The type of grant. It can be client credentials, password, or password with confidential client. ● OAuth 2 Token Endpoint: Token endpoint that the appli cation must use. ● OAuth2 User if the GrantType selected is password or password with confidential client. ● OAuth2 Password if the GrantType selected is password or password with confidential client ● OAuth 2 Client ID: The client ID ● OAuth 2 Client Secret: The client secret ● OAuth 2 Scope: The OAuth scope value ● OAuth 2 Token Request Content Type: The value for the Content-type HTTP header that the application must use when requesting a token.
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 75 Attribute Description
Require CSRF Header Require CSRF Header. The default value is true.
Note
For more information about supported remote systems and data sources, see SAP Note 2693555 .
4.4.9 CPEM
SAP BTP Enterprise Messaging (formerly SAP Cloud Platform Enterprise Messenging, or CPEM) allows connection to an SAP BTP enterprise messaging service instance using the AMPQ protocol over a secure WebSocket (was) connection with OAuth 2.0 authentication.
Features
The CPEM connection type allows to:
● Receive messages published via SAP BTP Enterprise Messaging ● Publish messages via SAP BTP Enterprise Messaging
Attributes
Attribute Description
Host Host name or IP address of the SAP BTP Enterprise Messag ing gateway.
OAuth 2 Token Endpoint The token endpoint to be used for OAuth 2 authentication.
OAuth 2 Client ID The client's ID for OAuth 2 authentication.
OAuth 2 Client Secret The client's secret for OAuth 2 authentication.
Administration Guide 76 PUBLIC Using SAP Data Intelligence Connection Management 4.4.10 CPI
Provides an HTTPS connection to an SAP BTP Integration system (formerly SAP Cloud Platform Integration system).
Operation
The CPI connection type allows:
● Triggering the execution of iFlows in the connected CPI system ● Providing the response from the execution to other downstream operators in the SAP Data Intelligence Modeler
You can create the CPI connection and use it in the CPI-PI iFlow operator in the modeler. The application supports only HTTP basic authentication.
Note
Using HTTPS as protocol is mandatory. Plain HTTP connections not using TLS encryption are not supported.
Attributes
Attribute Description
Host Hostname of the CPI system (without protocol and port). You can use the domain name or IP address as the host name.
Port Port of the CPI system. If not set, the application uses the HTTPS default port (443).
User Username that the application must use to authenticate at the target CPI system.
Password Password to authenticate at the target system.
Note
For more information about supported remote systems and data sources, see SAP Note 2693555 .
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 77 4.4.11 DATASERVICES
Provides connection and access information to a SOAP server from an SAP Data Services administration server.
Operation
The DATASERVICES connection type allows:
● Browsing SAP Data Services jobs ● Executing SAP Data Services jobs in the SAP Data Services Job Server. ● Supports only HTTP Connection protocol ● Supports only authentication type called secEnterprise ● Connect to On-Premise Data Services via SAP Cloud Connector
Note
The connection type allows you to connect to SAP DATASERVICES on-premise systems of version 4.2 SP2 Patch 2 and all versions after 4.2 SP2 Patch2. In the SAP Data Intelligence Modeler, for connections to older DATASERVICES system (earlier than 4.2 SP2 Patch 2), you can manually enter the runtime parameters.
Attributes
Attribute Description
Host Hostname of the SAP Data Services web service host.
In most cases it is the SAP Data Services management con sole host.
Port Port of the Data Services web service
CMS Host Hostname of the central management system
CMS Port Port of the central management system
Connection protocol Protocol (HTTP or HTTPS). The default value is HTTPS.
Username Username to authenticate in the DATASERVICES system
Password Password to authenticate in the DATASERVICES system
Note
For information about supported versions, see SAP Note 2693555 .
Administration Guide 78 PUBLIC Using SAP Data Intelligence Connection Management 4.4.12 DB2
Provides connection and access information to IBM DB2 databases. The application supports DB2 Universal Database (UDB) version 10.x onwards.
Prerequisites
● You have installed the IBM Data Server Driver for ODBC and CLI (CLI Driver) driver for the Linux operating system ● You have downloaded the compressed TAR archive (tar.gz file) for Linux. ● You have downloaded the IBM DB2 ODBC driver version 11.01 or later, for TLS support. ● Create the vsolution area: mkdir -p db2_vsolution/content/files/flowagent. ● Create vsolution manifest as db2_vsolution/manifest.json:
Sample Code
{
"name": "vsolution_db2", "version": "1.0.0", "format": "2", "dependencies": []
}
Note
If you need to upload a new driver later, you can modify the version (for example, 1.0.1, 2.0.0, and so on) to upload a new vsolution. Then simply modify your layering strategy appropriately.
● Extract the DB2 package: tar -xvzf ibm-db2-odbc-
Note
You require the Tenant Administrator role. /vrep is equivalent to the files folder in the top level of the System Management files.
2. Choose the Tenant tab. 3. Click +, and then select the newly created db2_vsolution.zip. 4. After the import is complete, choose the Strategy sub-tab, and then click Edit. 5. Click +, and select your newly imported solution vsolution_db2-1.0.0. 6. Click Save.
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 79 Note
For changes to take effect, restart the Flowagent application.
Operation
The DB2 connection type allows:
● Browsing DB2 schemas in Metadata Explorer ● Previewing DB2 tables and views in Metadata Explorer ● Profiling DB2 tables and views in Metadata Explorer ● Indexing DB2 schemas in Metadata Explorer ● On-premise connectivity via SAP Cloud Connector
You can create the DB2 connection and use the SAP Data Intelligence Modeler to:
● Read tables and views ● Read data from SQL queries ● Execute native SQL DDL/DML statements
Attributes
Attribute Description
Version IBM DB2 version. (DB2 UDB 11.x or DB2 UDB 10.x)
Host Hostname of the DB2 server
Port Port number of the DB2 server
Database name Name of the IBM DB2 database with which you want to es tablish a connection.
Server Certificate DB2 Server certificate
User DB2 user with privileges to connect to the database.
Password Password of the DB2 user
Additional session parameters Provide more session-specific parameters
Note
For information about supported versions, see SAP Note 2693555 .
Administration Guide 80 PUBLIC Using SAP Data Intelligence Connection Management 4.4.13 GCS
Provides connection and access information to objects in Google Cloud Storage.
Operation
The GCS connection type allows:
● Browsing folders and files in the Google Cloud Storage server ● Obtaining file metadata ● Profiling data ● Previewing data ● Performing flowgraph tasks with Google Cloud Storage files as the source and/or target
You can create the GCS connection and use the SAP Data Intelligence Modeler to:
● Read and write files using the Read File and Write File operators. ● Copy, rename, and remove files using the Copy File, Move File, and Remove File operators respectively.
Attributes
Attribute Description
Project ID ID of the GCS project to which you want to connect
Key file Contents of the key file used for authentication
Root Path The optional root path name for browsing objects. The value starts with the character slash. For example, /My Folder/MySubfolder.
If you have specified the Root Path, then any path used with this connection is prefixed with the Root Path.
Note
For more information about supported remote systems and data sources, see SAP Note 2693555 .
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 81 4.4.14 GCP_BIGQUERY
Provides connection to Google Cloud BigQuery.
Prerequisites
You have installed the ODBC driver. To install, follow the below steps:
● Download the Magnitude Simba drivers for BigQuery from Google Cloud platform. ○ Select the Linux 32-bit and 64-bit (tar.gz) version: SimbaODBCDriverforGoogleBigQuery_2.2.5.1012-Linux.tar.gz
Note
Currently the steps refer to the driver version 2.3.1.1001. However, the steps should remain the same for newer minor driver versions.
○ Create the vsolution area: mkdir -p gcp_bigquery_vsolution/content/files/flowagent. ○ Create vsolution manifest as gcp_bigquery_vsolution/manifest.json:
Sample Code
{
"name": "vsolution_gcp_bigquery", "version": "1.0.0", "format": "2", "dependencies": []
}
Note
If you need to upload a new driver later, you can modify the version (for example, 1.0.1, 2.0.0, and so on) to upload a new vsolution. Then simply modify your layering strategy appropriately.
● Extract the downloaded compressed TAR archive:
Sample Code
tar -xzvf SimbaODBCDriverforGoogleBigQuery_2.3.1.1001-Linux.tar.gz
tar -xzvf SimbaODBCDriverforGoogleBigQuery_2.3.1.1001-Linux/ SimbaODBCDriverforGoogleBigQuery64_2.3.1.1001.tar.gz -C gcp_bigquery_vsolution/content/files/flowagent/
cp SimbaODBCDriverforGoogleBigQuery_2.3.1.1001-Linux/ GoogleBigQueryODBC.did gcp_bigquery_vsolution/content/files/flowagent/ SimbaODBCDriverforGoogleBigQuery64_2.3.1.1001/lib/
● (Optional) Configure the driver to display proper error messages:
Sample Code
mv gcp_bigquery_vsolution/content/files/flowagent/ SimbaODBCDriverforGoogleBigQuery64_2.3.1.1001/ErrorMessages/en-US
Administration Guide 82 PUBLIC Using SAP Data Intelligence Connection Management gcp_bigquery_vsolution/content/files/flowagent/ SimbaODBCDriverforGoogleBigQuery64_2.3.1.1001/lib/en-US
● The gcp_bigquery_vsolution/content/files/flowagent/ SimbaODBCDriverforGoogleBigQuery64_2.3.1.1001/lib folder should have the following structure:
Sample Code
lib/:
cacerts.pem en-US EULA.txt GoogleBigQueryODBC.did libgooglebigqueryodbc_sb64.so lib/en-US:
DSMessages.xml DSOAuthMessages.xml ODBCMessages.xml SimbaBigQueryODBCMessages.xml SQLEngineMessages.xml
● Create a properties file gcp_bigquery_vsolution/content/files/flowagent/ gcp_bigquery.properties with driver manager relative to the location of the properties file:
Sample Code
GOOGLEBIGQUERY_DRIVERMANAGER=./ SimbaODBCDriverforGoogleBigQuery64_2.3.1.1001/lib/ libgooglebigqueryodbc_sb64.so
● Compress the vsolution from within the gcp_bigquery_vsolution directory:zip -r gcp_bigquery_vsolution.zip ./ ● Import the vsolution in System Management. 1. Start the System Management application from the Launchpad.
Note
You require the Tenant Administrator role.
2. Click the Tenant tab. 3. Click +, and then select the newly created gcp_bigquery_vsolution.zip. 4. After the import is complete, choose the Strategy sub-tab, and then click Edit. 5. Click +, and then select your newly imported solution vsolution_gcp_bigquery-1.0.0. 6. Click Save.
Note
For changes to take effect, restart the Flowagent application.
Operation
The GCP_BIGQUERY connection type allows:
● View connection status in Connection Management ● Browse remote objects in Metadata Explorer ● Obtain fact sheets of remote objects in Metadata Explorer
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 83 ● Preview content of remote objects in Metadata Explorer ● Publish remote objects in Metadata Explorer ● Read tables and views in Modeler ● Read data from SQL queries in Modeler ● Load data to a Google BigQuery table in Modeler
● Note
Previewing content in Metadata Explorer and reading tables and views in Modeler are features that require the installation of an ODBC driver. See the Prerequisites section. The account must have the permission `bigquery.datasets.get` granted at the project level. The access to Google BigQuery sources without providing an ODBC driver is deprecated. It is recommended to configure the ODBC driver for additional features and performance.
Attributes
Attribute Description
Project ID Google BigQuery Project ID
Keyfile Keyfile to upload that contains the access credentials for a service account.
Additional Regions Set of additional locations (besides the default) that will be used to show datasets.
Note
For more information about supported remote systems and data sources, see SAP Note 2693555 .
4.4.15 GCP_DATAPROC
Provides connection to a Google Cloud Dataproc cluster. The Google Cloud Dataproc is scalable, metered, and managed Spark and Hadoop service. Clusters can scale up and down, as required.
Operation
The GCP_DATAPROC connection type allows submitting Spark, PySpark, Hive, or SparkSQL jobs to a Dataproc cluster in a pipeline. You can create this connection and use it in the Submit Hadoop Job operator.
Administration Guide 84 PUBLIC Using SAP Data Intelligence Connection Management Attributes
Attribute Description
Project ID A unique identifier for the GCP project. It often consists of the project name and a random number.
The projects are the high-level groupings to manage APIs, billing, permissions, and more.
Cluster Name The name of the specific Dataproc cluster with which you want to establish the connection.
Region GCP-specific region in which the cluster is located. For ex ample, europe-west3
Zone Zone in which the cluster is located using the standard form. For example, europe-west3-b.
Keyfile The private key associated with the user or service account with Dataproc permissions.
In the GCP console of each cluster, you can find all attributes under the Configuration tab.
Note
For more information about supported remote systems and data sources, see SAP Note 2693555 .
4.4.16 GCP_PUBSUB
Provides connection to a Google publish/subscribe service.
Operation
The GCP_PUBSUB connection type allows:
● Creating a topic on the Google Cloud Pub/Sub service and sending messages to it using the Google Pub/Sub Producer operator. ● Subscribing to a topic on the Google Cloud Pub/Sub service and receiving messages from it using the Google Pub/Sub Consumer operator.
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 85 Attributes
Attribute Description
Project ID The project ID on GCP to which the service account belongs to.
Key file Keyfile that contains the access credentials for a service ac count on GCP.
Note
For more information about supported remote systems and data sources, see SAP Note 2693555 .
4.4.17 HANA_DB
Provides connection and access information to tables and views in the SAP HANA database.
Operation
The HANA_DB connection type allows:
● Browsing folders and files in the SAP HANA database ● Obtaining file metadata ● Profiling data ● Previewing data ● Performing data preparation tasks
Attributes
Attribute Description
Host Hostname or the IP address of the SAP HANA server.
For SAP HANA Cloud, use the endpoint without :
Administration Guide 86 PUBLIC Using SAP Data Intelligence Connection Management Attribute Description
Port Enter the SQL Port of your HANA database.
In the case of a single DB, its 3
SELECT DATABASE_NAME, SERVICE_NAME, PORT, SQL_PORT FROM SYS_DATABASES.M_SERVICES
For SAP HANA Cloud, use port 443 (provided at the end of the endpoint information of the SAP HANA Cloud instance).
Additional Hosts List of host-port pairs to use as fallback hosts. The additional hosts are alternatives to the required Host and Port attrib utes.
User Username that application must use to authenticate the server.
Password Password to authenticate the server
Blocked schemas Click Add item and enter the names of the schemas that you want hidden from other applications such as the browse and publication tasks in Metadata Explorer. Other calls do not use this list, and the objects are shown in the application, including during lineage extraction in Metadata Exporer.
Use the wildcard * to hide objects and datasets. For exam ple:
● "_SYS*" hides the top-level objects that start with _SYS. ● “CUSTOMER/*” hides all of the objects within the CUSTOMER object. ● “CUSTOMER/PROSPECTS*” hides all objects within CUSTOMER objects that begin with PROSPECTS. ● “CUSTOMER_EUROPE” hides the top-level objects that exactly match CUSTOMER_EUROPE.
Use TLS Set a value. The default value is false. This flag helps the ap plication identify whether to connect to the server over TLS.
For SAP HANA Cloud, TLS is always enabled and must be set to true.
SAP HANA Cockpit URL Base URL for the direct access to the SAP HANA database via the SAP HANA Cockpit.
Note
In the connection details page, select the External Links option to navigate to the SAP HANA Cockpit or SQL console in SAP HANA Cockpit.
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 87 Note
For information about supported versions, see SAP Note 2693555 .
4.4.18 HANA_XS
Provides connection and access information to tables and views in SAP HANA database.
Operation
The HANA_XS connection type allows:
● Browsing SAP HANA flowgraphs. ● Executing SAP HANA flowgraphs.
Attributes
Attribute Description
Host Hostname or the IP address of the SAP HANA server
Port Port of the SAP HANA server
Protocol Protocol (HTTP or HTTPS). The default value is HTTPS.
User Username that application must use to authenticate the server.
Password Password to authenticate the server
Note
For information about supported versions, see SAP Note 2693555 .
4.4.19 HDFS
Provides connection and access information to objects in an HDFS server.
Operation
The HDFS connection type allows:
Administration Guide 88 PUBLIC Using SAP Data Intelligence Connection Management ● Browsing folders and files in the HDFS server ● Obtaining file metadata ● Profiling data ● Copying and deleting files in the HDFS server ● Performing flowgraph tasks with HDFS files as the source and/or target
Note
Along with Remote Procedure Call (RPC), HDFS can also extend connections with WebHDFS and SWebHDFS.
To connect to an HDFS server, the HDFS connection type supports both Kerberos authentication and basic authentication in addition to a simple authentication mechanism. You can create the HDFS connection and also use the SAP Data Intelligence Modeler to:
● Read and write files using the Read File and Write File operators. ● Rename and remove files using the Move File and Remove File operators.
Attributes
Attribute Description
Host Hostname or the IP address of the HDFS namenode.
Port Port of the HDFS namenode. If you do not provide any value, the default value for the selected protocol is used.
Additional Hosts List of secondary hostnames or IP addresses. Required for HDFS High Availability (HA) configurations.
Protocol Protocol the application must use (rpc, webhdfs, swebhdfs).
Note
If you are using HDFS in a network other than the Data Intelligence cluster, then you can use only webhdfs or swebhdfs connections, but not RPC. In such cases, run HTTPFS proxy on HDFS and use only HTTPFS port to reach the HDFS server.
Authentication Type Set the authentication type that the application must use. To connect to HDFS the application supports simple, basic, and kerberos authentication mechanisms.
If you have selected kerberos to create a connection to Ker berized HDFS machine, then provide values to the following additional parameters.
● krb5.conf File: browse and select the krb5.conf configu- ration file from your local system ● Keytab File: Browse and select the keytab file of the user from the local system.
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 89 Attribute Description
User For kerberos authentication type provide user principal. For others, provide username of the HDFS user.
Root Path The optional root path name for browsing objects. The value starts with the character slash. For example, /My Folder/MySubfolder.
If you have specified the Root Path, then any path used with this connection is prefixed with the Root Path.
Impersonation Set the flag to true if you want to enable impersonation for the connected user. The default value is false.
Custom Parameters List of HDFS custom parameters defined by the customer.
Proxy Optionally configure a connection specific proxy server, specify type, host and port. HTTP (default) and SOCKS type are available.
Note
For more information about supported remote systems and data sources, see SAP Note 2693555 .
4.4.20 HDL_DB
Provides connection and access information to SAP HANA Data Lake database.
Operation
The SAP HANA Data Lake database connection allows:
● View connection status in Connection Management ● Browse remote objects in Metadata Explorer ● View fact sheets of remote objects in Metadata Explorer ● Preview content of remote objects in Metadata Explorer ● Prepare data using the Preparation application in Metadata Explorer ● Publish remote objects in Metadata Explorer ● Run rules in Metadata Explorer ● Read tables and views in Modeler ● Read data from SQL queries in Modeler ● Execute native SQL DDL/DML statements in Modeler
Administration Guide 90 PUBLIC Using SAP Data Intelligence Connection Management Attributes
Attribute Description
Host Host name of the SAP HANA Data Lake database server
Port Port number of the SAP HANA Data Lake database server
User HDL DB user with privileges to connect to the database name
Password Password of the SAP HANA Data Lake database user
Additional session pa Allows to set session specific variables rameters
Datatype Conversion
A conversion is performed from SAP HANA Data Lake database datatypes to an agnostic set of types as shown below. SAP HANA Data Lake database datatypes which do not have a corresponding datatype are not supported.
SAP HANA Data Lake Database Da tatype Mapped Datatype
BIGINT DECIMAL(20, 0)
UNSIGNED BIGINT DECIMAL(20, 0)
BIT INTEGER(4)
DECIMAL(p,s) DECIMAL(p, s)
DOUBLE FLOATING(8)
FLOAT FLOATING(4)
INT INTEGER(4)
UNSIGNED INT INTEGER(4)
INTEGER INTEGER(4)
UNSIGNED INTEGER INTEGER(4)
MONEY DECIMAL(19,4)
NUMERIC(p,s) DECIMAL(p, s)
REAL FLOATING(4)
SMALLINT INTEGER(4)
SMALLMONEY DECIMAL(10,4)
TINYINT INTEGER(4)
UNIQUEIDENTIFIER
VARCHAR(s) STRING(s)
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 91 SAP HANA Data Lake Database Da tatype Mapped Datatype
SYSNAME STRING(30)
DATE DATE
DATETIME DATETIME
SMALLDATETIME DATETIME
TIME TIME
TIMESTAMP DATETIME
BINARY
VARBINARY
BLOB LARGE_BINARY_OBJECT
CLOB LARGE_CHARACTER_OBJECT
IMAGE LARGE_BINARY_OBJECT
LONG BINARY LARGE_BINARY_OBJECT
TEXT LARGE_CHARACTER_OBJECT
Note
For information about supported versions, see SAP Note 2693555 .
4.4.21 HDL_FILES
Provides connection and access information to file storage on SAP HANA Data Lake.
Attributes
Attribute Description
Keystore File Client keystore to be used. Select the file in P12 format-binary or enter the Base64-encoded value of P12 file content.
Endpoint SAP HANA Data Lake Files endpoint.
Root Path The optional root path name for browsing. Starts with a slash (e.g. /MyFolder/MySubfolder). Any path used with this connection is prefixed with this root path.
Keystore Pwd Password for the client keystore.
Impersonate Impersonate as the current user, i.e. act on-behalf of (default: false).
Administration Guide 92 PUBLIC Using SAP Data Intelligence Connection Management 4.4.22 HTTP
Provides connection to a server over HTTP or HTTPS.
Operation
The HTTP connection type allows accessing an arbitrary HTTP endpoint.
Attributes
Attribute Description
Host Hostname or IP address of the HTTP server
Port Port for the HTTP server
Protocol Protocol (HTTP or HTTPS). The default value is HTTPS.
Path The path prefix to the API endpoint (for example, /api/ xyz)
Authentication The authentication method that the application must use. You can select NoAuth for no authentication, Basic to au thenticate with user/password details, or OAuth2.
If you have selected Basic, then provide values to the follow ing attributes:
● User: Username that application must use to authenti cate the server. ● Password: Password to authenticate the server
If you have selected OAuth2, then provide values to the fol lowing attributes:
● OAuth 2 Grant Type: The type of grant ● OAuth 2 Token Endpoint: Token endpoint that the appli cation must use ● OAuth 2 Client ID: The client ID ● OAuth 2 Client Secret: The client secret ● OAuth 2 Scope: The OAuth scope value ● OAuth 2 Resource: The name of the resource ● OAuth 2 Token Request Content Type: The value for the Content-type HTTP header that the application must use when requesting a token.
Note
For more information about supported remote systems and data sources, see SAP Note 2693555 .
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 93 4.4.23 IMAP
Provides connection to an IMAP server to receive emails.
Operation
The IMAP connection follows the protocol standard defined in RFC 350. You can use the connection in the Receive Email operator in the SAP Data Intelligence Modeler.
Attributes
Attribute Description
Host Hostname of the IMAP server
Port Port of the IMAP server
User Username in the IMAP server
Password Password in the IMAP server
Use TLS Set a value. The default value is false. This flag helps the ap plication identify whether to connect to the server over TLS.
Note
For more information about supported remote systems and data sources, see SAP Note 2693555 .
4.4.24 INFORMATION_STEWARD
Provides connection and access information to an SAP Information Steward administration server.
The INFORMATION_STEWARD connection type offers the ability to import rules and rule bindings in the Metadata Explorer.
Note
The connection is only shown in the Metadata Explorer when importing rules; it is not shown when browsing connections.
Features
The Information Steward connection type supports:
Administration Guide 94 PUBLIC Using SAP Data Intelligence Connection Management ● View connection status in Connection Management. ● Supports only HTTP and HTTPS Connection protocol. ● Supports only authentication type called secEnterprise. ● View Information Steward projects in Metadata Explorer. ● Import Information Steward rules and bindings in Metadata Explorer. ● Connect to on-premise SAP Information Steward via SAP Cloud Connector.
Attributes
Attribute Description
Host Hostname of the SAP Information Steward web service host.
Port Port of the Information Steward web service.
CMS Host Hostname of the central management system.
CMS Port Port of the central management system.
Connection protocol Protocol (HTTP or HTTPS). The default value is HTTPS.
Username Username to authenticate in the INFORMATION_STEWARD system.
Password Password to authenticate in the INFORMATION_STEWARD system.
When connecting on cloud instances, you will see additional options. See Using SAP Cloud Connector Gateway [page 125] for details.
4.4.25 KAFKA
Provides connection to an Apache Kafka cluster in order to consume and produce messages.
Operation
You can create the KAFKA connection and use the SAP Data Intelligence Modeler to:
● Consume messages from a list of Kafka topics using the Kafka Consumer operator. ● Produce messages to a Kafka topic using Kafka Producer operator.
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 95 Attributes
Attribute Description
Kafka Brokers Comma-separated Kafka brokers using the host:port.
Authentication Set the authentication type. The flag helps the application identify whether it must use a username and password for authentication.
Kafka SASL Username Kafka SASL connection username.
Kafka SASL Password Kafka SASL connection password.
Use TLS Set a value. The default value is false. This flag helps the ap plication identify whether to connect to the connection over TLS.
Use certificate authority When this property is true the TLS CA file will be requested, only needed when 'Use TLS' is true
Certificate authority path Certificate authority path, only considered when Use TLS is true.
Client certificate path Client certificate file that will be loaded with public/private key pair if applicable.
Client key path Client key file to be loaded with client certificate file if appli cable.
SASL Authentication Authentication mechanism to be used, currently PLAIN, SCRAM-256 and SCRAM-512 are supported.
Group ID Holds all information for a consumer that is part of the group.
Kafka kerberos service name The service name defined for the Kafka server.
Kafka kerberos realm The realm defined for the Kafka kerberos server.
Authentication type for Kafka/GSSAPI Authentication type for Kafka/GSSAPI, when "User/pass word" is used, username and password must be set, when "Keytab file authentication" is used, keytab path must be set.
Kerberos config path Kerberos configuration path (krb5.conf file).
Keytab file path Path for keytab file for Kafka client.
Note
For information about supported versions, see SAP Note 2693555 .
Administration Guide 96 PUBLIC Using SAP Data Intelligence Connection Management 4.4.26 MLCLUSTER
Provides connection to a Machine Learning cluster.
Operation
You can create the MLCUSTER connection and use the operators in the SAP Data Intelligence Modeler to trigger Machine Learning jobs.
Note
MLCLUSTER, by default, is disabled. It is used only internally by the Connection Management application. If you want to create a connection with this connection type, you must first enable it in the Connection Management application. For more information see topic, Using SAP Data Intelligence Connection Management.
Attributes
Attribute Description
Service Key Service Key to authenticate the SAP Data Intelligence in stance on the ML side.
4.4.27 MSSQL
Provides connection and access information Microsoft SQL Server databases (MSSQL). The application supports MSSQL version 2012 onwards.
Operation
The MSSQL connection type allows:
● Browsing MSSQL schemas in Metadata Explorer ● Previewing MSSQL tables and views in Metadata Explorer ● Profiling MSSQL tables and views in Metadata Explorer ● Indexing MSSQL schemas in Metadata Explorer ● On-premise connectivity via SAP Cloud Connector
You can create the MSSQL connection and use the SAP Data Intelligence Modeler to:
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 97 ● Read tables and views ● Read data from SQL queries ● Execute native SQL DDL/DML statements
Attributes
Attribute Description
Version Microsoft SQL Server version (2012, 2014, 2016, or 2017)
Subtype Microsoft SQL Server subtype. The application currently supports only SQL Server on Premise
Host Hostname of MSSQL server
Port Port number of the MSSQL server
Use TLS Set a value. The default value is false. This flag helps the ap plication identify whether to connect to the server over TLS.
Database name Name of the MSSQL database with which you want to estab lish a connection.
User MSSQL user with privileges to connect to the database.
Password Password of the MSSQL user
Additional session parameters Provide more session-specific parameters
Note
For information about supported versions, see SAP Note 2693555 .
4.4.28 MYSQL
Provides connection and access information to Oracle MySQL server. The application supports MySQL version 5.5 onwards.
Prerequisites
Configuration Using Existing Client
MySQL operators require MySQL ODBC Connector for Linux x86-64. Follow the instructions below to set it up:
1. Download MySQL ODBC Connector here . 1. Select Linux - Generic and the operating system Linux - Generic (glibc 2.12) (x86, 64-bit) as the OS version. 2. Select the version 8.0.12.
Administration Guide 98 PUBLIC Using SAP Data Intelligence Connection Management 3. Download the compress TAR archive tar.gz. 2. Create the vsolution area: mkdir -p mysql_vsolution/content/files/flowagent. 3. Create vsolution manifest as mysql_vsolution/manifest.json:
Sample Code
{
"name": "vsolution_mysql", "version": "1.0.0", "format": "2", "dependencies": []
}
Note
If you need to upload a new driver later, you can modify the version (for example, 1.0.1, 2.0.0, and so on) to upload a new vsolution. Then simply modify your layering strategy appropriately.
4. Extract the downloaded compressed TAR archive:
Sample Code
tar -xvzf mysql-connector-odbc-8.0.12-linux-glibc2.12-x86-64bit.tar.gz -C mysql_vsolution/content/files/flowagent/
5. Create a properties file mysql_vsolution/content/files/flowagent/mysql.properties with driver manager relative to the location of the properties file:
Sample Code
MYSQL_DRIVERMANAGER=./mysql-connector-odbc-8.0.12-linux-glibc2.12- x86-64bit/lib/libmyodbc8w.so
Note
The exact path for the MYSQL_DRIVERMANAGER variable may change depending on the downloaded driver version.
6. Compress the vsolution from within the mysql_vsolution directory: zip -r mysql_vsolution.zip ./ 7. Import the vsolution in System Management. 1. Start the System Management application from the Launchpad.
Note
You require the Tenant Administrator role.
2. Click the Tenant tab. 3. Click +, and then select the newly created mysql_vsolution.zip. 4. After the import is complete, click the Strategy sub-tab, and then click Edit. 5. Click +, and select your newly imported solution vsolution_mysql-1.0.0. 6. Click Save.
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 99 Note
For changes to take effect, restart the Flowagent application.
Operation
The MYSQL connection type allows:
● Browsing MySQL schemas in Metadata Explorer ● Previewing MySQL tables and views in Metadata Explorer ● Profiling MySQL tables and views in Metadata Explorer ● Indexing MySQL schemas in Metadata Explorer ● On-premise connectivity via SAP Cloud Connector
You can create the MYSQL connection and use the SAP Data Intelligence Modeler to:
● Read tables and views ● Read data from SQL queries ● Execute native SQL DDL/DML statements
Attributes
Attribute Description
Version Oracle MySQL server version. (Default value: MySQL 8.x)
Host Hostname of the MySQL server
Port Port number of the MySQL server. (Default value: 3306)
Database name Name of the MSSQL database with which you want to estab lish a connection.
User MySQL user with privileges to connect to the database
Password Password of the MySQL user
Additional session parameters Provide more session-specific parameters
Note
For information about supported versions, see SAP Note 2693555 .
Administration Guide 100 PUBLIC Using SAP Data Intelligence Connection Management 4.4.29 ODATA
Provides connection to OData RESTful APIs. The application supports connection to OData Version V2 and V4.
Operation
The ODATA connection type allows:
● Browsing OData datasets in Metadata Explorer ● Previewing OData resource in Metadata Explorer ● Indexing OData datasets in Metadata Explorer
You can create the ODATA connection and use the SAP Data Intelligence Modeler to:
● Read OData resources
Attributes
Attribute Description
URL OData Service URL
Version OData Version (V2 or V4)
Authentication The authentication method that the application must use. It can be Basic, OAuth2, NoAuth, or ClientCertificate. You can select NoAuth for no authentication, Basic to authenticate with user/password details, or OAuth2.
User Authentication user, if Basic authentication is chosen
Password Authentication password, if Basic authentication is chosen
Require CSRF Header Require CSRF Header
HTTP Headers HTTP headers allows pass additional HTTP headers with the request if they are required by specific OData Endpoint.
Client Certificate The X509 client certificate to be used if ClientCertificate au thentication is chosen
Provide Client Private Key in separate file Flag indicating whether the X509 client private key is speci fied in a separate file
Client Private Key The X509 client private key to be used if client certificate has a separated private key
Use Client Private Key Password Flag indicating whether the X509 client private key is pro tected by a password
Client Private Key Password Password for the X509 client private key
OAuth2 GrantType OAuth2 GrantType, if OAuth2 authentication is chosen
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 101 Attribute Description
OAuth2 Token Endpoint OAuth2 Token Endpoint, if OAuth2 Authentication is chosen
OAuth2 User OAuth2 User, if OAuth2 authentication is chosen
OAuth2 Password OAuth2 Password, if OAuth2 authentication is chosen
OAuth2 ClientId OAuth2 ClientId, if OAuth2 authentication is chosen
OAuth2 Client Secret OAuth2 Client Secret, if OAuth2 authentication is chosen
OAuth2 Scope OAuth2 Scope, if OAuth2 authentication is chosen
OAuth2 Resource OAuth2 Resource, if OAuth2 authentication is chosen
OAuth2 Response Type OAuth2 Response Type, if OAuth2 authentication is chosen
OAuth2 Token Request Content Type OAuth2 Token Request Content Type, if OAuth2 authentica tion is chosen. if url_encoded, OAuth2 Token Request pa rameters will be url-encoded and included in the HTTP Re quest body. if json, OAuth2 Token Request parameters will be in JSON format and included the HTTP Request body
Note
For information about supported versions, see SAP Note 2693555 .
4.4.30 OPEN_CONNECTORS
Provides connection and access information to SAP BTP Open Connectors.
Features
The OPEN_CONNECTORS connection type allows:
● Browsing dataset from Open Connectors in the Metadata Explorer. ● Previewing data from Open Connectors in the Metadata Explorer.
You can create the OPEN_CONNECTORS connection and use the SAP Data Intelligence Modeler to:
● Read tables and views ● Read data from SQL queries
Administration Guide 102 PUBLIC Using SAP Data Intelligence Connection Management Attributes
Attribute Description
Open Connectors Instance ID ID of an authenticated connector instance.
Open Connectors API Base URL API Base URL to access Open Connectors.
User Secret User secret of Open Connectors.
Organization Secret Organization secret of Open Connectors.
Note
For more information about Open Connectors API Base URL, see Base URLs
For more information about supported remote systems and data sources, see SAP Note 2693555 .
Getting Started with Open Connectors
1. Set up an SAP BTP Open Connectors service. For more information about setting up a trial version, see Enable SAP BTP Open Connectors in Trial .
2. In the SAP BTP Open Connectors landing page, choose the Connectors tab.
3. Choose Authenticate Create Instance to create an authenticated connector instance.
3. Record the following information about Open Connectors. You will require this information when creating the connection from SAP Data Intelligence Connection Management application.
● Organization secret and user secret ● Instance ID of the connector instance
SAP Data Intelligence supports all open connectors from the following hub types. For more information on the connectors for each of the hub types, see SAP Open Connectors documentation.
● CRM ● DB ● General ● Marketing ● Social
Note
These hub types are experimental. Experimental features are not part of the officially delivered scope that SAP guarantees for future releases - this means that experimental features may be changed by SAP at any time for any reason without notice.
Experimental features are NOT FOR PRODUCTIVE USE. You may not demonstrate, test, examine, evaluate or otherwise use the experimental features in a live operating environment or with data that has not been sufficiently backed up.
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 103 4.4.31 OPENAPI
Provides connection and access information to a server according to an OpenAPI 2.0 specification.
Attributes
Attribute Description
Host The service's host.
Port The service's port.
Protocol Connection protocol (HTTP or HTTPS)
Basepath The service's base path.
Authentication Type Authentication type that the application must use. You can select NoAuth, Basic, OAuth2, or ApiKey.
If you've selected Basic, then provide values for the following attributes:
● User: Authentication user ● Password: Authentication password
If you have selected OAuth2, then provide values for the fol lowing attributes:
● OAuth2 Grant Type: Authentication user ● OAuth2 Token Endpoint: Authentication password ● OAuth2 User: OAuth2 User ● OAuth2 Password: OAuth2 Password ● OAuth2 Client Id: OAuth2 Client Id ● OAuth2 Client Secret: OAuth2 Client Secret ● OAuth2 Authorization Server: Server for authorization code grant type ● OAuth2 Scope: OAuth2 Scope ● OAuth2 Resource: OAuth2 Resource
If you have selected ApiKey, then provide values for the fol lowing attributes:
● ApiKey Name: Name of the ApiKey parameter ● ApiKey Type: Type of the ApiKey (header or query) ● ApiKey Value: Value of the ApiKey, if ApiKey authentica tion is chosen.
HTTP Headers Additional HTTP headers that can be passed with the re quest in the form of name and value pairs.
TLS skips verify If set to true, the client does not verify the server's certificate chain nor the host name.
Administration Guide 104 PUBLIC Using SAP Data Intelligence Connection Management Note
For more information about supported remote systems and data sources, see SAP Note 2693555 .
4.4.32 ORACLE
Provides connection and access information to an Oracle database. The versions supported for connection are Oracle 12c, Oracle 18c, and Oracle 19c.
Note
The support for version 10g is deprecated and will be removed in the upcoming releases.
The support of version 11g is deprecated.
Prerequisites
● You have installed the Oracle Instant Client for Linux x86-64. Select the Basic Light Package from the 12.2.x version for installation. The following are example steps for 12.2.0.1.0 version. ● Create the vsolution area: mkdir -p oracle_vsolution/content/files/flowagent. ● Create a vsolution manifest as oracle_vsolution/manifest.json:
Sample Code
{
"name": "vsolution_oracle", "version": "1.0.0", "format": "2", "dependencies": []
}
Note
If you need to upload a new driver later, you can modify the version (for example, 1.0.1, 2.0.0, and so on) to upload a new vsolution. Then simply modify your layering strategy appropriately.
● Extract the zip file and create the necessary symbolic links. To create, use the commands:
unzip instantclient-basic-linux.x64-12.2.0.1.0.zip -d oracle_vsolution/ content/files/flowagent/
cd oracle_vsolution/content/files/flowagent/instantclient_12_2 ln -s libclntsh.so.12.1 libclntsh.so.12 ln -s libclntsh.so.12 libclntsh.so ln -s libclntshcore.so.12.1 libclntshcore.so.12 ln -s libclntshcore.so.12 libclntshcore.so ln -s libocci.so.12.1 libocci.so.12 ln -s libocci.so.12 libocci.so
cd ../../../..
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 105 Note
The steps above are valid for the 12.2.0.1.0 version. For your specific version refer to readme file that came with the installation package and follow the steps for creating symbolic links. The zip file name could be different depending on the driver version downloaded. Therefore, change it accordingly.
● Create a properties file oracle_vsolution/content/files/flowagent/oracle.properties with Oracle instant client path relative to the location of the property file:
ORACLE_INSTANT_CLIENT=./instantclient_12_2
NLS_LANG=AMERICAN_AMERICA.UTF8
● Configure TLS (Optional) 1. Create the orapki directory:mkdir -p oracle_vsolution/content/files/flowagent/ orapki 2. Find and copy oraclepki.jar, cryptoj.jar, osdt_core.jar, osdt_cert.jar, and ojmisc.jar to this directory.
Note
You can find these files in the $ORACLE_HOME/jlib folder from your server or client Oracle installation.
3. Add the entry to oracle.properties: ORACLE_ORAPKI_PATH=./orapki ● Compress the vsolution from within the oracle_vsolution directory. zip -y -r oracle_vsolution.zip ./ ● Import the vsolution in System Management. 1. Start the System Management application from the Launchpad.
Note
You require the Tenant Administrator role.
2. Choose the Tenant tab. 3. Click +, and select the newly created oracle_vsolution.zip. 4. After the import is complete, choose the Strategy sub-tab, and then click Edit. 5. Click +, and then select your newly imported solution vsolution_oracle-1.0.0. 6. Click Save.
Note
For changes to take effect, restart the Flowagent application.
Operation
This connection type allows:
● Browsing Oracle DB schemas and tables in Metadata Explorer ● Previewing Oracle DB tables and views in Metadata Explorer ● Profiling Oracle DB tables and views in Metadata Explorer ● Indexing Oracle DB schemas in Metadata Explorer
Administration Guide 106 PUBLIC Using SAP Data Intelligence Connection Management ● Previewing Oracle DB tables and views in Metadata Explorer ● On-premise connectivity via SAP Cloud Connector
You can create the ORACLE connection and use the SAP Data Intelligence Modeler to:
● Read tables and views in Modeler ● Read data from SQL queries in Modeler ● Execute native SQL DDL or DML statements in Modeler
Attributes
Attribute Description
Version The Oracle DB version. (Oracle 19c)
Host Hostname of the Oracle DB server
Port Port number of the Oracle DB server
SID or Service Name Database instance or alias with which you want to establish a connection
Use TLS Enable encrypted connection
Note
To use Oracle with TLS, it is necessary to upload the or apki tool utilities to System Management. To learn how to upload and configure the orapki tool utilities, see the SAP Note 2988193 .
Validate host certificate Validate the server certificate
Hostname in certificate Validate if hostname is in certificate
User Oracle DB user with privileges to connect to the database in stance.
Password Password of the Oracle DB user.
Additional session parameters Allows to set session specific variables
Note
For information about supported versions, see SAP Note 2693555 .
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 107 4.4.33 OSS
Provides connection and access information to Alibaba Cloud Object Storage Service (OSS).
Operation
You can create the OSS connection and use the SAP Data Intelligence Modeler to:
● Read and write files using the Read File and Write File operators. ● Copy, rename, and remove files using the Copy File, Move File, and Remove File operators respectively.
Attributes
Attribute Description
Endpoint Endpoint URL of the OSS server. The protocol prefix isn't re quired. For example, "oss.aliyuncs.com".
Protocol The protocol to be used. It overwrites the value from "End point", if already set.
Region The region used to authenticate and operate over. For exam ple, "oss-cn-hangzhou".
Access Key The user's Access Key ID used to authenticate.
Secret Key The user's Secret Access Key used to authenticate.
Root Path The optional root path name for browsing. The value starts with the character slash. For example, /My Folder/ MySubfolder.
If you've specified the Root Path, then any path used with this connection is prefixed with the Root Path.
Note
For information about supported versions, see SAP Note 2693555 .
Administration Guide 108 PUBLIC Using SAP Data Intelligence Connection Management 4.4.34 REDSHIFT
Allows connectivity to Amazon Redshift databases.
Prerequisites
● Install the Amazon Redshift ODBC driver (64-bit .rpm version) for Linux operating systems. ● Create the vsolution area: mkdir -p redshift_vsolution/content/files/flowagent/redshift. ● Create vsolution manifest as redshift_vsolution/manifest.json:
Sample Code
{
"name": "vsolution_redshift", "version": "1.0.0", "format": "2", "dependencies": []
}
Note
If you need to upload a new driver later, you can modify the version (for example, 1.0.1, 2.0.0, and so on) to upload a new vsolution. Then simply modify your layering strategy appropriately.
● Extract the downloaded RPM file: ○ On Windows, extract the RPM file using a file archiver. ○ On Linux, install the package or extract it using `rpm2cpio` tool:
Sample Code
`rpm2cpio AmazonRedshiftODBC-64-bit-
● Copy extracted files to vsolution:
Sample Code
cp
cp
● (Optional) Configure the driver to display proper error messages. To configure:
Sample Code
cp -R
● Create a properties file redshift_vsolution/content/files/flowagent/redshift.properties with driver path relative to the location of the properties file:
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 109 Sample Code
AMAZON_REDSHIFT_DRIVERMANAGER=./redshift/libamazonredshiftodbc64.so
● Compress the vsolution from within the redshift_vsolution directory: zip -r redshift_vsolution.zip ./ ● Import the vsolution in System Management. 1. Start the System Management application from the Launchpad.
Note
You require the Tenant Administrator role.
2. Choose the Tenant tab. 3. Click +, and then select the newly created redshift_vsolution.zip. 4. After the import is complete, choose the Strategy sub-tab, and then click Edit. 5. Click +, and then select your newly imported solution vsolution_redshift-1.0.0. 6. Click Save.
Note
For changes to take effect, restart the Flowagent application.
Operation
The Amazon Redshift connection type allows you to:
● View connection status in Connection Management ● Browse remote objects in Metadata Explorer ● Obtain fact sheets of remote objects in Metadata Explorer ● Preview content of remote objects in Metadata Explorer ● Publish remote objects in the Metadata Explorer ● Read tables and views in Modeler ● Read data from SQL queries in Modeler ● Execute native SQL DDL/DML statements in Modeler
Attributes
Attribute Description
Version The Amazon Redshift version
Host The host name of the Redshift server
Port The port number of the Redshift server
Administration Guide 110 PUBLIC Using SAP Data Intelligence Connection Management Attribute Description
Database name The database name to connect
SSL Mode The SSL certificate verification mode to use when connect ing.
User Redshift user with privileges to connect to the database
Password Password of the Redshift user
Additional session parameters This attribute allows you to set session-specific variables
SSL Mode allows the following options:
● prefer: if the server supports, the data is encrypted. ● disable: data is not encrypted. ● allow: if server requires it, data is encrypted ● require: data is always encrypted. ● verify-ca: data is always encrypted, and server certificate is validated. ● verify-full: data is always encrypted, server certificate is validated, and server hostname must match the one in the certificate. If you select verify-ca or verify-full, provide Redshift's certificate via the Certificates tab in Connection Management application.
Datatype Conversion
A conversion is performed from Redshift datatypes to an agnostic set of types, as shown below.
Redshift Datatype Datatype
BIGINT BIGINT
DECIMAL(p, s) DECIMAL(p, s)
DOUBLE PRECISION FLOATING(8)
INTEGER INTEGER(4)
REAL FLOATING(4)
SMALLINT INTEGER(4)
BOOLEAN STRING(5)
CHAR(s) STRING(s)
VARCHAR(s) STRING(s)
DATE DATE
TIMESTAMP DATETIME
TIMESTAMPTZ DATETIME
GEOMETRY STRING(127)
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 111 Access Firewall
The network for Amazon Redshift instances is protected by a firewall that controls incoming traffic.
For SAP Data Intelligence, cloud edition, Connection Management exposes an IP address via a read-only connection called INFO_NAT_GATEWAY_IP. Use this IP address and add the connection to allowlist in the Amazon Redshift dashboard. For SAP Data Intelligence, on-premise edition, as a Kubernetes administrator, obtain the public IP address of all the nodes and add them to allowlist in the Amazon Redshift dashboard.
Note
For more information about supported remote systems and data sources, see SAP Note 2693555 .
4.4.35 RSERVE
Provides connection to a RServe server.
Operation
RServe is a TCP/IP server that allows other programs to communicate with one or several R sessions. You can create this connection type and use it in the R Client operator in the SAP Data Intelligence Modeler.
Attributes
Attribute Description
Host Hostname or the IP address of the server
Port Port of the server
User Username to authenticate the RServe
Password Password for RServe
Note
For information about supported versions, see SAP Note 2693555 .
Administration Guide 112 PUBLIC Using SAP Data Intelligence Connection Management 4.4.36 S3
Provides connection and access information to objects in Amazon S3 or compatible services such as Minio and Rook.
Operation
The S3 connection type allows:
● Browsing buckets, folders, and files on an Amazon S3 endpoint. ● Obtaining file metadata ● Profiling data ● Copying and deleting files in Amazon S3 buckets. ● Performing flowgraph tasks with Amazon S3 files as the source and/or target
You can create the S3 connection and use the SAP Data Intelligence Modeler to:
● Read and write files using the Read File and Write File operators. ● Copy, rename, and remove files using the Copy File, Move File, and Remove File operators respectively.
Note
When connecting to an Amazon S3 compatible service such as Minio, enter an endpoint that is an IP address rather than the hostname.
Attributes
Attribute Description
Endpoint Endpoint URL of the Amazon S3 server. The protocol prefix is not required. For example, s3.amazonaws.com.
Protocol Protocol (HTTP or HTTPS). The default value is HTTPS. The value that you provide overwrites the value from the End point, if already set.
Region The region used to authenticate and operate over. For exam ple, us-east-1.
Access Key The access key ID of the user that the application must use to authenticate.
Secret Key The secret access key of the user that the application must use to authenticate.
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 113 Attribute Description
Root Path The optional root path name for browsing objects. The value starts with the character slash. For example, /My Folder/MySubfolder.
If you have specified the Root Path, then any path used with this connection is prefixed with the Root Path.
Note
For more information about supported remote systems and data sources, see SAP Note 2693555 .
4.4.37 SAP_IQ
Provides connection and access information to objects in SAP IQ databases.
Operation
The SAP_IQ connection type allows:
● View connection status in Connection Management ● Browse remote objects in Metadata Explorer. ● View fact sheets of remote objects in metadata Explorer. ● Preview content of remote objects in Metadata Explorer. ● Publish remote objects in Metadata Explorer. ● Prepare data using the Preparation application in Metadata Explorer. ● Run rules in Metadata Explorer. ● Read tables and views in Modeler. ● Read data from SQL queries in Modeler. ● Process native SQL DDL/DML statements in Modeler.
Attributes
Attribute Description
Version SAP IQ version
Host Host name of the SAP IQ server.
Port Port number of the SAP IQ server.
Use TLS Enable encrypted connection.
Administration Guide 114 PUBLIC Using SAP Data Intelligence Connection Management Attribute Description
Validate host certificate Validate the server certificate.
Hostname in certificate Validate if hostname is in certificate.
Database name Database name to connect.
Server name SAP IQ server name to connect (also referred to as the Engine name).
User SAP IQ user with privileges to connect to the database name.
Password Password of the SAP IQ user.
Additional session parameters Allows to set session-specific variables.
Datatype Conversion
A conversion is performed from SAP IQ datatypes to an agnostic set of types, called SAP Data Intelligence datatypes, as shown below. SAP IQ datatypes that don't have a corresponding SAP Data Intelligence datatype are not supported.
SAP IQ Datatype SAP Data Intelligence Datatype
BIGINT DECIMAL(20, 0)
UNSIGNED BIGINT DECIMAL(20, 0)
BIT INTEGER(4)
DECIMAL(p,s) DECIMAL(p, s)
DOUBLE FLOATING(8)
FLOAT FLOATING(4)
INT INTEGER(4)
UNSIGNED INT INTEGER(4)
INTEGER INTEGER(4)
UNSIGNED INTEGER INTEGER(4)
MONEY DECIMAL(19,4)
NUMERIC(p,s) DECIMAL(p, s)
REAL FLOATING(4)
SMALLINT INTEGER(4)
SMALLMONEY DECIMAL(10,4)
TINYINT INTEGER(4)
UNIQUEIDENTIFIER
UNIQUEIDENTIFIERSTR STRING(36)
CHAR(s) STRING(s)
VARCHAR(s) STRING(s)
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 115 SAP IQ Datatype SAP Data Intelligence Datatype
SYSNAME STRING(30)
DATE DATE
DATETIME DATETIME
SMALLDATETIME DATETIME
TIME TIME
TIMESTAMP DATETIME
BINARY
VARBINARY
BLOB LARGE_BINARY_OBJECT
CLOB LARGE_CHARACTER_OBJECT
IMAGE LARGE_BINARY_OBJECT
LONG BINARY LARGE_BINARY_OBJECT
TEXT LARGE_CHARACTER_OBJECT
Note
For information about supported versions, see SAP Note 2693555 .
4.4.38 SDL
Provides a connection to remote object stores.
Operation
With a connection of type SDL, you can:
● Use it in model training and model serving or with artifact producer and artifact consumer in Machine Learning scenarios. ● Experiment with data using JupyterLab in the context of Machine Learning.
Connection
Use the predefined connection DI_DATA_LAKE to access the Semantic Data Lake. This connection is managed by SAP and cannot be modified.
Administration Guide 116 PUBLIC Using SAP Data Intelligence Connection Management Usage
A connection of type SDL includes the following directories:
● / is the root directory. You cannot create directories or files under the root directory. ● /shared/sap: Should be reserved for storing content and data that is produced by SAP components. ● /external: Provides read-only access to connections for an authorized user. ● /shared: Available by default for read-write access on the Object Store Type defined in the connection.
Note
Previous releases included a /worm directory intended for immutable artifacts. If your version includes a / worm directory, note that it has been deprecated and is now read-only. You can migrate the content from the /worm directory to a new directory. After migration, a Tenant Administrator can delete the /worm directory permanently.
4.4.39 SFTP
Allows connection to an SSH File Transfer Protocol server.
Features
● Read and write files in Modeler via Read File and Write File operators. ● Remove files in Modeler via Remove File operator.
Attributes
Attribute Description Mandatory
Hostname Host name or IP address of the SFTP yes server.
Port Port number of the SFTP server. If none no is set, the default value for the selected protocol is used.
Host Key Host's public key. Usually is in the .pub yes format. The section Obtaining Host Key [page 118] explains the process to re trieve a host's public key.
Authentication Type The authentication type to be used. yes SSH_Key (default) and password are available.
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 117 Attribute Description Mandatory
User The user accessing the SFTP server. yes
Private Key The user SSH Private Key used for no SSH_Key authentication. The server must know the user SSH Public Key.
Use Passphrase Whether a passphrase should be used yes for the private key.
Passphrase The passphrase needed to decrypt the no Private Key in SSH_Key authentication.
Password The user's password used for authenti no cation.
Root Path The optional root path name for brows no ing. It starts with a slash. For example, / MyFolder/MySubfolder. Any path you use with this connection is prefixed with this root path.
Proxy Optionally configure a connection spe no cific proxy server, specify type, host and port. SOCKS (default) type is available.
Obtaining Host Key
Usually, the host's public key should be provided through a trustable channel.
If your machine has a trustable channel and a unix compliant shell with both ssh-keyscan and sed commands (both commands are already installed in a Linux system), obtain the key by executing the following command:
ssh-keyscan $HOST 2>/dev/null | sed "s/^[^ ]* //" > host_key.pub.txt
To run the command, you may either set the variable HOST or directly change the $HOST string within the command to your Hostname value. After running the command, a file host_key.pub.txt is created in the current working directory. You may use it as the Host Key in the connection.
If you are using a custom Port value, you should also put it into the -p parameter. For that, use the following command, either setting the PORT variable or changing the $PORT string within the command to your Port value.
ssh-keyscan $HOST -p $PORT 2>/dev/null | sed "s/^[^ ]* //" > host_key.pub.txt
Administration Guide 118 PUBLIC Using SAP Data Intelligence Connection Management 4.4.40 SMTP
Provides connection to an SMTP server connection.
Operation
The SMTP connection type:
● Provides protocol to send emails ● Follows the protocol standard defined in RFC 2821
You can create the SMTP the connection and use it in the Send Email operator in the SAP Data Intelligence Modeler.
Attributes
Attribute Description
Host Hostname of the SMTP server
Port Port of the SMTP server
User Username to authenticate the SMTP server
Password Password in the SMTP server
Note
For more information about supported remote systems and data sources, see SAP Note 2693555 .
4.4.41 SNOWFLAKE
Allows connectivity to Snowflake databases.
Prerequisites
You have installed the ODBC driver. To install, follow the below steps:
● Download the Snowflake ODBC driver for linux ○ Select the version 2.23.2 ○ Select the format TGZ (TAR file compressed using .GZIP) ● Create the vsolution area: mkdir -p snowflake_vsolution/content/files/flowagent.
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 119 ● Create vsolution manifest as snowflake_vsolution/manifest.json:
Sample Code
{
"name": "vsolution_snowflake", "version": "1.0.0", "format": "2", "dependencies": []
}
Note
If you need to upload a new driver later, you can modify the version (for example, 1.0.1, 2.0.0, and so on) to upload a new vsolution. Then simply modify your layering strategy appropriately.
● Extract the downloaded compressed TAR archive:
Sample Code
tar -xvzf snowflake_linux_x8664_odbc-2.23.2.tgz -C snowflake_vsolution/ content/files/flowagent/
● Create a properties file snowflake_vsolution/content/files/flowagent/ snowflake.properties with driver manager relative to the location of the properties file:
Sample Code
SNOWFLAKE_DRIVERMANAGER=./snowflake_odbc/lib/libSnowflake.so
Note
The exact path for SNOWFLAKE_DRIVERMANAGER variable may change depending on the downloaded driver version.
● Create a driver configuration file snowflake_vsolution/content/files/flowagent/ snowflake_odbc/lib/simba.snowflake.ini with the following contents:
Sample Code
[Driver]
ErrorMessagesPath=../ErrorMessages CABundleFile=./cacert.pem
DriverManagerEncoding=UTF-16
● Compress the vsolution from within the snowflake_vsolution directory:
Sample Code
cd snowflake_vsolution
zip -r snowflake_vsolution.zip ./
● Import the vsolution in System Management. 1. Start the System Management application from the Launchpad.
Administration Guide 120 PUBLIC Using SAP Data Intelligence Connection Management Note
You require the Tenant Administrator role.
2. Click the Tenant tab. 3. Click +, and then select the newly created snowflake_vsolution.zip. 4. After the import is complete, choose the Strategy sub-tab, and then click Edit. 5. Click +, and then select your newly imported solution vsolution_snowflake-1.0.0 and click Add. 6. Click Save.
Note
For changes to take effect, restart the Flowagent application.
Operation
The Snowflake connection type allows:
● Viewing connection status in Connection Management ● Browsing remote objects in Metadata Explorer ● Obtaining fact sheets of remote objects in Metadata Explorer ● Previewing content of remote objects in Metadata Explorer ● Reading tables and views in Modeler ● Reading data from SQL queries in Modeler ● Executing native SQL DDL/DML statements in Modeler
Attributes
Attribute Description
Host Host name of the Snowflake server
Port Port number of the Snowflake server. (Default value: 443)
Database Database name to connect
User Snowflake user with privileges to connect to the database
Password Password of the Snowflake user
Additional session parameters Allows to set session specific variables
Datatype Conversion
A conversion is performed from Snowflake datatypes to an agnostic set of types, as shown below:
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 121 SAP HANA Data Lake Database Da tatype Mapped Datatype
BIGINT DECIMAL(20, 0)
UNSIGNED BIGINT DECIMAL(20, 0)
BIT INTEGER(4)
DECIMAL(p,s) DECIMAL(p, s)
DOUBLE FLOATING(8)
FLOAT FLOATING(4)
INT INTEGER(4)
UNSIGNED INT INTEGER(4)
INTEGER INTEGER(4)
UNSIGNED INTEGER INTEGER(4)
MONEY DECIMAL(19,4)
NUMERIC(p,s) DECIMAL(p, s)
REAL FLOATING(4)
SMALLINT INTEGER(4)
SMALLMONEY DECIMAL(10,4)
TINYINT INTEGER(4)
UNIQUEIDENTIFIER
VARCHAR(s) STRING(s)
SYSNAME STRING(30)
DATE DATE
DATETIME DATETIME
SMALLDATETIME DATETIME
TIME TIME
TIMESTAMP DATETIME
BINARY
VARBINARY
BLOB LARGE_BINARY_OBJECT
CLOB LARGE_CHARACTER_OBJECT
IMAGE LARGE_BINARY_OBJECT
LONG BINARY LARGE_BINARY_OBJECT
TEXT LARGE_CHARACTER_OBJECT
Note
For more information about supported remote systems and data sources, see SAP Note 2693555 .
Administration Guide 122 PUBLIC Using SAP Data Intelligence Connection Management 4.4.42 VORA
Provides connection and access information to SAP Vora catalog service.
Note
The SAP Vora connection type is deprecated and will be removed in future releases. For more information, see SAP Note 3036489 .
Operation
The VORA connection type allows:
● Browsing schemas and tables in an SAP Vora instance ● Obtaining SAP Vora table metadata ● Profiling data ● Loading other cloud storages, such as Amazon S3, Google Cloud Storage, Microsoft Azure Data Lake, and Microsoft Windows Azure Storage Blobs.
Attributes
Attribute Description
Host vora-tx-coordinator-ext
Host name or IP address of the SAP Vora instance. (To es tablish a connection to the local SAP Vora installation, use the information displayed in the Vora Tools connection infor mation. For example, "vora-tx-coordinator:10002".
Port Enter the SQL Port of your HANA database. In case of a sin gle DB it is 3
SELECT DATABASE_NAME, SERVICE_NAME, PORT, SQL_PORT FROM SYS_DATABASES.M_SERVICES To access internal VORA, use 10004
Tenant Tenant name of System Management.
Use TLS This flag should be set to true. The default value is false so it will needed to be switched. This flag helps the applica tion identify whether to connect to the server over TLS.
User The user name that the application uses to authenticate the server.
Password The password used to authenticate the server.
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 123 Attribute Description
Storage Connection (Optional) A storage connection that is used a as data source and to back up the tables from SAP Vora. This option is used by data transform when using Vora as a target. Data is written to a cloud source, and then a LOAD statement is executed on the specified Vora table.
Storage Base Folder (Optional) The directory relative to the root path in Storage Connection where SAP Vora stores and browses files. This option is used by data transform when using Vora as a target. Data is writ ten to a cloud source, and then a LOAD statement is exe cuted on the specified Vora table.
Note
For more information about supported remote systems and data sources, see SAP Note 2693555 .
4.4.43 WASB
Provides connection and access information to objects in Microsoft Windows Azure Storage Blobs (WASB)
Operation
The WASB connection type allows:
● Browsing folders and files in the Microsoft Windows Azure Storage Blob server ● Obtaining file metadata ● Profiling data ● Previewing data ● Performing flowgraph tasks with Microsoft Windows Azure Storage Blob files as the source and/or target
You can create the WASB connection and use the SAP Data Intelligence Modeler to:
● Read and write files using the Read File and Write File operators. ● Copy, rename, and remove files using the Copy File, Move File, and Remove File operators respectively.
Attributes
Attribute Description
Protocol Protocol (wasbs or wasb). The default value is HTTPS.
Account Name Account name used in the Shared Key authentication
Administration Guide 124 PUBLIC Using SAP Data Intelligence Connection Management Attribute Description
Account Key Account key used in the Shared Key authentication.
Endpoint Suffix The endpoint suffix. The default value is "core.windows.net"
Root Path The optional root path name for browsing objects. The value starts with the character slash. For example, /My Folder/MySubfolder.
If you have specified the Root Path, then any path used with this connection is prefixed with the Root Path.
Note
For more information about supported remote systems and data sources, see SAP Note 2693555 .
4.5 Using SAP Cloud Connector Gateway
The SAP Cloud Connector is a proxy application that allows SAP Data Intelligence in the cloud to access on- premise connections.
Several connection types support the usage of a gateway. If a connection type supports a gateway and the gateway has been registered during the installation of the SAP Data Intelligence instance, you see an additional field Gateway in the create connection or update connection page in the Connection Management application.
Note
This feature is available for the following connection types:
● ABAP (only RFC) ● BW ● DATASERVICES ● DB2 ● HANA_DB ● HTTP ● INFORMATION_STEWARD ● MSSQL ● MYSQL ● OPENAPI ● ORACLE
To establish the connectivity to the on-premise connection via the cloud connector proxy, in the Gateway field, choose SAP Cloud Connector gateway.
For the following connection types, you can access cloud connector instances with locations other than the default ID by setting the Location ID field in the Gateway section.
● ABAP (only RFC)
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 125 ● BW ● CLOUD_DATA_INTEGRATION ● DATASERVICES ● DB2 ● HANA ● INFORMATION_STEWARD ● MSSQL ● MYSQL ● ODATA ● OPENAPI ● ORACLE
Related Information
Troubleshooting SAP Cloud Connector [page 160] Configure SAP Cloud Connector [page 15]
4.6 (Mandatory) Configure Authorizations for Supported Connection Types
To perform the supported operations of a connection type, various privileges are required in source system.
You must configure minimum or mandatory authorizations that are needed at the source to perform all the supported operations and access data from SAP Data Intelligence.
Example
When you create a connection to the BW system and define the parameters to use the connection, you must also configure other authorizations in the BW system so that you can perform the supported operations of BW connection type.
Related Information
HANA_DB [page 127]
Administration Guide 126 PUBLIC Using SAP Data Intelligence Connection Management 4.6.1 HANA_DB
Prerequisites to configure HANA database as a data source in SAP Data Intelligence.
Read or Extract Metadata
● HANA TABLES or SQL VIEWS GRANT SELECT ON SCHEMA
Change Data Capture and Load to HANA Table
GRANT ALTER|CREATE ANY|CREATE TEMPORARY TABLE|CREATE VIRTUAL PACKAGE|DELETE|DROP| EXECUTE|INDEX|INSERT|TRIGGER|UPDATE ON SCHEMA
GRANT ALTER|DELETE|DROP|EXECUTE|INDEX|INSERT|TRIGGER|UPDATE|REFERENCES ON "
4.7 Allowing SAP Data Intelligence Access Through Firewalls
You can allow SAP Data Intelligence to access external data sources that are protected by a firewall or other network access tools by adding its IP address to its allow-list.
Your SAP Data Intelligence cluster uses a network address translation (NAT) gateway to enable the cluster to connect to the internet and other network endpoints that are available on its network – including those accessed via VPC Peering, VNet Peering or VPN. This IP address is static and will not change throughout the lifetime of the SAP Data Intelligence cluster.
You can find the external IP address in the Description field of the read-only connection with the id INFO_NAT_GATEWAY_IP in the Connection Management application.
Administration Guide Using SAP Data Intelligence Connection Management PUBLIC 127 5 SAP Data Intelligence Monitoring
SAP Data Intelligence provides a stand-alone monitoring application to monitor the status of graphs executed in the Modeler.
The Monitoring application provides capabilities to visualize the summary of graphs executed in the Modeler with relevant charts. Additionally, the application allows you to schedule graph executions. For each graph instance, the application provides details such as graph execution status, time of execution, graph type, graph source, and more.
The application allows you to open a graph in the SAP Data Intelligence Modeler, view graph configurations, or stop process executions.
Related Information
Log In to SAP Data Intelligence Monitoring [page 128] Using the Monitoring Application [page 130]
5.1 Log In to SAP Data Intelligence Monitoring
You can access the SAP Data Intelligence Monitoring application from the SAP Data Intelligence Launchpad or directly launch the application with a stable URL.
Procedure
1. Launch the SAP Data Intelligence Launchpad user interface in a browser using one of the following URLs:
○ For on-premise installations (if TLS is enabled):
https://
○
https://
○
Administration Guide 128 PUBLIC SAP Data Intelligence Monitoring ○
kubectl get ingress -n $NAMESPACE
The welcome screen appears where you can enter the login credentials. 2. Log in to the SAP Data Intelligence Launchpad application using the following information:
○ Tenant ID ○ Your user name ○ Your password
The SAP Data Intelligence Launchpad opens and displays the initial home page. User details and the tenant ID are displayed in the upper-right area of the screen. The home page displays all applications available in the tenant. 3. On the home page, choose Monitoring. The application user interface opens displaying the initial screen.
Administration Guide SAP Data Intelligence Monitoring PUBLIC 129 5.2 Using the Monitoring Application
The SAP Data Intelligence Monitoring application offers the following capabilities.
Actions Description
Visualize various aspects of graph exe The Analytics tab in the home page of the monitoring application displays tiles cution depicting graph execution status.
● The Status tile displays a pie chart with information on: ○ The number of graph instances executed in the Modeler. ○ The status of graph instances executed. Each sector in the pie chart rep resents a graph state. ● The Runtime Analysis tile displays a scatter chart that includes a time period axis and duration of graph execution (in seconds) axis. This chart provides information on each graph execution and plots them in the scatter chart against the time of execution and duration of execution. Each point in the chart represents a graph instance. Place the cursor on one such point for more information on that particular graph instance. You can also configure the chart and view results for a selected time period. In the chart header, select the required time period (Hour, Day, Week). ● The Recently Executed tile displays the top five instances and execution sta tus. Click Show All to view all the instances. ● The Memory Usage tile displays a line chart for the memory consumption of graphs. You can filter to display by graph, status, user, or submission time. You can also see the resource usage in the last hour, day, two days, or set the custom time range for which you want to view the resource consumption. ● The CPU Usage tile displays a line chart for the CPU consumption of graphs. You can filter to display by graph, status, user, or submission time. You can also see the resource usage in last hour, day, two days, or set the custom time range for which you want to view the resource consumption.
Note
The tiles in the Analytics tab don't include the information on the archived graph instances.
Administration Guide 130 PUBLIC SAP Data Intelligence Monitoring Actions Description
View execution details of individual In the home page, choose the Instances tab. You can see a list view of the execu graph instances tion details of all graph instances.
Note
If you are a tenant administrator, you can see graph instances of all users. By default, filter is applied to show only the admin graph instances. Modify the filter to view graph instances of required users on which you can perform only limited actions.
For each graph instance, the application provides information on the status of the graph execution, the graph execution name, graph name (the source of the graph in the repository), the time of execution, and more.
Click a graph instance to open the execution details pane. This pane displays more execution details and helps monitor the graph execution.
Note
By default, filter is applied to exclude the subgraphs and the archived instan ces from the Instances list. You can remove the filter by clicking corre sponding to a filter in the Filters bar or through the option.
View subgraph execution details When a graph execution triggers or spawns the execution of another graph, the latter is called a subgraph. You can configure the monitoring application to view execution details of all subgraph instances. You can remove the filter Exclude Subgraphs by clicking corresponding to a filter in the Filters bar or through the option. The application refreshes the list view and displays all subgraph instan ces along with all other graph instances.
Note
All subgraph instances are named as subgraph.
Click a subgraph instance to open the execution details pane. This pane displays more execution details and helps monitor the subgraph execution.
View subgraph hierarchy details For a selected subgraph, you can view its parent graph or the complete hierarchy of graph instances associated with the subgraph.
1. In the Instances tab, select the required subgraph instance. 2. In the Actions column, click the overflow icon. 3. Choose the Show Hierarchy menu option.
In the Hierarchy dialog box, the application displays all graph instances associ ated with the subgraph and groups them as a hierarchical representation. Select a graph and choose View details for more details on its execution.
Administration Guide SAP Data Intelligence Monitoring PUBLIC 131 Actions Description
Filter graph instances You can filter the graph execution that the application displays based on the source name, execution status, time of execution and more. In the Instances tab, choose and define the required filter conditions.
You can also filter and view graph instances executed on the same day, hour, or week. In the menu bar, select the required time period.
If you are a tenant administrator, you can modify filter to view graph instances of required users.
Open source graph For any selected graph instance, you can launch the source graph in the Modeler application.
1. In the Instances tab, select a graph instance that you want to open in a graph editor. 2. In the Actions column, click the overflow icon. 3. Choose the Open Source Graph menu option.
The application launches the SAP Data Intelligence Modeler in a new browser tab with the source graph of the selected instance opened in the graph editor. You can view or edit graph configurations.
View graph configurations For any selected graph instance, you can view the configurations defined for its source graph.
1. In the Instances tab, select the required graph instance. The application opens a Graph Overview pane and displays the source graph of the selected graph instance. 2. Choose Show Configuration. The application opens a Configuration pane, where you can view the graph configurations (read-only). 3. If you want to view configurations for any of the operators in the source graph, right-click the operator and choose Open Configuration The Configuration pane displays the operator configurations.
Stop a graph instance execution You can stop the execution of a graph only if its status is running or pending.
1. In the Instances tab, select the required graph instance. 2. In the Actions column, click the overflow icon. 3. Choose the Stop Execution menu option.
Search graph instances In the Instances tab, use the search bar to search for any graph instance based on the instance name or its source graph name.
Archive graph instance You can archive a graph instance from the Pipeline Engine only if its status is com pleted or dead.
1. In the Instances tab, select the required graph instance. 2. In the Actions column, click the overflow icon. 3. Choose Archive.
Note
The archived instances remain in the system for a default period of 90 days. The tenant administrator can configure the retention time via the System Management application.
Administration Guide 132 PUBLIC SAP Data Intelligence Monitoring Actions Description
Download diagnostics information and You can download diagnostic information for graphs. view logs 1. In the Instances tab, select the required graph instance. 2. In the Actions column, click the overflow icon. 3. Choose the Download Diagnostic Info menu options.
The application automatically downloads a zipped archive of information for the graph that can help diagnose issues, if any.
For certain operators, such as the data workflow operators, you can view logs generated for the operator execution. To view these logs:
1. In the Instances tab, click the required graph instance. 2. In the bottom pane, select the Logs tab.
Managing schedules In the Monitoring application, you can view graphs that are scheduled for execu tion and create, edit, or delete a schedule. For more information, see Schedule Graph Executions.
A tenant administrator can view the schedules of oneself and other users within the monitoring application. If you are a tenant administrator, you can:
● view other user's schedules by applying filters. ● edit or stop user's schedules. ● suspend or resume user's schedules. ● view executed instances of user's schedules.
Administration Guide SAP Data Intelligence Monitoring PUBLIC 133 6 Maintaining SAP Data Intelligence
This section describes how to maintain SAP Data Intelligence.
Related Information
On-Demand Certificate Renewal [page 134]
6.1 On-Demand Certificate Renewal
On-demand certificate renewal provides solution to clusters with expired certificates, compromised private keys and security vulnerabilities in crypto-libraries.
This feature can be triggered using the low-level tooling (dhinstaller). Dhinstaller has a subcommand "certificates renew" which performs the renewal operation.
Currently, an SAP Data Intelligence instance contains three certificate trees:
● root certificate for client certificate authentication ● root certificate for internal communication ● root certificate for nats
The client certificate tree is used for external communication. The internal certificate and nats trees are used for internal communication inside SAP Data Intelligence.
Prerequisite: To rotate root certificates, you must be a cluster administrator.
Related Information
Certificate Rotation for Root Client Certificate and Internal Certificates [page 135] Functional Cluster [page 135] Non-Functional Cluster [page 135]
Administration Guide 134 PUBLIC Maintaining SAP Data Intelligence 6.1.1 Certificate Rotation for Root Client Certificate and Internal Certificates
To perform certificate renewal in a functioning or non-functioning cluster, run the following command in a shell window:
kubectl -n datahub-system exec -it datahub-operator-0 -- dhinstaller certificates renew --namespace [installation namespace] --registry [registry used in installation] --stack "" --delete-client-ca
Two different set of commands and options are used according to the state of the cluster:
● Functional Cluster ● Non-Functional Cluster
6.1.2 Functional Cluster
To perform certificate renewal in a functioning cluster, run the following command in a shell window:
kubectl -n datahub-system exec -it datahub-operator-0 -- dhinstaller certificates renew --namespace [installation namespace] --registry [registry used in installation] --stack "" --renew-nats-ca --renew-internal-ca
If the renewal of nats ca certificate is not needed, you can omit --renew-nats-ca.
If the renewal of internal ca certificate is not needed, you can omit --renew-internal-ca.
At least one of --renew-nats-ca and --renew-internal-ca should be passed as an argument.
If a timeout occurs while switching the runlevel to Stopped, the cluster may be in a non-functional state; continue with the following steps.
6.1.3 Non-Functional Cluster
If SAP Data Intelligence System Management is not functional (for example, the certificates are expired), enable the --disable-vsystem-hooks flag and provide the user information. Also, validating webhook should be disabled for this operation.
To perform certificate renewal in a functioning cluster, run the following command in a shell window:
kubectl get validatingwebhookconfigurations.admissionregistration.k8s.io validating-webhook-configuration -o yaml > /tmp/validating-webhook- configuration.yaml
kubectl delete validatingwebhookconfigurations.admissionregistration.k8s.io validating-webhook-configuration kubectl -n datahub-system exec -it datahub-operator-0 -- dhinstaller certificates renew --namespace [installation namespace] --registry [registry used in installation] --stack "" --renew-nats-ca --renew-internal-ca --username [installation username] -p [installation password] --system-password
Administration Guide Maintaining SAP Data Intelligence PUBLIC 135 [installation system password] --default-tenant-name [default tenant name] -- disable-vsystem-hooks true sed 's/Unknown/None/g' /tmp/validating-webhook-configuration.yaml > /tmp/ validating-webhook-configuration2.yaml
kubectl apply -f /tmp/validating-webhook-configuration2.yaml
If the renewal of nats ca certificate is not needed, you can omit --renew-nats-ca.
If the renewal of internal ca certificate is not needed, you can omit --renew-internal-ca.
At least one of --renew-nats-ca and --renew-internal-ca should be passed as an argument.
At least one of ```--renew-nats-ca``` and ```--renew-internal-ca``` should be passed as an argument.
Administration Guide 136 PUBLIC Maintaining SAP Data Intelligence 7 Exporting Customer Data
You can export customer data from different SAP Data Intelligence components to a target store for various intended purposes.
For example, you can export to move, extract, or copy the customer data from one system to another, or to process the data for different business use cases. The following components of SAP Data Intelligence can contain customer data from which you can export and retrieve.
● SAP Vora Database ● Application Database ● SAP Data Intelligence ● Files
Remember
Some parts of the stored data are user-specific. Therefore, due to access isolation, only respective users can access and export those files. Additionally, only tenant admins can export customer data and files from application databases and tenant layers respectively.
In the current version, you must manually trigger all exports. Additionally, you must also trigger the export in all existing tenants to retrieve a full export of customer data. This section describes the steps that you must execute to export the data from different components.
Component Steps
SAP Vora Database (using SAP You can use the SAP HANA JDBC driver to export customer data from the SAP Vora HANA JDBC driver) database.
1. Establish a connection between SAP Vora and the SAP HANA JDBC driver by providing the URL, the SAP Data Intelligence tenant-prefixed username (for ex ample, default/admin), and the user password. You can retrieve the connection details from the Connection Management appli cation.
Note We recommend using the official SAP HANA JDBC driver ngdbc.jar to query the SAP Vora database.
2. Use the SQL statement to retrieve all visible tables that you can export.
Source Code
SELECT * FROM SYS.TABLES WHERE SCHEMA_NAME != 'SYS'
Administration Guide Exporting Customer Data PUBLIC 137 Component Steps
SAP Vora Database (using SAP Vora You can use the SAP Vora Tools to export customer data from the SAP Vora database Tools) to cloud storages such as S3, ADL, GCS, and more.
1. Log in to SAP Data Intelligence Launchpad with the tenant admin credentials. 2. Start the SAP Vora Tools application. 3. Open the SQL editor. 4. Execute the SQL statement to retrieve all visible tables that can be exported. Sample SQL statement:
Source Code
SELECT * FROM SYS.TABLES
WHERE SCHEMA_NAME != 'SYS'
5. For every table that you want to export, run the export statement.
Source Code
EXPORT ( SELECT * FROM "
INTO S3 ( ORC ( '
Note
Quote the schema and table identifiers.
The application supports CSV, PARQUET, and ORC as the export file formats, and it supports S3, ADL, HDFS, WebHDFS, WAS, and GCS as the target stores for exports.
Administration Guide 138 PUBLIC Exporting Customer Data Component Steps
Application Database You can export the SAP HANA customer data from your landscape using the SAP Data Intelligence Customer Data Export application.
1. Log in to SAP Data Intelligence Launchpad with the tenant admin credentials. 2. Start the Customer Data Export application. 3. In the menu bar, choose Export Data. 4. In the Configure Export dialog: 1. Select a connection type. The connections of the selected connection type are listed in the Export Location dropdown.
Note
You can export only to S3 connection type.
2. Select an existing connection and enter the bucket details. You can manage the connections in Connection Management application.
Note
For connections with endpoint URLs containing region code, the URL should follow the pattern: s3.
5. Click Export. The application displays a confirmation message on the status of the data ex port. 6. The exported target folder links are displayed in the Customer Data Export ap plication home page. 7. To view the exported data, use Browse Connections link in the Metadata Ex plorer. For more information about Metadata Explorer, see Using the Metadata Explorer.
SAP Data Intelligence Files SAP Data Intelligence System Management application supports exporting SAP Data Intelligence files as .tgz files. You can download the content of all files by exe cuting the following steps in the SAP Data Intelligence System Management applica tion.
1. Log on to every tenant with tenant admin and export all files in Tenant Workspace 2. Log on to every user of every tenant and download all files in My Workspace.
In the System Management application, choose the Files tab and switch to the Split view to identify files in the Tenant Workspace and My Workspace separately.
Related Information
Log in to SAP Data Intelligence Customer Data Export [page 140]
Administration Guide Exporting Customer Data PUBLIC 139 7.1 Log in to SAP Data Intelligence Customer Data Export
You can access the SAP Data Intelligence Customer Data Export application from the SAP Data Intelligence Launchpad or directly launch the application with a stable URL.
Procedure
1. Launch the SAP Data Intelligence Launchpad user interface in a browser using one of the following URLs:
○ For on-premise installations (if TLS is enabled):
https://
where ○
https://
where ○
kubectl get ingress -n $NAMESPACE
The welcome screen appears where you can enter the login credentials. 2. Log into the SAP Data Intelligence Launchpad application using the following information:
○ Tenant name ○ Your username ○ Your password
The SAP Data Intelligence Launchpad opens and displays the initial home page. User details and the tenant name are displayed in the upper-right area of the screen. The home page displays all applications available in the tenant. 3. On the home page, choose Customer Data Export application. The application UI opens displaying the initial screen.
Administration Guide 140 PUBLIC Exporting Customer Data 8 Improving Performance
Describes ways that you can improve SAP Data Intelligence performance.
Related Information
Improving CDC Graph Generator Operator Performance [page 141]
8.1 Improving CDC Graph Generator Operator Performance
The performance of the Change Data Capture (CDC) Graph Generator operator depends on the initial table size and how rapidly change data is generated in the source system.
To optimize and process different scenarios, the following are some helpful tips for using the CDC Graph Generator operator.
Initial Load Volume
If the source table is large, it may be helpful to partition the source data using the Partition Specification parameter. After the partitioning is defined based on the resources available, you can set the Max Number of Loaders to the appropriate value (the default is 8).
Delta Change Rate
For a rapidly changing table, it is recommended that you set Delta Graph Run Mode to polling interval with a short polling interval so that the operator fetches data rapidly as well.
For slowly a changing table, you can either set Delta Graph Run Mode to Manual and then schedule the graph periodically (for example, N hours) or set Delta Graph Run Mode to polling interval with a high polling interval.
When you use a polling interval, the graph runs. In manual mode, the operator stops processing, and you can terminate the graph using a graph terminator.
Administration Guide Improving Performance PUBLIC 141 Tracking Multiple Tables
The CDC Graph Generator operator can handle only a single table, so if you must track multiple tables, you must use the operator for each source-target pair. However, based on how much change data is generated in the source, you can optimize resource usage in the SAP Data Intelligence cluster instead of creating a graph for every table.
Example: 10 source tables, 2 are rapidly changing and 8 are slowly changing.
For rapidly changing tables, create one graph for each rapidly changing table. This graph runs indefinitely and polls the source for changes rapidly. In addition, it is recommended that you set Delta Graph Run Mode to polling interval with a short polling interval so that the operator fetches data rapidly as well.
For slowly changing tables, you can chain multiple CDC Graph Generator operators one after another in the same graph. While chaining the operators, set the Delta Graph Run Mode to Manual, so that the first operator completes its data movement and before the graph proceeds to the second one. While chaining, you can also group every N CDC Graph Generator operator in a single group so that they run in their own pod.
At the same time, to save further resource usage, schedule the graph to run periodically based on the delta change rate in the source. This way, you are not consuming resources for slowly changing tables at the same time, and are not using too many SQL connections to the source.
Recommendations
● Recommended group resource is 0.5 CPU and 1500 m memory. More than 2 CPU is not helpful for processing; however, more RAM (maximum 4 Gb) is better when there is a chain of CDC operators. ● RAM requirements are based on the fetchSize and Table Row Size. ○ For wide tables (for example, more than 100 columns), a smaller fetchSize (<1000) and higher RAM value is recommended. ○ For narrow tables, a larger fetchSize provides better throughput.
"groupResources": {
"memory": { "request": "512M" }, "cpu": { "request": "0.5" } },
● During delta processing at maximum, you may use 2 SQL connections to the source table; however, for the initial load, use 1 + Max Number of Loaders connections.
Related Information
Changing Data Capture (CDC) CDC Graph Generator (Deprecated) CDC Graph Generator Sample Graph
Administration Guide 142 PUBLIC Improving Performance 9 Sizing Guide for Metadata Explorer and Self-Service Data Preparation
Improve your Metadata Explorer and Self-Service Data Preparation content using the App-Data tenant application.
App-Data is a tenant application that can scale to hundreds of users using a single pod, versus a user application that has one pod per use. Most systems run into resource issues before they reach 100 pods. App- Data contains Metadata Explorer and Self-Service Data Preparation as nested applications. We expose several configurations to help you fine-tune performance.
Performance testing with default options showed the following performance results.
Assumption: A user makes a backend call every 6−10 seconds. Any tests with no time between calls have a conservative factor of 5X to scale up to the number of users supported.
Metadata Explorer
Tests show good performance from 1−100 users. A decrease in performance is noted between 100−200 users, with a maximum of 500 users tested on a single instance.
Self-Service Data Preparation
Tests show good performance from 1−50 users. A decrease in performance is noted between 50−100 users, with a maximum of 150 users tested on a single instance.
Related Information
Configure App-Data [page 143] Configure Other Applications [page 145]
9.1 Configure App-Data
App-Data application settings are exposed to help you configure App-Data for optimal performance.
You can resolve HANA connection bottlenecks with HANA pool options in the App-Data application. Bottlenecks in other applications require changes to those applications and are listed in Configure Other
Administration Guide Sizing Guide for Metadata Explorer and Self-Service Data Preparation PUBLIC 143 Applications [page 145]. For these changes to take effect, the daemon or app-data pods must be restarted. Restart the Data Application or Data App Daemon by clicking Restart.
Data Application Options
Option Description
HANA Pool Timeout The HANA pool timeout in milliseconds (500, 100000) for each nested application. The default value is 5000.
The Metadata Explorer and Self-Service Data Preparation applications each have their own connection pool for accessing the HANA instance. The HANA Pool Timeout option is used to control the timeout value in both pools. The default of 5000 is good for most cases. If you encounter errors with the pool, increasing the value to 10,000−15,000 may keep the errors from occurring; however, it could result in longer wait times before the error is returned. The most common pool error logs the following message: {"message": "could not acquire connection from pool", "module": "dh-app-metadata"}
HANA Maximum Pool Size The HANA maximum pool size (1, 300) for each nested application. The default value is 100.
The Metadata Explorer and Self-Service Data Preparation applications each have their own connection pool for accessing the HANA instance. The HANA Maximum Pool Size option is used to control the maximum number of connections in the pool, which corre sponds to the number of concurrent database accesses that can be executed in parallel (each request to the backend gets a connection from the pool before it is processed). The default of 100 is good for most use cases. If requests fail to connect with the pool before the pool times out, you can increase the value. We recommend 100−150 connec tions. The most common pool error logs the following message: {"message": "could not acquire connection from pool", "module": "dh- app-metadata"}
Maximum number of concurrent Set to 10 or less to have up to 10 parallel batch processes running. Set to -1 to turn this tasks that execute pipelines in option off. Modeler (automatic lineage ex cluded) This process checks before each rulebook, preparation, profile, and publication that there are less than the maximum number of tasks are currently running.
Time to live after logout Enter the number of minutes before the pod is stopped when no one is using metadata. It saves resources when the system is not being used for an extended amount of time.
Administration Guide 144 PUBLIC Sizing Guide for Metadata Explorer and Self-Service Data Preparation Data Application Daemon Options
Option Description
HANA Pool Timeout The HANA pool timeout in milliseconds (500, 100000) for each nested application. The default value is 5000.
The daemon pod that handles background processes also has a connection pool for ac cessing HANA. The daemon pod HANA Pool Timeout is used to control the timeout value of the pool. The default of 5000 is good for most cases. If you encounter errors with the pool, increasing the value to 10,000−15,000 may keep the errors from occur ring.
HANA Maximum Pool Size The HANA maximum pool size (1, 300) for each nested application. The default value is 50.
The daemon pod that handles background processes also has a connection pool for ac cessing HANA. The daemon pod HANA Maximum Pool Size option is used to control the maximum number of connections in the pool. The default of 50 is good for most cases. If you encounter errors with the pool, increasing the value to 75−100 may fix the issue.
9.2 Configure Other Applications
Other application settings are exposed to help you configure applications for optimal performance.
The Metadata Explorer and Preparation applications sit on top of the stack and call other applications that can potentially become bottlenecks. You can change the following application configurations to improve performance in other applications that are called by Metadata Explorer and Preparation.
Flowagent
Option Description
Instances Browse remote connection, view dataset summary, view factsheet, and factsheet preview, rulebook dataset binding, and other calls may call the flowagent if that object is not pub lished. If these types of calls become slower, the slowdown may be with flowagent. Increas ing the number of flowagent instances may help.
The default value is 1.
Administration Guide Sizing Guide for Metadata Explorer and Self-Service Data Preparation PUBLIC 145 App-Core
Option Description
HANA Maximum Pool Size The HANA Maximum Pool Size option defined for Connection Management is similar to the App-Data pool size configuration. If the connection application is overloaded with requests, then increasing the pool size may improve performance. Metadata Explorer uses a cache to lower the number of calls made to connection application; However, Self-Service Data Preparation does not, and makes more calls per user as activity increases. The most com mon pool error logs the following message: {"message": "could not acquire connection from pool", "module": "dh-app-connection"}. If you en counter errors with the pool, increasing the value to 50 may fix the issue.
The default value is 20.
HANA Pool Acquire Timeout The HANA Pool Acquire Timeout option (in milliseconds) defined for Connection Manage ment is similar to the App-Data pool timeout configuration. If the connection application is overloaded with requests, then increasing the pool timeout may improve performance. Metadata Explorer uses a cache to lower the number of calls made to connection applica tion; however, Self-Service Data Preparation does not, and makes more calls per user as user activity increases. The most common pool error logs the following message: {"message": "could not acquire connection from pool", "module": "dh-app-connection"}.
The default value is 10,000.
Administration Guide 146 PUBLIC Sizing Guide for Metadata Explorer and Self-Service Data Preparation 10 Understanding Security
This section describes the SAP Data Intelligence security approach and additional ways that you can increase security.
Related Information
Data Protection and Privacy in SAP Data Intelligence [page 147] Securing SAP Data Intelligence [page 152]
10.1 Data Protection and Privacy in SAP Data Intelligence
SAP Data Intelligence provides the technical enablement and infrastructure to allow you to run applications on SAP Data Intelligence to conform to the legal requirements of data protection in the different scenarios in which SAP Data Intelligence is used.
Introduction to Data Protection
For general information about data protection and privacy in SAP BTP, see the SAP BTP documentation under Data Protection and Privacy.
Data protection is associated with numerous legal requirements and privacy concerns. In addition to compliance with general data privacy acts, it is necessary to consider compliance with industry-specific legislation in different countries. This section describes the specific features and functions that SAP Data Intelligence provides to support compliance with the relevant legal requirements and data privacy.
This guide does not give advice on whether these features and functions are the best method to support company, industry, regional, or country-specific requirements. Furthermore, this guide does not give advice or recommendations about additional features that would be required in a particular environment. Decisions related to data protection must be made on a case-by-case basis and under consideration of the given system landscape and the applicable legal requirements.
Note
In most cases, compliance with data privacy laws is not a product feature. SAP software supports data privacy by providing security features and specific functions relevant to data protection, such as functions for the simplified blocking and deletion of personal data. SAP does not provide legal advice in any form. The definitions and other terms used in this guide are not taken from any given legal source.
Administration Guide Understanding Security PUBLIC 147 Glossary Term Definition
Personal data Information about an identified or identifiable natural per son.
Business purpose A legal, contractual, or in other form justified reason for the processing of personal data. The assumption is that any purpose has an end that is usually already defined when the purpose starts.
Blocking A method of restricting access to data for which the primary business purpose has ended.
Deletion Deletion of personal data so that the data is no longer usa ble.
Retention period The time during which data must be available.
End of purpose (EoP) A method of identifying the point in time for a data set when the processing of personal data is no longer required for the primary business purpose. After the EoP has been reached, the data is blocked and can only be accessed by users with special authorization.
SAP Data Intelligence Approach to Data Protection
Many data protection requirements depend on how the business semantics or context of the data stored and processed in SAP Data Intelligence are understood.
Note
Using capabilities to communicate with other data sources, SAP Data Intelligence may also be used to process data that is stored in other systems and accessed through virtual tables.
In SAP Data Intelligence installations, the business semantics of data are part of the application definition and implementation. SAP Data Intelligence does not own or store any sensitive data but the features for working with data sources, flowgraphs, and so on. Therefore, it is the user that knows, for example, which tables in the database contain sensitive personal data, or how business level objects, such as sales orders, are mapped to technical objects in the SAP Data Intelligence ecosystem.
Caution
SAP Data Intelligence trace and dump files may potentially expose personal data.
The following data protection and privacy functions enable a company to process personal data in a clear and compliant manner:
Administration Guide 148 PUBLIC Understanding Security Function Description
Erase personal data SAP Data Intelligence needs the user ID, password, and (op tionally) name of SAP Data Intelligence users for its opera tions. These are managed through the SAP Data Intelligence System Management interface and you can delete them, if necessary. For any external data source that SAP Data Intelligence accesses that may contain personal data, you should delete data using the appropriate method for that source, because SAP Data Intelligence does not know which objects store personal data.
Log changes to personal data For the personal data that SAP Data Intelligence owns, all changes are logged in the audit logs, which can be accessed through SAP Data Intelligence diagnostics or the logs.
Information about data subjects For the personal data that SAP Data Intelligence owns, users can see the contents of stored data in SAP Data Intelligence System Management by navigating to user management. All stored data except passwords is shown on this page.
Log read access to sensitive personal data For the personal data that SAP Data Intelligence owns, any access is logged in the audit logs, which can be accessed through SAP Data Intelligence diagnostics or the logs.
Consent for personal data SAP Data Intelligence does not store any data that is subject to consent for its own operations. If any external data source which SAP Data Intelligence can access stores any personal data, the owner of the data source must obtain the consent.
Note
Database trace and dump files may potentially expose personal data, for example, a trace set to a very high trace level, such as DEBUG or FINE.
SAP Data Intelligence provides a variety of security-related features to implement general security requirements that are also required for data protection and privacy:
Aspect of Data Protection and Privacy More Information
Access control Roles and scopes
Enabling Authentication for SAP Data Intelligence Services and SAP Data Intelligence Users [page 152]
Transmission control/communication security Data Provisioning Agent documentation
SAP Data Intelligence Self-Signed CA, X.509 Certificates and TLS for SAP Data Intelligence Services [page 154]
Administration Guide Understanding Security PUBLIC 149 Aspect of Data Protection and Privacy More Information
Separation by purpose Roles and scopes
Enabling Authentication for SAP Data Intelligence Services and SAP Data Intelligence Users [page 152]
Caution
The extent to which data protection is ensured depends on secure system operation. Network security, security note implementation, adequate logging of system changes, and appropriate usage of the system are the basic technical requirements for compliance with data privacy legislation and other legislation
Data Protection and Privacy in Graphs
For a threat model, the protection of data and ensurance of data privacy is a major concern. Graphs that are executed within the SAP Data Intelligence Modeler are configured, instructed, and started by customer request. Therefore, the system behaves as a data processor, whereas the user or customer of the system is the data controller. Hence, the system neither audit logs the input of personal or sensitive data from source systems, nor audit logs transformations or ingestions into target systems. The data owner or the owner of the source and target systems is responsible to ensure tracability and instruct the systems to properly generate relevant audit logs, which allows the customer to be compliant to local data protection and privacy laws.
Related Information
Managing Audit Logs [page 150] Viewing Audit Logs [page 151] Malware Scanning [page 152]
10.1.1 Managing Audit Logs
All SAP Data Intelligence components write audit logs for accessing, modifying, and erasing personal data, and editing your security configuration.
You can manage the storage of audit logs from the settings page of the System Management application.
Note
Audit log persistence mode applies only to System Management applications, but the core SAP Data Intelligence system stores audit logs internally, and persistence cannot be changed from this setting. Use this setting for debugging purposes only.
Administration Guide 150 PUBLIC Understanding Security 10.1.2 Viewing Audit Logs
SAP Data Intelligence provides a comprehensive audit logging system, which includes events that are related to Data Protection Principles.
The audit logs include both audit logs from SAP Data Intelligence applications and infrastructure audit logs of the SAP Data Intelligence cluster while using the SAP BTP Audit Log Services.
To see infrastructure audit logs of SAP Data Intelligence, subscribe to the Audit Log Viewer application instance in your subaccount. After you subscribe, the audit log viewer role is available from the trust configuration panel. Only a subaccount security administrator can assign the audit log viewer role to you from the trust configuration panel, which lets you view audit logs. The subaccount security administrator role is present only for the creator of the subaccount, or it can be transferred from one subaccount security administrator to another account.
Adding Another Subaccount Security Administrator
If you are a subaccount security administrator, perform the following steps to add another administrator.
1. In the Security tab, navigate to Administrators. 2. Click the Add Administrators button, provide the User ID, and click OK.
The user should now have administrator privileges in the subaccount.
Granting Permissions to View Audit Logs
Perform the following steps to grant permission to view audit logs.
1. In the Subaccount page, navigate to Role Collections under the Security tab. 2. Click New Role Collection, enter a name for the role collection, and click Save. The new role collection appears in the role collection list. 3. Select the new role collection, add the auditlog-viewer! role, and select Auditlog_Auditor as the Role Template and Role. After you configure the role collection, you can assign users the role. 4. Navigate to the subaccount page and choose the Security tab. In Trust Configuration, choose the identity provider that you use and make sure that the Role Collection Assignment page is open. 5. To assign the new role to your user, provide either the e-mail or user ID, depending on your identity provider settings.
Access Audit Log View
To access the Audit Log Viewer application, navigate to Subscriptions from the subaccount page, then click Go To Application under the Audit Log Viewer subscription.
For more information about enabling the audit log viewer in SAP BTP, see Audit Log Viewer for the Cloud Foundry Environment.
Note
Audit logs are kept for a specified amount of time. The SAP BTP default is 30 days, after which the audit logs are deleted. If you want to change the retention period, see https://help.sap.com/viewer/ f48e822d6d484fa5ade7dda78b64d9f5/Cloud/en-US/e6b90a5575f8401bb8469904a159788e.html.
Administration Guide Understanding Security PUBLIC 151 10.1.3 Malware Scanning
SAP Data Intelligence scans all uploaded files for malware.
SAP Data Intelligence allows users to upload files to enable some component features, such as SAP Data Intelligence Modeler and Machine Learning operators. The files are stored in an SAP Data Intelligence cluster, and they may be read by other users, depending on their permissions scope.
As a second level of defense, and to comply with compliance controls, SAP Data Intelligence scans uploaded files for malware. In case of a positive finding, the upload fails and SAP Data Intelligence logs a security event.
The feature is enabled for all cloud instances.
10.2 Securing SAP Data Intelligence
When using a distributed system, you must be sure that your data and processes support your business needs without allowing unauthorized access to critical information. User errors, negligence, or attempted manipulation of your system should not result in loss of information or processing time.
These demands on security apply likewise to SAP Data Intelligence.
Related Information
Enabling Authentication for SAP Data Intelligence Services and SAP Data Intelligence Users [page 152] Configuring External Identity Providers in SAP Data Intelligence [page 153] Giving User Permissions for SAP Data Intelligence Access [page 154] SAP Data Intelligence Self-Signed CA, X.509 Certificates and TLS for SAP Data Intelligence Services [page 154] SAP Vora Integration for External Users [page 155] SAP Data Intelligence on Kubernetes Security [page 155] Connecting Your On-Premise Systems To SAP Data Intelligence [page 155]
10.2.1 Enabling Authentication for SAP Data Intelligence Services and SAP Data Intelligence Users
SAP Data Intelligence supports user name and password authentication for all services.
After a user is created, it becomes an SAP Data Intelligence user and is enabled on all SAP Data Intelligence endpoints, including all user interfaces and programmatic endpoints.
Note
Synchronizing a user for all SAP Data Intelligence services can take up to 5 minutes.
Administration Guide 152 PUBLIC Understanding Security The first SAP Data Intelligence user is created during installation in the default tenant, and a system user is created in the system tenant with the name system (system\system). You can create and manage additional users through the SAP Data Intelligence System Management user management module.
The SAP Data Intelligence installer also creates different service users that are used for internal authentication between SAP Data Intelligence services.
The following table summarizes the SAP Data Intelligence endpoints that are reachable by the outside world (exposed from Kubernetes by default) and the authentication methods that apply to them:
Endpoint Authentication Type and Proposed Security Measures
SAP Vora Transaction Coordinator JSON Web Token (JWT)-based authentication and SAP Data Intelligence users with username/password authentication via a proprietary SAP Vora protocol.
System Management user interface SAP Data Intelligence users with a user interface authentica tion form.
System Management REST endpoint JWT-based authentication.
SAP HANA Wire on SAP Vora Transaction Coordinator SAP Data Intelligence users with SAP HANA proprietary wire protocol.
10.2.2 Configuring External Identity Providers in SAP Data Intelligence
SAP Data Intelligence service integrates with SAP BTP User Account and Authentication (SAP BTP UAA). This allows the federation of external and custom identity providers (IdP). In that context, SAP BTP UAA works as a service broker. For example, it allows the configuration of identity providers such as Security Assertion Markup Language (SAML).
Configuring External Identity Providers
More information is available in the SAP BTP documentation:
● For information about the configuration of identity providers via SAP BTP UAA, refer to Data Privacy and Security in the SAP BTP documentation. ● For information about establishing trust between SAP BTP UAA and SAP BTP Identity Authentication Service (IAS), refer to Establish Trust and Federation with UAA Using SAP BTP Identity Authentication Service.
The configuration is handled within the SAP BTP security configuration dashboard of the subaccount that contains the SAP Data Intelligence instance.
Every user who can be federated via the SAP BTP UAA can log into SAP Data Intelligence with the initial role of a member.
Administration Guide Understanding Security PUBLIC 153 After the first login, broader access can be granted to the externally federated user.
For more information about user handling and access control settings, see SAP BTP Administration.
Related Information
SAP Vora Integration for External Users [page 155]
10.2.3 Giving User Permissions for SAP Data Intelligence Access
To authenticate an external user for an SAP Data Intelligence service, you must assign the sap.dh.systemAccess policy to the user in SAP Data Intelligence System Management.
After the external user logs in for the first time, the tenant administrator must navigate to the Users page in System Management and assign one of the policies that nests, or references, sap.dh.systemAccess. For more information about policy assignment, see Working With Policies [page 44].
External users with the sap.dh.systemAccess policy can log into the SAP Data Intelligence service by clicking the SAP CP XSUAA button on the login page.
10.2.4 SAP Data Intelligence Self-Signed CA, X.509 Certificates and TLS for SAP Data Intelligence Services
The SAP Data Intelligence installer creates its own Public Key Infrastructure (PKI) and self-signed CA (Certificate Authority). By using this root CA, the installer generates an intermediate certificate and its private key. When the root CA certificate signs this intermediate certificate, it is essentially transferring some of its trust to the intermediate. The intermediate certificate is used for signing x509 client and server certificates, which is used for encryption, digital sign, and authentication within the cluster.
Note
This CA and other respective certificates are used only in cluster communications and for external access to SAP Data Intelligence endpoints. Certificates provided to respective Kubernetes ingress are used.
When you run the installer, it prompts you with the following information:
"Please enter the SAN (Subject Alternative Name) for the certificate, which must match the fully qualified domain name (FQDN) of the Kubernetes node to be accessed externally:
The required FQDN/DNS name is deprecated and not currently in use."
Administration Guide 154 PUBLIC Understanding Security Related Information
Installing SAP Data Intelligence on the Kubernetes Cluster
10.2.5 SAP Vora Integration for External Users
If you use an external identity provider (IdP), you can access SAP Vora using a predefined connection with the ID vora_technical_default.
It's important to note that only the default tenant can access SAP Vora using the predefined connection. All users that use the default tenant (thus sharing the vora_technical_default connection) share the default schema in SAP Vora.
Related Information
VORA [page 123]
10.2.6 SAP Data Intelligence on Kubernetes Security
SAP recommends that you enable security features on your Kubernetes cluster. Consult SAP partners to enable the appropriate Kubernetes features for your environment.
For more information, see the following:
● Users in Kubernetes ● Configuring Service Accounts for Pods ● Security Best Practices for Kubernetes Deployment
Caution
For on-premise Kubernetes installations, applications deployed on the same Kubernetes clusters can access data of SAP Data Intelligence instances. While this can mainly happen if hostPath is used, NFS scenarios also require you, as an administrator, to take the necessary precautions to not allow external applications to access SAP Data Intelligence files on the same Kubernetes cluster.
10.2.7 Connecting Your On-Premise Systems To SAP Data Intelligence
To connect your on-premise systems to SAP Data Intelligence, submit a support request. The SAP operations team will help you set it up.
Administration Guide Understanding Security PUBLIC 155 There are three connectivity options available to connect to SAP Data Intelligence:
● SAP Cloud Connector ● Site-to-Site VPN ● VPC Peering
The approach that you use depends on whether your on-premise environment is hosted. We recommend that you use SAP Cloud Connector, which serves as a link between SAP BTP applications, such as SAP Data Intelligence and on-premise systems. If you use one of the following hosts, use a peering connection:
Hosted In Use
Amazon Web Services (AWS) AWS Virtual Private Cloud (VPC) peering connection
Microsoft Azure VNet Peering
If your on-premise environment is not hosted, a Virtual Private Network (VPN) using the AWS Site-to-Site VPN is required.
Related Information
Connect Using Site-to-Site VPN [page 156] Connect Using Virtual Network Peering [page 158] Configure SAP Cloud Connector [page 15]
10.2.7.1 Connect Using Site-to-Site VPN
If your on-premise environment is not hosted in Amazon Web Services (AWS) or Microsoft Azure, which are clouds where the SAP Data Intelligence can be hosted, connect it to SAP Data Intelligence with a Virtual Private Network (VPN).
Context
VPN connectivity provides an added layer of security and easier traffic routing between your on-premise systems and SAP Data Intelligence. You can connect to SAP Data Intelligence through the Internet or through an IP security (IPSec) VPN tunnel.
Administration Guide 156 PUBLIC Understanding Security Internet connectivity to SAP Data Intelligence APIs and user interfaces using secure protocols are always available. To add another network security layer, you can establish a VPN connection between your on-premise or other hosted environment and SAP Data Intelligence. Submit a support request to the SAP operations team, and the connection will be jointly implemented as follows:
Procedure
1. Contact your network department to verify that they will support a VPN connection to SAP Data Intelligence. 2. In the SAP Support Portal, open a VPN service request ticket using the product component CA-DI-OPS. 3. SAP support will send you a VPN technical configuration exchange document to complete. It informs your network engineers about technical configuration parameters on SAP Data Intelligence side, and requests technical parameters from your side to return to SAP support. 4. Request that your network engineers configure the on-premise VPN endpoint according to the technical configurations specified in the VPN exchange form. The SAP Data Intelligence operations team will do the same on the SAP Data Intelligence endpoint side.
Restrictions
Note the following restrictions:
● To ensure that your IPSec VPN endpoint is compatible, see https://docs.aws.amazon.com/vpc/latest/ adminguide/Welcome.html or https://docs.microsoft.com/en-us/azure/vpn-gateway/vpn-gateway- about-vpn-devices , depending where your SAP Data Intelligence instance is hosted. ● VPN provides a transparent connection to your network. The SAP Data Intelligence internal default Classless Inter-Domain Routing (CIDR) block is 10.0.0.0/16. The IP address range cannot be changed after the SAP Data Intelligence tenant landscape has been created. Therefore, ensure that you do not have an IP address range conflict with the SAP Data Intelligence default network. If the default CIDR segment conflicts with your network, specify an appropriate non-overlapping CIDR block (/22 or larger) when you create the SAP Data Intelligence service instance. For more information, see Create an SAP Data Intelligence Instance in SAP BTP [page 7]. ● High availability setup of a VPN connection is not available. ● Because a VPN connection leverages the Internet, SAP cannot provide performance and availability service-level agreements. Performance depends on your on-premise Internet Service Provider connectivity capacity and other factors. ● If your corporate hosting is also on Amazon AWS, a VPN connection to SAP Data Intelligence hosted on AWS is not supported by Amazon. Contact SAP support to discuss alternatives.
Administration Guide Understanding Security PUBLIC 157 10.2.7.2 Connect Using Virtual Network Peering
If your on-premise environment is hosted on the same one as SAP Data Intelligence (either Amazon Web Services (AWS) or Microsoft Azure), you can use virtual network peering to connect both.
Context
Virtual network peering creates network connectivity between two virtual networks (VPC for AWS or VNet for Azure), which are owned by different account holders. To establish virtual network peering between your hosted environment and SAP Data Intelligence, submit a support request, and the connection will be jointly implemented as follows:
Procedure
1. Contact your network department to confirm that virtual network peering is compliant according to your organization rules. 2. In the SAP Support Portal, open a virtual network peering service request ticket using the product component CA-DI-OPS. 3. SAP support will send you a VPC peering technical exchange document to complete. It is based on required information as described in the AWS VPC Peering documentation or Microsoft Azure VNet documentation . 4. Request that the administrator of your cloud provider account accepts the virtual network peering request and configures the internal network traffic.
Restrictions
Note the following restrictions:
● VPC peering provides a transparent connection to your network. The SAP Data Intelligence internal default Classless Inter-Domain Routing (CIDR) block is 10.0.0.0/16. The IP address range cannot be changed after the SAP Data Intelligence tenant landscape has been created. Therefore, ensure that you do not have an IP address range conflict with the SAP Data Intelligence default network. If the default CIDR segment conflicts with your network, specify an appropriate non-overlapping CIDR block (/22 or larger) when you create the SAP Data Intelligence service instance. For more information, see Create an SAP Data Intelligence Instance in SAP BTP [page 7].
Administration Guide 158 PUBLIC Understanding Security 11 Troubleshooting SAP Data Intelligence
Contains information that helps you troubleshoot problems in SAP Data Intelligence.
Related Information
Troubleshooting Vora Users After Upgrade [page 159] Troubleshooting SAP Cloud Connector [page 160] Troubleshooting Flowagent [page 160]
11.1 Troubleshooting Vora Users After Upgrade
Describes a possible solution after an SAP Data Intelligence Cloud cluster upgrade.
For users who use a VORA_TECHNICAL_DEFAULT connection for SAP Vora, after an SAP Data Intelligence Cloud upgrade, the user tables are not displaying in SAP Vora Tools.
To correct this issue, the administrator user must create a new connection using the corresponding SAP Vora technical user that lost its data.
For the new connection, use the following connection parameters:
Parameter Value
host vora-tx-coordinator-ext
port 10004
enable tls true
username
password The password of the technical user.
If the password has been forgotten, the administrator user can perform the following Vora query to recover it:
ALTER USER “
Administration Guide Troubleshooting SAP Data Intelligence PUBLIC 159 11.2 Troubleshooting SAP Cloud Connector
Contains information that helps you troubleshoot problems when using SAP Cloud Connector with SAP Data Intelligence.
Failed Connection
Symptom
Connection fails to be established when trying to use SAP Cloud Connector.
Analysis
Checking logs of SAP Cloud Connector (on customer side) reveals an error message like the following: Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
Root cause
The certificate of the SAP Data Intelligence ingress is not trusted by SAP Cloud Connector.
Solution
Obtain the certificate from the SAP Data Intelligence ingress endpoint and upload it into the trust store of the SAP Cloud Connector. For an example, see Import the Git Server Certificate into the JVM.
11.3 Troubleshooting Flowagent
Contains information that helps you troubleshoot Flowagent service issues in SAP Data Intelligence.
Flowagent service can hang indefinitely for database connections
There are some situations where registering an invalid connection leads to a hang in the Flowagent service, which is responsible for checking status, browsing and viewing metadata of connections. This issue can occur, for example, if you register a Microsoft SQL Server connection using MySQL information; however, there are other invalid combinations that can lead to the same behavior.
The connections affected by this issue may include: Azure SQL DB, DB2, MSSQL, MySQL, Oracle, and Redshift.
Workaround
In SAP Data Intelligence, restart the Flowagent application via SAP Data Intelligence System Management.
Administration Guide 160 PUBLIC Troubleshooting SAP Data Intelligence Important Disclaimers and Legal Information
Hyperlinks
Some links are classified by an icon and/or a mouseover text. These links provide additional information. About the icons:
● Links with the icon : You are entering a Web site that is not hosted by SAP. By using such links, you agree (unless expressly stated otherwise in your agreements with SAP) to this:
● The content of the linked-to site is not SAP documentation. You may not infer any product claims against SAP based on this information. ● SAP does not agree or disagree with the content on the linked-to site, nor does SAP warrant the availability and correctness. SAP shall not be liable for any damages caused by the use of such content unless damages have been caused by SAP's gross negligence or willful misconduct.
● Links with the icon : You are leaving the documentation for that particular SAP product or service and are entering a SAP-hosted Web site. By using such links, you agree that (unless expressly stated otherwise in your agreements with SAP) you may not infer any product claims against SAP based on this information.
Videos Hosted on External Platforms
Some videos may point to third-party video hosting platforms. SAP cannot guarantee the future availability of videos stored on these platforms. Furthermore, any advertisements or other content hosted on these platforms (for example, suggested videos or by navigating to other videos hosted on the same site), are not within the control or responsibility of SAP.
Beta and Other Experimental Features
Experimental features are not part of the officially delivered scope that SAP guarantees for future releases. This means that experimental features may be changed by SAP at any time for any reason without notice. Experimental features are not for productive use. You may not demonstrate, test, examine, evaluate or otherwise use the experimental features in a live operating environment or with data that has not been sufficiently backed up. The purpose of experimental features is to get feedback early on, allowing customers and partners to influence the future product accordingly. By providing your feedback (e.g. in the SAP Community), you accept that intellectual property rights of the contributions or derivative works shall remain the exclusive property of SAP.
Example Code
Any software coding and/or code snippets are examples. They are not for productive use. The example code is only intended to better explain and visualize the syntax and phrasing rules. SAP does not warrant the correctness and completeness of the example code. SAP shall not be liable for errors or damages caused by the use of example code unless damages have been caused by SAP's gross negligence or willful misconduct.
Gender-Related Language
We try not to use gender-specific word forms and formulations. As appropriate for context and readability, SAP may use masculine word forms to refer to all genders.
Administration Guide Important Disclaimers and Legal Information PUBLIC 161 www.sap.com/contactsap
© 2021 SAP SE or an SAP affiliate company. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP SE or an SAP affiliate company. The information contained herein may be changed without prior notice.
Some software products marketed by SAP SE and its distributors contain proprietary software components of other software vendors. National product specifications may vary.
These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or warranty of any kind, and SAP or its affiliated companies shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP or SAP affiliate company products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.
SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE (or an SAP affiliate company) in Germany and other countries. All other product and service names mentioned are the trademarks of their respective companies.
Please see https://www.sap.com/about/legal/trademark.html for additional trademark information and notices.
THE BEST RUN