<<

PUBLIC SAP Data Intelligence Document Version: 3.1.5 – 2021-09-03

Data Governance User Guide company. All rights reserved. affiliate

THE BEST RUN 2021 SAP SE or an SAP © Content

1 Data Governance User Guide for SAP Data Intelligence...... 6 1.1 Using the Metadata Explorer...... 6 Using the Discovery Dashboard...... 9 Configure User Settings and View Notifications...... 11

2 Managing Connections...... 13 2.1 Overview of Metadata Extraction...... 19 2.2 Profile a Dataset...... 21 2.3 Publish a Dataset...... 22 View Published Datasets in a Folder or Connection...... 23 Update or Delete a Publication...... 24 View Connection Capabilities...... 25 2.4 Manage and Folders...... 25

3 Managing the Catalog...... 28 3.1 Create Hierarchical Tags...... 31 3.2 Manage Tags...... 32

4 Rules...... 35 4.1 Create a Category...... 36 Edit or Delete a Category...... 37 4.2 Create a Rule and Test Case...... 37 Edit a Rule...... 41 Delete a Rule...... 42 Edit or Delete a Test Case...... 42 Values for the Match Pattern Operator...... 43 View Terms Related to a Rule...... 44 4.3 Create a Rulebook...... 44 Import Rules to the Rulebook...... 46 Bind a Rule to a Dataset...... 46 Edit or Remove a Rule Binding...... 47 4.4 Run Rulebooks and View Results...... 48 4.5 Create a Dashboard...... 50 Create a Scorecard...... 50 Edit a Dashboard...... 54 4.6 Import Rules from SAP Information Steward...... 55 View the Status of Imported Rules...... 56 4.7 View Terms Related to a Rulebook...... 58

Data Governance User Guide 2 PUBLIC Content 5 Business Glossary...... 59 5.1 Create a Business Glossary...... 61 5.2 Manage Business Glossaries...... 62 5.3 Edit the Term Template...... 62 5.4 Create a Business Glossary Category...... 64 5.5 Manage the Business Glossary Categories...... 64 5.6 Create a Term...... 65 5.7 Manage a Term...... 66 5.8 Manage a Term Relationship...... 67 5.9 Search for Categories and Terms...... 69 5.10 Import a Business Glossary from Information Steward...... 70

6 Managing Publications...... 72 6.1 Create a Publication...... 73 6.2 Edit or Delete a Publication...... 74

7 Monitoring Tasks...... 75 7.1 View Details About Publication Processing...... 77 7.2 View Details About Rulebook Processing...... 79 7.3 View Details About Glossary Importing...... 80

8 Viewing Metadata...... 83

9 View Fact Sheet...... 84 9.1 View a Summary of the Dataset...... 86 9.2 View a Summary of Column Data...... 89 View Details About a Single Column...... 91 9.3 Preview Data...... 93 9.4 Analyze Data Lineage...... 94 View Lineage Results...... 98 Configure the Lineage View...... 101 Data Lineage Examples...... 102 9.5 Review and Comment on a Dataset...... 104 9.6 View Dataset Relationships...... 105 9.7 Manage Fact Sheet Versions...... 107 9.8 Search for Fact Sheets on a Connection...... 108 9.9 Search for Fact Sheets Across Datasets...... 109

10 Searching Datasets...... 110

11 Managing Favorites...... 112

12 Functions...... 113 12.1 abs...... 117

Data Governance User Guide Content PUBLIC 3 12.2 add_months...... 118 12.3 concat_date_time...... 119 12.4 date_diff...... 120 12.5 date_part...... 121 12.6 day_in_month...... 122 12.7 day_in_week...... 123 12.8 day_in_year...... 124 12.9 decode...... 125 12.10 exists...... 127 12.11 fiscal_day...... 129 12.12 isweekend...... 130 12.13 is_valid_date...... 131 12.14 is_valid_datetime...... 132 12.15 is_valid_decimal...... 133 12.16 is_valid_double...... 134 12.17 is_valid_int...... 135 12.18 is_valid_real...... 136 12.19 is_valid_time...... 137 12.20 julian...... 138 12.21 julian_to_date...... 139 12.22 last_date...... 140 12.23 length...... 141 12.24 lookup...... 142 12.25 lower...... 145 12.26 lpad...... 146 12.27 lpad_ext...... 147 12.28 ltrim...... 148 12.29 match_pattern...... 150 12.30 match_regex...... 153 12.31 mod...... 158 12.32 month...... 159 12.33 nvl...... 160 12.34 quarter...... 161 12.35 replace_substr...... 162 12.36 round...... 166 12.37 rpad...... 167 12.38 rpad_ext...... 168 12.39 rtrim...... 170 12.40 soundex...... 171 12.41 sqrt...... 172 12.42 substr...... 173

Data Governance User Guide 4 PUBLIC Content 12.43 sysdate...... 175 12.44 systime...... 176 12.45 to_char...... 177 12.46 to_date...... 179 12.47 to_decimal...... 181 12.48 trunc...... 182 12.49 upper...... 183 12.50 week_in_month...... 184 12.51 week_in_year...... 185 12.52 word...... 187 12.53 year...... 188

13 Operators...... 190

Data Governance User Guide Content PUBLIC 5 1 Data Governance User Guide for SAP Data Intelligence

SAP Data Intelligence helps manage your data across different systems by using the Metadata Explorer.

The Metadata Explorer gathers information about the location, attributes, quality, and sensitivity of data. With this information, you can make informed decisions about which datasets to publish and determine who has access to use or view information about the datasets.

Use the Metadata Explorer to:

● preview data in the datasets ● profile data to view information about the contents of different datasets ● publish datasets to allow others to view and search the data ● tag the dataset with keywords to aid in searching for datasets and create a tag hierarchy to organize and implement your tagging strategy ● prepare the datasets by applying data quality enhancements to the data ● conduct lineage analysis to learn where the dataset is used and how it’s transformed ● create validation rules to ensure that your data passes data quality standards ● create scorecards to view the quality score of your datasets, categories, and rulebooks ● create a rules dashboard to organize your scorecards ● monitor the status of publishing, profiling, lineage, rules, and data preparation tasks

Related Information

Using the Metadata Explorer [page 6]

1.1 Using the Metadata Explorer

Use the Metadata Explorer to publish, profile, prepare, and monitor metadata.

The Metadata Explorer Home page contains some cards to help you navigate to your desired area. Whether you want to search for data on a connection, create data preparations, profile or publish a dataset, create rules or a business glossary, monitor tasks, or modify your user preferences, the Home page takes you there.

Access the Metadata Explorer from SAP Data Intelligence Launchpad by clicking Metadata Explorer. The Home page consists of several tiles with links to other areas in the Metadata Explorer.

Data Governance User Guide 6 PUBLIC Data Governance User Guide for SAP Data Intelligence Catalog

The Catalog tile has links to access your data.

Link Description

Browse Connections The connection is where you find the source data. You can view information about the connection, search for datasets, and begin to publish and run lineage analysis on a supported dataset.

Browse the Catalog The Catalog contains your published datasets. You can view the fact sheet, profile, prepare data. You can also tag datasets and columns, create, edit, and delete tags.

View Profiled Datasets The profiled datasets have a fact sheet that shows an overview of your data and in­ cludes a data preview. You can add tags, prepare data, and profile the dataset again to view any trends with the changing data. You can also manage the versions of your fact sheets.

View Preparations After you have prepared data, you can view some sample data, or view the fact sheet. If necessary, you can continue to edit the preparation and apply it to the da­ taset.

Rules

The Rules tile has a link to rule categories where you can create your rules. Another link takes you to rulebooks where you can group rules that apply to a business goal such as Customer Validation. You can also create a dashboard and scorecards to view a data quality or trend score. Search through your connections to find the datasets you want to work with.

Business Glossary

The Business Glossary tile has a link to the overview page where you can create terms and categories and associate them with Metadata Explorer objects. The glossary helps you understand the business terms in your organization. Having the terms defined and organized in a category can help you find related terms and objects when you search the glossary.

Monitor

The Monitor tile has links to the page that shows a view of the tasks that were run or are currently running. It also has a link to the Discovery Dashboard, where you can view a snapshot of information related to your dataset, such as memory usage, dataset distribution over connections, as well as profiling, catalog, and rulebook metrics.

Data Governance User Guide Data Governance User Guide for SAP Data Intelligence PUBLIC 7 User Preferences

The User Preferences tile has links to the folders in the catalog that you’ve selected as a favorite. You can also go to your preferences to set how you want the application to look, and the default views for the catalog, connections, and search results.

Administration

The Administration tile is shown to those users with administration privileges. It has a link to connections, where you can profile and extract lineage. Another link takes you to the manage publication page where you can add, edit, or delete publications and manage the content of the catalog. The preparations link takes you to a page for managing the data preparations, where you can duplicate and edit an existing preparation, or delete a preparation. You can also add, update, and delete tags and view where tags are used.

Metadata Explorer Navigation Menu

At the top of every page in the Metadata Explorer, is a navigation menu. Click the Metadata Explorer dropdown list next to the SAP logo. The categories and links described earlier are included in the menu. There are also links to other applications in this product:

● System Management is where you manage applications, users, files, and clusters. ● Audit Log Viewer is where you can see the timestamp of events, users, and written to the SAP HANA database from different components. ● Customer Data Export is where you export data from different SAP components to a target store. ● Connection Management is where you can create, edit, delete, or check the status of your connections. ● Monitoring is where you can view the status of instances, runtime analysis, schedules, and so on. ● Policy Management is where you can authorize resource access to a user for defined policies. ● License Management is where you manage system licenses and measurement records. ● Modeler is where you can create processing pipelines (graphs) that orchestrate data processing in distributed landscapes. ● Vora Tools is where you can import, export, partition, and manage and views.

Related Information

Using the Discovery Dashboard [page 9] Configure User Settings and View Notifications [page 11]

Data Governance User Guide 8 PUBLIC Data Governance User Guide for SAP Data Intelligence 1.1.1 Using the Discovery Dashboard

The Discovery Dashboard has an overview of information related to your datasets.

The Discovery Dashboard contains high-level information such as memory usage, dataset distribution over connections, as well as profiling, catalog, and rulebook metrics. To view more information, or navigate to other areas, click the graphs, charts, and links. Access the Discovery Dashboard from the Metadata Explorer Home page by clicking the Discovery Dashboard link in the Monitor tile.

Memory Usage

The Memory Usage tile shows the amount of memory used by loaded preparation and metadata catalog tables and rules. To view the breakdown of memory usage between the catalog and rules, click the Metadata graphic. The Metadata and Preparation indicators show the amount of free versus used memory using different colors to indicate alert levels. Each memory usage type has a set amount of memory available, therefore Preparation could be orange while Catalog is green, for example.

Usage Level

Below 80% Green (normal)

80-89% Gray (low alert)

90-94% Orange (medium alert)

95% or higher Red (high alert)

For information about increasing or decreasing the memory usage, see the SAP Data Intelligence Administration Guide Manage Metadata and Preparation Memory.

Dataset Distribution

The Dataset Distribution tile shows the most used connections and the number of datasets on each of the connections shown. To see the number of datasets on a specific connection, click the section in the pie graph.

Monitoring

The Monitoring tile shows the progress of profiling, publishing, rulebooks, and preparation tasks. Click the number in the chart to go to the Monitoring page filtered on that task and status. Click Manage to go to the Monitoring page and see all the tasks and status. From the Monitoring page, you can filter based on the date range, task type, or task status; and view the connection ID, path, and target folders. For profiling tasks, view the metadata or fact sheets and stop or start profiling datasets.

Data Governance User Guide Data Governance User Guide for SAP Data Intelligence PUBLIC 9 Favorites

The Favorites tile shows five of the catalog folders that you selected as Favorites. To view all Favorites, click Show all. To view or remove your Favorites, click Manage.

Recently Run Rulebooks

The Recently Run Rulebooks tile shows the number of available rulebooks, and results of five of the most recently run rulebooks. Each rulebook shows the number of rules contained in the rulebook. Click the number to view the category and rules included in the rulebook. Click the gauge to go to the rulebook results. Click Manage to go to the list of rulebooks.

Catalog Metrics

The Catalog Metrics tile shows the number of available datasets. Choose All connections to filter the trend information for a specific connection. Hover over the trend information graph to view statistics about the published datasets for approximately the last 7 days on the specified connection.

Tags Usage

The Tags Usage tile shows the number of tags that have been created in a hierarchy. To change the hierarchy, click the Display Hierarchy list and choose another hierarchy. The donut graph shows the frequency of use for each tag in the hierarchy. Click a section of the graph to view the number of objects (datasets or columns) that use the selected tag. You can search on a tag when the tag is attached to one or more datasets by clicking a section of the graph and clicking Search on Tag. Click Manage to go to the Catalog where you can create hierarchies and manage tags.

Tag Hierarchies

The Tag Hierarchies tile shows the default hierarchy and the five most recently used hierarchies. You can also switch the Most Recently Used hierarchies to the Most Used hierarchies by clicking the dropdown list under the number of tagged hierarchies. Each hierarchy in the list shows when the tags were last changed, and the number of datasets and columns that are tagged. Click a hierarchy link or click Manage to go to the Catalog where you can create hierarchies and manage tags.

Data Governance User Guide 10 PUBLIC Data Governance User Guide for SAP Data Intelligence Profiling Metrics

The Profiling Metrics tile shows a bar chart containing the number of datasets that have been profiled on each connection. The number of successfully profiled fact sheets is shown at the top of the tile.

Recently Published

The Recently Published tile provides up to five links to the Catalog where the published datasets are stored.

Glossary Metrics

The Glossary Metrics tile shows up to five links to the user's most recently created or updated terms in the Business Glossary. The top of the tile shows the number of terms and categories that have been created.

Related Information

Self-Service Data Preparation

1.1.2 Configure User Settings and View Notifications

Change the look of the background or set your default preferences.

Configure User Settings

1. From the Home page in the User Preferences tile, click  Preferences. 2. (Optional) Click Theme, and then choose one of the options: Default, Belize, Belize Deep, High Contrast Black, or High Contrast White. 3. (Optional) Click Default View. Choose how you want items to be shown on the Browse Connections or Catalog. Items can be shown as a table list or as tiles in a grid format. 4. Click Save.

Data Governance User Guide Data Governance User Guide for SAP Data Intelligence PUBLIC 11 Manage Notifications

If you receive an error, warning, or informational message, the message pops up at the bottom of the screen, then disappears.

To view the list of recent messages, click  Notifications.

You can sort the notifications By Date or By Priority. Depending on the type of task processing, some buttons are shown after a task is done processing. If a task results in an error state, the button to View Detail is shown. Only the person who initiated the task can view the logs. The table shows which buttons are available for each type of task.

Button Shown Preparation Profile Publish

View Fact Sheet No Yes No

View Logs No No No

View in Catalog No No Yes

View in Manage Preparations Yes No No

View in Manage Publications No No Yes

View in Monitoring Yes Yes Yes

To remove the message from the list, click  Close.

To close the Notifications pane, click a different pane, or click the Notifications icon again.

Data Governance User Guide 12 PUBLIC Data Governance User Guide for SAP Data Intelligence 2 Managing Connections

Explore your connections and make informed decisions about the content of your datasets after publishing, profiling, or lineage analysis.

The connections and the content within those connections are the basis of exploring metadata content. The connections are created in the SAP Data Intelligence Connection Management application.

Supported Connections

The following connections are supported for certain features.

 Note

Certain object types may not work for all tasks. For example, SAP HANA Database Calculation View is not supported for running rules.

Connec­ Object Fact Prepare Manage tion Type Types Browse Sheet Preview Profile Publish Lineage Rules Data Objects

Alibaba CSV Yes Yes Yes Yes Yes No Yes Source Yes Cloud Object Image Target Storage Files* Service (OSS) JSON*

JSONL*

ORC

PAR­ QUET

PDF*

(*pre­ view only)

Amazon TABLES Yes Yes Yes No No No No No No Redshift VIEWS

Data Governance User Guide Managing Connections PUBLIC 13 Connec­ Object Fact Prepare Manage tion Type Types Browse Sheet Preview Profile Publish Lineage Rules Data Objects

Amazon CSV Yes Yes Yes Yes Yes No Yes Source Yes Simple Storage Image Target Service Files* (S3) JSON*

JSONL*

ORC

PAR­ QUET

PDF*

(*pre­ view only)

Cloud TABLES Yes Yes Yes No No No No No No Data In­ tegration

Google TABLES Yes Yes Yes No No No No No No Cloud BigQuery VIEWS

Google CSV Yes Yes Yes Yes Yes No Yes Source Yes Cloud Storage Image Target (GCS) Files*

JSON*

JSONL*

ORC

PAR­ QUET

PDF*

(*pre­ view only)

Data Governance User Guide 14 PUBLIC Managing Connections Connec­ Object Fact Prepare Manage tion Type Types Browse Sheet Preview Profile Publish Lineage Rules Data Objects

Hadoop CSV Yes Yes Yes Yes Yes No Yes Source Yes Distrib­ uted File Image Target System Files* (HDFS) JSON*

JSONL*

ORC

PAR­ QUET

PDF*

(*pre­ view only)

IBM DB2 TABLES Yes Yes Yes Yes No No Yes Source No

VIEWS

Microsoft CSV Yes Yes Yes Yes Yes No Yes Source Yes Azure Data Image Target Lake Files* (ADL) JSON*

JSONL*

ORC

PAR­ QUET

PDF*

(*pre­ view only)

Data Governance User Guide Managing Connections PUBLIC 15 Connec­ Object Fact Prepare Manage tion Type Types Browse Sheet Preview Profile Publish Lineage Rules Data Objects

Microsoft CSV Yes Yes Yes Yes Yes No Yes Source Yes Azure Data Image Target Lake Files* (ADL) v2 JSON*

JSONL*

ORC

PAR­ QUET

PDF*

(*pre­ view only)

Microsoft TABLES Yes Yes Yes Yes Yes No Yes Source No Azure Cloud VIEWS SQL

Microsoft TABLES Yes Yes Yes Yes Yes No Yes Source No SQL VIEWS

Microsoft CSV Yes Yes Yes Yes Yes No Yes Source Yes Windows Azure ORC Target Storage Blob PAR­ (WASB) QUET

OData TABLES Yes Yes Yes No Yes No No No No

Oracle TABLES Yes Yes Yes Yes Yes No Yes Source No

VIEWS

Oracle TABLES Yes Yes Yes Yes Yes No Yes Source No MySQL VIEWS

SAP TABLES Yes Yes Yes No Yes No No No No ABAP VIEWS

CDS VIEWS

SAP ODP Ex­ Yes Yes Yes No No No No No No ABAP tractors Legacy

Data Governance User Guide 16 PUBLIC Managing Connections Connec­ Object Fact Prepare Manage tion Type Types Browse Sheet Preview Profile Publish Lineage Rules Data Objects

SAP INFO­ Yes Yes Yes No Yes Yes No No No Business PRO­ Ware­ VIDER house (BW) QUERY

DATA­ STORES

SAP TABLES Yes Yes Yes No No No No No No Cloud Platform Open Connec­ tors

SAP CALCU­ Yes Yes Yes Yes Yes Yes Yes* Source No HANA LATION See note Database Target VIEW (HANA_ about Calcula­ DB) COLUMN tion View TABLE

GLOBAL TEMPO­ RARY TA­ BLE

ROW TA­ BLE

SQL VIEWS

VIRTUAL TABLE

SAP_IQ TABLE Yes Yes Yes No No No Yes Source No

VIEW Target (Tables only)

SAP Vora STREAM­ Yes Yes Yes Yes Yes Yes Yes Source No ING TA­ Target BLE

DATA­ SOURCE TABLE

VIEWS

Data Governance User Guide Managing Connections PUBLIC 17 Connec­ Object Fact Prepare Manage tion Type Types Browse Sheet Preview Profile Publish Lineage Rules Data Objects

SDL CSV Yes Yes Yes Yes Yes No Yes No Yes

Image Files*

JSON*

JSONL*

ORC

PAR­ QUET

PDF*

(*pre­ view only)

 Note

Certain object types may not work for all tasks. For example, SAP HANA Database Calculation View is supported for running rules in certain conditions only. Calculation views that do not require an input parameter to be set or have required parameters with a default value can be used in a rulebook. Calculation views that have required input parameters without a default value cannot be used in a rulebook. To check whether your calculation view can run rules, preview the data from the fact sheet without setting any input parameters. If you can preview the data, then you can run rules.

 Note

To explore connections and perform certain tasks in Metadata Explorer, you must have one or more connections set and the appropriate permissions for those supported connection types. Depending on the policy configuration, you may have rights to see certain connections and the data within the dataset. Due to the policy configuration, you may not be able to see all connections or view the data from the dataset in areas where it’s exposed. For example, you may not see data in the Preview Data tab on the fact sheet.

Access Connections Page

Navigate to Browse Connections by clicking the Metadata Explorer navigation menu and choosing Catalog Browse Connections .

Data Governance User Guide 18 PUBLIC Managing Connections Search Connections

In the Filter connection names search box, enter all or a portion of the connection name, and then click  Search.

You can view the connections in a  List view or a  Grid view. Both options show the connection name, description and type.

Click  More Actions, and then View Capabilities to see information about whether the contents of the connection have permissions for profiling, publishing, data preparation, and lineage analysis. The permissions are set based on the connection type. If you navigate to the file level, this dialog shows the file type, size, owner, folder path, and the number of columns. To view the column names, data types, and any column tags applied, click the Columns tab. It also shows whether it was published, profiled, or has lineage extracted.

On a folder, click  More Actions to create a New Publication, Delete Folder, or Rename Folder. If you perform the same action on an unpublished dataset, you can also View Fact Sheet, Start Profiling, and Prepare Data. If you click More Actions on a published dataset, you can also View in Catalog. You can also View Lineage, if the published dataset has lineage.

 Note

View in Catalog is only available if the connection has been published.

Related Information

Overview of Metadata Extraction [page 19] Profile a Dataset [page 21] Publish a Dataset [page 22] Manage Files and Folders [page 25] Managing Policies

2.1 Overview of Metadata Extraction

Learn when to profile, publish, or create rules for a dataset, or discover the lineage of an object.

About Metadata

Metadata is available on published objects. A published object, also known as a published dataset, can be generated from various source objects: a connection, a schema or folder on a connection, or an object like a view, table, or file. You select a source connection or an object within the connection and then publish it. The published object contains the metadata and is placed in the Catalog. This table summarizes the primary metadata actions.

Data Governance User Guide Managing Connections PUBLIC 19 Browse Connections Action Destination

Source object (connec­ Publish Create a published dataset in the Catalog. tion, folder, table, file, view, and so on)

Source object Profile Supplement the dataset metadata with minimum and maximum values, the average value length, and a count of null, blank, and zero values in the Catalog.

Source object Rules Updates the pass and fail counts in the Rulebook and Rules Dashboard.

Source object Lineage Supplement the dataset with information about the source datasets and transformations in the Catalog.

Choose an object and click  More Actions, and then select View Metadata.

When you publish, the metadata is stored in the Catalog, so you can retrieve a snapshot of the metadata at the time the source object was published. However, if the data in the object has changed since it was last run, then the information in the Catalog is likely out of date.

About Publishing and Profiling

Publishing extracts the metadata from the source object and places the information in the Catalog as a published dataset. With the published dataset sstored in the Catalog, you can share the content with others, search, and tag your information.

Profiling produces extra metadata about the values in the dataset. For example, you can view the unique or distinct values, the minimum and maximum values, average length, and whether there are null, blank, or zero values. This information can help you determine which datasets may need cleansing, masking, or any number of options available in the SAP Data Intelligence Modeler.

About Rule Processing

Create rules and rulebooks to analyze your source data. You can define expressions that test for various conditions such as less than, greater than, and so on, to evaluate one or more columns of data. Gain more insights into your data by viewing pass and fail results. You can also see samples of records that failed your expressions. This information can help you identify steps to improve your data.

About Lineage Analysis

When you're looking for more information about the origin of a dataset or want to learn where a dataset has been used, you can run lineage analysis publishing a dataset. Lineage is available on some connections and in supported modeling graph operators.

After lineage is processed, you can see a graphical representation that shows the input and output of transformations performed on the dataset.

Data Governance User Guide 20 PUBLIC Managing Connections Workflow Example

This workflow is one possible scenario of using the Metadata Explorer.

● The administrator profiles the data and learns that the data contains a large amount of blank and null values. ● The administrator publishes the data to the Catalog so that a business user, who is more familiar with the dataset, can review the data. The administrator also adds the tag missing data to the dataset in the Catalog. ● The administrator creates rulebooks, rules, and scorecards to set the threshold of quality data. ● The business user searches the Catalog for the missing data tag and finds the published dataset. ● The techical user creates a data pipeline in the SAP Data Intelligence Modeler to populate the missing information. ● The information can replace the original source data, or it can be output to a different target location. ● The administrator can publish and profile the data again to see improvements in the data. Profiling can show the trends between the last few profiling tasks in the fact sheet. ● The administrator can run lineage extraction to see the dataset changes graphically and see the sequence of events from the original source to the target.

2.2 Profile a Dataset

Profile to learn more about your data and where it may be lacking information.

Context

Profiling data helps you learn more about your data. For example, you can see if there are null or blank values, distinct and unique values, minimum and maximum and average length values. You must profile the data to view this information in the fact sheet. To view the list of connections available for profiling, see Managing Connections [page 13].

Depending on the dataset size and various computation aspects, the application may return sampling-based profiling. When the application returns sampling-based profiling, you will see a notification on the fact sheet.

Procedure

1. From the Home page, click the Metadata Explorer navigation menu and choose Catalog Browse Connections . 2. Select the connection and navigate to the object to profile. 3. Click  More Actions, and then choose Start profiling. 4. Confirm that you want to start profiling by clicking Yes.

Data Governance User Guide Managing Connections PUBLIC 21 The profiling task begins.

 Note

SAP HANA objects with parameters or variables cannot be profiled or shown in a data preview.

Results

Click the Browse Connections navigation menu, and then choose Monitoring to view the progress of the profiling task. After the profiling task is complete, select More Actions View Fact Sheet to preview data and view the fact sheet.

2.3 Publish a Dataset

Publish the dataset to make a local copy of the metadata.

Context

In Browse Connections, you can publish datasets to the Catalog. After the data is published, you or your colleagues can search the metadata, add comments to the objects, and tag datasets. If you have a supported connection and a dataset that has lineage, you can choose to extract lineage and view the lineage on a published dataset. Typically, browsing the Catalog page is faster than browsing the remote connection.

 Note

When using an SAP HANA database, you can’t publish tables in the SYS or _SYS_REPO schemas. You can prevent these schemas from being shown in the Metadata Explorer by adding their names to the Blacklisted Schemas in Connection Management. For details, see HANA_DB.

Datasets can also be published from the Managing Publications. See Create a Publication [page 73].

Procedure

1. Click the Metadata Explorer navigation menu and choose Catalog Browse Connections . 2. Select the connection and navigate to the object to publish. 3. Choose one option.

○ Click  More actions, and then choose New Publication. ○ Drag the object to the right panel.

Data Governance User Guide 22 PUBLIC Managing Connections 4. Enter a name and description for the publication on the  New Publication tab. 5. If you selected a folder and want to publish the objects in the folder or subfolders (based on any File Names or Patterns specified), then select Include Subfolders. When selected, the file patterns are used in the current and all subfolders. For example, if you specified *.csv, then only the CSV files are processed in the subfolders. When not selected, then the file patterns are used in the current folder only.

 Note

You can add multiple datasets to the publication. Drag and drop those objects onto the File Names or Patterns area. These objects must all be within the same source folder.

6. (Optional) When available, click On in the Lineage option to recursively extract the lineage from the dataset in a lineage graph. Any datasets found during lineage extraction are also published in the catalog. 7. Click Publish. The publishing task begins.

Results

Click the Metadata Explorer navigation menu, and then choose Monitor Monitor Tasks to view the progress of the task. Navigate to the Catalog to view publishing results. Your published object is located in the folder with the same connection name as the source.

Related Information

View Published Datasets in a Folder or Connection [page 23] Update or Delete a Publication [page 24] View Connection Capabilities [page 25]

2.3.1 View Published Datasets in a Folder or Connection

View published datasets from a connection or from a particular folder on a connection.

Context

When preparing to publish a source dataset or folder, you may want to know whether other source datasets have been published in a folder or on the connection.

Data Governance User Guide Managing Connections PUBLIC 23 Procedure

1. Click the Metadata Explorer navigation menu and choose Catalog Browse Connections . 2. Select the connection and navigate to an object. 3. Drag the object to the  New Publication tab in Publications panel. 4. Choose an option.

○ To view the published datasets within the current folder, click the  All Publications on the New Publication Source Folder tab. ○ To view the published datasets within the current connection, click the  Connection Publications tab.

Results

The published datasets are shown. If you are on the Connection tab, then the published folders and datasets are shown. Click the arrow next to the published dataset or folder to edit the publication.

2.3.2 Update or Delete a Publication

Update the name or description, add or remove files, or delete a publication.

Context

You can change the name or description of a publication without having to republish the publication. If you update the source folder, subfolders, or patterns, run the publication again. You can also delete a publication.

Procedure

1. Click the Metadata Explorer navigation menu and choose Catalog Browse Connections . 2. Select the connection.

○ To view all publications on a connection, click  Connection Publications. ○ To view publications on a specific folder, drag the folder to  New Publication and then click  All Publications on the New Publication Source Folder. 3. Click the  arrow. 4. Edit the publication, and then choose an option.

○ To update a publication name or description without having to republish, click Update Publication. ○ To update a publication and then publish the publication, make any necessary changes to the options, and then click Update and Publish.

Data Governance User Guide 24 PUBLIC Managing Connections ○ To delete a publication, click Delete.

 Note

Deleting a publication removes the published objects from the Catalog. If you published a folder, the published datasets within the folder are removed from the Catalog. The folders and datasets return to a Not Published state in the Browse Connections page. However, when a dataset or folder is included in another publication, those objects are kept in the Catalog.

2.3.3 View Connection Capabilities

View whether the connection can perform certain tasks such as extracting lineage, applying rules, creating data preparation, and so on.

Context

Some features are available on certain connections. In Metadata Explorer, you can view the supported actions for each of your connections. To view all of the connection capabilities for all supported connections, see Managing Connections [page 13].

Procedure

1. Click the Metadata Explorer navigation menu and choose Catalog Browse Connections . 2. On the connection you want to view, click  More Actions. 3. Choose View Capabilities.

The right-side panel shows the capabilities available on the connection. The items with a red X are not supported. The items with a green checkmark are supported.

2.4 Manage Files and Folders

View, upload, download, delete, and rename files. Add, delete, and rename folders on supported connections.

Context

For certain connection types, you can manage files and folders. The following connection types are supported for uploading files.

Data Governance User Guide Managing Connections PUBLIC 25 ● Amazon S3 ● (GCS) ● Hadoop Distributed File System (HDFS) ● Microsoft Azure Data Lake (ADL) ● Microsoft Windows Azure Storage Blob (WASB) ● SDL

Procedure

1. Click the Metadata Explorer navigation menu. Choose Catalog Browse Connections . 2. Choose a supported connection. 3. Choose one or more actions.

Action Additional Steps

Create a folder 1. Click New Folder. 2. Enter a folder name, and then click OK.

Delete a folder 1. Select the folder you want to delete. 2. Click  More Actions, and then choose Delete Folder. 3. Click Yes to confirm the deletion.

 Note All subfolders and files are deleted.

Rename a folder 1. Select the folder you want to rename. 2. Click  More Actions, and then choose Rename Folder. 3. Enter a name, and then click Save.

Upload a file 1. Navigate to an existing folder, and then click  Upload Files. 2. Click  Add Files and choose your files, or drag and drop files from an external file browser.

 Note You can add multiple files at once, but each file must be less than 100 MB. This size can be changed by your administrator in System Management.

3. Click Upload. 4. Click Close to return to the Browse Connections page.

Download a file You can also download files from the Catalog.

1. Navigate to the file you want to download. 2. Choose  More Actions, and then select Download File. 3. Complete the download process based on the web browser you’re using.

You can also download the file from the Fact Sheet. 1. Choose a supported connection and navigate to the file you want to download. Click  View Fact Sheet. 2. Click Download File.

Data Governance User Guide 26 PUBLIC Managing Connections Action Additional Steps

Delete a file 1. Select the file you want to delete. 2. Click  More Actions, and then choose Delete. 3. Click Yes to confirm the deletion.

Rename a file 1. Select the file you want to rename. 2. Click  More Actions, and then choose Rename. 3. Enter a name, and then click Save.

View a file You can also view files from the Catalog.

 Note

You can only view files that are 15 MB or less. For larger files, you can download and view them locally.

1. Select the file you want to view. 2. Choose  More Actions, and then select View File.

 Note

If you’re viewing a JSON or JSONL file that converted to a single string, click Format to make it easier to read. You cannot change the content of the file.

If you’re viewing an image file, you can make some formatting changes (for ex­ ample, contrast, brightness, and color), and then click Apply and Save. The edited image is saved locally; the original file on the system remains un­ changed. You can upload the edited file on certain connections. See Manage Files and Folders [page 25].

You can also view files on the Fact Sheet. 1. Choose a supported connection and navigate to the file you want to view. Click  View Fact Sheet. 2. Click View File.

Data Governance User Guide Managing Connections PUBLIC 27 3 Managing the Catalog

The Catalog is the target location for published datasets.

When you publish a dataset, a folder with the name of the connection is created, and the published datasets are placed in the corresponding connection's folder. You can perform the following tasks in the Catalog:

● explore the contents of the folders and mark folders as a favorite ● view the dataset's metadata ● preview the data ● view the fact sheet ● create and manage tag hierarchies ● start profiling on supported source objects ● prepare data (enhance and enrich source datasets) ● view lineage on supported source objects

Set a Favorite Folder

Setting a favorite folder creates a link to the folder from the Discovery Dashboard page in the Favorites tile and in the Manage Favorites page. You can add a favorite to any top-level or sublevel folders. To add a favorite, click  Toggle Favorite next to the desired folder.

Search Folders or Datasets

To search, enter a portion of the search term in the Search entire catalog text box, and then click  Search. When you are at the root level, you can also filter based on the connection name.

To search for published datasets or columns that are tagged, navigate to the hierarchy that contains the tag you want to search for. Select the tag and then click  More Actions and choose Use Tag as Search Filter. You can search on multiple tags. If searching with tags from the same hierarchy, then the OR operator is used. For example, it would search for the tags Phone OR Email. If using tags from different hierarchies are used, then the AND operator is used. For example,Phone OR Email AND CustomerID, when CustomerID is in a different hierarchy than Phone and Email.

Click  Filter to search the minimum average rating or the dataset type. When searching the average rating, the results show the minimum stars selected and higher. For example, if you choose 4 stars, then the results show those datasets with 4 and 5 star ratings. Click Apply.

You can view the folders or objects in a  List view or a  Grid .

Data Governance User Guide 28 PUBLIC Managing the Catalog View Metadata

To view metadata, navigate to the dataset and either click the dataset or click  More Actions, and then choose View Metadata.

A pane on the right side opens showing two tabs:  Information and  Columns.

Depending on the connection, some of following information is found on the Information tab. In the Properties section, you could see the following information.

Properties Option Description

Name Name of the dataset.

Description The object description pulled from the source.

Type The file type of the dataset such as CSV, ORC, table, view, and so on.

Size The size of the dataset in bytes.

Last Modified The date and time the dataset was changed.

Owner The owner or name of the folder containing the dataset.

Connection ID The name of the connection.

Schema The schema containing the dataset.

Folder The folder containing the dataset.

Status The status showing the tasks run on the dataset such as lineage or profiled.

Search Rank The numerical rank showing how closely the search term matched the dataset. Number 1 is the highest ranking.

Matched Terms Terms used in a search that included this dataset in the search results. Click  View Match Information to view how the term matched the dataset. For example, the match might be a file name or extension, tag, or column name.

Last Profiled The date and time the dataset was most recently profiled.

Last Published The date and time the dataset was most recently published.

In the Related Objects section, you could see the following information.

Related Objects Option Description

Tags A list of the tags applied to the dataset, and the name of hierarchy where the tags are located. You can filter the tags by entering the name of a tag, and then clicking  Filter.

Rulebooks A list of links to rulebooks where the dataset is used. Click the link to view the rulebook details. Rulebooks are available only on datasets and connections that support rulebooks.

Terms A list of terms associated with the dataset. Click the link to view the term details in the glos­ sary.

The Columns tab lists the number of columns in the dataset and the names of the columns. Click the right arrow to view more information about the column such as the type, native type, and description, depending on the object type. You can also view tags and terms associated with the column. To add more tags to the column, you can drag and drop tags.

Data Governance User Guide Managing the Catalog PUBLIC 29 View the Fact Sheet

To view the fact sheet, navigate to the object. Click  More Actions and then choose View Fact Sheet.

To preview the data, click the Data Preview tab.

 Note

If the structure of data changed on the remote source, you can’t preview the data. For example, if you change a column name from 'Computer' to 'Device'. Run the publication again, and then preview the data as usual. Alternatively, you can switch to the remote connection and preview the metadata from the source and not the outdated metadata in the catalog.

View the Dataset in Browse Connections

To view the dataset in Browse Connections, navigate to the object and then click  More Actions, and then choose View in Browse.

Profile Data

To profile the data, navigate to the object and then click  More Actions, and then choose Start Profiling.

Add Tags to Datasets

To add tags, navigate to the object. In the left pane, select the hierarchy that contains the tags you want to use by clicking the hierarchy drop-down list, and choosing the hierarchy. Then choose a tag and drag it onto the dataset. If you choose a tag that has subtags, then only the tag you chose is added to the dataset; not the subtags. If you select a subtag, then the parent tags are included in the tag. For example, if you have a lighting department with the following tag hierarchy and drag the tag for "table", the tag on your dataset would be "indoor/lamps/table" under the category "lighting".

● indoor ○ ceiling ○ lamps ○ desk ○ floor ○ table ○ under cabinet ○ wall ● outdoor

Data Governance User Guide 30 PUBLIC Managing the Catalog Permissions for Viewing Data in the Catalog

You must have certain rights or permissions to access all the functionality on the Catalog page. To adjust your permissions, see your system administrator. If you don’t have permission, the following options are disabled: View in Browse, Start Profiling, and Prepare Data.

Related Information

Create Hierarchical Tags [page 31] Manage Tags [page 32]

3.1 Create Hierarchical Tags

A hierarchical tagging system helps you to organize and manage your tags.

Context

A ContentType hierarchical tagging structure is included. After you publish a dataset, and then profile the dataset, any tags that match the content types in your data are automatically added as tags. For example, if you have address data that has a Country column, then the tag Location/Country is applied to that column automatically. You can always remove these automatically generated tags from your columns. However, you can’t add or remove tags from the ContentType hierarchy.

Apply tags to your datasets or columns to help you search for datasets containing those assigned tags. Let's say that you work for a home store and want to create tags for the lighting department. You create a category called Lighting and add these tags:

● indoor ○ ceiling ○ lamps ○ desk ○ floor ○ table ○ under cabinet ○ wall ● outdoor ○ ceiling ○ landscape ○ pendants ○ post

Data Governance User Guide Managing the Catalog PUBLIC 31 ○ security

Procedure

1. Navigate to the Catalog by clicking the Metadata Explorer navigation menu and choosing Catalog Browse Catalog . In the Tag Hierarchies panel, click  More Actions, and then choose Manage Tag Hierarchies.

The Manage Tag Hierarchies dialog opens. 2. Click  Add. 3. Enter a name for the hierarchy, for example, Lighting. 4. Optionally, enter a description, such as Lighting department tags. 5. Click Save. 6. Click Close to go to the Catalog. 7. Click the down arrow to choose the hierarchy you created. 8. Click  More Actions and choose Add Tag to Hierarchy.

The Add Tag dialog opens. 9. Enter the name of the tag, for example, indoor. Optionally, add a description or choose a color for the tag. 10. Click Save.

○ To add another tag at the same level, repeat the previous step by click More Actions and Add Tag to Hierarchy next to the name of the hierarchy. Add another tag, for example, outdoor. ○ To add a subtag, select the tag you want it under, for example, select outdoor and add the subtag ceiling. 11. Repeat adding tags until your hierarchy is complete.

3.2 Manage Tags

Add tags to a dataset or column; edit or delete tags that are no longer relevant.

You can view the tags in the Catalog, Lineage, and Fact Sheet pages.

To complete the following tasks, navigate to the Catalog by clicking the Metadata Explorer navigation menu and choosing Catalog Browse Catalog .

Add Tags to a Dataset

Tags can be added to published datasets only. You can also add tags to datasets and columns in the Fact Sheet. For details, see View Fact Sheet [page 84].

Data Governance User Guide 32 PUBLIC Managing the Catalog 1. In the Catalog, find the dataset that you want to tag by browsing or searching the Catalog. 2. Open the hierarchy that contains the tags that you want to apply by clicking the down arrow and choosing the hierarchy from the list. 3. Select the tag you want to use. 4. Drag the tag onto the dataset, or click  More Actions, and then select Add Tag to Dataset.

Add Tags to a Column

1. In the Catalog, open the hierarchy that contains the tags that you want to apply by clicking the down arrow and choosing the hierarchy from the list. 2. Search for the dataset that you want to tag by entering a search term and clicking  Search. 3. Select the dataset, and then click  More Actions and select View Metadata. 4. Click the Columns tab within the Dataset Metadata pane, and select the column that you want to tag. 5. Select the tag from the hierarchy that you want to use and drag it to the column, or click  More Actions and then choose Add Tag to Column.

Edit or Delete a Tag Hierarchy

To change the hierarchy name or description, or to delete the hierarchy:

1. Click the down arrow and choose the hierarchy you want to edit. 2. Click  More Actions, and then choose Manage Tag Hierarchies. 3. Update the tag name or description, and then click Save. To delete the hierarchy, click Delete.

 Note

When deleting a hierarchy, all of the tags are deleted from datasets and columns that used the tags.

Remove Tags from a Dataset or Column

A dataset or column may be mis-tagged, or you may want to delete a tag from a dataset or column, but not from the tag hierarchy.

1. Choose the tag and click  More Actions, and then choose Use Tag as Search Filter. 2. Choose the dataset that you want the tag removed from. 3. If removing a tag from a dataset, scroll down on the Info tab in the right-side panel, and then click the X next to the tag you want to remove from the dataset. If removing a tag from a column, click the Columns tab. Click the right arrow next to the column that is tagged. Then click the X next to the tag you want removed from the column.

Data Governance User Guide Managing the Catalog PUBLIC 33 Delete Tags from a Hierarchy

1. Choose the tag and click  More Actions, and then choose Delete Tag from Hierarchy. 2. Click Yes to confirm that you want the tag deleted.

 Note

When deleting a tag, all of the tags and subtags are deleted from datasets and columns that used the tags.

Find Datasets or Columns Using a Tag

You can find datasets that use a tag in different ways.

● Choose the tag and click  Use Tag as Search Filter. ● Choose the tag and click  More Actions, and then choose Use Tag as Search Filter. ● Click  Filter and click Edit tag filters. Enter the name of the tag in the Filter tag names box, and click Search. Select the tag and click  Use Tag as Search Filter.On the Filters dialog, click Apply.

The datasets that have the tag name in the Matched Terms are shown. To view the Matched Terms, select the dataset. The Dataset Metadata panel opens. On the Info tab under Matched Terms, click  View Match Information to see more details about the matching term.

Change the Default Hierarchy

A default hierarchy is shown in the Fact Sheet and Catalog. You can set the hierarchy that you want to appear on those pages.

1. Click the down arrow and choose the hierarchy you want to be the default. 2. Click  More Actions, and then choose Set as Default Hierarchy.

Data Governance User Guide 34 PUBLIC Managing the Catalog 4 Rules

Rules help to determine whether the data complies with business constraints and requirements.

There are several steps for successful rule implementation.

1. Create a category or select an existing category. 2. Create and test a rule. 3. Create a rulebook. 4. Bind the rule to a dataset. 5. Run the rulebook. 6. Create a dashboard to view your customized scorecards.

Rules are created within a rule category. You can use the predefined categories or create your own categories.

When you create a business rule, you can define one or more parameters. Then use the parameter to set one or more conditions that determine which records pass the rule.

Use rulebooks to bind your rules to datasets, by assigning a column to the parameter. Then you can run all your rules in your rulebook.

After running the rules, view the results and create passing and failing thresholds.

After you have rules, categories, and rulebooks defined, you can create a dashboard and create scorecards to reflect whether the dataset has passed the rules. You can configure the passing threshold and apply weighting to certain rules or categories. The rules can be applied to multiple datasets. The dashboard shows all the scorecards, so you can get a quick view regarding the status of your datasets.

Related Information

Create a Category [page 36] Create a Rule and Test Case [page 37] Create a Rulebook [page 44] Run Rulebooks and View Results [page 48] Create a Dashboard [page 50] Import Rules from SAP Information Steward [page 55] View Terms Related to a Rulebook [page 58]

Data Governance User Guide Rules PUBLIC 35 4.1 Create a Category

Use categories to create and organize your rules.

Context

There are several predefined categories where you can create your rules.

Category Description Example

Accuracy The data reflects a standard or real- A business address is in an actual world value. building.

Completeness All the necessary data is present. An order must have values for product, price, and quantity.

Conformity The data has the correct data type and The date of birth must be a DATE type is in the required format. and have the YYYY/MM/DD format.

Consistency The data values match within the When a product is discontinued, there dataset and across similar datasets. should not be any sales of the product.

Integrity The data relationships are valid and A customer name and a valid address connected. must be related; otherwise the record is orphaned.

Timeliness The data is current and available. A business must report its quarterly results by a certain date.

Uniqueness The data has no duplicate records, A new product name is not the same rows, primary keys, homonyms, or name or similar sounding to a synonyms. competitor's product.

Validity The data supports a policy, conclusion, A new product claims to weigh a or measurement. specific amount, and the weight is verified in an independent test.

You can also create your own categories (up to 100, including the pre-defined categories). For example, you may want a category called Part Numbers where you place all the rules to validate the different part numbering schemes within the organization. You must have a category before you create a rule.

Procedure

1. From the Metadata Explorer navigation menu, choose Rules View Rules . 2. Choose  Create Rule Category. 3. Enter a name and description of the category. Click Save.

Data Governance User Guide 36 PUBLIC Rules Edit or Delete a Category

Context

You may want to edit the name or description of a category, or you may want to remove a category.

Procedure

1. From the Metadata Explorer navigation menu, choose Rules View Rules . 2. Select the category that you want to edit or delete. Click  Category Actions.

○ To edit the category, choose Edit Rule Category. Change the name or description, and then click Save. ○ To delete the category, choose Delete Rule Category. Click Yes to confirm.

 Note

All rules must be moved or deleted before deleting the rule category.

4.2 Create a Rule and Test Case

Create business rules to validate whether your data complies with the business requirements. Test the rule to learn whether the data passes or fails the rule.

Context

Rules require that you define one or more parameters and one or more conditions. Optionally, you can also add one or more filters for each parameter. The parameter could be the name of a frequently used column in your datasets, such as PartID. A condition you set shows the criteria of what could cause the record to pass or fail. For example, one condition could be that the PartID must start with the numeric sequence 5581. A filter defines the records that are sent to the rule. For example, one filter could be that PartID is not null.

After running the rule, those records that have a null PartID are filtered out. Those PartID values beginning with 5581 pass the rule. Those records that are null or begin with a different set of numbers fail the rule.

You can also specify logical expressions for your conditions. See the example shown later in this topic.

After the rule is created, you can test the rule with some sample data to see if the results meet your expectations.

Data Governance User Guide Rules PUBLIC 37 Procedure

1. From the Metadata Explorer navigation menu, choose Rules View Rules . 2. Expand the rule category where you want to place your rule. 3. Click  Create Rule. 4. Enter the Rule ID, Name, and a Description. Click Save.

The Rule Definition page is shown. 5. Create one or more parameters (up to 20) by clicking  Add Parameter. a. Enter the parameter name. b. Choose a data type. c. Optional. If you choose String as a data type, you can choose one or both of the options. These settings are applied when the parameter is used in rule conditions and filters.

○ Case Insensitive: The value is not case-sensitive. For example, Ocean and ocean are treated the same. ○ Trim: Any spaces before or after the value is removed. d. Enter a description. e. Repeat the substeps to add more parameters, if necessary. f. Click  Save. 6. Create one or more conditions (up to 20) by clicking  Add Condition. a. Enter a condition name. b. Choose one of the parameters. c. Choose an operator. d. Choose a Mode if the selected operator requires it. ○ User Entry: Set the value or format manually. For example, if you selected the operator is between, you can enter the values from 150 through 500 and return those records where the parameter value falls from 150 through 500. ○ Parameter Value: Set the values or format based on values from parameters with the same data type. Let's say that you created 3 parameters that have a String data type: Part2020, Part3355, and Part7788. The purpose of your condition is to determine whether the value of Part2020 is also in Part3355 and Part7788. So, your condition has the parameter name Part2020, the operator is set to is in set, the mode is Parameter Value, and the value is Part3355 and Part7788.

 Note

If the Parameter Value is not enabled, you may not have enough parameters with the same data type created. For example, you have a parameter named Date that has a defined Date data type. You’re creating a condition using the Date parameter with an operator of is between. If there are no other parameters defined with a Date data type, then only User Entry is available. If there are one or more additional parameters with a Date data type, then both options are available.

e. Add one or more values or formats to define the operator, if necessary. f. Repeat the substeps to add more conditions, if necessary. If you have a complex condition or you prefer to manually enter the condition, click Advanced. The condition you entered in Basic is shown as a script. You can also start in Advanced mode, and skip the Basic mode. Use the Parameters, Operators, and Functions buttons to help you create an advanced script. Click Save. Validation is run on

Data Governance User Guide 38 PUBLIC Rules the script. If there are any errors or warnings, they’re shown to the left of the line number where the error or warning occurred. Click View Warning, Errors to read the list of warnings and errors.

 Note

The scripting language is based on the SAP Information Steward scripting language.

 Note

You cannot return to Basic mode until you delete the advanced script.

g. Click  Save. 7. (Optional) If you used Basic mode, you can change the condition logical expression by clicking  Edit. a. Either type the expression in the editor, or click the conditions or expressions above the editor. Hover the cursor between the expressions or conditions, and the pencil icon appears. Click it to add conditions or expressions. Add parentheses to group conditions together. For example, (condition1 AND condition2) AND NOT condition3. b. Click  Save. 8. (Optional) Create one or more filters to send to the rule (up to 20). This filter limits the number of records sent to the rule. Click  Add Filter. a. Enter a filter name. b. Choose one of the parameters. c. Choose an operator. d. Choose a Mode if the selected operator requires it. ○ User Entry: Set the value or format manually. ○ Parameter Value: Set the values or format based on values from parameters with the same data type. e. Add one or more values or formats to define the operator, if necessary. f. Repeat the substeps to add more filters, if necessary. If you have a complex filter or you prefer to manually enter the filter, click Advanced. The condition you entered in Basic is shown as a script. You can also start in Advanced mode, and skip the Basic mode. Use the Parameters, Operators, and Functions buttons to help you create an advanced script. Click Save. Validation is run on the script. If there are any errors or warnings, they’re shown to the left of the line number where the error or warning occurred.

 Note

The scripting language is based on the SAP Information Steward scripting language.

 Note

You cannot return to Basic mode unless you delete the advanced script.

g. Click  Save. 9. (Optional) If you used Basic mode, you can change the filter logical expression by clicking  Edit. a. Either type the expression in the editor or click the logical expression components above the editor. Hover the cursor between the expressions or filters, and the pencil icon appears. Click it to add filters or expressions. Add parentheses to group conditions together. For example, (filter1 AND filter2) AND NOT filter3.

Data Governance User Guide Rules PUBLIC 39 b. Click  Save. 10. Click Test Rule. a. Click  Add Test Case.

 Note

Any defined conditions and filters are shown in the right-side panel.

b. Enter a value for each parameter. Enter some test cases that pass and some that fail the rule. If your test has filtered records, enter some record values that would not be sent to the rule. Click Run Tests.

Those test cases that pass have a green check mark in the Result. Those test cases that fail have a red X. Those test cases that have a  Filter icon are not sent to the rule. You can see the defined conditions and filters in the right panel. When testing a null value, enter null in the value. 11. Click  Go Back to return to the Rule Details page.

Example

For example, a certain number of product units must be sold depending on the size of the city. Small cities must meet a minimum quota that is smaller than large cities. Therefore, the minimum units sold and the population size must be met for the rule to pass. The parameters and conditions are set up as follows:

Condition Name Parameter Name Operator Value List

BigCityQuota UnitSales is equal or greater than 45000

SmallCityQuota UnitSales is equal or greater than 15000

BigCitySize Population is between 50001, 100000

SmallCitySize Population is between 1, 50000

The conditional logical expression (BigCityQuota AND BigCitySize) OR (SmallCityQuota AND SmallCitySize) states that both the BigCity conditions within parentheses must be true or both the SmallCity conditions must be true for the record to pass the rule. The default setting is to AND the conditions, so that all of the conditions must be true for the record to pass the rule.

A filter can limit the number of records sent to the rule. This filter passes only those records that are not null to the rule.

Filter Name Parameter Name Operator

NoEmptySales UnitSales is not null

NoEmptyPopulation Population is not null

The filter logical expression (NoEmptySales AND NoEmptyPopulation) states that both the NoEmptySales and NoEmptyPopulation filters must be true for the record to be sent to the rule. This expression means that the values cannot be empty for those parameters; otherwise they are not sent to the rule. The default setting is to AND the filters, so that all of the filters must be true for the record to pass the rule.

Data Governance User Guide 40 PUBLIC Rules Related Information

Edit a Rule [page 41] Delete a Rule [page 42] Edit or Delete a Test Case [page 42] Values for the Match Pattern Operator [page 43] View Terms Related to a Rule [page 44]

4.2.1 Edit a Rule

Change or delete the rule or the rule options including the parameters, conditions, and filters.

Context

As businesses change and evolve, so must the rules. Rather than creating a rule, you can modify the existing rules to reflect the updated business standards.

Procedure

1. From the Metadata Explorer navigation menu, choose Rules View Rules . 2. Expand the rule category where the rule is located. 3. Click  Rule Details. 4. Choose to edit or delete one or more of these options.

Option Action

Edit the name, description, or rule ID Click  Edit, and then type the new value. Click  Save.

Edit a parameter In the parameter group, click  Edit Parameters. Make your changes, and then click  Save Parameters. All of the conditions using the old name are automatically updated with the new name.

Delete a parameter Next to the parameter you want to delete, click  Delete Parameter. Click Yes to confirm that you want the parameter deleted.

Edit a condition In the conditions group, click  Edit Conditions. Make your changes, and then click  Save Conditions.

Delete a condition Next to the condition you want to delete, click  Delete Condition. Click Yes to confirm that you want the condition deleted.

Edit a filter In the filter group, click  Edit Filters. Make your changes, and then click  Save Filters.

Data Governance User Guide Rules PUBLIC 41 Option Action

Delete a filter Next to the filter you want to delete, click  Delete Filter. Click Yes to confirm that you want the filter deleted.

4.2.2 Delete a Rule

If you find that the rule is outdated or unnecessary, you can delete the rule.

Procedure

1. From the Metadata Explorer navigation menu, choose Rules View Rules . 2. Select the rule, and then click  Delete. 3. Click the Delete button. 4. Click Yes to confirm that you want the rule deleted.

 Note

Before deleting the rule, you must remove it from all rulebooks.

4.2.3 Edit or Delete a Test Case

Change or delete the test cases used to validate the rule.

Context

You may find that some test cases are outdated, or don't apply to a different dataset. Also, the test cases are reset when the parameters are changed. You can remove or edit the test cases to ensure that the data you want to pass or fail the rule works with the rule definition.

Procedure

1. From the Metadata Explorer navigation menu, choose Rules View Rules . 2. Expand the rule category where the rule is located. Click  Rule Details. 3. Click Test Rule.

Data Governance User Guide 42 PUBLIC Rules ○ To delete all the test cases, click Remove All. ○ To delete a single test case, click  Test Case Actions, and then choose Delete Test Case. ○ To remove the content of a single test case, click  Test Case Actions, and then choose Clear Test Case. Then you can enter new values in the test case.

4.2.4 Values for the Match Pattern Operator

The Match Pattern operator is an option in the rule conditions when you create a rule.

Rules require you to define a parameter and a condition for the parameter. When you choose to create a condition using the Match Pattern operator, use this information for writing the value.

Character Description

X Represents uppercase characters. Unicode 4.2 General Category Values specification. Key = Lu, uppercase letter (For example, Latin, Greek, Cyrillic, Armenian, Deseret, and archaic Georgian.)

x Represents lowercase characters. Unicode 4.2 General Category Values specifications keys:

● Ll = Lowercase letter (For example, Latin, Greek, Cyrillic, Armenian, Deseret, and archaic Georgian.) ● Lt = Title case letters (For example, Latin capital letter D with small letter Z.) ● Lm = Modifier letter (For example acute accent, grave accent.) ● Lo = Other letter (Includes Chinese, Japanese, and so on.)

9 Represents numbers.

\ Escape character.

* Any characters occurring zero or more times.

? Any single character occurring once and only once.

[ ] Any one character inside the braces occurring once.

[!] Any character except the characters after the exclamation point. For example, [!12] can allow any number that does not start with a 1 or 2.

All other characters represent themselves. To specify a special character as itself, use an escape character. For example, [!9] means any character except a digit. To specify any digit except 9, use [!\9].

The following table displays pattern strings that represent example values:

Pattern String Example Value

Xxxxxxx Henrick

XXXXX DAVID

Xxx Xx Tom Le

Xxxx-xxxx Real-time

XXX)$@&*xxX9999xX9# JJD)$@&*hhN8922hJ7#

9,999 1,553

Data Governance User Guide Rules PUBLIC 43 Pattern String Example Value

9.99 0.32

-99.99 -43.88

*Jones Returns names with last name Jones

Henrick? Returns Henrick1 or HenrickZ

David[123] Returns David1 or David2 or David3

4.2.5 View Terms Related to a Rule

View terms assigned to a rule and learn about other term relationships.

Context

Terms can be assigned to many objects, rules being one of them. Term relationships can provide a picture of how objects are related.

Procedure

1. From the Metadata Explorer navigation menu, choose Rules View Rules . Click the rule that contains the terms you want to view. 2. Click  View Related Terms. 3. Click the link to one of the terms to view the relationship in the glossary.

4.3 Create a Rulebook

Rulebooks contain a group of rules that can be run on one or more datasets.

Context

The rulebooks contain a group of rules. Rather than running a single rule, you run a rulebook that can contain many rules. These rules can come from one or more rule category, and the rules may be bound to one or more

Data Governance User Guide 44 PUBLIC Rules datasets. For example, you may have a Customer Validation rulebook. Within this rulebook, you may have the following rules and rule categories:

Rule Category Rule

Validity Customer address must be verified to an actual building.

Completeness Customer order must include customer name, product number, price, quantity, order date, and ship date.

Price cannot be null.

Format The order date must be in the format YYYY/MM/DD.

The ship date must be in the format YYYY/MM/DD.

Customer phone number must be in the format +nn (nnn) nnn-nnnn.

Timeliness Customer sales receipts must be mailed by the 20th of every month.

After the rulebook is created, import the rules. Grouping the rules into a rulebook helps to gain an understanding of the individual rules that have passed or failed within a larger context.

After rules are run, set some passing and failing thresholds. The threshold settings indicate whether the rule passed, failed, or is in a warning state. Adjust the threshold values to determine whether the percentage of records passing the rule indicates whether the overall dataset passes the minimum acceptable values of the business rule. The thresholds provide an overall picture of the quality of data within the rulebook. Thresholds can show quality changes over time, especially if it falls below the threshold, so that you can take steps to improve the dataset.

To create a rulebook, follow these steps.

Procedure

1. From the Metadata Explorer navigation menu, choose Rules View Rulebooks . 2. Click  Create a Rulebook. 3. Enter a rulebook name and a description. Click Save.

Related Information

Import Rules to the Rulebook [page 46] Bind a Rule to a Dataset [page 46] Edit or Remove a Rule Binding [page 47]

Data Governance User Guide Rules PUBLIC 45 4.3.1 Import Rules to the Rulebook

After creating a rulebook, add some rules.

Context

The rulebook holds the group of rules for a business case. Import some of the rules you have already created that apply for this rulebook. You can have up to 1000 rules in a rulebook.

Procedure

1. From the Metadata Explorer navigation menu, choose Rules View Rulebooks . 2. Click your rulebook, and then click  Import Rules. 3. Expand the rule categories and click the checkbox next to the rules that you want to import. Click Save.

4.3.2 Bind a Rule to a Dataset

Attach one or more rules to a dataset.

Context

After creating a rulebook and importing rules, bind the rule to one or more datasets. You can have up to 10 datasets in a rulebook.

Rules are meant to work with multiple unique datasets in the same rulebook or used in multiple rulebooks. The rule can be bound to the same dataset several times, if there are different combinations of column and parameter mappings. When searching for connections and datasets, only those connections that support rules and have datasets that have supported column types are shown.

When you bind the rule to a dataset, and then run the rule, you can view the results showing the number of records that pass the rule.

Procedure

1. From the Metadata Explorer navigation menu, choose Rules View Rulebooks . Click the rulebook that contains the rule you want to bind.

Data Governance User Guide 46 PUBLIC Rules 2. Expand a rule category. Next to the rule that you want to bind to a dataset, click  View Rule Bindings. 3. Click  Create Rule Binding. 4. In the Dataset option, browse to the dataset that you want to use. Use the Recent tab to view recently used datasets, or click Browse to choose a connection, and then navigate to your dataset. Click OK. 5. Enter a unique display name and a description to help you identify this rule within the current rulebook. 6. Map the parameter name to the dataset column name. Repeat for other parameters, if any. Click Save.

Related Information

Managing Connections [page 13]

4.3.3 Edit or Remove a Rule Binding

When a rule doesn't serve its purpose for a dataset, you can remove the rule binding.

Context

Sometimes a rule can become outdated or unnecessary for a dataset. You can continue to use the rule for other datasets.

Procedure

1. From the Metadata Explorer navigation menu, choose Rules View Rulebooks . 2. Expand a rule category. Next to the rule with the bindings that you want to edit or delete, click  View Rule Bindings. 3. Choose an option.

○ To delete the rule binding, click  Delete Rule Binding. ○ To change one or more options in the rule binding, click  Edit Rule Binding. Change the options as necessary, and then click Save.

Data Governance User Guide Rules PUBLIC 47 4.4 Run Rulebooks and View Results

Run the rules in your rulebook to view whether the data has passed the rule. Set the pass and fail thresholds.

Context

After creating rules, importing them into a rulebook, and binding the rule to a dataset, you can run the rulebook and learn the rule results. If you have rules in the rulebook that don’t have bindings to a dataset, they aren’t included in the results.

The threshold settings indicate whether the rule passed (green), failed (red), or is in a warning area (orange). You can adjust the threshold values to determine whether the percentage of records passing the rule indicates that the overall dataset passes the minimum acceptable values of the business rule. For example, there’s a dataset that has 1000 records. You have a rule where the PartID is not null. Let's say that 75 records are null, and 925 records have a partID. If your business rules dictate that 100% of the records must have a PartID, then you'd set the Pass threshold to 100. If you want a warning area, you can set the Fail threshold to 97. If you use these settings, then the rule wouldn’t pass the business rule.

The information shown in the gauge is for records that passed all rules. If you have two rules, and all the records pass one rule, but only 12% of the records pass the other rule, then the gauge shows 12%.

If you have filtered records so that they are not sent to the rule, then those records are marked as passing the rule and could impact the value on the scorecard and rule results.

Procedure

1. From the Metadata Explorer navigation menu, choose Rules View Rulebooks . Click the rulebook that contains the rules that you want to process. 2. Click Run All. 3. After a rule has finished processing, click View Results.

Results

The Rule Results page is shown. In the top pane, choose to view the Rule Results by selecting View by Dataset or View by Category from the dropdown list. View by Dataset shows the results for all rules applied to the dataset giving you insight into one dataset. The application shows the Datasets > Categories > Rules on the bottom of the screen. View by Category shows the percentage of records that passed all rules in that category giving you insight into a type of rules. The application shows Categories > Rules > Dataset at the bottom of the screen.

In % Rows passing all rules, shows a trend graph of the past five runs of the category or dataset. Selecting a date and time the rulebook was run displays those results at the bottom of the screen. If you see a change in the trend, click the date and view the results, then click the next date to view the changes. The Delta (Pass,

Data Governance User Guide 48 PUBLIC Rules Filter) option shows the difference from the selected run compared to the previous run. The trend graph can change because datasets or rules were added to the rulebook between runs. If the new datasets are not bound to the rule or category that you ran, then the number of records has not changed, resulting in no Delta changes even though the trend value decreased.

Rule Threshold shows whether the rule passed, failed, or is in the warning area. To set the threshold for passing, failing, and warning, click the gauge. Enter the maximum fail number, and the minimum passing number. Click Save. If you want a warning area, leave a gap between the pass and fail numbers. For example, if you set the Fail value to 60 and the Pass value to 75, then any results including 61 through 74 make the gauge appear orange. Values at 60 or below make the gauge appear red, indicating the dataset failed the rule threshold. Likewise, values at 75 or higher make the gauge appear green, indicating the dataset passed the rule threshold.

Rulebook Properties lists the owner, the number of times the rulebook has been run, when it was created, and when it was last modified.

The bottom of the page shows the dataset or category information. To view more data from the bottom of the page, click  Collapse Header.

Click in the search box to search for datasets or categories. Click Rulebook Details to view all of the categories and rules. Click Run All to run the rulebook. After the rulebook is done running, click  Refresh to update the rule results. To sort the categories or rules alphabetically, click  Sort.

All of the datasets or categories are listed at the bottom of the page. The right side has a gauge for each dataset or category showing whether it has passed or failed the rules. You can pin the datasets or categories whose results you want to see all the time. Click the  Pin icon next to the dataset or category, and the results are shown at the top of the screen.

Next to the dataset or category name, click  Expand/Collapse to view the results. You'll see the connection name, qualified name, Delta, rows passed, and the total rows when you’re viewing by Dataset. When viewing by Category, you'll see the category description, Delta, rows passed and total rows. You'll also see rules or dataset cards.

Click View Failed Rows on the card to view a sample of five records that didn’t pass the rule. To view more of the failed record data, click  Enter Full Screen Mode.

Regardless of whether you chose to view the results by Category or by Dataset, you’ll see the following information:

Option Description

Delta (Pass, Filter) The percentage of change in records between the selected run and the previous run that passed and were filtered. For example, if you ran the rulebook and 100 records were added to the dataset the next day, then there could be a difference between the number of records that passed the rule when you run it again. The addition of records could result in an increase or decrease in the Delta percentage.

Records Sent to Rule The number of records that were sent to the rule. The number of filtered records that are not sent to the rule is listed in parentheses.

Rows Passed The number of records that passed the rule, not including any filtered records.

Click the dataset or rule card to view the parameter description, type and mapping in the right panel.

Data Governance User Guide Rules PUBLIC 49 4.5 Create a Dashboard

A custom dashboard shows the data quality information on scorecards that you’re interested in.

Context

Create a dashboard with custom tiles showing the results of the datasets, categories, and rulebooks that matter to you. You can have several dashboards, and you can have groups within the dashboard, which helps you organize your scorecards in a way that makes sense to you.

Procedure

1. From the Metadata Explorer navigation menu, choose Rules View Rules Dashboard . 2. Click  Create a Dashboard . 3. Enter a dashboard name and a description. Click Save.

Results

The Rules Dashboard opens where you can create groups and add scorecards.

Related Information

Create a Scorecard [page 50] Edit a Dashboard [page 54]

4.5.1 Create a Scorecard

Create a scorecard to visualize the rule results in your rulebooks.

Context

You can create many scorecards, and organize them on the rules dashboard. A scorecard applies to one object: a rulebook, one or more categories, or one or more datasets. The rulebook score shows the percentage of

Data Governance User Guide 50 PUBLIC Rules objects that passed all of the rules on all of the datasets in the rulebook. The category score is based on the grouping of rules that can be used on multiple datasets. The dataset score is based on the rules bound to a single dataset.

A wizard helps you set up a scorecard based on the object. The result shows either a line graph, bar chart, donut chart, or trend score. The scorecards are shown on the rules dashboard.

You can create groups to organize your scorecards. For example, you could create a trend group that contains all the trend scores from your datasets, categories, and rulebook.

Procedure

1. From the Metadata Explorer navigation menu, choose Rules View Rules Dashboard . 2. Click the dashboard. 3. To create groups for your scorecards, click Add Group, enter a name, and then click Add. 4. Click  Add. 5. Select the rulebook that you want to apply the scorecard. Click Step 2.

 Note

Only rulebooks that have been run are shown in the list.

6. Select whether you want the scorecard to present information about the Datasets, Categories, or Rulebook. Click Step 3. 7. Depending on the Reporting Type you selected, choose the Scorecard Type that you want to create.

Data Governance User Guide Rules PUBLIC 51 Reporting Type Scorecard Type Description

Datasets Single Dataset Choose Show Delta value to show the dataset score of the current run, and the difference between the previous and current run. The Delta value is shown below an arrow indicating whether the score has increased (up arrow), decreased (down arrow), or is un­ changed (right arrow).

 Note

Enabling the Delta value means that the scorecard type must be Score not Pass %.

Choose the reporting type: ○ Score: Creates a single bar graph showing whether the score passes the pass/fail threshold. ○ Pass %: Creates a donut graph showing the percentage of passing records. The graph is color coded: green exceeds the passing threshold value, yellow exceeds the failing threshold value, and red is below the failing threshold value.

Click Step 4 and choose the dataset you want to use in the score­ card.

Datasets Compare Datasets Creates a bar graph showing the datasets next to each other.

Click Step 4 and choose the datasets you want to use in the score­ card.

Datasets Dataset Trends Creates a trend graph showing the scores of one or more datasets over time.

Click Step 4 and choose one or more datasets that you want to use in the scorecard.

Data Governance User Guide 52 PUBLIC Rules Reporting Type Scorecard Type Description

Categories Single Rule Category Choose Show Delta value to show the category score of the cur­ rent run, and the difference between the previous and current run. The Delta value is shown below an arrow indicating whether the score has increased (up arrow), decreased (down arrow), or is un­ changed (right arrow).

 Note

Enabling the Delta value means that the scorecard type must be Score not Pass %.

Choose the reporting type: ○ Score: Creates a single bar graph showing whether the score passes the pass/fail threshold. ○ Pass %: Creates a donut graph showing the percentage of passing records. The graph is color coded: green exceeds the passing threshold value, yellow exceeds the failing threshold value, and red is below the failing threshold value.

Click Step 4 and choose the category you want to use in the score­ card.

Categories Compare Rule Catego­ Creates a bar graph showing the categories next to each other. ries Click Step 4 and choose the categories you want to use in the scorecard.

Categories Category Trends Creates a trend graph showing the scores of one or more catego­ ries over time.

Click Step 4 and choose one or more categories that you want to use in the scorecard.

Data Governance User Guide Rules PUBLIC 53 Reporting Type Scorecard Type Description

Rulebook Choose Show Delta value to show the rulebook score of the cur­ rent run, and the difference between the previous and current run. The number in the middle of the scorecard is the rulebook score of the current run. The Delta value is shown to the right of the cur­ rent score below an arrow indicating whether the score has in­ creased (up arrow), decreased (down arrow), or is unchanged (right arrow).

 Note

Enabling the Delta value means that the scorecard type must be Score not Pass %.

Choose the reporting type: ○ Score: Creates a single bar graph showing whether the score passes the pass/fail threshold. ○ Pass %: Creates a donut graph showing the percentage of passing records. The graph is color coded: green exceeds the passing threshold value, yellow exceeds the failing threshold value, and red is below the failing threshold value.

Click Step 4 to confirm the rulebook that is reflected in the score­ card.

8. Click Step 5 and enter the Title and Subtitle of the scorecard. 9. Click Save to view your scorecard.

4.5.2 Edit a Dashboard

Rearrange and edit scorecards in your dashboard.

Context

Customize your dashboard by rearranging the scorecards. You can move them from one group to another, or arrange them within the same group. If you've added a rule or category used in a rulebook, you can update the scorecard to reflect the changed rules or datasets.

Procedure

1. Click  Edit Dashboard.

Data Governance User Guide 54 PUBLIC Rules Task Description

Move a scorecard Click the scorecard and drag it to a new location.

Edit a scorecard Click the scorecard, and then click  Edit.

The wizard opens and you can change the options. When finished, click Save to view your scorecard.

Add a scorecard Click  Add. Then complete the wizard, and click Save.

Delete a scorecard Click the scorecard, and then click  Delete.

Delete a group Click  Delete next to the group name.

2. Click  View Dashboard to stop editing.

4.6 Import Rules from SAP Information Steward

You can reuse rules that you created in SAP Information Steward.

Context

If you have existing rules that you want to use in this product, you can import the rules. The concept of "rule bindings" in this product are a little different than Information Steward. Any Information Steward bindings are turned into Metadata Explorer rulebooks. After the rules are imported, you’ll bind the rules to datasets on a connection.

Before importing rules, you must go to Information Steward and export the rules into a ZIP file. Information Steward provides two exporting options:

● Project: Exports both rules and bindings from Information Steward into rulebooks in Metadata Explorer. We recommend using this option. ● Rules to file: Exports rules only. A rulebook is not created in Metadata Explorer.

Procedure

1. From the Metadata Explorer navigation menu, choose Rules View Rules . 2. Click  Import Rules from Information Steward. The Import Rules dialog opens, showing the status of previously imported rules, if any. 3. Click  Import. The Select File dialog opens. 4. Enter a name. This name is a prefix to any rulebooks that are created during the import based on Information Steward bindings. 5. Click Browse and select a ZIP file to import. Click Open.

Data Governance User Guide Rules PUBLIC 55 6. (Optional.) If you chose the Project option in Information Steward when exporting, make sure that Create rulebooks based on Information Steward bindings is enabled to create rulebooks. The rulebooks will have the name entered previously as a prefix. 7. Click Import. You can view the status of the import here or on the Monitoring page. 8. To close the Import Rules dialog, click OK. 9. After the rules are successfully imported, check the imported rulebooks by clicking the Metadata Explorer navigation menu and choosing Rules View Rulebooks .

 Note

The rulebooks are only available if you chose the Project option in Information Steward when exporting, and clicked Create rulebooks based on Information Steward bindings in an earlier step.

10. Choose the imported rulebook and expand the rules. 11. To make the rules valid, you must bind the rule to a connection. Click  View Rule Bindings. 12. In the Qualified Name option, click  and choose a dataset. Click OK. 13. If the dataset does not have the same column name as defined in the rule, then map the rule to an existing column. 14. Click Save.

Results

The rules are imported into the Metadata Explorer.

Related Information

View the Status of Imported Rules [page 56] Bind a Rule to a Dataset [page 46]

4.6.1 View the Status of Imported Rules

View the status and details of imported SAP Information Steward rules.

Context

The status can show the progress of imported rules (Completed, Running, or Error). After the importing task is completed, the status provides details for those rules that are active or inactive. It also lists the rulebooks that were created based on Information Steward bindings.

Data Governance User Guide 56 PUBLIC Rules Procedure

1. From the Metadata Explorer navigation menu, choose Rules View Rules . 2. Click  Import Rules from Information Steward. The Import Rules dialog opens, showing the status of previously imported rules, if any. 3. To view details about the import process, click the name of the import task.

The Details tab shows:

Information Description

File Name Shows the name of the ZIP file chosen for inport.

Rules Shows the number of active and inactive imported rules. The active rules are ready to be used. See the Active Rules tab. The inactive rules have defined components that can't be used in this product. The rules in an Error state cannot be parsed. You can edit the rule to correct the components that cannot be parsed. See the Inactive Rules tab.

Create Rulebooks Shows Yes when you select Create rulebooks based on Information Steward bindings.

Rulebooks Shows the number of rulebooks created from the Information Steward rule bindings in the ZIP file.

Completed Shows the date that the rules were imported.

Runtime The amount of time it took to import the rules in days:hours:minutes:seconds.

Status Shows whether the rules were successfully imported, currently importing, or com­ pleted importing but has errors.

4. (Optional) Click the Active Rules tab to view the category where the rule was placed, the rule name, and status. Click the name of the rule to see the rule settings. 5. (Optional) Click the Inactive Rules tab to view the rule name and errors. The rule has been imported, but it cannot be used because it could have an expression or uses a lookup table that isn't currently supported. You can fix the error to make it useable. 6. Click the Rulebooks tab to view a list of rulebooks created based on the Information Steward rule bindings, and the number of rules within the rulebook. Each rulebook has the prefix named when importing the ZIP file. Click the link to open the rulebook.

 Note

If you imported rules without selecting Create rulebooks based on SAP Information Steward bindings, then the Rulebooks tab is not available.

7. Click OK to close the Import Rules dialog.

Data Governance User Guide Rules PUBLIC 57 4.7 View Terms Related to a Rulebook

View terms assigned to a rulebook and learn about other term relationships.

Context

Terms can be assigned to many objects, rulebooks being one of them. Term relationships can provide a picture of how objects are related.

Procedure

1. From the Metadata Explorer navigation menu, choose Rules View Rulebooks . Click the rulebook that contains the terms you want to view. 2. Click  View Related Terms. 3. Click the link to one of the terms to view the relationship in the glossary.

Data Governance User Guide 58 PUBLIC Rules 5 Business Glossary

The business glossary provides a central and shared repository for defining terms and describing how and where it’s used in the business.

A business glossary can promote a common, consistent understanding of business terms within your organization. The definitions provide context for the terms, and the term relationships provide additional meaning. Term relationships are links to other terms, datasets, and columns. Aim to use the business glossary enterprise-wide or division-wide, and make the terms as clear and understandable as possible. For example, what does "Sales" mean to your organization? Is it currency exchanged for goods or services? Is it a discounted price on a product? Both definitions are correct, but creating the correct definition for your organization helps reduce ambiguity.

A business glossary consists of three main areas:

● the term template defines additional information that is required or optional when the terms are defined ● the categories group the terms ● the defined terms provide clarity for the business

The typical business glossary workflow begins when a group of individuals agree on the rules and guidelines for creating terms. They edit the term template to include the rules and guidelines.

You can create a set of extra input fields for a term. The input fields can be free text or validated using minimum and maximum values or lookup tables. You can set the extra fields as required or optional. You can group the input fields into separate tabs or together on one tab. Create labels for the fields that add context and offer guidance for users when they create terms.

The group who created the term template can also define the category hierarchy. Subject matter experts create and define the terms, create term relationships, and place them in the categories. When business users have questions about terms, they search the business glossary and find the definition and term relationships.

Term Template

The term template provides the rules or guidelines for users who are defining terms. You can update the template in two areas, and those changes are shown for every term. The first area is in the Definition tab, where you can provide instructions for users to think about when they’re defining terms. For example, in the Definition tab, you can provide a list similar to this list:

● Terms must be the singular form. ● Definitions must be in present tense. ● Definitions must use the term in a sentence. ● Definitions cannot contain acronyms or abbreviations.

The second area where changes are shown is in the custom attributes. Create custom attributes to further govern the terms. For example, you can create a Workflow custom attribute tab that contains a Definition Date option and require that the option is defined before the term is saved. You can also create an Approved field that is optional.

Data Governance User Guide Business Glossary PUBLIC 59 The content in the Definition tab and any custom attributes are included when users create a new term.

Spend a considerable amount of time designing the term template before creating categories and terms. The more finalized the term template is, the more consistent your glossary is. If you created terms and then change the term template, the changes are shown in the existing terms. If the changes in the term template include required fields, the older terms are not required to complete those fields until those terms are edited. The terms are placed in an error state so you can find the terms to change to comply with the new requirements. For example, if you have some existing terms and then change the term template to include a required field, the older terms do not have the required field completed. Those terms have to be manually updated. If you remove a custom field, the field is removed for all existing terms and is no longer shown when defining new terms.

Categories and Terms

Create categories to group your terms and definitions. Make clear definitions for your terms and include keywords, synonyms, and relationships to provide additional context. For example, a business can have these categories, terms, and definitions within a Finance category:

Finance Subcategory Term Definition Keyword Synonym

Interest Annual Percentage Percentage of the loan APY, interest, rate APR, Annual Yield or savings amount Percentage Rate including compounded interest.

Interest Annual Percentage Percentage of the loan APR, interest, rate APY, Annual Rate or savings amount Percentage Yield including other fees and costs.

Interest Compound Interest Money that is earned principal, interest compounded interest, and added to an compounding, account balance so accrued interest the interest and principal both earn interest.

Funds Mutual Fund A mix of different investment, stocks, bond fund, retirement, stocks or bonds. bonds

Funds Exchange-Traded A mutual fund traded ETF, stock, index mutual fund, index Funds like stocks on a stock fund exchange. ETF typically follows the performance of a particular index such as the S&P 500 Index.

Each term could also have a Workflow custom attributes tab with these attributes defined: Definition Date, Term Created By, Approved, Approved By, and Approval Date.

You can have multiple glossaries, and you can import glossaries from SAP Information Steward.

Data Governance User Guide 60 PUBLIC Business Glossary Related Information

Create a Business Glossary [page 61] Manage Business Glossaries [page 62] Edit the Term Template [page 62] Create a Business Glossary Category [page 64] Manage the Business Glossary Categories [page 64] Create a Term [page 65] Manage a Term [page 66] Manage a Term Relationship [page 67] Search for Categories and Terms [page 69] Import a Business Glossary from Information Steward [page 70]

5.1 Create a Business Glossary

You can have multiple business glossaries.

Context

Each business glossary can use a custom term template with its own criteria for definitions, or you can copy a template from an existing glossary. Any changes to the term template apply to a single glossary.

Procedure

1. From the Metadata Explorer navigation menu, choose Glossary View Business Glossaries . 2. Click  Create a new glossary. 3. Enter a name for the glossary. 4. (Optional.) Choose a template. You can create a new template from scratch by leaving the Template option empty. 5. (Optional.) Enter a description of the glossary. 6. Click Save.

Data Governance User Guide Business Glossary PUBLIC 61 5.2 Manage Business Glossaries

You can set a default glossary, delete a glossary, or edit the name and description of the glossary.

You can perform the following tasks on the Business Glossary Overview page by clicking the Metadata Explorer navigation menu and choosing Glossary View Business Glossaries . Click  More Actions.

Action Description

View Glossary Opens the glossary so you can view the categories, terms, and definitions.

Edit Glossary Properties Opens a panel where you can change the name and description of the glossary.

Set as Default Glossary Sets one glossary as the default glossary. It adds a star icon so that you can distinguish the default from the others.

Delete Glossary Removes the glossary.

View Import Details Shows the details of an imported glossary such as the number of categories, terms, at­ tributes, errors, and status. This option is only available on imported glossaries.

5.3 Edit the Term Template

Use the term template to define how users enter definitions and create custom attributes to further govern the term content.

Context

The term template can be customized so that users who are creating terms create consistent definitions. For example, in the Definition tab, you can have instructions indicating that the term must have a short and a long description, or that the term must have an image. In the Custom Attributes tab, you can create items that must be completed before the user can save the term. You can create up to 20 groups with up to 25 attributes within each group. You can have multiple term templates, but each glossary can be assigned to one template only. Any changes you make to the term template only applies to the glossary you are in.

Let's say that you want each term approved by a reviewer, and you want the date the term was approved included. You can create an Approval group, and then create several custom attributes such as:

● Approved By: a String data type where the user selects a name from a list of people who can approve the term. ● Approval Date: a Date data type where the user selects a date from a calendar. ● Score: an Integer data type with a defined range of 1 through 5 where the user selects a score for the term.

You can make any of these attributes required or optional.

If you change the term template after you have added terms, you could receive an error or warning. If you see a warning state, check the term to see if there is any action needed to correct the term. If there is an error status,

Data Governance User Guide 62 PUBLIC Business Glossary open the term to view the errors. For example, if you added a required attribute, you see a warning sign next to the term until you edit the term with the required setting completed.

Procedure

1. Click Edit Term Template. 2. Enter instructions for users who are creating definitions in the Definition tab. 3. Click the Custom Attributes tab. 4. Click Add Custom Attribute Group.

 Note

If a group already exists, click  Add New Tab to create a new group.

5. Enter a name for the group. 6. Optional: Select Enable rich text for this group if you want a rich text box at the bottom of the Custom Attributes tab where users can add a note or more information about the attribute choices. The rich text box includes bold, underline, links, pictures, and so on. 7. Click Add Attribute. 8. Choose whether you want the attribute in the Left or Right column in the Display Column option. 9. Enter a name for the attribute. 10. Optional: Enter a description. 11. Optional: Click Required if users must define the attribute before the term is valid. 12. Choose the data type. Some data types have additional options to complete.

Data Type Additional Options

Boolean N/A

Date N/A

Decimal Enter a Validation Type: ○ Any Value: Users can enter any decimal value. ○ List of Values: Enter a list of decimal values that the user can choose. If multiple values are acceptable, click Multi Select. ○ Range: Enter a range of decimal values that the user can choose.

Integer Enter a Validation Type: ○ Any Value: Users can enter any integer value. ○ List of Values: Enter a list of integer values that the user can choose. If multiple values are acceptable, click Multi Select. ○ Range: Enter the Minimum and Maximum values that the user can choose.

String Enter a Validation Type: ○ Any Value: Users can enter any integer value. ○ List of Values: Enter a list of string values that the user can choose. If multiple values are acceptable, click Multi Select.

13. Click Save.

Data Governance User Guide Business Glossary PUBLIC 63 5.4 Create a Business Glossary Category

Group similar terms together so they’re easier to find.

Context

The business glossary categories are a hierarchical structure that contains the business terms. The business terms can be included in multiple categories. For example, a financial business that has the categories of Businesses Finance and Individual Finance can have the terms Customer ID and Budget in both categories.

Procedure

1. From the Category pane, click  Add Category.

 Note

To add a subcategory, choose the root category, and then click  More Actions Add Category .

2. Enter a name and description. 3. Choose one of these actions.

○ To continue creating categories, click Save and New. ○ To save and close the dialog, click Save. ○ To return to the overview page without saving the category, click Cancel.

5.5 Manage the Business Glossary Categories

View, edit, or delete a category.

Context

As terms and definitions evolve, the categories change also. Some categories are renamed while others must be deleted. If you delete a category that contains subcategories, then those categories are deleted also. Any links between the terms and the categories are removed. The terms are still available and can be added to other categories. If the category or a subcategory is used in a filter, then those categories are removed from the filter.

Data Governance User Guide 64 PUBLIC Business Glossary Procedure

1. From the category pane, click  More Actions. 2. Choose one of these actions.

Choice Description

View Category See details about the category, such as the description, the name of the per­ son who created or modified the term, and the date it was created or modi­ fied. Click Close to return to the overview page.

Edit Category Change the name or description of the category. Click Save.

Delete Category Delete the term and the definition. Click Yes to confirm that you want the cat­ egory deleted.

5.6 Create a Term

Business terms have several facets that help users understand as well as making it easier to find relevant terms.

Context

The only required fields are the Name and Definition, unless the term template has required custom attributes. However, the term can be more easily understood or found when you add keywords, synonyms, and term relationships and then add the term to one or more categories. Keywords can be descriptive names for the terms. For example, APY can be a keyword for the term Annual Percentage Yield. A synonym is a word or phrase that means nearly the same thing as the original term. For example, "come in" is a synonym phrase for "enter". Keywords and synonyms do not have to be terms defined in your glossary.

To create a term relationship, see Manage a Term Relationship [page 67].

Procedure

1. In the term list panel, click  Add Term. 2. Enter a term name. 3. Optional: Enter up to 10 keywords or 10 synonyms by either typing the words or clicking  to view a list of existing keywords and synonyms. 4. Enter a definition. Use the formatting tools to enhance your definition such as color, bold, bullets, links, images, video, and so on.

Data Governance User Guide Business Glossary PUBLIC 65  Note

Images and videos must be links that are hosted elsewhere and can be accessed from the browser used to view the term. Images and videos are not stored in the repository.

5. Optional: Click Add term to categories to organize your terms. Select up to 25 categories, and then click OK. 6. Click Save to view the term or Save and New to create another term.

5.7 Manage a Term

View, edit, or delete an existing term.

Context

There are several reasons why a term needs to be managed. Terms can change over time. Perhaps there’s a new industry buzzword that can be added as a keyword, or the definition can be either narrowed or expanded. Other times, the term becomes obsolete and must be removed. Another situation is when the term template has changed. If you change the term template after you have added terms, you could receive an error or warning. For details about the term template, see Edit the Term Template [page 62].

Procedure

1. Select the term from the term list. 2. View the term definition and the other settings. Choose an action.

○ To return to the term list without making changes, click  Go back to term list. ○ To delete a term, click Delete.

 Note

Any related terms or connections to objects are removed.

○ To modify the term, click Edit. Make the changes you want, and then click Save to view the term or click Save and Close to return to the term list.

Data Governance User Guide 66 PUBLIC Business Glossary 5.8 Manage a Term Relationship

Link a term to another term, a published dataset, rule, rulebook, or a column, or remove the relationship.

Context

You can link a term to multiple other terms, rules, rulebooks, published datasets, and columns. Term relationships create connections that are visualized in a graph. This graph can give a complete picture of the term's relevance to the business. When those relationships are no longer relevant, you can remove the link. Likewise, when related objects are removed from the catalog are automatically updated in the related objects for the associated terms. For example, if the Sales_Region table is removed from the connection, then those terms that linked to the table or the columns within the table are removed from the term's Relationship tab.

 Note

A new term must be saved before relationships can be added.

Procedure

1. Select a term. 2. Select Edit in the lower right corner. 3. Click the Relationships tab. 4. Click Edit Related Objects. The Edit Related Objects dialog opens. 5. Choose one or more actions.

Action Additional Steps

Link a related term 1. Make sure that you are on the Add related term tab. 2. To link to a term in a different glossary, click the Select a Glossary drop- down list and select the glossary with the term that you want to use. 3. Enter all or a portion of the term name in the Search terms box and click  Search. A list of terms appears. 4. Select the terms that you want linked to this term.

Link to a dataset or column 1. Select the Add related datasets or columns tab. 2. Enter the name of the dataset in the Filter items search box or expand the connections to find the dataset. 3. To select a dataset, select the checkbox next to the dataset name. To link to a column, expand the dataset and select the checkbox next to the col­ umn name.

Link to a rule 1. Select the Rules tab. 2. Either enter a portion of the rule name in the search box or expand a rule category.

Data Governance User Guide Business Glossary PUBLIC 67 Action Additional Steps

3. Click the checkbox next to one or more rules.

Link to a rulebook 1. Select the Rulebooks tab. 2. Either enter a portion of the rulebook name in the search box or scroll through the list of rulebooks. 3. Click the checkbox next to one or more rulebooks.

6. Click Save Related Objects.

Results

You can view the relationship chart showing the objects that you selected. There are several options for customizing how you view the chart, and there are several options to learn more information about the related objects.

Change the Chart View To change the orientation layout, select the drop-down list on the left side of the screen. Choose to view the chart from Top-Bottom, Left-Right, or Right-Left.

To view the legend that identifies columns, datasets, and terms, click  Legend. Zoom in or out with  Zoom In and  Zoom Out. To fit the chart into the space provided, click  Zoom to Fit. To view the chart using the whole screen, click  Enter Full Screen.

Learn About Related Objects You can learn more about the related terms, datasets, and columns, and perform some actions.

Related Terms

When there are linked related terms, click  More Actions next to the term. Click Jump to Term to view that term and its relationships. You could also see Indirectly Related Terms. These terms have created a link from their definition to the term you are now viewing. However, the term you are now viewing has not created a link to the indirect term. Only those terms that have been related from this term are shown as a Related Term.

To delete the term relationship link, click Delete Relationship.

Related Datasets

When there are linked related datasets, you can see the dataset name, type, connection ID, and qualified name. If you selected multiple columns from one dataset, then you can see the dataset group with a Group Details button to show the same information. You can also see whether the dataset has been profiled or has lineage. Click  More Actions to perform the following tasks:

● View Fact Sheet ● View in Browse ● View in Catalog ● Start Profiling ● Prepare Data ● Delete Relationship

Related Columns

Data Governance User Guide 68 PUBLIC Business Glossary When there are linked related columns, you can see the column name, type, and native type. Click  More Actions to perform the following tasks:

● View Fact Sheet ● View in Browse ● View in Catalog ● Start Profiling ● Prepare Data ● Delete Relationship

5.9 Search for Categories and Terms

Find categories and terms to learn more about the terms and their relationships.

Context

The business glossary serves to make the business more clear and understandable. You can search for categories only, or search for terms only, or search for a combination of both. When filtering in the Category panel, the results show the category only; it doesn't include subcategories. When searching in the term list, the results show terms that match any of the term filters. You can search or filter categories or terms.

Procedure

1. To search for categories, choose one of these actions in the category panel.

○ In the Filter categories search box, enter all or a portion of the category name, and then click  Search. ○ Click  Filter categories. Enter a name or description. To show results only for those categories you have created or modified, click Show my categories only. To remove categories that don’t have any terms associated, click Exclude empty categories. Click Apply. 2. To filter or search for terms, choose one of these actions.

○ Click the letter, number, or character in the filter bar near the top to show those terms that begin with that character. ○ In the Search business glossary box, enter all or a portion of the term and then click  Search. ○ Click  Filter terms. Enter one or more characters in the Search business glossary box. To show results for only those terms you created or modified, select Show my terms only. To narrow the results by status, click the checkbox next to the options you want to view in the Term status list: Valid, Warning, or Invalid. Click Apply. ○ To view terms in a category, select the category, and then click  Add Category as Search Filter. You can also add the filter by clicking  More Actions and choosing Add Category as Search Filter. Another way to add the category is by dragging the category name to the filter bar.

Data Governance User Guide Business Glossary PUBLIC 69  Note

If you filter using Show my categories only, then the categories use the OR condition while Show my terms only uses the AND condition. For example, if you filter using the categories Indoor Lighting and Outdoor Lighting, and you choose to Show my terms only, then the condition statement is: (Indoor Lighting OR Outdoor Lighting) AND My Terms Only. The results show those terms you created or modified in the Indoor Lighting and Outdoor Lighting categories.

5.10 Import a Business Glossary from Information Steward

You can reuse glossaries that you created in SAP Information Steward.

Prerequisites

If you have an existing glossary from Metapedia in Information Steward, you can import it. Before importing the glossary, you must go to Information Steward and export two files:

● Export the Metapedia data into a Microsoft Excel file. The data in the Excel file includes categories and terms. If you have any HTML tags in your Metapedia term descriptions or if you are unsure whether you have HTML tags, we recommend using the Export descriptions in HTML format option. See SAP Information Steward User Guide .

● Export the custom attributes into a ZIP file. The data in this file includes values beyond the default Metapedia values. See SAP Information Steward User Guide .

Context

When a glossary is imported, it creates a new glossary in Metadata Management. You cannot copy the new terms to an existing glossary. You can either recreate the term in an existing glossary or create a term relationship link from an existing term in the glossary to a term in the new glossary. See Manage a Term Relationship [page 67]. At this time, you cannot share a term from one glossary to another glossary, or import a glossary and place it in an existing glossary.

Procedure

1. From the Metadata Explorer navigation menu, choose Glossary View Business Glossaries . 2. Click  Import Glossary. The Import Glossary dialog opens. 3. Enter a name and description for the glossary.

Data Governance User Guide 70 PUBLIC Business Glossary 4. Browse to the location of the exported Metapedia file. Click Open. 5. Browse to the location of the custom attribute file. Click Open. 6. Click Import.

Results

The import task runs. When the task is completed, click  More Actions and choose View Task Details to view more information about the import process. For more information about the details shown after importing glossaries, see View Details About Glossary Importing [page 80]. To see the contents of the imported glossary, click View Glossary.

After importing, there is the overall status of whether the term is ready to use:

● Completed: The term is ready for use. ● Partial: The term was successfully imported, but there is an issue with the term. ● Error: the term was not successfully imported.

You can view these details on the Monitoring Tasks page.

Then there is the state about each term that you can view in the glossary:

● Active: The term is read for use. ● Warning: The term could be in an invalid state. Open the term to view any issues. ● Error: The term has an error. Open the term and click View Errors. Edit the term to resolve the error.

Data Governance User Guide Business Glossary PUBLIC 71 6 Managing Publications

View, add, delete, and edit the publications in the catalog.

The catalog may contain many published datasets, and you can manage the datasets in the Manage Publications page.

Access Manage Publications by clicking the Metadata Explorer navigation menu and choosing Administration Manage Publications .

The list of connections is shown. Any obsolete connections have an icon with a triangle. Obsolete connections are those connections that are changed or removed. It may be necessary to update the connection information in Connection Management.

Click any connection to view a list of publications with the following information.

Column Name Description

Obsolete Connection An icon that indicates that the publication connection is changed or removed.

Name The name of the publication.

Description The description of the publication.

Source Folder The name of the dataset folder on the connection where the published object is lo­ cated.

File Names or Patterns The name of the objects included in the publication.

Include Subfolders Indicates whether the published object (typically a folder) also published the con­ tents of its subfolders.

Lineage The lineage was extracted. This option is only for connections that support lineage.

User The user name who published the object.

Last Modified The date and time the object was most recently changed.

Last Executed The date and time the object was most recently published.

Runtime The duration of the publication process in hours, minutes, and seconds.

Filter Connections and Publications

You can filter the connections by entering the name or type of connection in the Filter connection names text box such as HANA.

Related Information

Create a Publication [page 73]

Data Governance User Guide 72 PUBLIC Managing Publications Edit or Delete a Publication [page 74]

6.1 Create a Publication

Publish a dataset from the Manage Publications page to allow others to view or use the contents of the dataset.

Context

When a publication is created, all datasets that match the filter have their metadata extracted and stored in the Catalog. Then colleagues with the correct permissions can view fact sheets, tags, and metadata about the datasets.

Procedure

1. Click the Metadata Explorer navigation menu and choose Administration Manage Publications . 2. Choose a connection. 3. Click Create Publication. 4. In the Source Folder, browse to the location of the source object. Click OK. 5. If you selected a folder and want to publish the objects in the folder or subfolders (based on any file names or patterns specified), then select Include Subfolders. For example, if you specified *.csv, then only the CSV files are processed in the subfolders. 6. When it's available, you can choose Yes to extract Lineage. A lineage graph is created that shows the source and transformations. 7. In File Names or Patterns, enter an object name or a pattern. For example, if you want to publish all CSV files in a folder, you would enter the pattern *.csv. If you want to publish a single file in the folder, then enter the name such as customer.csv. 8. Click Run. The task begins.

Results

To view the results of the task, click the Manage Publications navigation menu, and then choose Monitoring to view the progress of the task. You can navigate to Catalog to view the published results in the folder with the same name as the connection.

Data Governance User Guide Managing Publications PUBLIC 73 6.2 Edit or Delete a Publication

Modify the contents or target of a publication or delete a publication.

Context

You may need to change the contents of an existing publication, or you may need to delete the publication.

 Note

Modifying a publication may add or remove objects in the catalog. Deleting a publication removes all datasets and folders that were created in this publication, including any subfolders. The object is also removed from the monitoring list.

Procedure

1. Click the Metadata Explorer navigation menu and choose Administration Manage Publications . 2. Choose a connection and click the right arrow next to the publication you want to edit or delete. 3. Choose one of the following:

○ To edit the publication, change the options, set the target folder to publish, and then click Run.

 Note

If the existing publication has the Lineage Depth set, you cannot change it when reprocessing the publication. To set the Lineage Depth option, delete the existing publication and create a new one.

○ To delete the publication, click Delete.

Data Governance User Guide 74 PUBLIC Managing Publications 7 Monitoring Tasks

View the status of tasks, filter the monitoring data, and cancel a running profile, publish, preparation, or rulebook task.

The Monitor tile on the Discovery Dashboard page of the Metadata Explorer shows the status of your recent tasks. Anytime you start a data preparation, profiling, rulebook, or publishing task, you can see whether it’s processing, completed, or stopped running due to an error.

Access the Monitoring Page

Access Monitoring by clicking the Metadata Explorer navigation menu, and then select Monitor Monitor Tasks .

The Monitoring page lists all tasks and includes this information:

● Name and type of task ● Status ● User name or ID of the person who started the task ● Date and time the task completed ● Task processing time

If you have an error in any of the tasks, hover over the error in the Status column to see the error message. To view more information about the error, you can go to the SAP Data Intelligence Monitoring application, outside of the Metadata Explorer. Click More Actions View Logs .

When additional information is available, click the value in the Status column to see more information about the task processing.

Depending on the type of processing, different options are shown when you click  More Actions.

Task Type More Actions Options

Automated Lineage View Task Details

View Logs (Shown when there’s an error.)

Import View Task Details

View Glossary (Shown when the Import Type is Glossary.)

View Logs (Shown when there’s an error.)

Preparation View in Manage Preparations

View Logs (Shown when there’s an error. Only the person who started the task can view the logs.)

Data Governance User Guide Monitoring Tasks PUBLIC 75 Task Type More Actions Options

View Error Details (Shown when there’s an error.)

Profile View Metadata

View Fact Sheet

Start/Cancel Profiling

View Logs (Shown when there’s an error.)

View Error Details (Shown when there’s an error.)

Publish View Task Details

View in Manage Publications

Rulebook Cancel Rulebook Task

View Task Details

View Rulebook Details

View Rulebook Results

View Error Details (Shown when there’s an error.)

Filter by Task Type

You can filter based on the task type or status. To show only one task type, click the tab of the task type you’re interested in viewing. For example, click the Profile tab to view the status of only profiling tasks. To view the complete list of tasks, click All Tasks.

Filter by Status

To filter on the task status, click one of the status icons.

Status Description

 Running or Pending Pending means that the task is sent from the Metadata Explorer to the flowagent.

Validating means that the graph is being created and verified.

Active means that the task is processing.

 Completed The task completed without errors.

 Error Processing didn’t complete successfully due to an issue. To view more information about the error, go to the SAP Data Intelligence Monitoring

Data Governance User Guide 76 PUBLIC Monitoring Tasks Status Description

application, outside of the Metadata Explorer. Click More Actions

View Logs . For publishing and rulebook tasks, click the status to view information about the task processing.

 Partial For publishing and rulebook tasks only, a portion of the task completed successfully, and another portion has an error. Click the status to view information about the task processing.

Filter by Task Name

If you’re looking for a specific task, start typing the name in the Filter names search box.

To sort the search results by ascending or descending order, click each of the column headings in the table.

Related Information

View Details About Publication Processing [page 77] View Details About Rulebook Processing [page 79] View Details About Glossary Importing [page 80] Managing Publications [page 72] Managing the Catalog [page 28] Overview of Metadata Extraction [page 19] Self-Service Data Preparation

7.1 View Details About Publication Processing

View information about the published datasets in a single task that successfully completed and the tasks that have a processing error.

After you publish a folder or subfolder, you might see a Partial status in the Monitoring page indicating that some of the datasets in the tasks completed successfully while other datasets encountered an error. To view details of the task from the Monitoring page, find the Status column, and then click Partial, Success, or Error. You can also click  More Actions and choose View Task Details.

 Note

Profiled jobs and preparations only show error details when an error has occurred.

The Publication Information dialog opens.

There are three sections of information: publication details, extracted datasets, and failed datasets.

Data Governance User Guide Monitoring Tasks PUBLIC 77 Publication Details

The Publication Details tab provides an overview of the processing.

Option Description

Connection ID Shows the name of the connection

Source Folder Provides the path to the published folder.

File Names or Patterns Provides the name of the file or the pattern used to find the dataset. For example, the pattern, *.csv, would include those datasets with the extension CSV.

Total Extracted Datasets Shows the number of datasets that successfully completed.

Total Failed Datasets Shows the number of datasets that encountered an error during processing and did not complete successfully.

Status Shows the type of error. Click the error to view the error details, such as the error code and message.

View Logs Links to the processing logs.

Extracted Datasets

The Extracted Datasets tab shows the new, updated, and unchanged datasets that were successfully published. Each section lists the location and name of the datasets. Click  Filter to select only those datasets in the new, updated, or unchanged categories that you want to view.

To view more information about each dataset, click the dataset name. View the status, last metadata refresh, and last successful publishing dates and times, among other information.

Failed Datasets

The Failed Datasets tab shows the datasets that did not complete successfully. These datasets are placed into one of two categories: dataset extraction error or lineage extraction error. The dataset name and the error code are shown. Click  Filter to select only those datasets with a specific error code in the category that you want to view.

In the Filter datasets by name search box, enter all or a portion of the dataset name to view only those datasets that match the text you have entered.

Click the dataset name to view the error code and message as well as the causes of the error.

To export the error information into a JSON file, click Export Data.

Data Governance User Guide 78 PUBLIC Monitoring Tasks 7.2 View Details About Rulebook Processing

View information about the processed datasets within a rulebook.

Details about rulebook processing include the following:

● Success or failure status of the dataset and its rules ● Processing time for each dataset ● Links from the Failed Datasets tab to the Monitoring application

 Note

You can see the link only if you ran the rulebook; other logged in users cannot see the link.

After you run the rules in a rulebook, you might see a Partial status in the Monitoring page indicating that some of the rules bound to datasets completed successfully while other datasets encountered an error. To view details of the task from the Monitoring page, find the Status column, and then click Partial, Success, or Error. You can also click  More Actions and choose View Task Details.

The Rulebook Information dialog opens.

There are three sections of information: rulebook details, datasets, and failed datasets.

Rulebook Details

The Rulebook Details tab provides an overview of the processing.

Option Description

Rulebook Name Shows the name of the rulebook.

Last Modified Provides the date, time, and user who most currently ran the rulebook.

Total Successful Datasets Shows the number of datasets that completed the task processing.

Total Pending Datasets Shows the number of datasets that will begin processing.

Total Running Datsets Shows the number of datasets that are processing.

Total Failed Datasets Shows the number of datasets that encountered an error during processing and did not complete successfully.

Datasets

The Datasets tab shows the datasets that were successfully run, and those datasets that are running or pending. Each section lists the location and name of the datasets. Click  Filter to select only those datasets in the successful, running, or pending categories that you want to view.

In the Filter datasets by name search box, enter all or a portion of the dataset name to view only those datasets that match the text you have entered.

Data Governance User Guide Monitoring Tasks PUBLIC 79 To view more information about each dataset, click the dataset name. View the name, qualified name, the date and time the dataset started and completed processing.

Failed Datasets

The Failed Datasets tab shows the datasets that did not complete successfully. The dataset name and the error code are shown. If the current user is the same person who last ran the rules, then a link to View Logs is shown, where you can learn more about the error.

In the Filter datasets by name search box, enter all or a portion of the dataset name to view only those datasets that match the text you have entered.

Click the dataset name to view the error code and message as well as the causes of the error.

7.3 View Details About Glossary Importing

View information about importing glossaries such as the number of categories, terms, and attributes.

You could see a Partial status when there’s an issue with some of the terms. For example, if a term has a relationship to a dataset that doesn't exist in Metadata Explorer.

Details

Option Description

Name The name of the glossary defined when you started the import process.

Description The description of the glossary.

Metapedia File The name of the exported Microsoft Excel file.

Custom Attribute File The name of the exported custom attribute ZIP file.

Completed Categories The number of categories in the imported file.

Error Categories The number of errors when importing categories.

Completed Terms The number of terms available for use in the glossary.

Partial Terms The number of terms that have an issue such as a term without a definition or a rela­ tionship that is invalid.

Error Terms The number of terms that have an error during importing.

Custom Attributes The number of custom attributes added to the term template.

Completed The date and time the import finished processing.

Runtime The amount of time it took to process the import in days:hours:minutes:seconds.

Data Governance User Guide 80 PUBLIC Monitoring Tasks Option Description

Status The status of the import, such as Completed, Partial, or Error.

Categories

The imported categories are listed in a flat structure (not a tree structure). The Import Status column can have a Completed or Error status. You can filter the categories by entering all or a portion of the category name in the Filter categories search box.

Click the category name link to view this information.

Option Description

Name The name of the category.

Description The description of the category.

Last Modified By The name of the person who last changed the category.

Last Modified The date and time the category was last changed.

Created By The name of the person who created the category.

Created The date and time the category was created.

Terms

The Terms tab lists the terms and shows whether the term is ready to use, or needs some changes to make the term complete. The Import Status column can have these values:

● Completed: The term is completed and ready for use. ● Partial: The term was successfully imported, but there’s an issue with the term. For example, a synonym value could exceed 256 characters and is truncated. Click Partial to view the issue. ● Error: The term wasn’t successfully imported. Click Error to view the error.

The Term Status column can have a value showing Term is invalid. This status could mean that the required custom attributes that aren’t set.

You can filter the terms by entering all or a portion of the term name in the Filter terms search box. Click the column heading to filter or sort the results. Click the term name go to the term where you can edit the term definition.

Custom Attributes

The custom attributes are split into two groups. The Custom Inputs group shows the defined custom attributes. It has all the attributes that aren’t governance related.

Data Governance User Guide Monitoring Tasks PUBLIC 81 The Governance group pertains to the workflow and governance attributes such as the Approver setting. Each group lists the defined attributes showing the data type and the validation type. In SAP Information Steward, governance is not a custom attribute but you can find it in the Custom Attributes tab in this product.

You could also see warnings on custom attributes after importing. For example, an attribute can have too many lookup values.

Export Details

Click the Export Data button to download the import data details into a JSON file.

Data Governance User Guide 82 PUBLIC Monitoring Tasks 8 Viewing Metadata

Metadata is information about your data, such as the object type, size, owner, column, and connection information. It also includes related information like rules, tags, terms, and so on.

Access Metadata

You can access your metadata in several ways.

● From Browse Connections or the Catalog, navigate to a dataset. Click  More Actions, and then select View Metadata. ● From the Monitoring page, choose any profile task, click  More Actions, and then select View Metadata.

Metadata Details

The metadata information varies depending on the connection, so you may not see all of the options listed here.

On the  Information tab, you can see properties and related objects for the dataset. The properties information shows the name, type, size, connection, and so on. The related objects section shows the tags, rulebooks, and terms applied to the dataset.

To filter the tags:

1. Click in the Filter tag names box, enter all or a portion of the tag, and then click  Search. 2. Filter the results of the search by clicking  Filter. 3. Choose to sort the results by Default, Ascending, or Descending. You can also filter by the tag hierarchy by clicking in the dropdown list and selecting the hierarchies you want to view.

In the Rulebooks section, click the rulebook name link to view the rulebook. Likewise, in the Terms section, click the term link to view the term in the Glossary.

On the  Columns tab, you can see the list of columns. Click  arrow to view the properties and related objects for that column similar to the dataset.

Data Governance User Guide Viewing Metadata PUBLIC 83 9 View Fact Sheet

Learn more about your data by viewing the columns, data, lineage, comments, and relationships related to your dataset.

You'll gain more information after you profile a dataset. You can view the Fact Sheet before publishing or profiling to view the basic metadata available such as column data types and unique keys. After profiling, the fact sheet displays more information, such as distinct values and the percentage of null/blank/zero values. You can also edit user descriptions, add tags, revies, comments, and discussions. To begin profiling from the Fact Sheet, click  Start Profiling.

Access the Fact Sheet from Monitoring, Browse Connections, or Catalog. Choose the dataset, click  More Actions, and then select View Fact Sheet. You can also access the Manage Fact Sheets page by clicking the Metadata Explorer navigation menu and choosing Administration Manage Fact Sheets .

The fact sheet consists of an upper pane (header) that contains some metadata about the dataset. You can hide the header by clicking  Collapse Header. The header is automatically hidden when you scroll down the page. You can always display the header information by pinning it open. Click  Pin header on press. Then the header is shown when you scroll down. Click the pin icon again to default to hiding the header.

The information in the header includes:

Option Description

Fact Sheet Name Shows the name of the dataset you’re viewing.

Status Lists the publishing, profiling, and lineage (when available).

Rating Shows the overall dataset rating from 0 to 5 stars. Zero stars means that the dataset has not been rated yet.

Published Shows the most current date that the dataset was published. If it hasn’t been published, then the date is blank.

Type Shows the type of dataset you’re viewing. This type can be a file, table, view, query, and so on.

Columns Shows the number of columns in the dataset.

Rows Shows the number of rows in the dataset.

Current Version Shows the number of the most recent profiling version. For example, if the dataset has been run 4 times, then the number 4 is shown.

Profiling Runtime Shows the amount of time it took to process the dataset in hours, minutes, and seconds.

Profiled Shows the date and time the dataset was profiled.

You can perform these actions from the header by clicking the icon:

Icon Description Action

Start Profiling Run a profiling task on the current dataset to learn more about your data and where it could be lacking information.

Data Governance User Guide 84 PUBLIC View Fact Sheet Icon Description Action

Prepare Data Assess and enhance the quality of your data and share the data.

View in Catalog View the published dataset in the catalog.

View in Browse View the source dataset in browse.

Download File Download the file and view or save it to another location.

Delete Remove all versions of profiling data.

Refresh Refresh the screen if you have published or profiled your data.

The bottom portion of the screen shows data based on the tab that you are on.

Tab Description

Overview View general data about the dataset such as the properties, metrics, and reviews.

Columns View the column data including the data type, minimum and maximum values, average length, the percentage of null, blank, and zero values, distinct values, uniqueness, and the number of tags associated with the column.

Data Preview View the first 100-1000 rows of your data.

Lineage View the source, target, and transformations that include this dataset, if your dataset and connection allow lineage processing and you enabled the lineage option during publishing.

Reviews View the ratings, comments, and discussions regarding this dataset.

Relationships View the terms and tags associated with this dataset, the rulebooks that include this dataset, and a list of other datasets that have common terms, tags, or rules.

Permissions for Viewing Data in the Fact Sheet

You must have certain rights or permissions to access all the functionality on the Fact Sheet. To adjust your permissions, see your system administrator.

 Note

Some columns return data from the dataset. For example, Minimum - Maximum and Average Length can return sample data. The values in these columns aren’t shown. Likewise, the Top 10 Distinct Values aren’t shown if you don't have permission to view data on that connection.

If someone with full permissions publishes a dataset to the Catalog, then only those users with full permissions can view all the functionality in the Fact Sheet. Those users who have restricted permissions can’t see any disabled and hidden values.

Related Information

View a Summary of the Dataset [page 86] View a Summary of Column Data [page 89]

Data Governance User Guide View Fact Sheet PUBLIC 85 Preview Data [page 93] Analyze Data Lineage [page 94] Review and Comment on a Dataset [page 104] View Dataset Relationships [page 105] Manage Fact Sheet Versions [page 107] Search for Fact Sheets on a Connection [page 108] Search for Fact Sheets Across Datasets [page 109] Self-Service Data Preparation

9.1 View a Summary of the Dataset

View metadata about your dataset in the Fact Sheet.

The Fact Sheet Overview tab shows high-level information about your dataset. The information is more robust if you publish and profile the dataset. You can view information about the properties, column types, profiling trends, metrics, ratings, and comments. Only a subset of the blocks is shown if you haven’t published or profiled the dataset or if those tasks aren’t supported on the dataset or connection.

1. Navigate to the Fact Sheet by clicking  View Fact Sheeton the dataset that you want to view. 2. Click the Overview tab, if it isn't already open.

View Dataset Properties and Descriptions

Learn about your dataset and edit the description.

Properties

Depending on the dataset type, you can view several of these properties. The contents of the properties can change after publishing or profiling the dataset. Some sources such as SAP ABAP show additional properties. The additional properties are shown at the bottom of the block in the More Properties section.

Option Description

Connection ID Shows the name of the connection.

Connection Type Shows the type of connection such as ADL, MYSQL, S3, HANA_DB, and so on.

Qualified Name Shows the location of the dataset including the path starting at the root path defined in the connection.

Type Shows the type of dataset you’re viewing. The dataset can be a file, table, view, query, and so on.

Size Shows the size of the dataset in bytes.

Columns Shows the number of columns in the dataset.

Rows Shows the number of rows in the dataset. When viewing a large file that has used sampling in the results, the row count is estimated. See Sampling Used.

Data Governance User Guide 86 PUBLIC View Fact Sheet Option Description

Sampling Used Indicates that the dataset was too large to profile all of the data. Therefore, only a subset of the data was profiled.

Profiled Rows Shows the number of rows used in the profiling results. The dataset was too large to profile all of the data.

Owner Shows the name or ID of the owner.

Last Modified Shows the date and time when this dataset was changed. If the Last Published or Last Profiled date is earlier than this date, then the data shown here could be out of date. Consider publishing or profiling again to get the most current data.

Last Published Shows the date and time when this dataset was most recently published.

Last Profiled Shows the date and time when this dataset was most recently profiled.

Header Indicates whether to use the first row as a heading row because it contains the column names in a CSV file. If you previewed the data, and the first row of the data is actually the name of the columns, then enable the Use first row as header option. If you change the Use first row as header option, then we recommend that you profile the dataset again.

Character Set Shows the character set or code page used such as ISO-8859-2.

Column Delimiter Shows the character that indicates the end of a column.

Row Delimiter Shows the character that indicates the end of a row.

Text Delimiter Shows the character that indicates the end of a value.

Descriptions The Descriptions card shows the Native Description and the User Description. The Native Description is assigned from the source connection and is shown here. You can edit the User Description after the dataset is published.

To add a description to the dataset, click  Edit User Description. Enter a description, and then click Save.

View Information About the Column Types

In the Dataset Metrics section of the Fact Sheet Overview tab, you can view the Column Types block. It shows a donut graph of the number of column types in your data, such as Character, Numeric, DateTime, and so on.

View Information About Profiling Trends

The profiling trends indicate the change from previous profiling tasks. If you have a dataset that has a large amount of data that needs to be consolidated, you would want to see the trend go down. If you have a dataset where you’re gaining more data, you would want to see that the trend goes up.

You have two options for viewing the data: Trend by Row Count and Trend by Size (when your connection supports this option). Trend by Row Count shows the number of rows in each profiling task. Trend by Size shows the number of bytes in each profiling task.

Data Governance User Guide View Fact Sheet PUBLIC 87 Up to five of the most recent profiling tasks are shown. Click one of the versions to view the following information:

Option Description

Version ID Shows the version that you’re viewing. The higher the number, the more current the profiling task.

Profile Date Shows the date and time when this version was profiled.

Row count Shows the number of rows profiled.

Size Shows the size of the dataset in bytes.

Runtime Shows how long it took to run the profiling task in hours, minutes, and seconds.

Status Shows whether the profiling task completed or had errors and warnings.

View Information About the Glossary

The Glossary Metrics block shows the number of glossary terms related to the dataset and columns. Glossary terms define and describe datasets and columns. The terms can provide contents and additional meaning to the data.

View Information About the Tags

The Tag Metrics block shows the number of tags applied to the entire dataset. It also shows the number of terms applied to individual columns. You can view the number of hierarchies used and whether the tags were automatically applied or user applied. To add or remove the associated tags, go to the Relationships tab.

View Information About Ratings, Comments, and Discussions

In the People and Reviews section of the Fact Sheet Overview page, you can view information about contributers, ratings, comments, and discussions.

The Contributors block shows information about the people who have added comments, replies, and discussions, and the number of times they’ve contributed. You can also view the date and time of their latest contribution.

The Ratings and Comments block shows the users who have rated and commented on this dataset. It also shows the date and time of their most recent rating or comment.

The Discussions block shows the most recent topics and the number of replies to the topic. You can also view the date and time of their latest discussion.

To add a rating, comment, or create a topic for discussion, go to the Reviews tab.

Data Governance User Guide 88 PUBLIC View Fact Sheet 9.2 View a Summary of Column Data

Learn basic metadata about your columns such as the name, type, and length. Then profile the data to learn about the minimum and maximum values, average lengths, percentage of null/blank/zero values, and more about all of the columns in your dataset.

View high-level data about the information in your columns. This information can help you decide how to improve your data. If you have too many blank or null values, you can work to enhance the data in that column, for example.

To view the column data:

1. Navigate to the Fact Sheet by clicking  View Fact Sheet on the dataset that you want to view. 2. Click the Columns tab.

The columns are shown. If you have a large number of columns, you can click through the pages. Only a subset of the columns described here are shown if you haven’t published or profiled the data. The information shown is also dependent on the type of dataset you're viewing.

 Note

When viewing the results of a large datset, then the results shown could be based on sampled data, not the full dataset. The affected columns include Minimum-Maximum, Average Length, and % Null/Blank/Zero.

Column Name Description

Unique Keys Indicates that the column contains data in a unique key. A unique key can be a value in a single column or combination of values from different columns. You could a see number of values listed in this column. Each number is a unique key of one or more columns. When a column has multiple numbers, then it is part of separate unique key groups. Look for other columns that share the same number to identify the columns that make up a unique key.

Name Shows the name of the column and a native description, if any. The native description is one entered on the dataset at the source and then displayed here.

Type Shows the type of column, such as string, integer, datetime, and so on.

Native Type Shows the type of data used by the remote connection, such as Varchar, Integer, Timestamp, and so on.

Template Type Shows the internal type of column for the input and output ports of the operators in Modeler pipelines.

Minimum-Maximum Includes a range of values, from the smallest value to the largest value, based on the data type.

● Character data types list the first and last values based on the alphabetical order, for example, aardvark - zebra. ● Numeric data types list the highest and lowest numeric values, for example, 1-9999. ● Date data types list the earliest and latest date or time, for example, 01/01/2001 - 08/08/2025.

Average Length Shows the mean length of the values in the specified column. The length is the sum of the lengths of all the values divided by the number of values. Only certain data types have an average length value.

Data Governance User Guide View Fact Sheet PUBLIC 89 Column Name Description

% Null/Blank/Zero Lists the percentage of the values occurring as null (empty), blank (contains a blank space), or zero (value is 0).

Distinct Values Lists the number of values that are different (unique) from other values. To view the top 10 distinct column values, click the column to drill in and see the column details, including distinct values.

Uniqueness Describes the count and percentage of rows that contain unique data. Uniqueness is classified as one of three types:

● Sparse: More than 90% of the values of the column are null or blank. ● Low Cardinality: The number of distinct values in the column is less than 2%. ● Unique Values: All the values in the column are distinct from each other.

Number of Tags Shows the number of tags assigned to the column.

Filtering the Profiled Data

You can filter the data in several ways on the Columns tab.

● Column names: In the filter box above the table, enter a portion or all of a column name. The result shows only the columns that match what you have typed. ● Category: Choose one or more categories and define the type of data that you want to see in the table. Click  Filter. The Filters dialog is shown. You can filter on one category, such as Minimum/Maximum, and further refine those results by adding another Minimum/Maximum filter (by clicking the + icon), or by adding a filter to another category such as Uniqueness. If you have multiple filters within a category, they’re combined with an OR operator. All the set filters are combined with an AND operator, so the data must meet all criteria to be shown in the results table. For example, if you have Length=55 and Average Length=40, they’re combined with the OR operator. If you have Minimum/Maximum, Length/Average Length, and % Null/Blank/Zero set, then all categories are combined with the AND operator.

Filter Category Description

Length/Average Length Length is represented in parentheses in the Type column. Average Length is in its own column. Choose Length or Average Length, then choose an operator and enter a num­ ber. Only those columns that have a length value are shown by the filter.

Minimum/Maximum There are four types of data available in the column depending on the data type. Use Minimum Length or Maximum Length to filter character type columns. Use Minimum Value or Maximum Value to filter columns of all data types.

Null/Blank/Zero Three types of data are displayed in the column. The data is shown as a percentage of the values. Depending on the data type, you can typically filter on two of the types. For example, the numeric data types can show zero and null, while character data types can show null and blank. Choose Null, Blank or Zero, and then choose an operator and enter a number. Only those columns that have data in the filter you created are shown by the filter.

Distinct Values Select the Distinct Value, choose an operator, and enter a value.

Uniqueness There are three types of data available: Low Cardinality, Sparse Values, and Unique Val­ ues. See the descriptions later in this topic. Choosing All shows all the uniqueness data.

Data Governance User Guide 90 PUBLIC View Fact Sheet Filter Category Description

Column Types Select the checkbox next to those column types that you want returned.

To remove filters, click the Filter icon and make your changes. Click OK.

Related Information

View Details About a Single Column [page 91]

9.2.1 View Details About a Single Column

View the data, tags, terms, descriptions, properties, and distinct values for a column to learn more about a single column of data.

To view the column data:

1. Navigate to the Fact Sheet by clicking  View Fact Sheet on the dataset that you want to view. 2. Click the Columns tab. 3. Click the column that you want to view.

View and Edit Column Tags

The Column Tags block shows the tags assigned to this column. To add or remove tags, follow these steps.

1. Click Manage Tags. 2. To delete tags, click the X next to the tag name in the Column Tags panel. 3. To add tags, choose the hierarchy list and select the hierarchy that contains the tags you want to use. 4. Expand the hierarchy and select the tags you want to use by clicking the checkbox next to the tag. 5. Click Close when you’ve finished.

View Related Column Terms

The Related Column Terms block shows the terms linked to this column. If there are terms from different glossaries, you can change the glossary by clicking the list in the header of the block. Click the term link to view the definition in the glossary.

Data Governance User Guide View Fact Sheet PUBLIC 91 View or Edit the Column Description

The Descriptions block shows the native description and the user description. The native escription is one carried over from the source. If the datset was published, you can enter a description in the user description option by clicking  Edit User Description. After entering your description, click Save.

Preview Column Data

The Data Preview block shows up to 100 rows of values in this column.

View Column Properties

The Column Properties block shows the following information. Depending on the dataset and connection, you may not see all of these options. Some sources such as SAP ABAP show additional properties. The additional properties are shown at the bottom of the block in the More Properties section.

 Note

When viewing the results of a large dataset, the results shown could be based on sampled data, not the full dataset. The affected columns include Minimum and Maximum Values, Minimum and Maximum Length, Null, Blank, Zero, Uniqueness, and Distinct Values.

Option Description

Name Shows the name of the column

Type Shows the type of column, such as string, integer, datetime, and so on.

Length Shows the length set for the column size.

Native Type Shows the type of data used by the remote connection, such as Varchar, Integer, Timestamp, and so on.

Template Type Shows the internal type of column for the input and output ports of the operators in Modeler pipelines.

Unique Key Contains a number indicating that the column is a primary key.

Minimum Value Shows the smallest value based on the data type.

Character data types list the first and last values based on the alphabetical order, for example, aardvark-zebra.

Numeric data types list the highest and lowest numeric values, for example, 1-9999.

Date data types list the earliest and latest date or time, for example, 01/01/2001-08/08/2025.

Maximum Value Shows the largest value based on the data type.

Data Governance User Guide 92 PUBLIC View Fact Sheet Option Description

Character data types list the first and last values based on the alphabetical order, for example, aardvark-zebra.

Numeric data types list the highest and lowest numeric values, for example, 1-9999.

Date data types list the earliest and latest date or time, for example, 01/01/2001-08/08/2025.

Minimum Length Shows the number of characters in the shortest value.

Maximum Length Shows the number of characters in the longest value.

Null Shows the percentage of null (empty) values.

Blank Shows the percentage of blank values.

Zero Shows the percentage of zero values.

Uniqueness Describes the count and percentage of rows that contain unique data. Uniqueness is classified as one of three types:

Sparse: More than 90% of the values of the column are null or blank.

Low Cardinality: The number of distinct values in the column is less than 2%.

Unique Values: All the values in the column are distinct from each other.

Distinct Values Lists the number of values that are different (unique) from other values.

View the Top Distinct Values

The Top 10 Distinct Values block shows a bar chart of the top 10 values that are the most frequently occurring in the column. You can click the value to learn more about the data. The value, number of rows, and occurrence percentage is shown. If the results show sampled data, then the distinct values block shows the results for the sampled data only.

9.3 Preview Data

View the data from your source.

The default is to view 100 rows, but you can change the option to preview up to 1000 rows of data.

 Note

SAP HANA objects with parameters, variables, or unsupported column types cannot be profiled or shown in a data preview.

1. Navigate to the Fact Sheet. 2. Click the Data Preview tab. 3. (Optional) To change the number of rows in the preview, click the Maximum number of rows to preview value.

Data Governance User Guide View Fact Sheet PUBLIC 93 Preview SAP Business Warehouse Query Data

You can preview SAP BW query data by providing some variables to define what is shown in the data preview. If your query has required variables or optional variables that you can set, the Edit Parameters button is displayed on the Fact Sheet.

1. Navigate to the Fact Sheet. 2. Click Edit Parameters. 3. Based on the connection, one or more variables are listed. Select the variable that you want to define.

 Note

Click  Filter to find the required parameters, if any. The required parameters have a star next to them.

4. Choose the operator that you want to use. The list of available operators shown is based on the operators defined in the query. If IN BETWEEN is included in the list of operators, enter two values in the next step. 5. Choose from the list of values. The available values are based on those defined in the query. If the query allows multiple values, click + to define additional values. 6. Click OK to preview the data.

Related Information

View Fact Sheet [page 84]

9.4 Analyze Data Lineage

Use lineage analysis to trace back from a dataset to the source.

Use lineage analysis to view and navigate through various dependencies between objects. For example, if you have some data that has been transformed or enhanced, you can find where the data originated to learn how the dataset could have been modified.

Lineage can be run on modeler graphs created in the SAP Data Intelligence Modeler independently of the connection type. For example, a graph could move data from an Amazon S3 connection to a Microsoft ADL connection, and lineage can be extracted to show the connection between the two datasets.

Depending on the source, lineage can be shown without executing a graph. For example, if you have a view, datastores, BW Query, or InfoProviders, you can view the lineage without running a graph with that source in the Modeler. However, some sources require that a graph is executed before lineage is available. For example, data preparations must be executed before lineage is shown.

Data Governance User Guide 94 PUBLIC View Fact Sheet Lineage Supported Sources

Lineage can be extracted from several types of sources:

● Data preparation tasks. ● Connections on a system with lineage capabilities. Lineage is available on these connections and object types in the Metadata Explorer application: ○ SAP Business Warehouse: DataStores, InfoProvider, and BW Queries ○ SAP HANA: SQL views, column views, and synonyms ○ SAP Vora: datasource tables and views ● Modeler graph tasks. Lineage is extracted from operators that have a dataset referenced through a connection ID that was defined in Connection Management.

 Note

Every Modeler graph has a setting to disable lineage extraction. By default, lineage is always extracted.

Lineage can be extracted with some limitations on these graph operators in the Modeler:

○ Azure SQL DB SQL Consumer* ○ Open Connecters Table Consumer ○ Constant Generator ○ Oracle Table Consumer ○ Data Transfer ○ Oracle SQL Consumer* ○ Decode Table ○ Read File ○ DB2 SQL Consumer* ○ Read HANA Table ○ DB2 Table Consumer ○ Run HANA SQL* ○ Flowagent CSV Producer ○ SAP HANA Client* ○ Flowagent File Producer ○ SAP Vora Client* ○ Flowagent Table Producer ○ SAP Vora Loader ○ Format Converter ○ SQL Server SQL Consumer* ○ From File ○ SQL Server Table Consumer ○ Google BigQuery SQL Consumer* ○ To File ○ Google BigQuery Producer ○ ToBlob Converter ○ HANA Table Consumer ○ ToMessage Converter ○ Initialize HANA Table ○ ToString Converter ○ Multiplexer ○ Wiretap ○ MySQL SQL Consumer* ○ Write File ○ MySQL Table Consumer ○ Write HANA Table ○ OData Query Consumer ○ x:y Multiplexer

1 Limitations include the following: ○ Operators referencing a system with hard-coded credentials; only operators using Connection Management with a connectionID set are extracted. ○ The extraction is based on static analysis, so any dynamic aspect is ignored. For example, lineage is not extracted in these cases: ○ When a folder is referenced in an operator ○ When an operator generates objects with some patterns such as , ,

1 *Operator supports SQL parsing.

Data Governance User Guide View Fact Sheet PUBLIC 95 ○ Google BigQuery

 Note

Limitations: Google BigQuery supports two different SQL dialects: legacy and standard. The configuration used in this application is based on the standard dialect. Lineage is not extracted when the legacy dialect is used.

Wildcard table syntax is not supported: SQL queries having FROM elements containing a wildcard are not extracted for lineage.

○ IBM DB2

 Note

Limitation: Lineage is not extracted from the following SQL statements: SQL queries having FROM elements containing , , , , or clauses.

○ Microsoft SQL

 Note

Limitations: Lineage is not extracted from the following SQL statements: ○ SQL queries having SELECT elements containing the INTO clause, READTEXT| WRITETEXT|UPDATETEXT, or WITH XMLNAMESPACES clause for XML input. ○ SQL queries having FROM elements containing , <@variable>, <@variable.function_call>, , , or clauses. ○ INSERT|UPDATE|MERGE|DELETE statements with OPENQUERY, OPENROWSET, OUTPUT clause, or clauses.

This configuration can be used on Microsoft SQL server and Azure SQL Database.

○ Oracle

 Note

Limitations: Lineage is not extracted from the following SQL statements: ○ SQL queries with FROM elements containing , , , , or . ○ SQL statements using WITH elements containing , or PL/SQL. ○ INSERT|UPDATE|DELETE|MERGE statements that containing or .

○ Oracle MySQL

 Note

Limitation: Lineage is not extracted from the following SQL statements: ○ SQL queries with SELECT element . ○ TABLE Statement is not supported.

Data Governance User Guide 96 PUBLIC View Fact Sheet  Note

Check the Product Availability Matrix (PAM) for the supported versions of these products.

Extract Lineage

To extract lineage, set the Lineage option when creating a publication or when publishing a dataset. See Create a Publication [page 73] and Publish a Dataset [page 22].

To extract Modeler graph lineage, go to the Manage Publications page, and then click Create Publication. Complete the options as described in Create a Publication [page 73].

 Note

When selecting a source folder, you could see some icons with a lock next to them. The locked graphs are private user graphs and cannot be used. Those graphs without a lock are public/tenant and can have lineage analysis run. To activate your user graph and make it public, go to the System Management application.

Automatically Extract Lineage

Automatic lineage extraction can be run on Modeler graphs and Metadata Explorer data preparations. The benefit of automatically extracting lineage is to have a history of lineage extractions over time, and you can view how data sources have been added or removed. It also shows which datasets and transformations are included in the output of a data target. Those datasets that are grayed out in the lineage results do not directly contribute to the version of the lineage on the date you are viewing.

Lineage extraction is done automatically when you change a few settings in System Management. For details about setting up automatic lineage extraction, see Manage Metadata Automatic Lineage Extractions.

Related Information

View Lineage Results [page 98] Configure the Lineage View [page 101] Data Lineage Examples [page 102]

Data Governance User Guide View Fact Sheet PUBLIC 97 9.4.1 View Lineage Results

View the lineage graph and learn more about the datasets and transforms.

Access Lineage Graphs

From the Catalog or Browse Connections page, select an object that has lineage. Click  More Actions, and then choose View Lineage.

Likewise, you can click  View Fact Sheet, and then click the Lineage tab.

Overview of Lineage Page

The Lineage page is divided into three sections. At the top of the page, you can configure the view and search for objects. For details about configuring the view, see Configure the Lineage View [page 101].

In the middle of the page, you can view the graphical representation of the flow of data. The lineage representation shows all of the datasets and transformations.

At the bottom of the page, you can view dataset and column information, transform, and system artifact summary and detailed information.

Available History When lineage has been extracted multiple times, click the Available History dropdown list to view the historical runs. You can see the extractions over time and view the datasets that have been added or removed. You could see that some of the datasets and transforms are grayed out, which means that for the date version you’re viewing, those datasets and transforms did not contribute to the lineage. They contributed on a different date. For example, if you have a graph that has monthly sales reports that role into a quarterly dataset, you may see that the sales for January are grayed out when you view the version for February.

Search Click in the search box and begin typing the name of the object you want to view. You can view the nodes or lines that connect the nodes to learn more information about those objects.

Click a line to learn the name of the nodes that the line is joining and this information:

Option Description

Kind Indicates the function of the data as it passes (for example, read or write).

Update Mode Indicates the process, such as define.

Connection ID Shows the connection where the node is located.

Select a  Dataset node, and information about that dataset is shown at the bottom of the page. In the System Artifact Summary tab, you can view information about the dataset, such as the owner, when it was last modified, system and artifact types. Click the Datasets tab, and then click  More Actions to perform the following actions, when available:

Data Governance User Guide 98 PUBLIC View Fact Sheet ● View Fact Sheet ● View in Browse ● View in Catalog ● Start Profiling ● Prepare Data

Select a  Transform node to view information about the transform. Click  More Actions, and then click Show Defining Dataset, which shows the dataset that was output from the transform.

Graph Overview

The lineage overview shows a small graph with the entire contents of lineage analysis. If you have a large graph, it could be helpful to have the lineage overview displayed. To show or hide the graph, click  Graph Overview.

View Legend

To show or hide the legend, click  Legend.

Change Graph Size

There are several ways to change the graph size.

● To increase the graph size, click  Zoom In. To decrease the graph size, click  Zoom Out. ● To view the entire graph, click  Zoom to Fit. ● To view only the graph, click  Enter Full Screen. This option removes the header information at the top of the screen and the dataset and transform information at the bottom of the screen. To return to the previous view, click  Exit Full Screen.

System Artifact Summary, Locations, and Tags

Select a  Dataset node in the graph. The System Artifact Summary tab contains this information. Depending on the object type, you could see a subset of options.

Option Description

Name Provides the name of the dataset.

Native Qualified Name Provides the location of the dataset.

Owner Provides the name of the person or system who controls the dataset.

Status Lists the processes performed on the dataset such as publish, profile, and lineage.

Type Identifies the type of dataset such as a view, table, or file.

The Datasets tab contains this information.

Column Name Description

Connection ID Provides the name of the connection

Qualified Name Provides location of the dataset.

Status Shows whether the dataset has lineage or was profiled.

Tags Show the number of tags on the dataset or columns.

Click  More Actions to perform the following actions:

● View Fact Sheet

Data Governance User Guide View Fact Sheet PUBLIC 99 ● View in Browse ● View in Catalog ● Start Profiling ● Create a Preparation

 Note

You may not see all these options depending on your permissions and whether the action can be performed on a dataset.

Click  View Related Objects to show the tags, terms, and rulebooks applied to the dataset. Click the links in the terms and rulebooks to view details about those objects in the glossary and rulebook.

You can filter the tags by entering a filter term, and clicking  Search. Click  Filter to sort the hierarchies, or to remove some hierarchy tags from showing. Click Apply.

In the Columns pane, you can view the columns included in the dataset as well as the data type and a description. When there are many columns, click Enter Filter and type the name of a column. If a column has been tagged, the Tags column shows the number of tags applied to the column.

Click  View Related Objects to show the tags, terms, and rulebooks applied to the column. Click the links in the terms and rulebooks to view details about those objects in the glossary and rulebook.

Transform Summary and Transformation Details

Select a  Transform node in the graph. Transforms can contain information from two types of objects.

● Datasets containing transformations (Views, BEx queries) ● Computation artifacts (data pipelines)

The Transform Summary tab contains this information. Depending on the lineage type and whether it is automatic, you could see all or a portion of these options.

Option Description

Name Name of the transform.

Native Qualified Name Location of the transformation.

Defining Dataset The name of the dataset defining the transformation. Click  More Actions and choose Select Graph Node view the output dataset node details.

Owner Person who ran the lineage task.

View Details Opens the lineage in the Modeler when it is a graph execution. Opens the lineage in Metadata Explorer when it is a data preparation execution.

Native Computation Type Indicates whether the lineage is from a graph or a data preparation.

The Execution History is shown when this lineage is automatically extracted and shows this information.

Option Description

Execution Date Date and time the lineage was extracted.

Status Effective: Indicates that the transform contributed to the output on this execution date.

Noneffective: Indicates that the transform contributed to the output on a different date.

Data Governance User Guide 100 PUBLIC View Fact Sheet Option Description

User The person who set up the automatic lineage.

Duration The amount of time it took to process the lineage extraction.

More Actions Click  More Actions and choose one of the following.

● View Execution in Modeler opens the graph in the Modeler. ● View Execution Parameters opens the Parameter Details dialog showing the input and output file. Select Graph Node view the output dataset node details.

The Internal Computations tab contains this information.

Option Description

Identifier Lists the internal computation identifiers, when applicable. For example, the identifier could be a Modeler operator identifier in a pipeline graph.

Filter When there are many identifiers, you can click Enter Filter and type the name of the identifier.

The Transform Details pane includes this information.

Option Description

Type Identifies the type of transform such as Identity and Black Box.

Type Definition Provides information about the type of transform such as whether it can contain multiple input and/or output data.

This transform Provides information about the data coming in and going out of this transform, when applicable. Click the links to view the dataset summary of the input or output dataset.

Permissions for Viewing Data in Lineage

You must have certain rights or permissions to access all the functionality on the Lineage page. To adjust your permissions, see your system administrator. One scenario is that you do not have permission to access a dataset on a connection. Therefore, the metadata and lineage cannot be extracted.

Another scenario is that a user has permission to extract lineage, but another user does not have permission to access the connection. If you don’t have permission to access the connection, these options are disabled: View in Browse and Start Profiling. You may not be able to access some datasets. Those datasets have a lock icon and a "Permission Denied" message. You can’t view any column name, type or description information.

9.4.2 Configure the Lineage View

Configure how the nodes and lines are shown.

You can change how the nodes are displayed in the graph. You can have the nodes flow from left to right, right to left, and top to bottom. You can also change the node placement, and whether the lines going to and from the nodes should merge or remain separate.

Data Governance User Guide View Fact Sheet PUBLIC 101 1. Click  Settings. 2. Configure the following options.

Option Description

Fixed Node Width On truncates long object names and make all objects the same size.

Off shows the entire object name.

Orientation Left-Right shows the graph data beginning at the left side and flowing to the right.

Right-Left shows the graph data beginning at the right side and flowing to the left.

Top-Bottom shows the graph data beginning at the top and flowing to the bottom.

Node Placement Changes the number of straight edges and the placement of nodes. Choose from Brandes-Koepf, Linear Segments, and Simple.

Node Spacing Changes the distance between the nodes. Smaller numbers place the nodes closer together.

Line Types Merge combines lines that go in the same direction and then split when necessary.

Split separates each line.

3. Click Close.

9.4.3 Data Lineage Examples

Learn about the differences between extracting lineage from a connection and a pipeline.

Extracting lineage can return different results based on whether you’re extracting from a pipeline created in the Pipeline Modeler or extracting lineage from a single connection. Setting up a pipeline and extracting the data varies depending on the connection, graph, and dataset type. These examples give high-level information for accomplishing lineage extraction on connections and pipelines. Read the operator details for specific settings in the Modeling Guide and read about creating publications in the Data Governance User Guide.

If you click a transformation and click Execution History, you can see those transformations that have a status of Effective or Noneffective. The Effective datasets are contributing directly to this version of the lineage results. The Noneffective (grayed out) datasets have previously contributed or are set to contribute in the future to the lineage results, but did not contribute on this date.

Data Governance User Guide 102 PUBLIC View Fact Sheet Lineage Pipeline Example

Let's say that you have a table named Sales2020 on an SAP HANA connection. The data is transformed and loaded into the Sales2020 table on SAP Vora. In the Metadata Explorer, the lineage looks as follows:

The high-level steps include:

1. Open the Pipeline Modeler application, create a graph named HANA2Vora, and add the Data Transfer operator. 2. Configure the Data Transfer operator. The Source Connection is SAP HANA, and the table name is Sales2020. The Target Connection is SAP Vora, and the table name is Sales2020. 3. Open the System Management application. Click the Files tab and navigate to your HANA2Vora graph location. Right-click to activate the graph and make it available on the tenant workspace. 4. Open the Metadata Explorer application and click the Metadata Explorer navigation menu. 1. Choose Administration Manage Publications . 2. Click the  icon next to a connection. 3. Click Create Publications and complete the options, and then click Run. 5. After processing is complete, click the Overview navigation pane and select Catalog. Search for Sales2020. In the search results, there are two results: one for the table on SAP Vora, and one on SAP HANA. On the SAP HANA connection, click More Actions View Lineage .

When you extract lineage from a pipeline, it might also trigger lineage extraction on one or more connections.

Lineage Connection Example

To create a fuller view on the connection, let's create another publication. This time, it’s a publication on a view on the SAP Vora table named SALES1. When complete, the lineage looks as follows:

The high-level steps include:

1. In the Metadata Explorer, create a publication on the connection named SAP_Vora. 1. Click the Overview navigation pane and select Manage Publications. 2. Select the SAP_Vora connection. 3. Click Create Publication. 4. In the Source Folder, choose the SALES1 view, and click OK. 5. Complete the other options, including setting Lineage to a value of Yes. 6. Click Publish.

Data Governance User Guide View Fact Sheet PUBLIC 103 2. Click the Overview navigation pane and select Catalog. 3. Search for SALES1. 4. Click More Actions View Lineage .

The lineage graph is shown for SALES1 on SAP_Vora, including the pipeline from the pipeline example.

Related Information

Create a Publication [page 73]

9.5 Review and Comment on a Dataset

Rate, comment, and create a discussion about a published dataset.

You can rate and add comments to any published dataset. Ratings and comments provide opinions about the quality of your dataset. The rating can mean whatever you want to convey to others about the dataset. For example, you can give a dataset 5 stars to mean that it’s ready for another department to consume the data. When you or others have questions or comments about the dataset, they can create a discussion.

Navigate to the Fact Sheet, and then click the Reviews tab.

Rate a Dataset

Go to the Ratings and Comments block. To rate a dataset:

1. Click the number of stars you want to rate the dataset. 2. (Optional) Enter a comment. 3. Click OK.

If others have rated the dataset, then an overall rating is shown. You can view how many people have rated it for each star rating in the bar chart.

Edit a Rating or Comment

Only you can edit your comments. Click  Edit Rating, and then change the number of stars or your comment. Click OK.

To create a new comment, type your comment in the Add a Comment textbox at the bottom of the block. Click  Submit.

Data Governance User Guide 104 PUBLIC View Fact Sheet Edit or Delete a Comment

Only you can edit or delete your comments. Select the comment and then click  More.

To delete a comment, click Delete.

To edit a comment, click Edit. When you’re done making changes, click OK.

Filter the Comments

To find a specific comment, enter all or a portion of a term, and then click  Search.

Create a Discussion

To create a topic for discussion, go to the bottom of the Discussions block. Type your comment in the Add a Topic textbox, and then click  Submit.

● To reply to a topic, click Reply. Enter your comment, and then click Reply. ● To edit a topic, click Edit. When you’re done making changes, click Save. ● To delete a topic, click Delete. ● To search for a discussion, click  More. Enter all or a portion of a term, and then click  Search.

Sort Comments or Discussions

To show the most recent comment or topic at the top of the block, click  Sort.

To view the oldest comment at the top, click  Sort.

9.6 View Dataset Relationships

Relationships can help you more fully understand your dataset.

Relationships consist of terms, tags, and rules applied to the dataset. Terms provide definitions and context for your data. Tags provide some classification. Rules provide a view into quality.

To view the relationship data:

1. Navigate to the Fact Sheet by clicking  View Fact Sheet on the dataset that you want to view. 2. Click the Relationships tab.

Data Governance User Guide View Fact Sheet PUBLIC 105 View Dataset Glossary Terms

The Glossary Terms block shows the terms linked to this column. If there are terms from different glossaries, you can change the glossary by clicking the list in the header of the block. Click the term link to view the definition in the glossary.

View, Edit, and Filter Dataset Tags

The Tags block shows the tags assigned to this dataset. To add or remove tags, follow these steps.

1. Click Manage Tags. 2. To delete tags, click the X next to the tag name in the Dataset Tags panel. 3. To add tags, choose the hierarchy list and select the hierarchy that contains the tags you want to use. 4. Expand the hierarchy and select the tags you want to use by clicking the checkbox next to the tag. 5. Click Close.

To search for tags, enter all or a portion of the tag name in the Filter tags box. Click  Search.

To sort or filter tags, click  Filter. A dialog opens showing the sorting and filtering options. In the Sort Hierarchies section, you have several options.

Option Description

Default First Shows the tags from the hierarchy that was set as the default at the top of the list.

Ascending Lists the hierarchies alphanumerically starting with a special character, A, or 0.

Descending Lists the hierarchies alphanumerically starting with Z or 9.

In the Filter Hierarchies section, click Filter tags by hierarchy, and then select the hierarchies with the terms you want to view.

Use these options together to remove some hierarchies, and to sort the hierarchies that remain. Click Apply.

View Dataset Rulebooks

The Rulebooks block shows the rulebooks that have one or more rules applied to the dataset. If there were changes to the rules and the rulebook was run, then you can view the percentage of passing rows that changed since the previous run. Click the name of the rulebook to view the rules. Click the gauge to go to the rule results page.

View Related Datasets

The Suggested Datasets block shows datasets that have user-defined links such as common rulebooks, glossary terms, or tags. Click the dataset name to view its Fact Sheet.

Data Governance User Guide 106 PUBLIC View Fact Sheet 9.7 Manage Fact Sheet Versions

Up to five versions of profiled dataset Fact Sheets are saved. You can select which versions to keep or remove.

Context

When you profile a dataset, a Fact Sheet is created and stored. The trend chart in the Fact Sheet shows changes in the dataset's number of rows, or overall object size for each profiled version.

After you’ve profiled the same dataset five times, the sixth profile causes the oldest Fact Sheet to be removed. If there are certain Fact Sheets that you want to keep, you can remove the versions you no longer want, so you don't exceed five Fact Sheets. For example, you may want to keep the first profiled Fact Sheet as a reference point to show how much the dataset has changed after you added or removed data from the dataset. Perhaps some of the profiling tasks resulted in errors. You may want to remove those Fact Sheets and keep only the Fact Sheets that completed successfully.

Procedure

1. Click the Metadata Explorer navigation menu, and then select Administration Manage Fact Sheets . 2. Select the connection that contains the Fact Sheet that you want to view. 3. The list of Fact Sheets shows the following information.

Column Description

Dataset Name The name of the dataset.

Source Folder The name of the folder where the dataset is located.

Version History The number of times the dataset has been profiled (up to five times). 1. Click the number to view the status of each profile. 2. To delete a version, click  Delete in the row of the Fact Sheet that you want removed.

Last Profiled The date and time the dataset was last profiled.

 More Actions ○ Start Profiling: Profiles the selected dataset to make a new Fact Sheet. ○ View Fact Sheet: Opens the most current Fact Sheet. ○ Delete Fact Sheet: Removes all versions of the selected Fact Sheet.

 Note

You must have certain rights or permissions to access all the functionality on the Fact Sheet. To adjust your permissions, see your system administrator. If you don’t have permission, these options are disabled: Start Profiling and Delete Fact Sheet.

Data Governance User Guide View Fact Sheet PUBLIC 107 Column Description

 View Fact Sheet Opens the most current Fact Sheet. After this Fact Sheet is opened, you can navigate to the previous versions in the trend chart, when there are multiple versions.

9.8 Search for Fact Sheets on a Connection

Find a fact sheet on a profiled dataset.

Context

When you have a long list of datasets on a connection, you can search by the dataset name.

Procedure

1. From the Metadata Explorer navigation menu, choose Administration Manage Fact Sheets . 2. Choose a connection. 3. Click  Filter datasets. 4. Enter all or a portion of the fact sheet name, and press Enter.

Results

In the results, you can view the dataset name, source folder, version history, when it was last profiled and the connection name. To view the status, and the dates the dataset was profiled, click the link in Version History. You can also delete specific versions from the Version History dialog.

To start profiling, view the fact sheet, or delete all fact sheets for the highlighted dataset, click  More Actions

.

Data Governance User Guide 108 PUBLIC View Fact Sheet 9.9 Search for Fact Sheets Across Datasets

Search across all connections to find a fact sheet on a profiled dataset.

Context

When you're not certain which connection holds the fact sheet you're looking for, you can search by the dataset name.

Procedure

1. From the Metadata Explorer navigation menu, choose Catalog View Profiled Datasets .

A list of profiled datasets across connections is shown. 2. Click  Enter search terms to filter the list of datasets. 3. Enter all or a portion of the fact sheet name, and press Enter.

Results

In the results, you can view the dataset name, source folder, version history, when it was last profiled, and the connection name. To view the status, and the dates the dataset was profiled, click the link in Version History. You can also delete specific versions from the Version History dialog.

To start profiling, view the fact sheet, or delete all fact sheets for the highlighted dataset, click  More Actions .

Data Governance User Guide View Fact Sheet PUBLIC 109 10 Searching Datasets

Find published datasets based on the name, type, tag, or a column contained in the dataset.

Access searching functionality from the banner of Home page or from the Discovery Dashboard and Catalog pages.

1. Enter all or a portion of the search term in the text box and click  Search. If you’re searching within a Catalog folder, then the results show only the contents of that folder and subfolders. To search across all connections, make sure that you are at the top folder level where you can see the connection names.

 Note

When searching for an SAP ABAP dataset on an older (before SAP S/4HANA) version, enter the technical name of the table.

2. To limit your search results, click  Filter and set one or more of these options: ○ Choose one or more Connection Types from the list. ○ Choose Edit tag filters by clicking the link. Enter the name of the tag in the Filter tag names box. Select the tag and click  Use Tag as Search Filter. On the Filters dialog, click Apply. ○ Choose an Average Rating by clicking the number of stars.

 Note

The number of stars you specify and higher are shown in the search results. For example, if you selected 4 stars, those datasets that have 4 and 5 stars are returned.

○ Choose one or more Dataset Types from the list. 3. Click Apply.

The search results are shown in the Catalog for any matching searches in the table or file name, assigned tag, connection or dataset type, or column name within the dataset.

The number of search results is shown with the time it took to return the results.

View the search results in a grid or list view.

In the  Grid view, you can see the object name, whether it has been published, profiled, or has lineage, and the object type. For additional information, click the following icons.

Icon Description

 View Fact Sheet View the Fact Sheet to learn more about your data by viewing the minimum and maxi­ mum values, unique values, length, null or blank values, and more.

 More Actions Published datasets show View Metadata, View Fact Sheet, View in Browse, View Parent Folder, Start Profiling, and Prepare Data. When data lineage is available, the View Lineage option is shown.

In the  List view, the following information is available:

Data Governance User Guide 110 PUBLIC Searching Datasets Column Name Description

Name Shows the name of the object.

Description Shows the description of the object from the source.

Status Shows whether the dataset has been published or profiled, or has lineage.

Type Shows the object type, such as CSV, ORC, and so on.

 View Fact Sheet View the Fact Sheet to learn more about your data by viewing the minimum and maximum values, unique values, length, null or blank values, and more.

 More Actions Published datasets show View Metadata, View Fact Sheet, View in Browse, View Parent Folder, Start Profiling, and Prepare Data. When data lineage is available, the View Lineage option is shown.

Data Governance User Guide Searching Datasets PUBLIC 111 11 Managing Favorites

Add or remove your favorite catalog folders.

From the Discovery Dashboard page, you can see the Favorites tile that shows five links to your favorite catalog folders. You can have more than five favorites. Click Show all to view all of them. Select one of the favorites to go to that catalog folder. Click Manage to go to the Manage Favorites page.

You can access the Favorites page by clicking the Metadata Explorer navigation menu and choosing Settings Favorites .

To create favorites, navigate to a catalog folder and click  Toggle Favorite. To remove a favorite, click Toggle Favorite again.

To search for a favorite folder, enter all or a portion of the name in Filter folder names.

To change the number of favorites displayed, choose another setting in the Page Size option.

To view the catalog folder, click the name of the folder.

Data Governance User Guide 112 PUBLIC Managing Favorites 12 Functions

A function takes input, such as values from a column, and produces a return value.

Use built-in functions when you create filters and define rules. Use the Advanced Editor to either manually type the built-in function, or click the Functions button to view the list of categories and functions. The functions are available when creating or editing conditions and filters in rules.

When you use the Advanced Editor, select a function from the category list. Hover over a function, and a tooltip opens to assist you in completing the function you chose. The function is inserted at the current position within the rule script editor.

An expression can call most built-in functions. However, a function cannot call itself or call another function that leads to a recursive call.

 Example

Function “A” can't call function “A”. If function “B” calls function “A”, then function “A” can't call function “B”.

The following table describes each built-in function. Use the features in the table to view functions: Hide select columns, organize columns in ascending or descending order, and search for terms.

Function Option Category Description

abs [page 117] Absolute Math Returns the absolute value of an input number.

add_months [page 118] Add months Date Adds a given number of months to a date.

concat_date_time [page Concat date time Date Returns a datetime from separate date and 119] time inputs.

date_diff [page 120] Date difference Date Returns the difference between two dates or times.

date_part [page 121] Date part Date Extracts a component of a given date.

day_in_month [page 122] Day in month Date Determines the day in the month on which the given date falls. Result is the number from 1 to 31 that represents the day in the month.

day_in_week [page 123] Day in week Date Determines the day in the week on which the given date falls.

day_in_year [page 124] Day in year Date Determines the day in the year on which the given date falls.

decode [page 125] Decode Lookup Returns an expression based on the first condi­ tion in the specified list that evaluates to TRUE.

 Note Use the decode function instead of the ifthenelse( ) function, which is not supported.

Data Governance User Guide Functions PUBLIC 113 Function Option Category Description

exists [page 127] Exists Lookup Checks if a row exists that matches the pro­ vided filter expression and column.

fiscal_day [page 129] Fiscal day Date Converts a given date into an integer value rep­ resenting a day in a fiscal year.

isweekend [page 130] Is weekend Date Indicates whether a date corresponds to Sat­ urday or Sunday.

Is data dependent Lookup Checks the data dependencies between pri­ mary and dependent columns.

Is unique Lookup Checks the data dependencies between pri­ mary and dependent columns.

is_valid_date [page 131] Is valid date Validation Indicates whether an expression can be con­ verted into a valid calendar date value.

is_valid_datetime [page 132] Is valid datetime Validation Indicates whether an expression can be con­ verted into a valid datetime value.

is_valid_decimal [page 133] Is valid decimal Validation Indicates whether an expression can be con­ verted into a valid decimal value.

is_valid_double [page 134] Is valid double Validation Indicates whether an expression can be con­ verted into a valid double value.

is_valid_int [page 135] Is valid int Validation Indicates whether an expression can be con­ verted into a valid integer value.

is_valid_real [page 136] Is valid real Validation Indicates whether an expression can be con­ verted into a valid real value.

is_valid_time [page 137] Is valid time Validation Indicates whether an expression can be con­ verted into a valid time value.

julian [page 138] Julian value Date Converts a date to its integer Julian value, which is a six-digit number. The number repre­ sents a date between 1 January 1900 to 31 De­ cember 2899.

julian_to_date [page 139] Julian to date Conversion Converts a Julian value to a date.

last_date [page 140] Last date of month Date Returns the last date of the month for a given date.

length [page 141] String length String Returns the number of characters in a given string.

lookup [page 142] Lookup value Lookup Returns the values looked up from the data­ base, SAP application, or views.

lower [page 145] To lower case String Changes the characters in a string to lower­ case.

lpad [page 146] Left pad String Pads a string with characters from a specified pattern.

lpad_ext [page 147] Left pad logical String Pads a string with logical characters from a specified pattern.

ltrim [page 148] Left trim String Removes specified characters from the start of a string.

Data Governance User Guide 114 PUBLIC Functions Function Option Category Description

match_pattern [page 150] Match pattern String Matches whole input strings to simple patterns supported by this application. This function does not match substrings.

match_regex [page 153] Match regular String Matches whole input strings to the pattern that expression you specify with regular expressions and flags. The regular expressions are based on the POSIX standard. The match_regex func­ tion does not match substrings.

mod [page 158] Modulo Math Returns the remainder when one number is divided by another.

month [page 159] Month Date Determines the month of the given date.

nvl [page 160] Replace null Lookup Replaces NULL values.

quarter [page 161] Quarter Date Determines the quarter in which the given date falls.

replace_substr [page 162] Replace substring String Returns a string where every occurrence of a given search string in the input is substituted by the given replacement string.

round [page 166] Round Math Rounds a given number to the specified preci­ sion.

rpad [page 167] Right pad String Pads a string with characters from a specified pattern.

rpad_ext [page 168] Right pad logical String Pads a string with logical characters from a specified pattern.

rtrim [page 170] Right trim String Removes specified characters or blank charac­ ters from the end of a string.

soundex [page 171] Soundex String Encodes the input string using the Soundex al­ gorithm and returns a string. Use this function when you want to push down the function to the database level.

sqrt [page 172] Square root Math Returns the square root of the given expres­ sion.

substr [page 173] Substring String Returns a specific portion of a string starting at a given point in the string.

sysdate [page 175] System date Date Returns the current date as listed by the oper­ ating system.

systime [page 176] System time Time Returns the current time as listed by the oper­ ating system.

to_char [page 177] Convert to char Conversion Converts a date or numeric data type to a string.

to_date [page 179] Convert to date Conversion Converts a string to a date.

to_decimal [page 181] Convert to decimal Conversion Converts a string to a decimal, with an optional precision parameter.

trunc [page 182] Truncate Math Truncates a given number to the specified pre­ cision, without rounding the value.

Data Governance User Guide Functions PUBLIC 115 Function Option Category Description

upper [page 183] To upper case String Changes the characters in a string to upper­ case.

week_in_month [page 184] Week in month Date Determines the week in the month in which the given date falls.

week_in_year [page 185] Week in year Date Returns the week in the year in which the given date falls.

word [page 187] Word in string String Returns the word identified by its position in a delimited string.

year [page 188] Year Date Determines the year in which the given date falls.

Related Information abs [page 117] add_months [page 118] concat_date_time [page 119] date_diff [page 120] date_part [page 121] day_in_month [page 122] day_in_week [page 123] day_in_year [page 124] decode [page 125] exists [page 127] fiscal_day [page 129] isweekend [page 130] is_valid_date [page 131] is_valid_datetime [page 132] is_valid_decimal [page 133] is_valid_double [page 134] is_valid_int [page 135] is_valid_real [page 136] is_valid_time [page 137] julian [page 138] julian_to_date [page 139] last_date [page 140] length [page 141] lookup [page 142] lower [page 145] lpad [page 146] lpad_ext [page 147] ltrim [page 148]

Data Governance User Guide 116 PUBLIC Functions match_pattern [page 150] match_regex [page 153] mod [page 158] month [page 159] nvl [page 160] quarter [page 161] replace_substr [page 162] round [page 166] rpad [page 167] rpad_ext [page 168] rtrim [page 170] soundex [page 171] sqrt [page 172] substr [page 173] sysdate [page 175] systime [page 176] to_char [page 177] to_date [page 179] to_decimal [page 181] trunc [page 182] upper [page 183] week_in_month [page 184] week_in_year [page 185] word [page 187] year [page 188]

12.1 abs

The abs function returns the absolute value of an input number.

 Syntax

abs()

Category

Math

Data Governance User Guide Functions PUBLIC 117 Return value decimal, double, int, or real

The data type of the return value is the same as the data type of the original number.

Where

The source number.

 Example

Function Results

abs(12.12345) 12.12345

abs(-12.12345) 12.12345

12.2 add_months

The add_months function adds a given number of months to a date or date column.

 Syntax

add_months(, )

Category

Date

Return value

Date

Data Governance User Guide 118 PUBLIC Functions Where

The starting year. month.date. For example 2020.10.01. The data type is datetime.

The number of months to add to the . The data type is integer

Details

Enter an integer for . If is the last day of the month, or the resulting month has fewer days than the day component of , the result is the last day of the resulting month. Otherwise, the result has the same day component as .

 Example

Function Results

Result is the date from the DueDate column plus 1. add_months(SalesOrder.DueDate, 1)

add_months('2001.10.31', 4) '2002.2.28'

12.3 concat_date_time

The concat_date_time function returns a datetime from separate date and time inputs.

 Syntax

concat_date_time(,

Category

Date

Data Governance User Guide Functions PUBLIC 119 Where

Input date value or name of column with date value.

Return value datetime

The datetime value obtained by combining the inputs.

 Example

Function Result

Concatenates the ship date and time into one datetime concat_date_time($shipDate, data type value. $shipTime)

12.4 date_diff

The date_diff function returns the difference between two dates or times.

 Syntax

date_diff(, , )

Category

Date

Return value int

Data Governance User Guide 120 PUBLIC Functions Where

Input start date.

Input end date.

The string that describes the format of the dates. Format codes:

D Day

H Hours

M Minutes

S Seconds

MM Months

YY Years

 Example

Function Result

The number of days between the date in $start_date and date_diff($start_date,sysdate(), 'D') today's date.

The number of minutes between the time in $start_time date_diff($start_time,systime(), 'M') and the current time.

12.5 date_part

The date_part function extracts a component of a given date.

 Syntax

date_part(, )

 Note

The software displays the year as four digits, not two.

Category

Date

Data Governance User Guide Functions PUBLIC 121 Return value

integer

Where

The input date.

The string that describes the format of the extracted part of the date. Format co­ des:

YY Year

MM Month

DD Day

HH Hours

MI Minutes

SS Seconds

 Example

Function Result

1990 date_part(to_date('1990.12.31', 'yyyy-mm-dd'), 'YY)

30 date_part('1991.01.17 23:44:30', 'SS')

12.6 day_in_month

The day_in_month function determines the day in the month on which the given date falls.

 Syntax

day_in_month()

Data Governance User Guide 122 PUBLIC Functions Category

Date

Return value integer

The number from 1 to 31 that represents the day in the month.

Where

The source date. Data type is datetime.

 Example

Function Results

day_in_month(to_date('Jan 22, 1997','mon dd, yyyy')) 22

day_in_month(to_date('02/29/1996','mm/dd/yyyy')) 29

day_in_month(to_date('1996.12.31','yyyy.mm.dd')) 31

12.7 day_in_week

The day_in_week function determines the day in the week on which the given date falls.

 Syntax

day_in_week()

Category

Date

Data Governance User Guide Functions PUBLIC 123 Return value

integer

The number from 1 (Monday) to 7 (Sunday) that represents the day in the week that occurs.

Where

The source date. Data type is datetime.

 Example

Function Results

3 (Wednesday) day_in_week(to_date('Jan 22, 1997','mon dd, yyyy'))

4 (Thursday) day_in_week(to_date('02/29/1996','mm/ dd/yyyy'))

2 (Tuesday) day_in_week(to_date('1996.12.31','yyy y.mm.dd'))

12.8 day_in_year

The day_in_year function determines the day in the year on which the given date falls.

 Syntax

day_in_year()

Category

Date

Data Governance User Guide 124 PUBLIC Functions Return value integer

A number from 1 to 366 that represents the day in the year that occurs.

Where

The source date. Data type must be datetime.

 Example

Function Results

day_in_year(to_date('Jan 22, 1997','mon dd, 22 yyyy'))

day_in_year(to_date('02/29/1996','mm/dd/yyyy')) 60

day_in_year(to_date('1996.12.31','yyyy.mm.dd')) 366

(1996 was a leap year.)

12.9 decode

The decode function returns an expression based on the first condition in the specified list that evaluates to true.

 Syntax

decode(, , )

 Note

decode is available only for views and rules.

Category

Lookup

Data Governance User Guide Functions PUBLIC 125 Return value

Expression or default expression

Returns the value associated with the first that evaluates to true. The data type of the return value is the data type of the first .

If the data type of any subsequent or is not convertible to the data type of the first , this application produces an error at validation. If the data types are convertible but do not match, a warning appears at validation.

Where

Condition that evaluates to true or false.

Value that the function returns if the first evaluates to true.

Expression that the function returns if none of the conditions in the and lists evaluate to true. You must specify a value for .

 Example

Function Results

If the value in the column $COUNTRY/REGION is decode(($COUNTRY/REGION = 'FRANCE'), 'French', FRANCE, the value returned is French.

($COUNTRY/REGION = 'GERMANY'), If $COUNTRY/REGION is NULL, the value returned is 'German', Unknown. ($COUNTRY/REGION = 'ITALY'), 'Italian', If $COUNTRY/REGION does not contain any of the val­ ($COUNTRY/REGION = 'SWITZERLAND'), ues listed, the Decode function returns the value Others. 'Swiss', ($COUNTRY/REGION = 'USA'), 'America', ($COUNTRY/REGION IS NULL), 'Unknown', 'Others')

Tips for Using Decode

Use decode in place of ifthenelse, which is not supported. The decode function is less error-prone than nested “if else” functions.

 Example

decode ((EMPNO = 1), '111',

Data Governance User Guide 126 PUBLIC Functions

(EMPNO = 2), '222', (EMPNO = 3), '333', (EMPNO = 4), '444',

'NO_ID')

To improve performance, this application pushes this function to the database server when possible. Thus, the database server evaluates the decode function.

Use the decode function to apply multiple conditions when you map columns or select columns. For more flexible control over conditions in a script, use IF from the Keyword list in the Advanced Rule Editor.

If a condition compares a varchar value with trailing blanks, the decode function ignores the trailing blanks.

To compare a NULL value (NULL constant or variable that contains a NULL constant), use IS NULL or IS NOT NULL from the Operator list in the Advanced Rule Editor. If you use the Equal (=) or Not equal to (<>) operator, the comparison against a NULL value always evaluates to FALSE.

When you use decode in the Advanced Expression Editor for a view, add additional and parameters to the function by selecting Add Condition.

12.10 exists

The exists function checks if a row exists that matches the provided filter expression and column.

 Syntax

exists(, , , )

The exists function is available only for rules.

Category

Lookup

Return value

boolean

Data Governance User Guide Functions PUBLIC 127 Where

 Note

The and variables can be defined only once.

The connection that contains the lookup table.

 Note

If the lookup table is a view, select Views for the connec­ tion name.

The lookup table. The table, file, or view that contains the re­ sults or values that you are looking up.

When you enter the name, ensure that you use the following format as applicable:

● Table: . ● File: . ● View: view

The column in the lookup table specified in that the function uses to find a matching row.

 Note

When the function reads a varchar column in the lookup table, it does not trim trailing blanks.

Data Governance User Guide 128 PUBLIC Functions The value that the function searches for in the specified . The data type must be varchar.

When you select a value for , keep in mind the following information:

● The value can be a simple column reference, such as a column found in both a source and the lookup table, or an expression. The value can also be a complex expres­ sion given in terms of constants and input column refer­ ences. ● If you enter a simple column reference that is a unique source column, you do not have to include a table name qualifier. ● If you enter a value from another table, or the value is not unique among the source columns, a table name qualifier is required. ● If you enter an empty string, the function searches for a zero-length varchar value in the . ● The function ignores trailing blanks in comparisons of and values in .

 Example

The following example function searches the table "Orders" in the connection named "HR" for the parameter "$name" in the "customer_name" column.

exists ('HR', 'dbo.Orders', 'customer_name', $name)

12.11 fiscal_day

The fiscal_day function converts a given date into an integer value representing a day in a fiscal year.

 Syntax

fiscal_day(, )

Category

Date

Data Governance User Guide Functions PUBLIC 129 Return value integer

Where

The first month and day of a fiscal year. Use this format: 'mm.dd'. The data type must be varchar.

The date you want to convert. The datatype must be date­ time.

Example

 Example

Function Results

fiscal_day('03.01',to_date('1999.04.2 51 0', 'yyyy.mm.dd'))

12.12 isweekend

The function isweekend indicates whether a date corresponds to Saturday or Sunday.

 Syntax

isweekend()

Category

Date

Data Governance User Guide 130 PUBLIC Functions Return value boolean

Returns true if the date is a Saturday or Sunday; returns false if the date is not a Saturday or Sunday.

Input values

The date or datetime data type value to test.

 Example

Function Results

isweekend($hire_date) Tests whether the date in $hire_date is a Saturday or Sunday.

isweekend(SYSDATE) Tests whether today is a Saturday or Sunday.

12.13 is_valid_date

The is_valid_date function indicates whether an expression can be converted into a valid calendar date value.

 Syntax

is_valid_date(, )

Category

Validation

Return value boolean

Data Governance User Guide Functions PUBLIC 131 Returns true if string can be converted into a valid date; returns false if the string can't be converted into a valid date.

Where

The expression to be validated.

If the expression doesn't resolve to a value of data type var­ char, the software converts the value to varchar and issues a warning.

The string that identifies the date format of the input string. The following are the supported date formats: MM/DD/ YYYY, DD/MM/YYYY, YYYY/DD/MM.

The data type must be varchar.

Example

 Example

Function Results

Tests whether the string $SubmitDate can be con­ is_valid_date ($SubmitDate,'mm/dd/ yyyy') verted to a calendar date with the mm/dd/yyyy date for­ mat.

Returns false because there is no such date as January is_valid_date ('01/34/2002', 'mm/dd/ 34th. yyyy')

12.14 is_valid_datetime

The is_valid_datetime function indicates whether an expression can be converted into a valid datetime value.

 Syntax

is_valid_datetime(, )

Data Governance User Guide 132 PUBLIC Functions Category

Validation

Return value boolean

Returns true if the string can be converted into a valid datetime; returns false if the string can't be converted into a valide datetime.

Where

The expression to be validated. The value must be varchar.

The string identifying the datetime format of the input ex­ pression. The value must be varchar.

 Example

Function Results

Returns true if the string $Received can be converted is_valid_datetime($Received,'mm/dd/ yyyy hh24:mi:ss') to the datetime format of mm/dd/yyyy hh24:mi:ss.

Returns false because the date is not valid; there is no is_valid_datetime('01/14/2002 such hour as "26", even on the 24 hour clock. 26:56:09', 'mm/dd/yyyy hh24:mi:ss')

12.15 is_valid_decimal

The is_valid_decimal function indicates whether an expression can be converted into a valid decimal value.

 Syntax

is_valid_decimal(, )

Data Governance User Guide Functions PUBLIC 133 Category

Validation

Return value boolean

Returns true if the string can be converted to a valid decimal; returns false string can't be converted to a valid decimal.

Input values

The expression to be validated.

A string indicating the decimal format of the input expression.

Keep in mind the following information:

● Use pound characters (#) to indicate digits and a decimal. ● If necessary, include commas as thousands indicators. For example, to specify a deci­ mal format for numbers smaller than 1 million with 2 decimal digits, use the following string: '#,###,###.##'. ● To indicate a negative decimal number, add a minus sign (-) at the beginning or end of this value.

 Example

Function Results

Tests whether the expression for the parame­ is_valid_decimal($Price, '###,###.##') ter $Price can be converted to a decimal format using the stated format.

12.16 is_valid_double

The is_valid_double function indicates whether an expression can be converted into a valid double value.

 Syntax

is_valid_double(, )

Data Governance User Guide 134 PUBLIC Functions Category

Validation

Return value boolean

Returns true if the string can be converted into a valid double; returns false if the string can't be converted into a valid double.

Input values

The expression to be validated.

A string indicating the double format of the input expression. Use pound characters (#) to indicate digits and a decimal. If necessary, include commas as thousands indicators. For ex­ ample, to specify a double format for numbers smaller than 1 million with 2 decimal digits, use the following string: #,###,###.##'

 Example

Function Results

is_valid_double ($Weight,'###.###') Tests whether the string $Weight can be converted to double format.

12.17 is_valid_int

The is_valid_int function indicates whether an expression can be converted into a valid integer value.

 Syntax

is_valid_int(, )

Data Governance User Guide Functions PUBLIC 135 Category

Validation

Return value boolean

Returns true if the string can be converted into a valid integer; returns false if the string can't be converted into a valid integer.

Input values

The expression to be validated.

The format specifying the thousands separator of the input expression. Use pound characters (#) to indicate digits. If necessary, include commas as thousands indicators. For ex­ ample, to specify an integer format, use the following string: #.###.###'. Valid separators include the period (.) and the comma (,). However, you can use only one valid separa­ tor type in a format. Separator defaults to the comma (,) when none is specified.

 Example

Function Results

is_valid_int($Volume,'#,##,###') Tests whether the string $Volume can be converted to the integer format.

12.18 is_valid_real

The is_valid_real function indicates whether an expression can be converted into a valid real value.

 Syntax

is_valid_real(, )

Data Governance User Guide 136 PUBLIC Functions Category

Validation

Return value boolean

Returns true if the string can be converted into a valid real; returns false if the string can't be converted into a valid real.

Where

The expression to be validated.

A string indicating the real format of the input expression. Use pound characters (#) to indicate digits and a decimal. For example, to specify a real format for numbers smaller than 1 million with 2 decimal digits, use the following string: '#,###,###.##'.

 Example

Function Results

is_valid_real ($Mean,'#,###.#####') Tests whether the string $Mean can be converted to real format.

12.19 is_valid_time

The function is_valid_time indicates whether an expression can be converted into a valid time value.

 Syntax

is_valid_time(, )

Data Governance User Guide Functions PUBLIC 137 Category

Validation

Return value

boolean

Returns true if the string can be converted into a valid time; returns false if the string can't be converted into a valid time.

Where

The expression to be validated.

The string identifying the time format of the input expres­ sion. Construct the time format using HH, HH24, MI, or SS.

 Example

Function Results

is_valid_time($ReceivedTime,'hh24:mi: Tests whether the string $ReceivedTime can be con­ ss') verted to the hh24:mi:ss datetime format.

12.20 julian

The julian function converts a date to its integer Julian value, the number of days between the start of the Julian calendar and the date.

 Syntax

julian()

Data Governance User Guide 138 PUBLIC Functions Category

Date

Return value integer

The Julian representation of the date.

Where

The source date value of data type date or datetime.

 Example

Function Results

julian(to_date('Apr 19, 1997', 'mon 729436 dd, yyyy'))

12.21 julian_to_date

The julian_to_date function converts a Julian value to a date.

 Syntax

julian_to_date()

Category

Conversion

Data Governance User Guide Functions PUBLIC 139 Return value date

The date that corresponds to the input Julian value.

Where

Source value of data type date or datetime.

 Example

Function Results

julian_to_date($Julian_Date) Converts the number in $Julian_Date to its date value.

12.22 last_date

The last_date function returns the last date of the month for a given date.

 Syntax

last_date()

Category

Date

Return value date

Data Governance User Guide 140 PUBLIC Functions Where

The source value of data type date or datetime.

 Example

Function Result

last_date(to_date('1990.10.01', '1990.10.31' 'yyyy.mm.dd'))

12.23 length

The length function returns the number of characters in a given string.

 Syntax

length()

Category

String

Return value integer

The number of characters in .

Where

Source column name, variable, or other element of data type varchar for which the length is calculated.

Data Governance User Guide Functions PUBLIC 141  Example

Function Result

5 if $lname is 'jones'. length($lname)

12.24 lookup

The lookup function returns the values looked up from the database, SAP application, or view.

 Syntax

lookup (, , , , , )

 Note

The lookup function is available only for rules.

Category

Lookup

Return value any type

The value in the that meets the Lookup value requirements. The return data type is the same as .

Where

The connection that contains the lookup table. If the lookup table is a view, select Views for the connection name.

Data Governance User Guide 142 PUBLIC Functions The lookup table. The table, file, or view that contains the re­ sults or values that you are looking up.

When you enter the name, ensure that you use the following format as applicable:

● Table: .. ● File: . ● View: .view

The column in the specified that the function uses to find a matching row.

 Note

When the function reads a varchar column in the lookup table, it does not trim trailing blanks.

.

When you enter a value, keep in mind the following informa­ tion:

● The value can be a simple column reference, such as a column found in both a source and the lookup table, or an expression. ● The value can be a complex expression given in terms of constants and input column references. ● When the value refers to a unique source column, a ta­ ble name qualifier is not required. ● When the value refers to another table, or is not unique among the source columns, a table name qualifier is re­ quired. ● If the value is an empty string, the function searches for zero-length varchar values in the column specified in . ● The function ignores trailing blanks when comparing the values in and values in .

A column in the lookup table that contains the values you want to retrieve.

The value returned when there is no matching record in the lookup table.

Data Governance User Guide Functions PUBLIC 143  Example

Function Returns

Returns the column order_id if there is a row with lookup('HR', 'dbo.Orders', 'customer_name', $name, customer_name=$name and order_type=7 from 'order_type', 7, 'order_id', null) table HR.dbo.Orders.

Details

You can specify more than one and pair by adding additional pairs. The values must match for all specified pairs so that the lookup function can find a matching row.

The lookup function uses a value that you provide in to find a corresponding value in a file or different table. Specifically, the function searches for the row in the where the value in the matches the value in . The function returns the value from this matching row.

 Example

If your source schema uses a customer ID to identify each row, but you want the customer name in your target schema, you can use the lookup function to return the customer name that corresponds with the customer ID.

In SQL terms, the lookup function evaluates for each row, then executes the following command:

SELECT

FROM

WHERE =

The value returned by this SQL SELECT statement is the result of the lookup function for the row.

When there are no matching records in the , the function returns the . When multiple matching rows exist in the , the function returns the row based on whether the lookup table is a standard RDBMS table, an SAP application table, or a flat file:

● Standard RDBMS table: The lookup function finds the matching row with the maximum value in the and returns that value. ● SAP application tables or flat files: The lookup function randomly selects a matching row and returns the value in the for that row.

When creating rules using the lookup function in the Advanced Editor, be aware that knowing the metadata for the columns involved is important for the results that are returned. The metadata maximums are used in the comparisons. Let's say that you execute a lookup that returns a string coming from a VARCHAR(5) column. If it is compared a string that has 6 characters, then only the first 5 characters are used in the comparison.

Data Governance User Guide 144 PUBLIC Functions 12.25 lower

The lower function changes the characters in a string to lowercase.

 Syntax

lower()

Category

String

Return value varchar

The return data type is the same as the . Any characters that are not letters are left unchanged.

Where

The source column that contains a string to be output in low­ ercase.

 Example

Function Results

lower('Accounting101') 'accounting101'

Data Governance User Guide Functions PUBLIC 145 12.26 lpad

The lpad function pads the string with characters from a specified pattern.

 Syntax

lpad(, , )

Category

String

Return value varchar

Returns the modified string. The return type is the same as . Any characters that are not letters are left unchanged.

Where

The source column that contains the string to pad.

An integer value indicating the total number of characters in the returned string.

A character or set of characters that the function concate­ nates to the left of the input string until the input string reaches the set size.

Details

This function adds the specified pattern of characters from the left of the input string until the total string length is the specified length. If the input string is already longer than the specified length, the software truncates the string.

Data Governance User Guide 146 PUBLIC Functions  Example

Function Results

lpad('Tanaka', 15, ' ') A value padded with 10 spaces to the left of the characters Tanaka:

' Tanaka'

The value from the $last_name parameter, padded with lpad($last_name, 25, ' ') spaces on the left to total 25 characters. If the value in $last_name totals more than 25 characters, the value is truncated to 25 characters.

12.27 lpad_ext

The lpad_ext function pads a string with logical characters from a specified pattern.

 Syntax

lpad_ext(, , )

Category

String

Return value varchar

Returns the modified string. The return data type is the same as the data type. Any characters that are not letters are left unchanged.

Where

The source for the string.

Data Governance User Guide Functions PUBLIC 147 An integer value indicating the number of characters in the return string.

A logical character or set of logical characters that this func­ tion concatenates to the .

Details

This function repeats the pattern at the beginning of the input string until the final string is the specified length. If the is already longer than the specified length, the software trunceates the string.

 Note

The logical characters prohibit this function from being pushed down to the database for processing.

Example

In the following examples, the character "?" represents a Chinese ideograph, which is a double-byte character that occupies two cells on a display device or printer.

Function Results

lpad_ext("abcd", 10, "?") "??????abcd"

lpad_ext("abc??", 4, " ")

12.28 ltrim

Use the ltrim function to remove specified characters or blank characters from the start of a string.

 Syntax

ltrim(, )

 Note

ltrim is case-sensitive.

Data Governance User Guide 148 PUBLIC Functions Category

String

Return value varchar

Returns the modified string. The return data type is the same as the data type of .

Where

The varchar string to be modified.

The varchar character to remove from .

Details

The function scans left-to-right removing all characters that appear in until it reaches a character not in .

 Example

To remove all leading blanks in a string, use ltrim as follows:

ltrim(EMPLOYEE.NAME, ' ')

where specifies the column in the table.

 Example

Function Results

ltrim('Marilyn', ' ') 'Marilyn'

ltrim('ABCABCD', 'ABC') 'D'

ltrim('ABCABCD', 'EFG') 'ABCABCD'

Data Governance User Guide Functions PUBLIC 149 Function Results

ltrim('ABCDEABCDE', 'ABC' 'DEABCDE'

12.29 match_pattern

The match_pattern function matches whole input strings to simple patterns supported in this application.

 Syntax

match_pattern(, )

 Note

The match_pattern function does not match substrings.

Category

String

Return value

boolean

Returns true for a match, otherwise false.

Where

Varchar string to be matched. Supports Unicode characters.

Data Governance User Guide 150 PUBLIC Functions Varchar pattern to find in an input string.

 Note

Substring matches are not supported.

Use the characters in the following table for the values in .

Character Description

X A string of uppercase letters. Refer to Unicode 4.0 General Category, Lu = Letter, uppercase. Also, Unicode case map­ pings, for example, Latin, Greek, Cyrillic, Armenian, Deseret, and archaic Georgian.

x A string of lowercase letters. Refer to Unicode 4.0 General Categories:

● Ll = Letter, lowercase (case mappings, for example, Latin, Greek, Cyrillic, Armenian, Deseret, and archaic Georgian) ● Lt = Letter, title case (case mappings, for example, Latin capital letter D with small letter Z) ● Lm = Letter, modifier (case mappings, for example, for example acute accent, grave accent) ● Lo = Letter, other (case mappings, for example, Chinese and Japanese.)

9 A string of numbers.

\ Escape character.

* Any characters occurring zero or more times.

? Any single character occurring once and only once.

[ ] Any one character inside the braces occurring once.

[!] Any character except the characters after the exclamation point. For example, [!12] allows any number pattern, such as a postcode, that does not start with a 1 or 2.

All other characters represent themselves. To specify a special character as itself, use escape characters.

 Example

[!9] means accept any digit. To specify except 9, the correct pattern is [!\9].

Data Governance User Guide Functions PUBLIC 151  Example

Use Case Pattern Function Call Results

The pattern excludes strings "[!12]9999" if The string 15014 does not in the ZIP Code column that (match_pattern(' match the pattern. There­ begin with a 1 or a 2. 15014', '[! fore, the function prints "not 12]9999') <> 0)

matched". print('matched') ; else print('not matched');

"[!12]9999" if The string 55014 matches (match_pattern(' the pattern. Therefore, the 55014', '[! function prints "matched". 12]9999') <> 0)

print('matched') ; else print('not matched');

The pattern is applied to the "999-999-9999" WHERE When the string in PHONE_NUM column in the MATCH_PATTERN(CU PHONE_NUM does not CUSTOMER table. STOMER.PHONE_NUM match the pattern, throw er­ ,'999-999-9999') <> 0 ror 0.

More examples: The following table displays example values and the matching pattern strings.

Example Value Pattern string

Henrick Xxxxxxx

DAVID XXXXX

Tom Le Xxx Xx

Real-time Xxxx-xxxx

JJD)$@&*hhN8922hJ7# XXX)$@&*xxX9999xX9#

1,553 9,999

Data Governance User Guide 152 PUBLIC Functions Example Value Pattern string

0.32 9.99

-43.88 -99.99

Returns names with last name Jones *Jones

Returns David1 or David2 or David3 David[123]

12.30 match_regex

The match_regex function matches whole input strings to the pattern in a regular expression that is based on the POSIX standard, and flags.

 Syntax

match_regex(, , )

POSIX (Portable Operating System Interface) refers to the POSIX.1 standard, IEEE Standards 1003.1. POSIX.1 defines system interfaces and headers with relevance for string handling and internationalization. The XPG3, XPG4, Single Unix Specification (SUS) and other standards include POSIX.1 as a subset. The patterns in this application adhere to the current standard.

 Note

The match_regex function does not match substrings.

Category

String

Return value boolean

Returns true for a match, otherwise false.

Data Governance User Guide Functions PUBLIC 153 Where

String to be matched. Supports Unicode characters.

Pattern to find in a whole input string. Substring matches are not supported.

Provide the pattern in regular expression format with a var­ char data type.

Specifies additional behavior while searching the input string for pattern matches.

To exclude a flag, enter Null.

Separate multiple options for flags with a comma.

 Note

Flag options are case-sensitive and need to be specified in uppercase.

Regular Expression for Pattern

The following table contains the regular expression patterns for the variable.

Characters for pattern Character Description

\a Match a BELL, \u0007 character.

\A Match at the beginning of the input. Differs from ^ in that \A will not match after a new line within the input.

\b, outside of a [Set] Match if the current position is a word boundary. Boundaries occur at the transitions between word (\w) and non-word (\W) characters, with combining marks ignored. For better word boundaries, see ICU Boundary Analysis.

\b, within a [Set] Match a BACKSPACE, \u0008.

\B Match if the current position is not a word boundary.

\cX Match a control-X character.

\d Match any character with the Unicode General Category of Nd (Number, Decimal Digit.)

\D Match any character that is not a decimal digit.

\e Match an ESCAPE, \u001B.

\E Terminates a \Q ... \E quoted sequence.

Data Governance User Guide 154 PUBLIC Functions Character Description

\f Match a FORM FEED, \u000C.

\G Match if the current position is at the end of the previous match.

\n Match a LINE FEED, \u000A.

\N{UNICODE CHARACTER NAME} Match the named character.

\p{UNICODE PROPERTY NAME} Match any character with the specified Unicode Property.

\P{UNICODE PROPERTY NAME} Match any character not having the specified Unicode Prop­ erty.

\Q Quotes all following characters until \E.

\r Match a CARRIAGE RETURN, \u000D.

\s Match a white-space character. White space is defined as [\t \n\f\r\p{Z}].

\S Match a non-white space character.

\t Match a HORIZONTAL TABULATION, \u0009.

\uhhhh Match the character with the hex value hhhh.

\Uhhhhhhhh Match the character with the hex value hhhhhhhh. Exactly eight hex digits must be provided, even though the largest Unicode code point is \U0010ffff.

\w Match a word character. Word characters are [\p{Ll}\p{Lu} \p{Lt}\p{Lo}\p{Nd}].

\W Match a non-word character.

\x{hhhh} Match the character with hex value hhhh. From one to six hex digits may be supplied.

\xhh Match the character with two-digit hex value hh

\X Match a Grapheme Cluster.

\Z Match if the current position is at the end of input, but before the final line terminator, if one exists.

\z Match if the current position is at the end of input.

\n Back Reference. Match whatever the nth capturing group matched. n must be a number > 1 and < total number of cap­ ture groups in the pattern. Note: ICU regular expressions do not support octal escapes, such as \012.

[pattern] Match any one character from the set. See Unicode Set for a full description of what may appear in the pattern.

. Match any character.

^ Match at the beginning of a line.

$ Match at the end of a line.

Data Governance User Guide Functions PUBLIC 155 Character Description

\ Quotes the following character. Characters that must be quoted to be treated as literals are * ? + [ ( ) { } ^ $ | \ . /

Operators for Pattern

The following table contains regular expression operators to use in the variable.

Operators for pattern Operator Description

| Alternation. A|B matches either A or B.

* Match 0 or more times. Match as many times as possible.

+ Match 1 or more times. Match as many times as possible.

? Match zero or one times. Prefer one.

{n} Match exactly n times.

{n,} Match at least n times. Match as many times as possible.

{n,m} Match between n and m times. Match as many times as pos­ sible, but not more than m.

*? Match 0 or more times. Match as few times as possible.

+? Match 1 or more times. Match as few times as possible.

?? Match zero or one times. Prefer zero.

{n}? Match exactly n times.

{n,}? Match at least n times, but no more than required for an overall pattern match

{n,m}? Match between n and m times. Match as few times as possi­ ble, but not less than n.

*+ Match 0 or more times. Match as many times as possible when first encountered, do not retry with fewer even if over­ all match fails (Possessive Match)

++ Match 1 or more times. Possessive Match.

?+ Match zero or one times. Possessive Match.

{n}+ Match exactly n times

{n,}+ Match at least n times. Possessive Match.

{n,m}+ Match between n and m times. Possessive Match.

( ... ) Capturing parentheses. Range of input that matched the pa­ renthesized subexpression is available after the match.

Data Governance User Guide 156 PUBLIC Functions Operator Description

(?: ... ) Non-capturing parentheses. Groups the included pattern, but does not provide capturing of matching text. More effi- cient than capturing parentheses.

(?> ... ) Atomic-match parentheses. First match of the parenthe­ sized sub expression is the only one tried; if it does not lead to an overall pattern match, back up the search for a match to a position before the "(?>)"

(?# ... ) Free-format comment (?# comment).

(?= ... ) Look-ahead assertion. True if the parenthesized pattern matches at the current input position, but does not advance the input position.

(?! ... ) Negative look-ahead assertion. True if the parenthesized pat­ tern does not match at the current input position. Does not advance the input position.

(?<= ... ) Look-behind assertion. True if the parenthesized pattern matches text preceding the current input position, with the last character of the match being the input character just before the current position. Does not alter the input position. The length of possible strings matched by the look-behind pattern must not be unbounded (no * or + operators.)

(?

(?ismx-ismx: ... ) Flag settings. Evaluate the parenthesized expression with the specified flags enabled or disabled.

(?ismx-ismx) Flag settings. Change the flag settings. Changes apply to the portion of the pattern following the setting. For example, (?i) changes to a case insensitive match.

Flag Variables

The following table contains flags for the variable.

Flag variables Flag Options Description

CASE_INSENSITIVE When set, matching takes place in a case-insensitive man­ ner.

COMMENTS When set, uses white space and #comments within pat­ terns.

Data Governance User Guide Functions PUBLIC 157 Flag Options Description

DOTALL When set, uses a "." in a pattern matches a line terminator in the input text. By default, it does not match.

A carriage-return / line-feed pair in text behave as a single line terminator and match a single "." in a regular expression pattern.

 Example

Use Case Pattern Function Call

Match phone numbers in "([0-9]{3}-[0-9]{3}- match_regex (408)-933-6000 format [0-9]{4}" (pho_number,'([0-9]{3}- [0-9]{3}-[0-9] {4}',NULL)

Match a string that starts with "topicA.*" match_regex (subject, "topicA" regardless of case 'topicA.*',CASE_INSENSI TIVE)

Check a string against a complex pat­ tern and to print result to trace log "XXX)$@&*xxX9999xX9#" if(match_pattern('JJD) $@&*hhN8922hJ7#', 'XXX) $@&*xxX9999xX9#') <> 0)

print ('matched'); else

print('not matched');

The result for this call is "matched".

12.31 mod

The mod function returns the remainder when one number is divided by another.

 Syntax

mod(, )

 Note

The % operator produces the same result.

Data Governance User Guide 158 PUBLIC Functions Category

Math

Return value

Integer

Where

Integer to be divided.

Divisor of first integer.

 Example

Function Result

1 mod(10,3)

2 mod(17,5)

0 mod(10,5)

12.32 month

The month function determines the month of the given date.

 Syntax

month()

Data Governance User Guide Functions PUBLIC 159 Category

Date

Return value int

The number from 1 to 12 that represents the month component.

Where

The source date with a datetime data type.

 Example

Function Results

month(to_date('Jan 22, 1997', 'mon 1 dd, yyyy'))

month(to_date('3/97', 'mm/yy')) 3

12.33 nvl

The nvl function replaces NULL values.

 Syntax

nvl(, )

 Note

The nvl function is available only for views and rules.

Data Governance User Guide 160 PUBLIC Functions Category

Lookup

Return value

any type

If not NULL, returns the value of . If NULL, returns the value of .

Where

The value to be tested for NULL.

The value to replace if is NULL.

must be the same data type as .

 Example

Function Results

nvl($modification_date, sysdate()) If the $modification_date for a row has not been set, this function inserts today's date.

12.34 quarter

The quarter function determines the quarter of the given date.

 Syntax

quarter()

Data Governance User Guide Functions PUBLIC 161 Category

Date

Return value int

The number from 1 to 4 that represents the quarter component.

Where

The source date with a datetime data type.

 Example

Function Results

quarter(to_date('Jan 22, 1997', 'mon 1 dd, yyyy'))

quarter(to_date('5/97', 'mm/yy')) 2

12.35 replace_substr

The replace_substr function returns a string where every occurrence of a given search string in the input is substituted by the given replacement string.

 Syntax

replace_substr(, , , , )

Category

String

Data Governance User Guide 162 PUBLIC Functions Return Value varchar

Where

The varchar input string to change. If NULL, returns NULL.

Data Governance User Guide Functions PUBLIC 163 The string to search for. If is NULL, re­ turns the input as varchar.

You can use /x0000 to specify the hexadecimal value for a special character.

 Example

If is /x00A, and the software en­ counters /x, the function converts the next four charac­ ters to a hexadecimal value and then to a Unicode char­ acter.

Specifying the hexadecimal value for a special character us­ ing /x000 provides more flexibility when you use a search string.

You can also represent special characters using the escape character '/'. SAP Data Insight supports the following special characters:

/a: Bell (alert)

/b: Backspace

/f: Formfeed

/n: New line

/r: Carriage return

/t: Horizontal tab

/v: Vertical tab

To include the escape character '/' in the search string, es­ cape it using '//'.

 Example

If = 'abc/de', the software converts the to 'abcde'

If = 'abc//de', the software converts to 'abc/de'

The string that replaces . If you omit , or it is NULL, the software re­ moves all occurrences of .

Data Governance User Guide 164 PUBLIC Functions Specifies the occurrence to start replacing. If is NULL, start at the first occur­ rence.

 Example

Enter 2 for to replace or re­ move the second occurrence of a .

The number of occurrences to replace. If is NULL, replace all occur­ rences.

 Example

If is 2, the software repla­ ces or removes two sequential occurrences of the .

Details

Also use replace_substr to search for the following substrings:

● A hexadecimal value that refers to a UNICODE character ● A non-printable character reference such as a form feed or new line

Optionally, specify a start position and the number of occurrences to replace.

 Example

Function Result

Replace 'a' with 'B' starting from second occurrence and replaces two occurrences: 'ayyyByyyByyyayyy'

replace_substr('ayyyayyyayyyayyy', 'a', 'B', 2, 2)'

Search a string containing 'a' followed by a new line and re­ place it with 'B' starting from second occurrence and re­ 'ayyyByyyByyyayyy' places two occurrences:

replace_substr('ayyyayyyayyyayyy', 'a/n', 'B', 2, 2)

Data Governance User Guide Functions PUBLIC 165 Function Result

Search a string containing 'a' followed by a new line and re­ place it with 'B' starting from second occurrence and re­ 'ayyyByyyByyyayyy' places two occurrences:

replace_substr('ayyyayyyayyyayyy', 'a/x000a', 'B', 2, 2)

12.36 round

Use the round function to round a given number to the specified precision.

 Syntax

round(, )

Category

Math

Return value decimal, double, int, or real

The rounded number. The return data type is the same as the value in .

Where

The source number.

An integer indicating the number of decimals in the result. If is negative, the software rounds the digits to the left of the decimal point.

Data Governance User Guide 166 PUBLIC Functions  Example

Function Results

round(120.12345, 2) 120.12

round(120.12999, 2) 120.13

round(120, -2) 100

round(120.123, 5) 120.12300

12.37 rpad

The rpad function pads a string with characters from a specified pattern.

 Syntax

rpad(, , )

Category

String

Return value varchar

The modified string. The return type is the same as . Any characters that are not letters are left unchanged.

Data Governance User Guide Functions PUBLIC 167 Where

The source string.

An integer value indicating the number of characters in the resulting string.

A character or set of characters that this function concate­ nates to .

Details

The rpad function repeats the pattern at the end of the input string until the final string is the appropriate length. If the input string is already longer than the expected length, the function truncates the string.

 Example

Function Results

rpad('Tanaka',15,' ') 'Tanaka '

rpad($last_name,25,' ') The value in the column $last_name, padded with spaces to 25 characters, or truncated to 25 characters.

12.38 rpad_ext

The rpad_ext function pads a string with logical characters from a specified pattern.

 Syntax

rpad_ext(, , )

Category

String

Data Governance User Guide 168 PUBLIC Functions Return value varchar

The modified string. The return type is the same as the input value. Any characters that are not letters are left unchanged.

Where

The source string.

An integer value indicating the number of characters in the return string.

A logical character or set of logical characters that this func­ tion concatenates to .

Details

The rpad_ext function repeats the pattern at the end of the input string until the final string is the appropriate length. If the input string is already longer than the expected length, this function truncates the string.

 Note

These logical characters prohibit the rpad_ext function from getting pushed down to an Oracle database.

 Example

Function Results

rpad_ext('Tanaka',15,' ') 'Tanaka '

rpad_ext($last_name,25,' ') The value in the column $last_name, padded with spaces to 25 characters, or truncated to 25 characters.

Data Governance User Guide Functions PUBLIC 169 12.39 rtrim

The rtrim function removes specified characters or blank characters from the end of a string.

 Syntax

rtrim(, )

Category

String

Return value

varchar

The modified string. The return type is the same as .

Where

The string to be modified.

The characters to remove from .

Details

The rtrim function scans from right to left removing all characters that appear in until it reaches a character not in .

Removes trailing blanks only if contains trailing blanks. If the length of the modified string becomes zero after trimming, the function returns '' (empty string).

Data Governance User Guide 170 PUBLIC Functions  Example

Function Results

rtrim('Marilyn ', ' ') 'Marilyn'

rtrim('ZABCABC', 'ABC') 'Z'

rtrim('ZABCABC', 'EFG') 'ZABCABC'

12.40 soundex

The soundex function encodes the input string using the Soundex algorithm and returns a string.

 Syntax

soundex()

 Tip

Use the soundex function when you want to push down the function to the database level.

Category

String

Return value

varchar(4)

Returns a string containing the Soundex encoding of the input string. The return string length is always four characters.

Data Governance User Guide Functions PUBLIC 171 Where

The source string that will be encoded.

Details

When you use the soundex function, keep in mind the following information:

● Use this function for input strings in English only. The software ignores non-English characters. ● The software ignores any invalid leading characters in the input string. ● If the software can't encode the input string, it returns '0000'.

 Example

Function Result

Print(soundex('Hello'); Prints the Soundex of the word “Hello.”

$VAR=soundex($emp_name); Returns the Soundex encoding for the string stored in the $emp_name and then assigns it to $VAR.

$VAR=soundex('1234567'); Returns '0000' because the input data is numeric.

12.41 sqrt

The sqrt function returns the square root of a given number.

 Syntax

sqrt()

Category

Math

Data Governance User Guide 172 PUBLIC Functions Return value float

Return value is NULL if the input is negative.

Where

Input value Description

The number for which you want the square root.

 Example

Function Results

sqrt(625.25); 25.005000

12.42 substr

The substr function returns a specific portion of a string, starting at a given point in the string.

 Syntax

substr(, , )

Category

String

Return value varchar

The modified string. The return data type is the same as the .

Data Governance User Guide Functions PUBLIC 173 Where

The string to be modified.

The position of the first character in the new string.

● Most of the time, the first character in is posi­ tion number 1. ● If is 0, the new string begins with the first character, position number 1. ● If is negative, the function counts characters from the end of input.

The output modified string begins with the character in the from the end of the string. The function returns NULL or an empty string under the follow­ ing circumstances:

● If is greater than the number of characters in , the function returns NULL. ● If is less than 1, the function re­ turns an empty string.

The number of characters in the resulting string.

● If is 0 or negative, the function returns an empty string. ● If is greater than the number of characters remaining in after , the function returns only the remaining characters. ● The function keeps the trailing blanks in the remaining after .

 Example

Function Results

substr('94025-3373', 1, 5) '94025'

substr('94025-3373', 7, 4) '3373'

substr('94025', 7, 4) NULL

substr('Dr. Schultz', 4, 18) 'Schultz'

substr('San Francisco, CA',-4, 18) ', CA'

Data Governance User Guide 174 PUBLIC Functions 12.43 sysdate

The sysdate function returns the current date as listed by the operating system.

 Syntax

sysdate()

Category

Date

Return value

date

The current system date.

Details

There are no input fields for this function.

 Note

The value that the System date function returns is a date value. Internally, the software reads both the date and the time when it runs a sysdate function. The data that is used by the job depends on the data type of a particular column. For example, if the data type of a column in a query is date, this application uses the date only for calculations. The time data is ignored. If you change the data type to datetime, both a date and a time are used.

 Example

Function Results

isweekend(sysdate()) Tests whether today the system date is a Saturday or Sun­ day.

Data Governance User Guide Functions PUBLIC 175 Function Results

to_char(sysdate(), 'yyyy.mm.dd') Converts the output system date, which is datetime data type, to a string that displays only the date.

As this example shows, you can use sysdate to exclude part of the datetime data by providing a format only for the data to display in a report.

12.44 systime

The systime function returns the current time as listed by the operating system.

 Syntax

systime()

Category

Date

Return value time

The current time.

Details

There are no parameters for the systime function.

 Example

Function $timestamp = sql('my_connection',('UPDATE status_table SET job_start_time = \' '|| to_char(systime(),'hh24:mi:ss.ff'))||'\'');

Data Governance User Guide 176 PUBLIC Functions Result This expression updates the job_start_time column of the status_table with the cur­ rent time. It also formats the time data.

Function to_char(systime(),'hh24:mi:ss.ff')

Result Trims date data from the systime function in cases where it is added by default. Set the col­ umn that contains this expression to the data type varchar.

The data type for a column that calls the systime function should be time. If the data type is set to datetime, this application adds the default date for the datetime data type (1900:01:01) because systime does not read dates.

12.45 to_char

Converts a date or numeric data type to a string.

 Syntax

to_char(, ) to_char supports the Oracle 9i timestamp data type, up to 9 digits precision for subseconds.

Category

Conversion

Return value varchar

Where

Value from the source that is int, real, numeric, expression double, or decimal value.

String that indicates the format of the generated string.

Data Governance User Guide Functions PUBLIC 177 If the input value is a number, use one of the following codes for the value.

Format codes Format Description Example

9 Number to_char(123,'9999') = ' ● Suppresses leading and trailing ze­ 123' ros. ● Includes a leading minus sign “-” for negative numbers. ● Includes one leading space for pos­ itive numbers.

0 Number to_char(123,'09999') =0123 Includes leading and trailing zeros.

to_char(123,'9999D.00') =123.00

D<.|,> Position of decimal point followed by to_char(12.34,'99D.99') character for a decimal separator. = 12.34

Currently, only periods “.” and commas “,” are supported as decimal points.

G<.|,|space > Position of group separator followed by to_char(1234,' 9G,999') character to be used as group separa­ =1,234 tor.

Currently, only periods “.” , commas “,”, and spaces are supported as group separators.

x String containing unsigned hexadeci­ to_char(123,'xx') = 7b mal integer, using "abcdef". The soft­ ware pads the output when the input to_char(12,'x') = c number is 2 bytes.

X String containing unsigned hexadeci­ to_char(123,'XX') = 7B mal integer, using "ABCDEF". The soft­ ware pads the output when the input to_char(12,'X') = C number is 2 bytes.

0 String containing unsigned octal inte­ to_char(12,'oo') = 14 ger. This option is not case sensitive. The software pads the output if the in­ to_char(1,'o') = 1 put is 2 bytes.

When is a date, time, or datetime data type, use one of the following codes for the value.

Data Governance User Guide 178 PUBLIC Functions Format Description

DD Two-digit day of the month

MM Two-digit month

MONTH Full name of month

MON Three-character name of month

YY Two-digit year

YYYY Four-digit year

HH24 Two-digit hour of the day from 00 to 23.

MI Two-digit minute, from 00 to 59

SS Two-digit second, from 00 to 59

FF Up to 9-digit sub seconds

If you include any other values in , the software does not change the results.

 Example

Function Results

to_char($call_date,'dd-mon-yy The date value from the $call_date column formatted hh24:mi:ss.ff') as a string, such as, 28-FEB-97 13:45:23.32.

12.46 to_date

Converts a string to a date.

 Syntax

to_date(, )

The to_date function supports the Oracle 9i timestamp data type. Its precision allows up to 9 digits for subseconds.

If the input string has more characters than the format string, the software ignores the extra characters in the input string and initializes to the default value.

 Example

The software converts the following expression, but ignores and initializes the extra characters to zero in the time portion of the input string:

to_date('10.02.2007 13:25:45', 'DD.MM.YYYY')

Data Governance User Guide Functions PUBLIC 179 converts to: 10.02.2007 00.00.00.

Category

Conversion

Return value

date, time, or datetime

A date, time, or both, representing the original string.

Where

The source int, real, numeric expression double, or decimal value.

A string indicating the format of the generated string.

The following table contains the options for .

Format codes Code Description

DD Two-digit day of the month

MM Two-digit month

MONTH Full name of month

MON Three-character name of month

YY Two-digit year

YYYY Four-digit year

HH24 Two-digit hour of the day (0-23)

MI Two-digit minute (0-59)

SS Two-digit second (0-59)

FF Up to 9-digit subseconds

Data Governance User Guide 180 PUBLIC Functions  Example

Function Results

1968.01.08 stored as a date to_date('Jan 8, 1968', 'mon dd, yyyy')

12.47 to_decimal

The to_decimal function converts a string to a decimal, with up to an optional 96 precision parameter.

 Syntax

to_decimal(, , , , )

Category

Conversion

Where

The number string. Null implies a NULL return.

The character that separates the decimal component from the whole number component.

The character that separates thousands from hundreds in the whole number component.

Optional. The total number of digits in the returned value, up to 96.

The number of digits to the right of the decimal point in the returned value.

Return value

Decimal

Data Governance User Guide Functions PUBLIC 181 Uses the given precision and the given scale. If the input string is invalid, 0 is returned.

Example

In the following example, the input string elements include:

● Input string: 1.000.000 ● Decimal separator: . ● Thousands separator: , ● Precision: blank ● Scale: 3

Function Result

to_decimal('1000.000.00', '.', ',', 1000000.000 3))

12.48 trunc

The trunc function truncates a given number to the specified precision, without rounding the value.

 Syntax

trunc(, )

Category

Math

Return value

decimal, double, int, or real

The truncated number. The return data type is the same as the data type of .

Data Governance User Guide 182 PUBLIC Functions Where

The number from a source column.

The number of decimals in the result. If is negative, the software truncates the digits to the left of the decimal point and pads the value with zeros.

 Example

Function Results

trunc(120.12345, 2) 120.12

trunc(120.12999, 2) 120.12

trunc(180, -2) 100

trunc(120.123, 5) 120.12300

12.49 upper

The upper function changes the characters in a string to uppercase.

 Syntax

upper()

Category

String

Return value varchar

Data Governance User Guide Functions PUBLIC 183 The uppercase string. The return data type is the same as the data type of . Any characters that are not letters are left unchanged.

Where

The string to be modified.

 Example

Function Results

upper('Accounting101') 'ACCOUNTING101'

12.50 week_in_month

The week_in_month function determines the week in the month in which the given date falls.

 Syntax

week_in_month()

Category

Date

Return value

int

The number from 1 to 5 that represents the week in the month.

Data Governance User Guide 184 PUBLIC Functions Where

The source date.

Details

This function considers the first week of the month, week 1, to be the first seven days. The software ignores the day of the week when calculating the week in the month.

 Example

Function week_in_month(to_date('Jan 22, 1997', 'mon dd, yyyy'))

Result 4

Function week_in_month(to_date('Jan 21, 1997', 'mon dd, yyyy'))

Result 3

12.51 week_in_year

The week_in_year function determines the week in the year in which the given date falls.

 Syntax

week_in_year()

Category

Date

Return value int

Data Governance User Guide Functions PUBLIC 185 Where

The source date.

Details

This function returns the week in the year in one of two ways:

● 'WW': Absolute week number of the given date. ● 'IW': ISO week number of the given date.

The number from 1 to 53 that represents the week number in a year. This function considers the first week of the year to be the first seven days while determining the absolute week number. Under the ISO standard, a week always begins on a Monday, and ends on a Sunday. The first week of a year is that week which contains the first Thursday of the year. An ISO week number may be between 1 and 53. Under the ISO standard, week 1 will always have at least 4 days. If January 1 falls on a Friday, Saturday, or Sunday, the first few days of the year are defined as being in the last (52nd or 53rd) week of the previous year.

 Example

Some business applications use week numbers to categorize dates. For example, a business may report sales amounts by week, and identify each period as "9912", representing the 12th week of 1999.

 Note

An ISO week is more meaningful than an absolute week for such a purpose.

The following table contains more examples for week_in_year.

Function Results

week_in_year(to_date('Jan 01, 1 2001','mon dd, yyyy'))

week_in_year(to_date('2005.01.01', 1 'yyyy.mm.dd'),'WW')

week_in_year(to_date('2005.01.01', 53 'yyyy.mm.dd'),'IW')

Data Governance User Guide 186 PUBLIC Functions 12.52 word

The word function returns the word identified by its position in a delimited string.

 Syntax

word(, , )

Category

String

Return value varchar

A string containing the indicated word. The return data type is the same as the data type.

Where

The source string.

A nonnegative integer specifying the index of the target word in the string. The first word in a string is word number 1. If is 0 or greater than the number of words in , then the word function returns a NULL string. A negative word number means count from right to left.

Any character specified.

Details

The word function is useful for parsing web log URLs or file names.

A word is defined to be any string of consecutive non-white space characters terminated by white space, or the beginning and end of . White space characters are the following:

● Space

Data Governance User Guide Functions PUBLIC 187 ● Horizontal or vertical tab ● Newline ● Linefeed

 Example

Function Results

word('www.sap.com',2,'.') 'sap'

word('www.cs.wisc.edu', -2, '.') 'wisc'

(A negative word number means count from right to left.)

word('www.cs.wisc.edu', 5, '.') NULL

word('aaa+=bbb+=ccc+zz=dd', 4, '+=') 'zz'

(If 2 separators are specified (+=), the function looks for either one.)

word(',,,,,aaa,,,,bb,,,c ', 2, ',') 'bb'

(This function skips consecutive delimiters.)

12.53 year

The year function determines the year in which the given date falls.

 Syntax

year()

Category

Date

Return value int

Data Governance User Guide 188 PUBLIC Functions The number that represents the year component.

Where

The source date.

 Example

Function Results

year(to_date('Jan 22, 1997','mon dd, 1997 yyyy'))

year(to_date('03/97', 'mm/yy')) 1997

year(to_date('03/19', 'mm/yy')) 2019

Data Governance User Guide Functions PUBLIC 189 13 Operators

The operators you can use in expressions are listed in the following tables, in order of precedence.

Some operators can be selected from the Operators list in the Advanced Editor, while other operators can only be typed in manually. When you select an operator from the Operators list, a Function window opens in which you can construct the statement around the operator.

The operators fall into two categories: Comparison and Logical. The operators in the following tables can be selected in the Operators list or typed in manually.

Comparison Operators Name in Opera­ tors Menu Operator in Syntax Description

= Returns true if operands are equal. =

> Returns true if first operand is greater > than second operand.

>= Returns true if first operand is greater >= than or equal to second operand.

< Returns true if first operand is less than < second operand.

<= Returns true if first operand is less than <= or equal to second operand.

!= Returns true if operands are not equal. !=

Logical Operators Name in Opera­ tors Menu Operator in Syntax Description

and Returns the logical AND operation for AND the two operands.

in Returns true if the input field is among IN () the list of values specified.

is not null Returns true if the input field is not IS NOT NULL NULL.

is null Returns true if the input field is NULL. IS NULL

not Returns the logical NOT operation for NOT the operand.

Data Governance User Guide 190 PUBLIC Operators Name in Opera­ tors Menu Operator in Syntax Description

not in Returns true if the input field is not NOT IN () among the list of values specified.

or Returns the logical OR operation for the OR two operands.

You can use a comparison in the following ways:

● In an expression as a condition. For example:

if ($x > 1 and $x <10) begin return true; end

● As a condition of the IF block.

The following examples illustrate valid comparison expression syntax:

expression = expression

expression != expression expression < expression expression > expression expression <= expression expression >= expression expression IS NULL expression IS NOT NULL expression IN (expression list) expression IN domain expression LIKE constant expression NOT LIKE constant

NOT (any of the above comparisons). For example NOT ($x IN (1,2,3))

comparison OR comparison

comparison AND comparison

Note that the following syntax is not valid:

$x NOT IN (1,2,3)

EXIST or NOT EXIST

The operators in the following table can only be typed in manually.

Arithmetic and logical operators Operator in Syntax Description

+ Addition

- Subtraction

* Multiplication

/ Division

|| Concatenate

% Return the remainder when one number is divided by another

Data Governance User Guide Operators PUBLIC 191 Important Disclaimers and Legal Information

Hyperlinks

Some links are classified by an icon and/or a mouseover text. These links provide additional information. About the icons:

● Links with the icon : You are entering a Web site that is not hosted by SAP. By using such links, you agree (unless expressly stated otherwise in your agreements with SAP) to this:

● The content of the linked-to site is not SAP documentation. You may not infer any product claims against SAP based on this information. ● SAP does not agree or disagree with the content on the linked-to site, nor does SAP warrant the availability and correctness. SAP shall not be liable for any damages caused by the use of such content unless damages have been caused by SAP's gross negligence or willful misconduct.

● Links with the icon : You are leaving the documentation for that particular SAP product or service and are entering a SAP-hosted Web site. By using such links, you agree that (unless expressly stated otherwise in your agreements with SAP) you may not infer any product claims against SAP based on this information.

Videos Hosted on External Platforms

Some videos may point to third-party video hosting platforms. SAP cannot guarantee the future availability of videos stored on these platforms. Furthermore, any advertisements or other content hosted on these platforms (for example, suggested videos or by navigating to other videos hosted on the same site), are not within the control or responsibility of SAP.

Beta and Other Experimental Features

Experimental features are not part of the officially delivered scope that SAP guarantees for future releases. This means that experimental features may be changed by SAP at any time for any reason without notice. Experimental features are not for productive use. You may not demonstrate, test, examine, evaluate or otherwise use the experimental features in a live operating environment or with data that has not been sufficiently backed up. The purpose of experimental features is to get feedback early on, allowing customers and partners to influence the future product accordingly. By providing your feedback (e.g. in the SAP Community), you accept that intellectual property rights of the contributions or derivative works shall remain the exclusive property of SAP.

Example Code

Any software coding and/or code snippets are examples. They are not for productive use. The example code is only intended to better explain and visualize the syntax and phrasing rules. SAP does not warrant the correctness and completeness of the example code. SAP shall not be liable for errors or damages caused by the use of example code unless damages have been caused by SAP's gross negligence or willful misconduct.

Gender-Related Language

We try not to use gender-specific word forms and formulations. As appropriate for context and readability, SAP may use masculine word forms to refer to all genders.

Data Governance User Guide 192 PUBLIC Important Disclaimers and Legal Information Data Governance User Guide Important Disclaimers and Legal Information PUBLIC 193 www.sap.com/contactsap

© 2021 SAP SE or an SAP affiliate company. All rights reserved.

No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP SE or an SAP affiliate company. The information contained herein may be changed without prior notice.

Some software products marketed by SAP SE and its distributors contain proprietary software components of other software vendors. National product specifications may vary.

These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or warranty of any kind, and SAP or its affiliated companies shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP or SAP affiliate company products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.

SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE (or an SAP affiliate company) in Germany and other countries. All other product and service names mentioned are the trademarks of their respective companies.

Please see https://www.sap.com/about/legal/trademark.html for additional trademark information and notices.

THE BEST RUN