User Manual V4.0.7.0 05.03.2019

Copyright: 2150 GmbH || 2150.ch Datavault Builder || datavault-builder.com [email protected]

Table of contents

Table of contents

1. General Overview ...... 6 1.1. Requirements ...... 6 1.2. Layer Concept ...... 6 1.3. Navigation ...... 7 1.4. Modules ...... 9 1.5. Dialogues ...... 10 1.6. Data Preview ...... 11 2. Modules ...... 12 2.1. Staging ...... 12 2.1.1. Connecting a new source ...... 14 Base ...... 15 Connection ...... 16 2.1.2. Adding a staging table ...... 17 Base ...... 18 Columns ...... 19 Details ...... 20 2.1.3. Source Specific Configuration ...... 21 2.2. Data Viewer ...... 21 2.2.1. General Usage ...... 21 2.2.2. Finding the business key ...... 24 2.2.3. Using the log ...... 25 2.3. Data Vault ...... 26 2.3.1. Using the working canvas ...... 26 2.3.2. Extending the model ...... 27 2.3.2.1. Adding a Hub ...... 27 Base ...... 29 Source ...... 30 Key Settings ...... 31 2.3.2.2. Add a Satellite ...... 33 Base ...... 34 Satellite Columns ...... 35 Conversions ...... 36 2.3.2.3. Adding a Link ...... 37 Base ...... 38 Load ...... 39 2.3.2.4. Adding a Transaction Link ...... 40 Base ...... 41 Columns ...... 42 Conversions ...... 43 2.3.3. slider ...... 44 2.3.4. Style Settings ...... 45 2.3.5. Working with bookmarks ...... 46 2.3.6. Audit / Tracking Satellites ...... 47 2.4. Dimensional Model ...... 47 2.4.1. Creating a Business Object ...... 49 | 2 Table of contents

2.4.2. Working with the Business Object Generator ...... 50 2.4.3. Deleting a Business Object ...... 52 2.5. Business Rules ...... 53 2.5.1. Adding Business Rules ...... 55 2.5.2. Business Ruleset Properties ...... 56 2.5.3. Deleting a Business Ruleset ...... 57 2.6. Accesslayer ...... 58 2.7. Data Lineage ...... 59 2.8. Operations ...... 60 2.8.1. Status ...... 60 2.8.2. Jobs ...... 61 2.8.2.1. Creating a job ...... 63 2.8.2.2. Defining loads within a job ...... 65 2.8.2.3. Schedule a job ...... 66 Base ...... 67 Scope ...... 68 Timing ...... 70 2.8.2.4. Job dependencies ...... 71 2.8.2.5. Post-Job SQL Query ...... 72 2.8.2.6. Deleting a job ...... 73 2.8.3. Command line ...... 74 2.8.4. Load States ...... 76 2.9. Deployment ...... 76 2.9.1. Getting started ...... 76 2.9.2. Export an environment ...... 76 2.9.3. Connect environment ...... 77 2.9.4. Comparing environments ...... 78 2.9.5. Deploying objects ...... 80 2.9.5.1. Direct Deployment ...... 81 2.9.5.2. Script based deployment ...... 81 2.9.5.3. structure rollout ...... 82 3. Installing Datavault Builder ...... 82 3.1. Requirements ...... 82 3.1.1. Download of Container Images on Another Computer ...... 84 3.1.2. Connect to the Docker.Com repository using a proxy ...... 84 3.2. Test your Docker Installation ...... 84 3.3. On Premise ...... 85 3.4. Using non-containered Client-Database ...... 87 3.4.1. Install DVB Structures on Client-Database ...... 88 3.5. Limitations ...... 89 3.6. Setting up Backups ...... 89 3.6.1. Database ...... 89 4. Testing ...... 90 4.1. Connecting to a source ...... 90 5. Integration with other applications ...... 91 5.1. REST-API ...... 91 5.2. Using a seperate scheduler ...... 92 5.3. Using a seperate staging-tool ...... 92 | 3 Table of contents

Appendix A. System Structures on Processing Database ...... 92 A.1. Metadata Views ...... 92 dvb_core.access_errormart ...... 93 dvb_core.accesslayers ...... 93 dvb_core.business_rules ...... 93 dvb_core.businessobjects ...... 94 dvb_core.columns ...... 94 dvb_core.hub_loads ...... 94 dvb_core.hubs ...... 95 dvb_core.jobs ...... 95 dvb_core.latest_datavault_load_info ...... 95 dvb_core.latest_job_load_info ...... 96 dvb_core.latest_staging_load_info ...... 96 dvb_core.link_loads ...... 97 dvb_core.links ...... 97 dvb_core.linksatellites ...... 97 dvb_core.load_log_datavault ...... 98 dvb_core.load_log_staging ...... 98 dvb_core.satellites ...... 99 dvb_core.staging_tables ...... 99 dvb_core.subject_area_name ...... 100 dvb_core.system_connections ...... 100 dvb_core.systems ...... 100 dvb_core.tables ...... 100 dvb_core.tables_simple ...... 101 dvb_core.tracking_satellites ...... 101 dvb_core.transaction_link_relations ...... 101 dvb_core.transaction_links ...... 102 dvb_core.view_relations ...... 102 dvb_core.views ...... 102 dvb_core.x_business_rules_distinct ...... 103 dvb_core.x_business_rules_system ...... 103 dvb_core.x_businessobjects_distinct ...... 103 dvb_core.x_businessobjects_system ...... 104 dvb_core.x_hubs_distinct ...... 104 dvb_core.x_hubs_system ...... 104 dvb_core.x_jobs_system ...... 104 dvb_core.x_latest_load_info ...... 104 dvb_core.x_links_distinct ...... 105 dvb_core.x_links_system ...... 105 dvb_core.x_satellites_system ...... 105 dvb_core.y_blocking_procs ...... 106 A.2. Log Tables ...... 106 dvb_log.datavault_load_log ...... 106 dvb_log.ddl_log ...... 106 dvb_log.dvbuilder_creation_log ...... 107 dvb_log.dvbuilder_log ...... 107 dvb_log.job_load_log ...... 107 | 4 Table of contents

dvb_log.login_log ...... 108 dvb_log.staging_load_log ...... 108 Licences ...... 108 Release Notes ...... 111

| 5 1. General Overview

1. General Overview

1.1. Requirements The architecture of the Datavault Builder aims at sending all the processing to the client-database. Therefore, the minimal requirements to run the Datavault Builder are really low and you are free to increase performance by supplying more resources for the database environment.

All the ELT-Processing is directly sent to the database. For a setup, where only the core modules will be hosted on the machine and the processing database on another: • Hardware: • Disk: 25 GB • Memory: 8 GB • CPU: 4 Cores • Operating System: Any System on which a docker host can be installed. Our recommendation is Ubuntu 16.04. • Internet-Access, especially access to the docker-repositories (login to docker.com)

If hosting the database in the same environment, then the recommended resources for the database need to be added to the setup.

1.2. Layer Concept The Datavault Builder follows a standardized Data Vault integration model. The core itself contains data of different source, and can therefore be split up into three different categories: • Persistent Staging Area: Historization of the source in the Datavault by using the technical identifier (Source Vault). This is needed, if the business key is not available in the source column. (PSA is optional) • Raw Vault: Object historization on the business key. • Business Vault: Historization/Persisting of applied business logic.

Persistent Staging Area The PSA is optional.

There are two main reasons to use the PSA: 1. If the Business Key in the source is not clean, meaning a raw vault load from the source is not directly possible 2. or a need to do a business key lookup.

Alternatively, a PSA can as well be realized based on purely technical drivers, for instance if the Business Keys are not yet defined (or to automatically build a Source Vault based on the Primary Keys of the Source).

As a basic principle, between PSA and Raw Vault NO RULES are applied. The only exception to this is cleaning of Business Keys (for instance duplicates, which otherwise couldn't be loaded into the Raw Vault). In this case however, a link should be built to document, which records from the PSA were integrated in the Raw Vault.

Raw Vault The primary way from the source into the Datavault is however the Raw Vault, where the integration into the hub is done using the business key.

After the Raw Vault, Business logic may be applied.

| 6 1.3. Navigation

Based on these explanations, the Raw Vault (mainly) and PSA (in addition) are in combination the single source of facts.

Business Object / Business Rules Business objects is a virtual denormalization layer, automating the joins needed to build an output from the data vault. Business rules is another virtual layer on top of the business objects. Here the business logic is applied.

Business Objects & Business Rules should however not be misinterpreted as the Business Vault, as we will see in the next paragraph.

Business Vault The Business Vault is a materialization of the applied business logic from the Business Rules. Therefore, a new source based on the business rules in the Datavault Builder itself is created, allowing to loop back the output of the applied business logic and store it in a historized manner.

Accesslayer The Accesslayer is the virtualized interface layer for a target system. Here the data is presented either as flat tables, a dimensional or a model.

Access Errormart The Errormart is the virtualized interface layer for data quality analysis. Based on error views from the business rules layer, faulty data will be presented in the access errormart.

1.3. Navigation

Main navigation Lists of the Datavault Builder Modules, sorted according to the workflow.

Sub Navigation Not present in all modules. Allows navigation through submodules.

Search Bar Available within every module. Context-specific search functionality for the module. | 7 1.3. Navigation

Main Actions Located on the right of the Search-Bar. Allows main actions of the module.

Application Profile Opens up the details of the current development environment and logged on user.

Working Area Depending on the module brings up the contents to work on as lists or visualizations.

| 8 1.4. Modules

1.4. Modules In this chapter we give you an overview over all Datavault Builder modules.

Staging In the staging module connect external data sources like jdbc sources, csv files and rest interfaces and load them into the staging area.

Data Viewer Review your data to find business keys and visualize your output for a first analysis.

Datavault This module let you model and implement your core including Persistant Staging, Business Vault and Raw Vault

Business Objects The Business Objects module lets you graphically transform your data vault into a dimensional model.

Business Rules This module lets you apply and version your business rules. As well you can define here which systems have priority over others in the output and you can create Error Mart views for data quality monitoring.

Data Lineage Here you see the automatically created data lineage from source to the output of your Data Vault.

Operations Here you find the automatically created (master) jobs for your source systems. You can modify, monitor and schedule them.

| 9 1.5. Dialogues

1.5. Dialogues Throughout the application, dialogues will guide you through the actions. Three types of dialogues exist: • Single Tab Dialogues: One tab, with a completion action in the bottom right corner. • Multi Tab Dialogues: Multiple tabs, with a completion action on the last tab in the bottom right corner. • Combined Dialogues: Multiple tabs, with a completion action on the first & last tab in the bottom right corner.

Combined dialogues especially appear in the Datavault Core Module, where you can either create prototype objects to develop the logical model (using the completion action on the first tab) or directly declare a load for the new object and using the completion action on the last tab.

Most of the Multi Tab/Combined Dialogues do have a review tab at the end, allowing you to look over the specification of the new object and highlighting missing or falsely declared values.

Dialogue Title Title of the current Dialogue. In some cases includes as well the name of the object currently working on (such as a Hub Name).

Dialogue Tabs Available tabs in the dialogue. Workflow from left to the right. Only available in Multi Tab or Combined Dialogues. Mostly offer a review section.

Completion Action Located in the bottom right, the button to complete the dialogue.

Cancel Button to cancel the dialogue. Cancellation can also be done using ESC.

| 10 1.6. Data Preview

1.6. Data Preview The data preview windows are availble in many locations of the Datavault Builder, offering you the unique chance to immediately have a look at the underlying data while developing your data integration flow.

Be aware, that using the data preview can cause high load onto the database. Especially keep this in mind when using the data preview for source tables in the staging area, as these calls are directly sent to the source system!

Column Header

• Triangle opens Columns options. • Text-Area can be used for filtering of values. • Column-width can be resized.

Columns options

• Sort ascending/descending: Allows to sort the data on the database. Multiple column-sortings can be combined by pressing "Shift" while selecting the sorting. • Not Null: Allows to ignore rows containing nulls in the data set. • Top Occurances: Brings back a complete list of most reoccurring values of the field. Important: As it will return a list of all unique values with count, it is not recommended to apply this function onto large datasets with diversified values.

| 11 2. Modules

2. Modules In this part of the documentation, we will walk you step by step through all modules and dialogues in the Datavault Builder GUI. 2.1. Staging The staging area is used to connect the Datavault Builder to sources, such as , files or even REST-Apis.

Search-Bar Allows to filter the Staging-List for System Name, System ID or Staging Table.

New Source Allows to define a new source connection by opening up the New Source Dialogue.

Add Table Opens up the Add Table Dialogue to add a new staging table for a previously defined source system.

System Actions

• Plus: Opens up the Add Table Dialogue with a prefilled source system. • Circled arrow: Initiates a full load of all the defined staging tables of the system. Color/ Hovering indicates loading status. • Screw nut: Opens up the Edit Source Dialogue, which allows to change all editable properties of the New Source Dialogue. • Trash bin: Removes the Source System. This action is only available if no dependent staging tables or jobs are defined.

Table Actions

• Circled arrow: Initiates a full load of the specific staging table. Color/Hovering indicates loading status. While loading, the button can be used for cancelling. • Screw nut: Opens up the Edit Staging Table Dialogue, which allows to change all editable properties of the Add Table Dialogue. • Trash bin: Removes the Staging Table. This action is only available if no dependent data vault loads are defined.

| 12 2.1. Staging

Table Lists the Source Table, Source Schema, Staging Table and last succeeded load date and duration. Clicking onto the Source Table* or Staging Table opens up the Data Preview. (*if supported)

System Row Lists the System Name and System ID.

| 13 2.1.1. Connecting a new source

Subset Status

Load definition has/had a General Subset Where Clause.

Load definition has/had Delta load Subset Where Clause.

Coloring: - black: no load has yet been executed but a subset where clause is defined - green: a load has been executed with the current subset where clause definition - orange: a load has been executed but in the meantime the subset definition has changed - red: a load has been executed but in the meantime the subset definition has been removed 2.1.1. Connecting a new source Definition A source is a connection to a data origin. This origin can be either a database, files or even a webservice.

Steps 1. Navigate to the Staging Module. 2. Click onto "New Source". 3. Fill out the Base-Tab. 4. Fill out the Connection-Tab. 5. Confirm the creation on the Review-Tab.

| 14 Base

Base

Source System Name Declaration of the Name to be displayed for the System. Mandatory; Editable.

Source System ID Declaration of the System ID, which is used for the naming of the technical implementation (such as tables, views, ...) related with that system on the database. Mandatory.

Comment Individual notes about the system. Will as well appear in the documentation. Editable.

| 15 Connection

Connection In this step, the connection properties are declared. The Datavault Builder can connect to any source providing a JDBC-Driver. You can add you own specific drivers.

Please directly contact us, if you are missing a certain default driver.

Connection Type Declaration of the source type to connect to.

Source Type Parameters Depending on the chosen Connection Type, the Datavault Builder will require you to fill in the connection properties.

Connection String Based on the declared connection properties, the jdbc-connection string is put together. The string can directly be manipulated to use properties, which are not yet possible by using the source type parameters block.

Test Connection By pressing this button, the Datavault Builder tries to connect to the source by using the specified connection string. The test result will return next to the button within a couple of seconds.

| 16 2.1.2. Adding a staging table

2.1.2. Adding a staging table Adding a staging table will automatically define a staging load.

Definition A staging load is a 1-to-1 copy of the data from the source.

Usage These loads can be done as full or as delta loads. The underlying logic to pull a delta from the source can be specified within the creation dialogue of each staging table.

The result of the dialogue is a table in the staging area, consisting of fields with the same or a most similar datatype as the original type in the source. Length-restrictions for certain datatypes are removed, so the data can also be retrieved if a fieldtype is edited in the source. In connection with the built in loading logic, the added staging table can immediately be loaded.

Steps 1. Navigate to the staging module and click onto "Add Table". 2. Select an existing system on the base tab. 3. Select the columns to load on the columns tab. 4. Fill out the details tab. 5. Confirm the creation on the review tab.

| 17 Base

Base

Source System Select a previously defined Source System from the list.

| 18 Columns

Columns

Source Schema By scanning the source, the Datavault Builder supplies a list of available schemas. When using Flat Files as a source, the placeholder "Default Schema" shall be used.

Source Table A specific table within the chosen schema can be chosen. When using Flat Files, this relates to a specific file.

Available Columns List of the available columns in the source table, with column ID and column type.

Selected Columns List of the chosen columns which should be added to the staging table.

Available Columns Actions

• Ordering of the columns either by order or name • Filtering of columns by name • Magnifying Glass: Open up Data Preview (if supported).

Column Selection Columns can be added to the selected columns using the buttons in the middle, by double- clicking or drag&dropping.

| 19 Details

Details

Staging Table Name Declaration of the displayed name for the staging table. Editable.

Staging Table ID Declaration of the ID used in the technical implementation related with the staging table.

Batch Size Activation of batch-based loading from the source as well as specification of the batch-size to load in. Editable.

Delta Load Clause Specification of a SQL-based where clause statement to perform delta-loads. The where clause can be parameterized using {{parameters}}, to which values can be assigned on execution in the jobs module. This way, it is as well possible to declare different delta loads for the same staging table. Editable.

Comment Comment for the staging table, which will as well appear in the documentation. Editable.

| 20 2.1.3. Source Specific Configuration

2.1.3. Source Specific Configuration For certain sources, specifc configuration questions may arise.

Generally, you can use the configuration wizard to connect to a source. Based on the wizard, the connection string is automatically derived. If you are missing options in the wizard, then the connection string can as well be directly edited, offering you more advanced configuration options.

CSV Detailed documentation and advanced parameters to load from a CSV files is available at http:// csvjdbc.sourceforge.net/doc.html - To load from a TAB-Delimited file, use %09 as separator.

2.2. Data Viewer A big advantage of the Datavault Builder is the possibility to directly access the data within the development environment. Besides the Data Viewer, the Data Preview as well lets you have a quick look at the data. The advantage of the Data Viewer is being able to do some more data profilling, by selecting specific columns, using visualizations and aggregates.

There are two main use cases: 1. Access a staging table to find a business key. 2. Access the log tables for debugging your data integration flow. 2.2.1. General Usage When navigating to the Data Viewer, the working area is empty, so you first have to click into the search- bar to select a table/view to analyze. Once you have picked a table from the list, the visual data profiler opens up on the working canvas.

Important: To not overload the webinterface, the resultset will be limited to 600 records. Make use of the database-side aggregation to reduce the result set.

Search-Bar By clicking into the box, a table/view for analysis can be chosen.

| 21 2.2.1. General Usage

Available Columns After picking a source for analysis, the available columns will appear on the left hand side in the panel.

Chosen Columns (row) By dragging a column from the available columns into the area of the chosen columns (row), the data profiler will load the corresponding values with the specified aggregator. The chosen columns can be rearranged in order, as the results are grouped together according to the order specified (according to the chosen visualization).

By double-clicking onto a column-field, the filter-selection appears. This allows you to filter the loaded data by certain values. The filtering is only available for chosen columns and will display "null" as values when double-clicking onto a column field in the available columns section.

Chosen Columns (column) Columns can as well be drag & dropped into the chosen columns (column) area. Use this to build pivot-based analytics.

Visualization Control

The dropdown allows you to pick a certain kind of visualization of the data. The default is "Heatmap", as this will give you an indication to find the business key.

| 22 2.2.1. General Usage

Aggregation Control

The aggregation gives you the possibility to influence the totals fields for each row. Depending on the chosen aggregator, a second field will appear to further define the wished for aggregation (for instance "Count unique" will allow you to pick a specific column to count the uniqueness of). The aggregation is done on the database, only returning the result. This may take some time depending on the size of dataset.

Refresh The Datavault Builder will always auto-reload new data, when adding a field to the selection. Sometimes, when having already selected specific columns, visualization and aggregator, it can be useful to manually trigger the refresh of the data.

| 23 2.2.2. Finding the business key

2.2.2. Finding the business key

An important step when developing an integration flow within Data Vault modeling is to define the business key, which should be loaded into a hub. This business key has to be unique, as it identifies the object. The Data Viewer with its default visualization "Heatmap" will support you to find a unique business key. An even faster way to check, if your composed key is unique can be done by using the uniqueness check while creating a new hub load.

Steps 1. Select the newly staged table from with the search-bar. 2. Drag & drop a column (-combination) from the available columns to the chosen columns (row)-section. 3. The visualization will now give a visual feedback, whether the chosen combination of columns results in a unique identification, by highlighting duplicates in red.

Example of a column selection with duplicates highlighted in red.

Important: The uniqueness-check is only an indication and does not run over the whole data-set. A full check will be done when loading a hub in the Data Vault core. Then the system will automatically throw an error if the key violates the uniqueness constraint.

| 24 2.2.3. Using the log

2.2.3. Using the log The Datavault Builder has multiple logs, documenting the actions. The logs can be directly accessed on the database (Schema: dvb_log) or through the Data Viewer.

Steps 1. Navigate to the Data Viewer. 2. Type "System" into the search-bar. 3. Select one of the logs appearing in the dropdown.

Available Logs • Staging Load Log: Documents every load from a source into the staging area. • Datavault Load Log: Documents every load from a staging table into a data vault object. • Job Load Log: Documents every load initiatet as job load. • DV Creation Log: Documents every action based on the invoked function in the core module. • DV Builder Log: Documents every query executed against the processing database.

The resultset returned to the Data Viewer is limited to 600 records. In case of the logs, the latest 600 records in the log will be returned.

| 25 2.3. Data Vault

2.3. Data Vault

2.3.1. Using the working canvas The working canvas can be used to display existing parts of the model or also extend the model. When having parts of the model opened, you can see the following objects being present. By double- clicking onto a Hub, it is possible to load everything related to that element, enabling you to browse through the existing model.

Hub

By default, hubs are represented as larger blue squares, with the display name written onto it.

Link

The link connecting two hubs are represented according to their type as line, arrow, or double- arrow. Multiple links are displayed in a bent manner.

Satellite

By default, satellites are smaller circles, represented with the color of the source system. They are connected to the hub with a line, carrying the functional suffix name of the satellite.

| 26 2.3.2. Extending the model

Subject Area

Subject Areas are used to group parts of the model. They are represented as colored clouds around the objects.

Same As

To allow the modelling of same as links, the creation of "Alias Hubs" is possible. In this example, the Hub "Product Alias" is an alias of hub "Product" and thereby is bound to its parent by a doubled line. Technically, the declared loads for both hubs will be fed into the same object. 2.3.2. Extending the model The existing model can easily be extended by using the provided creation dialogues. These dialogues can be reached in the Data Vault module through the right-click-context menu on the canvas or the menu in the top right.

When working with the right click-context-menu and building on top of an existing object (such as a hub, a link, ...) , you can right click onto the existing object on the canvas. When invoking the dialogue this way, some information on the base tab can automatically be prefilled by the Datavault Builder. 2.3.2.1. Adding a Hub

Steps The add hub dialogue can be used in different ways:

• Create the logical model: A new hub is created, without having a load from a source defined. 1. Invoke the create hub dialogue, 2. only fill out the base tab and 3. complete the combined dialogue on the first tab.

• Create a new hub with a load: A new hub is created, with directly assigning a hub load for a staging table. 1. Invoke the create hub dialogue, 2. fill out all tabs and 3. complete the combined dialogue on the last tab.

• Add another load to an existing hub: Another load from a staging table is assigned to an existing hub. 1. Invoke the create hub dialogue by right-clicking onto the existing hub and selecting "add hub load". 2. The Datavault Builder will fill out the information on the base tab and directly take you to the source tab. 3. Complete the combined dialogue on the last tab.

• Create an alias hub: A hub is created, which technically refers to the parent hub to allow same-as- linking. 1. Invoke the create hub dialogue and

| 27 2.3.2.1. Adding a Hub

2. decleare the parent hub in the "Make This Hub An Alias For"-field. 3. The dialogue can be completed either on the first or last tab.

| 28 Base

Base

Hub Name Declaration of the displayed name of the hub. Editable. When clicking, a list of existing hub names will appear to avoid declaration of the same name.

Hub ID Declaration of the technical ID represented on the database for the hub. Not editable. Is automatically derived from the entred hub name, but can be manually adjusted.

Make This Hub An Alias For Declare the hub to be an alias for an existing hub. This allows the creation of same-as-links. Not editable.

Subject Area Grouping the hub into a specific part of the model. Editable.

Comment Custom notes about the hub. Will appear in the documentation. Editable.

Add Hub Without Load Completes the creation for logical modelling without declaring a load for the hub.

| 29 Source

Source

Source System Selection of a Source System to load from.

Staging Table Selection of a Staging Table to load from.

Available Columns Available columns in the staging table. Can be filtered, sorted as well as previewed with the Data Preview.

Business Key Declaration of the business key. Can be made up of a single column or as a composite key. Use the buttons in the middle or drag and drop to add columns from the available columns.

| 30 Key Settings

Check Uniqueness

The Uniqueness check allows you to validate your composed business key against the data currently loaded into your staging area. Click onto the icon, to start the validation. Once complete, it will either successfully turn green (and directly enable "Keys are Unique" on the next tab), or turn red (and disable "Keys are Unique".

In the second case of duplicates, a data preview window will open up, directly supplying you with the identified duplicates and their count.

Also, by clicking onto the plus-icon, you can retrieve a number of duplicates, helping you determine the root of the problem.

Key Settings

| 31 Key Settings

Keys are Unique Definition, that the declared business key is the main identifier for the object and is on the same granularity as in the staging table. This activates a check on load, which will throw an error when violating the uniqueness constraint. Disable this option when modeling a denormalized source into the datavault or declaring a "foreign-key"-like business key for link-creation. More details in Adding a Link.

Business Key Prefix Allows to set a prefix ahead of the declared business key. This is needed, when feeding a hub from multiple systems with overlapping keys, which do not mean the same. (For instance: System A Customer 12 != System B Customer 12).

Datavault Category Specification to which part of the datavault category (Persistent Staging Area / Raw Vault / Business Vault) the load belongs to.

| 32 2.3.2.2. Add a Satellite

2.3.2.2. Add a Satellite The creation of a satellite requires an existing hub.

Steps The add satellite dialogue can be used in two different ways:

• Create the logical model: A new satellite is created, without having a load from a source defined. 1. Invoke the create satellite dialogue, 2. only fill out the base tab and 3. complete the combined dialogue on the first tab.

• Create a new satellite with a load: A new hub is created, with directly assigning a hub load for a staging table. 1. Invoke the create satellite dialogue, 2. fill out all tabs and 3. complete the combined dialogue on the last tab.

| 33 Base

Base

Hub Name Name of the hub the satellite belongs to. When clicking into the field, a dropdown list with the existing hubs will appear.

Functional Suffix Name Displayed name of the satellite. Editable.

Functional Suffix ID Technical ID of the satellite. Not Editable. Is automatically derived from the functional suffix name, but can manually be adjusted. When more then one satellite exists, a functional suffix id is required.

Subject Area Grouping the hub into a specific part of the model. Editable. If the chosen hub has an assigned subject area, the area will be automatically preselected.

Comment Custom notes about the satellite. Will appear in the documentation. Editable.

Create Satellite Without Load Completes the creation for logical modelling without declaring a load for the satellite.

| 34 Satellite Columns

Satellite Columns Adding columns to a satellite requires to previously define a load for the parent hub.

Hub Load Selection of an existing hub load from the parent hub.

Available Columns Present columns in the staging table of the chosen hub load.

Selected Columns Chosen columns to add as attributes to the satellite.

| 35 Conversions

Conversions

Columns Selected columns from the tab satellite columns.

Type Converter Allows to add column-based type conversion on the way into the data vault. When declaring a type conversion, the original field will be added to the satellite, too, carrying the naming-extension "_raw".

Target Name Column-based renaming. Will only affect the displayed name.

Comment Custom notes about the column. Will appear in the documentation.

| 36 2.3.2.3. Adding a Link

2.3.2.3. Adding a Link The creation of a link requires two existing hubs.

Steps The add link dialogue can be used in three different ways:

• Create the logical model: A new link is created, without having a load from a source defined. 1. Invoke the create link dialogue, 2. only fill out the base tab and 3. complete the combined dialogue on the first tab.

• Create a new link with a load: A new hub is created, with directly assigning a hub load for a staging table. 1. Invoke the create link dialogue, 2. fill out all tabs and 3. complete the combined dialogue on the last tab.

• Add another load to an existing link: Another load from a staging table is assigned to an existing link. 1. Invoke the create link dialogue by right-clicking onto the existing link and selecting "add link load". 2. The Datavault Builder will fill out the information on the base tab and directly take you to the load tab. 3. Complete the combined dialogue on the last tab.

| 37 Base

Base

First Hub The first hub to connect from.

Second Hub The second hub to connect to.

Link Type Defines the relation-type of the link (one-to-one, one-to-many, ...). This declaration is just declarative, as the technical implementation is always realized as many-to-many relationship. However, this will help in the business object to denormalize the data vault structure without having a fanning when following along the related objects in the data vault.

Subject Area The Subject Area the link belongs to. Editable.

Link Suffix Name The displayed name of the link. Editable.

Link Suffix ID Partial ID of the technical implementation of the link. Is mandatory, as soon as multiple links exist between the same hubs.

Comment Custom notes about the Link. Appears as well in the documentation. Editable.

Completion Action Use this button to create a prototyp link for the modeling (without defining the load yet). Otherwise, switch to the next tab in the creation dialogue.

| 38 Load

Load To add a load to a link, a hub load based on the same staging table needs to exists for both hubs. In other words: A business key from the same staging table has to be defined for both hubs.

Hub Load 1 Selection of the hub load for the first side of the link.

Hub Load 2 Selection of the hub load for the second side of the link.

Add Hub Load Optional Button: Will be enabled, when selecting a hub load on one side and no corresponding hub load exists for the second hub. Allows to directly go to the Add link dialogue, prefill the base tab and select the underlying staging table. After successfully completing the adding of the hub load, the dialogue will appear again to continue adding the link load.

| 39 2.3.2.4. Adding a Transaction Link

2.3.2.4. Adding a Transaction Link

Steps Create a new transaction link with a load: A new transaction link is created, with directly assigning a load for a staging table. 1. Invoke the create transaction link dialogue, 2. fill out all tabs and 3. complete the combined dialogue on the last tab.

| 40 Base

Base

Hub Name Name of the main hub the transaction link belongs to. When clicking into the field, a dropdown list with the existing hubs will appear.

Functional Suffix Name Displayed name of the transaction link. Editable.

Functional Suffix ID Technical ID of the transaction link. Not Editable. Is automatically derived from the functional suffix name, but can manually be adjusted.

Subject Area Grouping the transaction link into a specific part of the model. Editable.

Comment Custom notes about the transaction link. Will appear in the documentation. Editable.

| 41 Columns

Columns Adding columns to a satellite requires to previously define a load for the parent hub.

Hub Load Selection of an existing hub load from the parent hub.

Available Columns Present columns in the staging table of the chosen hub load. It will by default only show Hash Keys of existing hub loads. By adding the hash key to the transaction link you select which other hubs will be linked to. The hash key of the parent hub will be present in the transaction link in any case. If you would like to also add attributes to the link, check the "Show Attributes"-checkbox, so the available columns from the staging table appears.

Selected Columns Chosen columns for the transaction link.

| 42 Conversions

Conversions

Columns Selected columns from the tab columns.

Type Converter Allows to add column-based type conversion on the way into the data vault. When declaring a type conversion, the original field will be added to the satellite, too, carrying the naming-extension "_raw". Please note: Only columns of type attribute can be converted! The type converter is disabled for Hash Key Columns.

Target Name Column-based renaming. Will only affect the displayed name.

Comment Custom notes about the column. Will appear in the documentation.

| 43 2.3.3. Metadata slider

2.3.3. Metadata slider To open up the metadata slider, you can click onto any element previously loaded onto the working canvas. The slider will then appear from the bottom of the screen. The Metadata-Sliders are made up similarly. Therefore, we will discuss the details based on the metadata of a hub.

Editable properties can be changed using .

Base In the base part, the specified data from the base-tab in the creation dialogue is listed. When having multiple users working on the same model, the change can only be saved if it has not been modified in the meantime. Otherwise the property has to be reloaded first.

Load Lists all loads for the object. (A satellite will only have one load). Only the Datavault Category can be changed. To correct a business key, delete the load and create the correct load again.

Object Actions

• Magnifying glass: Open the Data Viewer to see the historized data in the object. • Database minus: Open the dialogue to delete loaded data from the hub. • Trash bin: Delete the hub. This action requires to delete all loads, data and related objects first.

Load Actions

• Circled arrow: Initiate a specific data vault load. Color/Hovering for status details. While loading, a stop symbol will appear to cancel the running load. • Trash bin: Remove the specific data vault load.

| 44 2.3.4. Style Settings

2.3.4. Style Settings

Canvas Styles The following properties can be changed: • Color: In case of the satellite, the color will only affect prototyped satellites without a load. • Size • Font-Size

General The visual grouping into subject areas on the canvas can be turned on or off.

| 45 2.3.5. Working with bookmarks

2.3.5. Working with bookmarks Bookmarks can significantly accelerate working speed with the core model by directly accessing most often used parts. Also, bookmarks can be directly shared with all other uses, so you can show what you are modelling currently to a coworker.

Creating a bookmark 1. Load and arrange the parts of the objects on the working canvas in a favourable way.

2. Right-Click onto the canvas to open up the context menu and select 3. Give the bookmark a name and save it.

To load a stored view 1. Click onto Menu and navigate to Bookmarks. 2. Select a stored bookmark from the list.

Be aware, that loading a bookmark will reinitialize the canvas and clean all existing objects.

Delete Bookmarks 1. Click onto Menu and navigate to Bookmarks. 2. Select "Manage Bookmarks..." from the list. 3. A window listing all bookmarks will appear. 4. Remove any unwanted bookmark and confirm to leave the dialogue.

| 46 2.3.6. Audit / Tracking Satellites

2.3.6. Audit / Tracking Satellites While you are building your model on the canvas, the technical implementation on the database itself is directly realized. Therefore, besides the elements you see in the logical model, additional objects are automatically created and loaded. One of these objects are tracking satellites.

Definition A tracking satellite keeps data on row level granularity, if and when a record has been seen in the source the last time.

Purpose The goal is to either actually mark a record as deleted (when not seen in the source in a full load), or track when the record was last seen in a load (when doing delta loads). Imagine when you are only doing delta loads, you still need to include logic on when a record can be seen as removed from the source. There comes in the last seen date from the delta load tracking satellite. You add the field to your output and include business logic in the business rules on when a record can be seen as deleted.

Implementation Tracking satellites are marked with the ending "_w_trkd_h" (for the historized delta load tracking satellite) or "_w_trkf_h" (for the historized full load tracking satellite). Additionally, you can see element in the datavault with ending _c: These are directly giving you only the current valid records without the history.

A trkd consists of fields: • ..._h: hash of the business key(s) • ..._lth: last seen in delta load timestamp A trkf additionally consinsts of fields: • ..._vh: boolean, if record was seen in source • ..._lsh: last seen in full load timestamp (is updated, when the record disappears from the source)

2.4. Dimensional Model In the business objects layer, we can prepare a denormalized output based on the data vault core. A business object can be composed of fields coming from multiple related elements in the data vault. The Business Object generator will take away the work of manually joining the data vault elements and generates an as-of-now-view, on top of which the business logic can be applied in the business rules.

Definition A business object is a denormalized view layered on top of the datavault. It delivers an as-of-now-view, based on the data-set of a chosen granularity satellite.

When creating a new business object and selecting a hub, additional technical fields are available, such as effective last seen date of a record or if the record was flagged as deleted in the source.

| 47 2.4. Dimensional Model

Search Bar Allows to search for an existing business object and open it in the business object generator.

Create Business Object Opens up the dialogue for Creating a Business Object.

Add Related Business Object Opens up the dialogue to create a related business object. The button is only active, if a business object is currently open in the business object generator. It will invoke the dialogue to create a business object and preselect the starting hub. For details: Model a dimensional output.

Business Object Slider The slider will open up after creating a business object or when opening an existing business object. It lets you see and compose the output of the business object.

Element Selector The visual selection is available after creating a business object or when opening an existing business object. It lets you select an element from the data vault you would like to take fields from.

| 48 2.4.1. Creating a Business Object

2.4.1. Creating a Business Object

Steps 1. Navigate to the business objects module and click onto "Create business object". 2. Fill out the fields of the dialogue and confirm the creation. 3. Add attributes from the datavault to your output.

Start Hub Declaration, based on which hub the business object should start from. Will determine the primary name of the business object.

Source System Declaration, for which source system the business object should be generated.

Granularity Satellite Defines the dataset, which is delivered to the output.

Comment Custom notes about the Business Object. Appears as well in the documentation.

Functional Suffix Name Displayed suffix name for the business object.

Functional Suffix ID Technical suffix, which is appended to the created view.

Create Business Object Button to complete the creation dialogue.

| 49 2.4.2. Working with the Business Object Generator

2.4.2. Working with the Business Object Generator Once a business object is created, you can modify its output using the business object generator.

Adding attributes to the output 1. Use the element selector and either click onto a hub or a satellite, to select a data vault object. 2. The available columns will appear in the left part of the business object slider. 3. Select a column you would like to have in the output. 4. Rename the column into a business relevant terminology.

Available technical attributes Certain attributes are automatically available within the environment, such as last seen timestamps in delta loads. With the help of these fields, you can for example cleanse outputed data from deleted records in the source. Data Vault Available Description Object Type technical attribute Hub LT Load Time - Timestamp when the record was inserted into the Hub LS Load Source - Source from which the record was inserted DLS Delta Load Last Seen (per Staging Table) - Timestamp for when the key has been seen a delta load the last time FLVC Full Load Validity Change (per Staging Table) - Timestamp for last state change (added/deleted) of a key in a full load FV Full Load Valid (per Staging table) - Flags if the key was in the last full load or not Satellite LTS Load Time Satellite - Timestamp when the record was inserted into the Satellite PS Entry is Present in Satellite - Flags if the key is available in the Satellite Link LTL Load Time Link - Timestamp when the record was inserted into the Link LSL Load Source Link - Source from which the record was inserted ELSL Effective Last Seen Time Link (over all Staging Tables / Systems) - Timestamp when the record was effectively seen the last time in a full or delta load DLSL Delta Load Last Seen Time Link (over all Staging Tables / Systems) - Timestamp, when the record was seen the last time in a delta load FLVCL Full Load Last Validity Change Time Link (over all Staging Tables / Systems) - Timestemp for the last state change (added/deleted) of a record in the link

Hint: No manual saving is needed. When modifying the output-columns of a business object, after each modification the result is directly sent to the database and the underlying view is rebuilt.

Browsing You can follow along related objects in the element selector as it is possible in the data vault module, too, by double-clicking onto a hub. Some elements can appear in a gray color. In this case, it is not possible to select columns yet, as no data vault load has been defined yet to follow along the relation. Please first add a load for the link.

An important difference to the representation in the data vault is, that hubs, connected by more than one link, will be represented as two separate hubs. This way you can directly indicate, which relation to follow for the joins.

Traversing Links / available related objects

| 50 2.4.2. Working with the Business Object Generator

While modelling in the Datavault, the expected cardinality of a link was already defined. Based on this information, the business object generator can support you in following relations / traversing links without having a fanning in the output data.

Therefore, only related objects and links are shown, which are of the same or a more aggregated grain. This means, you can only follow along many-to-1 or 1-to-1 links, but not many-to-many or 1-to-many.

| 51 2.4.3. Deleting a Business Object

2.4.3. Deleting a Business Object A business object can only be deleted, if no customized business rules are declared anymore. Therefore, previously remove all business rules. An exception to this is the unaltered default ruleset, which will automatically be removed when deleting the business object. If the unaltered business ruleset is still fed into the accesslayer, the deletion will automatically remove it from the accesslayer as well.

Steps 1. Open up the business object you would like to delete. 2. Open up the Business object slider

3. Click onto the behind the magnifying glass in the Business Objects Columns. 4. Confirm the deletion.

If a dependent business ruleset exists, the deletion will directly return an error message.

| 52 2.5. Business Rules

2.5. Business Rules This is the layer where your business logic will be applied to the data. It is a second layer on top of the business object, which can then be fed to the accesslayer. The business logic will be applied in the form of a database view. This enables you to make use of any functionality of the underlying database.

Definition A business ruleset is a view on top of a business object. It allows the application of custom business logic.

For auditability or performance reasons, it can sometimes make sense to loop back the data and persist it in the data vault. This procedure is called Business Vault.

Search Bar Filters the business ruleset list.

Business Ruleset List Lists the existing business rulesets. Grouped by Business Object > System. Select a business ruleset from the list to see and edit its code with the editor.

Business Ruleset Editor The editor displays the SQL-Code of the view and allows implementation of new business logic. By using the -icon in the top right corner, a horizontal split-view can be activated.

The next to it (de-)activates the data preview of the result at the bottom.

Often used Keyboard-Shortcuts Action Windows/Linux Mac

Find Ctrl-F Cmd-F

Replace Ctrl-H Cmd-Option-F Fold Alt-L Cmd-Option-L Go to Line Ctrl-L Cmd-L Remove line Ctrl-D Cmd-D Move line up, Alt-Up, Option-Up, down Alt-Down Option-Down

| 53 2.5. Business Rules

Add Business Ruleset Opens up the creation dialogue for adding a business ruleset.

Business Ruleset Properties

Clicking onto the opens up the slider from the left, containing the business ruleset properties.

Quick Inserts Quick inserts can be used to search for tables/views on the database you would like to join to. This panel is made up of a search box at the top and a list of tables/views. • You can add more entries to the list using the search field. • Expand the name of the available list entries to see the fields available within the object. • Use the plus-icon or double-click to add the element to the view.

| 54 2.5.1. Adding Business Rules

2.5.1. Adding Business Rules To be able to add a business ruleset, an underlying business object needs to be created first.

Steps 1. Navigate to the Business Rules module and click onto "Add Business Ruleset". Alternatively, you can

click onto in the business ruleset list. This will automatically complete the first two fields of the creation dialogue. 2. Complete the creation dialogue. 3. Modify the view code in the editor to implement business logic.

Business Object Selection of the underlying business object to build the business ruleset on top of.

System Declaration of the source system, to define, for which related business object the rule should be created.

Business Ruleset Name Displayed name of the newly created business ruleset.

Business Ruleset ID Technical suffix ID for the business ruleset, which will be added to the view-name on the database.

Add Business Ruleset Button to complete the business ruleset creation.

| 55 2.5.2. Business Ruleset Properties

2.5.2. Business Ruleset Properties

Include in Accesslayer The button toggles, whether the specific business ruleset is fed into the accesslayer. If enabled, a priorization can be given. The priorization is used, when combining different business rulesets in the accesslayer: If two rulesets have a different result for a field, then the value of the higher priorized business ruleset is taken.

Is Error-Ruleset Defines the ruleset to be an error-view. The error-view will be mirrored into the errormart, as well as combined into an accesslayer-like errormart-view.

Comment Custom notes about the business ruleset.

| 56 2.5.3. Deleting a Business Ruleset

2.5.3. Deleting a Business Ruleset

Steps

1. Click onto the located next to a business ruleset in the business ruleset list. 2. Confirm the deletion.

The deletion of a business ruleset will: • Delete the business ruleset view • Remove the business ruleset from the accesslayer (if fed into the accesslayer) • Remove error-mart view (if declared as error-ruleset)

| 57 2.6. Accesslayer

2.6. Accesslayer After denormalizing the datavault structure and applying business logic to the data, it can finally be delivered the target using the accesslayer.

Definition The accesslayer is the interface for the target consuming systems. It combines the result-sets of business rulesets of related business objects.

Usage The accesslayer does not have its own module. You will only see it in the data lineage on the right side. It can be configured using the business ruleset properties.

When you enable a business ruleset to be fed into the accesslayer, the layer will be immediately regenerated. The decision, into which accesslayer a business ruleset is fed is automatically given by the underlying business object. For related business objects (and their business rulesets), there will always be one accesslayer, combining the result-sets.

Hint: Make use of the functional suffix in the business object to create unrelated business objects with the same starting hub.

If a record appears in multiple business rulesets, the priorization of the business ruleset takes effect: If both rulesets contain a value for a column, then the value of the higher ranked business ruleset will be taken. In case, that the higher ranked value is NULL, then the next lower available value is taken.

| 58 2.7. Data Lineage

2.7. Data Lineage The Data Lineage gives you an overview of the data integration flow. By using the system colors, it highlights, which elements are loaded from which source.

Additionally to the real existing objects, placeholders for "no source", "no staging", "no datavault" and so on are present in the visualization. This allows to find objects, which are not completely used in an integration flow yet.

Search Bar Filter the data lineage for specific elements.

Source The coloring starts on the left side with a source-system. It is given along, as long as an object has only one source. Otherwise, the color will turn into gray, indicating, that (potentially) the data of different sources is integrated.

Target It is as well possible to analyse the data flow by starting with the result-set. Hovering over an object will immediately highlight all directly dependent elements on the screen.

| 59 2.8. Operations

2.8. Operations The operations module is all about orchestrating and monitoring the loading process of staging and data vault loads. 2.8.1. Status The status lineage allows to get an immediate overview of the current loading states. As a difference to the data lineage, its flows are not colored by the system, but the state of each staging and data vault load.

Search Bar Use the search bar to filter the status-lineage. Filtering works the same as in the data lineage.

Load Loads are colored by their current status. • Green: Successfully loaded. • Light blue: Initialized, but waiting. • Blue: Loading. • Red: Failed.

When hovering a load, the window will appear, giving details about the load state, such as start- time, duration and loaded rows.

| 60 2.8.2. Jobs

2.8.2. Jobs The jobs module is the heart, automating and orchestrating the data flows in the Datavault Builder. The module allows to define multiple different packages of staging and data vault loads, giving you the maximum flexibility to define your custom jobs. However, if you compose manual jobs, the Datavault Builder will help you create an execution order by taking away the work of manually defining the correct and most efficient loading order. Focus on the loads to include in a job, and the Datavault Builder will optimize and parallelize the loads as good as possible.

Definition A job is a package of loads (datavault / staging) on the granularity of one source system. It is possible to have multiple jobs per system, but not multiple systems per job.

Usage At system creation, a default job is created, automatically taking up all newly created loads. Once you start modelling, in the background the job is automatically updated with each modification in your data integration process, in the end allowing you to directly trigger a reload of your integration flow.

Besides all the Automation, many parts can be customized, as you can see in the following chapters.

Job list When clicking into the search-bar, a drowdown will appear, listing all existing jobs in the Datavault Builder.

Job Load Status / Load Selection On the canvas, the loads within a job as well as the dependencies are visualized. In the status view, the color indicates the current status for each load. When using the view toggler, the selection view allows you to compose your custom job.

View Toggler Click onto the toggler to switch between load status and load selection on the canvas.

| 61 2.8.2. Jobs

Job Actions

• Circled Arrow: Initiate Job Load • Switch: Enable/Disable Job • Screw nut: Edit Job • Trash bin: Delete Job

Job Details Lists when the job started the last time, how long it did run as well as the next schedule. If you have multiple schedules defined for the same job, only the next run overall will appear.

Schedules Lists defined schedules and lets you add another schedule to a job.

SQL Query It is possible to define custom SQL commands, which will be executed every time the job completes. This allows you for instance to trigger the copy of data into another database or programm notifications about job status.

Triggered by / Triggered on Completion: Jobs Lists and allows to define dependencies between jobs, which causes jobs to run after oneanother.

| 62 2.8.2.1. Creating a job

2.8.2.1. Creating a job As stated before, a default job is created on system creation. This default job is configured to automatically include newly created loads. The default job can be modified or deleted if needed.

Also, it is possible to create more jobs for a system. This can be the case, if only parts of the source should be loaded more often than others to improve the effectivety of the integration flow.

Steps 1. Navigate to the jobs module 2. Click onto "Add Job" 3. Fill out the dialogue explained below 4. Complete the creation dialogue. 5. Continue with defining individual loads, schedules and job dependencies.

Activation Switch Enable or disable the job as a whole.

Source System Select a source system which was previously created in the staging module.

Job Name Displayed name of the job.

Job Suffix ID Partial technical ID of the job. Will compose the job ID in form of: [Source System ID] + "_j_" + [Job Suffix ID]

| 63 2.8.2.1. Creating a job

Max. Parallel Running Loads Defines the number of parallely running loads overall in the system. If another job is executed at the same time, this will also impact the number of available loads.

Auto Add New Elements When creating a new staging table or data vault element, the loads can automatically be added to a job. Enable this continuous integration by ticking the box. Otherwise new loads need to be added manually.

Comment Custom notes about the job. Will appear in the documentation.

Add Job Button to complete the job creation.

| 64 2.8.2.2. Defining loads within a job

2.8.2.2. Defining loads within a job A job can be configured in two ways: 1. Automatically take up new loads 2. Only contain manually specified loads

In both cases, loads can be individually enabled/disabled. If a load is disabled in the first case (automatically take up new loads), then new loads will still be automatically added to the loads, but the selected ones are skipped. In the second case (manually specified loads only), new loads are not automatically added, so that loads have to be picked indivually and only selected loads will be contained in the job.

If a load is removed in the Datavault or Staging, the load will as well be automatically be removed from the job. Be aware, that if you have previously excluded a load from a job, then removed it from the data vault and added it again, it has to be disabled again.

You can as well reconfigure a job from automatically adding new loads to manually selecting loads. In this case, the state is preserved, meaning that all loads added to the job up to this point will directly be included in the manual job.

Steps 1. Go to the jobs module and open up an existing job from the list 2. Toggle the view to activate the "Selection"-Appearance of the canvas 3. Click onto a load to add it to / remove it from the job

When you are on the selection view, you will notice, that the connections in the lineage are colored in three different ways: • Purple: Selected load, will be executed on job execution • Dark Gray: Available load, could be selected, won't be executed on job execution • Light Gray: Independent data flows, can't be selected, represent virtual layers or loads from different source systems.

The selection composer will only allow you to select loads of the underlying source system of the job. If you would like to initiate dependent loads on job completion, please have a look at job dependencies.

| 65 2.8.2.3. Schedule a job

2.8.2.3. Schedule a job The Datavault Builder consists of an integrated scheduling agent, allowing you to directly manage and configure the automation of reloading processes within the environment.

Definition A schedule will trigger one individual job on a given timing.

Steps 1. Navigate to the jobs module, open an existing job and click onto the located behind "Schedules" in the Job properties slider. 2. Fill out the Base tab, allowing to define a start and end date. 3. Fill out the scope, defining to load as full- or delta-load. 4. Indicate the timing to initiate the job at.

Important It is possible to have overlapping schedules, as well as job dependencies leading to a job possibly being initiate while already loading. To prevent the system from being queued up, the job will first check the existence of a running job and not initiate a second time.

| 66 Base

Base

Activation Switch Enables/Disabled the schedule.

Schedule Name Displayed name of the Schedule.

Start Date / Time Only run the schedule past this initial date and time.

End Date / Time Only run the schedule before this final date and time. Can be left out if infinite.

Comment Individual notes about the schedule.

Schedule ID Technical ID of the schedule. Is directly derived from the Name but can manually be edited.

| 67 Scope

Scope The scope is the place where you can define a schedule as loading full or delta in the staging.

In case of a full load, all data from the contained source tables in the job will be copied.

In case of a delta load, the where clause parameter specified for each staging table will be used to define a delta-subset in the source. In this case, eventually used parameters can be assigned a fix or calculated value in the schedule definition. For details: Working with full / delta loads

Important If a schedule is configured as delta load and has dependent jobs which will be initiated, then the parameter-values from the first job triggered by a schedule will be passed down the the next job to be used in the delta-load.

Schedule Type Defines to load full or only delta. If full load, then no parameters have to be specified.

| 68 Scope

Delta Load Parameters Name: Type in the name of the used parameter in the Delta-Load-Where-Clause of the staging table(s) Value: Assign a fix or dynamic value to the parameter, which will be filled in on each execution

The value can as well be the result of a function call. For that purpose, three execution environments are possible: - Execute on source: Just add the functions to the value, it will be added to the where-clause and passed as is to the source - Execute in the Datavault Builder Core: For your convenience default functions can be added with the prefix "DVB_CORE_FUNC.*' - Execute on your client-database: Prefix your value with 'DVB_CDB_FUNC.' and it will be run on the client-database. Important: Make sure, that the function call returns a single string (text/ varchar) as a result! For instance, create your own function in schema 'dvb_functions' called 'get_last_fullmoon_date'. To use it in the value field, add 'DVB_CDB_FUNC.dvb_functions.get_last_fullmoon_date()'.

Important: Pay attention to confirm the adding of a new parameter using the plus-symbol.

| 69 Timing

Timing In the last step of adding a new schedule to a job, the timing is assigned.

Available options are: - Every Minute - Every Hour (minute) - Every Day (hours,minute) - Every Week (weekday,hour,minute) - Every Month (monthday/lastofmonth,hour,minute) - Every Year(month,monthday/lastofmonth,hour,minute)

It is possible to also select multiple options within one step, as displayed in the example below, where the job will only run during the workdays and working hours.

Timing Selection Definition of the timing for a schedule. Multiple selections within a row are possible.

Add Schedule Button to complete the creation of the schedule.

| 70 2.8.2.4. Job dependencies

2.8.2.4. Job dependencies Jobs can not only be triggered by schedules, but as well by other jobs. In this case, the staging load type (full / delta) as well as the parameters are inherited from the triggering job. The dialogues for the jobs "Triggered by" and "Triggered on Completion" work the same way, so we will only show one of both below.

Definition A dependent job is triggered on completion of another job.

When adding a job to "Triggered on Completion", that job will run after the first job is finished. Therefore, the triggering job will also appear in the "Triggered by" jobs list of the dependent job. Direct circular dependencies are not possible (a dependent job being the trigger for its triggering job). However, indirect circular dependencies are possible.

Job List Search the list for a specific job

Job Status / Action

• Toggle: Indicates, whether the chosen dependent job is activated. For enabling/disabling open up the job. • Trash bin: Removes the dependency.

| 71 2.8.2.5. Post-Job SQL Query

2.8.2.5. Post-Job SQL Query Besides Job-Dependencies, the Datavault Builder as well supports execution of Post-Job SQL Queries, allowing to perform customized actions ones the loading process of the job is finished.

Definition A Post-Job SQL Query is custom code, executed within the job, after all loads, but before triggering dependent jobs.

Steps 1. Navigate to the jobs module, open an existing job and click onto the edit-symbol located behind "Default SQL Query" in the Job properties slider. 2. Fill out the Pop-Up, defining your custom code. 3. Activate the SQL Query.

Configuration In the dvb_config.config, you can define a paramter called "run_job_sql_query_on_core", allowing you to specify, whether the query will be executed on our core, or on the client/process database.

Example The following code shows, when the code execution of the post-jobs sql query on the core makes sense (Important: To run this example, set run_job_sql_query_on_core in the config to "TRUE"). In this case, the job is loading data from CSV-Files. Ones those files are loaded, they should no longer be in the source folder, otherwise they would be reloaded each and every run. Therefore, we will move them into a subfolder after the load.

DO $$ DECLARE _moved_files_count INTEGER; BEGIN IF ({{job_completed}}) THEN _moved_files_count = dvb_core.f_move_files('test*.csv', 'loaded_files/bak'); RAISE NOTICE 'Post Job SQL Script moved % source files.', _moved_files_count; END IF; END; $$;

As you notice, fore the movement we use the core-function "dvb_core.f_move_files", specifying which files should be moved into what directory. Also, we only move the files away, if the job was actually successful, meaning no load failed. For this purpose, the variable {{job_completed}} can be used, which will automatically be evaluated on execution time by the core.

| 72 2.8.2.6. Deleting a job

2.8.2.6. Deleting a job Before deleting a job, make sure, that it is not needed anymore because removing a job will also directly remove: • All schedules defined for the job • All SQL Queries within a job • All job dependencies • Inclusion logic for manually disabled / enabled loads within a job

Steps 1. Open up the job in the jobs module

2. Click onto the in the job properties. 3. Confirm the deletion.

| 73 2.8.3. Command line

2.8.3. Command line The command line is a direct interface to the dvb core module. Every action, which can be performed within the gui can also be called from the command line.

Common use cases • Batch-Processing: Use (metadata-driven) scripts to easily create structures of the integration flow. • Rollout-Scripts: Take the creation-log from one environment and execute it on another one. • Hotfixes: Directly patch a system without restarting it.

Usage You can send any query to the database using the command line. When making use of a function (such as dvb_core.f_create_user(...)), make sure to formulate it as a SELECT-Statement:

SELECT dvb_core.f_create_user(username text,email text,password text,pg_user text,full_name text,password_expires timestamp without time zone);

Important: All functions are part of the dvb_core-Schema. Therefore, you have to prefix every function- call with dvb_core.

Also, the query to execute has to return a text. This is important, as the message will be passed on as result-message to the gui. If no message is returned, the call will fail. For instan, if you would like to formulate an INSERT-Statement, prepare it in the following manner:

; SELECT "ok";

While typing, the Command Line Editor will offer you all available functions in the auto-completion. When hitting enter, it will also paste the parameters / parameter-types to help you using the function-call.

Frequently used commands User f_create_user Creates a new user. Use "dvb_user" as pg_user. Management f_delete_user Removes a user. f_update_user_password Changes the password of a user. Staging f_create_or_update_source_system Creates a new source system. f_delete_source_system Deletes an existing source system. f_create_or_update_staging_table Creates a new staging table. f_remove_source_table Removes a staging table. f_initiate_staging_load_async Initiates the staging load for a staging table. Data Vault f_create_hub Create a new hub. f_create_hub_load Add a load to an existing hub / Create a new hub with a load. f_delete_hub_load Removes the load from a hub. f_delete_hub_data Truncates all loaded data from the hub. f_delete_hub Deletes the hub (only works when no satellites/links/loads or data loaded) f_create_link Create a new link. f_create_link_load Add a load to an existing link / Create a new link with a load. f_delete_link_load Removes the load from a link. f_delete_link_data Truncates all loaded data from the link. | 74 2.8.3. Command line

f_delete_link Deletes the link (only works when no loads or data loaded) f_create_sat_prototype Creates a prototype satellite. f_create_sat_load Create a new satellite with a load. f_delete_sat_prototype Removes a prototype satellite. f_delete_sat_data Truncates all loaded data from the satellite. f_delete_sat Deletes the satellite (only works when no data loaded) f_initiate_datavault_load_async Initiates the datavault load for an object/staging table.

| 75 2.8.4. Load States

2.8.4. Load States The are four basic load types in the Datavault Builder:

• St aging load: A single staging table load is performed. • S ystem load: All staging tables for a system are loaded. • D ata Vault load: A single load into a Data Vault object (Hub / Link / Satellite). • J ob load: A (group of) Staging and / or Data Vault load(s).

Returned State Meaning Summarized Load Step Applicable to

Initiating Finding load parts to execute Running St, S, DV, J

Waiting Waiting for a free processing slot Running St, S, DV, J

Loading Staging Data or Loading the Vault Running St, S, DV, J

Incomplete Parts of the load failed Failed S, J

Failed The load failed Failed St, S, DV, J

Succeeded The load was successful Succeeded St, S, DV, J 2.9. Deployment The deployment module is all about the state of the developed model. With the module, the state can be exported, versioned or also imported onto another environment again. Also, comparisons between the logical model of one environment with another enviornment is possible. Lastly, to bring detected changes from one environment to another, either direct deployment as well as change script generation is supported. 2.9.1. Getting started

Connect Environment Let's you connect to a remote environment or import an existing, locally stored state of a logical state.

Export Local Environment Retrieves the logical state from the connected environment and packages it into human readable jsons within a zip file. 2.9.2. Export an environment

Definition When exporting an environment, a state-archive is generated. This zip-file contains the information about the current data integration flows on a logical level. As the logical state is exported as json files, the state is human readable.

Purpose | 76 2.9.3. Connect environment

There is many reasons to export an environment. For one, based on the export, the model can be checked into an enterprise versioning software. Furthermore, as the files are human readable, different versions can as well be externally merged and reimported into another environment.

Steps You can either export the local environment or also a remote environment after connecting to it. a) Open up the deployment module and click onto "export local environment". A zip file will automatically be generated and offered for download. b) Open up the deployment module and click onto "connect environment". Once connected, press the export-icon besides the connected environment to generate the state-archive.

2.9.3. Connect environment

Definition An environment can either be a second instance of the Datavault Builder (e.g. Dev/Test/Prod) or an exported logical state of the implementation.

Purpose The goal is to either connect to a remote Datavault Builder Instance or to upload a previously exported state for comparison.

Steps 1. Open up the deployment module and click onto "connect environment". 2. From the environment tabs, select the environment you would like to compare to. 3. Depending on the environment, specifcy the connection details a) Remote environment: Specify a host and login details. b) File: Upload a previously exported state-Zip-File. c) Folder: Drag and Drop the top root folder of the previously exported, unziped state archive.

| 77 2.9.4. Comparing environments

Environment Tab Select the sort of environment to connect to.

Environment Details Specify the login properties or upload a folder to connect to the second environment state. 2.9.4. Comparing environments

Purpose Once connected to a second environment, the compare view lets you see differences and either build and execute custom packages of import or export actions to rollout changes to another environment.

Steps 1. Open up the deployment module connect a second environment. 2. Go through the list and select the objects you would like to roll out. 3. Initiate the deployment through the environment actions.

| 78 2.9.4. Comparing environments

Export Environment

Download a state-archive from either of the two compared environments.

Expand all

Open up all categories and show contained elements.

Environment actions

From left to right: - Disconnect target environment - Switch deployment direction - Recompare both environments - Initiate deployment

Filter List Apply a filter to only compare certain elements of the implementation.

| 79 2.9.5. Deploying objects

Deployment package options

The operation column shows you, what action is needed to get the target environments element into the same state as the source environment. With the checkbox you can select which elements should be included in the rollout. When clicking the checkbox, a dependency check is performed which supports you with rolling out dependent objects.

Diff Viewer

When clicking onto an object in the list, the diff viewer will show you the states of both sides and in case of differences also the changes in the middle part. 2.9.5. Deploying objects

Purpose Once the deployment has been initiated, you need to select the way to deploy. This can be a direct deployment or a script based deployment.

| 80 2.9.5.1. Direct Deployment

Deployment way Specify which way to roll out the previously build deployment package.

Deployment direction Reminder of the direction of deployment.

Actions Start direct deployment / export into change script or cancel the process. 2.9.5.1. Direct Deployment

Purpose Instantly start the rollout of a previously defined deployment package to propagate changes onto an environment.

Steps 1. Open up the deployment module connect a second environment. 2. Initiate the deployment and select the option for direct deployment.

2.9.5.2. Script based deployment

Purpose Script based deployment is usually the way to deploy in enterprise environment where specific deployment packages are created and tested before being rolled out onto a productive environment. Therefore, for the rollout a package of API-Calls is generated, which can then be executed against a target environment.

Steps 1. Open up the deployment module connect a second environment. 2. Compose a deployment package and initiate the deployment. 3. Select the option to export a deployment script package. This exports a file with name pattern [date]_[time]_rollout_package.zip (e.g. 20180803_135742_rollout_package). | 81 2.9.5.3. Database structure rollout

In that package are two files. One of it contains environment parameters, the other the necessary API-calls. 4. Configure files and setup required library to be able to initiate the script based rollout a) in the environment file (rollout.env), specify at least against which target to rollout and login user details. If you are rolling out source systems, you can also update their parameters according to the target environments needs. b) install cUrl (https://curl.haxx.se/) # and jq (https://stedolan.github.io/jq/) 5. Invoke the rollout by calling the rollout.sh script as described in the file (./rollout.sh rollout.env)

Limitations Currently no error handling is in place. Therefore, you should test the rollout against a second

2.9.5.3. Database structure rollout As the Datavault Builder has no separate meta-data-repository, all objects are directly created on the processing database. For the rollout, the database objects and structures can be moved between the environments by using schema-comparison-tools.

By exporting the database structures, the current states of the implementation can as well be versioned by using a standard code-versioning-software, such as SVN or GIT.

Steps 1. Deploy the database structures 2. Deploy the content of the tables in the dvb_config-schema as well (as the environment configuration is stored in there) - config (adjust if necessary) - system_data & system_colors (adjust if necessary) - auth_users (adjust if necessary) - job_data, job_loads, job_schedules, job_sql_queries & job_triggers. At least the job "_dvb_j_default" has to be present on each environment. - source_types, source_type_parameters, source_type_parameter_groups, source_type_parameter_group_names, source_type_parameter_defaults > copy&paste as is. 3. After the deployment, depending on your used database type (MSSQL, EXASOL, ORACLE), trigger the refreshing of the metadata by executing the following query in the command line in the GUI: SELECT dvb_core.f_refresh_all_base_views(); 4. If you have skipped to deploy parts of the dvb_config-schema, it may be necessary to: - create new users - update system configs (as most of the time in DEV-TEST-PROD have different configs) - Can also be done in the GUI - adjust jobs

3. Installing Datavault Builder

3.1. Requirements

Hardware

Minimal System requirements are for all containers except the database: Disk: 25 gb Memory: 16-32 gb CPU: 4-8 cores | 82 3.1. Requirements

Additionally you will need more disk space, memory and cpu if you run your database as well as container. Please refer to the minimal requirements for your database to the database manufacturer. A rough approximation rule: Required Database Disk Space = 4 * Sourcedata-Diskspace

Software

The Datavault Builder can be installed on any system capable of running a docker host and able to connect to docker.com. For a list with a current overview of supported Environments please visit: https://docs.docker.com/engine/installation/

Windows Server Please note, that running linux based containers on Windows Server environments is officially supported since Windows Server 2019 and the license for docker EE is included in the server license. https://docs.microsoft.com/en-us/windows-server/get-started-19/whats-new-19#linux-containers-on- windows

Proxy If you use a proxy to connect to the internet please the check the corresponding chapter. If you have no chance at all to connect to the internet from your server you can download the container images as well on another computer and transfer it via USB drive or local network connection. Please be aware that this will increase your maintenance efforts and is not recommended.

On the same page, install instructions for each system are as well present.

Sample As an example, we have prepared you the install instructions for CentOS 7 (Tested version: 7.4 minimal, normal installation without any additional packages).

Run the following commands as root:

yum remove docker docker-common docker-selinux docker-engine yum install -y yum-utils device-mapper-persistent-data lvm2 yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo yum install docker-ce systemctl start docker systemctl enable docker curl -L https://github.com/docker/compose/releases/download/1.17.0/docker-compose- `uname -s`-`uname -m` -o /usr/local/bin/docker-compose chmod +x /usr/local/bin/ docker-compose

In case you would like to start the containers with another user than root, add that user: groupadd docker usermod -aG docker USERNAME

| 83 3.1.1. Download of Container Images on Another Computer

Test This is already it. Test the installation as describe in the next chapter and then just copy the provided docker-compose (as can be seen an example of in the chapter "On Premise") into a local folder, cd to that folder and start the application:

docker login docker-compose up -d

3.1.1. Download of Container Images on Another Computer

Short: use the save CLI command.

https://docs.docker.com/engine/reference/commandline/save/

You can pull the image on a computer that have access to the internet.

sudo docker pull hello-world Then you can save this image to a file

sudo docker save -o hello-world_image.docker hello-world

Transfer the file on the offline computer (USB/CD/whatever) and load the image from the file:

sudo docker load -i hello-world_image.docker

Reference for docker load: https://docs.docker.com/engine/reference/commandline/load/ Please use the following manual to configure Docker to use your proxy: you need to scroll down to section "HTTP/HTTPS proxy"

https://docs.docker.com/engine/admin/systemd/#httphttps-proxy

Important. You need to verify the correct settings using the hello-world appliciation. Testing using CURL has no meaning as the proxy setting for curl and docker can differ. Test you Docker environment. Save the following file to a local folder to a file called docker-compose.yml:

-- start of the file (don't include this line) version: '3.1'

services:

helloworld: image: 'hello-world' -- end of the file (don't include this line)

Switch using cmd or bash to the folder where you saved the docker-compose.yml and type: docker-compose up

You should get a message saying: helloworld | helloworld | Hello from Docker! | 84 3.3. On Premise

If this doesn't work check the following prerequisites: • that your computer is connected to the internet • if you use a proxy that the proxy is configured correctly as in the linked manual for Docker • you have the latest version of Docker and Docker compose installed (depending on you Linux distribtion you can't use Apt as they have to old version of te packages)

3.3. On Premise

The Datavault Builder is shipped as docker images, containing the different parts of the application. This makes not only the setup very easy, but also allows updating one part without touching the others. For the setup, you can use a docker-compose file, such as the one below. It will automatically pull the necessary images from our repository and start the containers.

Steps for initial setup 1. Create a docker cloud account (docker.com) and send us the username, so we can add you to the repository 2. Install docker and latest docker-compose (docker for windows 10 has latest, ubuntu 16.04 needs updates) 3. We will send you a docker-compose.yml file, looking similar to the code shown below. 4. Adjust the settings in the docker-compose.yml, such as • initial user settings • passwords • usernames • ports (optional)docker • and in case of an external database: jdbc connection string and save the file. 5. Open the cmd-line within the folder, containing the docker-compose.yml and login to docker cloud: docker login 6. Download the docker images: docker-compose pull 7. Start the docker containers: docker-compose up -d 8. The Datavault Builder Interface can now be reached with a Chrome Browser on localhost:80.

Updating 1. Adjust the version numbers in your docker-compse.yml file. 2. Download updates with the cmd-line command: docker-compose pull 3. Restart docker containers: docker-compose down docker-compose up -d

Optionally, you can remove old images, which are not used anymore: docker image prune

Sample docker-compose.yml

version: '3.1'

| 85 3.3. On Premise

services: clientdb_postgres: environment: - AUTHENTICATOR_PASSWORD=authenticatorPassword # use the same authenticatorPassword for every container - 'DBADMIN_PASSWORD=dbadminPassword' # dbadminPassword can differ from clientdb and core image: 'datavaultbuilder/clientdb_postgres:3.3.1' volumes: - data:/data ports: - '5433:5432' restart: on-failure core: environment: - AUTHENTICATOR_PASSWORD=authenticatorPassword - 'CLIENT_DB_CONNECTIONSTRING=jdbc:postgresql:// clientdb_postgres:5432/datavaultbuilder' - CLIENT_DB_TYPE=postgres_client_db - 'DBADMIN_PASSWORD=dbadminPassword' - 'PLJAVA_VMOPTIONS=-Djava.security.egd=file:///dev/urandom - Xms128M' - 'GUI_USER_NAME=yourName' - 'GUI_USER_PASSWORD=yourPassword' - 'GUI_USER_TEXT=This is your User' - '[email protected]' - 'GUI_USER_GROUP=dvb_user' image: 'datavaultbuilder/core:3.3.1' ports: - '5432:5432' - '8444:8444' secrets: - scheduler_password - datavault_builder_license restart: on-failure webgui: environment: - 'DAV_USER=yourName' - 'DAV_PASSWORD=yourPassword' image: 'datavaultbuilder/webgui:3.3.1' ports: - '80:80' | 86 3.4. Using non-containered Client-Database

restart: on-failure api: environment: - AUTHENTICATOR_PASSWORD=authenticatorPassword image: 'datavaultbuilder/api:3.3.1' ports: - '12334:12334' restart: on-failure scheduler: environment: - 'PGAGENT_OPTIONS=-l 1' image: 'datavaultbuilder/scheduler:3.3.1' restart: on-failure

secrets: scheduler_password: file: scheduler_password.txt datavault_builder_license: file: datavault_builder_license.lic

volumes: data:

3.4. Using non-containered Client-Database

When using the Datavault Builder, by default for the client-database as well a containerized version is started up (if available). Nevertheless, if you would like you can as well connect to a separate instance of your database. In this case, please modify the connection string in the docker-compose/docker-stack file accordingly.

The necessary meta-data structures are not directly deployed through the core to the client-database. If you would like to use a separate client-database, please contact us for the install-script.

Minimal Requirements When planning on bringing in your own database, please respect the following tested minimal versions and configurations:

General: • Collation: Case-Sensitive • Empty database, preferably called "datavaultbuilder"

Database-Specific: • Oracle: - Min. Version 12.2 - User with admin rights - Grants to initially create SYS-Triggers - 3*2 GB Redo-Log - 2 GB Undo-Log | 87 3.4.1. Install DVB Structures on Client-Database

- max_string_size = extended (recommended) • MS SQL: - Min. Version: 2016 - Collation: SQL_Latin1_General_CP1_CS_AS - Login databaseowner user - Login "authenticator" (does not need specific grants) - Compatability level 130 or higher: EXEC dbo.sp_dbcmptlevel @DB, 130 - Set Recovery mode to simple unless taking care of backing up transaction logs: ALTER DATABASE [DB_NAME] SET RECOVERY SIMPLE; • EXASOL: 6.1 • Postgres: 9.6

Please feel free to contact us for the support of other versions.

3.4.1. Install DVB Structures on Client-Database To complete the Setup of the Datavault Builder, some Objects (Mainly Tables for Config and Logging and Metadata Views) need to be installed on the target Database. When using a separate, non-containered Version of the Database, then this can be achieved by starting up the containerized version of the clientdabase once, connecting into the container and executing the install scripts.

Using the Helper Container 1. Start up the Clientdatabse container 2. Connect to the running Container in the shell docker-compose exec clientdb*** bash 2. change into folder with source code cd /dvb_sql/ 3. execute shell script which deploys source code, depending on the database, the script execution may vary • EXASOL: db_update.sh domain:Port username password • ORACLE: ./create_db_sh 'sys/"$DB_PASSWD"@\"localhost:1521/$DB_PDB.$DB_DOMAIN\" as sysdba' • MSSQL: ./create_db.sh [protocol:]server[instance_name][,port]] [username] [password] [database name] [authenticator password] Hint: If the named database does not exists, it will be created

4. Alter password for authenticator • MSSQL: ALTER LOGIN authenticator WITH PASSWORD = '$AUTHENTICATOR_PASSWORD';

Database Specific Settings MS SQL - Collation: SQL_Latin1_General_CP1_CS_AS - Login databaseowner user - Login "authenticator" (does not need specific grants) - Compatability level 130 or higher: EXEC dbo.sp_dbcmptlevel @DB, 130 - Set Recovery mode to simple unless taking care of backing up transaction logs: ALTER DATABASE [DB_NAME] SET RECOVERY SIMPLE;

| 88 3.5. Limitations

3.5. Limitations

Oracle - Does not know transactions for ddl-statements. Therefore, it can happen that structures may remain in an inconsistent state when an error occurs. - Loading of CLOBs and BLOBs into the Data Vault is currently not supported. - Data Viewer / Data Preview does not (fully) support tables/columns with CLOBs and BLOBs.

Exasol: - Missing Impersonate Functionality does not allow creation of different usergroups yet. - Transactions are deactivated due to unexpected locking behaviour.

3.6. Setting up Backups

3.6.1. Database Necessary structures and data to backup the state of the installation on the database level. As the Datavault Builder does not have a separate metadata repository, you can also simply backup the current model by creating an export of your database structures, including some configuration tables with data.

Important hints The system passwords are usually encrypted. If you would like to be able to restore the state and make use of the stored passwords, make sure to also backup the Datavault Builder configuration (e.g. the yml- file) and the encryption keys and passwords.

Backup without loaded data 1. Database structures from the following schemas: • staging • datavault_staging • datavault • businessobjects • business_rules • accesslayer • access_errormart • dvb_log • dvb_core • (sys - triggers)

2. Database structures and data from the following objects: • dvb_config.config • dvb_config.system_data • dvb_config.system_colors • dvb_config.auth_users • dvb_config.job_data • dvb_config.job_loads • dvb_config.job_schedules • dvb_config.job_sql_queries • dvb_config.job_triggers • dvb_config._dvb_runtime_bookmark

| 89 4. Testing

Backup with loaded data To create a backup including the loaded data, make sure to also backup the data from the objects in the schemas mentioned in part 1. above.

Make sure in this case to include the following tables, as they contain load relevant status information: • dvb_log.staging_load_log • dvb_log.datavault_load_log • dvb_log.job_load_log • datavault._dvb_runtime_load_data • staging._dvb_runtime_table_status

4. Testing For interested customers, we offer the possibility to test the Datavault Builder themselves and get a first hands-on-experience. The installation is usually setup within our cloud. 4.1. Connecting to a source Since your test-environment is hosted on our server, please make sure, that you do not load any confidential data!

Databases Besides that, in the staging area you can connect to any source by using the Add Source. You only have to make sure, that your source can be connected to from the outside, and for instance no firewall is blocking the access.

Files (CSV) When connecting to a CSV file, you have to indicate a directory. In the Test-Environment, this directory is located within our server. Because of this, CSV files previously to staging need to be uploaded into your Test-Environment.

To do so, we have enabled a Webdav-Connection. 1. We usually use "Cyberduck" as a Webdav-Client on our machines. You can download it from their website for free. 2. Start Cyberduck and create a new connection. • Connection-Type: WebDAV (HTTPS) • Server: Same as what you use to connect with the GUI in Chrome (something like yourenv.2150.ch) • Username + Password: Same as you use for login into the GUI. • Extended Options: Make sure to set the path to "/files"

| 90 5. Integration with other applications

3. Click onto "Connect" and upload your files into the folder. 4. When adding the new source in the Datavault Builder GUI, select "CSV" as source Type and specify "/ files" as directory. Add the Source. 5. You can now add staging tables for each file within the folder.

If you are missing a source, please feel free to quickly drop us a line using the support-link in the Datavault Builder. We will be glad to add that source for you. 5. Integration with other applications The Datavault Builder offers possiblities to be integrated within your companies IT-Landscape. Please note, that this article is WIP and does only highlight certain integration options.

5.1. REST-API Every action in the Datavault Builder calls a REST-API. Therefore, every action you can perform within the GUI, you can as well perform directly by calling the REST-API.

Using a standard user One possiblity, is to write a script which will at the beginning login to the Datavault Builder just as you do with the User Interface by calling the login-API. This API will then return you an authentication-token, which needs to be sent with each and every request. For security reasons, this tokens timeout after a certain while. Therefore, when writing your script, make sure to call renewToken before the login expires.

Using a technical user Another possibility is to make use of a technical user. A technical user is a role, which after login will receive an access Token without an expiration. With this long-living Tokens you can then create your REST-API calls, without having to worry about token-timeouts.

1. As a prerequisite to keep the token stable even when restarting the Datavault Builder, you need to define an additional environment-variable in the docker-compose file, the JWT_Secret. By default, the JWT_Secret is assigned randomly on startup, leading to different tokens generated.

| 91 5.2. Using a seperate scheduler

api: environment: - JWT_SECRET=yoursecret

2. Then a technical user can be created by calling the function dvb_core.f_create_user in the command line. Make sure to set the parameter "non_expiring_token" to TRUE.

3. Call the login-API to get your non-expiring token. Call-Body: {username: "technical_username", password: "SuperPassword"}

Response-Body: {"token":"eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VybmFtZSI6InRlY2huaWNhbF91c2VybmFtZSIsInJvbGUiOiJkdmJfZGV2In0.Pc_yrMuwUnZoP2iZD_QA93BB_yTgWHy4XUoglf7DJco"}

4. This token can now be used in the header of any REST-Call made to the Datavault Builder API, without requiring a fresh login.

It is recommended to still expire this tokens after a while. To do so, you can change the JWT_SECRET in the docker-compose file and restart the API (can be done by calling "docker-compose up -d"). Thereby all previously received tokes are invalidated.

5.2. Using a seperate scheduler : Insert description text here... And don't forget to add keyword for this topic 5.3. Using a seperate staging-tool : Insert description text here... And don't forget to add keyword for this topic Appendix A. System Structures on Processing Database The Datavault Builder relies on a diverse set of tables, views and functions on the processing database. As each database offers different possibilities for the technical implementation of the Data Vault based integration flow, these structures can slightly differ depending on the used database. Therefore, in this chapter the overall same structures (tables and views) are explained, which can as well be used as source for meta-data analytics.

Be aware, that changes to these structures can affect the system stability and performance and may cause failures!

Upper- or lower cased naming is based on the used database technology (e.g. postgres is lowercase, Exasol is uppercase). A.1. Metadata Views For performance reasons, for certain types of client databases (e.g. Exasol), the metadata views are materialized to speed up the performance in the GUI. If the client database does not actually support materialized views, then a table is created instead with the name of the metadata view and the actual view will have the postfix '_v'.

| 92 dvb_core.access_errormart

dvb_core.access_errormart The metadata view lists all access errormart views.

Column Meaning

access_errormart_id Complete technical ID of the access errormart. Including Schema. access_errormart_name Displayed Name of the access errormart. functional_suffix_id Technical suffix for the access errormart. functional_suffix_name Displayed Name of the suffix. business_ruleset_suffix_id Technical suffix of the underlying business ruleset. business_ruleset_suffix_name Displayed Name of the suffix of the underlying business ruleset. parent_hub_id Technical non-qualifying ID of the underlying hub. Without Schema. parent_hub_name Displayed Name of the underlying hub. access_errormart_comment Custom comment about the access errormart. dvb_core.accesslayers The metadata view lists all access layers.

Column Meaning

accesslayer_id Complete technical ID of the access layer. Including Schema. accesslayer_name Displayed Name of the access layer. functional_suffix_id Technical suffix for the access layer. functional_suffix_name Displayed Name of the suffix. parent_hub_id Technical non-qualifying ID of the underlying hub. Without Schema. parent_hub_name Displayed Name of the underlying hub. accesslayer_comment Custom comment about the access layer. dvb_core.business_rules The metadata view lists all business rulesets.

Column Meaning

business_ruleset_view_id Complete technical ID of the business ruleset. Including Schema. functional_suffix_id Technical suffix of the underlying business object. functional_suffix_name Displayed Name of the suffix of the underlying business object. business_ruleset_suffix_id Technical suffix of the business ruleset. business_ruleset_suffix_name Displayed Name of the Suffix for the business ruleset. system_id Technical ID of the system of the granularity satellite. system_name Displayed Name of the system. system_color Displayed color of the system. system_comment Custom comment about the system. start_hub_id Technical non-qualifying ID of the underlying hub. Without Schema. start_hub_name Displayed Name of the underlying hub. related_businessobject_view_id Technical ID of the underlying business object. Including Schema. business_rules_comment Custom comment about the business ruleset. business_ruleset_name Displayed Name of the business ruleset. business_rules_view_code Defining SQL Code of the business ruleset. is_error_ruleset Flag, if ruleset is to be delivered into access errormart.

include_in_accesslayer Flag, if ruleset is to be delivered into access layer. | 93 dvb_core.businessobjects

accesslayer_priorization Priorization number of dataset in accesslayer. Lower number means higher priorization. quick_inserts Configuration of business ruleset editor quick inserts. Stored as JSON. dvb_core.businessobjects The metadata view lists all business objects.

Column Meaning

businessobject_view_id Complete technical ID of the business object. Including Schema. functional_suffix_id Technical suffix of the business object. functional_suffix_name Displayed Name of the suffix for the business object. system_id Technical ID of the system of the granularity satellite. system_name Displayed Name of the system. system_color Displayed color of the system. system_comment Custom comment about the system. businessobject_comment Custom comment about the business object. start_hub_id Technical non-qualifying ID of the underlying hub. Without Schema. start_hub_name Displayed Name of the underlying hub. businessobject_name Technical ID of the underlying business object. Including Schema. businessobject_structure Stored Metainformation on the used elements in the business object. Stored as JSON. dvb_core.columns The metadata view lists all columns in the Datavault Builder Schemas.

Column Meaning

schema_id Technical ID of the Schema. table_nq_id Technical ID of the table. Without Schema. column_nq_id Technical ID of the column. Without Schema and table. column_id Technical ID of the column. Including Schema and table. column_name Displayed Name of the column. column_comment Custom comment about the column. data_type Data type of the column. data_type_id Technical ID of the data type. character_maximum_length Maximum length of the field. If applicable. numeric_precision Numeric precision of the field. If applicable. numeric_scale Numeric scale of the field. If applicable. datetime_precision Datetime precision of the field. If applicable. ordinal_position Position of the column within the table/view. complete_datatype Datatype including precision and scale if applicable. dvb_core.hub_loads The metadata view lists all defined loads into hubs.

Column Meaning

hub_load_id Technical ID of the hub load. Without Schema. hub_id Technical ID of the hub. Without Schema. technical_business_key SQL Statements which will compose the value of the business key. short_business_key Displayed business key configuration. | 94 dvb_core.hubs

hub_load_list_entry Displayed name of the hub load. system_id Technical ID of the source system. system_name Displayed name of the system. system_color Displayed color of the system. system_comment Custom comment about the system. staging_table_id Technical ID of the staging table to stage from. Including Schema. staging_table_view_id_hash CRC32-Hash of the staging table view id. keys_are_unique Uniqueness is checked when loading the hub. keys_are_prehashed Keys are already hashed in the source. datavault_category_id Technical ID of the related datavault category to load (PSA, Raw Vault, Business Vault) dvb_core.hubs The metadata view lists all hubs.

Column Meaning

hub_id Technical ID of the hub. Without Schema. hub_name Displayed name of the hub. boid Technical ID of the hub without hub identifying prefix (h_) and Schema. hub_id_of_alias_parent Technical ID of the parent hub if the hub is an alias. Without Schema. hub_name_of_alias_parent Displayed name of the parent hub if the hub is an allias. hub_subject_area_name Displayed subject area for the hub. hub_comment Custom comment about the hub. hub_is_prototype Flag, if hub is just a prototype (without load). dvb_core.jobs The metadata view lists all jobs.

Column Meaning

job_id Technical ID of the job. system_id Technical ID of the system the job belongs to. system_name Displayed name of the system. system_color Displayed color of the system. system_comment Custom comment about the system. job_name Displayed name of the job. job_suffix_id Technical suffix of the job. job_type Add new loads automactically. Values: Auto or manual. parallel_loads Max. number of globally running loads, which limits another load of the job to be initiated. job_comment Custom comment about the job. job_enabled Flag, if job is active or not. next_run Over all schedules the next planned runtime. last_run Last initiation time. last_run_duration Duration of the last job load. dvb_core.latest_datavault_load_info The metadata view lists the current load info of all datavault loads which were initiated at least once.

Column Meaning

| 95 dvb_core.latest_job_load_info

load_entry_time Initiation time, when the load was sent into the loading queue. object_id Technical ID of the datavault object (hub, link or satellite). Without Schema. staging_table_id Technical ID of the staging table to load from. Including Schema. load_start_time Last initiation time of the load. load_end_time Last end time of the load. load_duration Last duration of the load. load_state Current state of the load. load_result Result of the last load. load_progress Number of rows which have currently been processed within the load. load_total_rows Number of rows to process within the load. Does not have to be equal to the actual number of rows which are inserted (i.e. for hubs only new entries are added) login_username Displayed name of the initiating user. job_id Technical ID of the job within the load was initiated if applicable. pid Process ID in the core, which initiated the load. dvb_core.latest_job_load_info The metadata view lists the current load info of all jobs which were initiated at least once.

Column Meaning

load_entry_time Initiation time, when the job was initiated. job_id Technical ID of the job. latest_load_start_time Time, when the last job started to initiate loads. latest_load_end_time Time, when the last job run finished. succeeded_load_start_time Time, when the last successful load started to load. succeeded_load_end_time Time, when the last successful job run finished. failed_load_start_time Time, when the last failed job started to load. succeeded_load_duration Duration of the last successful job run. current_loading_duration Duration of the current job run. load_state Current state of the job. load_result Result of the last job run. username Displayed name of the initiating user. where_clause_parameters Key-Value pairs of parameters which are filled into the where clause for delta loading into staging. Stored as JSON. is_delta_load Flag, if the current run is a executed as delta load. pid Process ID in the core, which initiated the job. dvb_core.latest_staging_load_info The metadata view lists the current load info of all staging loads which were initiated at least once.

Column Meaning

load_entry_time Initiation time, when the load was initiated. staging_table_id Technical ID of the staging table. Including Schema. source_table_id Technical ID of the table / view in the source system. latest_load_start_time Time, when the last load was started. latest_load_end_time Time, when the last load ended. succeeded_load_start_time Time, when the last successful load was started. succeeded_load_end_time Time, when the last successful load ended. failed_load_start_time Time, when the last failed load was started. | 96 dvb_core.link_loads

succeeded_load_duration Duration of the last successful load. current_loading_duration Duration of the current load. load_state Current state of the load. load_result Last result of the load. load_progress Number of loaded rows. load_total_rows Number of rows to load. load_progress_percent Percentage processed rows in relation to total rows. username Displayed name of the initiating user. from_system_load Flag, if the staging load was initiated as part of a total system load in staging. job_id Technical ID of the initiating job if applicable. pid Process ID in the core, which initiated the load. dvb_core.link_loads The metadata view lists all link loads.

Column Meaning

link_load_id Technical ID of the hub load. Without Schema. link_id Technical ID of the link. Without Schema. system_id Technical ID of the source system. system_name Displayed name of the system. system_color Displayed color of the system. system_comment Custom comment about the system. staging_table_id Technical ID of the staging table. Including Schema. staging_view_id_hash CRC32-Hash of the staging table view id. dvb_core.links The metadata view lists all link loads.

Column Meaning

table_nq_id Technical ID of the link table. Without Schema. link_id Technical ID of the link without schema and technical relevant suffix(es). boid Technical ID of the link without link identifying prefix (l_) and Schema. hub_a_boid Technical ID of the first hub without hub identifying prefix (h_) and Schema. hub_a_name Displayed name of the first hub. hub_b_boid Technical ID of the second hub without hub identifying prefix (h_) and Schema. hub_b_name Displayed name of the second hub. link_suffix_id Technical suffix of the link. link_suffix_name Displayed Name of the suffix for the link. link_type Declared / Expected type of link or dummy-entries "linking" hubs to alias hubs. (i.e. many_to_one, many_to_many ...) link_subject_area_name Displayed subject area for the link. link_comment Custom comment about the link. link_name Displayed name of the link. link_is_prototype Flag, if link is just a prototype (without load). dvb_core.linksatellites The metadata view lists all link satellites.

| 97 dvb_core.load_log_datavault

Column Meaning

table_nq_id Technical ID of the linksatellite table. Without Schema. linksatellite_id Technical ID of the linksatellite without schema and technical relevant suffix(es). system_id Technical ID of the source system. system_name Displayed name of the system. system_color Displayed color of the system. system_comment Custom comment about the system. functional_suffix_id Technical suffix of the linksatellite. functional_suffix_name Displayed Name of the suffix for the linksatellite. linksatellite_subject_area_name Displayed subject are for the linksatellite. linksatellite_comment Custom comment about the linksatellite. parent_link_id Technical ID of the parent link. parent_link_name Displayed name of the parent link. linksatellite_name Displayed name of the linksatellite. dvb_core.load_log_datavault The metadata view lists the final result of datavault loads without intermediate steps (such as waiting or loading).

Column Meaning

load_entry_id Identifying log sequence number. load_entry_time Inititation time of load. object_id Technical ID of the datavault object (hub, link or satellite). Without Schema. staging_table_id Technical ID of the staging table to load from. Including Schema. load_start_time Start time of the load. load_end_time End time of the load. duration Duration of the load. load_state Final state of the load. load_result Final result of the load. load_total_rows Total rows to process within the load. login_username Displayed name of the initiating user. load_progress Number of rows which have been processed within the load. pg_username User group of initiating user. dvb_core.load_log_staging The metadata view lists the final result of staging loads without intermediate steps (such as waiting or loading).

Column Meaning

load_entry_id Identifying log sequence number. load_entry_time Inititation time of load. source_table_id Technical ID of the table / view in the source system. staging_table_id Technical ID of the staging table to load from. Including Schema. load_start_time Start time of the load. load_end_time End time of the load. duration Duration of the load. load_progress Number of rows which have been processed within the load.

| 98 dvb_core.satellites

load_total_rows Total rows to process within the load. load_state Final state of the load. load_result Final result of the load. login_username Displayed name of the initiating user. pg_username User group of initiating user. dvb_core.satellites The metadata view lists all satellites.

Column Meaning

table_nq_id Technical ID of the satellite table. Without Schema. satellite_id Technical ID of the satellite without schema and technical relevant suffix(es). boid Technical ID of the parent hub without hub identifying prefix (h_) and Schema. system_id Technical ID of the source system. system_name Displayed name of the system. system_color Displayed color of the system. system_comment Custom comment about the system. functional_suffix_id Technical suffix of the satellite. functional_suffix_name Displayed Name of the suffix for the satellite. satellite_subject_area_name Displayed subject area for the satellite. satellite_comment Custom comment about the satellite. staging_table_id Technical ID of the staging table to stage from. Including Schema. parent_hub_id Technical ID of the parent hub. Without Schema. parent_hub_name Displayed name of the parent hub. satellite_name Displayed name of the satellite. datavault_category_id Technical ID of the related datavault category to load (PSA, Raw Vault, Business Vault). dvb_core.staging_tables The metadata view lists all staging tables.

Column Meaning

staging_table_id Technical ID of the staging table to stage from. Including Schema. staging_table_name Displayed name of the staging table. staging_table_display_string Displayed name of the staging table including technical ID. staging_table_comment Custom comment about the staging table. staging_table_type_name Type name of the staging table (e.g. Table or View). staging_table_type_id Type id of the staging table (e.g. r for Table or v for View). schema_id Technical ID for the schema of the staging table. schema_name Displaye name of the schema. system_id Technical ID of the source system. system_name Displayed name of the system. system_color Displayed color of the system. system_comment Custom comment about the system. source_table_id Technical ID of the source table / view loading from. Including Schema. source_name Displayed name of the source table / view. source_table_type_id Type id of the source table (i.e. TABLE or VIEW). | 99 dvb_core.subject_area_name

source_object_id Technical ID of the source object in the source. source_schema_id Technical ID for the schema of the source object. batch_size Batch size to stage the table with (-1 if not defined). where_clause_general_part SQL based where clause, which is added to each staging load, defining a subset from the source which can be loaded like "full". where_clause_delta_part_templateSQL based where clause template added to the where clause sent to the source when loading delta. is_delta_load Flag, if currently loaded data in the staging table is a delta. is_up_to_date Flag, if all Business Keys and Hashes have been calculated for loaded data. dvb_core.subject_area_name The metadata view lists all subject areas.

Column Meaning

subject_area_name Displayed name of a subject area. dvb_core.system_connections The metadata view lists all systems with connection properties.

Column Meaning

system_id Technical ID of the source system. system_name Displayed name of the system. system_color Displayed color of the system. system_comment Custom comment about the system. source_type_api Technical ID of the API type to connect through (e.g. jdbc_driver, rest_api). source_type_id Technical ID of the source type to connect to (e.g. postgres, mssql). source_type_url Technical string used to connect to the source. source_type_parameters Parameters to connect to the source. Stored as JSON. dvb_core.systems The metadata view lists all user defined systems.

Column Meaning

system_id Technical ID of the source system. system_name Displayed name of the system. system_color Displayed color of the system. system_comment Custom comment about the system. dvb_core.tables The metadata view lists all tables in the Datavault Builder schemas.

Column Meaning

table_id Technical ID of the table. Including schema. table_nq_id Technical ID of the table. Without schema. schema_id Technical ID for the schema of the table. schema_name Displayed name for the schema of the table. table_name Displayed name of the table. table_comment Custom comment about the table. type_id Type id of the staging table (e.g. r for Table or v for View). type_name Type name of the staging table (e.g. Base Table or View). | 100 dvb_core.tables_simple

system_id Technical ID of the source system. system_name Displayed name of the system. system_color Displayed color of the system. system_comment Custom comment about the system.

dvb_core.tables_simple The metadata view lists all tables in the Datavault Builder schemas without display names and comments.

Column Meaning

table_id Technical ID of the table. Including schema. schema_id Technical ID for the schema of the table. table_nq_id Technical ID of the table. Without schema. type_id Type id of the staging table (e.g. r for Table or v for View). type_name Type name of the staging table (e.g. Base Table or View). system_id Technical ID of the source system.

dvb_core.tracking_satellites The metadata view lists all tracking satellites.

Column Meaning

table_nq_id Technical ID of the tracking satellite table. Without Schema. tracking_satellite_id Technical ID of the tracking satellite without schema. tracking_satellite_name Displayed name of the tracking satellite. tracked_object_id Technical ID of the tracked datavault object. Without Schema. tracked_object_name Displayed name of the tracked datavault object. is_tracking_a_link Flag, if the tracked object is a link. is_tracking_a_satellite Flag, if the tracked object is a satellite. is_delta_load_satellite Flag, if the tracking satellite is tracking changes of delta loads. is_full_load_satellite Flag, if the tracking satellite is tracking changes of full loads. last_full_load_time Timestamp of the previous full load of the tracking satellite. system_id Technical ID of the source system. system_name Displayed name of the system. system_color Displayed color of the system. system_comment Custom comment about the system. tracking_satellite_subject_area_nameDisplayed subject area for the tracking satellite. tracking_satellite_comment Custom comment about the tracking satellite. staging_table_id Technical ID of the staging table to stage from. Including Schema. parent_hub_id Technical ID of the parent hub. Without Schema. parent_hub_name Displayed name of the parent hub. satellite_name Displayed name of the satellite. datavault_category_id Technical ID of the related datavault category to load (PSA, Raw Vault, Business Vault). dvb_core.transaction_link_relations The metadata view lists all transaction link relations. These are just virtual relations.

Column Meaning | 101 dvb_core.transaction_links

transaction_link_relation_id Technical ID of the transaction link relation. transaction_link_id Technical ID of the related transaction link. transaction_link_name Displayed name of the transaction link. linked_hub_id Technical ID of the connected hub. linked_hub_name Displayed name of the connected hub. link_type Dummy-entries "linking" transaction links to hubs (e.g. many_to_one, many_to_many ...). dvb_core.transaction_links The metadata view lists all transaction links.

Column Meaning

table_nq_id Technical ID of the transaction link table. Without Schema. transaction_link_id Technical ID of the transaction link. boid Technical ID of the parent hub without hub identifying prefix (h_) and Schema. system_id Technical ID of the source system. system_name Displayed name of the system. system_color Displayed color of the system. system_comment Custom comment about the system. functional_suffix_id Technical suffix of the satellite. functional_suffix_name Displayed Name of the suffix for the satellite. transaction_link_subject_area_nameDisplayed subject area for the tracking satellite. transaction_link_comment Custom comment about the tracking satellite. transaction_link_type Declared / Expected type of link. (e.g. transaction_link) staging_table_id Technical ID of the staging table to stage from. Including Schema. parent_hub_id Technical ID of the parent hub. Without Schema. parent_hub_name Displayed name of the parent hub. transaction_link_name Displayed name of the transaction_link. datavault_category_id Technical ID of the related datavault category to load (PSA, Raw Vault, Business Vault). dvb_core.view_relations The metadata view lists all view dependencies in the Datavault Builder schemas.

Column Meaning

table_id Technical ID of the referenced table/view. Including schema. table_schema_id Techncial ID for the schema of the referenced table/view. table_nq_id Technical ID of the referenced table/view. Without schema. table_type_id Type id of the referenced table/view (e.g. r for Table or v for View). dependent_view_id Technical ID of the dependent view. Including schema. dependent_view_schema_id Techncial ID for the schema of the dependent view. dependent_view_nq_id Technical ID of the dependent view. Without schema. dependent_view_type_id Type id of the dependent view (e.g. v for View or m for Materialized View).

dvb_core.views The metadata view lists all views in the Datavault Builder schemas.

| 102 dvb_core.x_business_rules_distinct

Column Meaning

view_id Technical ID of the view. view_nq_id Technical ID of the view. Without schema. schema_id Technical ID for the schema of the view. schema_name Displayed name for the schema of the view. view_is_materialized Flag, if the the view is materialized. view_code SQL code of the view definition. metadata_name Value of "name"-key in JSON comment of view. metadata_comment Value of "comment"-key in JSON comment of view. metadata_businessobject_structureValue of "businessobject_structure"-key in JSON comment of view. metadata_quick_inserts Value of "quick_inserts"-key in JSON comment of view. metadata_code Value of "code"-key in JSON comment of view. metadata_is_error_ruleset Value of "is_error_ruleset"-key in JSON comment of view. metadata_include_in_accesslayer Value of "include_in_accesslayer"-key in JSON comment of view. metadata_accesslayer_priorization Value of "accesslayer_priorization"-key in JSON comment of view.

dvb_core.x_business_rules_distinct Helper view to get distinct subset of business ruleset properties.

Column Meaning

start_hub_id Technical non-qualifying ID of the underlying hub. Without Schema. start_hub_name Displayed Name of the underlying hub. functional_suffix_id Technical suffix of the underlying business object. functional_suffix_name Displayed Name of the suffix of the underlying business object.

dvb_core.x_business_rules_system Helper view to get distinct subset of business ruleset properties.

Column Meaning

start_hub_id Technical non-qualifying ID of the underlying hub. Without Schema. start_hub_name Displayed Name of the underlying hub. functional_suffix_id Technical suffix of the underlying business object. functional_suffix_name Displayed Name of the suffix of the underlying business object. system_id Technical ID of the system of the granularity satellite. system_name Displayed Name of the system. system_color Displayed color of the system. system_comment Custom comment about the system. related_businessobject_view_id Technical ID of the underlying business object. Including Schema.

dvb_core.x_businessobjects_distinct Helper view to get distinct subset of businessobject properties.

Column Meaning

start_hub_id Technical non-qualifying ID of the underlying hub. Without Schema. start_hub_name Displayed Name of the underlying hub. functional_suffix_id Technical suffix of the business object. | 103 dvb_core.x_businessobjects_system

functional_suffix_name Displayed Name of the suffix of the business object.

dvb_core.x_businessobjects_system Helper view to get distinct subset of business objects properties.

Column Meaning

start_hub_id Technical non-qualifying ID of the underlying hub. Without Schema. functional_suffix_id Technical suffix of the underlying business object. system_id Technical ID of the system of the granularity satellite. system_name Displayed Name of the system. system_color Displayed color of the system. system_comment Custom comment about the system.

dvb_core.x_hubs_distinct The metadata view to get distinct subset of hub properties.

Column Meaning

hub_id Technical ID of the hub. Without Schema. boid Technical ID of the hub without hub identifying prefix (h_) and Schema. hub_name Displayed name of the hub. hub_name_of_alias_parent Displayed name of the parent hub if the hub is an allias. hub_subject_area_name Displayed subject area for the hub. hub_comment Custom comment about the hub. dvb_core.x_hubs_system Metadata view to get distinct subset of hub properties. .

Column Meaning

hub_id Technical ID of the hub. Without Schema. system_id Technical ID of the source system. system_name Displayed name of the system. system_color Displayed color of the system. system_comment Custom comment about the system. dvb_core.x_jobs_system Metadata view to get distinct subset of job properties. .

Column Meaning

system_id Technical ID of the system the job belongs to. system_name Displayed name of the system. system_color Displayed color of the system. system_comment Custom comment about the system. dvb_core.x_latest_load_info The metadata view lists the current load info of all staging and datavault loads which were initiated at least once.

Column Meaning

| 104 dvb_core.x_links_distinct

entry_time Initiation time, when the load was initiated. source_id Technical ID of the source table / staging table. Including Schema. target_id Technical ID of the staging table / datavault object. start_time Time, when the last load was started. end_time Time, when the last load ended. duration Duration of the current load. state Current state of the load. result Last result of the load. progress Number of loaded rows if applicable. total_rows Number of rows to load. username Displayed name of the initiating user. job_id Technical ID of the initiating job if applicable. pid Process ID in the core, which initiated the load. dvb_core.x_links_distinct Metadata view to get distinct subset of link properties.

Column Meaning

link_id Technical ID of the link without schema and technical relevant suffix(es). boid Technical ID of the link without link identifying prefix (l_) and Schema. hub_a_boid Technical ID of the first hub without hub identifying prefix (h_) and Schema. hub_a_name Displayed name of the first hub. hub_b_boid Technical ID of the second hub without hub identifying prefix (h_) and Schema. hub_b_name Displayed name of the second hub. link_suffix_id Technical suffix of the link. link_suffix_name Displayed Name of the suffix for the link. link_type Declared / Expected type of link or dummy-entries "linking" hubs to alias hubs. (i.e. many_to_one, many_to_many ...) link_subject_area_name Displayed subject area for the link. link_comment Custom comment about the link. link_name Displayed name of the link. link_is_prototype Flag, if link is just a prototype (without load). dvb_core.x_links_system Metadata view to get distinct subset of link properties.

Column Meaning

link_id Technical ID of the link without schema and technical relevant suffix(es). system_id Technical ID of the system the job belongs to. system_name Displayed name of the system. system_color Displayed color of the system. system_comment Custom comment about the system. dvb_core.x_satellites_system Metadata view to get distinct subset of satellite properties.

Column Meaning

| 105 dvb_core.y_blocking_procs

satellite_id Technical ID of the satellite without schema and technical relevant suffix(es). boid Technical ID of the parent hub without hub identifying prefix (h_) and Schema. system_id Technical ID of the source system. system_name Displayed name of the system. system_color Displayed color of the system. system_comment Custom comment about the system. functional_suffix_id Technical suffix of the satellite. functional_suffix_name Displayed Name of the suffix for the satellite. parent_hub_id Technical ID of the parent hub. Without Schema. parent_hub_name Displayed name of the parent hub. dvb_core.y_blocking_procs Helper view to debug errors related with blocking procedures.

Column Meaning

blocking_pid Process ID of the blocking query. blocking_user User ID of the blocking query. blocking_query SQL query which is preventing the execution. blocked_pid Process ID of the blocked query. blocked_user User ID of the blocked query. blocked_query SQL query which is prevented from execution. A.2. Log Tables The Datavault Builder consequently logs metadata about modelling and loading actions into different logs. dvb_log.datavault_load_log The log lists the historized load info of all datavault loads which were initiated at least once.

Column Meaning

load_entry_id Sequence ID of the log entry. load_entry_time Initiation time, when the load was sent into the loading queue. object_id Technical ID of the datavault object (hub, link or satellite). Without Schema. staging_table_id Technical ID of the staging table to load from. Including Schema. load_start_time Initiation time of the actual loading process. load_end_time End time of the load. load_state Current state of the load. load_result Result of the last load. load_progress Number of rows which have been processed within the load. load_total_rows Number of rows to process within the load. Does not have to be equal to the actual number of rows which are inserted (i.e. for hubs only new entries are added) login_username Displayed name of the initiating user. pg_username Technical user group of the initiating user. job_id Technical ID of the job within the load was initiated if applicable. pid Process ID in the core, which initiated the load. dvb_log.ddl_log The log lists the historized load info of all staging loads which were initiated at least once. | 106 dvb_log.dvbuilder_creation_log

Column Meaning

log_entry_id Sequence ID of the log entry. log_timestamp Timestamp of the log entry. source Internal category of the trigger (e.g. event_trigger, refresh). type DDL Type of the executed query (e.g. CREATE TABLE, COMMENT). object_id Technical ID of the modified object. login_username Displayed name of the initiating user. pg_username Technical user group of the initiating user. dvb_log.dvbuilder_creation_log The log lists all modelling function calls executed through the datavault builder core.

Column Meaning

log_entry_id Sequence ID of the log entry. log_timestamp Timestamp of the log entry. function_call SQL based function call with all parameters to re-execute the modification. function_name Name of the function called in the core. object_id Technical ID of the modified object. Without Schema. staging_table_id Technical ID of the staging table if applicable. Including Schema. login_username Displayed name of the initiating user. pg_username Technical user group of the initiating user. dvb_log.dvbuilder_log The log lists all SQL actions executed through the datavault builder core.

Column Meaning

log_entry_id Sequence ID of the log entry. log_timestamp Timestamp of the log entry. login_username Displayed name of the initiating user. pg_username Technical user group of the initiating user. function_name Name of the function called in the core. action Name of the performed action on the client database. action_status Resulting status of the execution. attribute_list List of attributes passed into the function call of the core. error_message Message of the occured error if applicable. query Actual query executed on the client database. explain Further error details if applicable. explain_analyze Further error details if applicable. dvb_log.job_load_log The log lists the historized load info of all jobs which were initiated at least once.

Column Meaning

load_entry_id Sequence ID of the log entry. load_entry_time Initiation time, when the job was initiated. job_id Technical ID of the job.

| 107 dvb_log.login_log

load_start_time Time, when the job started to initiate loads. load_end_time Time, when the job run finished. load_state State of the job. load_result Result of the job run. login_username Displayed name of the initiating user. pg_username Technical user group of the initiating user. pid Process ID in the core, which initiated the job. where_clause_parameters Key-Value pairs of parameters which are filled into the where clause for delta loading into staging. Stored as JSON. is_delta_load Flag, if the run was executed as delta load. dvb_log.login_log The log lists the historized login info of all login attempts in the GUI.

Column Meaning

log_entry_id Sequence ID of the log entry. log_timestamp Timestamp of the log entry. login_username Displayed name of the user. pg_username Technical user group of the user if applicable. login_status Result of the login attempt. dvb_log.staging_load_log The log lists the historized load info of all staging loads which were initiated at least once.

Column Meaning

load_entry_id Sequence ID of the log entry. load_entry_time Initiation time, when the load was initiated. staging_table_id Technical ID of the staging table. Including Schema. source_table_id Technical ID of the table / view in the source system. system_id Technical ID of the source system. load_start_time Time, when the load was started. load_end_time Time, when the load ended. load_state Current state of the load. load_result Last result of the load. load_progress Number of loaded rows. load_total_rows Number of rows to load. load_progress_percent Percentage processed rows in relation to total rows. login_username Displayed name of the initiating user. from_system_load Flag, if the staging load was initiated as part of a total system load in staging. pg_username Technical user group of the initiating user. pid Process ID in the core, which initiated the load. job_id Technical ID of the initiating job if applicable. Licences

Software Container Licence Type Link Postgres Core PostgreSQL https:// www.postgresql.org/ about/licence/ | 108 Licences

Postgrest API MIT https:// github.com/ begriffs/ postgrest/ blob/master/ LICENSE unirest-java Core MIT https:// github.com/ Kong/unirest- java/blob/ master/ LICENSE pl/java Core BSD 2 Clause https:// github.com/ tada/pljava/ wiki/PLJava- License apache poi Core Apache 2.0 https:// poi.apache.org/ legal.html csvjdbc Core LGPLv2 https:// github.com/ kwaxi/csvjdbc/ blob/master/ src/doc/ license.txt MSSQL JDBC Core MIT https:// github.com/ Microsoft/ mssql-jdbc/ blob/dev/ LICENSE Oracle JDBC Core Proprietary http:// www.oracle.com/ technetwork/ licenses/ distribution- license-152002.html Ace Editor Webgui BSD https:// github.com/ ajaxorg/ace/ blob/master/ LICENSE Angular Webgui MIT https:// Materials material.angularjs.org/1.0.8/ license jQuery Webgui MIT https:// jquery.org/ license/ jsZip Webgui MIT https:// github.com/ Stuk/jszip/

| 109 Licences

blob/master/ LICENSE.markdown PDFMake Webgui MIT https:// github.com/ bpampuch/ pdfmake/ blob/master/ LICENSE rxjs Webgui Apache 2.0 https:// github.com/ Reactive- Extensions/ RxJS/blob/ master/ license.txt jsdiff Webgui BSD https:// github.com/ kpdecker/ jsdiff/blob/ master/ LICENSE File-Save Webgui MIT https:// github.com/ eligrey/ FileSaver.js/ blob/master/ LICENSE.md Dragability Webgui MIT https:// desandro.mit- license.org/ JS-Search Webgui MIT https:// github.com/ bvaughn/ js-search/ blob/master/ LICENSE.md Google Webgui MIT https:// Angular github.com/ angular/ angular.js/ blob/master/ LICENSE Nginx Webgui BSD 2 Clause http:// nginx.org/ LICENSE Angular UI Webgui MIT https:// Grid github.com/ angular- ui/ui-grid/ blob/master/ LICENSE.md PivotTable.js Webgui MIT https:// github.com/

| 110 Release Notes

nicolaskruchten/ pivottable/ blob/master/ LICENSE.md D3-Sankey Webgui BSD 3 Clause https:// github.com/ d3/d3-sankey/ blob/master/ LICENSE D3-Chart Webgui MIT https:// github.com/ misoproject/ d3.chart/ blob/master/ LICENSE-MIT Lodash Webgui MIT https:// github.com/ lodash/lodash/ blob/master/ LICENSE UCanAccess Core Apache 2.0 http:// JDBC ucanaccess.sourceforge.net/ site.html#license MySQL JDBC Core GPLv2 https:// dev.mysql.com/ downloads/ connector/j/ Snowflake Core Apache 2.0 https:// JDBC github.com/ snowflakedb/ snowflake- jdbc/blob/ master/ LICENSE.txt Postgres Core BSD 2 Clause https:// JDBC github.com/ pgjdbc/pgjdbc/ blob/master/ LICENSE Release Notes

4.0.7.0 05.03.2019 Filter possibilities in the lineage allow to drill down further into the dataflow. Just right-click onto a node to either add the ID or the name to the filtering conditions. In the Data Viewer, there is now the possiblity to bookmark a prepared view, allowing for quick querying of defined, often used settings. Also, in the Data Viewer, additional system views are available for easier debugging.

Minor Improvements: - Data Viewer: Extended System views, which allow debugging of function calls and monitoring of system properties (memory/cpu) of the host - Data Viewer: Added Bookmarking feature of the pulled up configuration - Lineage: Possibility to disable cyclical relations (~business vault loads) - Filtering of lineage in the jobs module | 111 Release Notes

- Automatically select granularity satellite when opening a business object - Add Satellite: Automatically select hub load when there is only on

API changes: - Extended checkConnection API to allow test if clientdb is connected successfully - Deprecated getJobLineage API, as it is now covered by searchJobLineage

Bugfix: - Fixed a cause of possible wrong load durations in gui by updating pljava to version 1.5.2 - MySQL-Source: Fix for default schema for editing staging tables - MySQL Source: Default URL change: Read datetime of 0000-00-00 as NULL - deleteUser: Deletion of non-admin users was not possible in case only one admin user existed - Exasol: Fix load times greater than 99 days - Add Link Load: fixed "Cannot read property 'display' of undefined "-error when coming from add hub load - Documentation generation could result in error "getBBox of null"

4.0.6.1 09.01.2019 Minor Improvements: - Updated PLJava to 1.5.2 - Updated Java to 1.8.0.191

4.0.6 20.12.2018 Minor Improvements: - Added security Exasol 6.1. with Impersonate - extended bookmark api to take bookmark_type so function can also be used in other context - Decoupled Fetch Size from Batch Size in staging process (new config param: staging_source_max_fetch_size)

Bugfix: - Staging Table was in some cases truncated twice in a row

4.0.5.2 12.12.2018 Bugfixes: - Handle Distinct in Link Loads on all DBs

4.0.5.1 11.12.2018 Bugfixes: - MS SQL: Handle Link Hashing of NULL

4.0.5 10.12.2018 Minor Improvements: - Handle CLOBs in Oracle loads - Added possibility to provide SSL Certificates to the GUI Container - Auto-Select Business Vault as load type for Business Vault based systems

Bugfixes: - Exasol: Create or update job Schedule

4.0.4 05.12.2018 Minor Improvements: - MS SQL: Explicitly declared collation to prevent problems in case of case insensitive master db

| 112 Release Notes

Bugfixes: - Oracle: Wrong Join in hub_load base view for comments

4.0.3.2 04.12.2018 Minor Improvements: - Exasol: Updated jdbc driver to 6.1.0 - Better Counting behaviour for loads - Deployment Dependency resolution

4.0.3.1 03.12.2018 Added subset status icons in staging. This allows you to now directly see whether a subset clause is defined for a staging table and whether the clause was applied onto the currently loaded dataset.

Minor Improvements: - Staging view update for Tracking Satellites to make better use of indices while loading - Added Transaction Links to Documentation - Handling of conflict in BO Edit

Bugfixes - MS SQL: Deployment of hub loads with composite keys - Oracle: Handle tracking satellites to recreate hub load for a hub which was previously loaded and the load was deleted - Oracle: Oracle MView Refresh caused Metadata Refresh - Fixed status polling in staging

4.0.3 Internal Release

4.0.2 08.11.2018 Minor Improvements - Correct handling of "\" in source connections - brute force protection of webserver - added current timestamp in dvbuilder log - added popup after deployment with deployment summary

4.0.1 Internal Release

4.0.0 26.10.2018 With Version 4 we further push your agility! We introduce a new "deployment"-module. With this module, you can directly compare and deploy between different environments. Also, you can export the created model on a logical level, to version it with your versioning tool (such as GIT, SVN, ...).

Minor Improvements - Added Authentication against the client database: With this, authentication against an AD is possible - Introduced moving away of loaded CSV files - Added possiblity to remove column from datavault staging and thereby take it out of satellite load - Prevent empty (space only) Subject Areas - Added an OCI/Thin switch for Oracle Sources - Connection Testing makes use of Stored PW if able to so password does not need to be entered - Improved speed to edit staging table - Dimensional Model: Now the hash of hubs can be used in the output - Mark non-delta-loads in a delta load job as full load (reducing data storage for tracking satellites) - Oracle: Faster calculation of newly added keys in with already filled staging table - added parameter to disable ipv6 in nginx | 113 Release Notes

- added build number - different user Roles on the core engine: dvb_operations, dvb_user_readonly, dvb_admin - MS SQL: General performance improvements for loads - Docker improvement: API container will wait for the core to come up - added the possibility for setting the timezone - actively validate name length of objects before creation

Bugfixes - Oracle / MSSQL: Load non-unique satellite records - MS SQL / Postgres: Corrected displaying of Datavault Category displayed for Satellites - Oracle: Handle null values for Functional Suffix ids for the Data Lineage - Oracle: Corrected Indices creation to be in same schema as tables - Oracle: Corrected Refreshing of Hash Keys when Staging Table was loaded - Corrected Function to update User Password - MS SQL: Fix trailing space in business key - Data lineage won't break if custom objects are present in DVB-schema - Fixed escaping in for like & and for function name correction - Oracle: Check dependencies before deleting an object, as Oracle allows dropping objects with dependencies - Load handling: Fail orphaned jobs / loads - MS SQL / Oracle / EXASOL: Corrected NULL-Value Handling in driving view - Oracle: revert metadata on oracle in case of a rollback during replace view - Fix for creating new system colors in case all predefined ones are already in use

3.9.2 23.04.2018 Minor Improvements - Only resync Meta-Data Views if needed (Oracle/MSSQL/Exasol) to improve speed - Improved speed of Business Object and Business Ruleset creation

Bugfixes - Oracle: Create Business Ruleset may have lead to an error due to Oracle-Specific Null-Handling - Oracle: Link Loads Base View may have had duplicates - Handle activation of non-existent Post-job scripts

3.9.1 20.04.2018 We have added the support for full load subsetting. You can define subsets of the source, which will be loaded and tracked as if it was a full load by specifing a fixed where clause in the create staging table creation wizard.

Minor Improvements - Improved error messages in log for failed loads - Added additional Parameters for CSV-Files, allowing patternmatching of filenames and creation of virtual columns based on the results - Removed regex-constraints in Create Satellite Wizard to reduce confusion - Extended Column Conversions (mainly for Postgres & MSSQL) - Delta Load Parameters support dynamic functions, which can be customized to being executed on the core, the client database or the source. - Faster putting of object comments allows faster creation and edits - Updated DVB Structures on MSSQL to support NVARCHAR - In Post-Job statements, it is now possible to move files after loading them - Moved quotation of Strings into core to speed up overall function calls

Bugfixes - Sourcing from REST-APIs may not have been possible due to errors when creating the staging tables

| 114 Release Notes

- Edit Source System does not retrieve and show passwords anymore - CMD-Line and create_user function do not log passwords anymore. Please review your existing logs to remove previously logged passwords. - On some environments, Job schedules had to be enddated in order to be saved - Longer running post-jobs sql queries may have resulted in never ending jobs - Previewing CSV Files forced reload of the GUI instead of displaying "Not Supported"-Hint - Multiple Savings of Business Rulesets may have crashed the Business Ruleset (Postgres)

3.9.0 Internal Release

3.8.2 Internal Release

3.8.1 16.03.2018 Minor Improvements - Additional columns '_dvb_row_id' (resp. '_DVB_ROW_ID') in new staging tables for use as surrogate key if no other business key is possible

Bugfixes - Transaction Links are now automatically added to auto-jobs and available for selection in manual jobs - Fixed update of link driving views with alias hubs present - Update gui when creating alias hubs - Fix for editing staging table names - Filter virtual system 'Hubs' from staging list - Fixed problem with creating staging tables for sources not having schemas (e.g. csv) - Fixed fileExtension-Parameter in csv source creation

3.8.0 09.03.2018 The retrieval of more technical fields in the dimensional output is now enabled. Simply work with standard attributes such as "Last seen"-Dates of tracking satellites to cleanse outputed data from deleted records. As well, it is now possible to start out with the data set of the hub itself (based on the virtual system 'Hub') when creating a new Business Object. For satellites, the field "PS" then indicates, whether the key exists in the satellite or not.

Minor Improvements - Data Preview now works with sources MS SQL, Oracle, Postgres and Exasol - General performance improvements - Updated MS SQL Server Version

Bugfixes - Correct sort order of columns in datavault "Create Satellite"- and "Create Transaction Link"-Wizards - Fixed link driving view outputs

3.7.1 19.02.2018 Stability bugfixes and Improvements.

Minor Improvements - Search in Datavault: Sorting by display name + Object ID in the dropdown list

Bugfixes - MSSQL: Stability improvement of loading processes for slower systems - Save Businessruleset: Don't crash if an error occured during the saving, but return the actual error- message

| 115 Release Notes

- MSSQL: creation of satellites with data type float was not possible due to a mismapping

3.7.0 08.02.2018 We have implemented the support of an additional type of processing database for to choose from for your data warehouse: Exasol (starting from version 6.). On MS SQL mainly an intermediate materialization of the metadata is introduced to speed up work in the GUI.

Minor Improvements - Environment color setting: In the dvb_config.config table new keys for environment_% are available to set specific coloring (e.g. make production red) - Editing columnar metadata in Satellites and Transaction Links is now possible (Rename / Comment) - Renamed the Business Objects module in the GUI to "Dimensional Model" to reduce confusion - Name of a subject area is now directly displayed within the subject area bubble - Long table names / ids are now automatically line-breaked to show complete string - When going to empty Jobs / BO etc., the dropdown for selection automatically appears - Runtime data of full load tracking satellites is shared between current and historized satellite objects - MSSQL: Tables base view is now materialized if needed after one minute to speed up the Data Viewer and Business Rules Quick Inserts. Therefore, however, these may not be up to date at all times. - Staging: Get Schemas and Tables/Views from source separately (to reduce load on source system) when creating a new staging table. - Removed counter of rows to process if loading from a view to speed up loads. - GUI is not cached anymore to reduce chances of outdated code - BR: moved the properties of the business ruleset from the slider at the bottom of the right column - Added debounce for datavault searching to enhance GUI performance - Keep Business Rules filter on save - Operations Status: Additional option to filter the lineage by state - Operations Status: Possibility to right-click onto a load to initiate/cancel it

Bugfixes - Core function fixes to speed up business ruleset and business object getting in the GUI. - updated full load tracking satellites with actual last seen date - initiating loads quickly after one another could have lead to never ending loading staging/datavault object/job due to a logging problem - Staging loads may have failed if the number of rows to loads was a multiple of the batch size - MSSQL: fixed the latest failed load time of staging loads in the base view - MSSQL: fixed possible overflow of percentage number in staging loads base view. - Fixed initiating multiple sats/links for the same hub with option "preload parent" resulting in an error - Add related BO: Used system was not properly prefilled - Doc module: Token won't expire during the loading process - Staging: If manually defining a connection string and then editing a source system, the manual string did not appear - BO: Fixed empty names of Quick Insert Objects - Renaming of a job was not possible in the GUI

3.6.0 05.12.2017 A new type of object is available in the Datavault: Transaction Links! This enables you to create links connecting multiple hubs and/or additionally store attributes. However, keep in mind, that using this type of link should only be used in special occasions, i.e for handling large volume transactional data. As we wanted to deliver you this feature as quickly as possible, please note that the usage of transaction links in the business object is not included yet and will follow shortly.

Minor Improvements - Hub Metadata opens when adding a hub load, so you see the existing loads / Business Keys - Overflowing table/view names in dialogues are now shown in total / on hover

| 116 Release Notes

- Better GUI Responsiveness (MS SQL) - Technical Users with long-living Tokens can be created (for use with REST-API) - Updated Driver for CSV-Loads (can have lead to problems with loads into MS SQL) - New Conversion Type (Postgres): Numeric/Text Epoch to timestamp - Status Lineage filter for load state

Bugfixes - convert to numeric doesnt fail load when conversion fails - add related business object - select granularity satellite is now enabled - removing of access errormart single view if disabled in business rules - stability of manual loads with preload parent - default datatype mapping of numeric values for business loads trimmed off values

3.5.0 10.11.2017 Bookmarks in the Datavault are now stored on the server and can therefore be used on any client. Furthermore, they can be directly shared with your coworkers.

Bugfixes - Staging: Edit of Staging tables based on CSV- and REST-API-Source

3.4.1 06.11.2017 We proudly present a new module: The Documentation! You can now automatically generate a documentation of your implementation! Use it as searchable webversion or also export it to a pdf for sharing and storing. Also, we added the possibility to perform extended queries the source databases in the jobs module and optimized the accesslayer views for simple cases.

Bugfixes - Business Ruleset: Editing adds unwanted newlines (MS SQL) - Business Rulesets: Quoting of columns (MS SQL) - Missing indexes on staging tables - After 'Add Hub Load' dialog called in 'Create Link Load' the link load creation process did not proceed - Unable to change hubs in 'Create Link' dialog in some constellations - Staging: Fixed failure when manually initiating the load of a complete source system

3.4.0 27.10.2017 You are now able to directly verify whether your composed business key is unique right after composing it in the editor! Just click onto the check-icon within the edit-window. If there are duplicates in the staged data set, a window will pop up, displaying you the number of duplicates. As well, you can then click onto the plus icon to get samples of duplicate records. Secondly, you can now define in the config, whether sql-queries in the jobs module are executed on the datavault builder core or the client database.

Bugfixes - Dangling jobs - Faulty marked running tracking sat loads "not found as loading" - Reduced "shivering" on the canvas, to optimize the drawing - Job logged as finished after performing sql query - Empty line insertion in business ruleset on save (only MS SQL)

3.3.8 17.10.2017 The creation log is now completed with updates, so that you can take a creation log of one instance and use it to replicate the state on another installation. We also removed passwords from connection URLs for security improvement. | 117 Release Notes

Bugfixes - For Many-To-Many links the driving view was not refreshed and thereby no data was delivered in the business objects - Creation log: Include all attributes - Handle numeric types without precision - JSON-Loads and custom enum-types - Correct metadata of business ruleset - Handle Nullpointer Exception of tables with datatype NULL - Improved reading speed for metadata of oracle source

3.3.7 06.10.2017 Instead of ignoring missing fields in staging, now a warning is logged.

Bugfixes - load timestamp offsets in staging

3.3.6 02.10.2017 Handling of datetime with higher precision in the source than in the fdb by splitting into two fields.

Bugfixes - null bks with prefix on (MS SQL)

3.3.5 28.09.2017 Now there is a connection pool parameter for the api.

Bugfixes - access to table view - keeping numeric precisions for staging and dv loads - show total rows in staging load again (on mssql batch load)

3.3.4 Internal Release

3.3.3 25.09.2017 The columns in the Business Object generator is now filterable and allows to select and add multiple columns at once. Improved load canceling: Wait and check for canceled loads, but don't terminate them.

3.3.2 Internal Release

3.3.1 15.09.2017 Bugfixes - performance improvement jobs - performance improvement link metadata

3.3.0 13.09.2017 BulkCopy is now available for staging (MS SQL). Possibility to add all tables with all columns for a specific system using the command line interface and function dvb_core.f_add_all_source_tables. In the Datavault Link load dialogues Hub names are now in the title.

| 118 Release Notes

Bugfixes - missing NULL values in views view - nolock does not interfer with readpast of base view (MS SQL) - deleting tracking sats when deleting hub data - performance improvments extended properties in base views (MS SQL) - loading progress percentage rounding - select Subject Area issue - error when creating business vault load staging table - datavault_category_id in satellite base view - performance improvement GUI: Removed duplicated calls

3.2.0 23.08.2017 Preparation for Documentation Module behind the scenes.

Bugfixes - Staging "Exception: b.replace is not a function"

3.1.3 23.08.2017 Bugfixes - alias name in bo

3.1.2 21.08.2017 Bugfixes - support button - fixed case for satellite columns - create link load - quoting of bk - more failsafe and better readable error message in br - Error overflow business ruleset - filter on original query

3.1.1 28.07.2017 Now technical columns are available for selection in the business objects generator.

Bugfixes - Business Object: Satellite fields not available - recreating removed link loads when data is still loaded - update link suffix name - delete satellite independent from materialized relations view - delete business object independent from materialized relations view - non standard business rules not depending on a view or table (i.e. on a function) - delete link data - wrong system - change metadata of prototype satellite - delete prototype satellite

3.1.0 15.06.2017 In the Datavault, now subtabs are available to split the Core into "Persistent Staging", "Raw Vault" and "Business Vault". The declaration of which object belongs to which datavault category can assigned in the creation of a hub load. This means, that a hub can belong to multiple datavault categories, while satellites will always belong to one category. Changing the datavault category of a hub load is possible by opening the meta- data panel of the hub. This also affects the category of the related satellite(s) for that load.

Bugfixes

| 119 Release Notes

- truncate hubs with tracking sats (refresh mat view) - truncate hubs when prototype hubs exist

3.0.1 13.06.2017 Fix Business Ruleset

3.0.0 21.05.2017 Initial Release with GUI.

| 120