DKTK Kaapana, Technical Documentation
Total Page:16
File Type:pdf, Size:1020Kb
DKTK Kaapana Technical Documentation Release 0.1.1 Kaapana Team Heidelberg Sep 24, 2021 CONTENTS 1 What is Kaapana? 1 2 Getting started 3 2.1 What’s needed to run Kaapana? .......................... 3 3 Build Kaapana 5 3.1 Build Requirements ................................. 5 3.2 Build modes ...................................... 7 3.3 Build Dockerfiles and Helm Charts ........................ 7 4 Install Kaapana 9 4.1 Step 1: Server Installation .............................. 9 4.2 Step 2: Platform Deployment ........................... 10 5 User guide 13 5.1 Default configuration ................................ 13 5.2 Storage stack: Kibana, Elasticsearch, OHIF and DCM4CHEE .......... 14 5.3 Processing stack: Airflow, Kubernetes namespace flow-jobs and the work- ing directory ...................................... 15 5.4 Core stack: Landing Page, Traefik, Louketo, Keycloak, Grafana, Kubernetes and Helm ....................................... 17 6 Extensions 20 6.1 Workflows ....................................... 20 6.2 Applications ...................................... 23 7 Development guide 24 7.1 Getting started .................................... 24 7.2 Write your first own DAG .............................. 25 7.3 Deploy an own processing algorithm to the platform ............. 27 7.4 Deploy a Flask Application on the platform ................... 32 8 Frequently Asked Questions (FAQ) 35 8.1 There seems to be something wrong with the landing-page visualization in the Browser ...................................... 35 8.2 Kibana dashboard does not work ......................... 35 8.3 Proxy configuration ................................. 36 8.4 Setup a connection to the Kubernetes cluster from your local workstation . 36 8.5 Failing to install an extension ............................ 37 i 9 Glossary 38 10 Releases 40 10.1 Version 0.1.0 ..................................... 40 11 License and contact 41 11.1 License ......................................... 41 11.2 Contact ........................................ 42 Index 43 ii CHAPTER ONE WHAT IS KAAPANA? Kaapana (from the hawaiian word kaāpana, meaning “distributor” or “part”) is an open source toolkit for state of the art platform provisioning in the field of medical data analysis. The applications comprise AI-based workflows and federated learning scenarios with a focus on radiological and radiotherapeutic imaging. Obtaining large amounts of medical data necessary for developing and training modern machine learning methods is an extremely challenging effort that often fails in a multi- center setting, e.g. due to technical, organizational and legal hurdles. A federated ap- proach where the data remains under the authority of the individual institutions and is only processed on-site is, in contrast, a promising approach ideally suited to overcome these difficulties. Following this federated concept, the goal of Kaapana is to provide a framework and a set of tools for sharing data processing algorithms, for standardized workflow design and execution as well as for performing distributed method development. This will facilitate data analysis in a compliant way enabling researchers and clinicians to perform large-scale multi-center studies. By adhering to established standards and by adopting widely used open technologies for private cloud development and containerized data processing, Kaapana integrates seam- lessly with the existing clinical IT infrastructure, such as the Picture Archiving and Commu- nication System (PACS), and ensures modularity and easy extensibility. Core components of Kaapana are: • dcm4chee: open source PACS system serving as a central DICOM data storage in Kaapana • Elasticsearch: search engine used to make the DICOM data searchable via their tags and meta information • Kibana: visualization dashboard enabling the interactive exploration of the DICOM data stored in Kaapana and indexed by Elasticsearch • Airflow: workflow management system that enables complex and flexible data pro- cessing workflows in Kaapana via container chaining • Kubernetes: container orchestration • Keycloak: user authentication • Docker: container system to provide algorithms as well as the platform components itself Kaapana is constantly developing and currently includes the following key-features: 1 DKTK Kaapana Technical Documentation, Release 0.1.1 • Large-scale image processing with SOTA deep learning algorithms, such as nnU-Net image segmentation • Analysing, evaluation and viewing of processed images and data • Simple integration of new, customized algorithms and applications into the frame- work • System monitoring • User management Currently the most widely used platform realized using Kaapana is the Joint Imaging Plat- form (JIP) of the German Cancer Consortium (DKTK). The JIP is currently being deployed at all 36 german university hospitals with the objective of distributed radiological image analysis and quantification. For more information, please also take a look at our recent publication of the Kaapana- based Joint Imaging Platform in JCO Clinical Cancer Informatics (JCO). 2 CHAPTER TWO GETTING STARTED This manual is intended to provide a quick and easy way to get started with Kaapana. Kaa- pana is not a ready-to-use software but a toolkit that enables you to build the platform that fits your specific needs. The steps described in this guide will build an example platform, which is a default configuration and contains many of the typical platforms components. This basic platform can be used as a starting-point to derive a customized platform for your specific project. 2.1 What’s needed to run Kaapana? 1. Host System You will need some kind of server to run the platform on. Minimum specs: • OS: CentOS 8, Ubuntu 20.04 or Ubuntu Server 20.04 • CPU: 4 cores • Memory: 8GB (for processing > 30GB recommended) • Storage: 100GB (deploy only) / 150GB (local build) -> (recommended >200GB) 2. Container Registry Hint: Get access to our docker registry In case you just want to try out the platform, you are very welcome to reach out to us via slack or email. In this case, we will provide you credentials to our docker registry from which you can directly install the platform and skip the building part! To provide the services in Kaapana, the corresponding containers are needed. These can be looked at as normal binaries of Kaapana and therefore only need to be built if you do not have access to already built containers via a container registry. This flow-chart should help you to decide if you need to build Kaapana and which mode to choose: 3 DKTK Kaapana Technical Documentation, Release 0.1.1 3. Build Build Kaapana 4. Installation Install Kaapana 2.1. What’s needed to run Kaapana? 4 CHAPTER THREE BUILD KAAPANA 3.1 Build Requirements Important: Disk space needed: For the complete build of the project ~50GB of container images will be stored at /var/snap/docker/common/var-lib-docker. If you use build-mode local it will be ~120GB since each container will be also imported separately into containerd. In the future we will also provide an option to delete the docker image after the import. Before you get started you should be familiar with the basic concepts and components of Kaapana see What is Kaapana?. You should also have the following packages installed on your build-system. We expect the sudo systemctl restart snapd 1. Dependencies Ubuntu Centos sudo apt update && sudo apt install -y curl git python3 python3-pip sudo yum install -y curl git python3 python3-pip 2. Clone the repository: git clone https://github.com/kaapana/kaapana.git 3. Python requirements python3 -m pip install -r kaapana/build-scripts/requirements.txt 5 DKTK Kaapana Technical Documentation, Release 0.1.1 4. Snap Ubuntu Centos Check if snap is already installed: snap help --all If not run the following commands: sudo apt install -y snapd A reboot is needed afterwards! Check if snap is already installed: snap help --all If not run the following commands: sudo yum install -y epel-release sudo yum update -y sudo yum install snapd sudo systemctl enable --now snapd.socket sudo snap wait system seed.loaded 5. Docker sudo snap install docker --classic --channel=latest/stable 6. In order to docker commands as non-root user you need to execute the following steps: sudo groupadd docker sudo usermod -aG docker $USER For more information visit the Docker docs 7. Helm sudo snap install helm --classic --channel=3.5/stable 8. Reboot sudo reboot 9. Test Docker docker run hello-world -> this should work now without root privileges 10. Helm plugins 3.1. Build Requirements 6 DKTK Kaapana Technical Documentation, Release 0.1.1 helm plugin install https://github.com/chartmuseum/helm-push helm plugin install https://github.com/instrumenta/helm-kubeval 3.2 Build modes If you don’t have access to a container registry with already built containers for Kaapana, you need to build them first. This is comparable to a binary of regular software projects - if you already have access to it, you can continue with step 3. The complete build will take ~1h (depending on the system)! Currently Kaapana supports two different build-modes: 1. Local build By choosing this option you will need no external container registry to install the platform. All containers will be build and used locally on the server. 2. Container registry This option will use a remote container registry. Since we’re also using charts and other artifacts, the registry must have OCI support . We recommend Gitlab or Harbor as registry software. Unfortunately, Dockerhub does not yet support OCI,