Storage and processing of medical data in GRID --- one more example of technology transfer from physics to medicine

FRENCH-UKRAINIAN WINTER WORKSHOP ON MEDICAL PHYSICS

Oleksandr Dyomin Institute for Scintillation Materials Physics to physicians

X-ray X-ray scanner

NMR MRT

Nuclear Power Nuclear Medicine Nuclide Production Problems for GRID

 Computational tasks  Storage of of large data volumes  The coperative data access for geographically distributed users History

 In 1973, John Schoch and John Hupp of California research center Xerox PARC have written a program that was launched at night at PARC LAN and forced to work to perform calculations .

 In 1978, Soviet mathematician Victor Glushkov worked on the problem of macroconveyor distributed . He proposed a number of principles for the distribution of work among the processors.

 In 1988, Arjen Lenstra and Mark Menes wrote a program for factoring of long numbers. To speed up program can be run on multiple machines. Each was treated with a small fragment.

 Idea to organize a mass project was proposed in 1994 by David Gedye. Project SETI@Home used volunteer computers (so-called volunteer computing). Scientific plan of the project by David Gedye and Craig Kasnoff was presented at the Fifth International Conference on bioastronomii in July 1996.

 In January 1996 the GIMPS project to find Mersenne primes was started. It used the ordinary ordinary user computers as the volunteer network.

 RSA Data Security Competition started on January 28, 1997. The goal was to hack a 56-bit RC5 encryption key by a simple exhaustive search. Thanks to good technical and organizational preparation of the project, organized by the non-profit community DISTRIBUTED.NET quickly became widely known

 OpenMP – application interface standard for parallel systems with

 MPI (Message Passing Interface) — is a language- independent communications protocol used for programming parallel computers

 POSIX Threads — is an execution model that exists independently from a language, as well as a parallel execution model

 Multiprogramming — one processor one program.  Multithreading — each consists of one or more threads .

The existence of multiple processes allows your computer to perform "simultaneously" multiple tasks. The existence of multiple threads allows a process to separate a job for parallel execution. Graphics processors, CUDA

 architecture, extremely aimed to increase the speed of calculation of complex textures and graphics;  set of commands is limited . Tesla Architecture Crystal and PM tubes Light spread function

PMTs LSF

Crystal Scintillation The data for different PMT scintillations are treated according to the same algorithm, independently. This allows parallel processing

Raw data Scintillation Data 2D image parameters input creation estimation

Scintillation Scintillation Data parameters parameters: input estimation x,y,a

Several hardware Several nodes can interface can be be used to improve used: PCI, USB, computational Ethernet performance Grid Cluster Computers in the Cluster Grid Middleware

gLite ARC ISMA takes part in the calculations by ALICE experiment since 2009. International HEP project BELLE II Grid and Cloud Grid v.s. Internet

Grid Internet

•Computing oriented. • Search oriented

•Maximum personalization. •Maximum anonymity. Certification MedGrid Virtual Organization Grid System Structure

Medical Web portal Grid Storage Facilities Http SE Https SSl Grid sertificate LFC

Medical Web portal Database Grid Facilities Storage SE

Medical Web portal Database Grid Facilities Storage SE

Grid users Client Software

Workplace allows you to choose the medical data in DICOM format, including graphical information stored on a local disk, perform the anonymization of the data and send to the grid storage. Gateway for medical networks

DICOM

WEB API

GRID The use of QR-code

You can scan QR-code from the form by standard tablet or smartphone app and open it by the standard viewer for DICOM-files MedGrid Rainbow service (“ARC in the cloud”)

Налаштування середовища виконання

Служба CE

Сервіс мережевої Компоненти робочого вузла конфігурації Шлюз

Інфраструктура Грід-задача Грід-мережа зберігання даних

Грід-користувачі

SE SE SE CE CE

А.Судаков, А. Борецкий КНУ им.Шевченко Grid-service for automatic VM starting

telemedicine consultations - Grid- service for VM starting WEB server GRIS VOMS

Physicians Central Grid services-

LFC WMS MyProxy Grid-infrastructure

CE CE SE

SE SE SE CE

CE CE SE А.Судаков, А. Борецкий КНУ им.Шевченко Rainbow Cloud Grid (ARC) service for ECG consulting

А.Судаков, А. Борецкий КНУ им.Шевченко Grid Processing for medical data

 Differential diagnosis  Medical statistics  Population medical research  E-epidemiology  Long time disease monitoring  Automation of diagnosis (Second opinion) Encephalography– common and powerful method for realtime study of brain functions

EEG – non-invasive safe diagnostic method

EEG in Laboratory animals – powerful research method

 ~3000 clinics in Ukraine EEG recordings are available in most of them 1 study takes from minutes to weeks Data file size 1МB – 10GB Requires many resources for storage and procession  – Grid is good solution 3 Collective research in biomedicine

CT Amosov Institute New knowledge KNU

Kyiv Heart Center ISMA SPECT ECG IC KNU IMBG IMBG MD ND NNCMBTP NNCMBTP Storage Capacity EEG Computing Power BIPH Procession Software

Diagnostic data 2 EEG database

 Creation o EEG database prototype in UNG  Importing of data from different devices  Support of different data formats  Support of different signal and images modalities  Extensibility

 Import test dataset into database  EEG records of humans playing computer game (~200 studies)  EEG records of laboratory animals with epilepsy (~500 studies)

 Perform test data analysis (artefacs removal,)

4 Results of artifacts removal

Before

After p=0.18;

1 - before and 2 - after removal EMG 9 USI images  Ultrasonic introscopic images. (~800)  General formal information (name, surname, age, sex, diagnoses etc.).  Results of biochemical assay (investigation of hormones concentration: protein-bound triiodothyronine T3 and thyroxine T4, free FT3 and FT4, additional thyroid stimulating hormone TSH and the thyroglobulin TG).  Description.  Verified diagnosis  Digital characteristics of image texture.

10 Automatic diagnostics P

The probability to identify the true negative diagnoses exceeds 91%. System identifies the true positive diagnosis of thyrotoxicosis with probability of more then 95%. 13 Grid sertificate

https://ca.ugrid.org/help.php List of RA centers

 KPIНТУУ "КПИ" операторы: Sergii Stirenko, Oleg Alienin, Oleksandr Rokovyi адрес: Киев, проспект Победы, 37, корпус 6, комната 22 тел.: +38 044 4068013

 RA KIPT Национальный научный центр "Харьковский физико - технический институт" операторы: Dmytro Soroka адрес: Академическая, 1, 61108, г. Харков тел.: +38 057 3356371

 RA ICMP Институт физики конденсированных систем НАН Украины операторы: Taras Patsahan адрес: ул. Свенцицкого, 1. 79011 м.Львов тел.: +38 032 2760614

 RA ONU Центр суперкомпьютерных вычислений и свободного программного обеспечения ОНУ имени И.И. Мечникова г. Ильичёвск, ул. Данченко, 17а операторы: Dmitry Spodarets адрес: г. Ильичёвск, ул. Данченко, 17а тел.: +38 063 7353526

 RA CHSTU Черниговский государственный технологический университет операторы: Olga Prila адрес: ул. Шевченко 95, г. Чернигов, IV корпус, 6 этаж, ауд. 63 тел.: +38 093 7871972

 RA ISMA Институт сцинтилляционных материалов НАН Украины операторы: Sergiy Barannik адрес: пр. Ленина, 60, 61001, г. Харьков тел.: +38 050 3230558

 RA IAP Институт прикладной физики НАН Украины г. Сумы, ул. Петропавловская, 58 операторы: Victor Kuprienko адрес: г. Сумы, ул. Прокофьева, 38А тел.: +38 050 3074891 Thank you for attention Кластерный анализ

 Имеется множество объектов  Каждый объект обладает сочетанием признаков  Необходимо разбить все множество объектов на кластеры, объединяя в один кластер объекты со сходными признаками

 Определяется метрика в пространстве признаков

Эвклидова метрика

 Минимизируется функционал, зависящий от разбиения объектов по кластерам Проблема

Во многих случаях классические алгоритмы обработки плохо применимы к медицинским изображениям.