Nosql Databases Are Finding Their Way Into Many Application Stacks

Management of polyglot persistent integrations with virtual administrators Thomas Clauwaert Supervisors: Prof. dr. ir. Filip De Turck, Dr. ir. Gregory Van Seghbroeck Counsellors: ing. Merlijn Sebrechts, Dr. ir. Gregory Van Seghbroeck Master's dissertation submitted in order to obtain the academic degree of Master of Science in Information Engineering Technology Department of Information Technology Chair: Prof. dr. ir. Bart Dhoedt Faculty of Engineering and Architecture Academic year 2017-2018 Preface It is crazy how fast the past few months went by. I have tried my best to research and learn as much as possible while also implementing interesting things. It was not always easy and from time to time I got stuck here and there. Looking back, I am glad about what I achieved but the big added value for me is the priceless experience I’ve gained throughout this journey. It is impossible to list every single person that helped me throughout this period but a few people deserve to be in the spotlight. First and foremost, I want to thank prof. Filip De Turck, dr. ir. Gregory Van Seghbroeck and ing. Merlijn Sebrechts for writing out this thesis proposal and providing the opportunity for a student like me to tackle this research. Merlijn especially deserves a round of applause for all the guidance and patience he had when I was stuck or in need of some advice. Next, a big shout out to all the people on the IRC channel of Juju. Even though the community is rather small, the people out there really want to help you. Finally, I’m grateful to all my friends and family for their support and every single piece of advice. You guys were great! Thomas Clauwaert Ghent, June 2018 Toelating tot bruikleen “De auteur(s) geeft (geven) de toelating deze masterproef voor consultatie beschikbaar te stellen ende- len van de masterproef te kopiëren voor persoonlijk gebruik. Elk ander gebruik valt onder de bepalin- gen van het auteursrecht, in het bijzonder met betrekking tot de verplichting de bron uitdrukkelijk te vermelden bij het aanhalen van resultaten uit deze masterproef.” “The author(s) gives (give) permission to make this master dissertation available for consultation and to copy parts of this master dissertation for personal use. In the case of any other use, the copyright terms have to be respected, in particular with regard to the obligation to state expressly the source when quoting results from this master dissertation.” Thomas Clauwaert, June 2018 Abstract Data management plays a crucial role in the area of information technology, as it impacts the efficiency of the system in use. End users often expect these systems to be responsive and available at any time. Good infrastructure design choices, that provide flexibility and scalability, are therefore crucial build- ing blocks of modern applications. In the state of the art, a lot of different database systems have been proposed which offer (dis-)advantages in a number of key areas. The traditional relational databaseis still the predominant system, although NoSQL databases are finding their way into many application stacks. Modern systems often use a combination of several database systems and make the development effort a lot more complex. Industry therefore relies on modern data administrators or operations engineers who have the know-how to use, setup and manage these polyglot persistent applications. Since these people are hard to find, developers or data scientists are looking at other solutions to simplify the operation of different technologies. The goal of this thesis is to propose a service which transparently manages different database systems. The idea behind a script or tool lies often in performing specific tasks which would otherwise needtobe performed manually. They can be seen as virtual administrators who perform predefined tasks. Inthis thesis several possibilities are investigated to create a virtual administrator that is responsible for the management of polyglot persistent applications and all its derivatives. The generic database service as presented in this research offers an easy-to-use platform where users request a specific database technology and the service itself will take care of installing all required components and sharing all needed information. The virtual administrator makes its own choices in deciding what services need to be deployed in order to provide the requested database technology. This way developers can ask the virtual administrator for a database technology and a database name and they end up with the connection details to use it. The developer becomes self-reliant and the time needed to get the requested operational tasks done, reduces significantly. A proof of concept was made in the application modelling tool Juju for the generic database service. With the help of a use case and the reactive framework, a requesting service can successfully request multiple databases of a different type. The generic database service would then correctly sharethe details to the requesting service. It only acts as a proxy that relays the database details. Furthermore, the service is resource-demanding and more database technologies should be supported. In iterative steps, support for any database technology could be added so the end result becomes a full-fledged application ready for use in Juju. The idea behind the generic database service is not bound to Jujuand can be (re-)used in other environments aiming to achieve the same goal. Samenvatting Data speelt een cruciale rol in de meeste informatie- en technologiesystemen. De manier waarop gegevens worden verwerkt bepaalt hoe efficiënt een systeem werkelijk is. Omdat eindgebruikers ver- wachten dat een systeem op elk moment reageert en beschikbaar is, vormen de keuzes voor het ontwerp van de infrastructuur de bouwstenen van moderne toepassingen. Op het gebied van databanktechnolo- gieën is er veel keuze. Traditionele relationele databanksystemen zijn nog steeds het populairst, maar NoSQL-technologieën vonden ook hun weg in applicatie infrastructuren. In moderne systemen worden ontwikkelaars uitgedaagd om verschillende databanktechnologieën te gebruiken in hun toepassingen, afhankelijk van het type gegevens. Deze heterogene dataopslagtechnieken resulteren in een complexere infrastructuur naarmate er meer en verschillende technologieën worden gebruikt. Om die reden zijn moderne databankbeheerders of operations engineers nodig die deze verschillende systemen weten te gebruiken, te configureren en te beheren. Het doel van deze masterproef is om een dienst voor te stellen die helpt om dit probleem aan te pakken. Machines, computers en technologie in het algemeen, helpen mensen om veel processen te automa- tiseren. Het idee achter een script (of tool) ligt vaak in het uitvoeren van specifieke taken die anders handmatig uitgevoerd moeten worden. Op een bepaalde manier zijn het virtuele administratoren. In deze masterproef is onderzocht of het mogelijk is om een virtuele administrator te creëren die verantwo- ordelijk is voor het beheer van deze heterogene dataopslagtechnieken. De generieke databank service, zoals gepresenteerd in dit onderzoek, biedt een eenvoudig te gebruiken platform waar gebruikers een databanktechnologie vragen en de service zelf zorgt voor het installeren van alle benodigde compo- nenten en het delen van alle informatie. De virtuele administrator bepaalt zelf welke diensten opgezet moeten worden wanneer er om een databank gevraagd wordt. De ontwikkelaar wordt op deze manier onafhankelijk van een fysieke administrator en de tijd die nodig is om de gevraagde operationele taken te voltooien, vermindert aanzienlijk. Een proof of concept is gemaakt in de applicatiemodelleringstool Juju voor de generieke databank service. Met behulp van een use-case en het reactive framework kan een applicatie met succes meerdere databanken van verschillende types opvragen. De generieke databank service gaat de gegevens dan correct delen met de oorspronkelijke applicatie. De implementatie van de service fungeert alleen als een proxy die de gegevens van de databank doorgeeft. Bovendien vergt deze dienst (te) veel middelen en moeten meer databanktechnologieën worden ondersteund. In iteratieve stappen kan ondersteuning voor elke databanktechnologie worden toegevoegd, zodat het eindresultaat een volwaardige toepassing wordt. Het idee achter de generieke databank service is bovendien niet gebonden aan Juju en kan (her)gebruikt worden in andere omgevingen om hetzelfde doel te bereiken. Virtuele administratoren voor het beheer van heterogene dataopslagtechnieken Thomas Clauwaert Begeleiders: prof. Filip De Turck, dr. ir. Gregory Van Seghbroeck, dr. ir. Tim Wauters, ing. Merlijn Sebrechts Abstract— In een wereld waar alles 24/7 beschikbaar moet zijn, is het II. ACHTERGROND onderhoud van services en applicaties van cruciaal belang. Het uitteke- nen, opzetten en uiteindelijk beheren van deze applicatie infrastructuren Het aantal machines, services en applicaties dat een moderne zijn vaak de grootste uitdagingen van moderne systeembeheerders. In deze masterproef wordt onderzocht hoe databanken eenvoudiger gebruiksklaar systeembeheerder moet beheren is de afgelopen jaren enorm gemaakt kunnen worden aan de hand van virtuele systeemadministrato- toegenomen. Dankzij diensten als Amazon AWS, Google cloud ren. Deze virtuele entiteiten nemen de taak over van de systeembeheerder computing of Microsoft Azure is het eenvoudiger geworden en conﬁgureren de nodige zaken op een automatische manier zonder dat de om snel machines operationeel

Load more