Optimizing Local File Accesses for FUSE-Based Distributed Storage
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Instituto De Pesquisas Tecnológicas Do Estado De São Paulo ANDERSON TADEU MILOCHI Grids De Dados: Implementação E Avaliaçã
Instituto de Pesquisas Tecnológicas do Estado de São Paulo ANDERSON TADEU MILOCHI Grids de Dados: Implementação e Avaliação do Grid Datafarm – Gfarm File System – como um sistema de arquivos de uso genérico para Internet São Paulo 2007 Ficha Catalográfica Elaborada pelo Departamento de Acervo e Informação Tecnológica – DAIT do Instituto de Pesquisas Tecnológicas do Estado de São Paulo - IPT M661g Milochi, Anderson Tadeu Grids de dados: implementação e avaliação do Grid Datafarm – Gfarm File System como um sistema de arquivos de uso genérico para internet. / Aderson Tadeu Milochi. São Paulo, 2007. 149p. Dissertação (Mestrado em Engenharia de Computação) - Instituto de Pesquisas Tecnológicas do Estado de São Paulo. Área de concentração: Redes de Computadores. Orientador: Prof. Dr. Sérgio Takeo Kofuji 1. Sistema de arquivo 2. Internet (redes de computadores) 3. Grid Datafarm 4. Data Grid 5. Máquina virtual 6. NISTNet 7. Arquivo orientado a serviços 8. Tese I. Instituto de Pesquisas Tecnológicas do Estado de São Paulo. Coordenadoria de Ensino Tecnológico II.Título 07-20 CDU 004.451.52(043) ANDERSON TADEU MILOCHI Grids de Dados: Implementação e Avaliação do Grid Datafarm – Gfarm File System - como um sistema de arquivos de uso genérico para Internet Dissertação apresentada ao Instituto de Pesquisas Tecnológicas do Estado de São Paulo - IPT, para obtenção do título de Mestre em Engenharia de Computação Área de concentração: Redes de Computadores Orientador: Prof. Dr. Sérgio Takeo Kofuji São Paulo Março 2007 A Deus, o Senhor de tudo, Mestre dos Mestres, o único Caminho, Verdade e Vida. À minha esposa, pelo seu incondicional apoio apesar da dolorosa solidão. À minha filha, pela compreensão nos meus constantes momentos de isolamento. -
Comparison of Kernel and User Space File Systems
Comparison of kernel and user space file systems — Bachelor Thesis — Arbeitsbereich Wissenschaftliches Rechnen Fachbereich Informatik Fakultät für Mathematik, Informatik und Naturwissenschaften Universität Hamburg Vorgelegt von: Kira Isabel Duwe E-Mail-Adresse: [email protected] Matrikelnummer: 6225091 Studiengang: Informatik Erstgutachter: Professor Dr. Thomas Ludwig Zweitgutachter: Professor Dr. Norbert Ritter Betreuer: Michael Kuhn Hamburg, den 28. August 2014 Abstract A file system is part of the operating system and defines an interface between OS and the computer’s storage devices. It is used to control how the computer names, stores and basically organises the files and directories. Due to many different requirements, such as efficient usage of the storage, a grand variety of approaches arose. The most important ones are running in the kernel as this has been the only way for a long time. In 1994, developers came up with an idea which would allow mounting a file system in the user space. The FUSE (Filesystem in Userspace) project was started in 2004 and implemented in the Linux kernel by 2005. This provides the opportunity for a user to write an own file system without editing the kernel code and therefore avoid licence problems. Additionally, FUSE offers a stable library interface. It is originally implemented as a loadable kernel module. Due to its design, all operations have to pass through the kernel multiple times. The additional data transfer and the context switches are causing some overhead which will be analysed in this thesis. So, there will be a basic overview about on how exactly a file system operation takes place and which mount options for a FUSE-based system result in a better performance. -
Accelerating Big Data Analytics on Traditional High-Performance Computing Systems Using Two-Level Storage Pengfei Xuan Clemson University, [email protected]
Clemson University TigerPrints All Dissertations Dissertations December 2016 Accelerating Big Data Analytics on Traditional High-Performance Computing Systems Using Two-Level Storage Pengfei Xuan Clemson University, [email protected] Follow this and additional works at: https://tigerprints.clemson.edu/all_dissertations Recommended Citation Xuan, Pengfei, "Accelerating Big Data Analytics on Traditional High-Performance Computing Systems Using Two-Level Storage" (2016). All Dissertations. 2318. https://tigerprints.clemson.edu/all_dissertations/2318 This Dissertation is brought to you for free and open access by the Dissertations at TigerPrints. It has been accepted for inclusion in All Dissertations by an authorized administrator of TigerPrints. For more information, please contact [email protected]. ACCELERATING BIG DATA ANALYTICS ON TRADITIONAL HIGH-PERFORMANCE COMPUTING SYSTEMS USING TWO-LEVEL STORAGE A Dissertation Presented to the Graduate School of Clemson University In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy Computer Science by Pengfei Xuan December 2016 Accepted by: Dr. Feng Luo, Committee Chair Dr. Pradip Srimani Dr. Rong Ge Dr. Jim Martin Abstract High-performance Computing (HPC) clusters, which consist of a large number of compute nodes, have traditionally been widely employed in industry and academia to run diverse compute-intensive applications. In recent years, the revolution in data-driven science results in large volumes of data, often size in terabytes or petabytes, and makes data-intensive applications getting exponential growth. The data-intensive computing presents new challenges to HPC clusters due to the different workload characteristics and optimization objectives. One of those challenges is how to efficiently integrate software frameworks developed for big data analytics, such as Hadoop and Spark, with traditional HPC systems to support both data-intensive and compute-intensive workloads. -
Design of Store-And-Forward Servers for Digital Media Distribution University of Amsterdam Master of Science in System and Network Engineering
Design of store-and-forward servers for digital media distribution University of Amsterdam Master of Science in System and Network Engineering Class of 2006-2007 Dani¨el S´anchez ([email protected]) 27th August 2007 Abstract Production of high quality digital media is increasing in both the commercial and academic world. This content needs to be distributed to end users on demand and efficiently. Initiatives like CineGrid [1] push the limit looking at the creation of content distribution centres connected through dedicated optical circuits. The research question of this project is the following: “What is the optimal architecture for the (CineGrid) storage systems that store and forward content files of a size of hundreds of GBs?” First I made an overview of the current situation. At the moment the Rembrandt cluster nodes [16] are used in the storage architecture. All data has to be transferred manually to the nodes via FTP. This is not preferred, because administration is difficult. Therefore a list of criteria is made for the new storage architecture. Important criteria are bandwidth (6.4 Gb/s) and space (31.2 TB a year and expandable). I made a comparison between open source distributed parallel file systems based on these criteria. Lustre and GlusterFS turned out to be the best of these file systems according to the criteria. After that I proposed two architectures which use these file systems. The first architecture contains only cluster nodes and the second architecture contains cluster nodes and a SAN. In the end it is recommended to install GlusterFS in the first architecture on the existing DAS- 3 nodes [15] with Ethernet as interconnect network.