Summer Student Project Report
Total Page:16
File Type:pdf, Size:1020Kb
Summer Student Project Report Dimitris Kalimeris National and Kapodistrian University of Athens June { September 2014 Abstract This report will outline two projects that were done as part of a three months long summer internship at CERN. In the first project we dealt with Worldwide LHC Computing Grid (WLCG) and its information sys- tem. The information system currently conforms to a schema called GLUE and it is evolving towards a new version: GLUE2. The aim of the project was to develop and adapt the current information system of the WLCG, used by the Large Scale Storage Systems at CERN (CASTOR and EOS), to the new GLUE2 schema. During the second project we investigated different RAID configurations so that we can get performance boost from CERN's disk systems in the future. RAID 1 that is currently in use is not an option anymore because of limited performance and high cost. We tried to discover RAID configurations that will improve the performance and simultaneously decrease the cost. 1 Information-provider scripts for GLUE2 1.1 Introduction The Worldwide LHC Computing Grid (WLCG, see also 1) is an international collaboration consisting of a grid-based computer network infrastructure incor- porating over 170 computing centres in 36 countries. It was originally designed by CERN to handle the large data volume produced by the Large Hadron Col- lider (LHC) experiments. This data is stored at CERN Storage Systems which are responsible for keeping and making available more than 100 Petabytes (105 Terabytes) of data to the physics community. The data is also replicated from CERN to the main computing centres within the WLCG. With such a diversity of sites and different storage systems having a com- mon information system is crucial to guarantee interoperation among the dif- ferent computing centres. The current WLCG information system conforms to a schema named GLUE which evolving towards a new version: GLUE2. The specific work we had to do was to develop and adapt the current infor- mation system used by the Large Scale Storage Systems at CERN (CASTOR and EOS) to the new GLUE2 schema. Reader can check references 2, 3 and 4 for more information about GLUE/GLUE2. 1 1.2 Information Providers for CASTOR and EOS The information system of CERN's Storage Systems is expected to provide the user with data i.e. total storage space used by some Virtual Organization (VO), how much of this storage space is online and how much is nearline, the protocols the user can use to access the data, etc. The user can ask for this information through LDAP (standard application protocol for accessing and maintaining distributed directory information services over an Internet Protocol (IP) network) search. There are two monitoring boxes for CASTOR and two for EOS. The in- formation providers are perl scripts running inside those boxes and they are responsible for fetching all the required information from web pages and Service Level Status (SLS). As long as they gather the required data they print an LDIF (a standard plain text data interchange format for representing LDAP) file in a way com- patible with GLUE schema. What we had to do was update the perl scripts so that they would display the information in a way that will be compatible with GLUE2 schema. Information providers work in two levels: 1. Gather information for CASTOR and EOS services in the way mentioned before. 2. Arrange this information in classes and print one LDIF file with the at- tributes of every class. There was no need to change the way that we collect the information. What we did was organizing it under a different structure. By that we mean that some classes that existed in GLUE do not exist anymore in GLUE2 (e.g. GlueSEC- ontrolProtocol), some were splitted in several others (e.g. GlueSE is splitted in GLUE2StorageService and GLUE2StorageServiceCapacity), GLUE2 has new classes as well (e.g. GLUE2StorageManager), etc. We firstly designed and implemented the "architecture" of the new classes. After that we pass the information that we collect in this structure and we print the LDIF file which is compatible with GLUE2 schema. Before we put the new information providers into production we tested that the LDIF file was correct under GLUE2, using a tool called GLUE validator, which confirmed the correctness of the output of the scripts. Next step was the integration of the information providers in production. We deployed the new scripts (GLUE2) in the monitoring boxes and run them in parallel with the old ones (GLUE), because we want be able to publish both GLUE and GLUE2. After some finals tests about the correctness of the LDIF files in production CERN's Storage Systems can now publish their information also in GLUE2. The copmlete transition from GLUE to GLUE2 is expected to happen in the future. 1.3 Reuse of this work As a conclusion we would like to add that there are more sites outside CERN running CASTOR/EOS (e.g. RAL, Taiwan). Despite the fact that the scripts that we developed for this project are CERN specific, with a little effort we can 2 adapt them to the singularities of every site and the transition from GLUE to GLUE2 for these sites can be done easily. 2 RAID Configurations 2.1 Introduction RAID (Redundant Array of Inexpensive Disks, see also ref 1) is a concept that was developed in 1977 by David Patterson, Garth Gibson, and Randy Katz as a way to use several inexpensive disks to create a single disk from the perspective of the OS while also achieving enhanced reliability or performance or both. As the disks of the array are independent you can read/write data from/to several disks at a time and this improves the performance. Apart from that in almost all RAID levels (except from RAID 0) there is a way to recover the data after a disk failure and avoid an array failure. This offers data reliability. There are 7 different standard levels of RAID: 1. RAID 0: splits data across drives, resulting in higher data throughput. The performance of this configuration is extremely high, but a loss of any drive in the array will result in data loss. This level is commonly referred to as striping. 2. RAID 1: writes all data to two or more drives for 100% redundancy: if either drive fails, no data is lost. Compared to a single drive, RAID 1 tends to be faster on reads, slower on writes. This is a good entry-level redundant configuration. However, since an entire drive is a duplicate, the cost per megabyte is high. This is commonly referred to as mirroring. 3. RAID 2: stripes data at the bit level instead of the block level (remember that RAID-0 stripes at the block level) and uses a Hamming Coding for parity computations. In RAID-2, the first bit is written on the first drive, the second bit is written on the second drive, and so on. Then a Hamming- code parity is computed and either stored on the disks or on a separate disk. 4. RAID 3: uses data striping at the byte level and also adds parity com- putations and stores them on a dedicated parity disk. 5. RAID 4: uses data striping at the block level and also adds parity com- putations and stores them on a dedicated parity disk. In a similar fashion, RAID-4 builds on RAID-0 by adding a parity disk to block-level striping. 6. RAID 5: stripes data at a block level across several drives, with parity equality distributed among the drives. The parity information allows re- covery from the failure of any single drive. Write performance is rather quick, but because parity data must be skipped on each drive during reads, reads are slower. The low ratio of parity to data means low redundancy overhead. 3 (a) RAID 10 array. (b) RAID 60 array. Figure 1: Hybrid RAID arrays. 7. RAID 6: is an upgrade from RAID-5: data is striped at a block level across several drives with double parity distributed among the drives. As in RAID-5, parity information allows recovery from the failure of any single drive. The double parity gives RAID-6 additional redundancy at the cost of lower write performance (read performance is the same), and redundancy overhead remains low. Note that RAID 2,3,4 are not in use anymore. There are nested RAID levels too, also known as hybrid RAID, which com- bine two or more of the standard levels of RAID to gain even better performance, additional redundancy, or both. Examples: RAID 1+0 or 10, 5+0 or 50, 6+0 or 60. 2.2 Hardware and Software RAID arrays There are several operations that take place in a RAID array, such as sending chunks of data to the appropriate disks, computing parity, hot-swapping, disk fail-over, checking read transactions to determine if the read was successfully and if not, declaring that disk as \down" and more. All of these tasks require some sort of computation and have to be performed by a RAID controller. There are two options for RAID controllers: 1. hardware RAID that has a dedicated RAID controller to run the RAID application. Hardware RAID has very good performance but it is expen- sive and the RAID controller is a single point of failure. If it fails the whole RAID array fails. 2. software RAID that uses the CPU for RAID chores without need for any additional hardware. Software RAID is cheaper but the performance is worse and processing load is added on the CPU.