Large Synoptic Survey Telescope (LSST) Site Specific Infrastructure Estimation Explanation

Mike Freemon and Steve Pietrowicz LDM-143 7/17/2011 LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011

Change Record

Version Date Description Owner name

1 5/13/2006 Initial version (as Document-1684) Mike Freemon

2 9/27/2006 General updates (as Document-1684) Mike Freemon

3 9/7/2007 General updates (as Document-1684) Mike Freemon

4 7/17/2011 General updates (as Document-1684) Mike Freemon Modified rates for power, cooling, floorspace, 5 4/11/2012 shipping Mike Freemon LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 Table of Contents LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011

The LSST Site Specific Infrastructure Estimation Explanation

This document provides explanations and the basis for estimates for the technology predictions used in LDM-144 “Site Specific Infrastructure Estimation Model.”

The supporting materials referenced in this document are stored in Collection-974.

1 Overview of Sizing Model and Inputs Into LDM-144

Figure 1. The structure and relationships among the components of the DM Sizing Model LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 2 Data Flow Among the Sheets Within LDM-144 LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011

3 Policies

3.1 Ramp up The ramp up policy during the Commissioning phase of Construction is described in LDM-129. Briefly, in 2018, we acquire and install the computing infrastructure needed to support Commissioning, for which we use the same sizing as that for the first year of Operations.

3.2 Replacement Policy

Compute Nodes 5 Years Disk Drives 3 Years Tape Media 5 Years Tape Drives 3 Years Tape Library System Once at Year 5

3.3 Storage Overheads

RAID6 8+2 20% Filesystem 10%

3.4 Spares (hardware failures)

This is margin for hardware failures. This is what takes into account that at any given point in time, there will be some number of nodes and drives out of service due to hardware failures.

Compute Nodes 3% of nodes Disk Drives 3% of drives Tape Media 3% of tapes

3.5 Extra Capacity

Disk 10% of TB Tape 10% of TB LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 3.6 Additional Margin This is additional margin to account for inadequate algorithmic performance on future hardware.

Compute algorithms 50% of TF

3.7 Multiple Copies for Data Protection and Disaster Recovery

Single tape copy at BaseSite Dual tape copies at ArchSite (one goes offsite for disaster recovery)

See LDM-129 for further details.

4 Key Formulas

This section describes the key formulas used in LDM-144.

Some of these formulas are interrelated. For example, the formulas used to establish minimum required nodes or drives will typically use multiple formulas based upon different potential constraining resources, and then take the maximum of the set in order to establish the minimum needed.

4.1 Compute Nodes: Teraflops Required (number of compute nodes) >= (sustained TF required) / (sustain TF per node)

4.2 Compute Nodes: Bandwidth to Memory (number of compute nodes) >= (total memory bandwidth required) / (memory bandwidth per node)

4.3 Database Nodes: Teraflops Required (number of database nodes) >= (sustained TF required) / (sustain TF per node)

4.4 Database Nodes: Bandwidth to Memory (number of database nodes) >= (total memory bandwidth required) / (memory bandwidth per node)

4.5 Database Nodes: Disk Bandwidth Per Node (Local Drives) (number of database nodes) >= (total disk bandwidth required) / (disk bandwidth per node)

where the disk bandwidth per node is a scaled function of PCIe bandwidth LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011

4.6 Disk Drives: Capacity (number of disk drives) >= (total capacity required) / (capacity per disk drive)

4.7 Disk Drives and Controllers (Image Storage): Bandwidth to Disk (number of disk controllers) = (total aggregate bandwidth required) / (bandwidth per controller)

(number of disks) = MAX of A and B where A = (total aggregate bandwidth required) / (sequential bandwidth per drive) B = (number of controllers) * (drives required per controller)

4.8 GPFS NSDs (number of NSDs) = MAX of A and B where A = (total storage capacity required) / (capacity supported per NSD) B = (total bandwidth) / (bandwidth per NSD)

4.9 Disk Drives (Database Nodes): Aggregate Number of Local Drives (number of disk drives) >= A + B where A = (total disk bandwidth required) / (sequential disk bandwidth per drive) B = (total IOPS required) / (IOPS per drive)

4.10 Disk Drives (Database Nodes): Minimum 2 Local Drives There will be a minimum of at least two local drives per database node

4.11 Tape Media: Capacity (number of tapes) >= (total capacity required) / (capacity per tape)

4.12 Tape Drives (number of tape drives) = (total tape bandwidth required) / (bandwidth per tape drive)

4.13 HPSS Movers (number of movers) = MAX of A and B LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 where A = (number of tape drives) / (tape drives per mover) B = (total bandwidth required) / (bandwidth per mover)

4.14 HPSS Core Servers (number of core server) = 2

This is flat over time.

4.15 10GigE Switches (number of switches) = MAX of A and B where A = (total number of ports required) / (ports per switch) B = (total bandwidth required) / (bandwidth per switch)

Note: The details of the 10/40/80 end-point switch may alter this formulation.

4.16 Power Cost (cost for the year) = (kW on-the-floor) * (rate per kWh) * 24 * 365

4.17 Cooling Cost (cost for the year) = (mmbtu) * (rate per mmbtu) * 24 * 365 where mmbtu = btu / 1000000 btu = watts * 3.412

4.18 Cooling Connection Fee Once for the lifetime of the project, paid during Commissioning

(one-time cost) = ((high water MW) * 0.3412 / 12) * (rate per ton) where high water MW = (high water watts) / 1000000 high water watts = high water mark for watts over all the years of Operations

5 Selection of Disk Drive Types

At any particular point in time, disk drives are available in a range of capacities and prices. Optimizing for cost per TB requires selecting a different price point than optimizing for cost per drive. In LDM-144, the “InputTechPredictionsDiskDrives” sheet implements that logic using the technology prediction for LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 disk drives based upon when leading edge drives become available. We assume a 15% drop in price each year for a particular type of drive at a particular capacity, and that drives at a particular capacity are only available for 5 years. The appropriate results are then used for the drives described in this section.

5.1 Image Storage

Disk drives for image storage are sitting behind disk controllers in a RAID configuration. Manufacturers warn against using commodity SATA drives in such environments, based on considerations such as failure rates caused by heavy duty cycles and time-limited error recovery (TLER) settings. Experience using such devices in RAID configurations support those warnings. Therefore, we select Enterprise SATA drives for image storage, and optimize for cheapest cost per unit of capacity.

SAS drives are not used as sequential bandwidth is the primary motivation for the drive selection, and SATA provides a more economical solution.

5.2 Database Storage

The disk drives for the database nodes are local, i.e. they are physically contained inside the database worker node and are directly attached. Unlike most database servers, where IOPS is the primary consideration, sequential bandwidth is the driving constraint in our qserv-based databases servers. Since these are local drives, and since they are running in a shared-nothing environment where the normal operating procedure is to take a failing node out of service without end-user impact, we do not require RAID or other fault-tolerant solutions at the physical infrastructure layer. Therefore, we strive to optimize for the cheapest cost per drive, and so select consumer SATA drives for the database nodes.

SAS drives are not used as sequential bandwidth is the primary motivation for the drive selection, and SATA provides a more economical solution.

6 Rates and Other Input

6.1 Power and Cooling Rates

6.1.1 Archive Site

The power rate for the University of Illinois for 2011 is $0.0791 per kWh.

The cooling rate for the University of Illinois for 2011 is $12.89 per mmbtu.

See Document-12991, which is also available at: http://www.energymanagement.illinois.edu/pdfs/FY12UtilityRates.pdf LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 6.1.2 Base Site

The power rate for La Serena is $0.1623 per kWh (USD).

The cooling rate for La Serena is $25.20 per mmbtu (USD).

See Document-11758.

Figure 2. Historical cost for power in La Serena, Chile.

6.2 Floorspace Costs

6.2.1 Archive Site

The floorspace rate for the National Petascale Computing Facility at the University of Illinois for 2011 is $175/sqft/year.

6.2.2 Base Site

Per Jeff Kantor and Ron Lambert, the lease costs for the Base Site are not part of Data Management.

6.3 Shipping Costs

The shipping rate for 2011 is $11.40 USD per pound. LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011

See Document-11838 for a 2011 FedEx quote. When comparing rate information from previous years, we find that rates have risen by 13-14% per year between 2007 and 2011. However, this model contains base year costs only -- all escalation is done in PMCS.

6.4 Academic and Non-Profit Discounts

The rule-of-thumb for vendor discounts for academic institutions is 35% from list prices. We assume similar pricing for non-profits and volume purchases. This can vary widely for any particular acquisition.

Note that some of the pricing estimates in LDM-144 already have these discounts embedded in them, so this factor is not applied across the board in LDM-144.

7 Additional Descriptions

7.1 Description of Barebones Nodes

Includes smallish single local drive inside each node for O/S and swap.

Traditionally these are dual small 10K SAS drives in a RAID1 (mirror) plus another pair of local drives for swap on a separate disk controller, but we are studying whether that is necessary given our approach to failure scenarios. Our standard method of operation for these kinds of failures will be to simply take the node out of service, repair offline, and then place the node back into service. This will be a routine occurrence, and so we may be not need or want multiple drives with RAID1 within each node. LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011

8 Computing

8.1 Gigaflops per Core (Peak)

8.1.1 Trend

Year Gigaflops per Core 2011 12.0 2012 12.5 2013 13.0 2014 13.5 2015 14.0 2016 14.6 2017 15.2 2018 15.8 2019 16.4 2020 17.1 2021 17.8 2022 18.5 2023 19.2 2024 20.0 2025 20.8 2026 21.6 2027 22.5 2028 23.4 2029 24.3

8.1.2 Description Core speed is expected to remain constant at around 3 GHz, where it has remained since 2005. Vendors can be expected to make additional instructions and capabilities available, but we expect those to be incremental, and can’t automatically assume LSST codes will be able to leverage those new capabilities. Taking all of these things into consideration, we model a 4% per year increase in per core performance.

8.1.3 References [1] Document-11526 An Overview of Exascale Architecture Challenges [2] Document-11570 Intel Pins Exascale Dreams To Knight Ferry LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 8.2 Cores per CPU Chip

8.2.1 Trend

Year Number of Cores 2011 6 2012 7 2013 8 2014 10 2015 12 2016 14 2017 17 2018 20 2019 24 2020 29 2021 34 2022 40 2023 48 2024 57 2025 68 2026 81 2027 96 2028 114 2029 136

8.2.2 Description Number of cores doubling of every four years. While extremely large numbers of cores in processors are expected in Knights Ferry from Intel, mainstream adoption isn't expected until 2018 [3].

8.2.3 References [1] Document-11510 International Solid-State Circuits Conference 2011 Trends Report. [2] Document-11511 Assessing Trends Over Time in Performance, Costs, and Energy Use For Servers. [3] Document-11570 Intel Pins Exascale Dreams to Knights Ferry

8.3 Bandwidth to Memory per Node

8.3.1 Trend

Year Bandwidth to Memory per Node (GB/s) 2011 25.6 2012 29.4 2013 33.8 2014 38.8 LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 2015 44.6 2016 51.2 2017 58.8 2018 67.6 2019 77.6 2020 89.1 2021 102.4 2022 117.6 2023 135.1 2024 155.2 2025 178.3 2026 204.8 2027 235.3 2028 270.2 2029 310.4

8.3.2 Description This is Front-Side Bus (FSB) / QPI bandwidth. The trend is: Initial value of 25.6 GB/s, and doubling every 5 years.

8.4 System Bus Bandwidth per Node

8.4.1 Trend

Year System Bus Bandwidth Per Node (GB/s) 2011 7.0 2012 8.0 2013 9.2 2014 10.6 2015 12.1 2016 13.9 2017 16.0 2018 18.4 2019 21.1 2020 24.3 2021 27.9 2022 32.0 2023 36.8 2024 42.2 2025 48.5 2026 55.7 LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 2027 64.0 2028 73.5 2029 84.5

8.4.2 Description Finalization of PCIe V3.0 in 2011, with products expected in 2012. Assuming initial value of 8 GB/s in 2012 and doubling every 5 years thereafter. This represent theoretical peak bandwidth for the system bus, and is used to scale the actual expected maximum bandwidth (see the next section).

8.4.3 References [1] Document-11542 PCI Express

8.5 Disk Bandwidth per Node

8.5.1 Trend

Year Disk Bandwidth Per Node (GB/s) 2011 3.5 2012 4.0 2013 4.6 2014 5.3 2015 6.1 2016 7.0 2017 8.0 2018 9.2 2019 10.6 2020 12.1 2021 13.9 2022 16.0 2023 18.4 2024 21.1 2025 24.3 2026 27.9 2027 32.0 2028 36.8 2029 42.2

8.5.2 Description The attached study [1] found that the system bus (PCIe v2) did not impose any bottleneck on I/O traffic. They achieved 3.2 GB/s disk bandwidth and speculate they should have gone higher if the RAID cards were better. Their bottleneck was the RAID cards they were using. However, it is unrealistic to belief that actual disk I/O could reach the theoretical peak in practice. Therefore, we adopt a model of using 1/2 of PCIe v2 theoretical peak for this system attribute. For year 2012, that translates to 8 GB/s PCIe peak and 4 GB/s max disk bandwidth per node. LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 8.5.3 References [1] Document-11675 Tom’s Hardware The 3GB Project Revisited

8.6 Cost per CPU

8.6.1 Trend The trend for cost per CPU chip is $996 and invariant over time.

8.6.2 Description Using Xeon X5650 as an applicable reference model for the type of processor that would be most applicable to our systems. It is a 6 core, 3 GHz processor. Additional specifications, and comparison with other models, here:

Intel X5650 Intel L5640 Intel E5649 AMD 2439SE AMD 6140 6 cores 6 cores 6 cores 6 cores 8 cores 3.06 GHz 2.8 GHz (turbo) 2.93 GHz 2.8 Ghz 2.6 GHz (turbo) (turbo) 12M cache 12M cache 12M cache 3M cache 4M cache 95W 60W 80W 105W 80W $996 $996 $774 $1229 $2300

8.6.3 References [1] http://www.intel.com/products/server/processor/xeon5000/index.htm [2] Document-11545 Sandy Bridge [3] Document-11546 Intel Processor Pricing June 2011

8.7 Power per CPU

8.7.1 Trend

Year Power per CPU Chip (watts) 2011 95 2012 95 2013 95 2014 95 2015 95 2016 95 2017 95 2018 95 2019 95 LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 2020 95 2021 95 2022 95 2023 95 2024 95 2025 95 2026 95 2027 95 2028 95 2029 95

8.7.2 Description The power per chip has been steady at 95W TDP. This is closely related to clockspeeds. Neither are expected to increase significantly given the physical properties of the materials and technology currently in use.

8.7.3 References [1] Document-11511 Assessing Trends Over Time in Performance, Costs, and Energy Use For Servers [2] Document-11545 Sandy Bridge [3] Document-11546 Intel Processor Pricing June 2011

8.8 Compute Nodes per Rack

8.8.1 Trend 48 nodes per rack, and constant over time

8.8.2 Description A typical full size rack is 42U.

For compute nodes, we go with blade systems. Assuming we have a few U of power distribution unit, UPS, networking, etc., and further assuming we can install 3 blade chassis of 16 nodes each per 10U, with 30U available in each cabinet, we estimate 48 nodes per rack.

8.8.3 References [1] Document-11685 server-poweredge-m1000e-tech-guidebook.pdf LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 8.9 Database Nodes per Rack

8.9.1 Trend 34 nodes per rack, and constant over time

8.9.2 Description A typical full size rack is 42U. Assuming we have a few U of power distribution unit, UPS, networking, etc., and further assuming we select 1U nodes, we estimate 34 nodes per rack. We cannot use a blade chassis due to the number of local disk drives in each database node.

8.10 Power per Barebones Node

8.10.1 Trend 100 watts and flat over time

8.10.2 Description Power for everything in a node except CPU chips, disk drives, and memory. This does include a single small local drive for O/S and swap space.

8.11 Cost per Barebones Node

8.11.1 Trend $1500 and flat over time.

8.11.2 Description Cost for everything in a node except CPU chips, disk drives, and memory. This does include a single small local drive(s) for O/S and swap space. It also includes PCIe cards, such as for 10GigE or IB. A reference barebones system is shown in [1].

8.11.3 References [1] Document-11674 Intel R1304BTL Barebones System

9 Memory

9.1 DIMMs per Node

9.1.1 Trend

Year DIMM Sockets per Node 2011 16 2012 16 2013 16 LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 2014 16 2015 16 2016 16 2017 16 2018 16 2019 16 2020 16 2021 16 2022 16 2023 16 2024 16 2025 16 2026 16 2027 16 2028 16 2029 16

9.1.2 Description

The growth in memory per node comes from the growth in DIMM capacity, not the number of DIMMs. Since 2006, technology changes have moved us from 512MB DIMMs being the norm to 4GB DIMMs. See the next section for Capacity per DIMM estimates.

9.1.3 References [1] Document-11547 You Probably Don't Need More DIMMs LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 9.2 Capacity per DIMM

9.2.1 Trend

Year Capacity per DIMM (GB) 2011 4.0 2012 5.0 2013 6.3 2014 8.0 2015 10.1 2016 12.7 2017 16.0 2018 20.2 2019 25.4 2020 32.0 2021 40.3 2022 50.8 2023 64.0 2024 80.6 2025 101.6 2026 128.0 2027 161.3 2028 203.2 2029 256.0

9.2.2 Description LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011

From [1], the “# Core” line shows the projected trend of cores per socket, while the dynamic random access memory (DRAM) line shows the projected trend of capacity per socket.

The trend is that capacity per DIMM doubles every 3 years.

The initial value chosen is 4GB for 2011. Although larger capacity DIMMs are available, the 4GB modules are the most cost effective solution that provides the required memory per core. The prices in section 2.4 match this capacity point.

9.2.3 References [1] Document-11670 Disaggregated Memory Architectures for Blade Servers [2] Document-11671 DDR Memory Prices from Crucial.com [3] Document-11549 Trends in Memory Systems

9.3 Bandwidth per DIMM

9.3.1 Trend

Year Bandwidth per DIMM (GB/s) 2011 14.4 2012 17.1 2013 20.4 2014 24.2 2015 28.8 2016 34.2 2017 40.7 2018 48.4 2019 57.6 2020 68.5 2021 81.5 2022 96.9 2023 115.2 2024 137.0 2025 162.9 2026 193.7 2027 230.4 2028 274.0 2029 325.8 LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 9.3.2 Description This doubles every 4 years [1]. The initial value is based upon mid-range memory to start, DDR3-1800 – PC3-14400, dual channel 128-bit has a full theoretical bandwidth of 28.8 GB/s. Real world performance is closer to that of single channel operation, or about 14.4 GB/s.

9.3.3 References [1] Document-11571 List of Device Bit Rates

9.4 Cost per DIMM

9.4.1 Trend The trend for cost per DIMM is $120 and constant over time.

9.4.2 Description Memory tends to start high, goes to a trough, and then rises again because of reduced production. Memory prices in the trough remain relatively constant, regardless of the memory size. Initial cost is for a representative server (Dell M710 blade) for a single DDR3-133 4GB DIMM in 2011 (crucial.com) [1]. Additional 4GB DIMMs available to approximately $120 are shown in [2].

9.4.3 References [1] Document-11556 Computer memory upgrades for Dell PowerEdge M710 Blade from Crucial [2] Document-11671 DDR Memory Prices from Crucial.com

9.5 Power per DIMM

9.5.1 Trend 5 watts and flat over time

9.5.2 Description This is difficult to trend out for the reasons indicated in the references. 5 watts is a reasonable estimate for our purposes.

9.5.3 References [1] Document-11590 DDR3 DIMM Memory Module.pdf [2] Document-11591 How many watts DDR2 memory module.pdf LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 10 Disk Storage

10.1 Capacity per Drive (Consumer SATA)

10.1.1 Trend

Year Capacity per Drive (TB) 2011 3 2012 4 2013 5 2014 6 2015 8 2016 10 2017 13 2018 16 2019 20 2020 25 2021 32 2022 40 2023 51 2024 64 2025 81 2026 102 2027 128 2028 161 2029 203

10.1.2 Description Disk space per drive doubles every 3 years for consumer SATA drives [1].

Although Kogge in section 6.4.1.1 of [2] states, and we do not dispute, that “10X growth over about 6 year periods seems to have been the standard for decades.” However, that report was in 2008. We now see some emerging dynamics with solid-state devices that may alter the dynamics. Alex Szalay, in a meeting with LSST in July 2011, supported the 2x every three years over the 10x every 6 years. We adopt the more conservative projection.

10.1.3 References [1] Document-11568 History of Hard Disk Drives – Wikipedia [2] Document-11672 ExaScale Computing Study: Technology Challenges in Achieving Exascale Systems, Peter Kogge, Editor & Study Lead LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 10.2 Sequential Bandwidth Per Drive (Consumer SATA)

10.2.1 Trend

Year Sequential Bandwidth (MB/s) 2011 64 2012 71 2013 79 2014 89 2015 99 2016 111 2017 124 2018 139 2019 156 2020 174 2021 195 2022 218 2023 244 2024 273 2025 305 2026 342 2027 382 2028 427 2029 478

10.2.2 Description [1] gives a good description of the relationship between capacity and sequential bandwidth estimates. Basically, since bandwidth is proportional to linear density times rotation speed, assuming rotation speeds will stay constant over time, and that linear density is proportional to the square root of the areal density, we get:

2TB drives: 50 MB/s 4TB drives: 71 MB/s 8TB drives: 100 MB/s 16TB drives: 141 MB/s 32TB drives: 200 MB/s 64TB drives: 283 MB/s 128TB drives: 400 MB/s

These number are equivalent to a 40% increase every 3 years. Our initial value is 71 MB/s with 4 TB drives in 2012. LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 10.2.3 References [1] Document-11673 Sequential Transfer Rates: An examination of its effects on performance http://www.storagereview.com/articles/9910/991014str.html [2] Document-11580 Seagate Barracuda Green 2TB Review

10.3 IOPS Per Drive (Consumer SATA)

10.3.1 Trend The trend for IOPS is 90 and flat over time.

10.3.2 Description Neither rotational latency nor seek time can be significantly improved, so we expect IOPS to remain essentially flat over time.

10.3.3 References [1] Document-11569 Calculate IOPS in a Storage Array

10.4 Cost Per Drive (Consumer SATA)

10.4.1 Trend The cost for a mid-range consumer SATA drive is $80, and will remain constant over time.

10.4.2 Description Drive costs remain relatively constant for drives as capacity increases.

10.4.3 References [1] Document-11581 Seagate Barracuda Green ST2000DL003 Internal Hard Drive

10.5 Power Per Drive (Consumer SATA)

10.5.1 Trend Power Per Drive (watts) for representative consumer SATA drive is 5.8 Watts.

10.5.2 Description Power consumption per drive constant, and is expected to continue this trend into the future. Reference drive is ST2000DL003, which consumes about 5.8 watts, which is due to power management for this "green" drive; otherwise it would be higher.

10.5.3 References [1] Document-11579 Seagate Barracuda Green Consumer SATA Specification LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 10.6 Capacity Per Drive (Enterprise SATA)

10.6.1 Trend

Year Capacity per Drive (TB) 2011 3 2012 4 2013 5 2014 6 2015 8 2016 10 2017 13 2018 16 2019 20 2020 25 2021 32 2022 40 2023 51 2024 64 2025 81 2026 102 2027 128 2028 161 2029 203

10.6.2 Description See the comments for Consumer SATA.

10.6.3 References See the comments for Consumer SATA.

10.7 Sequential Bandwidth Per Drive (Enterprise SATA)

10.7.1 Trend

Year Sequential Bandwidth (MB/s) 2011 64 2012 71 2013 79 2014 89 2015 99 2016 111 2017 124 LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 2018 139 2019 156 2020 174 2021 195 2022 218 2023 244 2024 273 2025 305 2026 342 2027 382 2028 427 2029 478

10.7.2 Description See the comments for Consumer SATA.

10.7.3 References See the comments for Consumer SATA.

10.8 IOPS Per Drive (Enterprise SATA)

10.8.1 Trend The trend for IOPS is 90 and flat over time.

10.8.2 Description Neither rotational latency nor seek time can be significantly improved, so we expect IOPS to remain essentially flat over time.

10.8.3 References [1] Document-11569 Calculate IOPS in a Storage Array [2] Document-11582 Hitachi Ultrastar A7K2000 2TB HDD Performance

10.9 Cost Per Drive (Enterprise SATA)

10.9.1 Trend The cost for a mid-range enterprise SATA drive is $220, and will remain constant over time.

10.9.2 Description Drive costs remain relatively constant drives as capacity increases. Representative drive is the Hitachi Ultrastar A7K2000, with a 24x7 duty cycle [1].

10.9.3 References [1] Document-11560 HITACHI Ultrastar A7K2000 Hard Drive LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011

10.10 Power Per Drive (Enterprise SATA)

10.10.1 Trend Power per Drive (watts) for representative Enterprise SATA is about 11 Watts.

10.10.2 Description Power consumption per drive constant, and is expected to continue this trend into the future. Reference is Hitachi Ultrastar A7K2000.

10.10.3 References [1] Document-11561 Ultrastar A7K2000 Specification

10.11 Disk Drive per Rack

10.11.1 Trend 360 disk drives per rack

10.11.2 Description 36 SATA drives per 4U space

10.11.3 References [1] Document-11839 LSI NetApp IBM DS3500 [2] Sun X4540

11 Disk Controllers

11.1 Bandwidth per Controller

11.1.1 Trend 1 GB/s per controller for DS3500, and rising at 2x every 5 years for mid-range controllers such as DS3500 [1]

11.1.2 Description The DS3500 is a reasonable balance between low-end controllers with unproven reliability and high-end, very expensive controllers.

11.1.3 References [1] Document-11839 LSI NetApp IBM DS3500 [2] DDN SFA10K Inifiniband LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011

11.2 Drives Required per Controller

11.2.1 Trend 24 drives behind each controller in order to achieve the rated bandwidth.

11.2.2 Description The DS3500 is a reasonable balance between low-end controllers with unproven reliability and high-end, very expensive controllers.

11.2.3 References [1] Document-11839 LSI NetApp IBM DS3500

11.3 Cost per Controller

11.3.1 Trend $10K for a midrange controller, such as [1].

11.3.2 Description The DS3500 is a reasonable balance between low-end controllers with unproven reliability and high-end, very expensive controllers.

11.3.3 References [1] Document-11839 LSI NetApp IBM DS3500

12 GPFS

12.1 Capacity Supported per NSD

12.1.1 Trend 500 TB per NSD server

12.1.2 Description Capacity supported by Linux 2.6 64-bit kernels is >2TB, up to the device driver limit, so this really depends on the configuration of the NSD.

12.1.3 References See section 5.6 in the following document: http://publib.boulder.ibm.com/infocenter/clresctr/vxrx/index.jsp?topic=%2Fcom.ibm.cluster.gpfs.doc %2Fgpfs_faqs%2Fgpfsclustersfaq.html LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011

12.2 Hardware Cost per NSD

12.2.1 Trend $12K per server for hardware

12.2.2 Description This is the estimated price for the hardware needed to serve as a GPFS NSD.

12.3 Software Cost per NSD

12.3.1 Trend $4K per server for software

12.3.2 Description This is special pricing from IBM due to the University of Illinois’ campus licensing agreement, and includes the initial purchase plus 3 years of maintenance.

Note that this pricing is nearly the same as what we would get support of a Lustre installation.

Licenses are priced by IBM on per processor core. A dual CPU system with 8 cores in each CPU, but with only 4 cores in each CPU dedicated to GPFS would require 8 client licenses.

12.3.3 References [1] See section 1.7 of: http://publib.boulder.ibm.com/infocenter/clresctr/vxrx/index.jsp?topic=%2Fcom.ibm.cluster.gpfs.doc %2Fgpfs_faqs%2Fgpfsclustersfaq.html

12.4 Software Cost per GPFS Client

12.4.1 Trend $0

12.4.2 Description GPFS clients are free.

This is special pricing from IBM due to the University of Illinois’ campus licensing agreement, and includes the initial purchase plus 3 years of maintenance. LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 13 Tape Storage

13.1 Capacity Per Tape

13.1.1 Trend

Year Type Capacity (TB) 2011 LTO-5 1.5 2012 LTO-5 2.1 2013 LTO-6 3.0 2014 LTO-6 4.2 2015 LTO-7 6.0 2016 LTO-7 6.9 2017 LTO-8 7.9 2018 LTO-8 9.1 2019 LTO-9 10.4 2020 LTO-9 12.0 2021 LTO-10 13.8 2022 LTO-10 15.8 2023 LTO-11 18.2 2024 LTO-11 20.9 2025 LTO-12 24.0 2026 LTO-12 27.6 2027 LTO-13 31.7 2028 LTO-13 36.4 2029 LTO-14 41.8

13.1.2 Description Tape capacity doubles every 2 years from 2007 to 2015 [1] [2].

Since we have no reliable information for dates beyond 2015, we adopted a less aggressive performance curve, namely we considered for the time being that tape capacity doubles every 5 years. These are uncompressed capacities.

13.1.3 References [1] Document-11533 Two new LTO tape gens announced 2010 [2] Sun roadmap, Oct 2005 [3] NCSA’s T2 proposal

13.2 Cost per Tape

13.2.1 Trend The cost per tape is $70 and constant over time. LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 13.2.2 Description As a starting point we used 400GB for $60 based on a recent NCSA tape purchase (2007). This is actually less expensive ($150/TB) than the estimate we used before ($200/TB) from the vendor roadmap.

It is estimated (based on historical data) that for a given technology for tapes the prices will decrease by 20% each year.

Current street price (2011) for LTO-5 tapes is $70 in units of 1. Price goes down in larger quantities.

13.2.3 References [1] Document-11564 LTO Ultrium 5 Tape Cartridge 1 Pack

13.3 Cost of Tape Library

13.3.1 Trend The acquisition cost for the tape library system is $375K, with maintenance at $75K/year.

The one-time software licensing for HPSS is $500K, which includes the movers, core servers, and all clients. The annual software licensing for HPSS is $150K per year.

13.3.2 Description This library includes 8000 slots, no media, and no drives. Media and drives are purchased separately.

Older information: We received a vendor quote of $1.2mil for the license and $265K/year for maintenance for a 30PB library. We will upgrade the license once the total size of the archive goes over 30PB, which will most likely happen sometimes in 2018-2019.

13.4 Bandwidth Per Tape Drive

13.4.1 Trend

Year Bandwidth per Tape Drive (MB/s) 2011 140 2012 140 2013 168 2014 168 2015 202 2016 202 2017 242 2018 242 2019 290 2020 290 2021 348 LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 2022 348 2023 418 2024 418 2025 502 2026 502 2027 602 2028 602 2029 722 2030 722

13.4.2 Description

The current (2011) rates are 280 Mb/s compressed + 20% per LTO generation. Note that LTO-5 is 15% faster than LTO-4 and 75% faster than LTO-3. (see Fujitsu page).

Vendor typically assume a 50% compression ratio. We assume no compression. The estimates above are adjusted accordingly.

13.4.3 References [1] Document-11565 LTO Bandwidth [2] Document-11566 LTO-5 Tape Drives

13.5 Cost Per Tape Drive

13.5.1 Trend The trend is $8K per tape drive and constant over time.

13.5.2 Description This is the estimate price for a tape drive for a tape library capable of meeting LSST’s requirements. LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 13.5.3 References [1] LTO-5 HP StorageWorks Ultrium 3000 SAS Internal drive

13.6 Tape Drives per HPSS Mover

13.6.1 Trend The trend is 8 tape drives per HPSS mover and constant over time.

13.6.2 Description Based upon our current design, we estimate that we’ll need 7-8 drives per mover. The outside range is around 5-10, so this has a small effect on budget estimates.

13.7 Hardware Cost per HPSS Mover

13.7.1 Trend $10-15K per mover for hardware.

13.7.2 Description This is an estimate of the hardware needed to serve as an HPSS mover.

13.8 Cost for 2 HPSS Core Servers

13.8.1 Trend We need 2 servers, fixed over time. $80K each, hardware only. The software license is free (bundled with the HPSS software licensing above).

13.8.2 Description These are “beefy” machines that manage the HPSS metadata and control the robots. LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 14 Networking

14.1 Bandwidth per Infiniband Port

14.1.1 Trend Initial value in 2011 is 10 Gbps, or 1 GB/s (QDR), and increases at the rate of 6x every 8 years.

14.1.2 Description

14.1.3 References [1] Document-15536 IBTA – Infiniband Trade Association

14.2 Ports per Infiniband Edge Switch

14.2.1 Trend 18 and flat over time.

14.2.2 Description This is the estimate for the number of port per switch.

14.2.3 References [1] Document-11534 Infiniband Switch Price – The Technology and its Cost of Ownership LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011

14.3 Cost per Infiniband Edge Switch

14.3.1 Trend Current price for an edge switch $12K.

14.3.2 Description Assume 36 unoptimized port switches, $340 per IB Edge Port, $1000 per Core Switch

14.3.3 References [1] Document-11534 Infiniband Switch Price – The Technology and its Cost of Ownership

14.4 Cost per Infiniband Core Switch

14.4.1 Trend Current price for Core Switch is $36K.

14.4.2 Description Assume 36 unoptimized port switches, $340 per IB Edge Port, $1000 per Core Switch

14.4.3 References [1] Document-11534 Infiniband Switch Price – The Technology and its Cost of Ownership

14.5 Bandwidth per 10GigE Switch

14.5.1 Trend 80 GB/s per switch, and doubles every 5 years.

14.5.2 Description Based on Juniper Ex4500 40-port switch, which does 900 Mpps, or, at 90 bytes per frame, about 80 GB/s.

14.5.3 References [1] Document-11575 10gb-switch-compare-6-2011.xlsx

14.6 Cost per 10GigE Switch

14.6.1 Trend $30K per switch LSST Site Specific Infrastructure Estimation Explanation LDM-143 7/17/2011 14.6.2 Description Juniper Ex4500 40-port switch: $29,000 list. As reference, a mid-range switch is Juniper 8200 64 port switch is $380,000, and a high end Juniper Ex8216 128-port switch is $730,000).

Our estimate is based on the Juniper Ex4500 40-port switch. This is lower throughput than the next class higher, which is the Juniper 8200, but the cost is over 10 times as much. We expect the throughput rates to go higher over time, but this price point should remain relatively steady.

14.6.3 References [1] Document-11575 10gb-switch-compare-6-2011.xlsx

14.7 Cost per UPS

14.7.1 Trend $3K per rack UPS unit.

14.7.2 Description This is the estimate cost for each rack-based UPS unit, to ensure a controlled shutdown (and flush of data buffers to disk) in the event of a facility power outage.