Disk Scrubbing Large Archival Storage Systems
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
An Analysis of Data Corruption in the Storage Stack
An Analysis of Data Corruption in the Storage Stack Lakshmi N. Bairavasundaram∗, Garth R. Goodson†, Bianca Schroeder‡ Andrea C. Arpaci-Dusseau∗, Remzi H. Arpaci-Dusseau∗ ∗University of Wisconsin-Madison †Network Appliance, Inc. ‡University of Toronto {laksh, dusseau, remzi}@cs.wisc.edu, [email protected], [email protected] Abstract latent sector errors, within disk drives [18]. Latent sector errors are detected by a drive’s internal error-correcting An important threat to reliable storage of data is silent codes (ECC) and are reported to the storage system. data corruption. In order to develop suitable protection Less well-known, however, is that current hard drives mechanisms against data corruption, it is essential to un- and controllers consist of hundreds-of-thousandsof lines derstand its characteristics. In this paper, we present the of low-level firmware code. This firmware code, along first large-scale study of data corruption. We analyze cor- with higher-level system software, has the potential for ruption instances recorded in production storage systems harboring bugs that can cause a more insidious type of containing a total of 1.53 million disk drives, over a pe- disk error – silent data corruption, where the data is riod of 41 months. We study three classes of corruption: silently corrupted with no indication from the drive that checksum mismatches, identity discrepancies, and par- an error has occurred. ity inconsistencies. We focus on checksum mismatches since they occur the most. Silent data corruptionscould lead to data loss more of- We find more than 400,000 instances of checksum ten than latent sector errors, since, unlike latent sector er- mismatches over the 41-month period. -
Data & Computer Recovery Guidelines
Data & Computer Recovery Guidelines Data & Computer Recovery Guidelines This document contains general guidelines for restoring computer operating following certain types of disasters. It should be noted these guidelines will not fit every type of disaster or every organization and that you may need to seek outside help to recover and restore your operations. This document is divided into five parts. The first part provides general guidelines which are independent of the type of disaster, the next three sections deal with issues surrounding specific disaster types (flood/water damage, power surge, and physical damage). The final section deals with general recommendations to prepare for the next disaster. General Guidelines 2. Your first step is to restore the computing equipment. These are general guidelines for recovering after any type If you do try to power on the existing equipment, it of disaster or computer failure. If you have a disaster is best to remove the hard drive(s) first to make sure recovery plan, then you should be prepared; however, the system will power on. Once you have determined there may be things that were not covered to help the system powers on, you can reinstall the hard drive you recover. This section is divided into two sections and power the system back on. Hopefully, everything (computer system recovery, data recovery) works at that point. Note: this should not be tried in the case of a water or extreme heat damage. Computer System Recovery 3. If the computer will not power on then you can either The first step is to get your physical computer systems try to fix the computer or in many cases it is easier, running again. -
Data Remanence in Non-Volatile Semiconductor Memory (Part I)
Data remanence in non-volatile semiconductor memory (Part I) Security Group Sergei Skorobogatov Web: www.cl.cam.ac.uk/~sps32/ Email: [email protected] Introduction Data remanence is the residual physical representation of data that has UV EPROM EEPROM Flash EEPROM been erased or overwritten. In non-volatile programmable devices, such as UV EPROM, EEPROM or Flash, bits are stored as charge in the floating gate of a transistor. After each erase operation, some of this charge remains. It shifts the threshold voltage (VTH) of the transistor which can be detected by the sense amplifier while reading data. Microcontrollers use a ‘protection fuse’ bit that restricts unauthorized access to on-chip memory if activated. Very often, this fuse is embedded in the main memory array. In this case, it is erased simultaneously with the memory. Better protection can be achieved if the fuse is located close to the memory but has a separate control circuit. This allows it to be permanently monitored as well as hardware protected from being erased too early, thus making sure that by the time the fuse is reset no data is left inside the memory. In some smartcards and microcontrollers, a password-protected boot- Structure, cross-section and operation modes for different memory types loader restricts firmware updates and data access to authorized users only. Usually, the on-chip operating system erases both code and data How much residual charge is left inside the memory cells memory before uploading new code, thus preventing any new after a standard erase operation? Is it possible to recover data application from accessing previously stored secrets. -
Databridge ETL Solution Datasheet
DATASHEET Extract and Transform MCP Host Data for Improved KEY FEATURES Client configuration tool for Analysis and Decision Support easy customization of table layout. Fast, well-informed business decisions require access to your organization’s key performance Dynamic before-and-after indicators residing on critical database systems. But the prospect of exposing those systems images (BI-AI) based on inevitably raises concerns around security, data integrity, cost, and performance. key change. 64-bit clients. For organizations using the Unisys ClearPath MCP server and its non-relational DMSII • Client-side management database, there’s an additional challenge: Most business intelligence tools support only console. relational databases. • Ability to run the client as a service or a daemon. The Only True ETL Solution for DMSII Data • Multi-threaded clients to That’s why businesses like yours are turning to Attachmate® DATABridge™. It’s the only increase processing speed. true Extract, Transform, Load (ETL) solution that securely integrates Unisys MCP DMSII • Support for Windows Server and non-DMSII data into a secondary system. 2012. • Secure automation of Unisys With DATABridge, you can easily integrate production data into a relational database or MCP data replication. another DMSII database located on an entirely different Unisys host system. And because • Seamless integration of DATABridge clients for Oracle and Microsoft SQL Server support a breadth of operating both DMSII and non-DMSII environments (including Windows 7, Windows Server 2012, Windows Server 2008, UNIX, data with Oracle, Microsoft SQL, and other relational AIX, SUSE Linux, and Red Hat Linux), DATABridge solutions fit seamlessly into your existing databases. infrastructure. -
Database Analyst Ii
Recruitment No.: 20.186 Date Opened: 5/25/2021 DATABASE ANALYST II SALARY: $5,794 to $8,153 monthly (26 pay periods annually) FINAL FILING DATE: We are accepting applications until closing at 5 pm, June 8, 2021 IT IS MANDATORY THAT YOU COMPLETE THE SUPPLEMENTAL QUESTIONNAIRE. YOUR APPLICATION WILL BE REJECTED IF YOU DO NOT PROVIDE ALL NECESSARY INFORMATION. THE POSITION The Human Resources Department is accepting applications for the position of Database Analyst II. The current opening is for a limited term, benefitted and full-time position in the Information Technology department, but the list may be utilized to fill future regular and full- time vacancies for the duration of the list. The term length for the current vacancy is not guaranteed but cannot exceed 36 months. The normal work schedule is Monday through Friday, 8 – 5 pm; a flex schedule may be available. The Information Technology department is looking for a full-time, limited-term Database Analyst I/II to develop and manage the City’s Open Data platform. Initiatives include tracking city council goals, presenting data related to capital improvement projects, and measuring budget performance. This position is in the Data Intelligence Division. Our team sees data as more than rows and columns, it tells stories that yield invaluable insights that help us solve problems, make better decisions, and create solutions. This position is responsible for building and maintaining systems that unlock the power of data. The successful candidate will be able to create data analytics & business -
Increasing Reliability and Fault Tolerance of a Secure Distributed Cloud Storage
Increasing reliability and fault tolerance of a secure distributed cloud storage Nikolay Kucherov1;y, Mikhail Babenko1;yy, Andrei Tchernykh2;z, Viktor Kuchukov1;zz and Irina Vashchenko1;yz 1 North-Caucasus Federal University,Stavropol,Russia 2 CICESE Research Center,Ensenada,Mexico E-mail: [email protected], [email protected], [email protected], [email protected], [email protected] Abstract. The work develops the architecture of a multi-cloud data storage system based on the principles of modular arithmetic. This modification of the data storage system allows increasing reliability of data storage and fault tolerance of the cloud system. To increase fault- tolerance, adaptive data redistribution between available servers is applied. This is possible thanks to the introduction of additional redundancy. This model allows you to restore stored data in case of failure of one or more cloud servers. It is shown how the proposed scheme will enable you to set up reliability, redundancy, and reduce overhead costs for data storage by adapting the parameters of the residual number system. 1. Introduction Currently, cloud services, Google, Amazon, Dropbox, Microsoft OneDrive, providing cloud storage, and data processing services, are gaining high popularity. The main reason for using cloud products is the convenience and accessibility of the services offered. Thanks to the use of cloud technologies, it is possible to save financial costs for maintaining and maintaining servers for storing and securing information. All problems arising during the storage and processing of data are transferred to the cloud provider [1]. Distributed infrastructure represents the conditions in which competition for resources between high-priority computing tasks of data analysis occurs regularly [2]. -
Error Characterization, Mitigation, and Recovery in Flash Memory Based Solid-State Drives
ERRORS, MITIGATION, AND RECOVERY IN FLASH MEMORY SSDS 1 Error Characterization, Mitigation, and Recovery in Flash Memory Based Solid-State Drives Yu Cai, Saugata Ghose, Erich F. Haratsch, Yixin Luo, and Onur Mutlu Abstract—NAND flash memory is ubiquitous in everyday life The transistor traps charge within its floating gate, which dic- today because its capacity has continuously increased and cost has tates the threshold voltage level at which the transistor turns on. continuously decreased over decades. This positive growth is a The threshold voltage level of the floating gate is used to de- result of two key trends: (1) effective process technology scaling, termine the value of the digital data stored inside the transistor. and (2) multi-level (e.g., MLC, TLC) cell data coding. Unfortu- When manufacturing process scales down to a smaller tech- nately, the reliability of raw data stored in flash memory has also nology node, the size of each flash memory cell, and thus the continued to become more difficult to ensure, because these two trends lead to (1) fewer electrons in the flash memory cell (floating size of the transistor, decreases, which in turn reduces the gate) to represent the data and (2) larger cell-to-cell interference amount of charge that can be trapped within the floating gate. and disturbance effects. Without mitigation, worsening reliability Thus, process scaling increases storage density by enabling can reduce the lifetime of NAND flash memory. As a result, flash more cells to be placed in a given area, but it also causes relia- memory controllers in solid-state drives (SSDs) have become bility issues, which are the focus of this article. -
EEPROM Emulation
...the world's most energy friendly microcontrollers EEPROM Emulation AN0019 - Application Note Introduction This application note demonstrates a way to use the flash memory of the EFM32 to emulate single variable rewritable EEPROM memory through software. The example API provided enables reading and writing of single variables to non-volatile flash memory. The erase-rewrite algorithm distributes page erases and thereby doing wear leveling. This application note includes: • This PDF document • Source files (zip) • Example C-code • Multiple IDE projects 2013-09-16 - an0019_Rev1.09 1 www.silabs.com ...the world's most energy friendly microcontrollers 1 General Theory 1.1 EEPROM and Flash Based Memory EEPROM stands for Electrically Erasable Programmable Read-Only Memory and is a type of non- volatile memory that is byte erasable and therefore often used to store small amounts of data that must be saved when power is removed. The EFM32 microcontrollers do not include an embedded EEPROM module for byte erasable non-volatile storage, but all EFM32s do provide flash memory for non-volatile data storage. The main difference between flash memory and EEPROM is the erasable unit size. Flash memory is block-erasable which means that bytes cannot be erased individually, instead a block consisting of several bytes need to be erased at the same time. Through software however, it is possible to emulate individually erasable rewritable byte memory using block-erasable flash memory. To provide EEPROM functionality for the EFM32s in an application, there are at least two options available. The first one is to include an external EEPROM module when designing the hardware layout of the application. -
PROTECTING DATA from RANSOMWARE and OTHER DATA LOSS EVENTS a Guide for Managed Service Providers to Conduct, Maintain and Test Backup Files
PROTECTING DATA FROM RANSOMWARE AND OTHER DATA LOSS EVENTS A Guide for Managed Service Providers to Conduct, Maintain and Test Backup Files OVERVIEW The National Cybersecurity Center of Excellence (NCCoE) at the National Institute of Standards and Technology (NIST) developed this publication to help managed service providers (MSPs) improve their cybersecurity and the cybersecurity of their customers. MSPs have become an attractive target for cyber criminals. When an MSP is vulnerable its customers are vulnerable as well. Often, attacks take the form of ransomware. Data loss incidents—whether a ransomware attack, hardware failure, or accidental or intentional data destruction—can have catastrophic effects on MSPs and their customers. This document provides recommend- ations to help MSPs conduct, maintain, and test backup files in order to reduce the impact of these data loss incidents. A backup file is a copy of files and programs made to facilitate recovery. The recommendations support practical, effective, and efficient back-up plans that address the NIST Cybersecurity Framework Subcategory PR.IP-4: Backups of information are conducted, maintained, and tested. An organization does not need to adopt all of the recommendations, only those applicable to its unique needs. This document provides a broad set of recommendations to help an MSP determine: • items to consider when planning backups and buying a backup service/product • issues to consider to maximize the chance that the backup files are useful and available when needed • issues to consider regarding business disaster recovery CHALLENGE APPROACH Backup systems implemented and not tested or NIST Interagency Report 7621 Rev. 1, Small Business planned increase operational risk for MSPs. -
Active Storage Activeraid ES Data Sheet
RAID Storage Evolved Welcome to RAID Storage Evolved Key Features Introducing the ActiveRAID ES — the affordable high performance RAID storage solution Optimized RAID performance in Apple Mac for Apple® users. The ActiveRAID ES was designed using the same principles of the OS X® Applications original ActiveRAID, but represents a previously unobtainable level of value by providing the capabilities of the ActiveRAID in a lower cost configuration that does not sacrifice Modern Linux RAID kernel and RAID the performance, reliability or ease of management of the original ActiveRAID. The controller design ActiveRAID ES was carefully designed for use in business critical applications where Proprietary self-optimizing architecture organizations require a high level of data integrity with ease of installation and management, but not the 24/7 365 day duty cycles of a large enterprise or Post-Production or Broadcast facility. Native Mac OS X Storage management suite The ActiveRAID ES provides redundant power, redundant cooling and redundant Fibre Dashboard Widget monitoring Channel connections, and the same powerful, yet easy to use management suite of the original ActiveRAID. The ES differs from the original ActiveRAID in that it has a single high Enterprise performance and reliability performance RAID controller and uses business class hard drives with a smaller cache size. without Enterprise complexity This results in a very robust RAID solution ideal for business class applications. Performance No long Quick Start guides to memorize of the ActiveRAID ES is impressive at up to 774 MB/s*† throughput and will easily handle Performance profiling and statistical most Mac OS X Server applications. The ActiveRAID ES is the ideal storage for business planning tools class applications such as File serving, database services, web hosting, calendar, Wiki, Mail, iChat and much more. -
Algorithms for Data Cleaning in Knowledge Bases
View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by VFAST - Virtual Foundation for Advancement of Science and Technology (Pakistan) VFAST Transactions on Software Engineering http://vfast.org/journals/index.php/VTSE@ 2018, ISSN(e): 2309-6519; ISSN(p): 2411-6327 Volume 15, Number 2, May-August, 2018 pp. 78-83 ALGORITHMS FOR DATA CLEANING IN KNOWLEDGE BASES 1 1 1 ADEEL ASHRAF , SARAH ILYAS , KHAWAJA UBAID UR REHMAN AND SHAKEEL AHMAD2 ` 1Department of Computer Science, University of Management and Technology, Punjab, Pakistan 2Department of Computer Sciences, Faculty of Computing and Information Technology in rabigh, King Abdulaziz University Jeddah Email: [email protected] ABSTRACT. Data cleaning is an action which includes a process of correcting and identifying the inconsistencies and errors in data warehouse. Different terms are uses in these papers like data cleaning also called data scrubbing. Using data scrubbing to get high quality data and this is one the data ETL (extraction transformation and loading tools). Now a day there is a need of authentic information for better decision-making. So we conduct a review paper in which six papers are reviewed related to data cleaning. Relating papers discussed different algorithms, methods, problems, their solutions and approaches etc. Each paper has their own methods to solve a problem in efficient way, but all the paper have common problem of data cleaning and inconsistencies. In these papers data inconsistencies, identification of the errors, conflicting, duplicates records etc problems are discussed in detail and also provided the solutions. These algorithms increase the quality of data. -
And Intra-Set Write Variations
i2WAP: Improving Non-Volatile Cache Lifetime by Reducing Inter- and Intra-Set Write Variations 1 2 1,3 4 Jue Wang , Xiangyu Dong ,YuanXie , Norman P. Jouppi 1Pennsylvania State University, 2Qualcomm Technology, Inc., 3AMD Research, 4Hewlett-Packard Labs 1{jzw175,yuanxie}@cse.psu.edu, [email protected], [email protected], [email protected] Abstract standby power, better scalability, and non-volatility. For example, among many ReRAM prototype demonstrations Modern computers require large on-chip caches, but [11, 14, 21–25], a 4Mb ReRAM macro [21] can achieve a the scalability of traditional SRAM and eDRAM caches is cell size of 9.5F 2 (15X denser than SRAM) and a random constrained by leakage and cell density. Emerging non- read/write latency of 7.2ns (comparable to SRAM caches volatile memory (NVM) is a promising alternative to build with similar capacity). Although its write access energy is large on-chip caches. However, limited write endurance is usually on the order of 10pJ per bit (10X of SRAM write a common problem for non-volatile memory technologies. energy, 5X of DRAM), the actual energy saving comes from In addition, today’s cache management might result in its non-volatility property. Non-volatility can eliminate the unbalanced write traffic to cache blocks causing heavily- standby leakage energy, which can be as high as 80% of the written cache blocks to fail much earlier than others. total energy consumption for an SRAM L2 cache [10]. Unfortunately, existing wear-leveling techniques for NVM- Given the potential energy and cost saving opportunities based main memories cannot be simply applied to NVM- (via reducing cell size) from the adoption of non-volatile based on-chip caches because cache writes have intra-set technologies, replacing SRAM and eDRAM with them can variations as well as inter-set variations.