Technologies for Data

Total Page:16

File Type:pdf, Size:1020Kb

Technologies for Data This report can be found at www.acola.org.au © Australian Council of Learned Academies Technologies for Data Kirsty Douglas Working paper April 2015 This working paper contributes to Securing Australia’s Future (SAF) Project 05. 1 This report can be found at www.acola.org.au © Australian Council of Learned Academies Table of contents Table of contents ............................................................................................................................ 2 Technologies for data ............................................................................................................................. 3 Overview ......................................................................................................................................... 3 Information in a digital age ................................................................................................................. 6 The ages of information ................................................................................................................ 12 The rise of a fourth paradigm? ..................................................................................................... 14 The age of interoperability ........................................................................................................... 17 Technologies for data in an age of interoperability .......................................................................... 18 ‘Big’ data capture, collection, and analysis ................................................................................... 22 Open data...................................................................................................................................... 25 Technologies for research data ..................................................................................................... 28 Policy issues and implications ....................................................................................................... 30 Summary ........................................................................................................................................... 33 References ........................................................................................................................................ 34 2 This report can be found at www.acola.org.au © Australian Council of Learned Academies Technologies for data Governments, intelligence agencies, businesses, time, some stochastic – with different and citizens must adapt rapidly to an environment properties of stability and longevity, and where data is both a commodity and a currency. different importance. The data accumulated from centuries of If the current ‘flood’ of data can be observation, and rapid improvements in digital distinguished from previous so-called interoperability in the last 50 years, continue to information revolutions, perhaps it is better transform the ways in which people understand characterised as an age of interoperability and describe the world, and the uses to which rather than a new information age. information is put. Countries and organisations The status of data is ambivalent. One which develop and foster technologies to ensure researcher’s data is another’s metadata; one that data can efficiently, reliably and economically person’s technology is another’s be stored, curated, managed, retrieved, accessed, infrastructure; the browsing and purchase shared, and protected have an opportunity to histories of an online community are a capitalise on this global data ecosystem. commodity and a resource to other users; and the size of a ‘big’ dataset in one field of Overview research, such as sociology, can be trifling in This paper addresses some of the another, such as astronomy. opportunities of three interconnected types of In the face of this pervasive interoperability technology for data: data analytics (‘big data’); and classificatory instability, regulators, users technologies for open data; and infrastructure and curators of data collections, and designers for collaborative research data. and administrators of data infrastructure may The notion of an information age or a data find it difficult to balance preservation, flood is not new. storage, and use with the management of However, the rise of digital information and digital traces, personal information harvested communication technologies (ICT) in the late from online behaviour and sensors. 20th century has been described as If ICT is a general purpose technology, then engendering an intellectual, social and data are its general-purpose constituents. The economic transformation ‘akin to the impact of technologies for data stems from emergence of literacy’. the interactions of data with other technology With the advent of digital programmable and with other data, ensuring a multiplicative computers in the mid-20th century, data effect: as 'data intermediaries', technologies outputs from simulations of complex for data can themselves be the mechanisms phenomena have increased enormously. by which things interoperate. More recently, improvements in networking Merely having more data does not translate and interoperability (broadly, the capacity for automatically to having better information. systems, organisations, processes and ‘Big data’ analysis is subject to the same limits products to work together) have provided as any statistical analysis. unprecedented access to information about The various uses of data creates tensions that the behaviour of people interacting with feed into a democratic deficit, whereby data online platforms. that is collected and generated for some The emergence of digital ICT as a general purposes is being co-opted for others, purpose technology (a GPT is a generic including by elected representatives. technology, widely used across the economy, For example, data producers (i.e. citizens) may with many different uses and spillover effects, have trouble accessing data about themselves, after (Lipsey et al., 2005) over the last 50 years while non-state entities use this data for has been accompanied by a huge increase in commercial or criminal purposes, and state the amount of information generated, agencies may themselves be complicit in the collected, stored, and shared. corruption of infrastructure. In an environment of networked sensors and Technologies to store, manage, access, ubiquitous internet access, virtually all human analyse, and share data can optimise the use activity is now a source of data. These data of large and complex datasets and ensure have different characteristics – some their preservation, providing Australian streaming, some stable, some updated over researchers and industries with a competitive 3 This report can be found at www.acola.org.au © Australian Council of Learned Academies advantage. But to realise these opportunities, consumers and producers of data need to be policymakers and users need to understand able to trust that their information is in safe the limits of data and data technologies, and hands. 4 This report can be found at www.acola.org.au © Australian Council of Learned Academies Keywords access, analyse, big data, capture, collect, data, disseminate, generate, ICT, information, information age, interoperability, infrastructure, metadata, open data, research data, re-use, share, standards, store, technologies for data, use ______________________________ This paper discusses the history of information and technologies for information, challenging the idea that the late 20th and early 21st centuries are distinguished from other periods for being an information or data age; the notion of a digital revolution and the development of a digital economy and its increasing entanglement with the rest of the economy (see also (Douglas, 2015); types of data problems including ‘big’ data, open data, research data collections, and the interlinked problems of privacy, security and surveillance; and the technologies that are needed to store, manage, access, analyse, share and regulate data to optimise the use of these collections for Australia. The paper does not dwell on the distinctions between data and information. For these purposes, data are items of ‘information considered collectively’, typically ‘used for reference, analysis, or calculation’, or ‘something given or granted; something known or assumed as fact, and made the basis of reasoning; an assumption or premise from which inferences are drawn’.1 So data and information describe or represent the world. Since at least the 1960s, data as raw numbers and facts have sometimes been distinguished from information as processed data, particularly in the domains of knowledge management and informatics, but this distinction is not particularly useful here. Data need infrastructure, or technology, for their capture, generation, collection, manipulation, analysis, use, dissemination, and storage. This data infrastructure may include data itself. Metadata as a type of data that characterises other data is a kind of infrastructure, for example (Research Data Infrastructure Committee, 2014). Legal and regulatory frameworks, licenses, and industry standards at all points of the data lifecycle and across sectors are also technologies and infrastructure for data (Korte, 2014) (see Table 2). In 2014 the Australian Government Minister for Communications, the Hon. Malcolm Turnbull MP, observed that ‘the cost of storing, transmitting and processing or analysing data is cheaper today than it has
Recommended publications
  • United States Patent (19) 11 Patent Number: 5,761,485 Munyan (45) Date of Patent: Jun
    USOO5761485A United States Patent (19) 11 Patent Number: 5,761,485 Munyan (45) Date of Patent: Jun. 2, 1998 54 PERSONAL ELECTRONIC BOOKSYSTEM Miyazawa et al. ("An Electronic Book: APTBook". Human-Computer Interaction-Interact '90. Proceedings of 76) Inventor: Daniel E. Munyan, 805 Mt. Gretna the IFIPTC 13 Third International Conference, 1 Jan. 1990, Rd., Elizabethtown, Pa. 17022 pp. 513-519). 21 Appl. No.: 565,915 (List continued on next page.) 22 Filed: Dec. 1, 1995 (51) Int. Cl. ............................ G06F 15/02; G06F 17/40: Primary Examiner Emanuel Todd Voeltz G09G 1/02 Assistant Examiner-Phalaka Kik 52 U.S. Cl. .............................. 395/500; 345/901; 326/8: Attorney, Agent, or Firm-Earl F. Clifford; Clifford & 395/187.01; 455/411 Clifford Law Firm 58) Field of Search .................................... 395/2.69, 500, 395/145. 2.82, 186, 187.01, 188.01, 200.09. 57 ABSTRACT 650; 455/89; 379/98. 368, 58: 348/134: The Personal Electronic Book System invention replaces a 382/14, 56; 345/192, 127, 130,901: 434/317: standard handheld book with an electronic equivalent. The 326/8; 364/286.4, 286.5, 949.81 260: 365/185.04; invention is sized and configured to be book size and to open 462/903: 463/29: 902/4 like a book for use. When opened, the user sees two facing page-like touch-sensitive, display screens with black print 56) References Cited on white background. Icons represent the electronically U.S. PATENT DOCUMENTS stored material. "artwork, audio clips, books, E-mail, faxes, 3,718,906 2/1973 Lighter.
    [Show full text]
  • Human Services Examiner
    Cayuga County Department of Human Resources and Civil Service Commission JOB SPECIFICATION Civil Service Title: HUMAN SERVICES EXAMINER Jurisdictional Class: Competitive Civil Division: County Adoption: CSM 12/17/03 Revised: CSM 02/09/05 (Change in Title); 10/18/06; 6/14/17; 3/16/21 DISTINGUISHING FEATURES OF THE CLASS: This position exists in the Department of Social Services and involves responsibility to participate in the delivery of financial service programs including: Public Assistance, Medical Assistance, Supplemental Nutrition Assistance Program (SNAP), and Child Support Enforcement Services. The work is performed in accordance with State and Federal regulations and department policy and involves responsibility in determining financial eligibility, investigations, in-depth interviewing, establishing amounts of assistance, making appropriate referrals; and the processing and maintenance of a variety of forms and records. In addition, the incumbents may represent the department in court as custodian of record to ascertain the completeness of records. Depending upon unit and/or assignment, work is performed under the direct or general supervision of a higher-level employee with leeway allowed in the performance of work assignments. Supervision is not normally a function of the class. Does related work as required. TYPICAL WORK ACTIVITIES: (Illustrative Only) Conducts investigations, including in-depth interviews to elicit sufficient information to approve, deny or determine the feasibility of a financial service/program, make an
    [Show full text]
  • Data Security and Confidentiality Guidelines for Clinical Research at Sparrow Health System
    Data Security and Confidentiality Guidelines for Clinical Research at Sparrow Health System Introduction This document was created to give you guidelines to follow in order to ensure that all confidential health information is protected as you procure, use, transfer, and store the data to complete your project. If you have additional questions after reading this document, a reference list is provided at the end to help you find answers to your questions or you may contact the IRB Office or Privacy Department. Guidelines for Data Procurement • Follow all Institutional Review Board (IRB) policies and procedures when requesting data located in Policy and Procedure Manual (PPM) • Limit your data requests to the minimum necessary. The minimum necessary standard is the minimum necessary to accomplish the intended purpose of the project. For example, do not collect age and date of birth if collecting age only will meet the intended purpose of the project. De-identify data using one of the two approved methods: the statistical method or the “safe harbor” method. See HIPAA Policy, HP-22, De-identified Information and Department of Health and Human Services De-identification Guidelines for more information. • Follow the correct path to data procurement – direct all inquiries for data to Sparrow’s Data Analytics Department. You can complete the general report request and email the completed form to Sparrow’s IT HelpDesk at [email protected]. Your request will then be forwarded to the Data Analytics Department. (Note: Case studies are excluded from this requirement.) • Use of Protected Health Information Preparatory to Research - an investigator may review protected health information solely to prepare a research protocol, or for similar purposes preparatory to research.
    [Show full text]
  • Network Storage 1St Edition Pdf, Epub, Ebook
    NETWORK STORAGE 1ST EDITION PDF, EPUB, EBOOK James OReilly | 9780128038659 | | | | | Network Storage 1st edition PDF Book Sorry, this product is currently out of stock. Electricity transmission and distribution systems carry electricity from suppliers to demand sites. Categories : Network-attached storage Server appliance Software appliances. Updating Results. Clustered NAS, like a traditional one, still provides unified access to the files from any of the cluster nodes, unrelated to the actual location of the data. Following the Newcastle Connection, Sun Microsystems ' release of NFS allowed network servers to share their storage space with networked clients. NAS is designed as an easy and self-contained solution for sharing files over the network. Advanced grid technologies to sustain higher network efficiency and maintaining power quality and security are under development. NAS units rarely limit clients to a single protocol. Imprint: Morgan Kaufmann. Paper data storage media. Computer data storage server. This book explains the components, as well as how to design and implement a resilient storage network for workgroup, departmental, and enterprise environments. All Pages Books Journals. Jim is currently a consultant specializing in storage systems, virtualization, infrastructure software, and cloud hardware. All Pages Books Journals. Enterprise storage architects, designers, and managers, as well as executives will find "Resilient Storage Networks" both interesting and informative. President, Volanto, USA. IT managers and administrators, network system sales and marketing staff, network system integrators and support staff, Networking students, faculty, and researchers. Be the first to write a review. About Overview Electricity transmission and distribution systems carry electricity from suppliers to demand sites. In an appropriately configured RAID array, a single bad block on a single drive can be recovered completely via the redundancy encoded across the RAID set.
    [Show full text]
  • Secondary Storage
    Memory More commonly known as RAM, memory is a location where information is stored that is currently being being utilized by the operating system, software program, hardware device, and/or the user. There are two types of memory, volatile memory and non-volatile memory. Volatile memory is memory that loses its contents when the computer or hardware device loses power. Computer RAM is a good example of a volatile memory. Non-volatile memory, sometimes abbreviated as NVRAM, is memory that keeps its contents even if the power is lost. CMOS is a good example of a non-volatile memory. Below is an example picture of computer memory. It is very common for users to confuse what memory is exactly. For example, a computer hard drive is sometimes thought of as memory. A hard drive is a type of storage but not memory. As mentioned above, memory is more commonly known as RAM. CMOS Picture of CMOS lithium battery on motherboardAlso known as a Real Time Clock (RTC), Non-Volatile RAM (NVRAM) or CMOS RAM, CMOS is short for Complementary Metal-Oxide Semiconductor. CMOS is an on-board semiconductor chip powered by a CMOS battery inside computers that stores information such as the system time and system settings for your computer. A CMOS is similar to the Apple Macintosh computer's PRAM. To the right is an image of a CMOS battery on a computers motherboard and the most common CMOS battery you're likely to encounter with your computer. To the right is some examples of other types of batteries that may be used in a computers to power the CMOS memory.
    [Show full text]
  • File Organization & Management
    1 UNESCO -NIGERIA TECHNICAL & VOCATIONAL EDUCATION REVITALISATION PROJECT -PHASE II NATIONAL DIPLOMA IN COMPUTER TECHNOLOGY File Organization and Management YEAR II- SE MESTER I THEORY Version 1: December 2008 1 2 Table of Contents WEEK 1 File Concepts .................................................................................................................................6 Bit: . .................................................................................................................................................7 Binary digit .....................................................................................................................................8 Representation ...............................................................................................................................9 Transmission ..................................................................................................................................9 Storage ............................................................................................................................................9 Storage Unit .....................................................................................................................................9 Abbreviation and symbol ............................................................................................................ 10 More than one bit ......................................................................................................................... 11 Bit, trit,
    [Show full text]
  • Unisys Clearpath 4400 Is the Last Unisys Computer
    Chapter 4. RECOVERY OF HISTORICAL U.S. CENSUS BUREAU MICRODATA: SUCCESS TO DATE B.K. Atrostic, Randy Becker, Todd Gardner, Cheryl Grim, and Mark Mildorf, Center for Economic Studies Additional years of microdata (CES). See Text Box 4-1 for high- many household and business from the Annual Survey of lights of recovered data. files to be recovered and sought Manufactures (ASM), Survey the help of the research com- Historic data from over 2,500 of Industrial Research and munity (Becker and Grim, 2009 tapes were recovered before the Development (SIRD), and the and Gardner, 2009). Without the Census Bureau’s Unisys main- Current Population Survey are intensive recovery effort led by frame computer was decom- just a few examples of the valu- CES during the last half of 2009, missioned in 2010. In the 2008 able data recently recovered by important information would Research Report, CES noted the the Center for Economic Studies have been lost forever. Text Box 4-1. RECOVERED DATA: HIGHLIGHTS The economic and demographic data recovered from the Unisys will be valuable additions to the data already available at the Center for Economic Studies (CES) and the Research Data Centers (RDCs). Most files will require additional work before they can be used for research purposes, and some may require approval by sponsoring agencies. Examples of data recovered from the Unisys include: • Earlier years of series already available at CES · Censuses of Mining, Retail, Wholesale, and Services · Annual Survey of Manufactures (ASM) · Survey of Industrial Research and Development (SIRD) · Survey of Minority-Owned Business Enterprises · Commodity Transport Survey (now called the Commodity Flow Survey) · Decennial Census data ~ Puerto Rico sample and complete count files ~ U.S.
    [Show full text]
  • And Predictions for the Next 20 Years
    20 Years of Storage Innovation …and Predictions for the Next 20 Years Michael Oros Executive Director How did we get here… Cave paintings…40,000 years ago Pictograms…9,000 years ago Writing…5,000 years ago Paper data storage…300 years ago Commercial electricity generation…~140 years ago © 2017 Storage Networking Industry Association. All Rights Reserved. 2 How did we get here… Transistor…70 years ago Hard disk drive…63 years ago Networked storage…34 years ago Flash memory…33 years ago Storage area networks…23 years ago SNIA…20 years ago © 2017 Storage Networking Industry Association. All Rights Reserved. 3 Why standards are important: Electricity as an example AC / DC – the battle of the currents AC voltage and frequency 220-240V and 100-127V are the standard for commercial power 14 frequencies were initially commercialized 50 and 60 Hz are the dominant world frequencies today 4-combination of voltage and frequency today…and 15 plugs © 2017 Storage Networking Industry Association. All Rights Reserved. 4 Why standards are important: Electricity as an example Impact: Electronics industry has to account for all permutations Electric machinery/motors has to be customized for each voltage/frequency combination Higher costs for manufacturers and customers Examples and consequences: Los Angeles power grid Japan’s incompatible power grids © 2017 Storage Networking Industry Association. All Rights Reserved. 5 Remembering how far we’ve come… © 2017 Storage Networking Industry Association. All Rights Reserved. 6 Who remembers dialup? © 2017 Storage Networking Industry Association. All Rights Reserved. 7 Data Center Dependent © 2017 Storage Networking Industry Association. All Rights Reserved.
    [Show full text]
  • Université De Fribourg Department of Informatics E-Passport Control
    Université de Fribourg Department of Informatics E-Passport Control: Freedom to Trade by Alvarez, Pedro Paul 09-202-490 Route des Arsenaux 22, 1700- Fribourg [email protected] Supervised by Prof. Andreas Meier and Luis Teran Fribourg 2010 Table of Contents I. Introduction II. Passports III. “Freedom to Travel is Freedom to Trade” IV. Terrorist Vulnerabilities V. The e-Passport i. Technical Information ii. Criticisms VI. ESTA & US VISIT VII. Global Entry VIII. Centralization of Data IX. Conclusion 2 I. Introduction Globalization has truly opened up our interest to cross national borders and today we can find evidence everywhere to support that. We can see human mobility as we have never seen it before. But unlike years before, we have increasing necessity to control some human mobility. This seminar paper will give a brief background on the current issues of human mobility, from immigration and tourism, business travelers and the distribution of goods, to e-government control and national security. Although the topic of this paper is on the subject of e-passports, it is important to fully understand the underlying basis for enhanced security and how they relate to e- passports. Informatics scientists as well as travelers must fully understand the importance of e-documents. Technology is never fully exclusive and we can only imagine e-documents trickling down for commercial purposes as well. We are on the eve of next-generation identification that not only strengthens security and verification at a global level, but will also risk the very same thing we are trying to protect- our identities.
    [Show full text]
  • Data Storage Security Using Steganography Techniques
    International Journal of Technical Research and Applications e-ISSN: 2320-8163, www.ijtra.com Volume 4, Issue 6 (Nov-Dec 2016), PP.93-98 Data Storage Security Using Steganography Techniques 1 Nancy Garg, 2 Kamalinder Kaur 1,2 Computer Science Of Engineering And Technology, Chandigarh Engineering College , IK Gujral Punjab Technical University, LANDRAN, INDIA 1 [email protected] Abstract— Cloud computing is an advanced technology systems .Public key system uses two keys that is public key throughout the world. As cloud computing is based on Internet which is known to everyone and private key which is used by which is a computer technology. The computer store in the only the recipient of messages. Symmetric key system use a available space and whenever it is requested by the authenticated single key that both the receiver and the sender have .In user it retrieve the stored information. Measures of security that process of Cryptography, a cipher message, may provoke are assumed in the cloud should be made available to the customers for gaining their trust. Steganography is used to hide suspicion on the behalf of the recipient while message which data. First acquire an image from various sources and will read is invisible created with method of steganography will not. the required details of an image. After that secret data is read However, when the use of cryptography is illegal ,then and is converted into integer values. Then it is encrypted and steganography is useful .However, cryptography and embedded with the cover image with the use of transform steganography are judged in a different way.
    [Show full text]
  • Institutional Protocols for Suitable Research Data Storage Mediums
    Institutional protocols for suitable research data storage mediums Purpose To establish best practice for researchers handling sensitive data in the course of their research. Background: Many researchers opt to store data on devices such as USBs and external hard drives. These types of storage solution are vulnerable to loss and theft, posing a risk of data loss if data is not backed up. Such loss may also put researchers and the University in breach of Privacy legislation if the dataset contains sensitive personal information, or in breach of contractual arrangements where those arrangements require researchers to protect intellectual property rights, commercial interests, or to keep sensitive information safe. The Bond University Code of Conduct for Research Policy – TLR 5.06 (issue 3) states the following with respect to security in storage of Research Data: 3.1.6 Research data and records must be maintained securely to prevent unauthorised access, destruction, alteration or removal, accidental or intended damage or destruction. 3.1.7 Confidential research data and records must be stored securely (for example, in lockable filing cabinets or a lockable room with controlled access). When confidential research data and records are stored electronically (for example on a personal computer) precautions must be taken to control access to the research data and records. Such precautions include password access and ‘locking’ data files. Australian Privacy Principles stipulate that 11.1 If an APP entity holds personal information, the entity must take such steps as are reasonable in the circumstances to protect the information: a from misuse, interference and loss; and b. from unauthorised access, modification or disclosure Recommendations: In view of these risks, Research Services and ITS seek to discourage the use of vulnerable portable storage solutions in favour of password protected data storage on University network drives, which are accessible off Campus via the Citrix desktop environment and securely backed up.
    [Show full text]
  • Epistemic Information in Stratified M-Spaces
    Information 2011, 2, 697-726; doi:10.3390/info2040697 OPEN ACCESS information ISSN 2078-2489 www.mdpi.com/journal/information Article Epistemic Information in Stratified M-Spaces Mark Burgin Department of Mathematics, University of California, Los Angeles, 405 Hilgard Ave. Los Angeles, CA 90095, USA; E-Mail: [email protected] Received: 15 September 2011; in revised form: 24 November 2011 / Accepted: 1 December 2011 / Published: 16 December 2011 Abstract: Information is usually related to knowledge. However, the recent development of information theory demonstrated that information is a much broader concept, being actually present in and virtually related to everything. As a result, many unknown types and kinds of information have been discovered. Nevertheless, information that acts on knowledge, bringing new and updating existing knowledge, is of primary importance to people. It is called epistemic information, which is studied in this paper based on the general theory of information and further developing its mathematical stratum. As a synthetic approach, which reveals the essence of information, organizing and encompassing all main directions in information theory, the general theory of information provides efficient means for such a study. Different types of information dynamics representation use tools of mathematical disciplines such as the theory of categories, functional analysis, mathematical logic and algebra. Here we employ algebraic structures for exploration of information and knowledge dynamics. In Introduction (Section 1), we discuss previous studies of epistemic information. Section 2 gives a compressed description of the parametric phenomenological definition of information in the general theory of information. In Section 3, anthropic information, which is received, exchanged, processed and used by people is singled out and studied based on the Componential Triune Brain model.
    [Show full text]