CIF21 Dibbs: Middleware and High Performance Analytics Libraries for Scalable Data Science

Total Page:16

File Type:pdf, Size:1020Kb

CIF21 Dibbs: Middleware and High Performance Analytics Libraries for Scalable Data Science NSF14-43054 start October 1, 2014 Datanet: CIF21 DIBBs: Middleware and High Performance Analytics Libraries for Scalable Data Science • Indiana University (Fox, Qiu, Crandall, von Laszewski), • Rutgers (Jha) • Virginia Tech (Marathe) • Kansas (Paden) • Stony Brook (Wang) • Arizona State(Beckstein) • Utah(Cheatham) Overview by Gregor von Laszewski April 6 2016 http://spidal.org/ http://news.indiana.edu/releases/iu/2014/10/big-data-dibbs-grant.shtml http://www.nsf.gov/awardsearch/showAward?AWD_ID=1443054 04/6/2016 1 Some Important Components of SPIDAL Dibbs • NIST Big Data Application Analysis: features of data intensive apps • HPC-ABDS: Cloud-HPC interoperable software with performance of HPC (High Performance Computing) and the rich functionality of the commodity Apache Big Data Stack. – Reservoir of software subsystems – nearly all from outside project and mix of HPC and Big Data communities – Leads to Big Data – Simulation - HPC Convergence • MIDAS: Integrating Middleware – from project • Applications: Biomolecular Simulations, Network and Computational Social Science, Epidemiology, Computer Vision, Spatial Geographical Information Systems, Remote Sensing for Polar Science and Pathology Informatics. • SPIDAL (Scalable Parallel Interoperable Data Analytics Library): Scalable Analytics for – Domain specific data analytics libraries – mainly from project – Add Core Machine learning Libraries – mainly from community – Performance of Java and MIDAS Inter- and Intra-node • Benchmarks – project adds to community; See WBDB 2015 Seventh Workshop on Big Data Benchmarking 04/6/2016 2 64 Features in 4 views for Unified Classifica7on of Big Data and Simula7on Applica7ons Both 10D Geospatial Information System Analytics 9 HPC Simulations Data Source and Style View (Model for 8D Internet of Things Simulations Data) 7D Metadata/Provenance (Nearly all Data) 6D Shared / Dedicated / Transient / Permanent 5D Archived/Batched/Streaming – S1, S2, S3, S4, S5 4D HDFS/Lustre/GPFS 3D Files/Objects Kernels/Many subclasses 2D Enterprise Data Model 1D SQL/NoSQL/NewSQL benchmarks (Analytics/Informatics/Simulations) - Convergence D M D D M M D M D M M D M D M M body Methods - Multiscale Method Iterative PDE Solvers Engine Recommender Base Data Statistics Local Micro N Data Classification Linear Algebra Particles and Fields Learning Graph Algorithms Nature of mesh if used Spectral Methods Data Alignment Streaming Data Algorithms Data Search/Query/Index Core Libraries Global (Analytics/Informatics/Simulations) Evolution of Discrete Systems Optimization Methodology Visualization Diamonds 1 2 3 4 4 5 6 6 7 8 9 9 10 10 11 12 12 13 13 14 Data Data � Execution Dynamic Dynamic Model Performance Flops Data Model Data Data Model Veracity Communication Regular Regular Data 22 21 20 19 18 17 16 11 10 9 8 7 6 5 4 15 14 13 12 3 2 1 Views and Iterative � # Metric Metric M M M M M M M M M M M M M M M M M M M M M M Volume Velocity Variety Abstraction Facets per Abstraction Size Variety = = = / NN = = Byte/Memory Simulation (Exascale) Big Data Processing R R Environment Pleasingly Parallel 1 Simple D D / / = = Processing Diamonds Diamonds / / / 2 Metrics Irregular Irregular Classic MapReduce M M � Static Static ( Structure / / Processing View Map-Collective 3 � (All Model) Non Non Map Point-to-Point ) = = 4 = - - S S = = N ; Metric Metric I I Core Map Streaming 5 IO/Flops Data Model Shared Memory 6 libraries Single Program Multiple Data = = 7 N N Bulk Synchronous Parallel per 8 watt Fusion 9 Execution View Dataflow 10 Agents 11M (Mix of Data and Model) Problem Architecture View Workflow 12 (Nearly all Data+Model) 04/6/2016 3 Java MPI performs better than Threads 128 24 core Haswell nodes on SPIDAL DA-MDS Code Best MPI; inter and intra node Best Threads intra node MPI inter node 04/6/2016 4 Big DataKaleidoscope and of(Exascale) (Apache) Big Data Simulation Stack (ABDS) and HPC Convergence Technologies II Cross- 17) Workflow-Orchestration: ODE, ActiveBPEL, Airavata, Pegasus, Kepler, Swift, Taverna, Triana, Trident, BioKepler, Galaxy, IPython, Dryad, Cutting Naiad, Oozie, Tez, Google FlumeJava, Crunch, Cascading, Scalding, e-Science Central, Azure Data Factory, Google Cloud Dataflow, NiFi (NSA), Functions Jitterbit, Talend, Pentaho, Apatar, Docker Compose, KeystoneML 16) Application and Analytics: Mahout , MLlib , MLbase, DataFu, R, pbdR, Bioconductor, ImageJ, OpenCV, Scalapack, PetSc, PLASMA MAGMA, 1) Message Azure Machine Learning, Google Prediction API & Translation API, mlpy, scikit-learn, PyBrain, CompLearn, DAAL(Intel), Caffe, Torch, Theano, DL4j, and Data H2O, IBM Watson, Oracle PGX, GraphLab, GraphX, IBM System G, GraphBuilder(Intel), TinkerPop, Parasol, Dream:Lab, Google Fusion Tables, Protocols: CINET, NWB, Elasticsearch, Kibana, Logstash, Graylog, Splunk, Tableau, D3.js, three.js, Potree, DC.js, TensorFlow, CNTK Avro, Thrift, 15B) Application Hosting Frameworks: Google App Engine, AppScale, Red Hat OpenShift, Heroku, Aerobatic, AWS Elastic Beanstalk, Azure, Cloud Protobuf Foundry, Pivotal, IBM BlueMix, Ninefold, Jelastic, Stackato, appfog, CloudBees, Engine Yard, CloudControl, dotCloud, Dokku, OSGi, HUBzero, OODT, 2) Distributed Agave, Atmosphere Coordination 15A) High level Programming: Kite, Hive, HCatalog, Tajo, Shark, Phoenix, Impala, MRQL, SAP HANA, HadoopDB, PolyBase, Pivotal HD/Hawq, : Google Presto, Google Dremel, Google BigQuery, Amazon Redshift, Drill, Kyoto Cabinet, Pig, Sawzall, Google Cloud DataFlow, Summingbird Chubby, 14B) Streams: Storm, S4, Samza, Granules, Neptune, Google MillWheel, Amazon Kinesis, LinkedIn, Twitter Heron, Databus, Facebook Zookeeper, Puma/Ptail/Scribe/ODS, Azure Stream Analytics, Floe, Spark Streaming, Flink Streaming, DataTurbine Giraffe, 14A) Basic Programming model and runtime, SPMD, MapReduce: Hadoop, Spark, Twister, MR-MPI, Stratosphere (Apache Flink), Reef, Disco, JGroups Hama, Giraph, Pregel, Pegasus, Ligra, GraphChi, Galois, Medusa-GPU, MapGraph, Totem 3) Security & 13) Inter process communication Collectives, point-to-point, publish-subscribe: MPI, HPX-5, Argo BEAST HPX-5 BEAST PULSAR, Harp, Netty, Privacy: ZeroMQ, ActiveMQ, RabbitMQ, NaradaBrokering, QPid, Kafka, Kestrel, JMS, AMQP, Stomp, MQTT, Marionette Collective, Public Cloud: Amazon InCommon, SNS, Lambda, Google Pub Sub, Azure Queues, Event Hubs Eduroam 12) In-memory databases/caches: Gora (general object from NoSQL), Memcached, Redis, LMDB (key value), Hazelcast, Ehcache, Infinispan, VoltDB, OpenStack H-Store Keystone, 12) Object-relational mapping: Hibernate, OpenJPA, EclipseLink, DataNucleus, ODBC/JDBC LDAP, Sentry, 12) Extraction Tools: UIMA, Tika Sqrrl, OpenID, SAML OAuth 11C) SQL(NewSQL): Oracle, DB2, SQL Server, SQLite, MySQL, PostgreSQL, CUBRID, Galera Cluster, SciDB, Rasdaman, Apache Derby, Pivotal 4) Greenplum, Google Cloud SQL, Azure SQL, Amazon RDS, Google F1, IBM dashDB, N1QL, BlinkDB, Spark SQL Monitoring: 11B) NoSQL: Lucene, Solr, Solandra, Voldemort, Riak, ZHT, Berkeley DB, Kyoto/Tokyo Cabinet, Tycoon, Tyrant, MongoDB, Espresso, CouchDB, Ambari, Couchbase, IBM Cloudant, Pivotal Gemfire, HBase, Google Bigtable, LevelDB, Megastore and Spanner, Accumulo, Cassandra, RYA, Sqrrl, Neo4J, Ganglia, graphdb, Yarcdata, AllegroGraph, Blazegraph, Facebook Tao, Titan:db, Jena, Sesame Nagios, Inca Public Cloud: Azure Table, Amazon Dynamo, Google DataStore 11A) File management: iRODS, NetCDF, CDF, HDF, OPeNDAP, FITS, RCFile, ORC, Parquet 10) Data Transport: BitTorrent, HTTP, FTP, SSH, Globus Online (GridFTP), Flume, Sqoop, Pivotal GPLOAD/GPFDIST 21 layers 9) Cluster Resource Management: Mesos, Yarn, Helix, Llama, Google Omega, Facebook Corona, Celery, HTCondor, SGE, OpenPBS, Moab, Slurm, Over 350 Torque, Globus Tools, Pilot Jobs 8) File systems: HDFS, Swift, Haystack, f4, Cinder, Ceph, FUSE, Gluster, Lustre, GPFS, GFFS Software Public Cloud: Amazon S3, Azure Blob, Google Cloud Storage Packages 7) Interoperability: Libvirt, Libcloud, JClouds, TOSCA, OCCI, CDMI, Whirr, Saga, Genesis 6) DevOps: Docker (Machine, Swarm), Puppet, Chef, Ansible, SaltStack, Boto, Cobbler, Xcat, Razor, CloudMesh, Juju, Foreman, OpenStack Heat, Sahara, Rocks, Cisco Intelligent Automation for Cloud, Ubuntu MaaS, Facebook Tupperware, AWS OpsWorks, OpenStack Ironic, Google Kubernetes, January Buildstep, Gitreceive, OpenTOSCA, Winery, CloudML, Blueprints, Terraform, DevOpSlang,04/6/2016 Any2Api 29 5) IaaS Management from HPC to hypervisors: Xen, KVM, QEMU, Hyper-V, VirtualBox, OpenVZ, LXC, Linux-Vserver, OpenStack, OpenNebula,5 Eucalyptus, Nimbus, CloudStack, CoreOS, rkt, VMware ESXi, vSphere and vCloud, Amazon, Azure, Google and other public Clouds 2016 Networking: Google Cloud DNS, Amazon Route 53 HPC-ABDS Mapping of Activities • Level 17: Orchestration: Apache Beam (Google Cloud Dataflow) integrated with Cloudmesh • Level 16: Applications. Datamining for molecular dynamics, Image processing for remote sensing and pathology, graphs, streaming • Level 16: Algorithms. MLlib, Mahout, SPIDAL – Release SPIDAL Clustering, Dimension Reduction (multi-dimensional scaling), Web point set visualization • Level 14: Programming. Storm, Heron, Hadoop, Spark, Flink. Inter and Intra- node performance • Level 13: Communication. Enhanced Storm and Hadoop using HPC runtime technologies • Level 11: Data management. Hbase and MongoDB integrated via use of Beam and other Apache tools • Level 9: Cluster Management. Integrate Pilot Jobs with Yarn, Spark, Hadoop • Level 6: DevOps. Python Cloudmesh virtual Cluster Interoperability 04/6/2016 6 .
Recommended publications
  • Integration of CUDA Processing Within the C++ Library for Parallelism and Concurrency (HPX)
    1 Integration of CUDA Processing within the C++ library for parallelism and concurrency (HPX) Patrick Diehl, Madhavan Seshadri, Thomas Heller, Hartmut Kaiser Abstract—Experience shows that on today’s high performance systems the utilization of different acceleration cards in conjunction with a high utilization of all other parts of the system is difficult. Future architectures, like exascale clusters, are expected to aggravate this issue as the number of cores are expected to increase and memory hierarchies are expected to become deeper. One big aspect for distributed applications is to guarantee high utilization of all available resources, including local or remote acceleration cards on a cluster while fully using all the available CPU resources and the integration of the GPU work into the overall programming model. For the integration of CUDA code we extended HPX, a general purpose C++ run time system for parallel and distributed applications of any scale, and enabled asynchronous data transfers from and to the GPU device and the asynchronous invocation of CUDA kernels on this data. Both operations are well integrated into the general programming model of HPX which allows to seamlessly overlap any GPU operation with work on the main cores. Any user defined CUDA kernel can be launched on any (local or remote) GPU device available to the distributed application. We present asynchronous implementations for the data transfers and kernel launches for CUDA code as part of a HPX asynchronous execution graph. Using this approach we can combine all remotely and locally available acceleration cards on a cluster to utilize its full performance capabilities.
    [Show full text]
  • Workload and Resource Aware, Proactive Autoscaler for Paas Cloud Frameworks
    University of Moratuwa Department of Computer Science & Engineering CS4202 - Research and Development Project inteliScaler Workload and Resource Aware, Proactive Autoscaler for PaaS Cloud Frameworks Group Members 110532R Ridwan Shariffdeen 110375L Tharindu Munasinghe 110056K Janaka Bandara 110059X Bhathiya Supun Supervisors Internal Dr. H.M.N. Dilum Bandara External Mr. Lakmal Warusawithana, WSO2 Dr. Srinath Perera, WSO2 Coordinated by Dr. Malaka Walpola THIS REPORT IS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE AWARD OF THE DEGREE OF BACHELOR OF SCIENCE OF ENGINEERING AT UNIVERSITY OF MORATUWA, SRI LANKA. February 21, 2016 1 Declaration We, the project group inteliScaler hereby declare that except where specified reference is made to the work of others, the project inteliScaler - a resource & cost aware, proactive auto scaler for PaaS cloud is our own work and contains nothing which is the outcome of work done in collaboration with others, except as specified in the text and Acknowledgement. Signatures of the candidates: ....................................................... R.S. Shariffdeen [110532R] ....................................................... D.T.S.P. Munasinghe [110375L] ....................................................... U.K.J.U. Bandara [110056K] ....................................................... H.S. Bhathiya [110059X] Supervisor: ....................................................... (Signature and Date) Dr. H.M.N. Dilum Bandara Coordinator: ......................................................
    [Show full text]
  • Bench - Benchmarking the State-Of- The-Art Task Execution Frameworks of Many- Task Computing
    MATRIX: Bench - Benchmarking the state-of- the-art Task Execution Frameworks of Many- Task Computing Thomas Dubucq, Tony Forlini, Virgile Landeiro Dos Reis, and Isabelle Santos Illinois Institute of Technology, Chicago, IL, USA {tdubucq, tforlini, vlandeir, isantos1}@hawk.iit.edu Stanford University. Finally HPX is a general purpose C++ Abstract — Technology trends indicate that exascale systems will runtime system for parallel and distributed applications of any have billion-way parallelism, and each node will have about three scale developed by Louisiana State University and Staple is a orders of magnitude more intra-node parallelism than today’s framework for developing parallel programs from Texas A&M. peta-scale systems. The majority of current runtime systems focus a great deal of effort on optimizing the inter-node parallelism by MATRIX is a many-task computing job scheduling system maximizing the bandwidth and minimizing the latency of the use [3]. There are many resource managing systems aimed towards of interconnection networks and storage, but suffer from the lack data-intensive applications. Furthermore, distributed task of scalable solutions to expose the intra-node parallelism. Many- scheduling in many-task computing is a problem that has been task computing (MTC) is a distributed fine-grained paradigm that considered by many research teams. In particular, Charm++ [4], aims to address the challenges of managing parallelism and Legion [5], Swift [6], [10], Spark [1][2], HPX [12], STAPL [13] locality of exascale systems. MTC applications are typically structured as direct acyclic graphs of loosely coupled short tasks and MATRIX [11] offer solutions to this problem and have with explicit input/output data dependencies.
    [Show full text]
  • What Is Bluemix
    IBM Brings Bluemix to Developers! This document has been prepared for the TMForum Hackathon in Nice, France. The first section of this document shares Bluemix related notes, and it is followed by notes appropriate for viewing content from exposed APIs (provided by TMForum and FIware) then you see the node flows that are available for you. IBM® Bluemix™ is an open-standard, cloud-based platform for building, managing, and running apps of all types, such as web, mobile, big data, and smart devices. Capabilities include Java, mobile back-end development, and application monitoring, as well as features from ecosystem partners and open source—all provided as-a-service in the cloud. Get started with Bluemix: ibm.biz/LearnBluemix Sign up for Bluemix: https://ibm.biz/sitefrbluemix Getting started with run times: http://bluemix.net/docs/# View the catalog and select the mobile cloud boilerplate: http://bluemix.net/#/store/cloudOEPaneId=store Tap into the Internet of Things: http://bluemix.net/#/solutions/solution=internet_of_things Bluemix tutorial in Open Classroom: http://openclassrooms.com/courses/deployez-des-applications- dans-le-cloud-avec-ibm-bluemix This table below can be used for general enablement. It is been useful to developers are previous hackathons. Source Code : Quick Start Technical Asset Name URL/Mobile App Technical Asset Description Guide Uses Node.js runtime, Internet Connected Home Automation ibm.biz/ATTconnhome2 of Things boilerplate, Node-RED ibm.biz/ATTconnhome2qs App editor and MQTT protocol Uses Node.js runtime, Connected
    [Show full text]
  • HPX – a Task Based Programming Model in a Global Address Space
    HPX – A Task Based Programming Model in a Global Address Space Hartmut Kaiser1 Thomas Heller2 Bryce Adelstein-Lelbach1 [email protected] [email protected] [email protected] Adrian Serio1 Dietmar Fey2 [email protected] [email protected] 1Center for Computation and 2Computer Science 3, Technology, Computer Architectures, Louisiana State University, Friedrich-Alexander-University, Louisiana, U.S.A. Erlangen, Germany ABSTRACT 1. INTRODUCTION The significant increase in complexity of Exascale platforms due to Todays programming models, languages, and related technologies energy-constrained, billion-way parallelism, with major changes to that have sustained High Performance Computing (HPC) appli- processor and memory architecture, requires new energy-efficient cation software development for the past decade are facing ma- and resilient programming techniques that are portable across mul- jor problems when it comes to programmability and performance tiple future generations of machines. We believe that guarantee- portability of future systems. The significant increase in complex- ing adequate scalability, programmability, performance portability, ity of new platforms due to energy constrains, increasing paral- resilience, and energy efficiency requires a fundamentally new ap- lelism and major changes to processor and memory architecture, proach, combined with a transition path for existing scientific ap- requires advanced programming techniques that are portable across plications, to fully explore the rewards of todays and tomorrows multiple future generations of machines [1]. systems. We present HPX – a parallel runtime system which ex- tends the C++11/14 standard to facilitate distributed operations, A fundamentally new approach is required to address these chal- enable fine-grained constraint based parallelism, and support run- lenges.
    [Show full text]
  • A Complete Survey on Software Architectural Styles and Patterns
    Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 70 ( 2015 ) 16 – 28 4th International Conference on Eco-friendly Computing and Communication Systems, ICECCS 2015 A Complete Survey on Software Architectural Styles and Patterns Anubha Sharmaa*,Manoj Kumarb, Sonali Agarwalc a,b,cIndian Institute of Information Technology,Allahabad 211012,India Abstract Software bought revolutionary change making entrepreneurs fortunate enough to make money in less time with least effort and correct output. SDLC (Software development life cycle) is responsible for software’s reliability, performance, scalability, functionality and maintainability. Although all phases of SDLC have their own importance but Software architecture serves as the foundation for other phases of SDLC. Just like sketch of a building helps constructor to correctly construct the building, software architecture helps software developer to develop the software properly. There are various styles available for software architecture. In this paper, clear picture of all important software architecture styles are presented along with recent advancement in software architecture and design phases. It could be helpful for a software developer to select an appropriate style according to his/her project’s requirement. An architectural style must be chosen correctly to get its all benefits in the system. All the architectural styles are compared on the basis of various quality attributes. This paper also specifies the application area, advantages and disadvantages of each architectural style. © 20152015 The The Authors. Authors. Published Published by byElsevier Elsevier B.V. B.V. This is an open access article under the CC BY-NC-ND license (Peerhttp://creativecommons.org/licenses/by-nc-nd/4.0/-review under responsibility of the Organizing).
    [Show full text]
  • Cutter IT Journal
    Cutter The Journal of IT Journal Information Technology Management Vol. 26, No. 3 March 2013 “Cloud service providers, the IT industry, professional The Emerging Cloud Ecosystem: and industry associations, governments, and IT pro- Innovative New Services and fessionals all have a role to Business Models play in shaping, fostering, and harnessing the full potential of the emerging cloud ecosystem.” Opening Statement — San Murugesan, by San Murugesan . 3 Guest Editor Merging IaaS with PaaS to Deliver Robust Development Tools by Beth Cohen . 6 Intrusion Detection as a Service (IDaaS) in an Open Source Cloud Infrastructure by John Prakash Veigas and K Chandra Sekaran . 12 Cloud Ecology: Surviving in the Jungle by Claude R. Baudoin . 19 The Promise of a Diverse, Interoperable Cloud Ecosystem — And Recommendations for Realizing It by Kathy L. Grise . 26 NOT FOR DISTRIBUTION For authorized use, contact Cutter Consortium: +1 781 648 8700 [email protected] Cutter IT Journal About Cutter IT Journal Cutter IT Journal® Cutter Business Technology Council: Part of Cutter Consortium’s mission is to Cutter IT Journal subscribers consider the Rob Austin, Ron Blitstein, Tom DeMarco, Lynne Ellyn, Israel Gat, Vince Kellen, foster debate and dialogue on the business Journal a “consultancy in print” and liken Tim Lister, Lou Mazzucchelli, technology issues challenging enterprises each month’s issue to the impassioned Ken Orr, and Robert D. Scott today, helping organizations leverage IT for debates they participate in at the end of Editor Emeritus: Ed Yourdon competitive advantage and business success. a day at a conference. Publisher: Karen Fine Coburn Cutter’s philosophy is that most of the issues Group Publisher: Chris Generali that managers face are complex enough to Every facet of IT — application integration, Managing Editor: Karen Pasley merit examination that goes beyond simple security, portfolio management, and testing, Production Editor: Linda M.
    [Show full text]
  • Cloud Computing
    PA200 - Cloud Computing Lecture 6: Cloud providers by Ilya Etingof, Red Hat In this lecture • Cloud service providers • IaaS/PaaS/SaaS and variations • In the eyes of the user • Pros&Cons • Google Cloud walk-through Cloud software vs cloud service • Cloud software provider • OpenStack, RedHat OpenShift, oVirt • Many in-house implementations • Cloud service provider • Amazon EC2 • Microsoft Azure • Google Cloud Platform • RedHat OpenShift Online • ... What does CSP do? A combination of: • IaaS • PaaS/Stackless/FaaS • SaaS Business use of CSPs The hyperscalers: What does IaaS CSP do? • Abstracts away the hardware • Operating system as a unit of scale (before IaaS, hardware computer has been a unit of scale) IaaS CSP business model • Owns/rents physical DC infrastructure • Owns/buys Internet connectivity (links, IX etc) • Provides IaaS to end customers • Serves other CSPs: PaaS and SaaS • Base for multi-cloud CSPs Typical IaaS offering • Compute nodes (VMs) • Virtual networks • Bare metal nodes • Managed storage (block, file systems) • Instance-based scaling and redundancy • Pay per allocated resources (instances, RAM, storage, traffic) Example IaaS • Amazon Elastic Compute Cloud (EC2) • Google Cloud • Microsoft Azure Cloud Computing Service IaaS CSP differentiation • Technically interchangeable • Similar costs • Customer is not heavily locked-in Multicloud Multiple cloud services under the single control plane • Reduces dependence on a single CSP • Balances load/location/costs • Gathers resources • Same deployment model (unlike hybrid cloud) Examples of multicloud software • IBM Cloud Orchestrator • RedHat CloudForms • Flexera RightScale What does PaaS CSP do? • Abstracts away the OS • Containerized application as a unit of scale PaaS CSP business model • Owns or rents the IaaS • Maintains the platform • Maintains services, data collections etc.
    [Show full text]
  • Innovative Feedback System Based on Ibm Bluemix Cloud Service
    INNOVATIVE FEEDBACK SYSTEM BASED ON IBM BLUEMIX CLOUD SERVICE P.P.N.G. Phani Kumar1, N. Anil Chakravarthy2, D.V.S.Ravi Varma3, M.Y.V.Nagesh4 1,2,3,4 Assistant Professor, Department of CSE, Raghu Engineering College, Visakhapatnam, (India). ABSTRACT Getting the right feedback at right time is most important for any organization or an institution. Getting the feedback from the users will help an organization or institution to provide better services to the users or students. Ongoing interaction with users can improve the efficiency of an organization and enable them to provide better service to the users. Until now, several feedback systems are in use which are mostly manual. We propose an efficient feedback system using cloud based computing to generate the report of the faculty performance. In the proposed system all the activities will be done by the use of cloud application Platform as a Service, through the use of IBM Bluemix. Keywords: Android, Bluemix, Cloud Computing, Paas (Platform As A Services) I INTRODUCTION Cloud Computing[1] is a popular technology in which internet and central remote servers are used to store and maintain the data, applications. Cloud computing allows people to use applications without installation of any specialized software and access the required computing facilities from anywhere via internet. By using Cloud computing we can achieve much more efficient computing power by centralizing data storage, processing and bandwidth. Cloud can be available as various services Software as a service (SaaS), Platform as a service (PaaS), Infrastructure as a service (IaaS) 1.1 Services Fig.1: Cloud Computing Services 260 | P a g e Software as a service (SaaS): Software as a service is one type of cloud service in which an application software is installed and manage in the cloud.
    [Show full text]
  • Regeldokument
    Master’s degree project Source code quality in connection to self-admitted technical debt Author: Alina Hrynko Supervisor: Morgan Ericsson Semester: VT20 Subject: Computer Science Abstract The importance of software code quality is increasing rapidly. With more code being written every day, its maintenance and support are becoming harder and more expensive. New automatic code review tools are developed to reach quality goals. One of these tools is SonarQube. However, people keep their leading role in the development process. Sometimes they sacrifice quality in order to speed up the development. This is called Technical Debt. In some particular cases, this process can be admitted by the developer. This is called Self-Admitted Technical Debt (SATD). Code quality can also be measured by such static code analysis tools as SonarQube. On this occasion, different issues can be detected. The purpose of this study is to find a connection between code quality issues, found by SonarQube and those marked as SATD. The research questions include: 1) Is there a connection between the size of the project and the SATD percentage? 2) Which types of issues are the most widespread in the code, marked by SATD? 3) Did the introduction of SATD influence the bug fixing time? As a result of research, a certain percentage of SATD was found. It is between 0%–20.83%. No connection between the size of the project and the percentage of SATD was found. There are certain issues that seem to relate to the SATD, such as “Duplicated code”, “Unused method parameters should be removed”, “Cognitive Complexity of methods should not be too high”, etc.
    [Show full text]
  • Trifacta Data Preparation for Amazon Redshift and S3 Must Be Deployed Into an Existing Virtual Private Cloud (VPC)
    Install Guide for Data Preparation for Amazon Redshift and S3 Version: 7.1 Doc Build Date: 05/26/2020 Copyright © Trifacta Inc. 2020 - All Rights Reserved. CONFIDENTIAL These materials (the “Documentation”) are the confidential and proprietary information of Trifacta Inc. and may not be reproduced, modified, or distributed without the prior written permission of Trifacta Inc. EXCEPT AS OTHERWISE PROVIDED IN AN EXPRESS WRITTEN AGREEMENT, TRIFACTA INC. PROVIDES THIS DOCUMENTATION AS-IS AND WITHOUT WARRANTY AND TRIFACTA INC. DISCLAIMS ALL EXPRESS AND IMPLIED WARRANTIES TO THE EXTENT PERMITTED, INCLUDING WITHOUT LIMITATION THE IMPLIED WARRANTIES OF MERCHANTABILITY, NON-INFRINGEMENT AND FITNESS FOR A PARTICULAR PURPOSE AND UNDER NO CIRCUMSTANCES WILL TRIFACTA INC. BE LIABLE FOR ANY AMOUNT GREATER THAN ONE HUNDRED DOLLARS ($100) BASED ON ANY USE OF THE DOCUMENTATION. For third-party license information, please select About Trifacta from the Help menu. 1. Quick Start . 4 1.1 Install from AWS Marketplace . 4 1.2 Upgrade for AWS Marketplace . 7 2. Configure . 8 2.1 Configure for AWS . 8 2.1.1 Configure for EC2 Role-Based Authentication . 14 2.1.2 Enable S3 Access . 16 2.1.2.1 Create Redshift Connections 28 3. Contact Support . 30 4. Legal 31 4.1 Third-Party License Information . 31 Page #3 Quick Start Install from AWS Marketplace Contents: Product Limitations Internet access Install Desktop Requirements Pre-requisites Install Steps - CloudFormation template SSH Access Troubleshooting SELinux Upgrade Documentation Related Topics This guide steps through the requirements and process for installing Trifacta® Data Preparation for Amazon Redshift and S3 through the AWS Marketplace.
    [Show full text]
  • Exascale Computing Project -- Software
    Exascale Computing Project -- Software Paul Messina, ECP Director Stephen Lee, ECP Deputy Director ASCAC Meeting, Arlington, VA Crystal City Marriott April 19, 2017 www.ExascaleProject.org ECP scope and goals Develop applications Partner with vendors to tackle a broad to develop computer spectrum of mission Support architectures that critical problems national security support exascale of unprecedented applications complexity Develop a software Train a next-generation Contribute to the stack that is both workforce of economic exascale-capable and computational competitiveness usable on industrial & scientists, engineers, of the nation academic scale and computer systems, in collaboration scientists with vendors 2 Exascale Computing Project, www.exascaleproject.org ECP has formulated a holistic approach that uses co- design and integration to achieve capable exascale Application Software Hardware Exascale Development Technology Technology Systems Science and Scalable and Hardware Integrated mission productive technology exascale applications software elements supercomputers Correctness Visualization Data Analysis Applicationsstack Co-Design Programming models, Math libraries and development environment, Tools Frameworks and runtimes System Software, resource Workflows Resilience management threading, Data Memory scheduling, monitoring, and management and Burst control I/O and file buffer system Node OS, runtimes Hardware interface ECP’s work encompasses applications, system software, hardware technologies and architectures, and workforce
    [Show full text]