WANDISCO FUSION® GOOGLE CLOUD DATAPROC

Cloud migration and hybrid cloud WANdisco Fusion for Google Cloud Dataproc extends on-premises Hadoop clusters to the GOOGLE cloud for active burst-out processing, offsite CLOUD DATAPROC disaster recovery and data archiving.

Guaranteed data consistency Take total control of your data with WANdisco FUSION Active Data Replication™ from on-premises Hadoop clusters running any distribution to Google Cloud Dataproc. Data is replicated between on- premises and cloud environments as it changes. FUSION FUSION Consistency is guaranteed and recovery is automatic after hardware or network outages.

On-demand analytics Replicate fast streaming data to Google Cloud Dataproc as it is ingested on-premises to HADOOP LOCAL AND leverage Dataproc’s scalability and performance NFS MOUNTED for demanding real-time analytics applications. FILE SYSTEMS WANdisco Fusion supports Dataproc’s ability to spin up and shutdown clusters on demand, so you only pay for resources when you use them.

Automatic recovery Recovery is automatic after planned or unplanned network or hardware outages in both on-premises and cloud environments.

Seamless, flexible and easy to install Replicates data between on-premises clusters deployed on any Hadoop compatible storage and Google Cloud Dataproc. Uses standard Google utilities for installation and deployment.

Copyright © 2017 WANdisco, Inc. All rights reserved. Overcomes challenges of other hybrid cloud Cloud solutions for Hadoop • Google Cloud Storage Fusion runs as a proxy to on-premises Hadoop • Amazon S3 clusters and Google Cloud Dataproc, replicating • MS Azure WASB data as it changes in either environment. • OpenStack Swift DistCp-based solutions offered by Hadoop vendors: About WANdisco • Run in batch WANdisco is the world leader in Active Data ™ • Require significant administrator overhead Replication . Its patented technology provides for setup, maintenance and monitoring continuous consistent access to changing data anytime and anywhere with guaranteed consistency, • Impose significant overhead when moving data that prevents other applications from performing no downtime and no business disruption. WANdisco Fusion serves crucial high availability requirements • Don’t guarantee data consistency across on- premises and Google Cloud Dataproc clusters including cloud migration, Hadoop Big Data and Application Lifecycle Management, including • Can’t replicate data as it’s ingested Subversion, , Gerrit and Access. WANdisco works • Require manual intervention to directly with Fortune 1000 companies around the handle out-of- sync conditions world and in all sectors to ensure their data, whether • Risk administrator error leading to data loss on-premises or in the cloud, can give them the and extended downtime during recovery. real insight they need. WANdisco is committed Supported environments to the ODPi Interoperable Compliance Program to ensure its products are interoperable across a Hadoop wide range of commercial Hadoop platforms. • HDP 2.1.2 – 2.5 • CDH 4.4, 5.2 – 5.10 About Google Cloud Dataproc • PHD 3.0 – 3.4 Google Cloud Dataproc is a managed Spark • IBM BI 2.1.2 – 4.2 and Hadoop service that takes advantage of • Amazon EMR open source data tools for batch processing, • Google Compute Engine querying, streaming, and machine learning. • HDInsights For more information Operating systems About Google Cloud Dataproc • RHEL 6, 7 (x86-64) cloud.google.com/dataproc • 6, 7 (x86-64) About WANdisco Fusion wandisco.com • CentOS 6, 7 (x86-64) Email [email protected] 12.04, 14.04 • SLES 11 (x86-64)

Talk to one of our specialists today US TOLL FREE +1-877-WANDISCO (926-3472) EMEA +44 114 3039985 APAC +61 2 8211 0620 wandisco.com ALL OTHER +1925 380 1728 Join us online to access our extensive resource library and view our webinars 5000 Executive Parkway, Suite 270 San Ramon, California 94583 Follow us to stay in touch Copyright © 2017 WANdisco, Inc. All rights reserved. @WANdisco