FAST FOR THE CURIOUS

Accelerating CloudStors Storage Perfomance

Chris Myers, Gavin Kennedy AARNet Cloud Services AARNET’S VISION IS OF A GLOBALLY NETWORKED DATA-SHARING ECOSYSTEM

Accelerating knowledge creation and innovation to benefit future generations of Australians.

Connect. Collaborate. Innovate. AARNET IS THE USER DATA CONNECTOR

RESEARCH CLOUD PUBLIC CLOUD HPC

SECURE CLOUD AARNet network

DOMAIN REPOSITORIES INSTITUTION UNIVERSITY REPOSITORIES RESEARCH STORAGE GOVERNMENT REPOSITORIES CLOUDSTOR

Research Data Storage Accessible to all Australian Researchers • Easy, Familiar • Cross Institutional • On-Net • Fast • Flexible Integration • Domestic/Local Support

aarnet.edu.au | 4 APPLICATION STACK • ownCloud Web Portal • ownCloud Mobile Client • ownCloud Synch Client • OnlyOffice Collaborative Document Editing • AARNet Tenant Portal • AARNet Rocket Fast Parallel Uploads • FileSender Secure File Sharing • AARNet S3 Gateway • SWAN Jupyter Notebooks • Collections Data Packaging • EOS Scalable File System

aarnet.edu.au | 5 CLOUDSTOR NODES

4 geographic locations 2 X geographic replication 1 x tape backup

aarnet.edu.au | 6 SERVER ARCHITECTURE EOS – A geographically distributed filesystem developed by CERN

EOSD – A client process that allows multi-user access to EOS

S3 – AWS Simple Storage Service - Object Storage through a web interface

MINIO – S3 like object storage interface

ownCloud – Sync & Share & User Interface software

aarnet.edu.au | 7 S3 COMPLIANT STORAGE

The Challenge: Implement an S3 gateway to CloudStor's storage

The problem: ownCloud and EOS did not natively support S3 PoC Minio talking to EOSD Various workarounds to speed up large file uploads. Seemed okay until the first upload of 200,000 1KB files.

The solution: Several iterations and a new version of EOS until Minio could talk directly to EOS. Faster + Stable + Cleaner + Maintainable +

aarnet.edu.au | 8 Photo by Freddie Collins on Unsplash BUT THEN

aarnet.edu.au | 9 SO

aarnet.edu.au | 10 Photo by Zach Lezniewicz on Unsplash WHY?

aarnet.edu.au | 11 UPGRADE EVERYTHING

Upgrade Repeat upgrade upgrade

Rinse

aarnet.edu.au | 12 BUT

EOS is a disk-based, low-latency storage service. Having a highly-scalable hierarchical namespace, and with data access possible by the XROOT protocol, it was initially used for physics data storage. Today, EOS provides storage for both physics and user use cases. Instances of EOS include EOSUSER, EOSPUBLIC, EOSATLAS, EOSCMS.

The main target area for the service is physics data analysis, which is characterised by many concurrent users, a significant fraction of random data access and a large file-open rate.

For user authentication EOS supports Kerberos (for local access) and X.509 certificates for grid access. To ease experiment workflow integration SRM as well as gridftp access are provided. EOS further supports the XROOT third-party copy mechanism from/to other XROOT enabled storage services at CERN.

aarnet.edu.au | 13 AND

aarnet.edu.au | 14 SO I SAID

aarnet.edu.au | 15 Photo by Iker Urteaga on Unsplash BUT DJ SAID

aarnet.edu.au | 16 Photo by Jason D on Unsplash SHARDING

aarnet.edu.au | 17 DEDICATED S3 SHARDS

Now in production

Much tuning of Minio to match client behaviours

QuarkDB to stabilise EOS

Containers () and orchestration () for rapid deployment and robust management

Allows us to stripe shards across same infrastructure or deploy to new infrastructure.

Photo by Alessio Lin on Unsplash

aarnet.edu.au | 18 THE NEW S3 IS SHINY!

Presentation Title |19 PresenterPage Name | Date WHAT WE CHANGED

• New storage model using shards. • Moved to K8’s • Calico overlay • Complete rewrite of MinIO gateway • Increased bandwidth available to crl2 and acti Cloudstor nodes.

Presentation Title |20 PresenterPage Name | Date NETWORK UPGRADE FOR CLOUDSTOR Cloudstor crlt node

• Multiple points of redundancy PE crlt PE wmelb • OOB and fabric separated

• In service maintenance of switching 100G 100G • Improved connectivity to AARNet4 • Improved connectivity to servers Spine1 Spine2 • Improved server interface options 10/25/50/100G 2 x 100G • Regular switch OS patching for features

and security possible Leaf1 Leaf4 Leaf5 Leaf6 Leaf2 Leaf3 • Raw available bandwidth to

4 x 40G server/services increased 4 x 40G 100/50/40/25/10G 100/50/40/25/25G 100/50/40/25/25G 100/50/40/25/25G • Improved scaling and reduced over subscription

CS Blade Server • Single data flow performance CS Compute Server CS Storage Server

improvements 1G 1G 1G

OOB OOB OOB

Presentation Title |21 PresenterPage Name | Date S3 CLIENTS

• MinIO (https://min.io/ ) • Cyberduck (https://cyberduck.io/ ) • (https://rclone.org/ ) • S3 Browser (https://s3browser.com/ ) • S3 cmd • S3FS • And loads more (google is your friend)

Presentation Title |22 PresenterPage Name | Date MINE IS NOT THAT FAST

• Server desktop network connection • Storage Speed • Firewalls and other such security devices. • Congestion • Lots of really little files • Untune hosts for wan performance • Incorrect setting in the tool • Not all tools are equal in performance and /or stupidity • Is this the right user case for S3

Presentation Title |23 PresenterPage Name | Date OPPORTUNITIES FOR SHARDS

Data Services – F.A.I.R.?

Domain Services – Bio, HaSS, Geo, etc.

Sensitive Data – ASD &/or ISO27001 Compliance

Digital Preservation – Archivematica

Institutional Shards

Data distribution shards for BIG data generators

AARNet Mirror

Photo by Sharon McCutcheon on Unsplash

aarnet.edu.au | 24 MOONSHOT

aarnet.edu.au | 25 Photo by Bill Jelen on Unsplash CLOUDSTOR UNDER YOUR DESK BUT WAIT, THERE’S ALWAYS MORE

“The new ownCloud architecture … combines the new user interface “Phoenix” with the (CERN) next generation storage engine “Reva” … switch from the scripting language PHP to the compiled programming language Go … introduces gRPC based high-performance cloud-native microservices which handle connections to storage and to the pluggable app framework”

“AARNet, the Australian research and education network, is co-developing ownCloud Infinite Scale and will be the first organization using it in the field.”

https://owncloud.com/owncloud-infinite-scale-owncloud-unveils-new-architecture-for-unlimited-scalability/ CONTACT:

[email protected]

Photo by Quino Al on Unsplash