Infrastructure and Performance

Infrastructure and Performance

Infrastructure and Performance An Islandoracon Workshop Instructors: Gavin Morris & Luke Taylor Instructor: Luke Taylor DevOps Team Lead discoverygarden inc. 155 Queen St. Suite 101 Charlottetown, PE C1A 4B4 discoverygarden.ca [email protected] Instructors - Gavin Morris ● Team Lead & Dev Ops at Born Digital ○ http://born-digital.com/ ● Convener of the Islandora DevOps Interest Group ○ https://github.com/islandora-interest-groups/Islandora- DevOps-Interest-Group ● Led the Islandora DevOps Panel: Building Islandora at the inaugural Islandora Conference (Islandoracon) on Prince Edward Island, Canada ● Presented Automating Islandora Upgrations, Maintenance and Deploys at the Islandora Camp in Hartford, CT Overview ● Intro to the stack ● Security ○ Build Types ○ Split the stack ● ○ Operating Systems Best Practices ○ Packages ○ Services / Software ● The future of the stack ○ CLAW ● Provisioning ○ ISLE ○ Deploy & Config management tools ○ Pipeline ● Q&A ● Performance ● Resources ● Scaling Intro to the Stack - What is Islandora? Source: http://islandora.mnpals.net/pals/islandora/object/PALSrepository%3A412/datastream/OBJ/download/2016-08_Detailed_Islandora_Introduction.pdf Intro to the Stack - Build Types ALL All in One 2-3 servers 5-7 servers Intro to the Stack - Build Types: All in One Recommended Minimum ● 4-6 cores ● 16GB - 32GB RAM ● 100-200GB for OS, Temp files etc ● Volume large enough for repository data ● Additional space could be required for staging content Intro to the Stack - Build Types: 2-3 servers Web & Database Server (Minimum Requirements) ● 1-2 cores ● 4 - 16 GB RAM (*depends on platform type e.g. staging, dev) ● 150-250GB for OS, Temp files etc ● Additional space could be required for staging content Fedora Repository Server (Minimum Requirements) ● 2-4 cores ● 8 - 32 GB RAM (*depends on collection size) ● 150-250GB for OS, Temp files etc ● Additional space / volume for repository data e.g. 2 -20 TB Intro to the Stack - Build Types: 2-3 servers Intro to the Stack - Build Types: 5-7 servers Storage mount (e.g. NFS) Staff/Ingest Front End Public Front End Read-only fedora DB server Blazegraph Solr Read-write Fedora Intro to the Stack - Split the Stack ● Remote Solr ○ Use Gsearch 2.8+ ○ Edit fgsindex.indexBase in index.properties in Gsearch. ○ Still have to maintain a “dummy” index on the Gsearch server. ● Blazegraph ○ Used to replace Mulgara (Triplestore) for performance and stability gains ○ https://github.com/discoverygarden/trippi-sail ○ https://github.com/Smithsonian/trippi-sparql Intro to the Stack - Operating Systems Current Stable Recommendations ● Ubuntu 14.04 LTS ● RHEL/CentOS 6.9 Needing more definitive testing ● Ubuntu 16.04 TLS (w/PHP7) ● RHEL/CentOS 7 (challenges with temporary file system) Community Poll from Melissa Anez (Have your say!) ● Survey https://docs.google.com/forms/d/1E7NmS4944LD3E51A7SK_8MiNoOWCPnjgY8YWOUC7I-o ● Google Group topic https://groups.google.com/forum/?hl=en#!searchin/islandora/php$20testing|sort:relevance/islandora/WftNSPr7Xi0/vlh6eJU bAwAJ Intro to the Stack - Operating Systems packages (basic) man vim curl perl unzip automake subversion kernel-headers gcc zip dkms bzip2 openssh mercurial pkg-config build-essential git wget htop cmake libtool apt-utils kernel-devel libfreetype6-dev ntp yasm nasm rsync autoconf zlib1g-dev linux-headers Development tools Intro to the Stack - Services / Software ● Apache 2.2 - 2.4 Web server ○ Modules include but are not limited to ■ ssl, rewrite, deflate, headers, expires, xml2enc ■ reverse proxy for multi-systems: ● proxy, proxy_http, proxy_html, proxy_connect ● Databases ○ Mysql 5.5+ ○ Percona ○ Mariadb ○ Postgres ○ Recommend UTF-8 encoding Intro to the Stack - Services / Software ● Tomcat 7.0.52 + ○ Oracle Java JDK or OpenJDK 7/8 ○ SSL & port 8443 ■ Will need to compile own jks/P12/truststore (how to automate?) ○ see Gotcha section re versions above 7.0.72/8.0.39+ ● Apache Solr ○ versions 4.2, 4.6.1, 4.10 ○ Don’t use Gsearch Ant generated schema (not complete), missing catch_all entries etc. ○ Always helpful for starting out for schema & solrconfig .xml files https://github.com/discoverygarden/basic-solr-config Intro to the Stack - Services / Software ● PHP 5.3.x+ ○ Drupal 7.5.4 ■ Islandora 7.x / HEAD modules ■ Additional modules e.g. ctools, imagemagick, date, views etc. ○ Composer ■ Drush ● Fedora-Commons 3.8.1 ○ Triplestore (mulgara, Blazegraph) ● Fedoragsearch HEAD / 2.7.1 ○ DGI GSearch Extensions https://github.com/discoverygarden/dgi_gsearch_extensions ○ XSL Transforms for Gsearch https://github.com/discoverygarden/islandora_transforms Intro to the Stack - Services / Software ● Binaries, Derivative generation ○ Imagemagick ○ LAME (audio, mp3 etc) ○ FFMPEG (video) from source 3.3 ○ FITS ○ EXIF ○ XPDF ○ Ghostscript 9.05 (from source) ○ Tesseract (OCR) ○ Adore-djatoka 1.1 ■ On multi-system setups libraries should be additionally installed on web servers ■ Requires use of Oracle JDK 7/8 Provisioning - Deploy & Config management tools Puppet DSL / Ruby Free Puppet Enterprise https://puppet.com/ (up to 10 nodes) Chef DSL / Ruby Free Chef Automate / Hosted https://www.chef.io Ansible DSL / Python Free Tower https://www.ansible.com/ (Red Hat owned) (agentless) Saltstack DSL / Python Salt Open Salt Enterprise https://saltstack.com/ CFEngine DSL / C Community Edition CFEngine Enterprise https://cfengine.com/ https://www.gnu.org/softw Shell Scripts Bash / sh Free are/bash/ Packer DSL / JSON Free Builds Images https://www.packer.io/ Developer #1 Provisioning - Pipeline Production Web & Db server VM Web & DB server Fedora server VM Theming, solution packs, Fedora repo server modules, XSLTs, schemas, config etc. Code Up! Developer #2 Package & software updates, system Web & Db server VM configuration changes, data Fedora server VM migrations, re-indexing of triplestore etc. Data Down! Development Developer #3 Web & DB server Continuous Integration w/ Web & Db server VM Testing Suites for Code & Fedora repo server Fedora server VM Data Example Pipeline Performance ● Using Solr vs SPARQL/iTQL ○ Collection Solution Pack (Display Generation) ○ Islandora OAI (Query Backend) ○ Paged Content Module (Use Solr to derive pages and sequence numbers) ○ Breadcrumbs (Breadcrumb Generation) ● Breadcrumbs - Disable if not required or use Solr ● Enable Drupal caching options (Configuration - Development - Performance) ● Memcached / Varnish Performance “(XmlUsersFileModule) null” error ERROR 2017-03-10 08:56:54.796 [http-8080-21] (XmlUsersFileModule) null ERROR 2017-03-10 08:56:54.805 [http-8080-21] (AuthFilterJAAS) javax.security.auth.login.LoginException: Login Failure: all modules ignored Source: /usr/local/fedora/server/logs/fedora.log Reference: https://issues.apache.org/jira/browse/XERCESJ-211 https://jira.duraspace.org/browse/FCREPO-1230 Fix! https://github.com/discoverygarden/fcrepo3-security-jaas Performance ● Help too many multisites! ○ Islandora installations with Drupal multisites can cause unnecessary database connections. ● Multi-site optimization ○ https://github.com/discoverygarden/fcrepo3-security-jaas Performance ● Islandora Jobs ○ https://github.com/discoverygarden/islandora_job ○ Faster Ingests ○ Allows you to have multiple Gearman workers processing derivatives. ● Islandora Gsearcher ○ https://github.com/discoverygarden/islandora_gsearcher ○ Updates Solr index upon ingest completion vs waiting for ActiveMQ Security ● Directory permissions Tomcat/Drupal ● Run services using non-privileged users with no shell. ● Firewalls ○ Fail2ban (https://www.fail2ban.org) ○ Modsec (https://modsecurity.org/) ○ Ports / Rules ● Central logging ○ Syslog ○ Tripwire (https://www.tripwire.com/) (can be used for extended logging in addition to security) ○ ELK (ElasticSearch, Logstash & Kibana) https://logz.io/learn/complete-guide-elk-stack/ Best Practices, Gotchas, Tips ● Gsearch issues Tomcat 7.0.72/8.0.39+ ○ https://github.com/discoverygarden/gsearch.git ● Try the Islandora Deploy on Ubuntu guide https://github.com/islandora-interest-groups/Islandora-DevOps-Interest-Group/blob/master/Deployment %20Guides/Provisioning-Islandora-on-Ubuntu.md ● AWS S3 mounting as a file system ○ https://github.com/danilop/yas3fs ■ Debug mode first! ■ Make sure it re-mounts properly if system is restarted. ■ Gotcha: There may be an object size limit of 60 GB for ingested binaries e.g. video etc. ■ Mount the datastreamStore to S3 and leave objectStore on EBS for better performance ● Caution! Challenges with restoration! ○ Alternative https://bitbucket.org/nikratio/s3ql (same Gotchas apply!) The future of the stack - Islandora 7.2.x - CLAW https://github.com/Islandora-CLAW/CLAW/blob/master/docs/user-documentation/i ntro-to-claw.md https://github.com/Islandora-CLAW/CLAW/blob/master/docs/mvp/mvp_doc.md The future of the stack - ISLE Islandora + = Enterprise (ISLE) https://github.com/Islandora-Collaboration-Group https://islandora-collaboration-group.github.io/ https://islandora.ca/content/islandora-together-meet-islandora-consortial-group Q&A Resources ● Islandora http://islandora.ca ● Islandora sandbox https://sandbox.islandora.ca/ ● Vagrant up with Islandora Labs! https://github.com/Islandora-Labs/islandora_vagrant ● Please join the growing global community! http://islandora.ca/membership ● Perhaps jump on a call with one of the Islandora Interest groups? ○ https://github.com/islandora-interest-groups ○ https://github.com/islandora-interest-groups/Islandora-DevOps-Interest-Group ● One can learn so much from the Islandora Community on Google Groups! ○ https://groups.google.com/forum/?hl=en#!forum/islandora-dev ○ https://groups.google.com/forum/?hl=en#!forum/islandora Thank you!.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    31 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us