ALICE security research site at University of Frankfurt
9th ALICE Tier-1/Tier-2 Workshop
Andres Gomez Ramirez, Udo Kebschull IRI - Goethe University Frankfurt ALICE UF Grid site
[email protected] UF Grid Site
➢ Security research, not production data processing. ➢ PhD thesis: “Deep Learning and Isolation Based Security for Intrusion Detection and Prevention in Grid Computing”. ➢ 5 Ubuntu 16.04 nodes. ➢ Centos 6 containers. ➢ CernVM-FS installed in the hosts and shared as a volume inside Docker containers.
Andrés Gómez – 9th ALICE Tier-1/Tier-2 Workshop Slide 2 Motivation: Grid Security Challenges
➢ Users can execute any application: arbitrary code execution by design. ➢ Payloads are frequently executed directly on host Operating System. ➢ Network sections are shared. ➢ Hundreds of thousands of jobs running simultaneously. ➢ Expensive to have many security experts monitoring the Grid. ➢ Similar to Cloud computing.
Andrés Gómez – 9th ALICE Tier-1/Tier-2 Workshop Slide 3 Proposed Solutions
➢ Linux Containers to execute payloads. ➢ Network isolation with virtual networks. ➢ Isolation to extract better payload behavior data from the host and from the network. ➢ Automated Intrusion Detection and Prevention. ➢ Deep Learning to enable this automation.
Andrés Gómez – 9th ALICE Tier-1/Tier-2 Workshop Slide 4 Security by Isolation: Linux Containers
/ Container / Container / Container / /boot /boot /boot /boot /dev Squid /dev rrdTool /dev Nagios /dev /etc /etc /etc /etc /kernfs /kernfs /kernfs /kernfs /home /home PHP /home Perl Python /home /media /media /media /media /opt /opt /opt /mnt /proc /proc /proc /opt /run Apache /run Cherokee /run Lighttpd GStreamer core /proc /srv /srv /srv /root /sys /sys /sys /run /usr MySQL /usr MariaDB /usr Drizzle /srv /var /var /var /sys /usr systemd-nspawn /var d systemd user D-Bus n PulseAudio systemd logind journald networkd IPC u
PID1 session daemon o daemon s e v i s u
l System Call Interface (SCI) c x e
Render Process Memory Virtual File Network cgroups Nodes & KMS scheduler manager System interface DRM Linux kernel
System Graphics Display Hardware CPU Storage Network GPU RAM RAM controller
Andrés Gómez – 9th ALICE Tier-1/Tier-2 Workshop Slide 5 Grid Job Execution and Network Isolation
Worker node Worker node
Pilot job Bob's grid Alice's grid Bob's grid job job Virtual network job Container Pilot Pilot Container Eve's grid job job Alice's grid job Eve's grid job job
Container Pilot job
Andrés Gómez – 9th ALICE Tier-1/Tier-2 Workshop Slide 6 Behavior Monitoring for the Grid
Working Node
Bob's grid Alice's grid Job Job Virtual network Container Pilot Pilot Container Classification: Job Job • Normal Eve's grid Job • Malicious
Container Pilot Job Hidden Input Output
Grid jobs word2vec monitoring embedding data vectors
Andrés Gómez – 9th ALICE Tier-1/Tier-2 Workshop Slide 7 ALICE Grid Security Monitoring
➢ Linux Containers ➢ Deep Learning ➢ Convolutional Neural Networks ➢ Recurrent Neural Networks ➢ Generative method for improving training ➢ Grid Jobs – normal vs malware
Andrés Gómez – 9th ALICE Tier-1/Tier-2 Workshop Slide 8 Grid job classification with Convolutional Neural Network Trace log
Word2vec
Output class Input
Dense connected layer with dropout and softmax single output word embedding Convolutional layer vectors multiple filters
Andrés Gómez – 9th ALICE Tier-1/Tier-2 Workshop Slide 9 Training data generation with Recurrent Neural Networks
Input layer x 1 x 2 ... x N
IO W W HH
Hidden layer h 1 h 2 ... h M
W HO
Output layer o 1 o 2 ... o L
Andrés Gómez – 9th ALICE Tier-1/Tier-2 Workshop Slide 10 Arhuaco: proof-of-concept implementation
Administrator Worker node workstation
Grid job Grid job
Virtual network Container Pilot Pilot Container job job
Grid job
Container Pilot job
Source Source interface 1 interface 2 VoBOX Worker node (Sysdig) (Bro) Monitoring Execution Grid job Grid job engine interface Virtual network Virtual network Container Container Pilot Pilot Container Feature Distributed job job container Eve's grid extraction job engine Classification Generation Pilot job
Response engine
Worker node Arhuaco
Grid job Grid job
Virtual network Container Pilot Pilot Container job job
Grid job
Container Pilot job
Andrés Gómez – 9th ALICE Tier-1/Tier-2 Workshop Slide 11 Arhuaco: proof-of-concept Implementation
➢ Linux Containers: Docker, Docker Swarm ➢ Deep Learning: Keras, Theano, TensorFlow, Python 3. ➢ Data collection: System Calls - Sysdig, Network connection - The zeek Network analysis tool. ➢ Grid Middelware - ALICE AliEn.
Andrés Gómez – 9th ALICE Tier-1/Tier-2 Workshop Slide 12 Evaluation: Grid Jobs vs Malware
Andrés Gómez – 9th ALICE Tier-1/Tier-2 Workshop Slide 13 Evaluation: Grid Jobs vs Malware
Andrés Gómez – 9th ALICE Tier-1/Tier-2 Workshop Slide 14 Results: Impact of Isolation over performance
1600 Grid Jobs in total
Andrés Gómez – 9th ALICE Tier-1/Tier-2 Workshop Slide 15 Results: Grid job classification – normal vs malware
➢ CNN: Convolutional Neural Network ➢ SVM: Support Vector Machine ➢ FPR: False Positive rate ➢ ACC: Accuracy
Andrés Gómez – 9th ALICE Tier-1/Tier-2 Workshop Slide 16 Results: training data generation results
➢ TPR: Sensitivity or True Positive Rate ➢ SPC: Specificity ➢ FPR: False Positive rate ➢ ACC: Accuracy
Andrés Gómez – 9th ALICE Tier-1/Tier-2 Workshop Slide 17 Conclusions
➢ Docker containers can be used to isolate and extract behavior information from Grid jobs without big performance impact. ➢ Deep learning is highly effective to identify “malicious” Grid jobs. ➢ CNNs with Word2Vec preprocessing provides improved accuracy than traditional SVM. ➢ Synthetic generated data can improve the training process.
Andrés Gómez – 9th ALICE Tier-1/Tier-2 Workshop Slide 18 Future work
➢ Thesis submitted. ➢ Exploring options: Arhuaco as open source and/or commercial tool. ➢ UF Grid site probably will not continue operations. ➢ Future research topics of interest: differential privacy, adversarial machine learning training, privacy preserving intrusion detection.
Andrés Gómez – 9th ALICE Tier-1/Tier-2 Workshop Slide 19 Thank you!
Questions? Appendix: CNN results
System Calls Network connections
Andrés Gómez – 9th ALICE Tier-1/Tier-2 Workshop Slide 21 Appendix: System call results
CNN CNN vs SVM
Andrés Gómez – 9th ALICE Tier-1/Tier-2 Workshop Slide 22 Appendix: Network traces results
CNN CNN vs SVM
Andrés Gómez – 9th ALICE Tier-1/Tier-2 Workshop Slide 23 CERN Security Operations Center
Andrés Gómez – Frankfurt University Slide 24