Data Science in Industry Ellie Dobson Pivotal 1 Network intrusion detection networks in telecommunications What has industry ever done for us? Theory-driven approach Data-driven approach ‘start with the system and ‘start with the data and work towards the data’ work towards the system’ How can we" make it happen?! Value Reactive What will" happen?! Analytics! Predictive" Why did it" happen?! Analytics! Diagnostic" What" happened?! Analytics! Descriptive Analytics Complexity The value of data over time! drive automated ! low latency actions! ! Big Data! in response to ! events of interest ! Size make insights from a large historical dataset…! … and use them to make split- second decisions on real time data! Fast Data! Speed! year+! year! month! day! s! ms! µs! drive automated ! low latency actions! in response to ! events of interest ! predictive maintenance ! payment fraud !! formula 1 racing! ! Telemetry! Telemetry! Car setup! Telemetry! Car setup! Traffic! True positive rate True Telemetry! Car setup! Traffic! Weather! False positive rate! Big Data Complex Data! Operational Commercial &" Dark Data! Social Data! Data! Public Data! TOOLKIT! 4! Write Code for Big Data! 6! Show Results! In-Database! Hadoop! Visualization! 1! Find Data! 3! Run Code! • SQL! • Pig! • python-matplotlib! • GraphViz! • PL/Python! • Hive! • python-networkx! • Gephi! Platforms! Interfaces! • PL/Java! • Java! • D3.js! • R (ggplot2, lattice, • Greenplum DB! • pgAdminIII! • PL/R! • Spark! • Tableau! shiny)! • Pivotal HD! • psql! • PL/pgSQL! • Excel ! • Hadoop (other)! • psycopg2! • SAS HPA! • Terminal! • AWS! • Cygwin! 5! Implement Algorithms! 7! Collaborate! • Putty! • Winscp! Libraries! Python! Sharing Tools! 2! Write Code! • MADlib! • numpy! • Chorus! Java! • scipy! • Confluence! Editing Tools! Languages ! • Mahout! • scikit-learn! • Socialcast! • Vi/Vim! • SQL! R! • Pandas! • Github! • Emacs! • Bash scripting! • (Too many to list!)! Programs! • Google Drive & • Smultron! • C! Text! • Alpine Miner! Hangouts! • TextWrangler! • C++! • OpenNLP! • Rstudio! ! • Eclipse! • C#! • NLTK! • MATLAB! • Notepad++! • Java! • GPText! • SAS! • IPython! • Python! C++! • Stata! • Sublime! • R! • opencv! fashion analytics !! video analysis !! call centre analysis !! What has industry ever done for us?! Thanks for listening! 23 .
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages23 Page
-
File Size-