Quick viewing(Text Mode)

Student Resume Book

Student Resume Book

Class of 2018

STUDENT RESUME BOOK

[email protected] CLASS OF 2018 PROFILE 44% WOMEN 41

CLASS SIZE

INTERNSHIP PLACEMENTS 1.5 A.T. KEARNEY ABC SUPPLY AVERAGE YEARS PRIOR AIRBNB WORK EXPERIENCE APPLE, INC BALYASNY ASSET MANAGEMENT BUZZFEED CME GROUP PRIOR DEGREE ENOVA CONCENTRATIONS EXPEDIA FORD MOTOR COMPANY SOCIAL BUSINESS & GODADDY SCIENCES 1.16% HCSC FINANCE 11.24% KPMG OTHER 2.33% LAZARD ECONOMICS 25.19% LINKEDIN MOLEX, INC STEM 69.38% NASA NORDSTROM, INC ON POINT TECHNOLOGY OPEX ANALYTICS PROCTOR & GAMBLE CO SCHNEIDER Computer Science 4.65% SOCIAL FINANCE TRANSUNION, LLC Science 5.81% TRELLO UNIVERSITY OF CHICAGO Engineering 22.09% ZURICH AMERICAN INSURANCE CO Mathematics & Statistics 36.82% Class of 2018

PATRICK CHANG JAMIE CHEN JERRY CHEN JOHNNY CHIU

LUCA COLOMBO GRACE CUI ANISHA DUBHASHI JILL FAN

MATT GALLAGHER MICHAEL GAO LAUREN GARDINER JOE GILBERT

SARAH GREENWOOD VARUN GUPTA VERONICA HSIEH WENZE HU

RISHABH JOSHI BROOKE KENNEDY ARVIND KOUL TUCKER LEWIS

WEI LI EMMA LI ZILI LI JUNXIONG LIU Class of 2018

YUQING LIU DANIEL LÜTOLF-CARROLL SPENCER MOON ERIC PAN

MICHAEL PAULEEN CHRIS ROZOLIS WILL SONG CHRISTA SPIETH

PENNY SUN PHYLLIS SUN SAURABH TRIPATHI VINCENT WANG

LOGAN WILSON HAO XIAO WENJING YANG TONG YIN

ETHEL ZHANG PATRICK CHANG Product Manager by Trade Data Scientist in Training

510.710.7317 | [email protected] EDUCATION

Northwestern University The Master of Science in Analytics is a cross-disciplinary master’s degree with an Master of Science in Analytics applied curriculum exploring data science, machine learning, and business informatics. Sep 2017 – Dec 2018 Developed profiling clusters and predictive pricing model for the Chicago Parks Honors: Graduate Fellowship District Day Camp which accounts for 33% of the Parks District’s revenue. GPA: 3.83 Topics: Unsupervised Learning, Predictive Analytics, Text Analytics, Deep Learning, Big Data, Optimization, Healthcare Analytics, Data Mining, GIS for Public Health, Databases Columbia University Major: Industrial Engineering - Minor: Sociology - Dean’s List Honors Bachelor of Science Activities: Engineering Class Council, Asian American Alliance Exec Board Aug 2007 – May 2011 Conducted research in both Political Science and Public Health. Credited in: • "Costly Jobs: Trade-related Layoffs, Government Compensation, and Voting in U.S. Elections", Yotam Margalit, American Political Science Review. • “Systems biology of human benzene exposure”, Luoping Zhang, Chem Biol Interact. WORK EXPERIENCE

Urban Labs (UChicago) Data Research Intern – Poverty Lab Chicago Developing scope and building model predicting families at risk of first time Jun 2018 – Present homelessness so that LA County can optimize distribution of homelessness prevention services. For another project, crafting an interactive dashboard displaying the results of a model predicting the likelihood of college freshman to drop out before graduation. Nielsen Senior Product Manager – Nielsen Marketing Cloud San Francisco & New York Managed the growth and activation of Nielsen datasets for use in programmatic Jul 2012 – Sep 2017 environments, including: Television Viewership and Card Spend. harvard princeton yale brown MIT cal tech ivy executive president IT • Grew product revenue 200% YOY to $40MM in 2014 and continued growth information technology epidemiology urban planning informatic informatics financial synergy synergize manager vice director budget 150% YOY through 2015; brokered partnerships with 30 new platforms. management innovation chicago legal lawyer law disease laboratory • Oversaw team composed of two data scientists and three offshore analysts. resident healthy hospital doctor nurse compliance report external partner liability insurance privacy behavior society statistical prevention • Led the activation of Nielsen datasets on mobile advertising platforms; at global biostatistics services organization six sigma pmi project institute launch, Mobile Precision Marketing was the first mobile ad targeting solution pmp capm pmi agile certified practioner loyola depaul uic certification biotechnology pharmaceutical biotech drugs block chain blockchain lab utilizing television viewership data. big data spark hadoop artifical intelligence machine watson ai aws web developer outsource offshore india carnegie mellon harvey Emerging Leaders Associate mudd of things quantum dark serverless server archituecture Member of the Nielsen Emerging Leaders Program, Class of 2012. The Emerging city dashboard viz apple facebook microsoft cisco Leaders Program is an 18-month management accelerator program, which provides qualcomm ibm dell social good entrepreneurship entrepreneuer microloan kiva mba phd communications virtual on demand networks knowledge and experience through exposure to a diversity of projects over four protocols risk enterprise retrieval warehouse warehousing officer government gov tech govtech nih cdc atlanta dc d.c. washington business units. CS Technology Associate Consultant New York Oversaw multi-million dollar infrastructure technology projects including the JP Morgan Jun 2011 – May 2012 Chase company wide refresh and Thomson Reuters global upgrade. Created an internal resource forecasting and allocation tool. SKILLS ACTIVITIES

R Python SQL GIS Spark Java UChicago Civic Scopathon – Organizing Committee Onboarding non-profit partners for solution building event. Hadoop Tableau STATA SAS HTML5 Techsoup – Volunteer Consultant Advised on projects including Techsoup’s free phone rollout. Conversational Mandarin Eagle Scout Bay Area Rescue Mission – Volunteer Coordinated volunteer and resource allocation of tech and Drums Photography Climbing Cooking food services for the shelter. Jamie Chen 805.405.3924 I [email protected] I .com/in/jamie-chen .com/jchen0529 I public.tableau.com/profile/jamie

EDUCATION Northwestern University, McCormick School of Engineering Evanston, IL Master of Science in Analytics, Merit-based Fellowship Expected Dec 2018 Courses: Predictive Analytics, Database & Information Retrieval, Data Mining, Deep Learning, Optimization

College of William and Mary Williamsburg, VA Bachelor of Science in Mathematics, Marketing Double Major, Magna Cum Laude May 2014

PROFESSIONAL EXPERIENCE Microsoft Corporation Redmond, WA Data Scientist Intern June 2018 – Aug 2018 • Defined the objectives of improving bot traffic detection for Microsoft News to protect revenue and improve user experience. Built complex data pipelines to query, join, and preprocess daily clickstream data (>1B records) for analyzing bot traffic, performed exploratory data analysis on anomalous bot users’ behaviors • Trained predictive models to classify bots, designed and tested bot detection rules that can be applied at the hourly level, improved warm path analytics and resulted in daily data storage saving on over 3.7M page views

Ernst & Young, Quantitative Economics and Statistics (QUEST) Washington, D.C. Senior Analyst July 2016 – July 2017 • Led a team of four to publish the 2016 US Investment Monitor Study that analyzed investment trends by industry • Quantified companies’ economic impact as a result of planned investments. Final reports helped multiple companies qualify for state tax incentives and aided their communication with stakeholders • Led marketing initiatives with cross-functional teams to expand service offerings and won two projects Analyst Aug 2014 – June 2016 • Designed stratified statistical samples in SAS to estimate companies’ tax deductions and provide litigation support • Launched 25+ web surveys, analyzed survey responses and created dynamic Tableau visualization dashboards

Harman International Industries, AHA Radio Palo Alto, CA Strategic Planning Intern June – Aug 2013 • Forecasted 3G mobile user growth in the target market and influenced management’s decision to enter market

TECHNICAL SKILLS & PUBLICATIONS Skills: Python • PySpark • R • SQL • Java • Hadoop • Hive • C# • SAS • Tableau • SPSS • (advanced) Publications: Ernst & Young, Quantitative Economics and Statistics (QUEST) • Viewpoints on paid family and medical leave Mar 2017 • 2016 US Investment Monitor Study – EY Aug 2016 • Impact of the Orphan Drug Tax Credit on treatments for diseases June 2015

PROJECT WORK & INTERESTS Chicago Botanic Garden, IBM Analytics Jan 2018 – May 2018 • Identified data scope and joined multiple datasets together to extract insights on CBG members. Created new segments and behavioral profiles for 48k members using unsupervised learning methods (K-Means, PCA, and Gaussian mixture models), recommended strategies for increasing donation based on member segments Shopify, Industry Practicum Project Oct 2017 – May 2018 • Analyzed clickstream data by developing classification and clustering models to understand malicious bot traffic

Compass Pro Bono Consulting, Compass DC Oct 2016 – May 2017 • Developed a 3-year strategic plan for a non-profit in Virginia to increase annual philanthropy event participation

Interests: Swim, kickboxing, marathons, travel, karaoke, tango, hiking, music and food exploration Zheyuan (Jerry) Chen [email protected] · (347) 604-0819 Education Northwestern University – Evanston, IL December 2018 (Expected) Master of Science in Analytics GPA: 3.98/4.0

Columbia University – New York, NY August 2011 Ph.D. in Chemical Physics

University of Science and Technology of China (USTC) – Hefei, China July 2006 B.S. in Chemical Physics , Admitted to Special Class for the Gifted Young

Professional Experience NASA Jet Propulsion Laboratory (JPL) Pasadena, CA Data Science Intern June 2018 – August 2018 . Built a recommender that helped engineers fill out categorical fields in failure reports, which was estimated to save 40 hours per year . Implemented classification models in Python for form fields with predictions of top-3 options and reduced model prediction time by a factor of 60 using NumPy . Trained random forest, naïve Bayes and support vector machine models with recall larger than 0.8 using engineered features (text and recency/frequency)

Projects Zurich Insurance Practicum Project September 2017 – June 2018 . Built a better pricing model (neural network) of Worker’s Compensation insurance to help Zurich optimize premium pricing strategy . Converted in-production model from SAS to Python (using H2O library) with a replication error of 0.02% . Calculated model scores from more than 0.6 million policies using generalized linear model (GLM) . Assessed the model performance in terms of accuracy, execution and business impacts

Embiggening The Simpsons Dialogues April 2018 – June 2018 . Created deep learning models that wrote dialogues in the style of The Simpsons . Achieved the human-level performance of identifying the 4 main characters by applying character-level language model with long short-term memory (LSTM) networks . Selected as the best poster in the deep learning class

ShopRunner Repurchasing Analysis January 2018 – March 2018 . Applied discrete time survival model logistic regression to analyze the repurchasing behavior of customers in the ShopRunner network . Identified factors that affected repurchasing: recency/frequency, holiday season, and network effect . Found a positive network effect which showed that healthy competition boosted repurchasing

Catalog Mailing Marketing Analysis September 2017 – December 2017 . Predicted future purchase to maximize the total actual purchases (payoff) of top 1000 prospects . Achieved 44% of the theoretical maximum payoff in a highly imbalanced dataset, 11 times higher than the payoff of the baseline method . Developed a two-step model by combining logistic and multiple regression models in R . Performed substantial data cleaning and feature engineering

Technical Skills Python, R, SQL, Java, TensorFlow, Hadoop, Spark, AWS, Tableau, D3

Research Experience Columbia University New York, NY Postdoctoral Research Associate September 2011 – May 2013 Graduate Research Assistant September 2006 – August 2011 . Deployed nanocrystals and thin carbon film in next-generation photovoltaics . Initialized, designed and executed experiments that pinpointed a key issue limiting photovoltaic performance . Published results in high-impact journals and presented findings at prestigious conferences SHIH-CHUAN (JOHNNY) CHIU (847) 262-7315 | [email protected] | ://github.com/johnnychiuchiu Preferred emphasis on algorithms, machine learning, data mining, statistics, applied mathematics or similar field EDUCATION NORTHWESTERN UNIVERSITY Evanston, IL Master of Science in Analytics Dec 2018 (expected) Coursework: Predictive Analytics, Data Mining, Data Visualization, Deep Learning, Analytics for Big Data, Data Warehouse, Text Analytics, Social Network Analysis Identify, analyze and interpret trends or pattern in complex data sets, including telemetry from storage arrays and other IT infrastructure components NATIONAL TAIWAN UNIVERSITY Taipei, Taiwan Bachelor of Mathematics/Bachelor of Economics Jun 2013 Relevant Courses: Computer Programming, Computational Mathematics, Statistics, Probability Theory, Mathematical Software, Economic Forecasting USE ADVANCED MACHINE LEARNING TECHNIQUES TO PREDICT OR ALERT IN CASE OF FAILURES ULM UNIVERSITY Ulm, Germany Exchange Program, Baden-Württemberg Scholarship Mar 2012-Jul 2012

SKILLS & CERTIFICATIONS Patent: Marketing Intelligence Platform, Utility Model Patent in Taiwan, Dec 1, 2017-Jun 22, 2027 Certification Number M552625 Programming & Software: Python, R, Spark, Hive, Pig, SQL, , Bash, Java, Tableau, Microsoft Office Languages: Native speaker of Mandarin & Taiwanese; fluent in English; intermediate German Certifications: Google Analytics Certification, AdWords Certification, DoubleClick Bid Manager Certification Bachelor's degree- working towards completion of PhD program in Computer Science or Computer Engineering. PROFESSIONAL EXPERIENCE LinkedIn Mountain View, CA Data Science, Analytics Intern Jun 2018-Sep 2018 • Provided product suggestions to improve app retention by analyzing early user actions which are determinant to users’ long-term engagement and presented to key stakeholders to influence product roadmap. • Built automating insight extraction tools to produce actionable insights from multi-dimensional data.

urAD (marketing & advertising agency) Taipei, Taiwan Data Analytics Specialist Jul 2014-Jun 2017 • Enhanced data management platform’s functionality by generating user segment based on behavior; resulted in an additional $20k of quarterly revenue. • Built Python API for the data analytics features of company's marketing intelligence platform; produced insightful results including ad performance ranking, ad design and campaign optimization suggestions. • Wrote data analysis report helping marketers make business decisions & increase sales using website clickstream data; generated 10 new partnerships with e-commerce clients in 2 months. • Managed in-house mobile performance tracking platform; completed server-to-server integration with 30 companies; resulted in 120k user downloads for 40 apps in 6 months. • Led company ERP platform’s full development cycle from ideas, system flow, to refining user experience and leading cross-departmental communication; cut internal workflow processing time by 50%.

PROJECT WORK BP NORTH AMERICA, Northwestern University Oct 2017-Jun 2018 Generated customer segmentation and build consumer lifetime value model to drive incremental revenue. ShopRunner, Northwestern University Feb 2018-Mar 2018 Implemented algorithms to generate personalized recommendation for ShopRunner’s members. Luca Colombo (224) 334-5010 Ÿ [email protected] Ÿ www.linkedin.com/in/lucacolombo1

Education

Northwestern University. Evanston IL Anticipated Dec 2018 Master of Science in Analytics. GPA: 4.0/4.0 • Coursework: Predictive Analytics, Data Mining, Big Data, Java & Python Programming, Databases & Data Warehouses, Data Visualization, Machine Learning Model Deployment, Deep Learning • Future coursework: Optimization & Heuristics, Reinforcement Learning, Text Analytics • ABC Supply Hackathon, 3rd Place. Enova Data Smackdown, 3rd Place Università Bocconi (Italy) and Université catholique de Louvain (Belgium) Apr 2016 Joint Master of Science in Economics. Graduated magna cum laude Università Bocconi. Italy Oct 2013 Bachelor’s Degree in Economics. Graduated cum laude University of Chicago. Chicago IL. Exchange quarter Sept 2012 – Dec 2012

Skills & Certifications

Computer skills: Python, R, SQL, Spark, MapReduce, Java, JavaScript (D3), Git, Tableau, SAS, AWS, Excel Languages: English, Italian, French, Spanish

Work Experience

Blue Cross and Blue Shield of Illinois. Chicago IL Jun 2018 – Sep 2018 Data Science Sr. Intern • Deployed to production a Python module to automatize the update of a database table (currently run on a weekly basis) saving man-hours, improving scalability and mitigating risk of errors • Increased writing performance by 10 times and parallelized execution of hundreds of SQL queries • Enabled to outsource the running of the tool to non-technical colleagues, by focusing on traceability, reproducibility and user friendliness (using logging, Sphinx, batch files and YAML files) Citadel. Chicago IL Mar 2017 – Aug 2017 Electronic Communications Surveillance Analyst • Conducted case-driven surveillance of e-communications to detect inappropriate and illicit behavior • Evaluated proofs of concept for a new holistic employee surveillance machine learning software MAPP Economics, an economic consulting firm Analyst. Brussels, Belgium Jan 2016 – Jul 2016 • Worked closely with managing director in the context of antitrust investigations and merger control involving the European Commission and National Competition Authorities • Used statistical learning and data visualization to study competitive behavior of firms Intern. Paris, France Jul 2015 – Dec 2015 • Performed data cleansing, exploratory data analysis and data visualization • Supported analysts and economists in the preparation of reports and client presentations Banco Desio, a publicly traded commercial bank. Desio, Italy Jun 2014 – Sept 2014 Summer intern, Office of Strategic Planning • Analyzed financial statements of competing banks to benchmark quarterly earnings Università Bocconi. Milan, Italy Nov 2011 – Jul 2014 Teaching Assistant for the courses “IT skills for economics” and “Introduction to STATA”

Projects

Industry Practicum: Zurich North America Nov 2017 – Jun 2018 • Monitored business impact of in-production pricing model for workers’ compensation insurance • Developed machine learning model to help Zurich optimize its pricing strategy NBA History Visualization. Github repo: https://bit.ly/2sTMwuZ May 2018 – Jun 2018 • Created an interactive dashboard using D3 to explore the evolution of the NBA’s style of play Employee Turnover Analysis. Github repo: https://bit.ly/2KqOLw3 Feb 2018 – Mar 2018 • Trained a Random Forest classifier to predict probability of quitting with 95% recall • Built a web-app to interact with the model and deployed it to

Interests

Swimming, cycling and running. Member of the Northwestern Triathlon Club and Bocconi Running Club Yue (Grace) Cui (626)203-1729 | [email protected] EDUCATION Northwestern University, Evanston Dec 2018 (Expected) M.S. in Analytics, McCormick School of Engineering GPA: 3.85/4.00 Coursework: Predictive Analysis, Data Mining, Big Data Analytics, Deep Learning, Data Visualization, Data Management, Databases & Information Retrieval, Text Analytics, Optimization & Heuristics, A/B Testing, Time Series, Statistical Consulting University of California, Los Angeles Jun 2017 B.S. Statistics; B.A. Economics Cumulative GPA: 3.83/4.00, Statistics Major GPA: 3.90/4.00 Honors and Awards: Magna Cum Laude; Winner - Best Use of External Data, Marketing Optimization, ASA DataFest; 1st Place, Target Corp. Case Competition; 2nd Place, PWC Challenge Case Competition

TECHNICAL SKILLS R, Python, SQL, Spark, Hadoop, Impala, Hive, MapReduce, Tableau, AWS, Java, D3.js, HTML/CSS, Git, Unix, SAS, Tensorflow

EXPERIENCE Procter & Gamble, Cincinnati Jun 2018 – Aug 2018 Data Science Intern, Global Skin & Personal Care • Initiated and led a project that focused on using NLP to recognize specific vocabulary involving P&G product ingredients from 7M+ call center conversations (Python). • Identified and visualized trends in consumer concerns about ingredients with text analytics methods such as TF-IDF scoring and Word2Vec (pandas, numpy, nltk, scikit-learn, textblob, seaborn, matplotlib) and developed algorithms to detect anomalies. • Created data pipelines and performed sales lift analysis using Big Data tools (Impala, Hive, Spark) on 14B+ transaction records. • Improved due diligence process for $300,000/year external data purchase by converting data cleansing process and exploratory analysis from Excel to R and Tableau (dplyr, tidyr, plotly), reducing evaluation timeline from 3 months to 3 weeks. British Petroleum, Chicago Oct 2017 – Jun 2018 Data Science Student Consultant • Identified key factors affecting customers’ gas purchasing behavior with customer segmentation analysis (R, Python). • Built a Consumer Lifetime Value model and developed an interactive Tableau dashboard tool to help BP determine optimal spending in marketing activities. • Translated technical algorithms and methodologies into client deliverables and effectively communicated the results to clients. Junction of Statistics and Biology Lab, UCLA Jul 2016 – Feb 2017 Research Assistant • Led two other students in building an R package to help users compare tissue or cell types based on chromatin states. • Wrote functions that allow users to convert bigwig files into csv files and turn the data into eligible formats for statistical analysis. Credit Reference Center, The People’s Bank of China, Beijing, China Jun 2015 - Aug 2015 Intern, Research and Development Department • Designed marketing plans for the company’s social media platform with knowledge of the credit reporting system in order to attract more followers. Followers of the WeChat platform rose from 20,000 to 90,000.

PROJECTS Gaming Analytics Project, Northwestern University Dec 2017 – Jun 2018 • Designed a novel team-based Player-versus-Player recommender system framework for players to boost performance using 16M+ matches’ data provided by Destiny 2, a modern massively multi-player online game by . • Constructed players’ profiles by applying various clustering methods such as K-means, GMM, and Archetypal analysis (Python). • Selected teams and players with similar playstyles but higher performance or faster improvement for recommendation using KNN. • Submitted paper to the Artificial Intelligence and Interactive Digital Entertainment Conference (AIIDE 2018). Grocery Sales Forecasting Project, Northwestern University Jan 2018 – Mar 2018 • Applied supervised learning methods such as Logistic Regression, GAM, Decision Tree, Random Forests, GBMs, and Neural Network to forecast products’ unit sales for a large Ecuador-based grocery retailer. • Developed a front-end interactive web app using Python, HTML, JavaScript and Flask and deployed it on AWS EC2. Predictive Analytics Class Project, Northwestern University Oct 2017 – Dec 2017 • Built logistic regression, lasso regression, and multiple linear regression models to identify the top 1000 customers with the highest expected dollar purchase from catalog mailing for a retail company.

VOLUNTEER AND INTERESTS • Volunteer: Recorded electronic books in both English and Mandarin for visually-impaired people from China. • Interests: Tennis, ping-pong, badminton, photography, cooking, baking ANISHA DUBHASHI 949-981-2132 | [email protected] EDUCATION Northwestern University 9/2017 - 12/2018 (Expected) Master of Science, Analytics GPA: 3.98 Coursework: Predictive Analytics (Supervised Learning), Data Mining (Unsupervised Learning), Deep Learning, Big Data Analytics, Data Visualization, Text Analytics, Optimization and Heuristics Honors: Graduate Fellowship

University of California, Los Angeles 9/2009 - 6/2013 Bachelor of Science, Mathematics-Economics

TECHNICAL SKILLS R, Python, SQL, Java, Tableau, Git, HTML/CSS, JavaScript (D3.js), Hadoop, Spark, Hive, AWS (Redshift, S3, EC2), VBA

WORK EXPERIENCE Nordstrom, Seattle, WA 6/2018 - 8/2018 Data Science Intern, Marketing Analytics • Implemented clustering algorithm using R to segment millions of customers in order to gain insight into behaviors, measure marketing effectiveness, and better allocate future marketing spend. Presented results to leadership. • Created a model that classifies new customers into a cluster for incorporation into ETL of productionized model. Synchrony Financial, Chicago, IL 9/2017 - 6/2018 Graduate Student Consultant • Collaborated in a team to create a chatbot prototype in Python that handles various customer service tasks utilizing natural language processing and a Naïve Bayes machine learning model on customer service e-chat data.

Conversant Media/Epsilon, Chicago, IL Senior Analyst, Technical Account Analytics 3/2016 - 8/2017 Analyst, Technical Account Analytics 12/2014 - 3/2016 • Queried a SQL database with nine billion daily bids and over eight billion client retail transactions to optimize success of campaigns; presented actionable insights in Tableau to clients to win five new business targets. • Designed and launched A/B tests to determine optimal promotional and marketing strategies for clients. • Created a monitoring system for test and control messaging using Python to detect operational issues. • Led new hire trainings and biweekly knowledge sessions for the Analytics department up to the VP level on company database specifics, table structures, and mapping across profiles.

UnitedHealth Group, Irvine, CA Analyst, Analytics 7/2014 - 11/2014 Analyst, Pricing 7/2013 - 7/2014 • Built, enhanced, and troubleshot complex models used by the pricing team to improve accuracy and efficiency. • Conceptualized new, competitive pricing rates while maintaining profit levels by analyzing competitive intelligence and historical forecast data. Presented a study of standard rates to demonstrate technique effectiveness.

PROJECTS AND AWARDS 1st place, Enova Data Smackdown, Evanston, IL 10/2017 • Trained a boosted tree model in R to predict the future value of properties for Enova’s data science competition. Gaming Analytics Research, Evanston, IL 12/2017 - 6/2018 • Developed an archetypal analysis machine learning model in R to detect anomalies and exotic playstyles in the eSports game League of Legends based on Riot’s API data of previously recorded matches.

Cuisine Prediction Web App Development, Evanston, IL 1/2018 - 3/2018 • Built a text classification web app for cuisine prediction with Python, Flask, and HTML deployed on AWS EB. QINGJIN (JILL) FAN [email protected] § (217) 898-0681 § www.linkedin.com/in/jill-fan

EDUCATION NORTHWESTERN UNIVERSITY EVANSTON, IL Master of Science in Analytics EXPECTED: DEC 2018 § GPA: 3.83/4.00 § Relevant Coursework: Predictive Analytics, Big Data Analytics, Data Mining, Machine Learning, Deep Learning, Optimization, Databases, Data Warehousing, A/B Testing, Recommender System, Text Analytics, Data Visualization

UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN URBANA-CHAMPAIGN, IL Bachelor of Science in Civil Engineering; Minor in Computer Science AUG 2013 – MAY 2017 § GPA: 3.85/4.00, High Honors, Engineering Dean’s List

SKILLS § Programming: Python, R, SQL, Hadoop, Spark, MapReduce, Hive, Java, JavaScript (d3.js), C/C++, Bash § Tools: Tableau, Git, Amazon Web Services, Azure, Matlab, MAPR, Mathematica, Minitab

PROFESSIONAL EXPERIENCE SCHNEIDER NATIONAL GREEN BAY, WI Advanced Analytics Intern JUN 2018 – AUG 2018 Associate Engagement Survey Analytics § Applied an RNN model for sentiment analysis of text data, leading to a 20% increase on testing accuracy with data augmentation. § Extracted linguistic features using NLP techniques, visualized word embeddings using nltk, genism and HoloViews. Work Experience Text Analytics § Built an ensemble of various SVM models to classify candidates based on their previous work experience with 80% accuracy. § Derived confidence scores from SVM models as factors for employee turnover prediction using scikit-learn. Driver Message Analytics § Constructed an ETL process to identify freeform notification messages from drivers with Spark, Python and Hive. § Leveraged big data infrastructure to store driver messages and enhance analytics capabilities in the system.

BOSCH WUXI, CHINA Human Capital Analytics Intern JUN 2017 – AUG 2017 § Migrated KPI calculation tools from Excel to Python (pandas, NumPy) and built a user-friendly interface that reduced KPI reporting process from 3 hours to 20 minutes. § Identified correlation among company’s medical clinic visiting rate, workforce illness rate and absence rate, interpreted the results and presented deliverables to senior manager to improve productivity and reduce absenteeism. § Acquired, cleaned, and structured massive data of 300k+ rows from multiple sources and built a master database in Excel for future workforce analysis used by the HR department.

PRICEWATERHOUSECOOPERS SHANGHAI, CHINA Technology Consulting Intern JUN 2016 – AUG 2016 § Designed a weight-adjusted grading scheme to evaluate system suppliers for a core banking platform transformation project. § Re-engineered business processes that increased the efficiency of processing time by avoiding human-errors, and achieved a smooth transition between two distinct electronic back-end processing systems.

PROJECT EXPERIENCE ANALYTICS HACKATHON 1ST PLACE - NORTHWESTERN UNIVERSITY & ABC SUPPLY MAY 2018 § Implemented K-Means clustering to segment fleet vehicles and to determine potential replaceable commercial vehicles. § Identified significant factors for vehicle replacement in Random Forest with 93% accuracy. § Developed a Holt-Winters time series model to predict sales loss due to vehicle replacement.

BRAND RECOMMENDATION ENGINE - SHOPRUNNER JAN 2018 – MAR 2018 § Designed a hybrid recommender system to personalize a top list of brands for customers based on their purchase behavior.

CUSTOMER PURCHASE PREDICTION - NORTHWESTERN UNIVERSITY NOV 2017 – DEC 2017 § Classified and identified highly responsive customers resulting from catalog mailing and built predictive models for dollar purchase amount using logistic regression, multiple linear regression, and lasso regression in R. § Selected the best model based on statistical significance, goodness of fit, parsimony, interpretability, and financial payoff. Matthew Todd Gallagher [email protected] | (302)-241-6027

Technical Skills / Recent Projects: Languages: Python, Java, R, SQL, JavaScript, D3.js Libraries: Sci-kit learn, Pandas, Numpy, Matplotlib, Flask, PySpark Software: AWS Cloud Stack (Redshift, Lambda, Athena, Kinesis), SSIS, Tableau, Hadoop

Industry Practicum: BP North America | Northwestern University • Performed customer segmentation analysis from consumer purchase data using R and Python. Created clusters using Gaussian mixture models and designed a recency-frequency customer lifetime value model to identify segments for targeted marketing. Ultimately designed a Tableau dashboard and presented findings to BP North America marketing stakeholders.

Predicting Food Inspection Outcomes | Northwestern University • Built and trained a logistic regression to predict restaurant food inspection failures. Utilized Python, Flask, and AWS to stand up a web app allowing real time API pulls, analysis, and retraining.

Education: Northwestern University | McCormick School of Engineering September 2017–December 2018 Master of Science in Analytics Relevant Coursework: •Optimization and Heuristics •Predictive Analytics •Data Mining •Text Analytics •Analytics for Big Data •Deep Learning

Lewis and Clark College September 2010–May 2014 Majors: Economics and Mathematics Member of Pi Mu Epsilon (Mathematics Honor Society) Honors Economic Thesis: Determining The Effect Of Increased To Higher Education On The Skill Premium Varsity Track and Field

Professional Experience: Expedia Group | Bellevue, WA June 2018–September 2018 BI Engineer Intern • Deployed a reporting and visualization pipeline designed to monitor and analyze the booking transaction data flow. The pipeline leveraged the AWS cloud stack, specifically: Kinesis, Lambda, Athena, and Redshift • Created metric to capture both overall data flow health, and health of individual micro-services • Built an interactive Tableau dashboard from the reporting pipeline, which acts as the source of truth for overall microservice ecosystem health. • Presented visualization and reporting infrastructure to division and organization leadership

GetInsured (acquired Array Health) | Seattle, WA December 2016–September 2017 Data Analyst • Improved billing integration scalability through designing ETL scripts, automating data processing in SQL and PowerShell, and writing detailed documentation • Took ownership of insurer billing integrations for our largest client, involving monthly flow of $2-3 million in premiums • Managed client expectations, vendor responsibilities, and coordinated the multi-team effort • Led weekly billing integration meetings with clients, vendors, and GetInsured department leadership • Created custom decision support features for sales department

Array Health | Seattle, WA February 2016–November 2016 Data Operations Support Analyst • Researched and resolved flagged consumer-facing data discrepancies, requiring clear communication with our Customer Support, Product, and DevOps departments • Automated outdated Excel manual procedures using SQL, reducing time required for data QA and cleansing • Developed ETL solutions to standardize and process external enrollment data, allowing easy merging with our schema • Performed QA testing to validate production data processing tools

MICHAEL (YIFEI) GAO (610) 517-3696 · [email protected] EDUCATION NORTHWESTERN UNIVERSITY Evanston, IL Master of Science in Analytics Expected Dec. 2018 • GPA 3.9 LEHIGH UNIVERSITY Bethlehem, PA Bachelor of Science in Accounting with CPA Eligibility (Cum Laud) May 2016 Integrated Degree of Computer Science and Business (Magna Cum Laude) May 2016 • GPA 3.6 – Top 10% TECHNICAL SKILL SKILLS: Python, R, SQL, Spark, Hadoop, Tableau, AWS, Java (MapReduce), JavaScript (d3), Flask, TensorFlow, Hive COURSES: Predictive Analytics, Data Mining, Machine Learning, Data Visualization, Big Data, Deep Learning, Optimization, Text Analytics, A/B Testing, Data Warehousing, Bloomberg Professional Certificates WORK EXPERIENCE A.T. KEARNEY Chicago, IL Summer Business Analyst – Analytics Practice Jun. 2018 – Aug. 2018 • Designed, modeled, and piloted carrier performance program for Fortune 10 automotive manufacturer that is expected to create $6M annual saving and significantly improve operational efficiency by reducing dwell time • Created and presented to 6 partners and client executive directors the interactive geospatial dashboard to visualize vehicle distribution network to be used in carrier contract negotiation • Performed baseline analysis on ocean shipping capacity to validate $5M saving potential after web-scraping JEFFERIES New York, NY Risk Analyst May 2016 – Aug. 2017 • Elevated operational efficiency by 30% through automating new account onboarding process • Streamlined client relationship management by creating KYC-based macros that helped reduce over $1M charges • Mitigated risk through fail analysis, position coverage, and client negotiation on fixed-income and equity products DELOITTE New York, NY Risk Consulting Intern Jun. 2015 – Aug. 2015 • Conducted 4 field interviews with plant management to identify potential risks in internal controls • Wrote an operational risk report for a leading insurance firm to streamline work processes for the audit team • Tested internal controls quantitatively on 60 samples with supporting materials to identify control weaknesses BANK OF CHINA (Global Fortune 100) Shenzhen, China Software Engineer Intern – PMO Resource Management System Jun. 2014 – Aug. 2014 • Led 7 interns in the design of the system which won 3rd place in the 2014 Creativity Contest (out of 60 teams) • Developed and presented prototype to the CTO which was adopted, serving 1200+ employees at 3 offices DATA SCIENCE PROJECTS CHICAGO BOTANIC GARDEN (NGO Project with IBM Analytics) Feb. 2018 – Jun. 2018 • Identified $50K annual revenue growth opportunity in upselling from segmenting 1M members with GMM • Visualized donation trend and segment profiles with Tableau to lead digital marketing and event redesign PRINCIPAL FINANCIAL GROUP Sep. 2017 – Jun. 2018 • Built regime-switching machine learning models to forecast returns to power factor-based investment strategy SHOPRUNNER Jan. 2018 – Mar. 2018 • Developed NLP-based collaborative filtering brand recommendation engine to engage users and retailer network • Designed and implemented front-end web infrastructure with Flask hosted on AWS and RDS LEADERSHIP & INTERESTS LEADERSHIP: Co-founder of Phi Delta Theta fraternity, Student Leadership Council Board Member, Peer Mentor INTEREST: Visual Design, Travel (20+ countries), Physical Fitness (ACE certificate in progress), Films, NGOs LAUREN GARDINER LG MS IN ANALYTICS CANDIDATE

CONTACT EDUCATION MASTER OF SCIENCE IN ANALYTICS Expected Dec 2018 (859) 492-7381 Northwestern University / Evanston, IL GPA 3.95/4.00

LaurenGardiner2018 Expected Coursework: Predic ve Analy cs I & II, Deep Learning, Big Data, @u.northwestern.edu Text Analy cs, Data Visualiza on, Data Mining, Reinforcement Learning

linkedin.com/in/legardiner BACHELOR OF SCIENCE IN DATA SCIENCE May 2017 Lipscomb University / Nashville, TN GPA 3.97/4.00 github.com/legardiner Minors: Pure Mathema cs, Internet and Social Media Marke ng Relevant Coursework: Informa on Structures, Research Methods, Linear SKILLS Algebra, Sta s cal Analysis & Decision Modeling, Data Mining & Analysis Python WORK EXPERIENCE

Spark SIRI SOFTWARE ENGINEER INTERN Jun-Sep 2018

Scala Iden fied upsteam factors affec ng the joinability of Siri usage log data SQL streams by building an explanatory model in Spark MLlib R Engineered a strongly type derived dataset of Siri usage data in Spark and Scala to ensure data quality for financial repor ng Java DATA SCIENTIST INTERN May-Aug 2017 CERTIFICATIONS Provided a data science perspec ve to the Applied Research team’s R Programming on-going deep learning automa c speech recogni on project Ge ng and Cleaning Data Improved the language model by manipula ng 100M lines of text training Exploratory Data Analysis data to mimic conversa onal speech and crea ng reproducible script Created tooling for Expert Services to assess client data and model PROJECT WORK transferability using pre-trained word embeddings and Tensorboard

DATA SCIENCE INTERN Jun-Aug 2016 INDUSTRY PRACTICUM: SHOPIFY Oct 2017-May 2018 Built campus classifier with Flask, Sci-kit Learn, Fiona, & Shapely Develop algorithm to classify users as Python packages to predict a user’s university and on/off campus status bots or real shoppers for e-commerce Performed a quan ta ve analysis in Excel and text analysis in Python pla orm based on click-stream data for declined B2B contracts that prompted repor ng process changes Created a nightly script in Databricks using SQL, Python, and Spark to HONORS enable the measurement of ROI on order inserts for brand partners Nashville Technology Council Student IMPLEMENTATIONS INTERN Jan-May 2016, Sep 2016-Apr 2017 of the Year 2017

College of Compu ng and Technology Designed, implemented, and documented Juicebox data visualiza ons Outstanding Senior 2017 [email protected] 512-964-5229 github.com/jl-gilbert Joseph Gilbert

Education Northwestern University, McCormick School of Engineering - Evanston, IL Expected December 2018 Master of Science in Analytics Cumulative GPA: 3.9/4.0 Current Coursework in: Text Analytics, Optimization & Heuristics

University of Minnesota, College of Liberal Arts – Minneapolis, MN May 2015 Bachelor of Science in Economics, Magna Cum Laude with Distinction Minors in Mathematics and Management Cumulative GPA: 3.8/4.0 Deans List, Gold Scholar Award, National Scholarship, Presidential Scholarship

Skills R, Python, Java, Git, Bash, Spark, Hive, SQL, Tensorflow, HTML, D3.js, Tableau, AWS

Relevant Experiences Ford Motor Company - Dearborn, MI June-September 2018 Global Data Insight & Analytics Intern • Initiated efforts to enhance forecasting models built with Excel by using Python and PySpark to incorporate bigger and more complex data sources • Forged new relationships between various business units to build a more comprehensive understanding of the autonomous vehicle industry outlook in the next decade, leading to more informed long-term strategy decisions • Mined consumer data from mTAB to gain insight into risk and opportunity in Ford’s highest impact segment

Boom Lab - Minneapolis, MN June 2015-July 2017 Business Analysis Senior Associate • Promoted from Associate to Senior Associate in March 2017 based on performance feedback from Fortune 50 online retail client • Brainstormed with business and engineering leaders to determine most important features to add to or improve on ecommerce search application, resulting in prioritization of future work and conceptual development of projects • Contributed to sprint planning and standup facilitation on an Agile team; filled in as scrum master at times • Provided detailed analysis and feedback to engineers to increase effectiveness and functionality of programs in development • Managed a software engineering intern during Summer 2016 to develop a new spellchecker application from the ground up, resulting in a 30% increase in spelling correction precision • Applied econometric techniques to data in RStudio to calibrate algorithms used by guest-facing applications

Contata Solutions, Ltd. - Minneapolis, MN June-August 2014 Data Analytics Intern • Conducted initial meetings with consulting clients to determine needs and evaluate datasets • Built slideshow presentations for communicating results of text analytics projects to clients • Verified accuracy of data scraped from the web to ensure effectiveness of analysis

Projects We Energies – Industry Practicum October-June 2018 • Developed models in R tying overtime to productivity and employee attrition in customer care centers • Visualized patterns in employee behavior with Python and Tableau • Recommended HR strategy based on observed trends, designed to improve employee productivity, attendance, and satisfaction • Presented project and findings at annual Northwestern University Analytics Exchange conference

NBA Stat Predictions Web App January-March 2018 • Envisioned, designed, and developed self-updating Flask web app to make predictions for NBA stat lines with automated machine learning algorithms • Hosted app and database on AWS EC2 and RDS platforms

Sarah Greenwood 248.953.5227 ⠂[email protected]

EDUCATION Northwestern University, Evanston, IL — Master of Science in Analytics ​ ​ DECEMBER 2018 (EXPECTED) GPA: 3.95/4.0 Masters of Science in Analytics Fellowship Recipient Coursework: Databases, Data Mining, Predictive Analytics, Data Visualization, Big Data, Deep Learning, Text Analytics, Machine Learning Model Deployment Venture for America — Fellow ​ ​ JUNE 2015 - JUNE 2017 Received training from a variety of firms, such as McKinsey, IDEO, Flatiron School, Kaplan, and Goldman Sachs University of Michigan, Ann Arbor, MI — Bachelor of Science, Data Mining and Information Analysis ​ ​ MAY 2015 Minor in Spanish Language and Culture GPA: 3.74/4.0 with University Honors Coursework: Statistical Principles for Problem Solving, Statistical Computing, Quantitative Research Methods, Effective Communication in Statistics, Data Mining SKILLS Programming: R, Python, Java, SQL, C++, Spark, MapReduce, Hive, Javascript (d3) ​ Software: Git, AWS, Google Analytics, Excel, Tableau, Salesforce, Pardot, Adobe Marketing Cloud, Brandwatch ​ EXPERIENCE Ford Motor Company, Dearborn, MI — Smart Mobility, Machine Learning Intern ​ ​ JUNE 2018 - SEPTEMBER 2018 • Investigated root causes of spark plug fouling for 5,000 vehicles by applying survival analysis and temporal data mining on millions of rows of embedded modem and diagnostic trouble code data • Conducted cluster analysis using modem data and connected vehicle data to identify driver behavior segments using KMeans and Gaussian Mixture Models • Created a data pipeline using Hive, Spark, and Hadoop to combine modem data, repair data, and dealership data for millions of vehicles • Optimized data processing pipeline for weekly dataset acquisition using PySpark, which resulted in a 40% reduction in runtime and a 60% reduction in required processing power Rendia, Baltimore, MD — Business Analyst and Strategist ​ ​ AUGUST 2015 - AUGUST 2017 • Built and maintained logistic regression and random forest models which predicted customers’ likelihood to renew with 85-90% accuracy based on customer usage patterns and characteristics • Collaborated across departments to harness statistical models to proactively target at-risk accounts and forecast cash flow • Performed ad-hoc analysis to determine best practices for improving customer success and re-engaging expired customers • Developed data tracking methods to improve the cleanliness and quality of data, resulting in clearer, faster, and more accurate reporting • Identified prospective customers and planned and iterated on targeted outreach through market research and search engine marketing MRM//McCann, Birmingham, MI — Marketing Analytics Intern ​ ​ ​ ​ JUNE 2014 - AUGUST 2014 • Queried, analyzed, summarized, and presented insights from large social-media data sets to better understand the public opinion of General Motors’ competitors • Analyzed variables associated with GM Card’s different credit cards’ programs to calculate under which circumstances each credit card program is the most beneficial, and reported these findings internally PROJECT WORK Instacart Recommendation Engine, Northwestern University ​ ​ JANUARY 2018 - MARCH 2018 • Developed a proof of concept collaborative filtering recommendation engine using Scikit-Learn Surprise • Built and maintained front-end web application infrastructure as well as RDS and AWS components

VARUN GUPTA +1-312-964-0262 | [email protected] EDUCATION Northwestern University, Evanston IL Dec 2018 (Expected) Master of Science in Analytics (MSiA) GPA 3.90/4.00 Relevant Coursework: Predictive Analytics 1&2, Data Mining, Deep Learning, Data Visualization, Java and Python Programming, Analytics for Big Data, Databases and Information Retrieval, Reinforcement Learning, Text Analytics Nanyang Technological University, Singapore Jun 2015 B.Eng. (Mechanical Engineering) with Business Minor. GPA 4.41/5.00 SKILLS AND LANGUAGES • Programming Languages: Python, R, Java, SQL, HTML/CSS, Javascript (D3.js) • Software: Spark, MapReduce, Hive, Hadoop, Tensorflow, NetworkX, AWS (E3, EMR, RDS), Unix, Tableau, Git, Office PROJECTS Analytics Consultant, TransUnion (Chicago, IL) – Practicum Project Sep 2017 – Jun 2018 • Created graph structures composed of shared personal identity attributes between customers to help predict the occurrence of Synthetic Identity Fraud within a well-studied sample of 500,000 customers. • Engineered graph features manually using parallel-processing to reduce the time consumed, and then fed these to an XGBoost Tree model to appraise the value of using this graph data and to identify suspicious graph features. • Developed a graph-kernel based Convolutional Neural Network model to bypass manual user-defined feature creation. • Achieved test-set Recall above 70% and Precision above 90% with both models, far outperforming initial expectations. Used these graphs to capture numerous suspicious under-performing customers that were undetected by previous models. Gaming Analytics, Digital Creativity Labs – PvP Recommender System Framework for Teams Jan 2018 – Jun 2018 • Developed a novel, flexible framework for providing different types of recommendations to improve the performance of teams in a Player versus Player (PvP) environment. Evaluated it by building a working recommender system for Destiny 2. • Created multiple individual and team profiles by using Gaussian Mixture Models to find clusters of equipment- and playstyle features. Shared the strategies employed by better or faster-learning teams among pools of nearest-. Deep Learning Project – Object Detection for Autonomous Vehicles Apr 2018 – Jun 2018 • Adapted the Mask-RCNN model code to use it for an object detection and segmentation dataset provided by Baidu, with the goal of quickly and accurately detecting object boundaries and correctly classifying them. • Achieved a pixelwise accuracy (IOU) of over 50% on average, and over 90% on images with many close-range objects. Big Data Analytics – Venmo Transaction Analysis Apr 2018 – Jun 2018 • Classified over 7 million Venmo transactions stored on Hadoop using emojis in their descriptions, and analyzed the popularity of each transaction category over various time-frames using PySpark. • Conducted network analysis to study in/out degree growth and the proportion of reciprocal relationships over time. Full Stack Web App Development Jan 2018 – Mar 2018 • Built and deployed a Chicago crime-forecasting web application using Flask and HTML on an AWS EC2 instance. • Automated regular weather-forecast API pulls to store parsed data in an RDS instance, which then fed these as inputs along with historical crime data and a user-selected zipcode to a simple Python-based Random Forest model for predictions. PROFESSIONAL EXPERIENCE Analytics Intern, Enova International (Chicago) Jun 2018 – Aug 2018 • Integrated unused customer data from other Enova credit products to find groups of customers that strongly overperform relative to the existing model’s expectations for longer-term, large principal UK loans using classification tree-based models. • Investigated the process by which customers are purchased from aggregators by exploring the feature-space to find customer groups that are either rarely or very frequently purchased by competitors, or far more likely to have an issued loan with one Enova product than another, to maximize profit or volume through smarter acquisitions. • Designed tests to sample applicants and measure the impact of suggested changes for both analyses. Advanced Analyst, Ernst & Young LLP (Bangalore) Nov 2015 – Jun 2017 • Performed valuation as well as difference resolution procedures using analytical and simulation-based pricing models for various vanilla and complex FX, interest rate, equity, and commodity derivative products as a member of the Derivatives Valuation Centre (DVC) in EY’s Risk Advisory division. • Led a team of Analysts performing valuation procedures, while handling all client communications. • Developed a comprehensive tool, using VBA for Access and Excel, that automates the standard DVC process for a variety of vanilla products by interfacing with the user via Access forms and Excel spreadsheets, and serves as an accurate one-stop market data repository. Won the tri-annual “Extra-Miler Award” for potentially saving thousands of man-hours. • Worked onsite in market risk model development and documentation, specializing in FX risk, for a global investment bank. EXTRA-CURRICULAR ACTIVITIES Sports Editor, The Tribune by NTU Students Union Aug 2013 – Aug 2014 • Selected articles, directed journalists, edited their writing, authored complete articles, and finalized the newspaper’s appearance for each issue. Systemic content-choice changes resulted in improved readership of the Sports section.

Veronica Hsieh DATA SCIENCE Passionate about combining data with natural curiosity to understand people & solve business questions creatively; graduate student with prior consulting experience in implementing technology solutions

CONTACT EDUCATION (408) 507-0639 MS in Analytics, McCormick School of Engineering [email protected] Northwestern University | Expected December 2018 linkedin.com/in/veronica-hsieh GPA: 3.8

BS Economics Mathematics, Business Administration LANGUAGES University of Southern California | Aug 2010 – Sept 2014 • Python • R EXPERIENCE • Spark • Java (MapReduce) Jun 2018 Spotify – New York, NY • D3 Aug 2018 Data Science Intern, User Growth & Markets • HTML/CSS • Developed a clustering model to identify similar markets • SQL based on product feature usage, listening behavior, and content preferences TOOLS & FRAMEWORKS • Conducted ad-hoc analysis to support strategy and ops teams on evaluating growth opportunities in various • Tableau markets; created a product transition model to analyze • Git changes in user engagement over time • Flask • MS Excel, Project Oct 2017 BP North America – Chicago, IL • Jira • Distributed computing Jun 2018 Customer Segmentation & Lifetime Value Analysis • Built a customer segmentation model using purchasing behavior, fuel preferences, and station demographic data for RELEVANT COURSEWORK the driver loyalty program; wrote a script to automate the data processing and feature engineering for the model Predictive Analytics • Created a Tableau dashboard for the marketing team to Machine Learning measure the impact on customer lifetime value and perform Optimization scenario analysis on marketing campaigns Data Mining Data Visualization Sept 2014 Optimity Advisors – New York, NY Analytics for Big Data July 2017 Senior Associate, Associate Deep Learning • AMC Networks – Managed the development and delivery of a Text Analytics web API which passed show and movie metadata to various Probability Theory Mathematical Statistics consumer applications • Moody’s Investor Services – Led a team of UX/UI designers and software developers on creating an enterprise survey tool which automated the distribution and ingestion of financial INTERESTS data collected from 500+ companies iMentor ~ Oct 2017 – Present Music Discovery OTHER PROJECTS & PREVIOUS WORK Outdoor Activities • Cryptocurrency Analysis Fashion Editorials • ShopRunner – Recommender system Collecting Condiments • Beats by Dre – Global sales analysis

Wenze Hu [email protected] | (425)-589-7187

EDUCATION Northwestern University Evanston IL, USA Master of Science in Analytics Dec 2018 (expected) Select Courses: Deep Learning, Social Network Analysis, Text Analytics, Data Visualization, Analytics for Big Data, Analytics Value Chain, Data Mining, Predictive Analytics, Optimization & Heuristics

Tsinghua University Beijing, China Master of Engineering Jul 2004 2nd Prize – Tsinghua-Guanghua Scholarship; Tsinghua Excellent Freshman Award (top 1%)

SELECT DATA SCIENCE PROJECTS Unemployment Insurance Fraud Detection (Internship Project) Jun 2018 – Aug 2018 • Converted a social network of 2 million unemployment insurance claims to Graph Kernel matrices. • Designed and implemented a Convolutional Neural Network model which takes in the Graph Kernel matrices to detect frauds. Northwestern Admissions Data (Natural Language Processing) Sep 2017 – Dec 2017 • Analyzed archived MSiA admission emails using NLTK (Python, Natural Language Processing). • Generated FAQ and reference answers using Okapi BM25 ranking model (Gensim). Personalized Restaurant Recommendation (Web & Android App, Recommendation) Nov 2017 – Jan 2018 • Designed and implemented Java Servlets to allow users to explore and get recommended restaurants by applying a collaborative filtering & sorting algorithm on Yelp data (, JSON, MySQL). • Designed and implemented a LBS Android app for users to explore, visit and get recommended restaurants. • Integrated Google Map API to display restaurant locations.

PROFESSIONAL EXPERIENCE On Point Technology Chicago, IL Data Scientist Intern Jun 2018 - Aug 2018 • Analyzed 2 million unemployment insurance claims by using Graph Kernels and Convolutional Neural Network to detect frauds. AWS GPU was utilized to accelerate the training process. Everbridge (Nasdaq: EVBG) Beijing, China Technical Project Manager | Scrum Evangelist Sep 2015 - Aug 2017 • Coached 6 Scrum teams (40 people) to accomplish the organizational Scrum transformation from traditional workflow (accelerated the release cadence from quarterly to monthly). • Led cross-functional 10-person Scrum team to achieve all release milestones with 100% fulfillment. Yottaa Beijing, China Product Manager | Certified Scrum Master Jan 2014 - Mar 2015 • Supervised the Traffic Management product line and led a cross-functional 5-person Scrum team to increase team productivity by 50% (measured in the story points completed) in 2015 Q1. France Telecom Beijing, China Product Manager | Certified Scrum Master Dec 2007 - Dec 2013 • Supervised customization & delivery of signature Android devices (6M+ per year). • Led 2 Scrum teams covering 5 feature mobile applications & Managed a 20-person outsourcing QA team. Siemens Beijing, China R&D Engineer Jul 2004 - Dec 2007 • Implemented and maintained the protocol stack of Siemens carrier-grade WAP gateway in Java. • Met with clients as technical specialist and worked across organization to support Pre-Sales, Product Management, QA, and Customer Service.

SKILLS Programming: Java, Python, R, JavaScript, Android SDK Machine Learning Package: TensorFlow, PyTorch Distributed Computing: AWS, Hadoop, Hive

Rishabh Joshi [email protected] | github.com/rishabh-joshi | 224-565-3647

Education Northwestern University, Evanston, IL Expected December 2018 Master of Science in Analytics GPA: 4.00/4.00 Coursework: Predictive Analytics, Databases, Data Mining, A/B Testing, Deep Learning, Big Data Analytics (Hadoop, Spark & Hive)

Indian Institute of Technology (IIT) Guwahati, India June 2017 Bachelor of Technology in Mathematics and Computing GPA: 9.19/10.00 Honors: Institute Merit Scholarship 2014 (Rank 1/50 in the department) Coursework: Statistical Methods & Time Series Analysis, Probability, Linear Algebra, Optimization, Data Structures and Algorithms Technical Skills Python (tensorflow, keras, scikit-learn, xgboost, pandas, seaborn, networkx, numpy), R (xgboost, caret, dplyr, tidyr, ggplot2), C++, Java, SQL, MATLAB, Hadoop, MapReduce, Spark (pyspark, sparklyr), Hive, HBase, Amazon Web Services, d3.js, Tableau, Bash, Git Experience TransUnion, Chicago, IL | Analytics Intern June 2018 – September 2018  Worked closely with the Sr. Director of Analytics to design and productionize a Mortgage Interest Rate Estimator from end-to-end using 1.2 trillion records which is estimated to generate more than $1M in revenue.  Collaborated with the sales and credit teams to collect information on previously unknown motorcycle loans; identified these loans from 119 billion trades using SQL, Hive (Hadoop) and clustering on Spark MLlib (85% accuracy) to build a risk score model.  Automated and scaled audit reports generation with effective visualizations such as Maps and Waterfall Charts to spot trends and anomalies in auto loans; enabled rapid product iteration to prescreen customers for auto loans targeted marketing. Digital Creativity Labs, Chicago, IL | Data Science Consultant December 2017 – June 2018  Developed a novel team-based recommender system for MMOG video games to improve player performance.  Created multiple behavioral profiles for 100,000 players of Destiny II using unsupervised learning methods including Gaussian mixture models, K-Means, and Archetype Analysis to recommend appropriate weapons and optimal playing strategies for players. TransUnion, Chicago, IL | Fraud Analytics Consultant December 2017 – June 2018  Engineered graph structures and novel features for 500,000 customers using parallel processing to represent shared identity attributes and predicted synthetic identity fraud by training an XGBoost model on those features.  Automated a time-consuming manual feature creation process by training a graph kernel based Convolutional Neural Network (CNN) model and further improved upon the XGBoost model.  Elevated the performance of the original productionized model by 25%; uncovered several suspicious customers and previously undetected sharing patterns; presented to the entire analytics department of more than 70 people. Saarland University, Germany | Machine Learning Intern May 2016 – July 2016  Built an author citation network (1.3 million nodes, 172 million edges) to distinguish between authors with the same name by comparing their affiliation, specialization, and co-authors using social network analysis and fuzzy matching on 1.2 million papers. Heritage Institute of Technology, India | Data Science Intern May 2015 – July 2015  Developed three algorithms in C++ to solve the community detection problem in dynamic social networks that out-performed the state-of-the-art algorithm in optimizing the community structures. Project Work Activity Classification in Videos with Deep Learning for Better Ad Placement April 2018 – June 2018  Built an application for real-time human activity classification in videos to improve the allocation of ads with 90% accuracy.  Generated features from the video frames using the Inception Model and passed the features to an LSTM Model to utilize the temporal flow of the frames and identify one of the 101 possible activities; built the models using TensorFlow and Keras on GPU. Full-Stack AWS Web App Development, Character Recognizer January 2018 – March 2018  Designed a handwritten character recognition web app following agile methodology with Flask, HTML, and JavaScript, deployed on the AWS EC2 Platform, with a CNN model trained on 800,000 images stored in the Amazon RDS database. Customer Purchase Activity Prediction November 2017  Predicted customer purchases by creating features to capture the recency, frequency, and monetary value using regression models; successfully identified the top 1,000 customers with highest expected payoff and lowest RMSE among 10 teams. Extracurricular Activities and Volunteer Work  Certified KPMG Six Sigma Green Belt holder.  Coursera Mentor for four data science courses since Sep ‘17.  Top 50 finalist in the national level ITA Coding Challenge ’17.  Tutor for underprivileged children from Dec ‘14 to Dec ‘15. BROOKE KENNEDY [email protected] · (270) 617-1700 · www.linkedin.com/in/brooke-kennedy

EDUCATION DECEMBER 2018 (EXPECTED) MASTER OF SCIENCE IN ANALYTICS, NORTHWESTERN UNIVERSITY, EVANSTON, IL Coursework: Predictive Analytics • Data Mining, Big Data • Java and Python Programming • Deep Learning • Data Visualization • Databases and Data Warehousing • Analytics Consulting • Optimization • Text Analytics | Awards: Enova Data Smackdown Team Competition – 1st Place

MAY 2017 BACHELOR OF ARTS IN COMPUTER SCIENCE, BELLARMINE UNIVERSITY, LOUISVILLE, KY Honors: Summa Cum Laude | Minor: Mathematics | Awards: Computer Science Faculty Merit Award • National Science Foundation STEM Scholar • Dean’s List • Clare Boothe Luce Undergraduate Research Scholar • Kentucky Academy of Science Annual Meeting 1st place undergraduate engineering poster TECHNICAL PROFICIENCIES Python • Java • R • SQL • Tableau • Hadoop • Spark • Hive • JavaScript (D3) • HTML/CSS • PHP • Git • Linux EXPERIENCE JUNE 2018 – SEPTEMBER 2018 DATA SCIENCE INTERN, OPEX ANALYTICS, EVANSTON, IL Analytics firm that combines machine learning and optimization to answer complex questions o Developed full stack solution utilizing Python, PostgreSQL, Tableau, and internal deployment platform to help Fortune 150 quick service restaurant identify and reduce global supply chain risks by recommending backup options in supply chain for times of contingency o Led initiative on reproducible data science through company-wide presentation and published article on company’s public blog, leading to adoption of techniques in new client projects

OCTOBER 2017 – JUNE 2018 STUDENT DATA SCIENCE CONSULTANT, SHOPIFY, EVANSTON, IL Canadian e-commerce company offering a platform for online stores and retail point-of-sale systems o Used Python and SQL to analyze Shopify clickstream data o Utilized supervised machine learning algorithms to distinguish between sneaker bots and human shoppers

OCTOBER 2017 – DECEMBER 2017 STUDENT DATA SCIENCE CONSULTANT, GREENWICH.HR, EVANSTON, IL Company offering real-time labor market intelligence data o Analyzed Greenwich.HR labor market data and developed multiple predictive models to recommend job roles and salaries based on skills and presented analysis to Greewich.HR CEO

MAY 2017 – JULY 2017 QA TECHNOLOGY INTERN, THE LEARNING HOUSE INC., LOUISVILLE, KY Academic program management company offering technology-enabled education solutions o Performed quality assurance testing including weekly regression tests on software deployments o Automated test scripts in Java and Python using Selenium WebDriver to improve testing speed o Created an upgrade test plan for internal enterprise resource planning software (ERP) by documenting test cases o Used PHP and MySQL to create a script that logs needed information to files on a monthly basis

Arvind Koul

EDUCATION [email protected] Master of Science in Analytics Northwestern University (773) 961-4829 09/2017 – 12/2018 Evanston, IL linkedin.com/in/arvindkoul Post Graduate Diploma in Applied Statistics IGNOU SKILLS 08/2016 – 07/2017 New Delhi, India

R (caret, xgboost, keras, tidyverse family) Bachelor's in Management Studies University of Delhi 07/2013 – 06/2016 New Delhi, India

Python (tensorflow, scikit-learn, numpy, pandas, matplotlib) WORK EXPERIENCE Analytics Intern SQL, Hive TransUnion 06/2018 – 09/2018 Chicago, IL Achievements Hadoop, Spark Implemented challenger xgboost model to predict credit card bust-outs (3M records) resulting in a 26% reduction of annual cumulative bad debt. Used auto-encoders to automate feature generation from raw transactions data to feed into the xgboost model. Tableau Improved interpretability of xgboost model by tracing the path of each observation along all the trees in the model thereby decomposing each prediction into impacts due to an observation's feature values.

Analytics Consultant RELEVANT Principal Financial Group COURSEWORK 10/2017 – 06/2018 Des Moines, IA Achievements Predictive Analytics Simulated current Random Forest model used by the firm to predict factor returns in Russell 1000 market. Conducted feature engineering, developed lasso-logistic model, and implemented regime modeling (using Data Mining markov chains) on weekly Russell 1000 market data (1994 - 2017) to improve the factor-based investment process. Deep Learning

Big Data Analytics Co-founder Fishdeal (subsidiary of YesLife Enterprises) Data Vizualisation 03/2014 – 02/2017 New Delhi, India Achievements Statistical Inference Co-founded Fishdeal, a retail company for all kinds of aquaria. Relevant analytics experience included conducting A/B tests to increase customer engagement (improved response rate by 15%). Probability Theory

Corporate Finance ACADEMIC PROJECTS Image Colorization (04/2018 – 06/2018) Used a sequence of convolutional neural networks in Python for colorization of photos of different bird species, with initial layers used for feature extraction, and deeper layers for colorization.

Movie Recommender Web App (01/2018 – 03/2018) Built a recommendation system using collaborative filtering in Python on the MovieLens dataset. Designed and developed flask web app, which was hosted on AWS Elastic Beanstalk.

Customer Response Prediction (11/2017 – 12/2017) Predicted customer response to a catalog mailing using logistic (probability of purchase) and linear (Total amount of purchase) regression in R. Matthew Tucker Lewis [email protected] (720) 290-0372 EDUCATION Northwestern University – Evanston, IL Anticipated December, 2018 Master of Science in Analytics Cornell College - Mount Vernon, IA May, 2017 Bachelor of Arts in Business Analytics Bachelor of Arts in Psychology Thesis: Mobile Phone Applications for Behavior Change

PROFESSIONAL EXPERIENCE ABC Supply – Chicago, IL Data Science Intern June 2018 – September 2018  Leveraged in-house and external data sources for feature engineering to develop a predictive model for new markets and branch productivity  Utilized advanced web scraping techniques with Selenium to leverage new data sources for predictive modelling and D3 visualizations Synchrony Financial – Chicago, IL Data Science Consultant November 2017 – Present  Engineered a chatbot solution for expediting customer service interaction, reducing call center load and costs using Python, NLTK, and other NLP tools  Led the team through stakeholder meetings and technical advisor sessions to ensure alignment of product and expectations  Design and implement an interactive personal shopper, building upon the chatbot infrastructure. Integrate natural language processing to ensure seasonally appropriate, personalized recommendations for cross-selling opportunities. ProMazo Inc. – Chicago, IL Project Manager November 2017 – February 2018  Managed and worked with students to capture and prioritize the enterprise analytics demand of over 100 stakeholders to facilitate the efficient adoption of advanced analytics stack Cornell College Quantitative Reasoning Studio - Mount Vernon, IA Peer Consultant August 2015 – May 2017 • Guided students seeking help with quantitative subjects: homework and exam reviews in Statistics, Economics, Calculus; Experimental Design and Paper Review in Biology, Psychology, Analytics TECHNICAL SKILLS Languages Tools/Skills Selenium R, Python Tableau NLP Java Spark, Hive Web Scraping SQL MapReduce, HDFS Network Analysis HTML, CSS AWS (S3, EBS, EC2) Linear Programming D3, JavaScript Git Analytic Solver Flask Applications SPSS, Minitab RELEVANT PROJECTS Google Landmark Recognition - Designed a Deep Learning model based on Xception CNN to classify landmark photos, reaching 93% accuracy with 100 classes Venmo Transaction Classification - Used Spark RDDs and Spark data frames to cluster and classify both Emoji and Text transactions - Modelled Transactional relationships between users to better understand customer activity Xiaowei (Wei) Li [email protected] (415) 606-1185 EDUCATION Northwestern University, Evanston, IL Sept 2017 – Dec 2018 (Expected) M.S. in Analytics (MSiA) GPA: 3.82/4.0 Relevant Coursework: Predictive Analytics I & II, Data Mining, Analytics for Big Data, Deep Learning, Text Analytics, Data Visualization, Modern Database & Data Warehouse, Analytical Value Chain, Analytical Consulting Leadership

St Mary’s College of California, Moraga, CA Sept 2009 – Feb 2013 B.S. Mathematics, summa cum laude, Minor in Economics GPA: 3.89/4.0

SKILLS Python (pandas, Seaborn, sklearn, H2O, Keras), R, SQL, Spark, MapReduce, Java (Hadoop), Tableau, Git, AWS, D3.js

EXPERIENCE SoFi, Inc. San Francisco, CA Data Science Intern, Marketing Analytics Jun 2018 – Sept 2018 • Created data pipeline to clean, transform, and combine over 200 million prospects’ credit attributes, card transactions, and demographic data stored in AWS S3 and data warehouse with Hive and PySpark • Developed xgboost uplift model for direct mail campaign cadence with an AUC of 0.83 and improved lift of top 2 deciles by 12%; scored prospects by propensity to fund personal loan; extracted recommendation for monthly A/B test experiment and re-allocated marketing budget to reduce cost-per-start by 3% • Built regression models using ensemble method to predict and profile members and applicants by their potential values in SoFi Money product to facilitate marketing with lead prioritization; deployed the model and integrated results by collaborating with warehouse team

G ridsum, Inc. Bei jing, C hina Data Sc i e ntist, Dat a Ce nter D ec 2015 – J ul 2017 • Segmented mobile gaming app users with k-means clustering based on activity ranking, in-app behavior pattern, and purchase history; automated weekly Tableau reports to facilitate design of activities, resulting in increased ARPU (average revenue per user) of 21% over a quarter • Formulated quantitative approach to exploring mobile app user retention: developed solution using decision tree based on users’ in-app behavioral data; collaborated with marketing and operations teams to design A/B test strategies that improved 30-day retention rate by 6% • Oversaw web/app user behavior analytics projects that assisted clients in enhancing online marketing effectiveness, including user segmentation, churn analysis, customer lifetime value (CLV), lead generation, and campaign optimization • Leveraged data science tools and techniques such as regression and classification models, A/B testing, k-means clustering, recommender systems in Python and R using data from disparate data sources (online, offline, transactional, aggregated, qualitative)

Berkeley Research Group, LLC San Francisco, CA Senior Associate Jul 2013 – Oct 2015 • Conducted statistical and probabilistic studies of claims value and potential insurance recovery in light of factual and legal uncertainties; performed allocations of liability costs to insurance coverage by analyzing historical claims and estimating future liabilities; helped client to successfully recover 90% insurance claim payments of $1.6 billions from 120+ insurers • Developed quantitative financial models in R and Excel using structured data on pricing, sales volumes, costs, and competing products to assist clients in damage analysis relating to disputes in antitrust, intellectual property, securities, financial reporting, and mergers and acquisitions

SELECTED CASES/PROJECTS ShopRunner – Retailer Network Analysis (Github: https://bit.ly/2LQnPu1) Jan 2018 – Jun 2018 • Evaluated retailers’ in-network value by aggregating over 5 million members transactions to generate retailer network; constructed retailer segmentation model with network features using PCA and gaussian mixture models; profiled 100+ active retailers to strengthen network connection and facilitate cross-sell; developed an interactive dashboard with D3.js and deployed it on AWS to visualize active retailers’ interactions and sales trends

Zurich North America Insurance – Workers’ Compensation Pricing Model Sept 2017 – Jun 2018 • Migrated current GLM model from SAS to Python; created benchmark metrics to measure and monitor model performance in terms of accuracy, execution, and business impacts; profiled and segmented over 90,000 companies according to their risk level; developed new pricing models with boosted trees, random forest, and RNN models and improved pricing tool accuracy by 10% Zili Li [email protected] • (406)208-7878 EDUCATION Northwestern University – GPA 3.92 Expected Graduation: December 2018 Master of Science in Analytics  Coursework: Predictive Analytics, Databases, Data Mining, Analytics Value Chain (A/B Testing), Deep Learning, Data Visualization, Analytics for Big Data  Award: ABC Supply Hackathon 1st Place – Identified delivery vehicles that needed to be replaced, estimated the potential loss due to replacement using time-series analysis, and proposed recommendations for reallocating resources The Ohio State University – GPA 3.97 May 2017 Bachelor of Science in Business Administration with Specialization in Accounting, summa cum laude  Honors Thesis: The Impacts of the Affordable Care Act on Preventive Services among Racial Groups Bachelor of Science in Actuarial Science, summa cum laude  Exams: Probability (November 2014), Financial Mathematics (April 2015), Models for Financial Economics (July 2015) TECHNICAL SKILLS  Programming Languages: Python (pandas, numpy, scikit-learn, tensorflow), R (dplyr, tidyr), Java, JavaScript (D3.js), SQL  Tools: Hadoop, Spark, Hive, Amazon Web Services (AWS), Azure, Linux, Tableau WORK EXPERIENCE Expedia Group Chicago, Illinois Air Shopping Optimization Intern June 2018 – September 2018  Conducted quantitative analysis for 160 million searches to understand the difference in conversion rates by device types  Used K-means clustering to segment mobile and traditional browser users based on their search attributes  Identified the key factors that drove flight booking conversion rates, quantified their importance by building tree-based models with AUC of 0.78, and provided implementable recommendations for optimizing the flight search results State of Ohio Columbus, Ohio Research Intern June 2017 – August 2017  Utilized data analytics to monitor childcare providers’ activities, detect potential fraud, and improve investigation process  Developed a proactive monitoring dashboard by manipulating and analyzing data from multiple sources, identifying patterns and anomalies, and communicating with subject matter experts for continuous improvement The Dow Chemical Company Midland, Michigan Tax Accounting Intern May 2016 – August 2016  Identified and analyzed the discrepancies between two software systems in deferred taxes for more than 400 foreign entities and contacted foreign tax managers to understand the discrepancies and determine appropriate adjustments PROJECTS Chicago Park District Practicum Project, Northwestern University October 2017 – June 2018  Worked with the client to develop a tiered pricing system and remove the needs for manual price adjustments with a regression tree model  Created interactive dashboards that assisted the CFO and pricing team to visualize socio-economic data and update prices Game Analytics Research, League of Legends, Northwestern University January 2018 – June 2018  Assisted broadcasters in real-time storytelling by using archetype analysis to detect anomalous events and exotic plays in the massive esports game League of Legends  Performed exploratory data analysis in MySQL, identified sequences of events, and extracted features from raw data ShopRunner Repurchase Analysis, Northwestern University January 2018 – March 2018  Engineered features (RFM) from top 10 retailers data for survival analysis in R, evaluated the network effects and the impacts of time-dependent covariates on retention rates, and presented actionable marketing strategies to the Analytics team Book Recommender System, Northwestern University January 2018 – March 2018  Built a recommender system for 10K books based on 6 million user reviews using content-based and collaborative filtering methods in Python and deployed the model as a web app on AWS EC2

Xinyue (Emma) Li Emeryville CA 94608 · (530) 564-2418 · [email protected] EDUCATION Northwestern University Evanston, IL MS in Analytics, GPA: 3.80/4.00 Expected 12/2018 • Relevant Courses: Predictive Analytics, Data Mining, Data Visualization, Analytics for Big Data (Hadoop/Spark/Hive), Database Design and Information Retrieval, A/B Testing University of California, Davis Davis, CA B.S. Applied Statistics, GPA: 3.91/4.00; B.S. Managerial Economics, GPA: 3.76/4.00 06/2017

SKILLS R, Python (Pandas, NumPy, Scikit-learn, Keras), SQL (MySQL, Hive, Netezza), Spark, Hadoop, Java, D3.js, HTML/CSS, Tableau, AWS, SAS, C, Stata, Matlab

WORKING EXPERIENCE TransUnion Chicago, IL Data Science Intern 06/2018 – 09/2018 • Constructed risk score models to review personal loan applications using various methods such as tree-based models (XGBoost, Gradient Boosting, C5.0, Random Forest), SVM and Artificial Neural Networks • Researched and implemented methods of variable interpretation in Neural Networks for adverse selection • Performed quantitative analysis on 1B+ trades records to identify the customers’ capacity to absorb ongoing credit products as the interest rate increases, which improved its cycle readiness through early identification of a shift in consumers’ debt Graduate Analytics Consultant 09/2017 - 06/2018 • Created graph components and generated features on 500k+ credit accounts with shared identity information • Trained Boosting Trees with XGBoost to identify the fraudulent accounts and achieved 95.9% precision • Researched and applied a Convolutional Neural Network using Graph Kernels to improve the fraud detection performance • Detected undiscovered suspicious accounts and improved the previous model by 25% Agricultural Issues Center Davis, CA Undergraduate Researcher 07/2016 - 06/2017 • Analyzed the effect of the legalization on Cannabis price in California through Hypothesis Testing in R • Performed exploratory data analysis and analyzed and researched reasons for price change across different agricultural commodities through Functional Principal Component Analysis on the 1995-2015 California agricultural exports Standard Chartered Bank Shanghai Intern 08/2016 - 09/2016 • Predicted key indicators (revenue growth, EBITDA, taxable profit, etc) of the client to ensure liquidity for debt issuance • Explored potential collaboration opportunities between Standard Chartered Bank and Alibaba through analyzing Alibaba’s operational model and cash flow

PROJECTS Predictive Modeling on Clothing Sales 10/2017 - 12/2017 • Cleaned data inconsistencies and imputed missing values with MICE algorithm and KNN methods • Generated features measuring the recency and frequency of consumers’ purchasing behaviors • Evaluated the efficacy of catalog-driven marketing through predicting customers’ future purchases with stacking Logistic Regression and Multiple Linear Regression models • Estimated the expected profit to assist the company’s marketing strategy decision Gaming Analytics on Destiny II 01/2018 - 06/2018 • Designed a Player versus Player recommendation system framework based on team play to improve team performance • Implemented clustering analysis through K-means, GMM, Archetype Analysis on 16M+ matches from Destiny II to create player profiles • Produced team profiles accordingly and provided recommendations via K-Nearest Neighbor method • Submitted paper to AIIDE(Artificial Intelligence and Interactive Digital Entertainment Conference) 2018 Venmo Transaction Study 04/2018 - 06/2018 • Conducted quantitative analysis with effective visualizations on 7M+ transactions via PySpark and SparkSQL to summarize Venmo's social network • Analyzed different emoji use patterns in various time frames to learn users' spending habits using RDD and Spark data frame • Clustered the transaction messages with PySpark using text-based attributes to improve the text classification algorithm in each segmentation

ACTIVITIES AND LEADERSHIP • Vice President of Career Development Department at CSSA – Davis, CA 01/2016 - 06/2017 • Academic Coordinator at UC Davis Statistics Club – Davis, CA 06/2015 - 12/2016 • Volunteer at NYBL Foundation of America – Sacramento, CA 01/2015 - 03/2016 YUQING LIU C: (530) 574-9428 | E: [email protected] | https://www.linkedin.com/in/yuqing-liu11 EDUCATION Northwestern University | Evanston, IL Expected Dec 2018 Master of Science in Analytics University of California, Davis | Davis, CA Sep 2013 – Jun 2017 Bachelor of Science in Computational Statistics, Major GPA: 3.87/4.00 University Honor Program, Outstanding Undergraduate Research Scholar, Outstanding Performance in Statistics SKILLS Programming: R, Python, SQL, C, C++, Java, MATLAB, SPSS, SCIP, HTML/CSS/Bootstrap, JavaScript, D3 Data Science: Machine Learning, A/B Testing, Recommender System, Time Series Analysis, Latent Class Analysis, Consumer Analysis, Visualization, Model Lifecycle, Deep Learning, Data Warehousing Tools: AWS, Tableau, Qlik, Plotly, , tensorflow, scikit-learn, MapReduce, Spark, Hive Expected by graduation: Text Analysis, Social Network Analysis INDUSTRY EXPERIENCE The Boeing Company St. Louis, MO Data Science Intern Jun 2018 – Aug 2018 • Applied word clustering based on syntagmatic analysis, LSTM, and classification models to find patterns in red status project updates written by managers, diagnosed pain points in the production process, and provided data supported insights of censoring to management. • Identified causes of safety incidents at Boeing with text analysis of incident reports and categorized injury levels. • Improved the current time series model for finance expenses prediction by constructing stacked models, acquiring market data and feature engineering. Principal Financial Group Evanston, IL Student Researcher Sep 2017 – June 2018 • Use Regime Switching Model to predict economic factor’s forward return to support factor based investing. Lam Research Corporation Fremont, CA Data Science Intern Jun 2016 – Jun 2017 • Built classification tools for 30,000+ semiconductor parts using Support Vector Machine and Natural Language Processing methods in R and Python; performed analysis and created dashboards in Qlik. • Improved inventory planning by predicting demand for parts with regression model and survival analysis; reduced average delivery time from 8 days to 6 days. • Presented model performance with data visualization techniques to both technical and nontechnical audience; beat the accuracy of the Microsoft Azure team. RESEARCH Manifold Learning with Outliers, Davis, CA Sep 2015 – Jun 2016 • Researched the current algorithms and tuning parameters applied in Principal Curve method and analyzed disadvantages when abnormalities exist in data. • Improved the implemented Principle Curve method in R by making the fitted curve less sensitive to outliers. PROJECTS Moving Object Detection – CVPR 2018 Challenge Apr 2018 – June 2018 • Performed image segmentation of moving objects in street-view images with Mask-RCNN and YOLO algorithms. Zillow Home Value Prediction Web Application Jan 2018 – Mar 2018 • Predicted future home value based on Zillow Research data and managed the model following production level lifecycle and agile software development requirements. • Created a web application providing visualization and model results with Java, HTML and AWS. Consumer Analysis – ShopRunner Jan 2018 – Mar 2018 • Estimated consumer lifetime value with retention models for retailer segmentation. • Developed metrics to measure customer and retailer similarities; built recommendation system for top retailers with collaborative filtering and dimensionality reduction methods. AWARDS iidata Convention 1st Place, Stephen Curry Shooting Prediction | University of California, Davis Jan 2017 Hackathon 2nd Place, Optimizing Fleet Productivity | Northwestern University MSiA & ABC Supply May 2018 JUNXIONG LIU

[email protected]  https://www.linkedin.com/in/junxiongliu  https://github.com/junxiongliu  612-356-8798

EDUCATION Northwestern University, Evanston, IL Expected December 2018 Master of Science in Analytics GPA: 3.87/4.00 Coursework: Big Data, Data Mining, Data Visualization, Data Warehouse, Deep Learning, Predictive Analytics, Text Analytics Carleton College, Northfield, MN March 2017 Bachelor of Arts in Mathematics/Statistics, cum laude GPA: 3.78/4.00 Honors & Activities: 2013-14 Dean’s List, Chess Club President, 2014 Pan-Am Intercollegiate Chess Championship

SKILLS

Programming R (tidyverse, data.table, Shiny), Python (pandas, numpy, sklearn, seaborn), SQL, Java, JavaScript (D3.js) Tools & Systems Git, Hadoop, MapReduce, Spark, Hive, HBase, Bash, Amazon Web Services, Tableau, Linux

EXPERIENCE Data Science Intern - Predictive Analytics, Zurich North America, Schaumburg, IL June - August 2018 · Re-examined and improved geographical pricing strategies for commercial auto line with Python, R, Spark, and Hive · Designed reproducible PySpark pipelines to reduce dimensions of 20GB geographical data with more than 2,000 features · Collaborated closely with business partners and constructed novel model evaluation metrics that best served business needs · Built machine learning models (XGBoost, GLM, etc) that outperformed Zurich’s current model by 10%. Results and recom- mendations will be fully utilized in next generation pricing models Graduate Student Data Science Consultant, Shopify, Evanston, IL October 2017 - June 2018 · Researched bot behaviors with Shopify’s web clickstream data and prototyped bot detection methods in Python and R · Developed reproducible scripts to clean unstructured URL data and engineered important features relevant to bot behaviors · Implemented algorithms to propagate bot behaviors and built a Random Forest model achieving 99% accuracy (93% baseline) Graduate Student Data Science Consultant, ShopRunner, Chicago, IL January - June 2018 · Segmented and visualized 146 active retailers in ShopRunner’s e-commerce network in team of 5 with R and D3.js [GitHub] · Generated valuable features with PCA and performed analysis with Gaussian Mixture and Ordinal Logistic Regression models · Designed an interactive dashboard incorporating KPI and segmentation analysis results for ShopRunner’s internal use Data Analysis Intern, Delos, New York, NY April - July 2017 · Conducted literature review on statistics-related health and building science papers as part of team’s core research projects · Scraped, cleansed, and visualized data from various sources with Python and R and delivered research recommendations Statistical Consultant, Carleton College, Northfield, MN January 2016 - March 2017 · Participated in and led 4 data-oriented consulting projects for clients from Hennepin County or Northfield community · Cleaned (dplyr, tidyr), analyzed (regression and clustering models), and visualized (ggplot2, Shiny) massive data (generally 200K+ observations) with R. Generated 100+ pages of reports and delivered 5+ presentations

PROJECTS

Graduate School Projects, Northwestern University, Evanston, IL September 2017 - Present · eSports Analytics: Researching unsupervised and supervised techniques to detect anomalous events in eSports game Dota 2 · College Recommendation: Built a web app to guide students’ college application processes with Flask and HTML [GitHub] · Labor Analysis: Analyzed job market data (4M observations) for a HR company and presented insights to the CEO Multiple Data Competitions, DrivenData/Kaggle, Online February 2018 - Present · Santander Challenge: Building models to predict customers’ value of transactions with R and Python (currently top 25%) · Pump it Up: Used supervised methods to predict Tanzania waterpoint conditions in Python and R. Finished top 8% [GitHub] · Power Laws: Conducted time-series analysis to forecast building energy consumption with R. Finished top 8% [GitHub] Minnesota Water Quality, MinneMUDAC, Eden Prairie, MN October - November 2016 · Built GLM models to quantify relationships between Minnesota water quality and property values from 70GB water data · Won 1st place (for highest level of business insights and most actionable recommendations) out of 15 undergraduate teams and awarded total prize of $2,600 Daniel F. Lütolf-Carroll +1(914)-837-8918 • [email protected] • github.com/dlutolf Education Northwestern University (Evanston, IL) September 2017 – December 2018 Robert R. McCormick School of Engineering and Applied Science Master of Science in Analytics, Candidate in the Class of 2018 • Relevant Courses: Predictive Analytics, Deep Learning, Optimization & Heuristics, Data Visualization, Data Mining, A/B Testing, Text Analytics, Big Data Analytics, Java & Python Programming, Social Network Analytics, Reinforcement Learning Stanford University (Palo Alto, CA) June – July 2014 Stanford Graduate School of Business Certificate from the Summer Institute for General Management • Relevant Courses: Strategy, Accounting, Finance, Statistics, Economics, Operations, and Organizational Behavior Iona College (New Rochelle, NY) September 2010 – June 2014 Bachelor of Science, Major in Chemistry and Minor in Mathematics • Patrick J. Martin Scholar (Top Scholarship Award at Iona College) and Dean’s Honor List (2010, 2011, 2013, 2014) • Supported faculty research on bilayer surfactants by developing Java programs to for image processing of surfactant crystallization.

Skills • Programming: Java, R, Python (keras, xgboost), C++, SQL, HTML/CSS, Javascript, PHP, VB, Objective C, Lisp, Fortran, Perl, Lua • Software: Gaussian (Monte Carlo), MATLAB, AWS, and APIs, Apache , Git, Minitab • Simulations: Computational modeling of chemical processes using ab initio, semi-empirical, or experimental methods. • Development: iPhone iOS and Android mobile apps, ETL with distributed databases (Hadoop, Hive, HBase, MapReduce, Spark)

Projects Improved Advertisement: Activity Classification in Videos using Deep Learning • Built a real-time application to classify human activity from 101 classes with 90%+ accuracy using Tensorflow and Keras. • Model: Convolutional Neural Net (CNN) leveraging transfer learning of InceptionV3 and LSTMs integrating temporal flow of frames. Bot Detection Algorithm • Practicum project with ecommerce platform Shopify – Data pulled directly from production Kafka streams • Designed and developed a predictive model using clickstream analytics to classify bot traffic at the application level Modeling Customer Purchases and Labor Data Analysis • Built a predictive regression and classification model using Python/R (xgboost) to identify target customers for marketing based on activity and RFM (frequency, recency and monetary) metrics • Developed a visualization dashboard and predictive model using client’s labor data (6 million+)

Work Experience Molex (Lisle, IL) Data Science Intern June 2018 – August 2018 • Designed, built and executed an ensemble model of Deep Learning and Computer Vision to digitize 40 years of document archives. • Developed production code in Java and Scala to leverage Google APIs Optical Character Recognition and applied ETL using Spark. • Assisted in the design of a predictive model using transducer vibration data to determine corrosion levels in industrial pipes. Netspan AG (Liestal, Switzerland) Project Manager August 2015 – May 2017 • Managed a remote mobile phone App development team based in India. Responsible for the design of product specs for a social media event hub customer. Supervised app testing and implemented quality controls. Negotiated development team’s compensation. IT Consultant July 2008 – August 2009 • System administrator for Windows and Linux servers. Part of a team that programmed customer applications and . Cenciarini & Co Merchant Bank (Milan, Italy) Summer Intern June – July 2013 • Identified a need for and programmed a customized (VB) spreadsheet system to efficiently analyze large dataset of an Italian client to improve its account’s receivables collections and pinpoint problematic invoices. • Programmed a VB system using GPS technology to optimize transport delivery times between client warehouses & local pharmacies. Additional Information • American Chemical Society, Award for Excellence in Undergraduate Inorganic Chemistry at Iona College (2014) • Elected to Gamma Sigma Epsilon, the National Chemistry Honor Society (2014) • Trilingual in English, German, and Italian – Lived in Spain, Argentina, Mexico, Italy, Switzerland; traveled extensively in Asia SPENCER MOON linkedin.com/in/moonspencer| [email protected] | (214) 228-2372

PROJECTS League of Legends Dec 2017 – Present • Define anomalous behaviors in professional eSports gameplay, generate models for predicting events in real-time, and visualize these behaviors for coaches, producers and players

TransUnion Sep 2017 – Jun 2018 • Detected synthetic fraud accounts by building graph structures with personal and credit data and feeding them as inputs for neural networks

PROFESSIONAL Atlassian New York, NY Data Science Intern, Trello Jun 2018 – Sep 2018 • Supplemented A/B testing of feature limitation by modeling change in product usage after implementation, estimating lift in premium plan conversion rate, and determining workarounds • Productionized data pipelines for summary tables in Trello database to eliminate potential errors from complex joins and reduce query execution time by 45% • Designed visual dashboard for mobile metrics including signups, monthly active users, app rating, and Net Promoter Score

Phreesia New York, NY Analyst, Provider Insights Mar 2017 – Aug 2017 Select Engagement Experiences: • Closed enterprise client deal worth $1.1M in recurring revenues by presenting product value and return on investment o Projected annual financial impact using product utilization rates and payment collection rates in early adopters to show additional value that can be delivered through expansion o Determined increase in appointments through patient usage of clinical assessments • Estimated patient tendency of paying balances and submitting demographic information through product modalities in order to highlight gaps in user interaction to engineering teams o Created automated dashboards to provide organizational performance, end user data, and market trends for sales department pursuing customer leads

FTI Consulting New York, NY Consultant, Health Solutions Aug 2014 – Feb 2017 Select Engagement Experiences: • Built web-based dashboard that summarizes individual patient history to improve information transfer between care settings for 300-bed nonprofit children’s hospital o Mined millions of health records using SQL to parse out specific hospital events and created timeline visualization in QlikView, allowing physicians to interact with queried data • Served as lead analyst to conduct comprehensive assessment of care processes and quality outcomes for 375-bed hospital system to capture clinical and financial opportunities o Analyzed clinical variation of costs incurred by physicians when performing various procedures and found $4.4M in opportunities for internally standardizing care o Identified $6.2M in potential savings by comparing client’s average length of stay across all service lines to external benchmark consisting of peer provider organizations

COMMUNITY PACPI, Nonprofit focused on eliminating fatalities from pediatric AIDS, Pro-Bono Consultant MicroMentor, Mentoring program for small business owners, Mentor

EDUCATION Northwestern University, Master of Science Evanston, IL Concentration: Analytics Sep 2017 – Present • Cumulative GPA: 3.90/4.00 • Activities & Awards: Enova Data Smackdown Competition – 3rd Place

Northwestern University, Bachelor of Science, Cum Laude Evanston, IL Concentration: Economics & Learning and Organizational Change Sep 2010 – Jun 2014 • Cumulative GPA: 3.75/4.00 • Activities & Awards: Lending for Evanston and Northwestern Development · Students Consulting for Nonprofit Organizations · Arthur Siehrs Scholarship · Weinberg Research Grant

SKILLS Python · R · SQL · Java · Spark · AWS · Mode · Tableau · Javascript · Microsoft Office · Korean INTERESTS Microfinance · English Premier League · Streetwear · USA’s Mr. Robot and HBO’s Silicon Valley

Kehan Pan 773-273-0336 | [email protected] | Github: https://github.com/pankh13

EDUCATION Northwestern University, Evanston, IL Expected Dec 2018 Master of Science in Analytics GPA: 3.9 /4.0 Coursework: Predictive Analytics, Data Mining, Data Visualization, Data Warehousing, MapReduce & Hadoop, Deep Learning Tsinghua University, Beijing, China Sep 2013 - Jul 2017 Bachelor of Engineering in Industrial Engineering GPA: 3.9 /4.0 Award: National Scholarship (top 0.2% in China), China Merited Undergraduate Student (top 5000 in China) Coursework: Data Structure & Algorithm, Operations Research, Database Management Systems, Modelling and Simulation Georgia Institute of Technology (Exchange), Industrial & Systems Engineering, Atlanta, GA Aug 2015 - Dec 2015 SKILLS Programming: Python, R, Java, C, HTML, JavaScript (d3), SQL, Bash, Scala Software & Tools: Tableau, MATLAB, LATEX, Gurobi, Airflow, , AWS, Spark, Redis, Excel, Linux OS EXPERIENCE Data Intelligence Intern | Balyasny Asset Management, Chicago, IL Jun 2018 - Aug 2018 • Devised web scraper infrastructure and internal Python package for data group based on AWS Auto Scaling Group, Selenium, Splash and Scrapy; distributed computation, making scrapers 50 times more efficient, completely anonymous and untraceable • Designed scrapers for eBay, MercadoLibre, and Carvana, collecting 4 million pages of product information per day, capturing data of ~80% total revenue; modeled data in time series and delivered to Excel function; helped PM generate millions of return • Built data ETL and data QA platforms full-stack to analyze, visualize and monitor data with Redis, Plotly Dash& Flask; scheduled via Airflow; deployed with , Docker & Nomad; saved 80% cloud resource; delivered reports 3 times faster Data Analyst Intern | Lenovo, Beijing, China Jun 2016 - Dec 2016 • Developed a statistical methodology for processing call center transcripts data; increased efficiency by 5 times (adopted by international branches); added 25 Q&As to customer support knowledge base, reducing the number of related calls by 10% • Created a model with Python and Java based on word embedding and Neural Network to analyze customer feedback sentiment; increased accuracy by 90% through optimization; used time-series analysis to model and monitor sentiment trend • Designed centroid-based clustering algorithm with Python to discover new customer pain points and their importance PROJECTS Landmark Picture Recognition Deep Learning Bot, Northwestern University Apr 2018 - Jun 2018 • Presided over the construction of a deep learning model to classify pictures of landmark based on Xception and convolutional neural network with Keras and TensorFlow; optimized with transfer learning and stepwise training • Reached 93% top 1 error on test set among 100 landmarks, outperforming human recognition (at 60%) Day Camp Pricing Model, Chicago Park District (CPD) & Northwestern University Sep 2017 - Jun 2018 • Collected customer social-economical data via Google API and geospatial mapping; visualized customer and park data as an interactive dashboard with Tableau • Reengineered the pricing model with machine learning & integer programming approach and integrated into an Excel app Recommendation System Design, ShopRunner & Northwestern University Jan 2018 - Mar 2018 • Design lead of a hybrid recommendation algorithm based on customer data and product description; addressed usage of text mining and collaborative filtering in cold start recommendation system problem • Conducted Design of Experiments to find optimal strategy, reaching recommendation recall rate of 82% on test set Movie Recommendation Website Based on Natural Language Processing, Northwestern University Dec 2017 - Mar 2018 • Scraped 8 million movie ratings, reviews and posters from Amazon and stored in RDS PostgreSQL database and S3 • Trained doc2vec (word2vec) model with Python; predicted movie genres and similar movies (70% consistent with IMDB) • Built a website full-stack with Flask, HTML, JavaScript, Bootstrap and AWS to provide visualization and interaction LEADERSHIP EXPERIENCE & INTERESTS President, Volunteer and Public Welfare Association | Tsinghua University, Beijing, China Jul 2016 - Jun 2017 President, Students’ Union of Department | Tsinghua University, Beijing, China Jun 2016 - May 2017 Interests: freediving (semi-professional), swimming (university amateur group champion), taekwondo (1st dan), ballroom dance MICHAEL PAULEEN 503.781.8393 mobile • [email protected]

Work Experience Airbnb San Francisco, CA Data Scientist Intern Summer 2018 • Joined new Lux team to develop host and guest growth strategies for newly acquired LuxuryRetreats rentals business. • Defined key metrics and designed A/B experiment to analyze overall business impact of Lux launch on Airbnb. • Conducted analysis of visitor conversions at searching, booking, and payment stages; presented findings to leadership to direct product development. • Built model to identify cross-listed properties on Airbnb and LuxuryRetreats and delivered to leaders to drive host onboarding process design. Enodo Chicago, Illinois Data Scientist Intern Fall 2017 – Winter 2018 • Designed new spatial clustering product to generate comparable rental sub-markets for analysis of rental values. • Developed novel Bayesian models to analyze drivers of multifamily real-estate values in markets nationwide. • Synthesized domain knowledge, market data and building features to quantify the value of amenities in each market. Allstate Chicago, Illinois Data Scientist Intern Summer 2017 • Used deep learning and transfer learning to segment cars in images to prototype automation of claims estimates. • Built Python tool to facilitate labeling of over 450k Allstate claimant submitted images for training and validation. • Achieved 93% pixel-wise detection accuracy for instance-aware semantic segmentation of damaged vehicles. • Established end-to-end training and validation strategy to fine-tune model and assess performance for business goals. Data Engineer Intern Summer 2016 • Created and maintained new schema and data pipeline to enable sales and loss modeling at first contact for dataset of 90MM quote records. • Developed new metrics to analyze agent adoption of new marketing technologies to evaluate impact on conversion rate.

Education Northwestern University Evanston, Illinois M.S. in Analytics • GPA: 3.90/4.00 December 2018 • Courses: Predictive Analytics, Analytics for Big Data, Data Visualization, Deep Learning B.S. in Industrial Engineering, Magna Cum Laude June 2017 • GPA: 3.84/4.00 • Courses: Stats for Data Mining, Optimization Methods in Data Science, Machine Learning • 2017 ICSA All-Academic Sailing Team, 2017 MCSA All-Conference Sailing Team

Projects WE Energies – Employee Productivity Analysis Fall 2017 – Spring 2018 • Develop explanatory models in R to correlate employee productivity, overtime and attrition in customer care centers • Visualized trends in call volumes and customer wait times to highlight opportunities to increase caller satisfaction. ML Strategies for Forex Trading • 40 hours Spring 2017 • Develop ensemble and boosted methods to predict short-term price movements in EUR-USD Forex market and designed a profitable trading strategy and complete back-testing simulation framework.

Skills • Technologies: Python, R, PySpark, H2O, Hadoop, SQL, BigQuery, C++, AMPL, Caffe, MXNet, Tableau, d3 • Foreign Language: French and Spanish (ILR Level 4 – Full professional proficiency) Leadership and Interests • Northwestern University Sailing Team (Vice-Commodore 2x, Regatta Chair), rock climbing, credit card churning

Christian John Rozolis Work Experience

[email protected] Northwestern University, Department of Industrial Engineering Cell: (815) 451-9675 Graduate Research Assistant, June 2017- Present • Use historic runner data and visualization in ongoing humanitarian logistics project for the Northwestern University, McCormick School of Chicago and Houston Marathons Engineering and Applied Science • Train machine learning models on historic runner data to simulate runner speed and locations for Bachelor of Science, Industrial Engineering marathon situational awareness planning Minor in Psychology, Sep. 2013-Jun. 2017 • INFORMS Analytics Society Innovative Applications in Analytics Award (IAAA) Winner GPA: 3.701 | Honors: Tau Beta Pi ABC Supply Company, Inc. Master of Science, Analytics Data Science Intern, June – September 2018 Sep. 2017-Dec. 2018 (expected) • Aggregated and combined various data sources from the business to create a visualization tool to GPA: 3.877 | Honors: INFORMS IAAA aid users in understanding population and geographic data all in one tool (d3js, Flask)

• Selected and engineered relevant features from delivery, population, sales, and homebuilder data to Languages develop a predictive model for valuable market identification and branch success • Python • Java HP Inc. Advanced Technology and Platform Solutions • R • C++ Operations Planning Industrial Engineering Intern, June – August 2016 • Spark (RDD, DFs) • d3, JS, CSS • Performed current-state manufacturing start process analysis and implemented solutions aligning

• SQL (U-SQL, MySQL, PostgreSQL, SQL Server) inter-department communication and metric reports Tools/Skills • Designed and implemented linear programming model used to reduce resource requirements and operating costs by optimizing cycle-time planning process • Tableau • Hadoop DFS • Created line management course curriculum and facilitation guide used to train technicians and • • MapReduce supervisors in the theory of constraints and best practices • Azure Web Apps • Git Walt Disney Parks and Resorts Industrial Engineering Co-Op • Amazon Web Services • Flask Apps Walt Disney World Transportation, May-August 2015 • Hive • SPSS • Internally consulted on projects, responsible for planning, data acquisition, and analysis • SAS (SQL, Data Step) • MATLAB • Led client meetings to coordinate project tasks, goal-setting, and implement recommended • Linear Programming • Simio solutions matching agreed upon measures of success • ArcGIS • @Risk • Developed internal tools using SQL and Excel that improved information flow and analytics of • Microsoft Access, • Webscraping available data for current and future projects Walt Disney World Facilities and Operations Services (FOS), January–May 2015 Excel, Project • Extracted and analyzed maintenance workforce data using SAS to minimize future headcount • Volunteering needs due to park operational changes Northwestern University, NSFP • Optimized attractions maintenance schedules using historical trends to reduce unnecessary costs Orientation Assistant, September 2017 Leadership Experience Northwestern University, Class Gift Committee Ultimate Chicago, Youth U19 Chicago League Volunteer Coach, June 2018-August 2018 Executive Board Member, January-June 2017 Cary-Grove High School Choir Department Northwestern University, New Student and Family Programs (NSFP) Audition Assistant, Fall 2013-Winter 2016 Board of Directors, Director of Staff Training and Design, November 2015-December 2016 Cary-Grove High School Tennis Camp • Collaborated with staff training team to develop and teach a course involving social justice Instructor/Coach, Summer 2013, 2014 education, campus inclusion, and effective dialogue skills for 200+ student leaders Disney’s Ultimate EnginEARing Exploration (DUEE) • Interviewed and evaluated candidates for peer leadership, board member, and professional roles in Industrial Engineering Representative, Summer 2015 the office Peer Adviser and First Year Experience Co-Instructor, Spring 2014-Spring 2016 Interests • Provided academic and campus life guidance to new students throughout their first year • Discovering Music • Public Speaking • Data Visualization • Teaching • Led biweekly sessions to assist with course, major, and minor selections and facilitated dialogues • Tennis, Volleyball, • Freehand Sketching about emotional, mental, and physical health as a college student Ultimate Frisbee Jingwei (Will) Song

[email protected] | (310) 910-4205 EDUCATION Northwestern University, Evanston, IL December 2018 (Expected) Master of Science in Analytics (MSiA) GPA: 3.8 • Relevant Courses: Predictive Analytics, Database System, Data Mining, Java & Python Programming, Analytics for Big Data, Deep Learning, Data Visualization, Text Analytics (upcoming)

University of California Los Angeles, Los Angeles, CA June 2017 Bachelor of Science in Statistics GPA: 3.9 • Relevant Courses: Intro to Probability, Mathematical Statistics, Linear Models, Computational Statistics with R, Markov Chain Monte Carlo Methods, Intro to Programming in C++

SKILLS Programming Languages: R, Python, Java, SAS, SQL, HTML, C++ Big Data Analytics: Hadoop, Spark, AWS, D3.js, Tableau

PROFESSIONAL EXPERIENCE TransUnion, Chicago, IL June 2018 – August 2018 Insurance Analytics Intern • Conducted independent research on the relationship between daily events and public interest in the insurance industry by applying XGBoost and Random Forest models • Utilized NLP techniques, such as sentiment analysis, n-grams and TD-IDF, to perform feature engineering using news and trends data scraped from major media websites and Google API in R • Reproduced key functionalities of a costly commercial modeling software for insurance pricing into an interactive R Shiny app to improve global communication within TransUnion’s insurance realm • Developed documented R functions that evaluate policyholder risks from existing SAS codes Doodod Technology, Beijing, China July 2017 – August 2017 Data Scientist Intern • Built web scraping templates with Python to extract millions of data points every day from China’s 10+ mainstream video websites and ticketing platforms for further analysis on China’s film market • Screened and gathered data from major social media platforms like Weibo with Doodod’s customized tools and further cleaned them in MySQL • Analyzed aggregated datasets from Weibo in a used-car market analysis to boost companies’ brand exposure on social media platforms by identifying relevant key opinion leaders

PROJECT EXPERIENCE Yelp Data Visualization Project, Northwestern University April 2018 – June 2018 • Built an interactive dashboard in D3.js to display local catering sites in Las Vegas in detail using graphics such as geo map, heat map, bubble chart and line chart • Developed two interfaces – one with consumer insights and competitive landscape and the other with rating patterns – for local businesses and potential customers respectively Used Car Price Prediction Web App Development Project, Northwestern University February 2018 – March 2018 • Trained a linear regression model to predict future used car prices based on Kaggle’s used car transaction data from the European sales market • Built a user-friendly Flask web app in Python to interact with the model and deployed the app with EC2 instance on Amazon Web Services Valued Customers Predictive Analytics Project, Northwestern University November 2017 – December 2017 • Created a two-step model combining multiple linear regression and logistic regression models to identify customers who were likely to respond to a mailed catalog with high dollar purchase • Improved model performance and prediction results by conducting data cleansing, variable transformation and model validation

INTERESTS AND ACTIVITIES Languages: English, Chinese, French Extracurricular activity: Chinese Student A Cappella Club at UCLA

CHRISTA SPIETH MASTER OF SCIENCE IN ANALYTICS STUDENT

CONTACT EDUCATION

Christa Spieth Present Master of Science in Analytics Northwestern University 715 316 1767 Graduating December 2018 [email protected] 2013-2017 Mechanical Engineering & Mathematical Studies linkedin.com/in/christa-spieth Minor in Computing Andrews University Graduated May 2017 Magna Cum Laude SKILLS

R Python RELEVANT EXPERIENCE SQL Tableau & MicroStrategy Summer 2018 Enova International Java Fraud Analytics Consultant Jenkins Created a SQL function to describe the lifecycle of loan applications –

information that would entail long, complex queries across analytics teams. Designed updateable MicroStrategy visualizations for management’s product reports. Organized data study to identify a HONORS suitable vendor for account takeover predictions. Migrated and streamlined fraud alerts for operations within Jenkins. Enova Data Smackdown, First Place ABC Supply, Second Place President’s Full Tuition Scholarship Spring 2018 Synchrony Financial National Merit Finalist Data Science Consultant Undergraduate Research Scholarship Developed a chatbot proof of concept in Python, capable of common Phi Kappa Phi Honor Society interactions in an inbound customer service call. With a predictive Pi Mu Epsilon Honor Society model-based chatbot that identifies customer intent, recurrent issues Engineering Excellence could easily be tracked and collected as structured data to produce Who’s Who Among Students business insights to show where improvements could be made.

Summer 2016 Texas Tech University RELATED Knowledge Representation REU Researcher COURSEWORK Proposed standardized method of converting clinical practice guidelines into declarative programming-based algorithms. Developed and implemented thyroid nodule algorithms while collaborating with Analytics for Big Data PhD and medical school students, physicians, and professors. Analytics Value Chain

Business Communication Data Management for BI Data Mining 2015-2016 Andrews University Data Visualization Mathematics Student Researcher Databases and Information Retrieval Collaborated with biology researcher on the subject of Antillean Deep Learning manatees. Utilized MATLAB to find a deterministic mathematical model Predictive Analytics for weight as a function of standard morphometric measurements Presented research at university research conference open to public and at Michigan Academy of Science, Arts, and Letters.

Penny (Mengyu) Sun 765.491.0208 | [email protected]

EDUCATION Northwestern University Evanston, IL Master of Science in Analytics Sep 2017- Dec 2018 Relevant Courses: Predictive Analytics, Data Mining, Big Data, Text Analytics, Deep Learning, Data Visualization, Machine Learning Model Deployment, Databases and Data Warehouse, Social Network Analysis Imperial College London London, UK Master of Finance Aug 2012 – Aug 2013 Peking University Beijing, China Bachelor of Economics Freshman Scholarship, Li & Fung Scholarship Sep 2008 – Jul 2012

TECHNICAL SKILLS: Languages/Tools: Python, R, Java, Spark, Hive, SQL, Git, Bash, AWS, Azure, Flask, Tableau, JavaScript (D3, Leaflet) Python Libraries: Pandas, Scikit-learn, TensorFlow, Keras, PySpark, Seaborn, NLTK, CRON

WORK EXPERIENCE OPEX Analytics Evanston, IL Data Science Intern Jun 2018- Aug 2018

• Designed and developed inventory risk assessment tools for 8 regional Supply Planning teams of the world’s largest consumer goods company, which is expected to improve workflow operational efficiency by 80% • Automated daily data scraping process from 2 multi-dimensional data sources, created interactive geospatial dashboard for to visualize network inventory supply and actionable insights utilizing optimization algorithm • Performed root cause analysis for current multi-stop routing solution and improved on cost saving prediction model by introducing billing, fuel and carrier acceptance data Synchrony Financial Practicum Chicago, IL Data Consultant Intern Oct 2017- Jun 2018

• Created a chatbot prototype in Python to handle customer service tasks utilizing natural language processing (stop words, stemming, word2vector) and Naïve Bayes machine learning model on credit card customers e-chat data • Presented on Synchrony Monthly Townhall and was highly praised by the Chief Information Office KPMG, LLP London, UK Assistant Manager Sep 2013 – Jan 2017

• ACA qualified accountant (ICAEW) specialized in financial services companies, recipient of KPMG Encore rewards • Managing multiple engagements, and coordinating integrated audit efforts for medium-size, multi-location teams

PROJECTS & AWARDS Flower Species Detection Web App Apr 2018 - Jun 2018 • Developed a flower image classification web app which implements Xception model and convolutional neural network • Deployed front-end web infrastructure with Flask hosted on AWS Elastic Beanstalk and RDS Venmo Transactions Unsupervised Learning Project June 2018 • Processed 7M+ transaction comments in PySpark, and identified most popular topics using TF-IDF and Latent Dirichlet Allocation model and Gaussian Mixture Model, concluded meaningful insights from both models Rubikloud Case Competition (2nd/39 teams with cash prizes) May 2018 • Identified promotion strategies for retailers by engineering novel features, clustering customers, computing segment transition matrix and customer lifetime value, and predicting segment improvement with random forest model ShopRunner Recommender Project Jan 2018 – Mar 2018 • Developed NLP-based collaborative filtering brand recommendation engine to engage users and retailer network YIWEI (PHYLLIS) SUN Evanston, IL | (617) 838-6926 | [email protected] EDUCATION Northwestern University Evanston, IL M.S in Analytics (MSiA) 09/2017 – 12/2018 . Overall GPA: 3.85/4.00 . Coursework: Big Data, Data Mining & Machine Learning, Predictive Analytics, Deep Learning, Database System, Text Analytics, Business Intelligence, Time Series, Value Chain (A/B Testing), Data Visualization, and Bayesian Data Analysis

Boston University Boston, MA B.A. and M.A. in Statistics with Minor in Computer Science 09/2012 – 01/2017 . Overall GPA: 3.68/4.00, Master GPA: 3.88/4.00, cum laude . Leadership: Teaching Assistant in Statistics I & II; Research Assistant; Treasurer, Mathematical Association of America

SKILLS . Programming: R, Python, SQL, Spark, Hadoop/MapReduce, Hive, Java, JavaScript, HTML/CSS, SAS . Software: Unix, Git, AWS, Tableau, Salesforce, Bloomberg Terminal, Google Analytics, Qlikview, InfoSource WORK EXPERIENCE Chicago Mercantile Exchange (CME Group) Chicago, IL Business Intelligence Intern 06/2018 – Present . Create decision rules in SQL through volume trend analysis by K-means clustering and Random Forest in R to classify client trading accounts to improve efficiency of customer segmentation for sales and marketing teams . Develop Quarterly Liquidity Deck Report for all offices in APAC, EMEA and North America for customer reference . Visualize average daily volume and revenue for Ad-hoc analysis to generate business opportunities in a timely manner . Improve pricing and sales data quality in Hadoop and Oracle Database and design test for new liquidity tool . Perform correlation analysis for futures returns across top performing futures in R on price data through Bloomberg API

Chicago Park District (CPD) Chicago, IL Data Science Consultant (Industry Practicum) 09/2017 – 06/2018 . Visualized enrollee, discount, and park information datasets (510K+ rows) as an interactive dashboard in Tableau and channeled external demographic datasets in Python to identify key day camps’ trends at park level . Updated CPD’s summer camp prices for over 600 parks by developing Random Forest, Gradient Boosting and Decision Tree to improve participation rates of the camps while minimizing the additional overhead required to process the discounts

Scitics Inc. Acton, MA Data Analytics Intern 12/2016 – 05/2017 . Conducted exploratory data analysis and variable selection on membership and national event data (30K+ rows) for American Marketing Association (AMA) . Predicted customer renewal probability using logistic regression and validated model through concordance statistic (AUC) to help retention team develop marketing plan

RESEARCH PROJECTS

Google Landmark Recognition 04/2018 – 06/2018 . Produced bootstrap images to ensure 1000 training and 100 testing images for top 100 most frequent landmarks . Predicted landmark labels with 93% accuracy through training the Xception model with fine tuning and transfer learning by adding two fully connected layers

Toxic Comment Classification Web App 01/2018 – 04/2018 . Conducted sentiment analysis to classify toxic comments from Wikipedia by a logistic classification model with 84.3% accuracy and deployed the interactive Flask Web App on AWS

ShopRunner Repurchase Prediction Analysis 01/2018 – 04/2018 . Developed marketing strategies to improve retention rate with network effect analysis on the top 15 retailers’ data in R PUBLICATION F. Fang, Y. Sun and K. Spiliopoulos, “The Effect of Heterogeneity on Flocking Behavior and Systemic Risk”, Statistics and Risk Modelling, Vol. 34, No. 3-4, (2017), pp. 141-155 . Investigated the default activities for banking agents based on their mean-reversion rates and volatilities in heterogeneous mean-field interacting coupled diffusions using Monte Carlo Simulation to help stabilize the financial system SAURABH TRIPATHI +1(312)-273-7157 [email protected]

EDUCATION Master of Science - Analytics • Northwestern University Evanston, Illinois • Dec 2018 Bachelor of Technology - Mechanical Engineering • Indian Institute of Technology (IIT) Varanasi, India • May 2012

SKILLS Languages: Python, R, Scala, Bash, C#, Data Science Coursework: Predictive JavaScript, jQuery, HTML5, CSS3, T-SQL, Analytics, Data Mining, Deep Learning, git-bash Text Analytics, Analytics of Big Data, Data Visualization, Analytical Libraries: Tensorflow, Keras, SciKit- Consulting. Learn, NLTK, Gensim, Scrapy, Statsmodels, NumPy, PySpark, SciPy, Databases: SQL Server 2008 R2, Pandas, matplotlib, seaborn, D3, Caret, PostgreSQL, Hadoop, Spark, Hive randomForest, rpart, nnet

WORK HISTORY Data Science Intern • GoDaddy Tempe, AZ • June 2018 to Current Built a highly configurable and intuitive, end-to-end machine learning pipeline, capable of running and tuning various supervised learning/ deep learning algorithms, on the fly.The pipeline served as a one click solution for benchmarking and figuring apt algorithm for a dataset. Built a classification model to predict intent of customer support call. Data Science Consultant • Synchrony Financial Chicago, IL • January 2018 to June 2018 Built a chatbot capable of interactions that are commonplace in an inbound customer service call in banking industry Full Stack Developer • Seven Lakes Technologies Bangalore, India • December 2015 to July 2017 Developed an intelligent dynamic routing solution which directs the field personnel to the highest priority assets to maximize production efficiency of the oil fields under them. Led a team of UI developers to overhaul the front-end architecture to optimize CPU usage and reduce memory footprint of organizations data visualization product. Technology Analyst • Infosys Technologies Mysore, India • August 2012 to December 2015 Developed a solution for analyzing risk and value associated with any opportunity (oil well) based on conventional analysis techniques. Created and implemented database schema and architecture for multiple projects.

ACADEMIC Built a deep learning model to predict genre of based on movie poster. PROJECTS Built a recommendation engine for ranking vendors for an e-commerce platform. Built a predictive model to capture customer purchase response to a catalog mailing.

ACCOMPLISHMENTS Awarded Most Valuable Employee at Infosys Technologies for two consecutive years in 2014, 2015. Ziwen (Vincent) Wang [email protected] | (425) 208-1748 | Los Angeles, CA 90034 LinkedIn: https://www.linkedin.com/in/ziwen-wang/ | GitHub: https://github.com/vincent9514 | Tableau: https://public.tableau.com/profile/vincent.wang1896#!/

EDUCATION

Northwestern University, Master of Science in Analytics, GPA: 3.93/4.00 Evanston, IL, Dec 2018

• Coursework: Predictive Analytics I&II, Deep Learning, Data Mining, Analytics for Big Data, Data Visualization, A/B kTesting, Text Analytics, Databases & Information Retrieval

The Hong Kong Polytechnic University, BEng in Industrial and Systems Engineering, GPA: 3.78/4.00 Hong Kong, May 2017

• Coursework: Operations Research, Object-oriented Programming, Modeling and Simulation, Calculus I-III

TECHNICAL SKILLS & LANGUAGES

• Programming: Python, R, SQL, Java, JavaScript (D3.js, Node.js), HTML/CSS, C++, Bash • Software: Tableau, PySpark, MapReduce, Hadoop, Hive, AWS (EC2, Beanstalk, EMR, S3), Git, Tensorflow, Dialogflow • Certifications: Associate Certified Analytics Professional (INFORMS), Lean Six Sigma Green Belt (IISE) • Languages: Fluent in English, Mandarin, and Cantoneses

PROFESSIONAL EXPERIENCE

KPMG US LLP Chicago, IL Data Science Intern, Artificial Intelligence Jun 2018 – Aug 2018 • Launched a Chatbot automated development pipeline with functions including query variations generation (Tensorflow), real- time query simplification (OpenNMT), and testing automaton (Node.js), improving Chatbot intent detection accuracy by 32% • Trained sentential paraphrase generation and simplification models using LSTM seq2seq on 70 M+ paraphrase pairs • Developed a production-ready HR Chatbot with 200+ intents using NLP and Machine Leaning deployed on Google Dialogflow • Maintained an internal signals repository and scoring engine on AWS, providing external data for modeling purposes and offering solutions including customer retention and demand planning for Fortune 5 telecom company on a subscription basis

BP North America Chicago, IL Data Science Practicum Consultant Sep 2017 – May 2018 • Conducted customer segmentation analysis (GMM, K-Means, PCA) on 10M+ transaction records, accounting for purchasing behavior, seasonality, demography, and geography data in Python; created end-user Tableau dashboards for marketing teams • Built a dynamic customer lifetime value (CLV) model to track customer migration and optimize targeted marketing decisions

Digital Creativity Lab Chicago, IL Research Scientist Nov 2017 – May 2018 • Created a production-ready recommender system applying unsupervised learning (K-means, GMM, Archetypal Analysis) in Python on 16M+ matches’ data from Destiny II, a massively multi-player online game (MMOG) by Bungie Studio • Constructed player profiles and team representations based on equipment preference, player playstyle, and character preference • Implemented KNN to recommend teams and players with similar playstyles but higher performance or faster improvement rate • Submitted paper to AIIDE 2018: Artificial Intelligence and Interactive Digital Entertainment Conference

Audi, Innovation Research Beijing, China

Data Analytics Intern May 2017 – Aug 2017 • Delivered Price-Sensitivity Meter models on 60K+ rows of customer feedback to determine the optimal price ranges • Conducted Latent Factor Analysis and Principal Component Analysis (PCA) to identify the drivers on product premium

PROJECTS

Big Data Analytics | Venmo Text Mining and Network Analysis Jan 2018 – May 2018 • Extracted and classified emoji from 7M+ Venmo transactions using PySpark and RegEx to analyze its popularity • Conducted Venmo’s network analysis by exploring in-degrees and out-degrees, as well as reciprocal relationships using RDD

Advanced Data Visualization | Chicago Botanic Garden Jan 2018 – May 2018 • Created a visualization dashboard using D3.js and Tableau to map the network of 300+ endangered plant species in the Midwest

Deep Learning | Object Detection and Segmentation for Autonomous Driving Jan 2018 – Apr 2018 • Implemented Mask R-CNN model to segment road objects in images (93G) provided by CVPR Autonomous Driving conference • Generated evaluation scripts based on pixel-wise IOU (Interaction over Union) and achieved 50%+ accuracy for testing dataset

Full-Stack Web App Development | Flask Framework and AWS Deployment Jan 2018 – Apr 2018 • Built a web app with Flask, HTML, and JS deployed on AWS Beanstalk and EC2 to predict FIFA soccer player transfer value • Evaluated 11 different supervised learning models including Lasso, Ridge, GAM, Random Forest, Gradient Boosting, Neural Network, XGB, etc. on 18K soccer players records with 40+ attributes stored in Amazon RDS PostgreSQL database LOGAN WILSON

(760)-450-4934 / [email protected] / github.com/lwilson18

EDUCATION Northwestern University, Evanston, IL Dec. 2018 (Expected) Master of Science in Analytics – GPA: 3.88 Relevant Coursework: Big Data, Predictive Analytics, Data Mining, Data Visualization, Social Network Analysis, Text Analytics, Database Management, Deep Learning, Data Warehousing

Washington and Lee University, Lexington, VA May 2017

Bachelor of Science in Mathematics and Engineering with Minor in Music – GPA: 3.72 (Cum Laude)

TECHNICAL SKILLS • Lanugages: Python, R, SQL • Libraries: Spark, NLTK, Pandas, Scikit-learn, Flask, H2O, D3, spaCy • Tools: Amazon Web Services, Hadoop, BigQuery, Excel

RELEVANT EXPERIENCE BuzzFeed, New York, NY June 2018 – Aug. 2018 Data Science Intern • Developed algorithm to identify trending topics in the news by clustering and labeling content through natural language processing and ranking according to pageview trends • Collaborated with stakeholders to build product solution connecting data pipelines from internal data sources to implement algorithm in real-time as a RESTful API with caching • Pitched tool to editors to explain underlying algorithm and integrate into news curation workflow to influence content seen by hundreds of thousands of users daily

DJ Random Forest – Song Recommendations through Machine Learning Nov. 2017 – Feb. 2018 Independent Research Project - http://djrandomforest.us-east-1.elasticbeanstalk.com/ • Aggregated audio features for 80,000 songs and over 8,000 artists with Spotify Web API • Tuned and cross-validated random forest model for predicting song preferences from user ratings • Developed Flask web application hosted on AWS implementing model to make song recommendations in real-time and visualize personalized audio profiles • Built and maintained database of song ratings to analyze trends in musical preferences

OTHER EXPERIENCE Zurich North America, Schaumburg, IL Sept. 2017 – June 2018 Analytics Consultant • Evaluated accuracy, execution, and business impact of workers compensation pricing tool • Leveraged deep learning in H2O to improve pricing tool accuracy by 10% • Profiled and segmented over 90,000 companies by risk level to identify high-value customers

Twine Analytics, San Diego, CA Aug. 2017 – Sept. 2017 Data Engineer Intern • Developed analytics platform for aggregating and processing biopharmaceutical data • Implemented scalable Spark modules to perform ETL on data from over 200,000 clinical trials

MyEyeDr., Vienna, VA June 2017 – Aug. 2017 Marketing Analytics Intern • Designed data-driven pricing strategy to streamline customer purchases of eyewear products • Supported CRM initiatives through analysis of customer spending patterns

AWARDS AND ACTIVITIES • Johnson Scholar – Four-year full merit-based scholarship to Washington and Lee University • Analytics Council – Industry Chair – Student organization for planning analytics networking events • Musicians on Call – Volunteer Musician – Nonprofit bringing live music to healthcare facilities • Keynotes – Assistant Music Director – Northwestern University’s graduate student a cappella group Hao Xiao 847-644-2536 | [email protected] EDUCATION Northwestern University, Evanston, IL Sep 2017 - Dec 2018 (Expected)  Master of Science in Analytics (GPA 3.90/4.00)  Core courses: Predictive Analytics, Data Mining, Deep Learning, Big Data, Text Analytics, A/B Testing Peking University, Beijing, China Sep 2012 - Jul 2017  BS in Urban Planning, B.A. in Economics  Awards: 2013 Merit Award (top 20%), 2014 TC Scholarship (top 10%), 2015 CASC Scholarship (top 5%) SKILLS  Python, R, SQL, Java, C++, JavaScript/HTML/D3, Hadoop/MapReduce, Spark, Hive, Tableau, Git, AWS WORK EXPERIENCE TransUnion Chicago, IL Data Science Intern Jun 2018 - Sep 2018  Designed and conducted research on big data GLM algorithm implementation in R (Spark, H2O, etc.) and proposed feasible solutions to help Insurance Analytics team transfer modeling platform from Emblem to R  Developed an internal R package and an interactive R Shiny App to assist in insurance modeling and visualization  Built GLM and GBM to predict auto insurance risk; elevated the performance of the previous model by 1.1%  Further improved prediction performance by 4.6% by combining GBM and Recurrent Neural Network Fraud Analytic Graduate Consultant Nov 2017 - Jun 2018  Created graphs of identity sharing between customers and built xgboost models to detect synthetic identity fraud  Automated time-consuming graph feature creation process by developing a graph-kernel based Convolutional Neural Network model; elevated the performance of the original productionized model by 25% ShopRunner Chicago, IL Data Science Graduate Consultant Jan 2018 - Jun 2018  Evaluated retailers’ network value by developing metrics with Principle Component Analysis and segmenting on network features using Gaussian Mixture Model  Provided key insights and possible actions on defined segments to facilitate cross-sell and network growth  Developed an interactive dashboard incorporating KPI and segmentation analysis in D3 for ShopRunner's internal use; deployed sample application on Amazon Web Services[Github Link] China Sustainable Transportation Center Beijing, China Spatial Data Science Intern Jul 2016 - Aug 2016  Developed an end-to-end road congestion data analysis pipeline including open-source data collection and processing, metric calculation and visualization; utilized Python, JavaScript, and ArcGIS PROJECTS Venmo Transaction Comments Classification May 2018 - Jun 2018 Big Data course project Evanston, IL  Identified emoji popularity patterns over 7M+ Venmo transaction comments utilizing RDD and Spark data frame  Developed metrics of comments to help comment classification and information extraction Video Activity Classification May 2018 - Jun 2018 Deep Learning course project Evanston, IL  Developed deep learning models (CNN + LSTM, 3D CNN) to classify 101 human activities in videos  Created a video processing pipeline adding dynamically-changing activity class tags to videos in Python Customer Lifetime Value Analysis Apr 2018 - May 2018 Runner-up of 2018 Analytics By Design Competition Toronto, Canada  Identified customer value increase opportunities by engineering novel purchase pattern features, segmenting customers and building revenue prediction models on each segment Wenjing (Karen) Yang

[email protected] | 404-632-3558

EDUCATION Northwestern University, Evanston, IL Expected December 2018 M.S. in Analytics (MSiA), 3.7/4.0 Relevant Coursework: Predictive Analytics, Data Mining, Deep Learning, Text Analytics, Java & Python Programming, Data Visualization, Big Data Analytics, Data Warehousing, Databases, Recommender Systems

Emory University, Atlanta, GA May 2017 B.S. in Applied Mathematics, double major in Economics, 3.9/4.0 Honors: High Honors in Economics, Dean’s list (6 semesters), Phi Beta Kappa Honor Society

SKILLS Programming: Python (pandas, sklearn, seaborn, nordypy), R, Java, SQL, JavaScript (D3.js), HTML/CSS Tools & Systems: Hadoop, MapReduce, Spark, Hive, AWS (EC2, S3, Redshift, EMR), Git, Tableau, Bash, Ubuntu, Stata

EXPERIENCE Nordstrom, Seattle, WA June 2018 - August 2018 Data Science Intern | Grab-and-Run (GnR) Exploratory Analysis and Predictive Modeling • Led exploratory data analysis and machine learning methodologies of the GnR project, and organized weekly meetings with Nordstrom loss prevention department • Manipulated 10M+ data records stored on SQL server and AWS Redshift database using Nordypy in EC2 Ubuntu system • Investigated time-series and geographical patterns of GnR occurrences from 2010 to 2018 with Python and SQL queries • Experimented machine learning algorithms to predict expected money loss of future GnR incident, and to study occurrence triggers to provide strategic advice on store organizations and employee deployments

Synchrony Financial, Chicago, IL October 2017 - May 2018 Data Science Practicum Consultant | Chatbot Development • Developed a chatbot prototype capable of common interactions in an inbound customer service call by applying Naïve- Bayes classification model on simulated dataset using Python (Phase I) • Boosted model performance in terms of classification rate from 85.3% to 94.5% by incorporating eChat data with data manipulations of upsampling and downsampling, and text analytics methods of stopwords and stemming (Phase II)

ShopRunner, Chicago, IL February 2018 - June 2018 Graduate Student Analytics Consultant | Network Segmentation and Visualization • Constructed retailer segmentation model with network features using PCA and Gaussian Mixture models (GMM) in R • Profiled 140+ active retailers at ShopRunner network to facilitate cross-sell and improve network completeness • Designed a user interactive dashboard for visualizing dynamics of network clusters and changes of retailer performances with D3.js and HTML/CSS for internal use as strategic insights

PROJECTS Venmo Transaction Big Data Project, Northwestern University, Evanston, IL April 2018 - May 2018 • Classified emoji popularity patterns in various time frames using RDD and Spark data frame • Analyzed Venmo network by visualizing in-degrees, out-degrees and reciprocal relationships on 7M+ transactions using SparkSQL and PySpark

Movie Recommender System Project, Northwestern University, Evanston, IL January 2018 - March 2018 • Implemented a content-based recommender system with text analytics by NLTK library in Python • Deployed the model as web application on AWS Beanstalk using Flask library to bridge Python and HTML

Tong Yin

(310) 745-4851· [email protected]· www.linkedin.com/in/tongyin10

EDUCATION

Northwestern University | Evanston, IL

M.S. in Analytics Expected December 2018 • Current GPA 3.90/4.00 • Coursework: Predictive Analytics, Machine Learning, Deep Learning, Big Data Analytics, Database Design & Information Retrieval, A/B Testing, Data Mining, Data Visualization, Optimization, Text Analytics, NLP

University of California, Los Angeles | Los Angeles, CA

B.S. in Financial Actuarial Mathematics with Minor in French September 2013 – June 2017 • Cumulative GPA 3.92/4.00 • Honors: Summa Cum Laude, Phi Beta Kappa Honor Society, Dean’s Honors List • Actuarial Exams Passed: Probability (01/2016); Financial Mathematics (08/2016)

TECHNICAL SKILLS

Programming Techniques: SQL, R, Python, Hive, Hadoop, Spark, JavaScript (D3.js), C++, Java Tools: Tableau, AWS (RDS, EC2, Elastic Beanstalk), Git, Flask, Adobe Analytics, Omniture, Excel, PowerPoint

PROFESSIONAL EXPERIENCE

Expedia Group|Chicago, IL Product Analytics Intern June 2018 – August 2018 • Assessed the incremental value and self-selection bias of Favorite/Shortlist feature in predicting customer quality using Hive and adapting Random Forest Classifier and Logistic Regression with Python scikit-learn • Constructed data pipelines from 40M+ data points and examined flight customers' cross-device shopping patterns through the segmentation of various shopping attributes using Hive and Python

Principal Financial Group|Practicum Student Analytics Consultant October 2017 – June 2018 • Evaluated and simulated the current Random Forest forecast model on factor returns on a financial risk factor dataset with 200+ feature variables and 100K+ data points with R • Researched and implemented Regime Switching model to enhance factor-based investing strategy

California Department of Insurance|Rate Regulation Branch, Los Angeles, CA Student Analytics Assistant June 2016 – August 2016 • Predicted marketing channels for insurance companies by building an Ensemble model with Bootstrap Aggregation using R • Visualized mismatches and helped the branch target 20+ companies out of compliance with regulations

PROJECT WORK

Multivariate Time Series Prediction of User Behavior with Amplero September 2018 – Present • Capture the longitudinal behavior of mobile phone users along various dimensions by applying state-space models and optimize marketing decisions with Python

Customer Segmentation & Visualization Project with ShopRunner January 2018 – June 2018 • Segmented and analyzed 140+ retailers from 5M+ transaction data in ShopRunner’s network using Principal Component Analysis (PCA) and Gaussian Mixture clustering model (GMM) with R • Developed an interactive KPI dashboard with D3.js that enabled ShopRunner to visualize its retailer network

H1B Petition Status Prediction Web Application January 2018 – April 2018 • Engineered features from raw case certification data and forecasted H1B petition case status with Python • Designed and deployed a web application by incorporating Flask, HTML, and AWS components Ethel Shiqi Zhang [email protected] | +1 (213) 550-6743 https://github.com/0ethel0zhang

TECHNICAL SKILLS • Languages: SQL, Python, R, Java, JavaScript, Spark, Hadoop, Tensorflow, Sklearn, MDX, GraphDB • Software: Tableau, Spotfire, PostgreSQL, Git, Google Analytics, AWS, Microsoft Office Suite, Matlab EDUCATION Northwestern University Evanston, IL Master of Science in Analytics, School of Engineering Sep 2017 – Dec 2018 • Relevant Coursework: Predictive Analytics, Python and Java Programming, Database and SQL, Data Mining, A/B Testing, Data Visualization, Deep Learning, Text Analytics, Optimization University of Southern California Los Angeles, CA Bachelor of Science in Business Administration, Concentration in Finance Aug 2011 – May 2015 • GPA: 3.88, GMAT: 760 (99 percentile), Marshall Honor Student (1% of student body) PROFESSIONAL EXPERIENCE Lazard Frères & Co., LLC New York, NY Data Science Intern Jun 2018 – Present • Built GMM clustering and comparables-finding models to enable bankers to perform benchmarking analysis; the models are at production-level and integrated into a company-wide data analysis tool • Explored the relationship between financial metrics and stock price using Boosted Tree (AdaBoost) and Random Forest models in Python to facilitate C-suite executives to make sound business decisions • Prepared a data pipeline to clean, combine, and transform billions of rows of transaction data in Spark Shell Game Venture Los Angeles, CA Co-founder and Data Science Lead Jul 2016 – Jul 2017 • Co-founded a business venture that optimizes the return of select equity and alternative investments • Utilized Excel, R, Python and SQL to create a local web application and optimization models that automate the data warehousing and performance evaluation processes for select portfolio in a team of 5 Ernst & Young, LLP (EY) Irvine, CA Project Management Consultant Jul 2015 – Jul 2016 • Increased internal sales win percentage by 20% through big data analytics utilizing Spotfire and Tableau • Expedited the procurement and the implementation of a software through benchmarking analysis • Developed a project management plan for client’s more than 70 breakthrough initiatives by setting up a PMO equipped with project management and visualization tools such as reporting dashboards • Led the incubation of EY Presents and organized 20 members to practice public speaking monthly Morgan Stanley Beverly Hills, CA Global Wealth Management Intern Aug 2013 – Jun 2014 • Piloted a marketing project that was projected to increase the team’s asset value by 1% • Managed relationships with 30 high-net-worth clients daily through phone calls and emails DATA SCIENCE PROJECTS Chicago Botanic Garden (Joint Project with IBM Analytics) Jan 2018 – Jun 2018 • Identify clusters for client’s 1 million donors in Python to analyze gifting patterns and upselling trends • Visualize the clusters on Tableau and present upselling recommendations to the EVP along with IBM Chicago Park District (CPD) – Day Camp Price Modeling Sep 2017 – Jun 2018 • Reengineered the pricing model for Day Care program using machine learning methods such as neural network, boosted tree, K-nearest neighbors, and factor analysis in R and built an interactive dashboard Music Recommender Web Application Development Project Jan 2018 – Apr 2018 • Built a recommender system using Flask with communications to Spotify API and dynamically read user inputs to present the recommendations in a web application hosted by AWS and RDS SKILLS, AWARDS & INTERESTS • Languages: fluent in Mandarin Chinese (written and spoken), conversational in French • Awards: Winner of ’18 NYC Product Tournament, EY Bravo awards for excellent professional services • Interests: Music, piano, traveling, psychology, board games, yoga, and track (high-school track team)

NumPy, SciPy, pandas, scikit-learn, dplyr, ggplot2, statistics, machine learning, data extraction, data cleaning, data analysis, large datasets, forecasting algorithms, Time-Series Regression, Logistic Regression, Factor Analysis, Excel, PowerPoint, market research