Ethen (MingYu) Liu [email protected] | (224) 714-7841 | https://github.com/ethen8181 EDUCATION Northwestern University | Evanston, IL December 2017 (Expected) Master of Science in Analytics National Taiwan University of Science and Technology | Taipei, Taiwan January 2015 Bachelor of Industrial Management, Deans List (5 semesters)

TECHNICAL SKILLS Programming: R, SQL, Python Python Scientific Libraries: pandas, Matplotlib, NumPy, SciPy, Numba, , PySpark Libraries: scikit-learn, Gensim, XGBoost, Keras, TensorFlow, LIME, H2O

EXPERIENCE Cambia Health Solutions | Portland, OR June 2017 – August 2017 Data Science Marketing Intern • Generated targeted marketing list from 1.5M members for asynchronous health service • Refactored ETL and modeling pipeline to achieve a 3X speedup without losing performance • Automated PostgreSQL query using Jinja template and visualized embedding using tensorboard Allstate | Chicago, IL October 2016 – May 2017 Graduate Student Consultant • Won a team-based competition to work on year-long analytics project. Awarded 30k cash prize • Implemented collaborative filtering methods in Cython (1.7 times faster than Quora’s open-sourced qmf package when benchmarking on the publicly available Last.fm 360K dataset) to generate personalized recommendation for Allstate’s loyalty program • Designed intuitive scikit-learn-like API and jupyter notebook documentation for beginners in recommendation systems

PROJECTS Machine Learning Tutorials January 2016 - Present https://github.com/ethen8181/machine-learning • Approximately 1.2K stars and 160 forks • Developed machine learning tutorials. Contents aims to strike a good balance between math, educational implementation of algorithms from scratch and open-source library usage Northwestern University | Evanston, IL February 2017 - May 2017 Group Project - Amazon Project Category Classification • Built an ensemble of Convolution Neural Network and Logistic Regression to predict Amazon product categories from image and text data with 96% accuracy (25% baseline) • Identified weaknesses in the ensemble model using LIME for model agnostic interpretation Northwestern University | Evanston, IL May 2017 nd Hackathon 2 place - Sentiment Analysis On Amazon Product Reviews • Leveraged scikit-learn and LIME to build a sentiment analysis pipeline Northwestern University | Evanston, IL October 2016 - December 2016 Group Project - Text Analysis of Job Postings to Predict Supply and Demand • Cleaned 1.6 million text data (job postings) using spaCy • Utilized gensim and scikit-learn to train Phrase model and Latent Dirichlet Allocation (LDA) for assigning job categories to job postings • Implemented Topic Stability Algorithm in Python to determine suitable topic numbers for LDA • Trained Random Forest and Gradient Boosting Machine models with H2O for predicting different jobs’ supply and demand during different time spans, achieving R2 value of 0.90