Data Science Resources

PAWEL CISLO M S C D A T A S C I E N T I S T Data Science Resources P A W E L C I S L O . C O M FOREWORD First of all, I would like to extremely thank you for deciding to be one of my newsletter subscribers. I will try to be quick right there, as this e-book will consist of a lot of resources stored alphabetically (tutorials, links, tools and datasets) that I definitely recommend to check out in your free time. All of the links you are about to see have been collected on my long internet journey (since 2016). Without a doubt, there are tons of other sites to visit; however, I considered these to be the one that stood out to me (I wish to knew them when I was starting). As I browse the web, I might add more links to this e-book. You can find the most up to date book on the book address given in your e-mail. Enjoy! P A W E L C I S L O . C O M TUTORIALS Side Note: If you are looking for any other tutorials, I recommend to use the browser of all the online courses (courseroot.com); however, you will usually be redirected to Udemy, Coursera and edX. Moreover, on my website, I will try to post my own notes from the courses that I have completed. • 39 Machine Learning Resources that will help you in every essential step • A Data Science Framework: To Achieve 99% Accuracy <--- very good tutorial for beginners in Python • Another Book on Data Science <--- learn R & Python in parallel • A short & practical HOW-TO guide to scrape data from a website using Python • Awesome Learn Datascience <--- list of tutorials & resources for beginners • AWS Machine Learning Course <--- free (30 lessons, 45 hours) • D3 Graph Theory <--- learn graph theory visually • Data Science Essentials (edX) <--- one of the courses I have finished • Deep Learning for Self-Driving Cars (MIT 6.S094) • Deep Learning Ocean <--- kick-starter into deep learning • Deep Learning World (repo) <--- resources for Deep Learning Researchers and Developers • Easy-TensorFlow <--- comprehensive tutorials • Guide to deep learning • Immersive Linear Algebra <--- world's first linear algebra book with fully interactive figures • Kaggle Learn <--- set of courses to go through • Lecture Collection | Convolutional Neural Networks for Visual Recognition (Spring 2017) <--- very good - YouTube based, recommended by "About Data - Krzysztof Sopyła" • Learn Data Science • Learn Machine Learning in 3 Months <--- list of resources by Siraj Raval P A W E L C I S L O . C O M • Machine Learning & Deep Learning Tutorials • Machine Learning Crash Course <--- by Google for free. After continue with paid TensorFlow course • Machine Learning Course by Andrew Ng <--- mostly recommended course by Stanford University, which I finished as well • Machine Learning Course with Python (repo) • Machine Learning Crash Course (HN) <--- with TensorFlow APIs (by Google) • Machine Learning for Everyone <--- explained in simple words • Machine Learning Guides <--- by Google • Machine Learning with Python <--- small scale machine learning projects • March 2019 Machine Learning Study Path <--- complete ML study path • Microsoft Professional Program for Big Data • MIT Deep Learning <--- MIT Deep Learning related courses • mlcourse.ai <--- open Machine Learning course, both in English and Russian • MVA - Introduction to Data Science • Notes from Coursera Deep Learning courses by Andrew • Pandas Cookbook <--- online e-book (534 pages) • PracticalAI <--- practical approach to learning machine learning • Principles and Techniques of Data Science <--- online book • Production Data Science (Reddit) <--- bridge the gap between exploration in data science and productionisation in software development • Project Based Learning <--- curated list of project-based tutorials • Python Data Science Handbook • PyTorch Tutorial • Quantee Tutorial <--- Data Science tutorial by Dawid Kopczyk • R and Python <--- how to integrate both into your workflow • Seeing Theory <--- visualization of data • Spinning Up <--- learn deep reinforcement learning • Stawiamy własny serwer <--- do programowania w R i Pythonie • TensorFlow Course <--- Simple and ready-to-use tutorials for TensorFlow • The most comprehensive Data Science learning plan for 2017 • The neural network zoo <--- explanation of all the neural architectures • The Open Source Data Science Masters P A W E L C I S L O . C O M LINKS Side Note: Here you can find links to interesting articles and things that did not fit into any other category. • 5 Career Paths in Big Data and Data Science, Explained • 7 myths in machine learning research • 8 ways to perform simple linear regression and measure their speed using Python • 10+2 Data Science Methods that Every Data Scientist Should Know in 2016 • 10 must-know algorithms and data structures for a software engineer • 11 most read Deep Learning Articles from Analytics Vidhya in 2017 • 12 essential command line tools for data scientists • 16 Useful Advices for Aspiring Data Scientists • 20 Big Data Repositories You Should Check Out • 21 Must-Know Data Science Interview Questions and Answers • 100 Days of ML Coding <--- great respository with lots of ML terms • A Concise Handbook of TensorFlow <--- well-explanatory PDF • AI & Architecture <--- use of AI to generate floor designs and their styles • Algorithm_Interview_Notes <--- in Chinese • All the best big data tools and how to use them • Amazon Mechanical Turk <--- access a global, on-demand, 24x7 workforce • Artificial Neural Networks Explained • As a data scientist, what tips would you have for a younger version of yourself? • Awesome (better graphical form) ○ Awesome AI ○ Awesome Big Data ○ Awesome Computer Vision ○ Awesome Data Science ○ Awesome Data Science Interview Questions ○ Awesome Deep Learning ○ Awesome Deep Vision P A W E L C I S L O . C O M ○ Awesome Information Retrieval ○ Awesome Machine Learning ○ Awesome Pytorch List ○ Awesome Speech and NLP • Awful AI <--- curated list to track current scary usages of AI • Big Data Landscape 2017 (Source) • blink-182 Song Similarity <--- style progression of a band over time • Bringing the best out of Jupyter Notebooks for Data Science • Building a Deep Neural Net In Google Sheets • Build Handwriting Recognizer & Ship It To App Store • Cheat Sheets ○ Cheat Sheets for AI, Neural Networks, Machine Learning, Deep Learning ○ Cheat Sheets for Machine Learning and Deep Learning Engineers ○ Cheat Sheets - Data Science Resources <--- R/Python/Numpy/Pandas ○ DataCamp Cheat Sheets ○ Data Science Cheatsheets <--- list of cheatsheets on GitHub ○ Data-Science--Cheat-Sheet ○ Deep Learning cheatsheets for Stanford's CS 230 ○ Machine Learning cheatsheets for Stanford's CS 229 <--- repo ○ Most general cheat sheet • Checklist for debugging neural networks • Choosing the right metric for evaluating Machine Learning models • Classification datasets results (MNIST, CIFAR, STL-10, SVHN, ILSVRC2012) • Convolutional Neural Networks <--- very good explanation • Cybercrimes Investigation and Intrusion Detection in Internet of Thing • Data Science Blogs <--- list of all blogs • Data Science Interview Guide • Data Science map • DataScienceWeekly <--- weekly newsletter • Deep Learning 500 questions <--- with anserws (Chinese) • Deep Learning Papers Reading Roadmap • DeepForge <--- modern development environment for deep learning • DeepLearn <--- implementation of research papers on Deep Learning P A W E L C I S L O . C O M • deep learning object detection <--- paper list • Deep Neural Network implemented in pure SQL over BigQuery • Descriptive Statistics • Docker for Data Science • Encoding data in dubstep songs (HN) • End-to-End Deep Learning for Self-Driving Cars • Essentials of Machine Learning Algorithms <--- (with Python & R codes) • Feature Selection algorithms <--- in Python • Generate Quick and Accurate Time Series Forecasts using Facebook’s Prophet • Genetic algorithm in machine learning • Getting Spark, Python, and Jupyter Notebook running on Amazon EC2 • Global Heatmap • Google AI Research <--- repository with code released by Google AI Research • Homemade Machine Learning <--- Python examples of popular ML algorithms • How Docker Can Help You Become A More Effective Data Scientist • How far can you travel in one hour by car? • How Much Hotter Is Your Hometown Than When You Were Born? • How to Build an End-to-End Conversational AI System using Behavior Trees • How to build your own AlphaZero AI using Python and Keras • How to Deploy Machine Learning Models: The Ultimate Guide • How to easily Detect Objects with Deep Learning on Raspberry Pi (HN) • How to extract data from MS Word Documents using Python • How to Develop a Word-Level Neural Language Model and Use it to Generate Text • How to recognize fake AI-generated images • How to setup a data science blog • How to solve 90% of NLP problems <--- step-by-step guide • How to train Keras model x20 times faster with TPU for free • How you can train an AI to convert your design mockups into HTML and CSS (HN) • Illustrated Word2vec • Inside - AI <--- subscribe to daily news about AI • Interactive Machine Learning List (Repository) <--- I have also contributed ;) • Introduction to Matplotlib <--- good summary of my Udemy notes • Introduction to NumPy and Pandas <--- simple tutorial • Japanese scientists just used AI to read minds and it's amazing • Kaggle <--- home of data science & machine learning • Kaggle’s State of Data Science & Machine Learning Report, 2017 (conclusions) • Kto wygra finał mistrzostw świata w piłce nożnej 2018? <--- good tutorial of cleaning and merging two datasets by Mateusz Grzyb P A

Data Science Resources

2020 Vision: Info Pro Skills for a New Decade

Ciência De Dados Na Ciência Da Informação

Cc5212-1 Procesamiento Masivo De Datos Otoño 2020

Dataset Search: a Lightweight, Community-Built Tool to Support Research Data Discovery

Talks & Abstracts

Search, Reuse and Sharing of Research Data in Materials Science and Engineering—A Qualitative Interview Study

A Day Without a Search Engine: an Experimental Study of Online and Ofﬂine Searches∗

U.S. Government Publishing Office Style Manual

JLEP-Issue-9.2.Pdf

Sustainability of (Open) Data Portal Infrastructures a Distributed Version Control Approach to Creating Portals for Reuse

A Study on the Veracity of Semantic Markup for Dataset Pages

Automl: a Survey of the State-Of-The-Art