Stream 1 program

Stream 1 (Starting out in ML)

This stream is suitable for participant that do not have any ML experience and are interested in a curriculum for self-learning. The following curriculum along with a support group of peers going through the same journey will prepare participants for stream two.

You don’t need to know everything about everything. At the beginning, focus on learning a few things really well.

Machine learning basics: how to formulate a machine learning problem
- Introduction, and Chapter 5 of the deep learning book
  - Note: if you need a refresher on Linear Algebra, probability theory, and numerical computation, chapters 2-4 in the deep learning book are a great resource.
- [optional] Chapter 1 of Hands-on machine learning book
Learn theory of 5 basic algorithms, how to evaluate them and how to use them in practice (sklearn):
- regression:
  - Linear regression
- Clustering:
  - K-means clustering
- classification:
  - logistic regression
  - SVMs
  - Random forests
Model evaluation
- Cross-validation, over-fitting,
- accuracy, recall, precision, F1 score, ROC curve, loss functions
Project 1 [Kaggle]
- study problem formulation
- follow others’ solutions with various algorithms
- replicate existing solutions and understand various aspects of data preparation and modeling

Guidance on picking Projects

project 1 (Stream 1):
- should be doable in a span of a week (estimated 40-60 hours),
- focus is on learning by example, how a data problem is formulated, how others solved the problem.
  - Test: the main question to be able to answer everywhere is “WHY”?
- What business problem does solving this problem tries to meet? What is the value of the project if solved?
  - Test: Think about the business, is this problem worth solving?
- Understand the data, how to pre-process, explore, normalize similar datasets, why? and what tools are used?
  - Test: Can you do the data preparation on a similar dataset?]
- why have a specific algorithm been used to model the dataset? How were the hyper-parameters chosen?
  - Test: Can you compare the algorithms, what are pros and cons?
  - Test: Can you apply the algorithms you studied on a different but similar problem?
- How is each solution evaluated? what is the evaluation metric?
  - Test: Why does that metric make sense? Can you justify it? what are the alternatives?
  - Test: How did you make sure the model didn’t overfit? Can you justify why the solution is correct?
- Can you write about the problem, and the solutions you studied, and the tools you used in a blog post?
  - Test: Find gaps in your knowledge while writing you blog post.
  - Test: what did you learn? What did you find very useful? What are the caveats?

Stream 1 program

Stream 1 (Starting out in ML)

Guidance on picking Projects

Sign up sheet