Machine learning mentorship program and this blog have been established to help people that are trying to enter machine learning and data science from other STEM fields. The aim of ML mentorship is to provide guidance in terms of what to focus in the learning path, direct participants’ efforts toward projects with more return on invested time, and advice on ML problem formulation and solving. The ultimate goal is for participants to find a ML/DS related job in the industry and advance their careers. Additionally, we’ll also invite DS/ML professionals in our network to share their experience with mentees. The program is structured in three streams listed below, all of which will be provided with support from a community of peers and advisorship from a mentor. You can read more about each stream and sign up on their respective pages.
Here’s a summary video from our introduction session where more details are explained:
Program Charter
- We as a group commit to be respectful of each other, and not discriminate based on sex, race, religion, etc.
- We as a group commit to be respectful of each others time.
- We as a mentees, commit to self-learning, self-development, and seeking guidance when we need it.
- We as mentors, commits to help guiding mentees in their self-learning and professional journey.
- We as a group don’t expect each other to do things for us or set aside time for us, but we will try to help and guide each other as far as our time allows.
- We as a group are building this community and hoping that the bond remains strong enough to form a professional network for us and believe that supporting each other will come back to us in full circle during our careers.
Organization and logistics
In order for the mentorship program to be scalable, we need to organize in a way that everyone gets support without all the burden of mentorship falling on a single person (we all have a full-time job afterall). What I’d like to see is that you help mentor the people that have entered the field after you, so I am using the term mentor to refer to all of you. Here is a suggestion for an organizational structure that can help everyone to get support from their peers and also contribute to the group.
- Shared Slack channel
- Everyone joins slack as a communication channel for asking and answering questions, and posting relevant resources.
- Developing a community for supporting each other.
- A platform for forming teams to do joint-projects.
- Github and blogs
- Everyone creates a personal github account to share code and contribute to others’ code (here’s an intro article for how to do that, you can google more: https://product.hubspot.com/blog/git-and-github-tutorial-for-beginners)
- Everyone Create a github pages blog to share their work and learning (here’s an article for how to do that, you can google more: https://towardsdatascience.com/how-to-create-a-free-github-pages-website-53743d7524e1)
- Once you’ve established your blog, if you are interested in your blog being featured on the MLmentorship blog, provide a headshot, a brief introduction to yourself, and the address to your blog, to be featured in https://mlmentorship.github.io/mentees/
- Within Stream sessions
- The group within a stream will meet bi-weekly.
- Aim: Develop the community within that stream and support each other.
- Agenda:
- Discuss progress and challenges of the members
- share resources
- ask and answer questions.
- Get guidance from the mentor from the upper stream.
- Find areas where help and support is needed.
- Each stream will have a coordinator that rotates between members after each session and is responsible for:
- Coordinate and run the next bi-weekly session for their stream (1.5 hour at most)
- Make sure meeting notes (minutes) are taken.
- Invite a rotating mentor from an upper stream.
- Make sure the coordinator for next meeting is chosen.
- Share meeting notes and action items with the group and myself.
- I will read the meeting minutes from all three streams and if required have a monthly meeting with the rotating coordinators from each stream to discuss challenges, resources, projects, and how to help.
Program focus on writing and sharing
Writing is consistently a very big part of the process of learning, finding a job, and career growth especially in ML/DS as it can solidify your learnings, and get visibility to your work. I encourage everyone in the program to take personal notes in your learning journey. These notes will become handy when it’s time to prepare for interviews and help your think about ideas for writing helpful blog post.
Google is your best friend in your learning journey, and you have to constantly google things to learn and understand them. As you are searching, make note of the things where you haven’t found a good resource for. Those are great candidates for writing helpful blog posts. When you think you have an idea for a blog post or enough content, discuss with me and I’ll try help guide you in writing your post. If you are interested in getting more visibility for your post, we can discuss editing your blog post and cross-posting it on the MLmentorship blog. This will give your work visibility. You’ll also also learn how to work with github, submit a PR request, and contribute to another repository. You’ll be building your portfolio along the way which helps your job search. Let’s aim for everyone writing at least 1 blog-post in the first 2 months. I’ll help you edit and will cross-post on MLmentoship blog to further promote your work.
Resources for learning
I found this list of tutorials and resources very useful. It’s broken down by topics and all resources are free. It may be a great starting point.
FAQ
- Who are you? Why are you doing this?
- My name is Hamidreza Saghir, and I am an applied scientist at Amazon working on machine learning and NLP. I got my PhD from University of Toronto in Biomedical engineering, and then switched from STEM to machine learning some 5 years ago. I get a lot of questions about my experience in switching and thought that instead of answering just one-off questions, it would be more impactful to mentor some people that want to pave the same path as myself. So MLmentorship is just a platform for doing that and I am currently the only mentor. Hopefully it can grow beyond just myself so that I can invite others to join.
- Is it worth it to start in DS/ML?
- I personally think the field of DS/ML has a lot of opportunity and it’s only starting. There’s always hype when a new technology becomes hot which tends to calm down after a few years but that doesn’t mean there isn’t actual merit in the technology’s potential. ML has a lot of potential for adding business value to many industries and that only grows with the ever-growing amount of data in various industries. But at the end of the day, it’s your decision whether you want to invest the time and energy for entering a new the field. I made this choice for myself and am happy with my decision.
- Do I need a degree in ML/DS to get a job in this industry?
- You are welcome to invest in getting a degree in ML if you are interested but my personal opinion and the empirical evidence of many people that have made the switch from STEM to ML/DS, suggests that a degree is not required. However, what is required is the skills. I believe that more and more we are living in an skill-oriented world, where not many care about your degree if you can do the job and have some evidence to show that. This is even more true for data science, since many data scientist are actual people with advanced degrees from other STEM fields.
- What programming language should we start with?
- I suggest python. It’s the de facto language in the industry and unless you have a very strong opinion about another programming language, I suggest everyone start with python so that group projects are done easier.
- How many projects do we need to do to get a job?
- It’s hard to put a number on such a thing because everyone learns differently. However, if I were forced to pick a number, I’d say 4 projects. 3 strong individual kaggle-like projects and 1 larger-scale group project that’s more similar to the actual work happening in the industry.
- Where do we get data?
- Kaggle is a good candidate for individual projects, but I suggest going beyond that in the group project and source data by e.g. scarping the web or alternative data sources.
- Should we use jupyter notebooks?
- Jupyter notebooks are a great tool to use for doing data science and ML projects since you can write your explanations, code, and results in the same doc. You’ll also get the benefit of easily converting your work to a blog post by converting a jupyter notebook to markdown.
- How do we prepare for interviews?
- I will introduce resources, and you can form sub-groups for mock-interview preparation when it comes the time to do that.