**The Importance of Vision and Thought Process in Machine Learning Engineers**
When looking to hire machine learning engineers for your team, it's essential to consider not only their technical skills but also their thought process and vision for ML. Many people may have a deep understanding of the technical aspects of ML, but being able to recognize where ML can be applied within an organization and anticipating future developments in the field is crucial. As a hiring manager, you want to find engineers who can develop systems that enable your team to take the next step forward in their ML journey.
**Asking the Right Questions**
To assess a candidate's suitability for the role, it's essential to ask the right questions. What do they think about the trade-offs between high accuracy and low latency on models and ML systems? How do they approach solving problems like detecting cancer or other critical applications? By drilling down into specific scenarios and asking candidates to walk you through their thought process, you can gauge their level of understanding and ability to make informed decisions. This conversation should be more about broad topics, such as the future of ML, and then move on to specific technical details.
**Portfolio and Projects**
When interviewing a candidate, it's not enough to simply ask them about their experience or skills. You want to see evidence of their work, such as projects they've completed or contributed to open-source initiatives. A strong portfolio can demonstrate a candidate's ability to design, implement, and deploy ML solutions that drive real value for your organization. Look for projects that showcase the candidate's problem-solving skills, creativity, and attention to detail. By seeing the end-to-end process of how they developed and implemented their solution, you can gain insight into their thought process and approach to tackling complex problems.
**Breaking into the Field**
For those without a CS degree or looking to break into the field of machine learning engineering, there are numerous resources available. Online courses like MOOCs (Massive Open Online Courses) and blogs have made it easier to acquire technical knowledge. However, it's essential to strike a balance between acquiring technical skills and developing high-level knowledge about approach and problem-solving techniques in ML. Separating resources into two categories – technical knowledge and high-level knowledge – can help ensure that you're dedicating equal time to both areas. This will enable you to develop a well-rounded understanding of the field and make informed decisions when approaching problems or projects.
**Conclusion**
Machine learning engineers play a critical role in driving innovation and solving complex problems within organizations. When hiring, it's not just about finding someone with technical skills; you need to find someone who can think critically, approach problems from different angles, and develop systems that drive real value. By asking the right questions, reviewing their portfolio, and developing a balanced understanding of technical knowledge and high-level concepts, you can find the best candidate for your team and set them up for success in this rapidly evolving field.
"WEBVTTKind: captionsLanguage: enhey it's Molly from springboard and I'm here with shooby a machine-learning engineer at Survey Monkey should be thanks so much for being here thanks so much for having me Molly let's start off can you introduce yourself and give us a little bit of your background sure so I grew up in the Bay Area and I went to undergrad at Cal Tech down in Pasadena over there I studied Business Economics and computer science specializing in machine learning systems and I've been in industry about two years and at Survey Monkey about five months so far so specifically talking about machine learning what is a machine learning engineer actually do and how does that differ from being a data scientist or a software engineer a data scientist today would primarily be responsible for translating this business problem of for example we want to figure out what product we should sell next to our customers if they've already bought a product from us and translating that business problem into more of a technical model and being able to then output a model that can take in certain set of attributes about our customer and then spit out some sort of result an ml engineer would probably then take that model that the state of science has developed and integrate it in with the rest of the company's platform and that and that could involve building maybe say an API around this model so that it can be served and consumed and then being able to maintain the integrity and quality of this model so that continues to serve really accurate predictions software engineers tend to have you know two main areas that they focus on generally there's a front-end engineering aspect and a back-end engineering aspect the front-end engineering is your UI your UX what your end users really end up seeing and then the backend engineering is developing systems in which data is flowing there's a lot of logic and business logic that really exists there and an ml engineer ends up utilizing a lot of what the software engineer back-end engineer would end up doing day to day and using their knowledge of data science and ml in order to integrate ml models into engineering systems awesome and so how did you get started in machine learning there's actually two interesting issues related to that so one thing is when I first got to Caltech my freshman year roommate had been doing machine learning for like several years before that and I had never heard of it before and when I asked him what machine learning was he kind of explained it in the way of you can use data from the past to decipher patterns in the data and make predictions about the future and that like totally blew my mind that you could use data from you know the past to make predictions about the future in literally any space whatsoever and today the reason that really drives me in machine learning to kind of pursue it even more I would say is it's kind of amazing what you can do with data insights that scale and ml machine learning really you know allows you to do that so what's a typical day day to day as a machine learning engineer and what kind of teams do you normally work with sure so day to day I might do kind of projects or tasks that are kind of in to kind of different buckets one might be you know more developing infrastructure or platform to automate ml within our organization and whether that could be making sure that our models are retrained automatically whenever their performance tends to degrade or whether that means automating the way that a data scientist may want to develop a model in the future and then the second thing is as an ml engineer I'm also responsible for a couple of different models and model services especially when they're placed into production so that could be kind of making sure that their quality is integrating that they're still performing is expected and being able to take the actions necessary if anything were to go wrong as part of that I guess the third thing actually might be to where we're working to put other models into and as part of that I'm working with the data science team the data engineering team you know product managers and also other software engineering teams in order to really make that happen where the data scientists and data engineers during model development and making sure that this model will still work in production and a practical and usable manner and then with the PM's and other software engineering teams who are really responsible for then consuming the model once it's out there in production really that makes a lot of sense so for those of us maybe who don't know their world the world of machine learning um what would surprise us about your day to day or your your job in general I think a lot of people have asked me like oh so you know do you end up just typing code on a computer all day long and I guess this applies to most software engineers too but I think also because we interface a lot with you know the business problem at the end of the day and so we really have to understand the business problem and a lot of that is collaboration communication working with you know different teams that's really a cross collaboration 'el effort and know after we've really understood this problem and really translating it and does more technical architecture we still need to you know be on the same page about the technical problems there and then really do we translate that into code definitely that makes a lot of sense um what would you say is your biggest challenge in being a machine learning engineer having access to clean reproducible data at the end of the day whenever we're developing these ML systems you need clean data and you could have the most well organized data sets or data pipelines or whatever but we end up finding again and again that hey there's still some improvement here especially when we get into the realm of trying to automate a lot of our processes where we don't have access to these certain set of attributes within you know our data or we realize that we were able to get this data this one time but you know because this data has been the data ownership has changed no it's not no longer available in that way and because this is a process that's constantly in flux it's almost impossible to to you know always have that available but I know that there are a lot of steps that our organization is taking plus a lot of tools that are coming out that make this process much easier very cool so speaking more to your journey once you've become you know machine learning engineer what's a really cool problem that you've solved and how did you use machine learning to solve it so I used to work in the credit union space I consulted back there and really interesting thing about credit unions is that they are a few years behind banks and utilizing machine learning within their organization within their industry really and what I was doing for them was to try to demonstrate that ml has value within their organization and our goal really was to see can we derive a really simple use case that can be implemented very quickly to show to our stakeholders that hey you know this is very valuable for your organization valuable for your members and sometimes you need a little bit more than just ml4 to - how these projects really come to life and for work to actually happen and what we ended up doing for that specific project was developing a churn prediction model and putting it into production such that when people came and visited the credit union - you know maybe extract money or make a deposit the bank tellers or the credit union tellers would know how likely this member was to churn or no longer be a member of this credit union which is really valuable information and the business got some immediate business value out of this project and you know we're able to move forward to even have even more ml projects in their pipeline yeah that sounds like a really cool and useful project as well kind of piggybacking off of that project what are some tools and technologies that you using your date today honestly as a machine learning engineer you almost have to be an expert in knowing what tools are out there my day-to-day at Survey Monkey I am using mostly Python as my main development language but a little bit of Scala here and there but in terms of frameworks and technologies it's really across the board so I end up using pretty much a lot of the full AWS stack of s3 EMR Kinesis lambda so on and so forth but also container izing tools like docker kubernetes and Sybil and then on the ml specific side things like you know scikit-learn tensorflow PI spark and so on really there's really a lot of different domains that exist within the tools and you have to know not just one tool that can get the job done but almost three or four because sometimes you might find yourself needing a specific you know speed versus you know the the ease of development versus having multiple people be able to collaborate and when you're making those decisions you know you really need to know what's up I'm so let's say someone's getting just getting started in machine learning what different types of machine learning jobs are out there right now so machine learning engineering as a field is actually in a pretty interesting spot right now because a lot of corporations or businesses have found that all right this ml thing is really working for us so we've got three or four models out there we if they're working for us but now we want on link scale we don't want just three or four models we want like 25 or 50 or 100 models and that's where machine learning engineering as a field is really growing to handle that kind of scale as a result of that I found that there are probably three groups of machine learning engineers one is you know your ml engineer that works at a start-up they end up doing everything you know from integrating your ml model into any consuming service to actually doing the model development data cleaning building out a data pipeline and then you know that your more mid-level companies their ml engineers are really responsible for working with software engineers and the data scientists in order to productionize there's a ploy a model which in itself is a really huge task or project but they may not be as responsible for the actual algorithmic model development and then there's the third set of ml engineers that I think work at much larger companies or Google's your Facebook's and really I might even call them like ml infrastructure engineers where they're building the infrastructure or platform to automate ml or to make ml much easier to do within the organization thank you for explaining that really appreciate it um so let's kind of go through the interview process I know a lot of our students are from the Bay Area or West Coast in general and would love to know any tips or tricks you might have during that interview process to be honest with you interviewing in the Bay Area is very hard I don't think it really matters you could be you know the best of the best but it's very difficult to sometimes even get that interview I would recommend you know at least on your resume listing a lot of different projects that you might tend to do especially ones that you aren't really commonly performed you know I think a lot of people have tackled say the Netflix challenge is a project within their engineering courses or even as part of their MOOCs and a lot of people have tackled things like the Titanic project and these are very commonly known projects I would kind of recommend students to try out something in a field that they're maybe really interested in for example I'm really interested in sports and so I've done a lot of sports analytics projects where I'm really working on scraping data from the troves of data warehouses that are publicly available and trying to make some interesting projects out of that and that's what really makes you stand out I say on a resume once you maybe get that first interview I think it's a most important that you know the ins and outs of models for sure knowing really the math behind them why you would apply certain algorithms over some other set how they work and realizing that again ml isn't really just the modeling part but there's a greater aspect to it and understanding how they all really fit together you know really key so outside of being extremely passionate about something but one specific concept and actually having that background in the modeling what do you think sets apart great candidates from just good candidates that when you're really looking at you know you're good candidates and say you're great candidates when you know I'm looking to hire some more ml engineers for for our team as their thought process and their approach to ml a lot of people really have you know the technical stuff down and that's amazing but really having a vision of recognizing whereas ml within our organization today and the industry today and where can it really go in the future and how do I really work to develop systems that allow us to take that step forward for what may come in the future for example these days it's becoming increasingly common or needed to have central feature repositories within your ml systems and although say an organization may not have one today how can we develop our ml systems such that we can incorporate them later and really developing and thinking for the future because this is a really a rapidly changing field that you know new technologies and new ideas are all always flowing through so let's say we're putting yourself in the hiring manager position again what are some questions that you might ask those who are kind of applying to become a machine learner at your company I'm always interested in what people view as important within ml you know really understood the ML engineer that I'm looking to you know how join my team do they really understand the trade-offs between really high accuracy and really low latency on my models and within ml systems overall and understanding why is you know one thing more than the other and I might ask questions along like specific to situations where let's say like I'm going to create a model that detects cancer what's important to have in that model and I think I really love hearing from ml engineers themselves and you know having be more of a conversation about more broad topics and then be able to drill down to the specific technical details from there yeah that makes a lot of sense definitely and I know a lot of our students are really passionate about that interview process so if you were to have someone come in for the interview what would you want them to bring a portfolio anything specific I think something that would you know blow me away it probably is if they have a project or something that they can really point to and show me end to end how this works why they chose to do or like implement this project in a certain way the trade-offs really that exists there what the impact of this project really is not in terms of like hey I just got 93% accuracy on this you know machine learning model but what was the value that was that came out of that did you help you know a team of people detect that you know there's a deforestation going on where they couldn't be able to do that in an automated fashion or things kind of like that where that really closed the loop on why we're really doing ml right we're trying to translate a business problem or a problem into something technical solving it and then closing the loop and saying okay what was the impact of solving this technical problem so for those of us who do not have a degree in computer science and if you're an aspiring machine learning engineer how would you get into that field or how would you break in honestly I think in today's world the need for less CS degree to enter into the field of machine learning or just engineering in general is really going down that barriers going down and there are a plethora of resources out there especially like MOOCs or just blog posts and everything and I find myself to this day still you know using these mooks blogs and newsletters all the time in order to increase my knowledge I think when approached the idea of kind of entering into new field I always like to kind of bucket things and say like alright there are two types of resources that I think exist out there one is your really your technical knowledge you know your models your algorithms your engineering stuff and then a different type of resource are like the blogs or the or the essays or the papers that really help think about your approach to the field or approach to solving a problem and I often think that you have to separate your resources into either bucket and almost apply them fifty-fifty where I want to dedicate 50% of my time to this technical knowledge that I make that I will look to gain and I also want to maybe also apply about 50% of my time to this more high-level knowledge of my approach to solving problems or my approach to the field in general because if I often find that there's an imbalance that kind of persists when we try to enter a field without saying maybe a formal education and it's really helpful to have both when you're you know kind of joining the workforce at the end of the day thanks so much for sharing that well I'm shooby it's been so nice to chat with you and get to know a little bit more about machine learning engineers and more about your job so thank you so much for taking the time to talk with us today thank you so much for having me Molly it's been a pleasure youhey it's Molly from springboard and I'm here with shooby a machine-learning engineer at Survey Monkey should be thanks so much for being here thanks so much for having me Molly let's start off can you introduce yourself and give us a little bit of your background sure so I grew up in the Bay Area and I went to undergrad at Cal Tech down in Pasadena over there I studied Business Economics and computer science specializing in machine learning systems and I've been in industry about two years and at Survey Monkey about five months so far so specifically talking about machine learning what is a machine learning engineer actually do and how does that differ from being a data scientist or a software engineer a data scientist today would primarily be responsible for translating this business problem of for example we want to figure out what product we should sell next to our customers if they've already bought a product from us and translating that business problem into more of a technical model and being able to then output a model that can take in certain set of attributes about our customer and then spit out some sort of result an ml engineer would probably then take that model that the state of science has developed and integrate it in with the rest of the company's platform and that and that could involve building maybe say an API around this model so that it can be served and consumed and then being able to maintain the integrity and quality of this model so that continues to serve really accurate predictions software engineers tend to have you know two main areas that they focus on generally there's a front-end engineering aspect and a back-end engineering aspect the front-end engineering is your UI your UX what your end users really end up seeing and then the backend engineering is developing systems in which data is flowing there's a lot of logic and business logic that really exists there and an ml engineer ends up utilizing a lot of what the software engineer back-end engineer would end up doing day to day and using their knowledge of data science and ml in order to integrate ml models into engineering systems awesome and so how did you get started in machine learning there's actually two interesting issues related to that so one thing is when I first got to Caltech my freshman year roommate had been doing machine learning for like several years before that and I had never heard of it before and when I asked him what machine learning was he kind of explained it in the way of you can use data from the past to decipher patterns in the data and make predictions about the future and that like totally blew my mind that you could use data from you know the past to make predictions about the future in literally any space whatsoever and today the reason that really drives me in machine learning to kind of pursue it even more I would say is it's kind of amazing what you can do with data insights that scale and ml machine learning really you know allows you to do that so what's a typical day day to day as a machine learning engineer and what kind of teams do you normally work with sure so day to day I might do kind of projects or tasks that are kind of in to kind of different buckets one might be you know more developing infrastructure or platform to automate ml within our organization and whether that could be making sure that our models are retrained automatically whenever their performance tends to degrade or whether that means automating the way that a data scientist may want to develop a model in the future and then the second thing is as an ml engineer I'm also responsible for a couple of different models and model services especially when they're placed into production so that could be kind of making sure that their quality is integrating that they're still performing is expected and being able to take the actions necessary if anything were to go wrong as part of that I guess the third thing actually might be to where we're working to put other models into and as part of that I'm working with the data science team the data engineering team you know product managers and also other software engineering teams in order to really make that happen where the data scientists and data engineers during model development and making sure that this model will still work in production and a practical and usable manner and then with the PM's and other software engineering teams who are really responsible for then consuming the model once it's out there in production really that makes a lot of sense so for those of us maybe who don't know their world the world of machine learning um what would surprise us about your day to day or your your job in general I think a lot of people have asked me like oh so you know do you end up just typing code on a computer all day long and I guess this applies to most software engineers too but I think also because we interface a lot with you know the business problem at the end of the day and so we really have to understand the business problem and a lot of that is collaboration communication working with you know different teams that's really a cross collaboration 'el effort and know after we've really understood this problem and really translating it and does more technical architecture we still need to you know be on the same page about the technical problems there and then really do we translate that into code definitely that makes a lot of sense um what would you say is your biggest challenge in being a machine learning engineer having access to clean reproducible data at the end of the day whenever we're developing these ML systems you need clean data and you could have the most well organized data sets or data pipelines or whatever but we end up finding again and again that hey there's still some improvement here especially when we get into the realm of trying to automate a lot of our processes where we don't have access to these certain set of attributes within you know our data or we realize that we were able to get this data this one time but you know because this data has been the data ownership has changed no it's not no longer available in that way and because this is a process that's constantly in flux it's almost impossible to to you know always have that available but I know that there are a lot of steps that our organization is taking plus a lot of tools that are coming out that make this process much easier very cool so speaking more to your journey once you've become you know machine learning engineer what's a really cool problem that you've solved and how did you use machine learning to solve it so I used to work in the credit union space I consulted back there and really interesting thing about credit unions is that they are a few years behind banks and utilizing machine learning within their organization within their industry really and what I was doing for them was to try to demonstrate that ml has value within their organization and our goal really was to see can we derive a really simple use case that can be implemented very quickly to show to our stakeholders that hey you know this is very valuable for your organization valuable for your members and sometimes you need a little bit more than just ml4 to - how these projects really come to life and for work to actually happen and what we ended up doing for that specific project was developing a churn prediction model and putting it into production such that when people came and visited the credit union - you know maybe extract money or make a deposit the bank tellers or the credit union tellers would know how likely this member was to churn or no longer be a member of this credit union which is really valuable information and the business got some immediate business value out of this project and you know we're able to move forward to even have even more ml projects in their pipeline yeah that sounds like a really cool and useful project as well kind of piggybacking off of that project what are some tools and technologies that you using your date today honestly as a machine learning engineer you almost have to be an expert in knowing what tools are out there my day-to-day at Survey Monkey I am using mostly Python as my main development language but a little bit of Scala here and there but in terms of frameworks and technologies it's really across the board so I end up using pretty much a lot of the full AWS stack of s3 EMR Kinesis lambda so on and so forth but also container izing tools like docker kubernetes and Sybil and then on the ml specific side things like you know scikit-learn tensorflow PI spark and so on really there's really a lot of different domains that exist within the tools and you have to know not just one tool that can get the job done but almost three or four because sometimes you might find yourself needing a specific you know speed versus you know the the ease of development versus having multiple people be able to collaborate and when you're making those decisions you know you really need to know what's up I'm so let's say someone's getting just getting started in machine learning what different types of machine learning jobs are out there right now so machine learning engineering as a field is actually in a pretty interesting spot right now because a lot of corporations or businesses have found that all right this ml thing is really working for us so we've got three or four models out there we if they're working for us but now we want on link scale we don't want just three or four models we want like 25 or 50 or 100 models and that's where machine learning engineering as a field is really growing to handle that kind of scale as a result of that I found that there are probably three groups of machine learning engineers one is you know your ml engineer that works at a start-up they end up doing everything you know from integrating your ml model into any consuming service to actually doing the model development data cleaning building out a data pipeline and then you know that your more mid-level companies their ml engineers are really responsible for working with software engineers and the data scientists in order to productionize there's a ploy a model which in itself is a really huge task or project but they may not be as responsible for the actual algorithmic model development and then there's the third set of ml engineers that I think work at much larger companies or Google's your Facebook's and really I might even call them like ml infrastructure engineers where they're building the infrastructure or platform to automate ml or to make ml much easier to do within the organization thank you for explaining that really appreciate it um so let's kind of go through the interview process I know a lot of our students are from the Bay Area or West Coast in general and would love to know any tips or tricks you might have during that interview process to be honest with you interviewing in the Bay Area is very hard I don't think it really matters you could be you know the best of the best but it's very difficult to sometimes even get that interview I would recommend you know at least on your resume listing a lot of different projects that you might tend to do especially ones that you aren't really commonly performed you know I think a lot of people have tackled say the Netflix challenge is a project within their engineering courses or even as part of their MOOCs and a lot of people have tackled things like the Titanic project and these are very commonly known projects I would kind of recommend students to try out something in a field that they're maybe really interested in for example I'm really interested in sports and so I've done a lot of sports analytics projects where I'm really working on scraping data from the troves of data warehouses that are publicly available and trying to make some interesting projects out of that and that's what really makes you stand out I say on a resume once you maybe get that first interview I think it's a most important that you know the ins and outs of models for sure knowing really the math behind them why you would apply certain algorithms over some other set how they work and realizing that again ml isn't really just the modeling part but there's a greater aspect to it and understanding how they all really fit together you know really key so outside of being extremely passionate about something but one specific concept and actually having that background in the modeling what do you think sets apart great candidates from just good candidates that when you're really looking at you know you're good candidates and say you're great candidates when you know I'm looking to hire some more ml engineers for for our team as their thought process and their approach to ml a lot of people really have you know the technical stuff down and that's amazing but really having a vision of recognizing whereas ml within our organization today and the industry today and where can it really go in the future and how do I really work to develop systems that allow us to take that step forward for what may come in the future for example these days it's becoming increasingly common or needed to have central feature repositories within your ml systems and although say an organization may not have one today how can we develop our ml systems such that we can incorporate them later and really developing and thinking for the future because this is a really a rapidly changing field that you know new technologies and new ideas are all always flowing through so let's say we're putting yourself in the hiring manager position again what are some questions that you might ask those who are kind of applying to become a machine learner at your company I'm always interested in what people view as important within ml you know really understood the ML engineer that I'm looking to you know how join my team do they really understand the trade-offs between really high accuracy and really low latency on my models and within ml systems overall and understanding why is you know one thing more than the other and I might ask questions along like specific to situations where let's say like I'm going to create a model that detects cancer what's important to have in that model and I think I really love hearing from ml engineers themselves and you know having be more of a conversation about more broad topics and then be able to drill down to the specific technical details from there yeah that makes a lot of sense definitely and I know a lot of our students are really passionate about that interview process so if you were to have someone come in for the interview what would you want them to bring a portfolio anything specific I think something that would you know blow me away it probably is if they have a project or something that they can really point to and show me end to end how this works why they chose to do or like implement this project in a certain way the trade-offs really that exists there what the impact of this project really is not in terms of like hey I just got 93% accuracy on this you know machine learning model but what was the value that was that came out of that did you help you know a team of people detect that you know there's a deforestation going on where they couldn't be able to do that in an automated fashion or things kind of like that where that really closed the loop on why we're really doing ml right we're trying to translate a business problem or a problem into something technical solving it and then closing the loop and saying okay what was the impact of solving this technical problem so for those of us who do not have a degree in computer science and if you're an aspiring machine learning engineer how would you get into that field or how would you break in honestly I think in today's world the need for less CS degree to enter into the field of machine learning or just engineering in general is really going down that barriers going down and there are a plethora of resources out there especially like MOOCs or just blog posts and everything and I find myself to this day still you know using these mooks blogs and newsletters all the time in order to increase my knowledge I think when approached the idea of kind of entering into new field I always like to kind of bucket things and say like alright there are two types of resources that I think exist out there one is your really your technical knowledge you know your models your algorithms your engineering stuff and then a different type of resource are like the blogs or the or the essays or the papers that really help think about your approach to the field or approach to solving a problem and I often think that you have to separate your resources into either bucket and almost apply them fifty-fifty where I want to dedicate 50% of my time to this technical knowledge that I make that I will look to gain and I also want to maybe also apply about 50% of my time to this more high-level knowledge of my approach to solving problems or my approach to the field in general because if I often find that there's an imbalance that kind of persists when we try to enter a field without saying maybe a formal education and it's really helpful to have both when you're you know kind of joining the workforce at the end of the day thanks so much for sharing that well I'm shooby it's been so nice to chat with you and get to know a little bit more about machine learning engineers and more about your job so thank you so much for taking the time to talk with us today thank you so much for having me Molly it's been a pleasure you\n"