4 answers
4 answers
Updated
Leandro (Leo)’s Answer
Machine learning (ML) is a quite broad topic and, to your question, there are different essentials skills depending on the area you intend to work. We can split in basically three areas:
1. Data Engineering: the ability to discover and transform data into valuable features is critical--if not the most important task--for successful ML applications. The essentials continue to be SQL and NoSQL (including Graph). Master those and it will be much easier to learn other technologies such as Hadoop and Spark as well as feature engineering modules of Python libraries such as Pandas and Dask.
2. Model Training: the essential is Data Structures (special attention to vectors and matrices). Read chapters 2 and 4 from the book "Data Science from Scratch". The book is also great for software engineers who want to do data science. The time spent is worth and higher-level libraries (e.g. scikit-learn) will make much more sense.
3. Model Deployment: At the last stage, it is all about DevOps. Spend your time learning the 7 key practices of DevOps and MLOps will follow.
I hope it helps.
1. Data Engineering: the ability to discover and transform data into valuable features is critical--if not the most important task--for successful ML applications. The essentials continue to be SQL and NoSQL (including Graph). Master those and it will be much easier to learn other technologies such as Hadoop and Spark as well as feature engineering modules of Python libraries such as Pandas and Dask.
2. Model Training: the essential is Data Structures (special attention to vectors and matrices). Read chapters 2 and 4 from the book "Data Science from Scratch". The book is also great for software engineers who want to do data science. The time spent is worth and higher-level libraries (e.g. scikit-learn) will make much more sense.
3. Model Deployment: At the last stage, it is all about DevOps. Spend your time learning the 7 key practices of DevOps and MLOps will follow.
I hope it helps.
Great answer, Leo! Couldn't have said it better myself.
Eric Loxton
Updated
Jerin’s Answer
All the previous comments give a good insight into your question, ML is a wide area where knowledge depends on the requirements. and it is evident the software is evolving day by day with new tools that make the development easy. However there are a few programming languages that are popular for predictive and prescriptive analytics are Python, Java, R, and SQL.
Updated
Vikas’s Answer
To add to the two previous 2 posts which have great advice I wanted to pen down some thoughts.
Software Engineers in machine learning are in high demand nowadays as data scientists come from a variety backgrounds (not necessarily computer science). But the roles are now clearly seperated within teams where the data science folks build ML models and the software engineers are involved mainly with model deployment & ML Ops.
I suggest trying to figure out what part of data science you like more. The machine learning part and math to build models OR the software engineering side mainly aimed at deployment as these are both distinct areas that you could specialize in.
The software engineering side of data science includes understanding data engineering, hosting a model as an API, technologies like Kubernetes (infrastructure management) and model monitoring. ML models degrade over time as the data changes and they require recalibration. In most teams as a software engineer you would be tasked with developing services to track model performance and results after deployment and setup modules to automatically refresh & retrain models post deployment.
As a data scientist, the role is more business facing. The expectations would be that you thoroughly understand what the business need is and translate it into an analytic solution and solve it using logic or an ML model. There is a little bit of a heavier requirement on communication and presentation of results as you will likely be communicating with people from different backgrounds who may or may not have a background in data science and statistics.
Hope this helps and best of luck !
Software Engineers in machine learning are in high demand nowadays as data scientists come from a variety backgrounds (not necessarily computer science). But the roles are now clearly seperated within teams where the data science folks build ML models and the software engineers are involved mainly with model deployment & ML Ops.
I suggest trying to figure out what part of data science you like more. The machine learning part and math to build models OR the software engineering side mainly aimed at deployment as these are both distinct areas that you could specialize in.
The software engineering side of data science includes understanding data engineering, hosting a model as an API, technologies like Kubernetes (infrastructure management) and model monitoring. ML models degrade over time as the data changes and they require recalibration. In most teams as a software engineer you would be tasked with developing services to track model performance and results after deployment and setup modules to automatically refresh & retrain models post deployment.
As a data scientist, the role is more business facing. The expectations would be that you thoroughly understand what the business need is and translate it into an analytic solution and solve it using logic or an ML model. There is a little bit of a heavier requirement on communication and presentation of results as you will likely be communicating with people from different backgrounds who may or may not have a background in data science and statistics.
Hope this helps and best of luck !
Updated
Nicole’s Answer
Hi Jiarong J. Thanks for your awesome question
I agree with the previously provided comments...and I will add a few. Software engineering and/or machine learning includes lots of trial and error. When individuals are building models or writing code for a specific result, sometimes it takes a little while to achieve the desired results. Sometimes when the desired result IS achieved, then there may be a need to change model/code inputs or logic.
An individual who is patience, tenacious, knows great places to go for walks ;)...will enjoy making great discoveries and will likely maintain a sustainable career in this space.
Hope helpful and best of luck to you!
I agree with the previously provided comments...and I will add a few. Software engineering and/or machine learning includes lots of trial and error. When individuals are building models or writing code for a specific result, sometimes it takes a little while to achieve the desired results. Sometimes when the desired result IS achieved, then there may be a need to change model/code inputs or logic.
An individual who is patience, tenacious, knows great places to go for walks ;)...will enjoy making great discoveries and will likely maintain a sustainable career in this space.
Hope helpful and best of luck to you!