Insight and analysis on the information technology space from industry thought leaders.
Continuous Learning Can Help Close the Data Science Skills Gap
Here's how organizations will benefit from developing their own data science continuous learning programs.
December 5, 2022
When discussing the most in-demand IT jobs for 2022 and beyond, "data scientist" regularly appears at the top of many lists. The problem is that there aren't enough potential data scientists with the proper skill sets to meet the current demand. In fact, a 2022 survey found that 56% of respondents believed that insufficient talent or headcount in data science is one of the biggest barriers to the successful enterprise adoption of data science.
To overcome this barrier, organizations should consider developing their own data science continuous learning programs to complement what is being taught in schools. Such programs can be instrumental in exposing budding data scientists to professionals and processes involved in bringing data models to life. The programs can also help these individuals understand where they fit in the machine learning operations (MLOps) lifecycle while helping companies find much-needed talent.
Data Scientist Needed: 'Works Well with Others'
One of the key job requirements for a data scientist is being able to work well with others. That may sound surprising since the archetypical picture of a data scientist is someone working in isolation with their models, but the actual modeling part of data science is really only a portion of the job description.
In reality, creating an intelligent application or model is a complex process that involves several steps, including:
Understanding the business goals behind the application
Gathering and preparing the data for the application
Developing the model
Deploying the model
Monitoring and managing the model post-deployment
Each of these steps is handled by different members of the MLOps team, which includes business leaders, data engineers, developers, and IT operations managers. There's naturally some crossover between responsibilities for the different steps. Still, data scientists are involved in all of them in some way and must routinely interact with other leaders to get their models deployed. Thus, the ideal data scientist has both business acumen and technical expertise and is just as comfortable working on models as they are collaborating with their peers.
Providing Hands-on Opportunities for Future Data Scientists
Enterprises have a unique advantage when it comes to teaching these "soft skills." That's because many organizations already have MLOps teams in place or are currently building out their MLOps capabilities.
These teams effectively provide built-in resources to help train the next generation of data scientists. Organizations can invite prospective data scientists — either from inside or outside their companies — to learn how the teams work, the tools they use, the personalities involved, and more, and teach them through a hands-on continuous learning program.
Within these programs, prospects can work side-by-side with current data scientists to understand how they interact with their fellow colleagues and observe the subtleties that come with those interactions. They can ask questions: What's the best way to hand over models to a developer? How do operations managers like to work with data scientists? How can I ensure my model will end up becoming a useful and deployable application?
Learning the Tools of the Trade
Students can also gain valuable experience with tools fundamental to MLOps. These could include Jupyter Notebooks, Apache Spark, Python, and other technologies that are commonplace among data scientists. This will help them develop a better understanding of the technologies that developers use to create their code and applications, as well as the solutions that data engineers use to extract and transform data.
While data science students do not need to become experts on technologies like containers or the latest MLOps governance software, it's helpful to know who uses what technology and why. Every additional bit of knowledge provides the student with a better understanding of how each person on the team works and helps them work more closely with team members so they can deliver models more effectively.
The Benefits for Organizations
Creating a continuous learning experience for data scientists is also beneficial to companies. For example, organizations can train prospective data scientists on their own unique processes and shape potential new employees to their MLOps culture. If students show the necessary amount of aptitude, they may consider hiring them and adding to their data science ranks. Thanks to their continuous learning program, they will have a new employee who is already trained and ready to hit the ground running on day one.
At the very least, companies with continuous learning programs show they are interested in investing in their employees. That alone can be a difference-maker that can energize an employee base — one that will hopefully include a lot more future data scientists.
Audrey Reznik is senior principal software engineer at Red Hat.
About the Author(s)
You May Also Like