All the evidence suggests it's tough to get hold of talented data scientists – and that's even true at NASA, says David Meza, acting branch chief of people analytics and senior data scientist at the US space agency.
So what does a skills gap look like at NASA? Meza says his team is still taking a "deep dive" into the organisation's data science talent demands – but clear patterns are emerging, particularly in terms of identifying capability that already exists within the organisation.
"One of the biggest challenges has been to identify where our data science skills are within NASA. It's not a terminology or an occupation that's been labelled data science within the government. It's still something that's in development to have a work role or an occupation of 'data science'," he says.
SEE: An IT pro's guide to robotic process automation (free PDF) (TechRepublic)
What's clear so far is that NASA has data talent across the organisation, some of which is not easily identified or categorised due to the wide range of work taking place at the space agency.
"You know, we're NASA, so we're doing a lot of that type of stuff," says Meza. "We need to identify how it's being done, in which organisations, what type of analytics they're doing, and then from there also to identify the data literacy of our other workforce."
That's where Meza's organisation comes in, with the team working on the creation of a workforce talent-mapping database to identify the data skills required for all kinds of projects, whether that's getting back to the Moon, going to Mars or working on scientific endeavours closer to home, such as climate change, aeronautical engineering or medical research.
"There's a wide range of things that we do within NASA. So we have a wide range of data sets and skill sets that we need to identify and make sure that we have the right people in the right place," says Meza.
Fittingly then, the solution to filling the data science skills gap at NASA lies in data science itself. The talent-mapping database that Meza is developing uses Neo4j technology to build a knowledge graph, which is designed to show the complex and varied relationships between data – and in this case, the relationships between people, skills and projects at NASA.
An experienced data scientist, he first started using Neo4j's knowledge graph more than a decade ago. He's used that awareness to craft his talent-mapping database that helps match employees with new projects based on their skill sets.
"That's a knowledge graph problem for most cases – anytime you look at people, there's usually some kind of relationship and connection you're looking for and it just made sense to use graphs for that," he says.
The project is still in its implementation phase. Meza says the first six to eight months were spent on research and development. His team focused on creating an occupational taxonomy, which analysed the various components of a role from an employee, training and project perspective. They used a database from the Department of Labour called O*NET, which has descriptions and skill sets for hundreds of occupations.
Meza used this taxonomy to create the basis for his Neo4j graph database. Capturing those components allowed his team to build a model and to start identifying people with skills in specific occupations. That model highlights the kinds of abilities that other individuals in NASA might need to complete tasks in each occupation successfully.
Research suggests that identifying and then finding data science talent is a common business challenge. More than two-thirds (69%) of businesses have found it difficult to fill at least one data vacancy in the past two years, reports Ipsos Mori.
In fact, most companies are flying "data blind" with regard to finding the skills they need, says tech analyst Gartner. More than half (53%) of businesses say the inability to identify in-demand skills is the biggest impediment to business transformation. What's more, 31% say they have no way to identify market-leading skills.
That lack of awareness is untenable in a fast-moving world, where a strength in accessing and cultivating digital and data talent is likely to be the key to helping organisations in all sectors gain a competitive advantage over their rivals. Business leaders must look to identify and elevate current employees' skill levels on specific data and analysis fields, says consultant Deloitte, such as machine learning, data analytics, data modelling, data architecture, and data engineering.
The next phase of the NASA project involves getting employees to validate the skills and tasks associated to each occupation and to train the model. Meza and his team will then formalise the end-user application and create an interface – hopefully by year-end – to help people in NASA search for talent and potential job opportunities.
Meza expects the final application to provide benefits for both managers and employees: managers will be able to track and predict where the organisation is missing skills and where it might need to shift its training programmes; employees will be able to see what opportunities exist and they'll get a roadmap of how they'll need to upskill to fill any skills gaps.
"In other words, if I'm a management analyst and I want to go more into business intelligence, what skills do I have as a management analyst that are comparable to a business intelligence analyst and what do I need to learn?"
Meza has already learnt some important lessons about using graph technology. As with any AI or machine-learning project, his team has had to make sure it operates with the utmost ethics in-mind. They've worked with psychologists within NASA's human capital workforce function to ensure the models they're using aren't inaccurate or biased against certain populations.
Leading this project has also helped Meza to create a sharper sense of NASA's data scientists of the future. He envisages a more cohesive and collaborative group of experts that share information, knowledge and models. The end result will be the development of broad data science expertise across NASA that can be applied to projects on-demand.
"We still have a lot of relationships and knowledge to share across each of our domains," he says. "And this is where I think the knowledge graphs will definitely help us to identify how each of our individual domains may be connected in certain ways."