Advancing human exploration: Is space the final frontier, and how can data and AI get us there?

Fifty years after the moon landing, it's not just NASA working on what many consider the final frontier for humanity: space travel. NASA, however, is special, and one of the reasons is that data is at the heart of what it does.
Written by George Anadiotis, Contributor

Fifty years is a long time by human standards, and an eon by technology standards. In 1969, not many organizations even knew what a computer was, let alone used one. Though it's trivial, revisiting and comparing the compute power of then to what we have now can help us realize the effort it took to realize the achievement that the moon landing was. 

The scale of our compute and storage capabilities has changed dramatically as Moore's law has been in full effect. Like many "laws," Moore's law is more like a rule of thumb, stating that the number of transistors in dense integrated circuit doubles about every two years. So, theoretically, our compute power roughly doubles every two years.

In 1969, astronauts had access to only 72KB of computer memory. By comparison, a 64GB cell phone today carries almost a million times more storage space. Pioneer software engineers like Margaret Hamilton (who is also credited with coining the term) had to use paper punch cards to feed information into room-sized computers with no screen interface.

Certain things remain unchanged, compared to 50 years ago. The need for a holistic view is one. Hamilton, for example, viewed the Apollo mission as a system: "part is realized as software, part is peopleware, part is hardware." 

Gimp: A great graphics editor.

Margaret Hamilton, lead software engineer of the Apollo spacecraft in 1969. The paper threatening to fall over on her is some of her assembler source code. And, you think programming is hard today!

Outsourcing is another one. In 1969, the code that Hamilton and others wrote was sent to a Raytheon factory. There it was woven into a long "rope" of wire, encoding ones, and zeroes. This was a workaround for the Apollo computers' limited memory. Today, contractors still undertake significant parts of NASA's projects.

One of the most high-profile among those projects is Orion. Orion is a collaborative project involving NASA and ESA, currently under development. Orion is intended to be the main crew vehicle of the Artemis lunar exploration program as well as potential crew flights to asteroids and Mars.

Keeping it real-time with streaming data

Just a few days back, Orion was successfully launched on a test mission into Earth's atmosphere. The main objective of this mission was to collect and analyze data from 12 data recorders that were ejected during the test capsule's descent. Analysis of the information will provide insight into the abort system's performance. 

In a presentation given in 2015, Lockheed Martin experts expanded on the data collection needs and related infrastructure for Orion that they have implemented. There's a total of about 350.000 parameters monitored for Orion's operation via 1.200 telemetry sensors, each sending measurements 40 times per second. This amounts to 2TB of data captured each hour. 

Viewed simplistically, it may be tempting to say 2TB of data is not an awful lot. It's still something most of us can relate to -- about the size of a modern hard disk. But there is some critical nuance here. As opposed to the average user, that (imaginary) hard disk that Orion's measurements fill up every hour is not mostly taken up by images and videos.

2TB of measurements, in other words mostly numeric values (and perhaps some metadata), makes for a lot of numbers. Keeping track and making sense of all that data is not entirely straightforward. Especially if we consider this has to be done in real-time, for obvious reasons: Orion's navigation and the crew's safety depend on this. This is a case for real-time, streaming data analytics if there ever was one. 


Joint NASA and ESA space venture Orion produces great volumes of data, which need to be analyzed in real-time. Image: Lockheed Martin

Traditionally, engineers responsible for analyzing this telemetry data would watch a handful of the indicators on the real-time monitors as a test progresses, or in review. Certain behavior and reporting will result in further analysis on a few other measurements. 

Most values of most telemetry measurements are ignored if they are not out of limits. Specific studies are done on some measurements in historical context, usually after detection of anomalous behavior, to determine if the behavior has been observed in the past. 

Today, automation has been applied to the problems of collecting data, running scripted tests, and detecting out-of-limit values. Lockheed Martin's presentation showed how platform developers, data scientists, and subject matter experts worked together to build the infrastructure required for this. 

Several tools were used, bundled together in a Lambda Architecture comprised of Apache open source projects: Data is ingested by Kafka, processed by Spark, and stored in Hadoop HBase over HDFS. Other tools and techniques used include SAS, SPSS, R, MATLAB, Python+SciPy, SQL, natural language processing, Baysian models, and Petri nets, as well as Tableau, D3 or other JS, Cesium for visualization and communication.

This is quite a stack, and that's not necessarily a good thing. By today's standards, improvements could be made to this architecture. Most notably, a Lambda Architecture is one that features different processing layers, introducing complexity and duplication.

The Lambda Architecture was a design built out of necessity, as streaming platforms were not mature enough to handle all data processing at the time. Today, however, streaming platforms have matured, and the flattened Kappa Architecture is the architecture of choice for new deployments.

Exploring lessons learned with graph databases

NASA has also built its own streaming analytics platform called Streams. It was developed using a rapid prototyping methodology and uses techniques from the stock market, open-source software, open and hyper-agile development, and an analytics cloud. Within weeks, NASA's Jet Propulsion Laboratory was able to infuse new technology into missions that used to take years to accomplish.

But how does NASA search through its troves of data on past missions and research? David Meza, chief knowledge architect at NASA, started exploring graph databases toward this goal around the same time Lockheed Martin was building Orion's real-time analytics infrastructure. Meza is adamant on the importance of reviewing lessons learned before starting a new project: 

"Lesson learned databases are filled with nuggets of valuable information to help project teams increase the likelihood of project success. Why then do most lesson learned databases go unused by project teams? In my experience, they are challenging to search through and require hours to review the result set.

Recently I had a project engineer ask me if we could search our lessons learned using a list of 22 key terms the team was interested in. Our current keyword search engine would require him to enter each term individually, select the link, and save the document for review. [...] This would not do.

I asked our search team if they would run a special query against the lesson database only, using the terms provided. They returned a spreadsheet with a link to each document containing the term or terms. Over 1,100 documents were on the list. The engineer had his work cut out for him. I started thinking there had to be a better way."

Meza began to explore, and before long he concluded a graph database was the best way to connect the data needed to achieve what he wanted. He created a graph model for NASA's lesson learned database, imported the data in a graph database, used graph queries to navigate and narrow down results, and graph visualization to explore further.

Meza noted his belief that this stack would allow him to explore and visualize data in ways his search engine at that time could not do. Back in 2015, that was work in progress, and graph databases were a nascent market. Early adopters like NASA's Meza showed how to provide users with more effective search experience, reducing time to find answers and allowing them to start their project on the right foot.

In naming graph DBs one of the 10 biggest data and analytics trends of 2019, Gartner predicted that the category will grow a whopping 100 percent annually through 2022 "due to the need to ask complex questions across complex data, which is not always practical or even possible at scale using SQL queries."

Is space the final frontier? What does the data tell us?

These examples paint a picture. Progress and mass adoption in compute, and storage technology means that NASA no longer needs to innovate in those areas to be able to achieve its mission. It seems that NASA these days is more of an early adopter than an innovator when it comes to this technology, and this is perfectly fine. 

This kind of data infrastructure is the substrate on which AI is built. This AI is what is meant to power grandiose plans such as navigating a Mars landing. But without denying the value of scientific endeavor, isn't there a striking absurdity in committing billions to reach space where no people live, while only a fraction of that amount is appropriated to service the densely populated slums? 

This (slightly paraphrased) question was initially asked by Martin Luther King in 1967. The question and the debate remain strikingly relevant in 2019. If anything, today the debate is richer. Sylvia Earle, explorer-in-residence at the National Geographic Society and Time magazine "hero for the planet" stated in 2004: 

"The resources going into the investigation of our planet and its oceans are trivial compared to investment looking for water elsewhere in the universe. Real oceans need scientific attention more than the dried-up remnants on Mars." She said that she does not want to cut funding for space science, but noted that "we have better maps of Mars than our own ocean floor. That's just not right." 

Today, besides NASA, two of the world's wealthiest and powerful people, Jeff Bezos and Elon Musk, are also enterprising toward space exploration. Bezos and Musk claim their space ventures are a way to save humanity, by colonizing other planets or by using them to relocate polluting industries, respectively. 

However, scientists such as cosmologist Martin Rees of Cambridge University, who recently published a book about existential threats to our world, are critical of such views: "I think that's a dangerous delusion because Mars will be a more hostile environment than the top of Everest or the South Pole, and dealing with climate change here on Earth is far more important than terraforming Mars."

In the meanwhile, NASA has been doing more than planning space missions. NASA is also an open data provider. In the 1970s, NASA's planetary exploration budget fell dramatically. It was then that the agency got into the business of studying our home planet from orbit. It was also a time when people were beginning to realize that our climate could change relatively fast, on the scale of the human lifespan.

Today, NASA notes, we know that our climate is changing at an unprecedented rate and that humans are a key part of that change. NASA continues to launch new satellite missions and is also relying on aircrafts, as well as scientists on the ground, to take vital measurements of things like snowpack and hurricanes, augmenting the big-picture view we get from space.

Perhaps the final frontier for exploration is not space, but humanity itself, and our planet. NASA is using space to collect data around the globe and the clock, making it available to the scientific community. Data, not rockets, may prove to be NASA's most significant contribution to humanity.

The tech that changed us: 50 years of breakthroughs

Editorial standards