Making sense of big data

How do you know what you need to know, before you know you need to know it? Find out in this Q&A with Teradata Labs president Oliver Ratzesberger.
Written by Colin Barker, Contributor

Teradata has been selling data warehousing technology from long before big data analytics became the next big thing.

Oliver Ratzesberger Teradata

Teradata's Oliver Ratzesberger: "We can scale linearly -- or close to that -- the largest data volume of any company."

Image: Twitter

Teradata specialises in analytic data platforms, marketing applications, and the related software which is becoming an essential component of any companies data strategy.

ZDNet discussed the company's evolution with the president of Teradata Labs, Oliver Ratzesberger.

ZDNet: Over the years Teradata has changed and evolved, so what exactly does it do now?

Ratzesberger: Teradata today is focused on large-scale analytics, so we target any company that has data, whether it is structured, semi-structured, or unstructured data that needs that data to scale. That is, not so much the OLTP side of things, it's more a matter of taking all of these transactions -- billions of them -- and building something. That means taking all of these transactions and making them into a Teradata database or what we call a Unified Data Architecture.

So if we look at something like eBay -- a customer of ours -- they handle something like 10 trillion website clicks, and we can relate that to hundreds of billions of transactions over multiple years. When we do that we can see seasonal patterns, shifts in behavior, and do experimentation.

If you look at advanced companies nowadays, they are competing based on data analytics and trying to leverage the data so that they can make better decisions. Now that could be for activity-based costing, efficiency, security, customer behavior, or product feature development.

How can you deal with the data and make smarter decisions based on the actual user behaviour that you see from customers? It could be as detailed a change as: 'Should we have a large image or a small image?' All of this means that when they search for something, they can find the image in less clicks. Or it could be, how do you help the customer get to the checkout faster?

So who are your customers?

Our target customer base is the top 3,000 to 5,000 companies in the world, of which we have about 1,500 of the largest. The Fortune 3,000 are our customers, whether that is a Vodafone here in the UK or an AT&T in the US. It is banks like HSBC. Companies like VW and Maersk when you look at transportation. Pretty much all the airlines in the world are customers of ours. Anyone, basically, who has large scale data.

So what's your unique selling points, or USP? Is it your size?

Our unique proposition is that we can scale linearly -- or close to that -- the largest data volume of any company. Many of our customers have either thousands of stores or millions of customers. So at that level, first of all they produce a lot of data. They might have terabytes of data or they might generate hundreds of terabytes every day, all of which has to be processed.

They usually have a lot of employees that need that data. Not only do they have to run that data but they have to be able to ask, 'What if I ask this?' That can be a new question that has not been asked before.


Selling points for Teradata's private cloud.

Image: Teradata

But with thousands of users there will be thousands of different questions, so the application has to be able to scale and scale concurrently. Then in-depth there are the types of analytical capability that we at Teradata have. For example, we have technology, that we call Teradata Aster, which makes very complex questions very easy to answer.

So, for example, you can have multi-channel pathing. Let's say, a customer comes to a bank branch or an ATM. He comes back to the website and may make a call as well. Now to visualize all of the different kinds of interaction from that customer is a difficult task for companies.

If you take Aster, we have functions in there that, as long as you know SQL, can take another action, or a function, and ask an SQL question. Then we have a function that we call nPath, which you can use to put model queries like, 'show me any customer who was on the website at least three times, has been in a branch at least once and so on', and the function will give you all of the customers that match that kind of pattern across multiple channels.

Now, most of our B2C customers understand that this concept of omnichannel is really big. So no matter when or where the customer interacts with us, we can stitch this together in near real-time.

So, [a customer service rep would know] 'the person I have on the phone at the moment was on the website just five minutes ago and for some reason couldn't do what they were trying to do'. I need to know that and this type of analytics that can do that and scale and it needs to, at the same time, be as easy as possible for companies to implement without needing low-level programmers to do it.

So your USP is that while this can be easily done with one, or a few, customers, it is a different proposition with hundreds of thousands of customers -- and that's where you come in?

Take eBay, a big customer of ours, as an example. They have tens of petabytes of data that they make available to thousands of users within the company and over time they will track over 200 billion customers. Then they analyze them in thousands of micro-segments and show the customers who wanted this and customers who wanted that. Then they can customer-tailor their offerings and whatnot to them.

But they don't stop just with customer data. They also integrate technology as sensor data, IoT data, and so on -- data that they can have in their data centers that can be integrated into their capacity planning. So then the system can answer questions like: 'Do we have enough capacity for the website to stay up? At Thanksgiving? Or Christmas?' All of a sudden, datacenter data becomes vital.

If you look at the likes of Siemens, for example, one of their businesses is Siemens Mobility. So the rail connection between Barcelona and Madrid is now handled by Siemens and what they sell is on-time arrivals. If they get 99.99 percent on-time arrivals, they do well and part of that ability is sensor data.

Predictive and preventative maintenance is vital so that they can predict problems before something breaks, which helps them avoid having stuck trains. That is all about machine behavior data.

Then look at the oil companies, which have sensor networks on the ocean floor with thousands of sensors listening so that they can get better pictures of what's underneath the sea bed. This happens in real time so that they can see potential risks before something breaks.

This is all software-only?

It has been but now we are starting to offer the Teradata cloud services to other companies such as Amazon. Now we can offer Teradata appliances, which are hardware and software bundled.

The Teradata appliance from 20 years ago was custom hardware and custom software -- everything was custom.

Today our appliances are standard, open-spec servers from Intel or Dell, standard storage with Infiniband interconnects where we just do the interconnection. We don't do anything silicon.

Do you go to datacenter customers and tell them how to do it better?

It depends on the customer. Some want the most highly tuned hardware and we do infrastructures for that. And then there are other customers who are starting to shift to the cloud. Another customer of ours, Netflix, used to have their own datacenters but a couple of years ago they decided to get completely out of datacenters. Their thinking was that they couldn't be nearly as good at datacenters as a company as big as Amazon. Our engineering at the front end is the difference we can offer, not the back end.

So we want to give our customers a choice of deployment. They can use appliance, they can use our software or they can do it all in the cloud. Many of our customers go hybrid.

So what is your next move?

We are pushing heavily into the cloud and services architectures. Nowadays a lot of data comes from the cloud and it wants to stay in the cloud -- this concept of data gravity -- and it processes in the cloud. So you will see us coming out with a lot of new products.

One product we are launching in Q1 is Teradata Listener. That is an enterprise, web-scale listening infrastructure that you can deploy in the cloud, privately or publicly; it integrates peer stream data, or public data.

So rather than go down the traditional path of ETL Tools, we can turn this upside down so that any developer in the company (or third parties if you want to open it up) can register the forms of data in much more of a self-serve environment. They can get an API key to an API and then go straight to the destination so as to turn that structure into a listening infrastructure.

We guarantee to listen in real-time and then ask: Where do you want to deposit that data? Do you want it in the system, do you want it in your data centre, or do you want it out there in the cloud?

What about governments? Do you have relationships there?

A lot actually. For example, in the US a lot of governments use Teradata apps to audit tax revenues. They use it so they can identify companies that are not paying their taxes.

Then there are a lot more in the smart grid or smart city, part-industry area. We have a lot of people on this. There are more than 10,000 employees here at Teradata of which more than 5,000 are consultants -- some on the technical side and some on the industry side.

For the last two years, I have been working with the Kellogg School of Management, the Business School at Northwest University. We have built a capability maturity model for companies that want to be advanced analytical company, independent of industry. They want to know, 'what are the capabilities that you need to build?'

So they came up with a capability model called the sentient enterprise, which would be an enterprise that is self-aware. It is all about how to make companies agile with data.

Everybody today is focused on time-to-market and agility, but the problem is that you really need to plan it, design it, and build it. But a lot of companies focus on the Wild West, instead of agility. So they build something and then it becomes a house of cards, and you pull one card out and it falls apart.

A lot of companies are still measuring their business on transactions: 'How many products can I sell on Monday? How will that compare with last Monday? Last year?'

What they should be thinking about is: 'Why did the customer just buy that product? And what is the reason for that? And who else is in that customer base?'

Ultimately then you are talking about collaboration and scale in the business. In a corporation you may have thousands of people working on data, but how do you ensure that they know about each other?

Read more about Teradata and analytics

Teradata to restructure, sell marketing software unit, bet on cloud

Teradata puts Aster on Hadoop, adds 'Listener' For IoT

Splunk adds more machine learning, analytics to security detection tools

Editorial standards