What's the most important issue when dealing with data? Inthe old days, like 10 years ago, it used to be how do you organize your datainto tables to make it easy to write that to a database, pull the informationout and to be able to access it quickly. Now, the issue is how do you deal withdata in real time as data streams into you from all sorts of transactions.
To show what I mean, let's look at what we used to be. Weused to have a struggle between the flat file formats and relational databaseformats. And in a flat file database, every record looks exactly the same. Youput every conceivable field that you need, first name, you know, addressinformation, last name, the date of a transaction, you know, the states or theprincipality, zip codes, what thing they bought, the amount they paid for it,if there were taxes, and it ended up with really long files even if each recordonly used a fraction of those fields. So it was an inefficient way to store thedata and it was also, it made it harder to get the information out.
So when dealing with how do you write and retrieveinformation, a relational database is superior. As you can see, independenttables are joined by a common ID field. So you can have an address databasehere, you could have a purchase database here, you could have a customerservice database and in each one of these things, you're only storing theinformation you need for that particular need, but you can always join themtogether and get a good look at all the transactions and all the informationaround that particular customer.
So we think that we've come a long way in solving theproblem of how to write and retrieve information about transactions that havealready occurred. But with data streaming, the issue is really different. It's,I've got this huge amount of data coming at me like a hammer head shark andit's coming at me and I've got to decide what do I do with this? What do I haveto deal with right now? You know, unlike in the good old days, I can't justwrite it to a database and deal with it the next day. There are things I've gotto do. I've got to make calculations. I've got to queue my inventory routinesto see if I'm running out of inventory in a particular item. I've got to beconcerned about fraud and make sure that my fraud controls are looking at thesetransactions in real time. So retailers, financial institutions, banks are alldealing with huge amounts of information coming in. What do they have to dealwith in real time and what can they afford to deal with later? So the new issuewith data. The new problem is this huge influx of stuff that's coming into youand how do you make a determination as to what's important, what's notimportant. How do you deal with the important stuff in real time and how do youdo that without losing the information or losing the opportunity to make thecalculation and make the sale? So streaming, that's the new issue in data thesedays.



















