Big data deluge: How Dutch water is trying to turn the tide

With the Dutch water system efficiently churning out data from countless incompatible sources, a research project is now trying to work out how to open up the databases.
Written by Toby Wolpe, Contributor

The Netherlands is a few months into its year-long Digital Delta big data research project. When the scheme ends in June 2014, it should provide the Dutch with answers on how to deal with the data pumped out by their water system — and the money pouring into it.

With 55 percent of the population of the Netherlands living under the threat of flooding, water is understandably close to the hearts of the Dutch.

The Dutch Water Ministry's Digital Delta project aims to establish a central registry of IBM-built data sources.

Their government also takes it seriously, spending €7bn ($9.5bn) annually managing water and a network of dykes or levees, canals, locks, harbours, dams, rivers, storm-surge barriers, sluices and pumping stations. That cost could rise to as much as €9bn ($12.2bn) by 2020.

Because of the potential impact of flooding — and drought — on the Dutch population and economy, the authorities run a sophisticated water-management system, constantly monitoring and modelling to understand and anticipate adverse events.

A deluge of data

According to Raymond Feron, programme director for Digital Delta at the Dutch Ministry of Water (the Rijkswaterstaat), the Dyke Data Service Centre database alone handles 2 petabytes (PB) of sensor data annually, while a typical water management project — of which there are up to 100 — can easily generate 10TB to 30TB of structured and unstructured information.

So it's a water system that just can't help generating data. The problem is there's little consistency in its collection and, because of the volumes, finding relevant data is difficult.

"This project combines and interacts with other projects that will generate this information. It is not in itself a data-generation programme. But what the water sector in the Netherlands will benefit from is being able to select the relevant data for your purpose quickly. I'm not looking for big data but for small, relevant data," says Feron.

The trouble is, the amount of data is growing without any control, says Feron, because newer, cheaper sensor technologies are being added to conventional data-collection methods.

"We're seeing a shift in thinking. We used to have well-controlled monitoring programmes for single-purpose, strict-governance, high-quality data" says Feron. "We're seeing a shift to open, flexible, multi-purpose, multi-sensor type of monitoring. This shift gives an exponential shift in data, with a mixture of low- and high-quality and multi-purpose data."

"You can put very cheap sensors in your infrastructure and these sensors will spit out huge amounts of data. In the operational water-management processes, we're not used to getting these new types of data, which may be lower quality than we're used to, but give you a wider geographic view."

One of the aims of the Digital Delta project, which is a collaboration between the Rijkswaterstaat, IBM, local water authority Delfland, the Deltares Science Institute and Delft University of Technology, is to establish a central registry of data sources, which will be built by IBM.

The idea is that better integrated information will enable authorities to anticipate disasters and reduce the cost of managing water by up to 15 percent.

For example, Feron cites the selective use of sensors in dykes as a way of making more accurate spending decisions.

"We choose a couple of locations and see whether this new sensor technology, measuring all kinds of new things in and on the dyke, can gather information for a better design of a new dyke infrastructure," he said.

"That's an example that directly saves money. If you do good sensoring for one or two years and can prove it's not necessary to do something, then you save money for infrastructure and you can spend it at another location. These new sensors are not meant at this moment to replace existing monitoring, so it's not an efficiency operation for monitoring."

Outcomes, not technology

Project goals include promoting cooperation between national and regional levels, local authorities and cities with improved connections between the various data-collection processes.

"The basic driving force for the government is to keep innovating in the process of water management. But this is not a big IT project. We're focusing on the outcomes and not the technology," maintains Feron.

"We're looking at public-private research and how large companies like IBM can work with small companies. We would like to see the water discipline interact better with other disciplines — so agriculture, environment, city planning, and traffic," he adds.

According to Feron, the large volumes of data in the Dutch system do not originate solely from live data collection, but also from the hydrological models developed by Deltares.

The research is looking at various architectures that could open up all the new and old data so that it can be shared more easily and new users can find it.

"These models used to be used to make decisions and for planning and predictions, but now they spit out huge amounts of output and this model output is the new input data for other processes," he said.

"The output is several times greater than the data you put into the model so there's an exponential growth. That can be used in other disciplines — wind-farm planning in the North Sea or agriculture or city planning — and then it will be used by other types of experts who may not be sure about its origin — is it real data or model data?"

It is big data issues such as this that make greater transparency and standardisation a key issue for the Digital Delta project.

"They say, 'We'll open up our data', but the database in Rotterdam is completely different to the database in Amsterdam or a smaller town. What you'd like to see is some interoperability between these databases if they are to open up to private parties or the public. There must be something that interconnects to make more efficient use of all the available data," says Feron.

The research is looking at various architectures that could open up all the new and old data so that it can be shared more easily and new users can find it.

"Over the past 10 years we have introduced all kinds of standards — maybe too many standards. We have well-paid people thinking about IT architectures. We build all kinds of new tools and platforms but we still think — and this is the main research question — there could be some additional IT infrastructure above the standards and architectures. But we're not sure what it is," says Feron.

"We don't want it to be too heavy or complex. The question is what do we need extra? Is it just some rules, or a repository, or is it a very complex enterprise server bus?"

Editorial standards