The data centre of 2015 is highly evolved, designed to meet the needs of the familiar connected world. Services and clients create and consume corporate and consumer data, large and small, over well-characterised networks. And with the transition to cloud computing well under way, the basic mechanisms and dynamics of most IT tasks are well understood and well supported.
The Internet of Things (IoT) will not follow these rules, but it's not clear what rules it will follow: a powerful concept eagerly marketed, the IoT nevertheless lacks many useful definitions. There will be millions of autonomous connected sensors, certainly, creating regular pulses of data in profusion. That data will be used in real time, as well as stored for advanced analysis later. The data centre will be the central organising nexus. More than that, though, will depend on what is commercially interesting, and there's no general agreement about what sort of data, in what quantities, with what sort of access requirements, will best characterise the operational IoT.
The first operational IoT systems will be for specific markets -- medical, energy, transport -- and these will provide valuable lessons in managing such technology, but there are common characteristics that can help set guidelines from the outset. The single most important realisation is that an IoT data centre cannot see itself as a physical entity with four walls and a ceiling: that will be part of it, but the actual functions and responsibilities must extend out across the networks, with its managerial, processing and storage purviews reaching potentially as far as the sensors themselves.
There will be two main kinds of sensor data, streaming and event-driven. Streaming data about known conditions will arrive at regular intervals, putting demands on throughput and storage; event data will advise of unexpected or unpredictable conditions and will put a premium on cross-system response time. All data will also be associated with metadata describing sensor ID, location, the nature of the data and so on. All this data, or a useful subset of it, will end up in storage, but it may easily be highly heterogeneous. Thus, storage, communication, database design and management process will be key factors in effective IoT data management.
Sensors will often be geographically disparate. Moving all data to a central point all the time will be both expensive and unreliable, even where it is possible. Thus, there'll be a need for intermediate storage and processing, including on the sensors themselves, with significant regional, distributed databases potentially removing the need for a single central store at all. Biology provides a good analogy, where the visual system starts processing and filtering data the moment it's collected on the retina, with further layers extracting the relevant signals as they move into the brain. Separate systems manage important real-time events, like the blink reflex and different parts of the brain analyse and store different aspects of the images.
Biology provides a good analogy, where the visual system starts processing and filtering data the moment it's collected on the retina, with further layers extracting the relevant signals as they move into the brain.
An efficient IoT network will have to make best use of all its resources, so processing tasks, queries and management functions will be pushed out to whichever layer is best equipped to deal with them. Where data is consumed near to the location that generated it, it may never need to transit to a data centre at all, or only in heavily pre-processed, summarised form.
Data management in today's data centres is primarily concerned with the data that's already in them, reporting on and controlling data objects such as files and records and the storage in which they live. IoT data management will extend that to live data, including that outside the local data centre and residing in intermediate storage layers -- or, again, on sensors themselves. It will have to provide an on-demand, homogeneous interface to what is actually a heterogeneous network of intermittently connected devices, as well as carrying out its traditional roles.
Security and reliability will also need to be managed in new ways. Current security methods won't scale to managing thousands or millions of remote objects, and so will be devolved to the sensors and their gateways onto the network. Concepts such as access whitelists, intrusion detection and cryptographically-guaranteed identity will characterise the outer layers of the IoT, while intelligent backup methodologies will have to archive the minimal acceptable subset of data. In some cases, this may be a highly compressed synopsis of many readings; in others, high-resolution continuous records may be required.
IoT data may be of immediate interest, or more valuable historically. In the latter case, a wide variety of tasks can be carried out asynchronously from the business of collecting the data: normalising data for efficient storage, comparison and query optimisation, hierarchical data reporting, aggregation and so on. As data transits all of these processes, it will make computational requirements that will be related to the amount of data arriving and the degree of transformation required. Changes at any point along this chain will have consequences for later stages, as will changes in the scope of the sensor network or the demands for its data.
Security and reliability will also need to be managed in new ways. Current security methods won't scale to managing thousands or millions of remote objects, and so will be devolved to the sensors and their gateways onto the network.
In general, the further the data moves away from the sensors through the system, the greater the opportunity to structure and simplify it towards a more traditional model. However, there may be valid reasons to use the data at any stage in this process. Think of patient data, where everything from a real-time event requiring immediate attention to decades-long analysis of a condition may be clinically significant. This is not a mature area, and remains to be explored.
Many proposed IoT services and products depend on mobile sensors and platforms, whether patient monitoring, vehicular, aeronautical or robotic, whether indoors or out. For the foreseeable future, mobile aspects of any network must accept unreliable connections and often other restrictions on electrical power availability and processing capabilities. These are hard enough to manage within the existing internet, but will need much more attention with the IoT. How a real-time query will work when a sensor is offline but recent historical data is available, or how a mobile endpoint can be effectively managed when transiting different service zones, are not issues that have traditionally been a concern for data centre system design.
A set of tasks
The question isn't so much how to design a data centre for the IoT, as what new priorities and tools are required. One of the most important will be the development of statistical models that can take the range of parameters within which a particular IoT project will operate, and produce workable specifications for storage, network, distributed architecture, management overhead, security and robustness. Many more small transactions than are usual today may swamp network routing or bandwidth capabilities. On the other hand, they may have much more flexible latency and responsiveness requirements. Storage planning may start to require input about weather or significant external events. The data centre itself may start to generate information about sensor or data patterns that a client may not expect, but may help them manage and plan their services better.
The IoT has great potential, and many unknowns. It will bring the complexities and excitement of the real world through the doors of the data centre, and will demand a much more flexible and outward-looking approach to data management and service provision. The data centre manager who is best prepared to look outward and learn will be the one best equipped to create a product worthy of the challenge.