Interview with Andrea Vaccari, research associate at MIT's SENSEable City Laboratory

Now that machine and sensor data is joining social data traffic on the Internet, the ability to interpret and create meaning out of the information to improve life could be the post-Web 2.0 manifesto.

Now that machine and sensor data is joining social data traffic on the Internet, the ability to interpret and create meaning out of the information to improve life could be the post-Web 2.0 manifesto.

MIT's Senseable City Lab has been on the task for years with various projects that utilize sensors and hand-held electronics to help describe and understand cities. But some skeptics, while impressed by the "info-porn" generated by the lab's data visualizations of city-scale data, are now asking if the work will create meaningful change in cities.

Andrea Vaccari, a research associate at the lab spoke recently at the ETech conference in San Jose, and ZDNet caught up with the researcher to get more resolution around the latest work and aims of the Senseable City Lab.

For those unfamiliar with the lab, tell us a little about the work and your background.

I'm a Research Associate at the Senseable City Lab of the Massachusetts Institute of Technology where I investigate how digital technologies are revolutionizing the way we live in urban areas, the way we study people's flows through urban space, and the way we can configure more livable, sustainable, and efficient cities.

I leverage the huge volume of real time geotagged data provided by mobile devices, sensor networks, and pervasive systems to better understand cities as real-time control systems, and to provide new tools to innovate and anticipate the effects of such innovations.

I'm currently working on CurrentCity, a new initiative that aims at analyzing and visualizing aggregate information on cell phone activity to identify unexpected events and assist public authorities and first responders.  Another project is MIT Enernet, a platform to identify, assess, and communicate energy efficiency opportunities by studying energy consumption, HVAC levels and human occupancy at the scale of the room, estimated through connections to the WiFi network.

Credit: CurrentCity.org

Credit: CurrentCity.org

What are the implications of the knowledge your lab creates from the data?

Over the past decade there has been an explosion in the deployment of pervasive systems like cell phone networks and user-generated content aggregators on the Internet that produce massive amounts of data as a by-product of their interaction with users. This data is related to the actions of people and thereby to the overall dynamics of cities, how they function and evolve over time. Electronic logs of cell phone calls, subway rides, GPS-enabled buses, and geotagged photographs are all digital footprints that today allow researchers to better understand how people flow through urban space, and could ultimately help those who manage and live in urban areas to configure more livable, sustainable, and efficient cities.

The SENSEable City Laboratory is a research initiative in the Department of Urban Studies and Planning at the Massachusetts Institute of Technology. Research in the Lab focuses on developing technologies that can mediate between physical urban space and the layers of digital flows produced by everyday urban functions, and on analyzing the changes our cities undergo due to this new coupling with digital technologies. Through our projects to date, we have explored areas such as interactive urban furniture, methods for data fusion, pervasive data-mining, real-time data visualization, and many others. This is possible because we consciously integrate aspects of urban studies, architecture, engineering, interaction design, computer science, and social science.

We all love the cool data visualizations and info-motion graphics, but what types of problem(s) is the SENSEable City Lab at MIT trying to solve?

Data visualizations are just one aspect of our work and they are aimed at fostering the interest of local authorities, telecom operators and service providers. They are of course complemented by active research aimed at advancing our understanding of how people move in the city and how cities themselves function and evolve. The research I am currently conducting has the potential to solve real-world problems that interest multifaceted aspects of the city as a complex system, like the following:

  • Management of crowds. Estimate how many people are present in a given area, like a square or a stadium, and how many are entering or leaving it. Assess the current demand for public transportation. Forecast where traffic is piling up.
  • Support for special events. Monitor public gatherings and detect unauthorized aggregations of people, traffic jams, car accidents and other emergency situations. Support police and first responders during emergencies like fires and evacuations.
  • Analysis of urban dynamics. Identify the most popular areas of a city and the most crowded neighborhoods in real time. Estimate how many people visit a touristic spot or look at a billboard, and how much time they spend in such activities.
  • Analysis of urban policies. Identify the patterns of inflows and outflows of people to and from the city. Reconstruct commuting patterns and origin/destination flows during the day.

These objectives are inherently multidisciplinary and they will require to draw on a range of fields from data mining and machine learning to networks analysis and statistics. Moreover, the increased understanding of the interaction between urban dwellers and the pervasive systems embedded in the built environment will enable local authorities, service providers, enterprises, and citizens themselves to develop new services and applications for the city and its inhabitants.

What would the takeaway be for the average citizen versus an urban planner?

For urban planners, one critical question in the study of cities is how cities perform in normal conditions and during a special event or a sudden emergency? Until today it has been difficult to provide a precise answer to this question, and impossible to provide it in a timely fashion. Traditional surveys and people counts, together with the use of helicopters and satellite imagery, are cumbersome in capturing the dynamic changes of the functioning of a city and the behavior of its inhabitants. Today, however, information produced by the interaction with pervasive systems can help to create and define new methods of observing, recording, and analyzing activity logs, and therefore human dynamics, in the context of the city. This new study of a city as a real time control system will enact a paradigm shift from the concept of urban planning to that of urban programming, where the city can adapt to variation of its state.

For citizens and enterprises, mobile phones and wireless networks form an infrastructure that allows to extract and insert information almost anywhere in the built environment in real-time. Processing this information and making it publicly accessible enables to make better decisions about the use of urban resources, mobility, and social interaction. This feedback loop influences all aspects of the city and can help local authorities, service providers, enterprises, and citizens themselves to improve the economic, social, and environmental sustainability of the places we inhabit. New services can be built, and old ones can be improved using geo-tagging, time-tagging, and review information added by companies and citizens themselves. For example, special events like concerts could be better organized and experienced by the community, and interactive maps and other devices can be used to augment our senses in the city.

In addition to real-time data such as cell phone usage and information uploaded by citizens, will (are) RFID-tagged, IP-enabled smart objects and any other data sources be incorporated into your models?

In my opinion, there are two major types of urban data that we can collect, and two primary ways of collecting them. We have digital shadows, which are electronic logs created by our interactions with urban pervasive system (e.g. mobile phone networks, WiFi networks, RFID system for bike sharing and metro). And we have digital footprints, all the explicitly disclosed information that is publicly available online, like pictures on Flickr and micro messages on Twitter. To collect this information, we can choose to deploy new technology in the form of urban infrastructure or user agent applications on people's phones, or we can choose to leverage existing systems. The former solution can provide higher resolution data but can be extremely costly to deploy in large areas, or could simply fail to reach a critical user mass (in the case of mobile applications).

We chose to focus on both types of urban data, but to do it only through existing systems. We believe that the two data sets can tell us very different stories about the dynamics that drive people movements and actions in the urban environment. For example, digital shadows can tell us about the nature of mobility in modern cities where digital footprints can provide new insights on specific activities like leisure and tourism. Moreover, we recognize that the unique advantage of leveraging existing systems is the superior spatial and temporal precision and breadth: collecting data from the mobile phone network, we can reach uniformly every road and square of a city, and we can capture the unbiased behavior of its entire population (or at least of the share that adopted the specific telecom operator).

In the future, we will try to aggregate more urban data from other providers like transportation authorities, waste management services, newspapers, and most importantly the citizens themselves. We aim at building an open platform where distributed sensors, local authorities, businesses, and citizens can mutually provide information on the urban environment.

Do you have any examples of how the analysis of city-scale data can inform a city's development and planning?

The most interesting example is the New York Waterfalls study, which used mobile phones activity logs and geotagged photos from Flickr to study urban dynamics in the vicinity of the New York Waterfalls, a $20 million public art project of four artificial waterfalls rising from New York Harbor between June 26 and October 13, 2008. The research was included in the Economical Impact Study commissioned by the New York Economic Development Corporation, which estimated that nearly 1.4 million people viewed the Waterfalls from an official vantage point or from a ferry or tour boat, directly and indirectly generating about $69 million in total economic impact. While traditional methods (e.g. people counts and surveys) employed for the study extrapolated punctual estimate of visitors at specific locations, our analyses provided novel insights to quantify the influence of the public art exhibit on the distribution of visitors and on the attractiveness of areas on interest in the proximity of the event. For instance, the analysis of cellular network traffic and tagged photos provided evidences on where visitors were attracted in Lower Manhattan and whether the areas of interests at the waterfronts benefited in attractiveness and popularity.

In this case study, we illustrated through the collection and analysis of digital shadows and digital footprints how we can develop observation and define indicators to inform the urban design, planning and management processes. Our approach relied on several "indicators of urban attractiveness" inspired from financial indicators and network theory. For instance, we compared the attractiveness of the main points of interests in New York from the relative strength based on the density of digital footprints; or the evolution of the centrality of the waterfront among the network of points of interest generated by the flows of visitors in the Lower Manhattan area. The mapping of this new type of digital footprint analysis shows the impact of an event to open-up and drive people to new parts of a city over time. This information can feed tourism studies helping a city to understand the behavior of people who, by definition, are unlikely to return any time soon and whose favorable impressions can have a huge impact on the urban economy.

Can you tell us about any upcoming projects and cities of interest to the lab?

Most Lab projects are carried out in partnerships with cities and commercial entities worldwide. Some of our partners include the City of Copenhagen, Denmark; Florence, Italy; New York City; Amsterdam, and the United Nations, as well as companies such as AT&T, Telecom Italia, AUDI Volkswagen, and several other leading mobile phone service providers. We are also discussing new collaborations, but I cannot talk about them until they are official. :)

What do you like most about researcher at MIT?

Being at MIT is very important for this kind of initiatives. On one side, it allows to meet researchers whose ideas from the most different disciplines can help solve problems related to urban dynamics. For example, we are using network theory, financial analysis, and data mining and clustering to study the topological properties of the urban data. On the other side, MIT's authority and respectability is essential to build the trust necessary to work on the confidential data provided by our partners. In this sense we hope that our projects might stimulate a dialogue on the responsible access to private data and on how it could provide value-added services to local and regional communities.

Any final comments you'd like to make?

There are still many open questions. What will be the impact of these pervasive systems on urban life itself? What will be the effects of the real-time feedback that will be enacted between people and the built environment? What type of new public services and business models could emerge, and how will current services and businesses be affected and transformed by the upcoming disruptions?

The data analyzed in our projects consists always of aggregate information on network activity, like for example the number of calls and text messages served by one base station in one hour. To ensure the complete privacy of AT&T's mobile customers, data is provided in accordance with our partners privacy policies and analyses are performed in compliance with the 2002 Directive of the European Parliament and Council on Privacy. The use of aggregate data implies that at no time could individual users be identified. To make a simple comparison, think of information about traffic in highways: we know how many cars from each state are traveling, but we do not know their particular license number or who is driving them.

Here is a link for a download of Vaccari's ETech presentation in video form. A big thanks to Andrea for taking the time to answer my questions in such great detail.