The video sequences, recorded in HD, also contain GPS locations, IMU data, and timestamps across 1100 hours.
UC Berkeley's BDD100K database can be used by engineers and developers of self-driving car technologies to train autonomous systems.
These types of datasets are required to teach systems how to cope with different environments and driving conditions, including how to detect a road surface in comparison to pedestrian areas, objects such as other vehicles, and potential hazards.
Classification can take countless hours and so to boost object mapping, the database already contains 2D bounding boxes which have annotated over 100,000 images containing objects of note, including traffic signs, people, bicycles, other vehicles, trains, and traffic lights.
In addition, 100,000 images contain notes for vehicles to make complicated driving decisions, such as at busy intersections, cluttered road systems, or where multiple lane markings are present.
The video clips are approximately 40 seconds long at 30 frames per second and use a variety of methods to annotate objects, according to a paper describing the dataset project (.PDF).
"To achieve rich annotation at scale, we found that existing tooling was insufficient, and therefore develop novel schemes to annotate driving data more efficiently and flexibly than previous methods," the researchers behind the project said. "Current tools are difficult to deploy at scale and are rarely extensible to new tasks or data-structures."
This is not the only autonomous vehicle dataset to be released to the public. In March, Baidu released Apollo Scape, a dataset based on Baidu's autonomous driving platform Apollo.
The open-source database is not as large as UC Berkeley's, but the more data that self-driving technology developers can get their hands on, the smarter our vehicles will become.