Oracle supercomputer AI glitch impacts elections in Brazil

The problems caused delays in the totalization of votes in the first round of municipal elections.

Technical problems in the artificial intelligence (AI) component of a supercomputer set-up provided by Oracle prompted delays in the processing of votes during the first round of municipal elections in Brazil last weekend, the Superior Electoral Court (TSE, in the Portuguese acronym), has said.

In 2020, for the first time, the TSE centralized countrywide totalization of votes on a supercomputer using database platforms with artificial intelligence technology provided by Oracle. Previously, each of the 27 regional electoral courts across all the Brazilian states counted the votes and forwarded them over to the TSE.

The problems in the equipment during the elections on Sunday (15) meant the process of vote processing suffered a delay of nearly three hours. Brazil is one of the only countries in the world where the voting process is entirely electronic. The system that includes an estate of about 455,000 voting machines enables results to be processed within a matter of minutes within the closing of ballots.

The service provided by Oracle to the Brazilian electoral authorities consists of two sets of hardware, as well as a cloud-enabled high-performance database and storage servers. As part of the contract, the vendors provide the TSE with a main Exadata X8 Full Rack server, with eight processing nodes, and an Exadata X8 Half Rack, with four processing nodes, for redundancy.

During the vote processing exercise on Sunday, one of the eight processing nodes of the main server was disconnected. Whereas the delay in counting the votes had been initially attributed to this glitch and the IT team at the TSE had been focused on solving the issue, it was later found that the equipment was able to automatically distribute the load to the other processing nodes.

While the node failure is not linked to a direct and immediate delay in the vote counting process, the authority has said that because the glitch happened at the same time a slowness in the system was verified, which meant the team was focused on fixing the problem.

According to the authority, this series of events relating to the failure "delayed the identification of the direct cause of the problem to be solved" for votes to be counted faster. The electoral authority noted that the problem has fixed the problem on Tuesday (17) and that the system is now fully functioning.

The real problem was elsewhere, in what the TSE statement defined as "lack of calibration of artificial intelligence" component of the Oracle database optimizer, which has the function of ensuring faster data processing. As the TSE note stated, the optimization requires an execution plan, which is performed by the Oracle database using statistics, such as the size of the tables and the amount of data in them.

However, the equipment was new, which meant the results of the first round of the elections were totaled in a database with completely empty tables. According to the TSE, from the moment the ballots closed at 5pm the database tables started receiving more than one million rows per minute.

"The execution plan generated by the computer with the empty database proved to be inadequate for processing with a full database", notes the TSE, adding that the Oracle equipment "did not manage to, simultaneously and with the necessary speed, learn a new execution plan suitable for processing the large volume of data and perform the totalization with the expected speed."

For the second round of municipal elections, which will take place in Brazil on November 29, the TSE noted that its technical team and Oracle understand that the failure in the execution plan will not happen again, as the optimizer is already calibrated to process a larger volume of information quickly but measures are being implemented to avoid similar issues.

The process of getting the equipment's AI component takes time and that, according to the TSE, is normal. However, the authority also noted the problem could have been avoided with tests to calibrate the optimizer. The point of testing relates to other issues surrounding the delivery of the Oracle service, also detailed in the statement.

According to the TSE, the issues started with a delay of over a month in the delivery of the Oracle equipment, in July 2020, caused by the unavailability of parts due to the Covid-19 pandemic. This delay impacted in the testing timescales for the supercomputer, so rather than five test events, the TSE only managed to carry out two tests prior to the election.

"It should be noted that the vote totalization test is a complex procedure and involves the mobilization of all electoral offices in the country so that ballot reports can massively transmitted to the TSE, simulating what happens on election day", the TSE statement pointed out.

Read this

Developers or their bosses: Who really picks the database?

Developers may have provided the initial impetus for the popularity of many databases, but some software providers now say that situation is changing.

Read More

Oracle retains exclusivity for cloud provision to Brazil's federal government bodies, so other companies - including Amazon Web Services, which has long been investing in broadening its public sector customer base in the country - cannot compete in that space. The supercomputer private cloud deal has also been sealed without a tender, since the Brazilian law does not require such processes when there is the impossibility of competition.

According to the Superior Court, the recent change in procedure and the adoption of the Oracle technology results for centralized vote counting results from a recommendation from the Federal Police, which is responsible for the tests of the electronic ballot boxes. The assessments are intended to submit the electoral systems to the scrutiny of the academic community and other civil society bodies to identify any potential weaknesses in the electronic voting set-up.

A 2018 assessment from the Federal Police on this previous set-up noted that having a physical server in each of the regional courts with the TSE tasked with running and maintaining them was not ideal. The report suggested that changing the server architecture to be located centrally in the TSE would bring a "considerable improvement" to the overall operational security of the elections.

The TSE noted that the process of continuous improvement through the adoption of new technology is "absolutely normal" and that "there was no concrete event of vulnerability" in terms of vote counting by the regional courts that prompted the decision to change the previous arrangement.

According to the Federal Police report, "a decentralized architecture and the fact that there is a database and a local application server on a computer at each [regional court] increases the range of potential attacks on the environment, which can be mitigated with the physical location of these machines in the TSE environment."

After that assessment, TSE then procured the centralization of vote counting from Oracle. According to the electoral authority, the supplier has provided the technology supporting all elections that used the electronic voting system since 1996, and has been supplying database technology to the TSE for over a decade.

The TSE is also carrying out a trial to investigate the adoption of online voting, in a move that aims to phase out the current electronic voting machine set-up and generate savings.