X
Tech

Automating chip debugging

You all know that fixing bugs in computer chips after they've been fabricated in silicon is a tedious and costly process. This is why researchers at the University of Michigan have developed a new technology to automate post-silicon debugging. FogClear, as the new method is called, 'uses puzzle-solving search algorithms to diagnose problems early on and automatically adjust the blueprint for the chip. It reduces parts of the process from days to hours.' This technology looks promising. However, it remains to be seen if or when semiconductor companies will use it. But read more...
Written by Roland Piquepaille, Inactive

You all know that fixing bugs in computer chips after they've been fabricated in silicon is a tedious and costly process. This is why researchers at the University of Michigan have developed a new technology to automate post-silicon debugging. FogClear, as the new method is called, 'uses puzzle-solving search algorithms to diagnose problems early on and automatically adjust the blueprint for the chip. It reduces parts of the process from days to hours.' This technology looks promising. However, it remains to be seen if or when semiconductor companies will use it. But read more...

The current post-silicon debugging methodology

The figure above shows the current post-silicon debugging methodology. "To verify the correctness of a silicon die, engineers apply a large"number of test vectors to the die and then check their output responses. If the responses are correct for all the applied test vectors, then the die passes verification. If not, then the test vectors that expose the design errors become the bug trace that can be used to diagnose and correct the errors. The trace will then be diagnosed to identify the causes of the errors. Typically, there are three types of errors, including functional, electrical, and manufacturing/yield." (Credit: University of Michigan)

The FogClear post-silicon debugging methodology

And the figure above shows the FogClear post-silicon debugging methodology. "When post-silicon verification fails, a bug trace will be produced. Since silicon dies offer simulation speeds orders of magnitude faster than those provided by logic simulators, constrained-random testing is used extensively, generating extremely long bug traces. To simplify error diagnosis, we introduce a step called bug trace minimization to reduce the complexity of the trace." (Credit: University of Michigan)

This research project has been conducted by Kai-hui Chang, a recent doctoral graduate, under the supervision of Valeria Bertacco and Igor Markov, both associate professors of computer science and electrical engineering.

Here are some quotes from Valeria Bertacco. "Today's silicon technology has reached such levels of small-scale fabrication and of sheer complexity that it is almost impossible to produce computer chips that work correctly under all scenarios," said Valeria Bertacco, assistant professor of electrical engineering and computer science and co-investigator in the new technology. "Almost all manufacturers must produce several prototypes of a given design before they attain a working chip." The new method, called FogClear, "uses puzzle-solving search algorithms to diagnose problems early on and automatically adjust the blueprint for the chip. It reduces parts of the process from days to hours."

And how does FogClear automate the chip debugging process? "The computer-aided design tool can catch subtle errors that several months of simulations would still miss. Some bugs might take days or weeks before causing any miscomputation, and they might only do so under very rare circumstances, such as operating at high temperature. The new application searches for and finds the simplest way to fix a bug, the one that has the least impact on the working parts of the chip. The solution usually requires reconnecting certain wires, and does not affect transistors."

This research work will be presented on November 6, 2007, at the International Conference on Computer-Aided Design held in San Jose, California (ICCAD) during a session about Connecting Physical Challenges and Design Approaches. Here is a link to this presentation called "Automating Post­Silicon Debugging and Repair" (PDF format, 8 pages, 617 KB) from which the above images and captions have been extracted.

And here is the abstract of this presentation. "Modern IC designs have reached unparalleled levels of complexity, resulting in more and more bugs discovered after design tape-out. However, so far only very few EDA tools for post-silicon debugging have been reported in the literature. In this work we develop a methodology and new algorithms to automate this debugging process. Key innovations in our technique include support for the physical constraints specific to post-silicon debugging and the ability to repair functional errors through subtle modifications of an existing layout. In addition, our proposed post-silicon debugging methodology (FogClear) can repair some electrical errors while preserving functional correctness. Thus, by automating this traditionally manual debugging process, our contributions promise to reduce engineers' debugging effort. As our empirical results show, we can automatically repair more than 70% of our benchmark designs."

And if you really need more information, you can read Chang's Ph. D. Thesis, "Functional Design Error Diagnosis, Correction and Layout Repair of Digital Circuits" (PDF format, 255 pages, 2007).

Sources: University of Michigan news release, November 2, 2007; and various websites

You'll find related stories by following the links below.

Editorial standards