Saturday, September 15, 2012

Communicating BUGS or BUGgy Communication

A few decades back when the designs had limited gate count, designers used to verify their code themselves. With design complexity increasing, the verification engineers were introduced to the ASIC teams. As Moore’s law continues to drive the complexity further, IP reuse picked up and this ensured that the engineers are spread all around the globe and communication across geographies is inevitable.
 
The reason for introducing the verification engineer was BUGS. A lot has been written and discussed on verification but reference to bugs is limited. With IPs getting sourced from different parts of the world and companies having extended teams everywhere, communicating the BUGs effectively becomes all the more important. “Wrong assumptions are costly” and in semiconductor industry this can throw a team out of business completely. Recently, in a mid-program verification audit for a startup with teams sitting between US & India, I realized that putting a well defined structure on communicating bugs could improve the turnaround time tremendously. Due to different time zones and working style, there was a lot of back & forth communication between the team members. Having a well defined format for communicating bugs helped a lot.
 
BUG COMMUNICATION CYCLE
 
BUG reported à BUG reproduced by designer à BUG fixed à FIX validated à BUG closed à revisit later for review, reproducing when required or data mining.
 
APPROACH
 
Before the advent of CRV, directed verification approach was common. Communicating bugs was simple and required limited information.  Introduction of CRV helped in the finding the bugs faster but sharing and preserving information around bugs became complicated. With a well defined approach, this problem can be moderated. Here is a sample format on what should be taken care of at each stage of the above cycle –
 
BUG reporting
 
Defining mnemonics for blocks or data path or scenarios etc. that can be appended along with the bug title helps. This enables categorizing the bugs at any time during project execution.
 
While the tool enforces adding the project related information, severity, prioritization and contact details of all concerned to broadcast information, the detailed description section should include –
 
- Brief description of the issue
- Information on diagnosis based on the debug
- Test case(s) used to discover this bug
- Seed value(s) for which the bug is surfaced
- Command line option to simulate the scenario and reproduce the issue
- Testbench Changelist/TAG on which the issue was diagnosed
- RTL Changelist/ TAG on which this issue was diagnosed
- Assertion/Cover point that covers this bug
- Link to available logs & dump that has the failure
 
BUG fixing
 
After the designer has root caused and fixed the bug, the bug report needs to be updated with –
 
- Root cause analysis and the fix
- Files affected during this fix
- RTL changelist/TAG that has the fix
 
FIX Validation
 
After the BUG moves to fixed stage, the verification engineer needs to update the TB if required and rerun the test. The test should pass with the seed value(s) required and then with multiple random seeds. The assertion/cover point should be covered in these runs multiple times. With this, the report can be updated further with –
 
- RTL changelist/TAG used to validate the bug
- Testbench changelist/TAG on which the issue was validated
- Pointer to the logs & waveforms of the validated test (if required to be preserved)
 
BUG Closed
 
After validating the fix, the bug can be moved to closed state. The verification engineer needs to check if there is a need to move this test to the smoke test list/mini regression and update the list accordingly.
 
Following the above approach would really help to communicate, reproduce and preserve the information around BUGs. The rising intricacies of the design demand a disciplined approach to attain efficiency in the outcome. Communicating bugs is the first step towards it!
 
HAPPY BUG HUNTING!!!