Fuzzing FPGA synthesis tools

When you want to do some computation on an FPGA, it is traditional to enter your design in a language like Verilog, and then to use automatic synthesis tools to turn your Verilog design into a “configuration bitstream” that can be fed to your FPGA to make it perform the computation you want.

These synthesis tools occupy an extremely privileged position. They are, after all, responsible for almost all of the world’s FPGA-based computation. There is no feasible way to compute with an FPGA other than via these tools. They also occupy an extremely trusted position. We have to trust that they faithfully reproduce the computation we specified in Verilog, because we have no completely reliable way to check whether they have made a mistake. This is especially the case given that the details of how FPGAs are configured are generally not made public.

In our FPGA 2020 paper, my student Yann Herklotz and I ask whether these tools really deserve our trust.

To answer this, we set about doing some fuzzing. We were inspired by a tool called Csmith, whose sole purpose is generate large, random C programs. These C programs don’t do anything in particular, but they are a nightmare for compiler-writers because they combine C language features in weird ways that humans wouldn’t ordinarily think of. Programs generated by Csmith have exposed hundreds of bugs in widely-used C compilers. We wanted to see if we could do the same thing for FPGA synthesis tools.

To this end, we produced a tool called Verismith that generates large, random Verilog programs. Like Csmith, we have to take care that the programs we generate are “legal”; after all, if you put bogus Verilog into a synthesis tool, you can’t expect sensible output. This means making sure that we don’t do things like drive a wire from two different sources, pass the wrong number of bits to a module, or divide by zero.

We used Verismith to generate about fifty thousand Verilog programs, and we gave each of these to four different FPGA synthesis tools: Xilinx XST, Xilinx Vivado, Intel Quartus, and Yosys. The output of these tools is a netlist – a list of interconnected low-level logic gates. (This netlist can then be turned into a configuration bitstream by other automatic tools that we have not tested.) To check whether the synthesis tool has worked correctly, we ask a SAT solver if there are any discrepancies between the input Verilog with the output netlist. If there are, then we have found a bug!

Each time we find a bug, we use Verismith to reduce it so that it is as small (and hence understandable) as possible. To do this, Verismith repeatedly chops the Verilog program in half, each time discarding the half that does not trigger the bug, until it cannot make it any smaller.

After reducing all the buggy programs to be as small as possible, we decided that some of these programs are really exposing the same bug. Ultimately, we reported 4 bugs in Yosys, 5 in Xilinx Vivado, and 1 in a Verilog simulator called Icarus. All of these bugs were confirmed by the vendors. (We didn’t report the bugs we found in Xilinx XST as this tool is no longer supported.)

Our programs trigger two different kinds of bugs in the synthesis tools. Some of them trigger “crash” bugs, which means that the synthesis tool goes wrong even though the input is perfectly valid. Others trigger “mis-synthesis” bugs, which means that the synthesis tool produces a netlist that is not equivalent to the input Verilog. In our paper, we dig into a handful of the bugs that we found, to explain exactly what the tools are doing wrong.

It’s hard to say for sure how much these bugs really matter. If they only come to light when the tools are given fairly baroque Verilog that no human would realistically write, and no Verilog-targeting compiler would automatically generate, then perhaps we need not be too concerned. That said, one observer on the Xilinx bugtracking forums did comment that one of the bugs we found “looks […] to be a rather critical bug”. Personally, I think any bug in a synthesis tool is enough to undermine its trustworthiness, whether or not it is likely to affect the computation that I want to do on my FPGA.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s