High-Level Synthesis
Functional Equivalence Verification Tools in High-Level Synthesis Flows
Anmol Mathur Edmund Clarke Calypto Design Systems Carnegie Mellon University
Masahiro Fujita Pascal Urard University of Tokyo STMicroelectronics
Editor’s note: Eliminate bugs insofar as possible High-level synthesis facilitates the use of formal verification methodologies that from the given description levels. check the equivalence of the generated RTL model against the original source This is necessary at the highest specification. The article provides an overview of sequential equivalence level of abstraction that cannot be checking techniques, its challenges, and successes in real-world designs. verified via equivalence checking, ÀÀAndres Takach, Mentor Graphics but it is also necessary when equiva- lence checking cannot verify all the MOST MODERN DESIGNS start with an algorithmic behaviors of a model by comparing it against a description of the target design that is used to iden- higher-level model. tify high-level system, or hardware and software, Guarantee the equivalence of the two descriptions. performance trade-offs. The designers’ goal is to We must guarantee equivalence of both a vali- transform these algorithmic descriptions into ones dated higher-level model and of a lower-level that are hardware-friendly, such that the final trans- model obtained automatically or through manual formed descriptions are accepted by high-level syn- refinements of the higher-level model. thesis tools to generate high-quality RTL designs.1 These design refinements consist of many transfor- These two issues complement each other to assure mation steps including changes of data types (for ex- the correctness of the descriptions as a whole. ample, floating point to fixed point, refinements of To eliminate bugs as much as possible from a bit-widths), removal of certain types of pointer given design description, simulation is not sufficient, manipulations, and partitioning of memories. Design especially for large and complicated designs. There- models at levels higher than RTL, such as algo- fore, we also use various formal methods to verify rithmic and high-level models, are called system- the design descriptions. level models (SLMs). Commonly, C/C++ or SystemC is used to represent these SLMs. Once a high-level Model checking for SLM synthesizable description is obtained in this manner, Validating the SLM’s functional correctness can be it can automatically be transformed into an RTL de- done via simulation (a dynamic technique). How- scription through appropriate high-level synthesis ever, simulation can never be exhaustive due to the tools. large input and state space of real models, and can Figure 1 illustrates a high-level design flow that miss functional issues in corner cases. Consequently, starts from an algorithmic design description. To en- we also use formal (static) techniques to validate force the correctness of various descriptions in such the SLM. These techniques do not require the user a design flow, we must resolve two issues: to specify test vectors.
88 0740-7475/09/$26.00 c 2009 IEEE Copublished by the IEEE CS and the IEEE CASS IEEE Design & Test of Computers Static analysis methods Static analysis does not exe- Algorithmic design Design description cute and follow design behav- optimization iors globally; instead, it checks Many steps of behaviors of the design locally. manual refinement The most basic static methods Static/model High-level synthesizable Sequential are those used in lint-type checking Design description equivalence tools, and there have been sig- optimization checking High-level Many steps of nificant extensions for more synthesis manual refinement detailed and accurate analysis.2 Designers generate control and Register transfer level Design description data flow graphs from given de- optimization sign descriptions, and examine dependencies on control and Figure 1. High-level design flow. data that result in various de- pendency graphs. With appropriate sets of rules, we can check vari- Model-checking methods ous items. For example, we can detect uninitialized After designers apply static checking methods to variables by traversing control/data dependencies to the design descriptions, they can use model-checking check if a variable is used before it is assigned a methods to detect more complicated bugs that can value. Note that this search process is local and be found only through systematic traversals starting does not need to start from the beginning of the de- from initial states or user-specified states. Given a scription. Whereas model-checking methods basi- property,model-checking methods determine whether cally examine the design descriptions exhaustively all possible execution paths from initial states satisfy starting from initial or reset states, static checking ana- that property. Properties are conditions on possible lyzes design descriptions only in small and local values of variables among multiple time frames, areas. Instead of starting from initial states, static such as ‘‘if a request comes, a response must be checking first picks up target statements and then returned within two cycles’’ and ‘‘even if an error hap- examines only small portions of design descriptions pens, the system eventually returns to its reset state.’’ that are very close to those target statements, as In one approach, the C-bounded modeling check- shown in the left part of Figure 2. ing tool, CBMC,4 verifies that a given ANSI-C program Because analysis is local, it can deal with very satisfies given properties by converting them into bit- large design descriptions, although such analysis vector equations and solving satisfiability (SAT) using has less accuracy. This means that static checking methods can produce false errors or warnings. Null pointer references can be checked with backward tra- versals of control and data dependencies, and rela- Main() { Initial states tive execution orders among concurrent statements … can be checked with backward and forward traver- … … sals starting from the target concurrent statements to } be checked. Static checking methods, which have Static checking Func1() { … Model checking been applied to various software verification tasks, Analyze only … have proven highly effective even for very large locally Starting from initial states, … check all possible paths descriptions having millions of lines of code. For ex- } Target statement with constraints ample, Unix kernels and utilities have been automat- … … ically analyzed by static methods, which have … revealed several design bugs.2 Also, researchers have been working to combine static and model- checking methods within the same framework for more efficient and accurate analysis.3 Figure 2. Static checking and model-checking approaches.
July/August 2009 89 High-Level Synthesis
a SAT solver, such as Chaff.5 All statements in C descriptions. Abstraction must be refined to avoid descriptions are automatically translated into Boo- such false negative cases. A considerable amount lean formulas, and they are analyzed up to given spe- of ongoing research concerns automatic abstraction cific cycles. Although CBMC and similar approaches refinements for model checking.7 State-of-the-art can generally verify C descriptions, they cannot model checkers for SLMs can process tens of thou- deal with large descriptions because the SAT process- sands of lines of codes within several hours. ing time grows exponentially with design size. Model-checking methods for RTL or gate-level Sequential equivalence checking designs are mostly based on Boolean functions. Sequential equivalence checking (SEC) is a formal These methods make each variable a single bit and technique that checks two designs for equivalence use decision procedures for Boolean formulas, such even when there is no one-to-one correspondence as SAT procedures. In SLMs, however, there can be between the two designs’ state elements. In contrast, many word-level variables that have multiple bit traditional combinational equivalence checkers need widths, such as integer variables. If we expand all a one-to-one correspondence between the flip-flops of them into Boolean variables, the number of varia- and latches in the two designs. SLMs can be untimed bles can easily exceed the capacity of SAT solvers. C/C++ functions and have very little internal state. Therefore, when reasoning about high-level descrip- RTL models, on the other hand, implement the full tions, we commonly use word-level decision proce- microarchitecture with the computation scheduled dures. Extended SAT solvers, called Satisfiability over multiple cycles. Accordingly, significant state Modulo Theory (SMT) solvers, such as CVC,6 can differences exist between the SLM and RTL model, deal with multibit variables as they are. They are sim- and SLM-to-RTL equivalence checking clearly ilar to theorem provers and consist of several deci- needs SEC. Researchers have investigated SEC tech- sion procedures. niques,8-12 and also commercial SEC tools that are Although reasoning about the SLM can be much now available, such as the one from Calypto Design. more efficient with SMT solvers, realistic design SEC can check models for functional equivalence sizes are still simply too large to be processed. More- even when they have differences along the following over, the types of formulas that word-level decision axes: procedures can deal with are limited, and sometimes the formulas must be expanded into Boolean ones if Temporal differences at interfaces: The SLM typi- they are outside the scope of word-level decision pro- cally has a parallel interface where all inputs ar- cedures. Therefore, researchers have proposed sev- rive at the same time, whereas the RTL model eral automatic abstraction techniques, such as that accepts inputs serially. proposed by Clarke et al.,7 by which large C design Differences in internal state and microarchitecture: descriptions can be sufficiently reduced, and SAT/ Various sequential transformations in converting SMT solvers can quickly reason about the reduced de- an SLM to an RTL model (such as scheduling of sign description. This automatic reduction is generally a computation over multiple cycles, pipelining, re- target property dependent. That is, given a property to source sharing to minimize area or power, retim- be verified, portions of design descriptions that do ing of registers across logic) create significant not influence the correctness of the property are auto- divergence between the state machine in the matically eliminated. SLM and RTL. Depending on characteristics of target proper- ties, the amount of reduction varies. For example, The output of the equivalence checking process is if the target is to check whether array accesses either a counterexample illustrating a scenario in never exceed array bounds, most of the design which the behaviors of the SLM and the RTL differ or descriptions are irrelevant and can be omitted for a proof that the two are equivalent. model checking. Abstraction-based approaches, The use of equivalence checking to verify RTL however, have the problem of possibly generating functional correctness has two key advantages. The false counterexamplesÀÀthat is, the generated coun- first advantage is complete verification of the RTL terexamples are true only for abstracted descriptions, model with respect to the SLM. Unlike simulation- but not true (executable) for the original design based or assertion-based approaches, in which
90 IEEE Design & Test of Computers functional coverage of the RTL model is an issue, SEC intermediate equivalences and decompose the prob- checks that all the RTL behaviors are consistent with lems given to the solvers. those in the SLM. This results in very high coverage of Typical SLMs and RTL models use word-level arith- the RTL behaviors. It should be noted that SLM-to-RTL metic operators. Most existing formal verification equivalence checking verifies only the RTL behaviors tools synthesize these operators to gates and then that are also present in the SLM. Thus, if the SLM has a use bit-level techniques, such as binary decision dia- memory implemented as a simple array while the grams (BDDs) or bit-level SAT, to solve the resulting RTL model implemented a hierarchical memory problems. To allow SEC to perform enough sequential with a cache, equivalence checking will not verify unrolling, it is important to reason about arithmetic whether the cache is working as intended as long operators at the word level, as well as to develop solv- as the overall memory system works as expected. ers that can efficiently switch between word-level and The second advantage of equivalence checking is bit-level techniques. simplified debugging. In case of a functional differ- Finally, because the SLM and RTL model have dif- ence between the SLM and RTL, SEC produces the ferent interface protocols, it is important for an SEC shortest possible counterexample that shows the dif- tool to provide mechanisms for a designer to specify ference. This contrasts with traditional simulation- the temporal relationships between corresponding based approaches, which may find the difference I/Os and internal state points. Since specifying such but only after millions of cycles of simulation. The relationships can be cumbersome and error-prone, conciseness of the counterexample makes the pro- it is also useful for the SEC tool to automatically cess of debugging and localizing the error much infer these mappings. In flows where SEC is used in more efficient. conjunction with high-level synthesis tools, it is often possible for the HLS tool to generate the I/O Key technology in SEC and state maps. This automation is crucial to make Four key technology pieces are required to make SEC usable in an HLS flow. SEC feasible: Deploying and using SEC Model extraction from SLM Design teams should follow four guidelines to effec- Sequential analysis tively use SEC or other simulation-based techniques Word-level solvers for keeping SLMs and RTL models functionally Mechanism for specifying temporal mappings at equivalent: I/Os and state points Coordination between SLM and RTL design In model extraction from SLM, extracting the hard- teams. The teams designing and maintaining the ware model from SystemC or C/C++ with extensions is SLM and RTL model must be well coordinated, and a key area where new techniques will be required. both need to realize the value of keeping the SLM Four of the issues that the model extraction technol- and RTL models equivalent. This is naturally the ogy will handle are identification of persistent state, case in HLS-based flows, but in flows where the RTL generation of state machines that are implicitly speci- model is manually created, this needs to be enforced fied via wait statements, synthesis of channels and for SLM-to-RTL verification to work. The SLM team interfaces, and handling of arrays and pointers. must be aware that often bugs can be found in the In sequential analysis, SEC requires technology for SLM when comparing the SLM and RTL model. Fur- the efficient unrolling of finite-state machines (FSMs) ther, the modeling style might need to be adjusted to align the SLM and RTL state machines to synchro- to allow effective use of formal tools such as SEC. nizing states, in which their output and next state functions might need to be computed and compared. Consistent design partitioning. SEC is a block- It requires efficient symbolic analysis, and state and level verification tool due to capacity limitations of output-based induction procedures, to prove the cor- formal technology. To effectively use SEC, it is crucial rectness of the transactions of interest. To ensure that that the SLM and RTL model be consistently parti- SEC can handle designs of sufficient complexity, it is tioned into subfunctions and submodules. Clean also imperative to leverage the SLM’s existence to find and consistent design partitioning provides an
July/August 2009 91 High-Level Synthesis
opportunity to use sequential equivalence checking is between 500K and 700K gates and where the at the level of individual SLM/RTL blocks. latency or throughput difference between the models is in the hundreds of cycles. If the SLM and RTL model Creation of SLMs with hardware intent. To per- to be verified exceed these limits, the design teams form SLM-to-RTL equivalence checking using a se- should find smaller subfunction (SLM) or submodule quential equivalence checker, the SLM must be (RTL) pairs where SEC can be applied. This requires written to let the tool infer a hardware-like model stat- the SLM and RTL model to have consistent partition- ically from the source. This requires that the team cre- ing and clean interface mappings at the subfunction/ ate the SLM to follow certain coding guidelines that submodule level. allow the SLM’s static analysis. Use of statically sized data structures instead of dynamically allocated Deploying SEC in HLS flows memory, explicit use of memories to reuse the same High-level synthesis generates an RTL description storage for multiple arrays instead of pointer aliasing, from a C/C++/SystemC model.1 By using HLS, a and statically bounding loops are some examples of designer’s development time is greatly reduced via constructs that make SLMs more amenable to SEC coding at a higher abstraction level than RTL. How- and HLS tools. ever, to take advantage of system-level design and its time savings, formal verification between the two Orthogonalization of communication and com- design descriptions is required. Verification helps re- putation. Clear separation between the computa- veal any bugs in HLS tools and permits any manual tional and communication aspects of the SLM allows modifications at the synthesized RTL description. easier refinement of the communication protocol, if Figure 3 illustrates a working model in which an needed, to make the interface timing more closely HLS tool is used. Algorithmic C/C++ models of the ap- aligned with that of an RTL model. plication (such as video codecs and wireless digital baseband modems) are the starting point. After man- Challenges in SEC usage ual hardware and software partitioning, blocks to be The main challenges in the usage of SEC-based implemented as IP blocks are individually split flows stem from the fact that SEC has capacity limita- (A, B, C in the figure). These C/C++ models are used tions. SEC’s complexity is a function of the following factors: