Roll Your Own 11:58 (2006.06.28:1)

Designing, building and finally running with a custom-crafted, free and open processor is a hacker's dream which dates well back to the early days of homebrewing. Martin Howse examines how soft hardware in the form of FPGAs goes some of the distance towards realising this promise.

[originally published in LinuxUser and Developer issue 54]

The quest for open hardware is a tough call well worthy of any contemporary hacker hunting down this veritable holy grail. Casting patent free CPU or other key hardware designs in silicon is out of the question given the vast associated planning and fabrication costs. Reconfigurable computing, courtesy of FPGAs (Field Programmable Gate Arrays) which allow for total control of a vast network of logic gates to create all manner of processor and device, opens up its own can of worms with the boundaries of software and hardware impossible to distinguish and thus increasing demand for freedom at all levels. Yet short of wirewrapping a bag full of logic ICs in a gesture of solidarity with the early days of hardware hacking in which free software has its roots, or hiding behind the abstractions of machine simulation or emulation, FPGAs present one of the only valid steps towards free or open hardware. And it's worth remembering that free hardware, in common with its software brethren, is no idle quest, but rather a most serious concern when viewed under the threatening light of contemporary DRM implementations and trusted computing initiatives. As well as offering political and social freedoms, in parallel with the culture of free software and copyleft, freely reconfigurable hardware furnishes the promise of freedom from established models of computing transfixed by the serial and the centralised; freedom for computation at a real core level.

Yet whilst on paper FPGAs do appear to offer a veritable yellow brick road to free computing, in practise such bleeding edge technology is hemmed in by proprietary concerns. Until very recently GNU/Linux users had little choice but to whine over Wine, run virtually with VMWare and the like, or dual boot in order to undertake even the most basic FPGA-related operations. Costly and poorly documented tools restricted experimental usage to a hardcore of heavily bearded geeks ensconced in universities funnelling research into narrow fields. Things have changed in some respects, with key manufacturers Xilinx and Altera offering free as in beer, proprietary tools which at least run on most GNU/Linux systems and plentiful open source initiatives do address nearly all areas of FPGA use. Yet there is quite simply no complete, free software toolchain in existence for these versatile and conceptually powerful beasts, and those wishing to experiment must, hopefully only temporarily, embrace the proprietary. The stranglehold of vendors, locking down their products with code and patents, has given rise to many flamewars on mailing lists and the like, with newcomers failing to grasp the complexity of the situation, and others twisting arguments to attack the very principles of open source. In order to understand the magnitude of the task faced by the free hardware movement, it's more than necessary to run through both theory and practise in the contemporary FPGA scene.

Soft is hard

FPGAs present the absolute meeting point of hardware and software. Abandoning the physical, hardware flips into software and vice versa, blurring the boundaries at the magical point of execution and opening up a field of intriguing experimentation. As key media theorist Friedrich Kittler explains in his seminal essay There is No Software, software erodes and obscures hardware but is itself constrained and dependent on the physical. Both sides of the equation vanish under the signs of execution and noise and FPGAs well demonstrate this fact especially with reference to the strange position they occupy within the worlds of free software and heavy industry. Experience with FPGA development tools which allow for language-based code chunks to be manipulated as hardware components within a graphical schematic diagram is enough to unravel the vast array of concerns here.

Without running through the well rehearsed historical relation of Boolean logic to the computational, with reference to Boole, DeMorgan and most importantly Claude Shannon, who enabled the translation of such theory into hardware and thus the physical embedding of notation, it's easy to see that FPGAs are well rooted in this field as their very name would suggest. In the early 60s, and subsequently within the golden age of homebrewing, discrete logic, individual chips offering a specific logic function, would be wire wrapped to offer fully functional systems. Nearly all computer systems could thus be effected, spaghetti style, but the sheer complexity and lack of flexibility in design for large scale systems would be incredibly prohibitive. Burning logic into silicon to arrive at CPU and memory, the historical next step, takes care of complexity but certainly not flexibility. Hardware was shut down design-wise within the hermetically sealed clean rooms of industry. PLDs (Programmable Logic Devices) helped to keep things open, offering programmable connectivity between a limited number of logic gates. CPLDs, with the word complex prefixing the terms of the good old programmable logic device, took things a stage further, offering an extra configurable layer. The interconnections between well defined PLDs could readily be programmed. CPLDs still see good use today, often in concert with beefier FPGAs, their seriously scaled up older brothers.

FPGAs, which now pack in over a million logic gates and are thus capable of supporting complex CPU or SoC (System on a Chip) designs, allow for a matrix of programmable logic blocks. Blocks are configured to perform a specific logic function such as AND, and switches are then programmed to join such blocks or cells. Each logic cell comprises a small lookup table, from which the logic is selected, some gates and a D-flipflop, or basic memory cell. Without exposed input and output FPGAs would prove more than useless, and indeed non programmable, and thus a vast array of I/O cells and pins populate the edges of the chip. Dedicated memory blocks, and fast dedicated routing lines round out the advanced picture of a contemporary FPGA.

Most contemporary FPGAs use SRAM or Flash to define the grid of connections and thus they can readily be reprogrammed instantly in the field, or even by their own hand. It's simply a question of defining the logic and connections which can then be uploaded as a binary file to configure the FPGA. And, as is to be expected when it comes down to binary matters, this is where the proprietary creeps in with such binary files, or bitstreams, nailed down by the big two FPGA manufacturers, Xilinx and Altera.

High level

What's totally clear is that we need a consistent way of describing our logic and interconnects at a sufficient level of abstraction and a plethora of HDLs, or hardware description languages, have been devised to attack this problem. The two major standard driven players, VHDL (Very high speed integrated circuit Hardware Description Language), and Verilog, both of which are also used in regular chip design, offer quite different approaches and descriptive means with VHDL as a higher level, more behavioural affair heavily based on the Ada programming language. Verilog can readily be used in a behavioural manner but also attacks low level logic. What was once the domain of serious flamewar, VHDL versus Verilog, is still a tough call which quite simply boils down to personal taste once we're outside the complex domain of market forces. Verilog is touted as easier to learn, particularly for those used to the C programming language, but VHDL example code is also plentiful within the community and easy enough to follow. Indeed it's tough to say which HDL is more favoured within the open source world, as both receive equal attention as to distributed source, tools such as editors and simulators, and tutorial material. The LGPLed LEON2 processor design of European Space Agency fame makes good use of VHDL, yet many large scale projects listed elsewhere, particularly with reference to opencores.org, rely on Verilog. Whichever HDL takes your fancy, time and place are perhaps the most important issues facing any hardware coder. Sequential operation may well be the intended modus operandi of the yet to be designed soft CPU, but within our HDL we have to take care of all manner of complex clock related issues implied both in concurrency and operations in sequence whilst at the same time taking care of our available spatial resources. Hard coding is certainly not for the faint hearted.

Indeed, it's tempting to avoid the soft look at hardware altogether in systems design and focus instead on a more traditional electronics engineering methodology, the schematic diagram. Integrated tools within both Altera and Xilinx' mammoth workflow environments allow for simple graphical entry of schematics, though only the most basic designs should be managed in such a manner. Alternative techniques, such as the use of JBits to manipulate low level bitstreams by way of a Java API, abound but the use of such tools demands a more thorough understanding of the FPGA-based design process. And there are certainly a good many more HDLs out there matching up to the vast diversity of contemporary programming languages. You'll find seriously enticing Hydra, an HDL in Haskell, the very attractive Python-based MyHDL, lavaHDL, rubyHDL and JHDL. Confluence, an HDL with GPLed compiler, which by way of FNF, the Free Netlist Format, can generate VHDL or Verilog code, is also attracting a good deal of attention. Yet, before an overview of such an array of toolkits and means of translation between high level abstractions can be attempted, it's well advised that those new to the territory equip themselves with a knowledge of one major HDL such as VHDL, and the general workflow involved in implementing any design.

Descending the tree of abstraction

The rules of software development well apply even if we're now in the terrain of ill-defined hardware specification. Abstraction is still the name of the game, and just as software development starts at the highest level, before proceeding by way of debugging and optimisation to consider lowly matters, so FPGA-led workflow is an exercise in descending abstractions starting from our preferred top level HDL or schematic description. It's worth noting before we walk down our ladder, that all steps, including entry of HDL code, are well contained within the ISEs (Integrated Software Environments) supplied by the big two manufacturers, though all stages can readily be substituted by way of alternative tools.

First up is synthesis, with description transformed into what's known as a netlist, a la FNF we noted above, a description of gates and interconnections. We're still well removed from actual hardware, so next up implementation tools map our netlist over the FPGA in question. Translation, mapping, and the twins of placement and routing are the key processes here, whereby CLBs (Configurable Logic Blocks) are decomposed into LUTs (Look-up Tables) dictating lowest level logic which is then pinned down and physically routed. Now that we have a precise low level description of our logic we can descend one more step to generate the bitstream which is downloaded to a real chip. Onboard switches route and reconfigure. It's time for the magical moment of execution as a new chip is born. Or so we hope for as in all such efforts a range of practical considerations cloud the picture. Testing and parallel debugging practise are essential, and open simulation tools such as Icarus Verilog, which can also synthesise into a range of netlist formats, are readily available. And before we even get so far a host of practical issues is encountered as soon as we attempt our basic workflow under GNU/Linux.

Free flow

In order to provide concrete evidence of such a workflow certain choices must be made. With many open source projects making use of Xilinx devices, and free software friendly XESS corporation supplying decently priced Xilinx-led development boards, the choice requires little effort. XESS helpfully provide open code for utilities which upload configuration to the development board by way of the parallel port so various ports have been attempted to GNU/Linux. Other low level utilities are also made available. Their online documentation is second to none when it comes to using Xilinx' own free as in beer WebPACK ISE and they outline a huge range of design projects, with full source, which not only provides reusable code, but acts as excellent tutorial material, particularly with reference to VHDL. You'll also find helpful hints and tips including the use of Makefiles to automate workflow.

XESS' tutorials for the WebPACK software form an excellent introduction both to use of the ISE and generic FPGA design and implementation. Two simple example tutorials targeted at a range of XESS boards including the budget XSA-50 board exercise the on board seven segment LED and also demonstrate simple bit level communication between FPGA and PC. Though heavily Windows-biased, with multiple screenshots of the ISE running on that platform, these tutorials translate well to the ported interface and those running with a well setup WebPACK will find more than enough information which can readily be extrapolated from XESS' own datasheets and those of Xilinx to venture into pretty advanced territory.

XESS offer a concise range of small Xilinx Spartan-based development boards with generous I/O which can well be extended using their cunningly named XStend boards which add video inputs and audio input and output alongside funky switches, lights and prototyping area. Interface by way of Ethernet, USB, RS-232 and IDE make the larger extended boards an excellent choice for GNU/Linux loaded SoC experimentation. Yet the very first stages in working with the XESS board, through trying to test basic functionality of the CPLD which links board to PC, truly prove that the field of FPGA endeavour still has much work to do in running well with free software platforms. XESS provide either broken links or point to older ports of basic test software which refused to compile under a modern stock system. Only an RPM for these xstools from Columbia University proved adequate to the job. Installing the free Xilinx WebPACK which consists of both ISE and command line tools was an experience typical of working with products from large companies possessing little if any GNU/Linux experience. One installer simply doesn't work and only the 400 MB installer from Xilinx succeeds in its purpose after being upgraded with the latest service pack. At this point running GNU/Linux is purely a convenience or rather, practically speaking, a damn inconvenience. In terms of look, feel and approach we're very much in Windows world but at least we have our tools and can get down to work.

It's perhaps worth rehashing our workflow, now placed within the full context of the Xilinx ISE. After defining the features of a new project, for example details such as FPGA model and HDL, we can begin to pin down the design through specifying the input and output buses for our circuit. Working with VHDL code within a highlighting source code editor, which has already filled out some code detail in template fashion, we can readily create hardware modules or components. Again the XESS tutorials provide a well illustrated walk through of GUI tasks within this framework. Syntax can usefully be checked as to errors and these components can later be tied together to form our complete system through defining a schematic layout. Again the correctness of the diagram, for example as to matching data bus widths, can be readily checked. We're only now ready for our second stage, synthesis, through launching the Xilinx XST tool. Yet before we can implement our design and thence generate our FPGA-banging bitstream, we need to tell the ISE a bit more about the real world, for example exactly what pins map to the input and output of the segmented LED. Such is defined within an implementation constraints file which we can easily generate with a GUI. Pin assignments can readily be gleaned from data sheets. Under simple designs, implementation and bitstream generation are thoroughly automated affairs, with the former able to be well investigated through a range of graphical interpretations of our chip. Time now to break out a shell, launch xsload with correct options and within no time we have a custom chip. Further online community documentation provides a great source of tutorial and inspiration with a range of projects, ranging from retro style re-implementation of intriguing ancient systems such as the transputer to innovative micro CPU designs employing ingenious architectures, for example a Forth-based processor. The OpenCores project is a more serious endeavour which is perhaps less readily accessible for new FPGA hackers but which nevertheless provides an essential resource within the field of open hardware. Through the site opencores.org, free cores or rather digital modules, which range in complexity from stepper motor interface to complete SoC, are published and documented in unencumbered manner for re-use and further development.

Open bitstreams

Yet, without totally free tools the open hardware movement can only travel so far. The hardware may well be soft but without specs it's tough closed stuff and we may as well resort to simulation. We've seen how open high level tools abound, but it's the low level specs which the vendors are keeping close to their chests. Bitstream data is well obscured and likely to remain so. In this sad light FPGAs could well be viewed as more closed than most CPUs. Xilinx' JBits effort, which exposes a low level API and packs in a Java library to access CLBs and routing, gets us a bit closer to the metal, but it's still far from libre. Development of a truly free FPGA architecture and toolchain is far from possible and reverse engineering to further create low level tools such as place and route would be arduous to say the least. The MPGA, or meta FPGA project, takes another rather tortuous approach to freeing up low level FPGA specs, through implementing or rather simulating a free and open FPGA on top of a closed device. It's an appealing if somewhat tricksy notion which has borne little fruit so far.

JBits is probably the closest call worthy of full blooded investigation. And although hacker and all round FPGA expert Neil Franklin closed down the project with some degree of despair in late 2003, VirtexTools code and documentation is well worthy of a look, coming very near indeed to actually freely generating an FPGA bitstream. It's interesting to note that this project grew from Franklin's effort to implement an ancient PDP-10 mainframe, a key machine within the history of free software, inside an FPGA.

Under reconfigurable computing, the physical is very much in the hands of the machinic at a very fine grained level. Indeed, this field of enquiry marks an essential crossing point or even oscillation between the digital and the analogue. Under such an emphasis hardware seems to win out over software, and we could well agree with Kittler when he argues that "there is no software," and thus we can pursue questions of real freedom given such a lockdown. Our hardware is opaque and perhaps even free software is an illusion supported only within simulation.

Grow your own

In the face of the often minute complexities involved in both the design and workflow of advanced electronic systems, one tempting path, already enacted within mainstream software engineering, is to turn to nature for inspiration and leverage evolutionary strategies for implementation. Or to put it simply, why not breed your own reconfigurable hardware. And, by way of a similar logic and approach, we can well envisage highly flexible fault tolerant hardware systems, which can even undertake repairs on themselves. That various space agencies have shown keen interest in such projects is a good sign. A self healing system can well head out for a long space voyage without fear of accidental demise. And within the increasingly common context of embedding hard or soft multiple processors, such as the PowerPC or Xilinx Microblaze, within a vast FPGA, the prospect of flexible OS, and indeed uClinux has been ported to the latter soft core, surrounded by reconfigurable peripherals, is an attractive option.

It's an appealing strategy on paper, but the practise is somewhat more demanding. Downloading arbitrary bitstreams to nearly all contemporary FPGAs will most likely result in irreparable damage to the chip by way of signal contention, driving two connected logic gate outputs high. At the same time, whilst early experiments using more tolerant hardware pursued a low-level GA (Genetic Algorithm) approach, with bitstream as genotype, questions of syntax, semantics and modularity have recently suggested new approaches. Again JBits is an essential resource for those pursuing open research in this field.

Kicking off the first studies in this intriguing domain, Dr. Adrian Thompson is the person most people will associate with evolving hardware. His seminal experiments way back in 1997 expose many of the well argued issues associated with reconfigurable hardware and indeed the very dependence of software on a physical substrate and the erosion of its autonomy. Using an Xilinx 6216 chip, part of a family which were seriously tolerant of rogue bitstreams, over the course of three weeks he was able to evolve a functional circuit which could distinguish between two square wave inputs, one oscillating at a frequency of 1 kHZ, the other at ten. It's a task which one could argue as being purely analogue, but digital circuits can readily attempt it if clock driven. The winning evolved circuit was incredibly efficient, occupying only a ten by ten celled corner of the FPGA, and more importantly verged on the impossible; the cells had no access to clock or any other resource which could provide any timing clues to compare the frequencies with. Even more interesting results were to come when Thompson tried shifting the cell block to another portion of the silicon. The results changed markedly indicating that the naughty little evolved circuit was taking advantage of the minute electronic and electronic interactions at a supremely low level. The very same kind of cells on the very same FPGA responded differently within the rich network of logic and feedback. It did prove possible to evolve new, successful circuits in this fresh region but such experiments demonstrated the very dependence on a supremely detailed physical substrate; the triumph of noise over design in a shift away from digital approaches to complex systems implementation.

key links

Xilinx: http://www.xilinx.com

Altera: http://www.altera.com

Friedrich Kittler: http://www.ctheory.net/text_file.asp?pick=74

Hamburg VHDL archive: http://tech-www.informatik.uni-hamburg.de/vhdl

Hydra: http://www.dcs.gla.ac.uk/~jtod/Hydra

MyHDL: http://www.jandecaluwe.com/Tools/MyHDL/Overview.html

Confluence: http://www.confluent.org/wiki/doku.php

Icarus Verilog: http://www.icarus.com/eda/verilog

XESS: http://www.xess.com

Xstools RPM: http://www1.cs.columbia.edu/~sedwards/software/xstools-4.0.3-1.i386.rpm

Opencores: http://www.opencores.org

FPGACPU: http://www.fpgacpu.org

Opencollector: http://opencollector.org

MPGA: http://sourceforge.net/projects/mpga

JBits: http://www.geocities.com/:/Pines/6639/fpga/jbits.html

VirtexTools: http://neil.franklin.ch/Projects/

Adrian Thompson: http://www.informatics.sussex.ac.uk/users/adrianth/ade.html


Xilinx' WEBpack ISE covers all the bases when it comes down to the full FPGA workflow, from basic project description, through HDL editing and syntax checking, constraints input and synthesis, place and route right down to bitstream generation. Under GNU/Linux it's decidedly ugly, very slow and at times heavily dysfunctional.