Last week we talked about the processors in embedded systems, this week we will discuss the programming languages used.
Processors execute machine instructions. Machine instructions are simple commands like move, add, store, call, and compare. But it is not something that most humans can read, it is just a bunch of binary values. Each command is very simple so they can be carried out very quickly. Machine code looks like this (in hexadecimal):
4B0C480B 429A1842 4A0BD3F6 2300E002
If the proper sequence is given to a processor, it can look like the web browser that you are looking at right now. An incorrect sequence could crash the computer, destroy data, or shutdown your nuclear power plant.
Machine code is really hard to get correct manually, so tools were created to help. A human readable equivalent called assembly language was made. It has a one-to-one correspondence to the machine code. It looks like this:
ldr r0,=_sdata ldr r3,=_edata adds r2,r0, r1 cmp r2,r3 bccCopyDataInit ldr r2,=_sbss bLoopFillZerobss
Each processor has its own instruction set, its own binary patterns that represent those instructions, and its own assembly language dialect. All assembly language dialects look similar but they are not interchangeable. Nor are these instructions very “expressive”, each one does very little. So to get something complex done, you need to write a lot of assembly code.
If you change from one computer system to another, you have to rewrite all of your assembly code to match the instruction set of the new computer. This is one of the problems that the C programming language tried to solve.
C was written in the early ‘70s by Dennis Ritchie, at AT&T Bell Labs, as an alternative to assembly language specifically for writing operating systems. Many other languages existed at the time, such as ALGOL, COBOL, FORTRAN, LISP, even BASIC, but writing operating systems requires the low level control of assembly language to get complex things done fast. Since assembly was hard to get right and a lot of it was used in an operating system, a language higher than assembly was needed, one without the features that made COBOL and FORTRAN good for developing applications. Something like a super macro assembler, that thing was C. (The name C doesn’t stand for anything. It is simply the successor to the language B.) *
Embedded systems live in this same space. They need very tight control since resources are restricted, but need a language expressive enough to get the system going. So, the overwhelmingly popular language for programming embedded systems is C.
C is a mature language, and it has been adapted to run on more processors than any other language. This is important in embedded systems where the life of a processor depends on sales. Without compilers, nobody will buy a new chip, and without lots of sales, nobody will write a compiler for a chip. Many times, C was the only option because it is the only language with a compiler available for your processor.
Chip manufacturers are buying the compiler writing companies to provide compilers for when their new chips are released as well as dropping support of their competitor's processors**. They can have a free version of the compilers, the tradeoff is that support is either very expensive from the company, free but with support given by the user community, or the compiler will only generate a limited size of program.
Commercial C compilers for embedded systems run about $2500 for each developer, plus 30% per year for bug fixes and updates. You may have to budget extra for the debugger. That assumes that you stay within the same family of processors, the compilers typically won’t build code for other processors unless you pay for that option too. That is a serious amount of money unless you are backed by venture capitalists who consider $5 million chump change. A small startup with no income would rather pay wages than buy a $5000 development system. (I know this to be true.)
There are some really good free C compilers by the name of gcc. (I know this to be true too.)
C++ is becoming available on microcontrollers and, with judicious restraint, can be used successfully. Java needs a “virtual machine” to execute your program on a few embedded processors, this code is owned by Oracle and is available for about $2500 for a 1 year subscription.
ADA is a language that is used in systems where reliability is a very important quality, such as transportation, satellite communications, avionics, and nuclear power plant control. Initially criticized for being overly complex and hence unreliable***, ADA is now reported to have a program error rate approximately 10% of C****. For safety critical systems, ADA would be a fine substitute for C.
Other languages never quite got any traction. General Motors used a version of Modula-2 (son of Pascal) to program their engine management computers, but it was never made available to others and was subsequently replaced by C. Dialects of BASIC and FORTH were used in development boards, but these boards were far too expensive to consider in large scale products, much like the Raspberry Pi today.
In the future will another language take over from C? I don’t know. Apple has open sourced their Swift language, maybe it can be adapted to generate embedded programs.
C is getting more and more features in its old age, some very applicable to embedded systems. C is also falling out of fashion in desktop computer programming, maybe embedded systems will be what keeps it going. The parts of C that make it dangerous to use are slowly being identified, maybe someone will come up with a safer, compact language that can be put onto our little processors.
For a quick introduction to the C language from Brian Kernighan, see: Programming in C: A Tutorial.
* The definitive documentation of C was written, in 1978, by Ritchie and a Canadian dude by the name of Brian Kernighan and is known as “The C Programming Language” or just K&R. This book was updated for the 1985 version of C, and no more. Kernighan is an old guy now, and Ritchie passed away in 2011.
A more up to date reference is “C: A Reference Manual” (5th Edition) by Harbison and Steele, 2002, Pearson. This book has a higher probability of being updated for the 2011 version of C in a future edition.
**** Ganssle on ADA
This post is part of a series. Please see the other posts here.