2804003-Cantrell.qxp 3/7/2008 10:45 AM Page 80

SILICON UPDATE by Tom Cantrell

More Than a Core While examining 32-bit last month, Tom decided that the STMicroelectronics STM32 was worth a second look. With the new ARM Cortex M3 core, good peripherals, integration, and energy efficiency, this could be just the MCU for your next project.

Having covered the territory last debates have become less relevant, thought these shining stars would month (“More Bits, Less Filling,” especially for blue-collar embedded burn out so fast? Circuit Cellar 212, 2008), it’s not my apps. Maybe it’s just battle fatigue, The microprocessor was barely born intention to get stuck on the topic of having seen so many architectures before it headed into battle. Early 8-bit 32-bit MCUs. Believe me, there’s march off to war. Remember way back skirmishes foreshadowed the epic plenty of other neat stuff going on in the mainframe years (1960–1970s) struggle between the Intel ’x86 and with FPGAs, wireless, sensors, and when companies like Univac, the then Motorola 68K, a battle that other wonders of the silicon age. Burroughs, and Honeywell challenged counted a myriad of upstart architec- Nevertheless, if you have anything to IBM with “better” architectures? All tures as collateral casualties. May the do with embedded systems, you need dead and gone. 88K, i860, Clipper, 29K, and all of the to stay up to speed with the latest hot Then there were the fabulous mini- others rest in peace. rod chips or you’ll get left behind. computers such as the Data General True believers are entitled to pitch In some ways these fast and furious Nova and the Digital Equipment VAX. their favorite architecture and poo-poo MCUs remind me of the brand new Like teenagers, they seemed invinci- the others. Taking nothing away from Tesla Motors high-performance elec- ble. “Nova” indeed. Who would have Cortex M3, the fact is that all of the tric vehicle just now hitting the streets. It’s got the efficiency and ICode green aspects of a golf cart, but can Flash interface Flash smoke the tires when you punch it. DCode memory Cortex-M3 The big difference is that the 32-bit System MCUs don’t cost an arm and a leg,

but in fact are a luxury any designer SRAM

can afford. DMA

So this month, you’re invited to Ch. 1 AHB system Bridge 1 look over my shoulder as I pop the Ch. 2 Bridge 2 APB2 APB1 hood on the STMicroelectronics STM32 (see Figure 1). You’ll recall Ch. 7 GPIOA USART1 USART2 WWDG GPIOB SPI1 USART3 CAN from last time that its main claim to GPIOC ADC1 SPI2 BKP fame is the use of the new ARM GPIOD ADC2 I2C1 PWR GPIOE TIM1 I2C2 TIM2 Cortex M3 core. Sure, that’s newswor- EXTI AFIO USB TIM3 thy, but there’s more to the STM32 IWDG TIM4 than that.

WORLD BEYOND CORE DMA Request Indeed, over the years, I’ve come to Figure 1—The ARM Cortex M3 core is the attention-getter in the new STM32 MCU from STMicroelectronics. But the conclusion that “core wars” there’s more to an MCU than a processor core, including lots of , fast SRAM, and a bunch of I/O.

80 Issue 213 April 2008 CIRCUIT CELLAR® www.circuitcellar.com 2804003-Cantrell.qxp 3/7/2008 10:45 AM Page 81

Package pins 36 36 48 48 48 64 64 64 100 100 Flash 32 KB 64 KB 32 KB 64 KB 128 KB 32 KB 64 KB 128 KB 64 KB 128 KB SRAM 10 (6) KB 20 (10) KB 10 (6) KB 20 (10) KB 20 (16) KB 10 (6) KB 20 (10) KB 20 (16) KB 20 (10) KB 20 (16) KB General-purpose 2 3 2 3 3 2 3 3 3 3 timers Advanced control 1 (0) 1 (0) 1 (0) 1 (0) 1 (0) 1 (0) 1 (0) 1 (0) 1 (0) 1 (0) timer SPI 1 1 1 2 2 1 2 2 2 2 I2C 1 1 1 2 2 1 2 2 2 2 USART 2 2 2 3 3 2 3 3 3 3 Full-speed USB 1 (0) 1 (0) 1 (0) 1 (0) 1 (0) 1 (0) 1 (0) 1 (0) 1 (0) 1 (0) 2.0 CAN 2.0B 1 (0) 1 (0) 1 (0) 1 (0) 1 (0) 1 (0) 1 (0) 1 (0) 1 (0) 1 (0) 12-bit 1-µs A/D 2 (1) × 10 ch 2 (1) × 10 ch 2 (1) × 10 ch 2 (1) × 10 ch 2 (1) × 10 ch 2 (1) × 16 ch 2 (1) × 16 ch 2 (1) × 16 ch 2 (1) × 16 ch 2 (1) × 16 ch General-purpose 26 26 37 37 37 51 51 51 80 80 I/Os CPU Frequency 72 (36) MHz 72 (36) MHz 72 (36) MHz 72 (36) MHz 72 (36) MHz 72 (36) MHz 72 (36) MHz 72 (36) MHz 72 (36) MHz 72 (36) MHz

Table 1—STMicroelectronics blasts off the starting line with a full complement of 20 STM32 parts, divided equally between “Performance” and “Access” lines. In this table, the “Access” line features are shown in parenthesis where they differ from the “Performance” line. Another difference is that both lines come standard with a –40° to 85°C tempera- ture range, but the “Performance” parts also have an extended temperature range (–40° to 105°C) option.

major 32-bit MCUs (including the can lead to the awkward situation a curse if the I/O traffic clogs available ARM7 and ARM9 chips ST also offers) where more “megahertz” means less bus bandwidth and demands a lot of are fully capable of getting the job performance. It’s no surprise that handholding by the processor. The done in most applications. most 32-bit MCUs devote silicon to STM32 avoids that pitfall with multi- Look at a die photo of any 32-bit the cause of getting around the flash ple on-chip I/O busses to boost band- flash MCU and what you’ll find is a bottleneck. The STM32 is no excep- width and a powerful seven-channel little processor core stuck in the cor- tion, using a 64-bit wide flash bus in DMA controller that offloads the ner, dwarfed by surrounding memory conjunction with two 64-bit prefetch processor of I/O grunt work. and I/O silicon. The fact is, while the buffers to hide the flash latency. Even Another way to boost bus band- architecture chosen for the core may though this simple prefetch scheme is width is to demand less of it in the be the sizzle, it’s the implementation relatively unobtrusive, there may be first place. As I went through the of an entire chip that’s the steak. times when you’d prefer to turn it off, specs, I was impressed with the way which the STM32 allows you to do. the STM32 uses “smart” I/O devices FLASH FOR CASH If you really need max MIPS, take that take care of their own dirty laun- Sure architecture has an impact on advantage of the fact that the STM32 dry rather than bugging the processor performance, but so do a lot of other allows execution of code from the on- to do it for them. things starting with bus bandwidth. chip SRAM at full speed. You can use Even the simple stuff such as serial The differences (relatively minor actu- the SRAM as a “programmer directed and parallel I/O is pretty fancy these ally) in the way competing architec- cache,” preloading it with perform- days. Every STM32 I/O line is indi- tures choose to deal with instructions ance-critical routines such as DSP vidually programmable as input (pull- and data don’t matter nearly as much inner loops and handlers. up and pull-down options) or output as how fast a particular chip can actu- Just remember that a MIPS rating is (push/pull or open collector with out- ally do it. only half the story. You can crank put drive strength options). I/O lines In the blue-collar space these chips through all of the instructions you are also 5-V tolerant and can target, we’re generally talking about non- want, but nothing useful happens source/sink a whopping 25 mA, with cache implementations. That means flash until data makes its way to and from the not unexpected caveat that total (i.e., instruction fetch) bandwidth is a crit- the pins. As a practical matter, the on- chip power is limited to 150 mA. A ical limiting factor. The STM32 comes in chip I/O devices are just as important measure of port-remapping capability two flavors, “Access” and “Performance,” as the processor core itself in achiev- enables juggling peripheral pin with a major difference being that the for- ing peak system performance. assignments to best fit a particular mer runs up to 36 MHz and the latter to application. 72 MHz (see Table 1). Just keep in mind I/O U As I’ve noted before, the traditional that higher clock rates require 0, 1, or 2 I/O throughput starts with the num- RISC load/store architecture is prob- flash wait states for clock rates up to ber and performance of the on-chip I/O lematic for “atomic” bit operations 24, 48, and 72 MHz, respectively. devices themselves. The STM32 has a because an interrupt might occur If something isn’t done, wait states lot of fast I/O, but that can actually be between the load and the store. The

www.circuitcellar.com CIRCUIT CELLAR® Issue 213 April 2008 81 2804003-Cantrell.qxp 3/7/2008 10:45 AM Page 83

Forward Jitter Backward Jitter Forward

TI1

TI2

Counter

Up Down Up

Figure 2—Smart timers are needed to enable real-time applications to handle tasks in hardware that would otherwise bog down the processor core. The Encoder mode of the Advanced Control Timer (ACT) included in STM32 “Performance” parts is a good example. It automatically monitors the phase relationship of two inputs and keeps track of the cumulative count.

Cortex M3 architecture takes a crack fast), and standards (e.g., SM Bus 2.0). message filters so it can screen mes- at the problem with a “bit-banding” No surprise that the USARTs are fast sage traffic by itself without bothering capability that provides atomic access (up to 4.5 Mbps) and capable (e.g., the processor. to single bits. In addition, the STM32 LIN, IrDA) as well. Note that any or If you want to do real-time, you also incorporates “set/reset” shadow all of these serial I/Os work with the need plenty of timers. General house- registers for I/O, a solution that has DMAC, taking advantage of its intelli- keeping is handled with an RTC, a the advantage of being able to deal gence (e.g., 8-, 16-, and 32-bit bus free-running “SysTick” counter, and with multiple bits at a time. matching, circular buffer manager), two separate watchdog timers, while In safety-critical applications (e.g., which leaves the processor free for three 16-bit units with input capture, transportation, medical, and industri- more important tasks. output compare, and PWM do the al), a single lowly I/O line can have The “Performance” parts include heavy lifting. “Performance” parts go life and death riding on its shoulders. USB 2.0 (full-speed, 12.0 Mbps) and even further by throwing in an The STM32 has a unique capability to CAN interfaces. This seems like a “Advanced Control Timer” that has “lock” the configuration of an I/O line rather unlikely pairing and indeed the even more bells and whistles (see against unintended reprogramming to datasheet reveals that you can really Figure 2). help keep a software crash from lead- only use one function at a time (they Analog capability is another differ- ing to a real one. share the use of a 512-byte buffer). ence between the two STM32 lines. Moving on to serial I/O, every Once again, you’ll find that these The “Access” parts include one con- STM32 includes a SPI port, an I2C interfaces have the “smart” features verter while the “Performance” line port, and two USARTs while the larg- that make life easier for the processor has two converters with the simulta- er parts add an extra one of each. and programmer. For instance, the neous sampling capability required for That’s a total of up to seven fast and CAN controller has programmable many applications (e.g., motor control full-featured serial ports, quite impres- sive for an entry-level part.

The SPI ports run at up to 18 MHz 1st Tr ig as master or slave in half- or full- duplex mode. Besides the usual options (clock rate, mode, 8- to 16-bit ADC1 reg CH0 CH1 CH2 CH2 CH3CH3 CH4 frame), there’s hardware that takes ADC1 inj CH0 care of the CRC for flash cards (e.g., SD Card). Likewise, the I2C port han- dles different modes (e.g., Slave, Figure 3—Automatic scanning of a group of analog inputs is a common feature in modern ADCs. The STM32 Multi-Master), speeds (standard and takes the concept a step further with the ability to interrupt one group scan by “injecting” another.

www.circuitcellar.com CIRCUIT CELLAR® Issue 213 April 2008 83 2804003-Cantrell.qxp 3/7/2008 10:45 AM Page 84

and power factor correction). While DMAC can work together to handle complicate the design or otherwise the basic converter specs (12 bits, 1 µs, high-speed timing critical tasks in compromise the application. up to 16 channels) are competitive, it’s hardware. Purists will argue that no Traditional RISCs, reflecting their the sophisticated CPU cycle-savers MCU can match a DSP or specialized “computer” (versus “controller”) that set this ADC apart from most. chip for applications like motor con- background, can be pretty lame when Many ADCs include a “scan” capa- trol, but I bet the STM32 might sur- it comes to , but not so for bility to automatically convert a prise them. the STM32. In addition to the Cortex sequence of channels. The STM32 M3 architectural improvements (e.g., takes it to the next level by adding a REALITY SHOW built-in vectored interrupt controller second scan group that can be “inject- There is no doubt that the processor and “tail-chaining” to minimize stack ed” into (i.e., interrupt) the regular and peripherals are the attention-get- operations), the STM32 includes dedi- scan (see Figure 3). An “analog watch- ters for any MCU. But there are also a cated hardware to configure up to 19 dog” capability provides independent lot of nuts and bolts required to lash I/O lines as external interrupt/event threshold comparison for any/all pins together a real-world design. Some par- inputs. in either the regular or injected scan ticular little piece of “glue logic” may While it sometimes seems that all groups, or both. seem insignificant, until you need it of the focus is on MIPS and mega- Above and beyond their individual and it’s not there. Then all of a sudden hertz, there is also the small matter of capabilities, the timers, ADC(s), and it’s a big deal with the potential to power consumption. “Small matter”

8-MHz USB 48 MHz USBCLK HSI RC HSI Prescaler to USB interface /1, 1.5 /2 72 MHz max HCLK to AHB bus, core Clock enable (3 bits) memory, and DMA

/8 to Cortex system timer SW PLLSRC FCLK Cortex PLLMUL free running clock HSI ..., x16 ... AHB SYSCLK APB1 36 MHz max x2, x3, x4 Prescaler Prescaler PCLK1 PLL PLLCLK 72 MHz /1, 2...572 /1, 2, 4, 8, 16 to APB1 Max peripherals HSE Peripheral clock enable (13 bits)

to TIM2, 3, CSS TIM2, 3, 4 and 4 x1, 2, Multiplier TIMXCLK Peripheral clock enable (3 bits) PLLXTPRE

OSC_OUT 4–16 MHz APB2 72 MHz max HSE OSC Prescaler PCLK2 OSC_IN /1, 2, 4, 6, 16 to APB2 peripherals /2 Peripheral clock enable (11 bits)

/128 TIM1 Timer to TIM1 OCS32_IN LSE to RTC x1, 2 Multiplier TIM1CLK LSE OSC RTCCLK 32.768 kHz Peripheral clock OSC32_OUT enable (1 bit)

ADC to ADC RTCSEL[1:0] Prescaler /2, 4, 6, 8 ADCCLK LSI To independent watchdog (IWDG) LSI RC 40 kHz IWDGCLK

/2 PLLCLK Main clock output HSI MCO Legend: HSE HSE = High-speed external clock signal SYSCLK HSI = High-speed internal clock signal LSI = Low-speed internal clock signal MCO LSE = Low-speed external clock signal

Figure 4—Some may consider it mere “glue logic,” but the clock generator on a modern MCU such as the STM32 plays a critical role in achieving system price, power, and performance goals.

84 Issue 213 April 2008 CIRCUIT CELLAR® www.circuitcellar.com 2804003-Cantrell.qxp 3/7/2008 10:45 AM Page 85

pin) is detected. And while better than nothing, a sin- More evidence gle watchdog timer always raises the that the STM32 question of who will watch the takes the nuts and watchers? Taking advantage of the bolts seriously is the additional clock, the STM32 inte- clock generator (see grates two independent watchdog Figure 4). Make that timers for a level of protection only clock(s) generator(s). true redundancy provides. This chip’s got so Together the power and clock sys- many clocking tems give you a lot of power-saving options I thought I options. Embellishments to a trio of was in Switzerland. low-power modes (Sleep, Stop, and The primary 8-MHz Standby) include the ability to tweak oscillator (factory various dials on the clock generator trimmed for accura- and the voltage regulator (run, power cy) drives a PLL to down, off). The lowest power mode generate the myriad (Standby) takes advantage of the sepa- of high-frequency rate backup supply domain to shut clocks required for primary power off yet retain the abili- the processor and ty to wake up from an RTC alarm or peripherals. the independent watchdog. Photo 1—Drape this gadget around your neck and you’ll be the life of the Alternatively, you And just how low power are we party! A good MCU needs a good starter kit and those provided by the likes of Raisonance (the STM32 primer shown here), Keil, IAR Systems, and Hitex can provide an exter- talking? According to the datasheet, Development Tools make it easy and inexpensive to check out the new nal 4- to 16-MHz even running full bore at 72 MHz STM32 MCU. clock, in which case with all peripherals enabled, you’re the internal clock looking at just 0.5 mA per 1 MHz typ- as in your design had better consume serves as a monitor and backup should ical (i.e., 36 mA at 72 MHz at room a small amount of power, or else. the external clock fail. temperature). And here’s another rea- After all, a main claim to fame for all There’s a separate low-speed (40-kHz) son to put your most frequently exe- of the new-age 32-bit MCUs is that clock that’s powered from the VBAT cuted routines in RAM: not only is it they can go head-to-head with 8-bit backup power supply. It’s not accurate fast (zero wait states), but running parts and that means battery-powered enough for real time, but it does fill code from RAM also consumes less applications. the key role of providing an on-chip than half the power (e.g., 14.4 mA at Powering the chip couldn’t be sim- wakeup source when the MCU core 72 MHz) of running code from flash pler. Just hook it up to anything from 2 (i.e., 1.8-V domain) is powered down. memory. Another power-saving trick to 3.6 V and it springs to life. An on- chip regulator supplies 1.8 V internally while power-up/power-fail RESET and Flash memory 128 KB over- and under-voltage interrupts are 0x0801FFF built-in. The ADC features a precise on-chip 0x08015000 1.2-V reference voltage, but you can Application 3 4 KB connect an external reference if you 0x08014000

wish (noting that using the ADC boosts Application 1 8 KB

the minimum required chip voltage 0x08012000

from 2 to 2.4 V). Finally, just hang a 1.8- here 96debug KB RAM 20 KB to 3.6-V battery on the VBAT supply 0x0800A000 0x20004FFF Get full version to upgrade to to Getupgrade full version Application 2 4 KB OS 4 KB pins if you want to take advantage of 0x08009000 the RTC and related backup features. 0x20004000 0x08008000 Switchover between the primary and Debugable application Application data battery backup supplies is handled auto- 8 KB 16 KB 0x08006000 matically on-chip. OS

24 KB debug 32 KB 32 debug Besides the RTC, VBAT also pro- version Free vides power for 10 16-bit “backup” 0x08000000 0x20000000 registers (i.e., RAM). A unique protec- tion option automatically clears the Figure 5—The STM32 primer may look like a toy, but under the hood is a “Circle OS” that supports application contents of these registers if “tamper- development and experimentation. There’s plenty of room in the STM32 on-chip flash and SRAM for both “Circle ing” (i.e., unexpected activity on a OS” and application code and data.

www.circuitcellar.com CIRCUIT CELLAR® Issue 213 April 2008 85 2804003-Cantrell.qxp 3/7/2008 10:45 AM Page 86

I got a chance to play around with pretty face. Behind the looks of a the cute little “STM32 Primer” gadget flashy new core is a down-to-earth courtesy of STMicroelectronics and chip that’s sophisticated, but not frag- Raisonance. Although the evaluation ile or high maintenance. version of the Raisonance RIDE7 tool- And this is a supermodel that’s chain (GNU-based) that comes with the accessible to mere mortals. Judging primer is limited to debugging 32 KB (a from all of the promotion commotion full-function upgrade is available from and third-party support, it is clear that Raisonance) at just $32, the kit is still STMicroelectronics is serious about quite a bargain. going after the mass-MCU market, Photo 2—Small is beautiful, except when it comes to A close look reveals two MCUs (see not just a few big-ticket focus cus- hand-wiring a tiny surface-mount chip. The STM32- Photo 1). At the top is the STM32 of tomers. Wise move, because staying H103 header board from Olimex makes it quick and interest, a 128-KB flash unit. On the power in the MCU business is as easy to prototype your own STM32-based design. left is an ARM7 MCU acting as a much a matter of seats (i.e., number debug interface between your PC USB of designs) as sockets. is to take advantage of the fact that port and the STM32 software/JTAG Is the STM32 the “best” 32-bit every peripheral has its own power debug pins. A benefit of the two-chip MCU? Who knows, and who cares? switch (i.e., clock gate) and the approach is that it leaves the STM32 What matters is that it is a great datasheet helpfully itemizes the power USB port free for application use. MCU that leverages an entire ecosys- consumption of each. The savings can In the upper left is a part that raises tem of chips, tools, and third-party add up considering the higher-power the primer’s fun quotient, a three-axis support. Bottom line for designers peripherals (e.g., timers and ADCs) low-g MEMS accelerometer enabling a shopping 32-bit MCUs? If you’ve got a consume a milliamp or two each. “tilt-o-whirl” user interface. Scrolling short list of favorites, it just got a lit- Beyond active power consumption, and menu selection is accomplished tle longer. I low-power modes are where batteries by tilting the gadget. The display live and die. The STM32 Sleep mode automatically switches between Tom Cantrell has been working on cuts power consumption roughly in Portrait and Landscape mode depend- chip, board, and systems design and half yet remains functional enough ing on orientation. marketing for several years. You may (i.e., many fast wakeup options) to use Taking advantage of the accelerome- reach him by e-mail at tom.cantrell@ routinely. Taking a big step down the ter, the Primer comes preprogrammed circuitcellar.com. ladder, Stop mode specs at just 15 to with some simple maze and breakout 25 µA or so depending on the particu- games. But it’s more than a toy. Indeed, lars (e.g., voltage regulator on/off, tem- under the hood is a “Circle OS” that SOURCES perature). That’s not bad considering includes a simple task scheduler and a Cortex M3 core the on-chip RAM is kept alive and it’s variety of I/O libraries for both the ARM relatively easy to wake up (e.g., via STM32 on-chip peripherals and the www.arm.com pin, interrupt, USB). If you don’t need primer add-ons (MEMS accelerometer, STM32 Development tools to preserve the contents of RAM, graphics LCD, button, buzzer, and Hitex Development Tools Standby mode slashes power to little more) (see Figure 5). You can find the www.hitex.com more than 1 µA, yet still gives you source code for Circle OS and example some tools to work with besides just applications at www.stm32circle.com. STM32 Development tools RESET (e.g., RTC wakeup alarm and The primer documentation walked me IAR Systems the backup registers). through the process of creating my own www.iar.com “Hello World” application in a matter STM32 Development tools ONE LAST THING of minutes, and everything worked Keil I think you can see that the STM32 without a hitch. www.keil.com is firing on all cylinders (i.e., good Rolling your own prototype is another core, good peripherals, good integra- option, but not always an easy one with STM32 Evaluation boards tion, and good energy efficiency). fine-pitch surface-mount parts. Olimex Olimex Guess what? A good chip is useless provides a handy solution with a “head- www.olimex.com unless it’s got some good tools to go er board” that includes the STM32 STM32 Development tools with it. Fortunately, the STM32 gets MCU, a USB connector, and easy access Raisonance to ride on the ARM bandwagon, which via standard headers to the chip’s I/O www.raisonance.com is standing room only with third-party lines (see Photo 2). tool suppliers including ARM and Keil STM32 Cortex M3-based 32-bit flash (owned by ARM), Raisonance, IAR MOST SMARTEST MCU Systems, and Hitex Development In the reality show, that’s the MCU STMicroelectronics Tools with no doubt more to come. business: the STM32 is more than a www.st.com

86 Issue 213 April 2008 CIRCUIT CELLAR® www.circuitcellar.com