Sunday, May 10, 2009

The Propeller chip: the good, the weird, and the really weird?

I've been using the Parallax Propeller chip as the controller for my quadrotor. Specifically, I've been using the PropStick USB:



The Propeller is an embedded microprocessor with some unusual features. Basically, it's a controller optimized for bit-banging. It has 8 symmetrical cores called "cogs", each running at (up to) 80MHz, each of which has access to read and write all 32 I/O pins. Instead of dedicated peripheral controllers such as UARTs, SPI drivers, or EEPROM controllers, you just dedicate one of the cores to driving the pins directly. You can even generate NTSC or VGA video output this way. There is literally no interrupt controller! The basic chip can be had for $8 in DIP, QFN, or QFP form factors.

(The PropStick is an expensive ($80) but super convenient option -- it has the same form factor as the DIP variant, but includes power regulators, clock crystal, EEPROM, and a USB connector right on the part, so you need basically no external parts to be up and running. I'm paying for convenience here.)

The Propeller's design is super great in many ways. In my application I need to interface with the XBee radio (115kbps serial), the GPS (57kbps serial), the IMU (115kbps serial), and the motor speed controllers (four 50Hz PWM lines). A conventional embedded CPU such as the LPC2138 might have one or two UARTs and a PWM driver, so I'd still have to bit-bang the remaining serial port -- quite challenging at 57kbps if you want to do anything else at the same time! -- or use an external UART. In any case, I would have to fuss with interrupt controllers and so on, which is always a pain. In the Propeller, I simply spin up three cogs running FullDuplexSerial and one running Servo32v3 (both of which are included in the standard library) and tell them which pins to use. I dedicate another cog to parsing the IMU and GPS data as it comes in, leaving the main cog to do the actual flight control.

There's a catch, though.

The Propeller is normally programmed in this proprietary language called "Spin", possibly with some assembly language mixed in. Spin is a BASIC-like interpreted language, and it's kind of cute, but its limitations get frustrating after a while. The syntax is idiosyncratic; for example, the greater-than-or-equal-to comparison operator is "x => y", and the conventional "x >= y" actually means "x := x > y", because they have assignment versions of every conceivable operator, so you get weird bugs just because you slipped and used normal syntax in an if statement. The language is untyped and unchecked; you're constantly passing around pointers as integer values. There are no data structures, only arrays of primitive types. There is no heap. You can define code modules and instantiate them multiple times, but they must be instantiated statically, and cannot be passed outside the calling module. The libraries tend to be buggy, and people make weird cargo-cult modifications to standard modules (the GPS module, for example, demands a custom serial port module that includes decimal parsing routines).

I could live with all that, but the real problem is that Spin is slow, because it's a bytecode interpreted language running on a processor that's about as powerful as an old 486 to begin with (and a 486SX at that -- no FPU!). For example, I mentioned that I dedicate a cog to parsing the GPS and IMU output; that's because keeping up with 100Hz IMU updates and 5Hz GPS updates in Spin requires a dedicated cog. You can write assembly code, which runs much more quickly, but then, well, you have to write assembly code.

Worse, each cog only has 2KB of local "cog RAM" (there's 32KB of shared "hub RAM"), and instructions can only be executed from cog RAM. The Spin bytecode interpreter takes up all 2K. That means if you want to call assembly code from Spin, you have to send it off to another cog to execute. This means those 8 cogs are used up quickly. In my case, I have three dedicated to the serial ports, one to servo control, one to parsing the GPS and IMU output, one for the main control logic, and because the software floating point library uses assembly code that uses up another cog. That's all but one of the cogs used up already.

I would love a Propeller-like architecture that I could program in compiled C. ICCV7 for Propeller costs $99 and runs on Windows only (yuck). Catalina, a free port of lcc, is in "beta release", and apparently generates rather poor code so far. In any case you're not going to get very far squeezing compiled code into the 2K of cog RAM, so both C compilers use the "Large Memory Model" which copies instructions from hub RAM into cog RAM, runs them, and repeats. That means that compiled code in hub RAM runs 5 times slower than native code in cog RAM. (This is still many times faster than Spin code.) Also, the instruction set is apparently designed for hand assembled code, not compiled code -- there is no explicit support for a stack, for example, nor are there indexed addressing modes. I'm guessing clever compiler designers could work around these things, but the Propeller is a niche product, and clever compiler designers have other things to do with their time.

The Propeller 2 will have 256KB of hub RAM but still only 2KB of cog RAM, so I'm not sure it will make things much better. Maybe things like Catalina will be more mature and optimized by then and we'll just live with variants of the LMM. Apparently the Propeller 2 will have an entire IDE *on the chip*, so if you connect a TV (!) and a keyboard you can write code without ever using a PC.

2 comments:

  1. Hmm, the Propeller sounds clever, but really annoying to use for real tasks. I'd think perhaps the simplest fix would be to use something like an ATmega328P in 28-pin DIP form as your main processor, and maybe just treat the Propeller as a peripheral interface and preprocessor. To me it sounds kind of like an FPGA for implementing peripherals, but a lot easier to program.

    ReplyDelete
  2. I may have made it sound more annoying than it is. In general it really is fun to work with, but I think it's better for I/O glue than for hardcore computation.

    Amusingly, the IMU itself has an LPC2138 in it, which in theory can be reprogrammed, so I could offload some of my filtering logic in there. (But it doesn't really have the exposed I/O that would be needed to draw in other data like the GPS.)

    ReplyDelete