C vs C++, Part II, Beautiful & efficient

In the first part of this series, we used C++ features (operator overloading and templates) to eliminate all defines and macros necessary for using a Pin. This way, we achieved the same performance of C code, but slightly increased readability, and hugely increased code safety. The result is a library that looks like this:

template<uint16_t address_>
struct Register {
 void operator=   (uint8_t _r)
  *reinterpret_cast<volatile uint8_t*>(address_) = _r;
 operator uint8_t   () const
  return *reinterpret_cast<volatile uint8_t*>(address_);
 operator volatile uint8_t& ()
  return *reinterpret_cast<volatile uint8_t*>(address_);

 template<uint8_t bit_>
 void setBit() { *reinterpret_cast<volatile uint8_t*>(address_) |= (1 << bit_); }
 template<uint8_t bit_>
 void clearBit() { *reinterpret_cast<volatile uint8_t*>(address_) &= ~(1 << bit_); }

Register<0x24> DDRB;
Register<0x25> PORTB;

constexpr uint8_t DDB5 = 5;
constexpr uint8_t PORTB5 = 5;


C vs C++, performance on AVR

The aim of this post is to fight the generalized belief of C++ being too slow of a language for embedded environments. This belief goes around, saying that microcontrollers should still be programmed in C, or even in assembler. Probably you don't agree with me right now. The idea of C being much more efficient than C++ is so extended that it almost seems like sacrilege to debate it. That's why I'm about to make a series of comparisons between both languages, throwing in some real and objective numbers (code size, execution time, etc). After we prove that not only can C++ compete with good old C, we'll see it's actually a better alternative. For that, besides performance metric, I will compare things like safety, code readability or portability.


Data Oriented Design vs Object Oriented Programming

I've been raised in the culture of Object Oriented Programming. I've always been told about the benefits of encapsulation, cohesion, locality, etc. There are very good reasons why a lot of smart people deeply support OOP. Designing good OOP architectures pays off. It saves a lot of time debugging errors, makes code easy to read and understand, and lets you focus on one part of the problem at a time.

But what if it's all wrong? In the last few years I've read about a concept known as Data Oriented Design, which many claim is a different paradigm promising huge performance improvements and that will make you question why you ever used OOP in the first place. Kind of a big claim, and big claims require good proof. So, when I came across with this talk by Mike Acton, I did the only thing I could do: I wrote a test.

The idea is simple: Have a bunch os squares defined by their radius and compute their areas. This is where a traditional OOP beginner tutorial would say "Make a class for Square ...".

class NiceSquare {
 float radius;
 float color[3];
 NiceSquare() : radius(3.f) {}
 void computeArea(float& area) { area = radius*radius; }

However, following the principles of DoD, we realise that our data is not a square, but a bunch of squares, so...

struct BunchOfSquares {
 float * radius;
 float * color;

There is a good reason for that color member. We will use it later to control the packing factor of our data. But we just sacrificed encapsulation for no good reason. If computing a square's area is something the square can do itself, then computing a bunch of areas should be something a bunch of squares can do itself too. What if we took this DoD approach to the problem, but implemented it with OOP?

class BunchOfSquares {
 float *radius;
 float* color;
 BunchOfSquares() : radius(new float[N_SQUARES]) {
  for(unsigned i = 0; i < N_SQUARES; ++i) radius[i] = 3.f;

 ~BunchOfSquares() {
  delete[] radius;

 void computeAreas(float* area) {
  for (unsigned i = 0; i < N_SQUARES; ++i) {
   float rad = radius[i];
   area[i] = rad*rad;

Much better now. Notice we didn't really sacrifice any Object-Orientation here. We just realised what objects really belong to our problem. And that's actually the key. Most of the time, when we do OOP, we tend to design our classes to fit our mental model of day to day life. WRONG! You are not solving day to day life, you are solving a specific problem! Now the question of performance still remains, so lets measure it:

duration<double> oldBenchmark() {
  NiceSquare *squares = new NiceSquare[N_SQUARES];
  float*areas = new float[N_SQUARES];
  auto begin = high_resolution_clock::now();

  for (auto i = 0; i < N_SQUARES; ++i) {

  duration<double> timing = high_resolution_clock::now() - begin;
  delete[] areas;
  delete[] squares;

  return timing;

duration<double> dodBenchmark() {
 BunchOfSquares squares;
 float* areas = new float[N_SQUARES];
 auto begin = high_resolution_clock::now();


 duration<double> timing = high_resolution_clock::now() - begin;
 delete[] areas;

 return timing;

int main(int, const char**) {
 ofstream log("log.txt");

 for(int i = 0; i < 100; ++i) {
  double oldTiming = real_millis(oldBenchmark()).count();
  double dodTiming = real_millis(dodBenchmark()).count();
  log << oldTiming << ", " << dodTiming << endl;

 return 0;

If you pay attention, you will see the benchmarks have been written the old fashioned way. It would be better to realise I don't want just 1 timing, and that I won't perform just one benchmark, and write the test to do a bunch of benchmarks and store the results in a bunch of timming records. But for now, we'll stick to this format because it will be easier to read for people not used to DoD, and because I like the irony of it.

Back to the test, a quick run shows this.

The improvement is obvious, even for a dumb example like this, DoD is about 40% faster. Cool. But can we do better? Theory says that the big performance improvements of DoD come from not wasting cache space. The better we use our caches, the faster the test will run. That's what the color member is there for. It represents the more realistic scenario where classes have more than one member. By controlling the size of color, we control how sparse in memory are the radiuses. That way, completely removing the color should make both paradigms perform almost identically, right?
Definitely right. And if we move the other way around and increase color from 3 to 63 floats ...

That's absolutely a win. We have almost 85% improvement. DoD code is running more than 6x faster now. And it's still Object Oriented! We've lost none of the benefits of OOP!

In conclusion, Data Oriented Design doesn't mean throwing away all you know about OOP and good programming practices. It is a reminder to solve the problems we do have, instead of the problems we are comfortable thinking of. Even though its performance gain is very thightly coupled with low level hardware, DoD principles tell us that our code is really messed up from a very high level. The moment you forget what data you are dealing with, you're already going the wrong way. Know your problem, know your data. Then you can apply whatever programming paradigm you see fits better. And if you decide to go for OOP, remember there's no rule saying an "object" in your code has to match any object in your day to day life. So just choose the right objects for your data.


Home made 3d printed rocket nozzle

One of the most important parts in a rocket engine is the nozzle. A well designed nozzle will provide the optimum expansion of the exhaust gases and maximize thrust. I am working on a home made rocket, and decided to do a few tests with a real nozzle just to get a real feel of how these devices work.
So for starters, how does a rocket nozzle work and what is it for? The nozzle is basically a tube that redirects the combustion gases into one direction so to push the rocket. While doing so, it also accelerates the gases and lowers their temperature. And all of that without a single moving part. Rocket nozzles achieve this because they belong to the category of Convergent-Divergent Nozzles, or de Laval Nozzles, which take advantage of the properties of gases at supersonic speed. I won't get into much detail about the thermodynamics of ConDi Nozzles, as there are many resources that explain them perfectly (https://en.wikipedia.org/wiki/De_Laval_nozzle).
The important point here is how to design the nozzle, and for that there are a few things we need to know:
- The properties of  the gas that will flow through the nozzle (i.e. air), or gamma-
- The amount of air we can provide per second (mass flow).
- Our working pressures.
About air, all we need to know is the heat capacity ratio, which happens to be just about 1.4.
The amount of air is a bit more tricky. In my case it is limited by the air compressor I use to power the system. It's not so important to know the exact amount of air that will flow as it is to be sure that our nozzle becomes the limiting factor. In order for the air to go supersonic (locally), it needs to choke the nozzle. That means that you must be able to supply air enough to saturate it, or conversely, that the nozzle must be small enough to saturate with the air you can provide. For this reason, the simplest thing you can do is to design the nozzle with a throat smaller than the smallest area of your air feed system. Just measure the ducts of your feed system and choose a smaller size for the throat. My throat is about 4 mm diameter because the smallest duct of my air compressor has a diameter of 6mm. It could actually be even a bit bigger than the duct and still choke (for a bigger area, a smaller pressure will be able to choke the same mass flow) due to the pressure losses, but this way we have some margin.
Finally, the working pressures (along with gamma) will define the expansion ration of the nozzle (the only thing we are left to know to fix its geometry). Expansion ratio, the ratio between the area at the end of the nozzle, and the area at the throat, follows the following formula:
Taking into account that Pe, the expanded pressure will be equal to ambient pressure (1.013 bars) and that my air compressor can give up to 8 bars, that results in an expansion ratio of about 4, so the expansion radius will be about twice that of the throat. I will build my nozzle with a contraction angle of about 30º and an expansion angle of 15º.
As you can see in the pictures, the nozzle is a simple revolution solid.

Notice I let a pretty big input hole. That is because I use 3/8" plumbing to feed the system. That is conventional plumbing that you can buy in any hardware store and is pretty easy to work with and to seal. I also printed a small cap to transform a PVC tube into an adapter to hold the nozzle in place. This way I can put it on top of a weight scale to measure thrust. The picture below shows my poor man's engine test bench, where I can measure pressure vs force.

I didn't expect much of this at first, but the results impressed me. The whole "engine" (the chamber plus nozzle) weight less than 10 grams, and it delivers more than 200 grams of thrust. A force to weight ratio of more than 20. Not bad for 10 grams of plastic.


Prometheus Arm: Latest changes.

In the last few weeks, we have made some improvements in the design of Prometheus arm.
In order to be able to use antagonistic actuation in a simple fashion, it is important that actuation of each degree of freedom remains symmetrical. This means that if you open a finger a little, both inner and outer tensors should be displaced the same length. Initially, a pulley system was our choice for granting this. This iteration was tested in the first version (A3 project) and showd some major inconvenients. Tensors used to degrade quickly and loosed the joints, leading to poor performance of the fingers. In order to solve this issue, we developed a system with adjustable pulleys that allowd tensors to be readjusted.
This moving pulleys are hard to print and require the printer to be very well calibrated. Besides, the teeth degrade rather quickly. Taking into account that, and the fact that fingers were starting to become complex to assemble, this solution didn't seem fitted for anyone to print and assemble at home. A deep redesign was needed.

So we wanted to do three things: Get rid of as many tensors as possible, increase robustness and simplify assembly. The solution is a simple bar actuated mechanism.

Two solid bars link the moving parts of the finger, and keep reduce the number of degrees of freedom to one. This way, the whole finger can be operated from the base and a symmetric mechanism is only needed there. The design is also robust if you choose the right orientation at print time, and the reduced complexity of the features and part count makes it easier to print and assemble. Here is a proof of concept of the finger. The next step is to print a complete finger with the servo adapter, and then test its actuation triggered by electrodes.


Helicopter Electronics: Test #1

Quadcopters are so popular these days, but I still prefer traditional helicopters. They are more efficient, can carry heavier payloads, give a faster response ... and are a bigger challenge. That's why I'm revamping an RC helicopter into an autonomous platform to test a few technologies. This is the first test of some of its electronics.

Sidenote: It is my little brother that you hear in the video, and actually operating the controls.


A perfect fit for A3

It's been a while since the last time I wrote about the A3 project. It has recently morphed into the Prometheus Arm project, in which I work with a friend of mine. What motivated the change? Take a look yourself:

I must admit it. I can't resist Iron Man. That's just it, and now I want to contribute. My friend Pablo and I are now working on a better design of the arm, balancing between cost, simplicity, repairability and functionality, and will try to contribute to Limbitless' project with it.

Talking specifics, we are making the first tests with a 3d printed, improved version of the antagonistic mechanism that allows for better, finer and simpler adjustment of the joints.

The main problem with antagonistic actuation is obviously the cost. Since you are basically doubling the number of actuators, the costs goes up very quickly. However, a good mechanical design can simplify things a lot. I still don't have it fully documented, but I will soon post a method that uses symmetric cinematic chains to factor out some actuators while retaining the basic functionality. Reducing the number of actuators necessarily reduces the number of degrees of freedom (which originally was two for each joint), but I think this is a good trade off if all you are loosing is strength degrees.
More specifically, the tension regulators are shared among similar joints, so they will all be adjusted at the same time, but individual elasticity and torque tolerance is fully kept. I don't think the use-case for trying to exert different amounts of force with each finger is much common, so it is definitely worth the cost drop.