Technik's blog: 2016

I've been raised in the culture of Object Oriented Programming. I've always been told about the benefits of encapsulation, cohesion, locality, etc. There are very good reasons why a lot of smart people deeply support OOP. Designing good OOP architectures pays off. It saves a lot of time debugging errors, makes code easy to read and understand, and lets you focus on one part of the problem at a time.

But what if it's all wrong? In the last few years I've read about a concept known as Data Oriented Design, which many claim is a different paradigm promising huge performance improvements and that will make you question why you ever used OOP in the first place. Kind of a big claim, and big claims require good proof. So, when I came across with this talk by Mike Acton, I did the only thing I could do: I wrote a test.

The idea is simple: Have a bunch os squares defined by their radius and compute their areas. This is where a traditional OOP beginner tutorial would say "Make a class for Square ...".

class NiceSquare {
 float radius;
 float color[3];
public:
 NiceSquare() : radius(3.f) {}
 void computeArea(float& area) { area = radius*radius; }
};

However, following the principles of DoD, we realise that our data is not a square, but a bunch of squares, so...

struct BunchOfSquares {
 float * radius;
 float * color;
};

There is a good reason for that color member. We will use it later to control the packing factor of our data. But we just sacrificed encapsulation for no good reason. If computing a square's area is something the square can do itself, then computing a bunch of areas should be something a bunch of squares can do itself too. What if we took this DoD approach to the problem, but implemented it with OOP?

class BunchOfSquares {
 float *radius;
 float* color;
public:
 BunchOfSquares() : radius(new float[N_SQUARES]) {
  for(unsigned i = 0; i < N_SQUARES; ++i) radius[i] = 3.f;
 }

 ~BunchOfSquares() {
  delete[] radius;
 }

 void computeAreas(float* area) {
  for (unsigned i = 0; i < N_SQUARES; ++i) {
   float rad = radius[i];
   area[i] = rad*rad;
  }
 }
};

Much better now. Notice we didn't really sacrifice any Object-Orientation here. We just realised what objects really belong to our problem. And that's actually the key. Most of the time, when we do OOP, we tend to design our classes to fit our mental model of day to day life. WRONG! You are not solving day to day life, you are solving a specific problem! Now the question of performance still remains, so lets measure it:

duration<double> oldBenchmark() {
  NiceSquare *squares = new NiceSquare[N_SQUARES];
  float*areas = new float[N_SQUARES];
  auto begin = high_resolution_clock::now();

  for (auto i = 0; i < N_SQUARES; ++i) {
   squares[i].computeArea(areas[i]);
  }

  duration<double> timing = high_resolution_clock::now() - begin;
  delete[] areas;
  delete[] squares;

  return timing;
}

duration<double> dodBenchmark() {
 BunchOfSquares squares;
 float* areas = new float[N_SQUARES];
 auto begin = high_resolution_clock::now();

 squares.computeAreas(areas);

 duration<double> timing = high_resolution_clock::now() - begin;
 delete[] areas;

 return timing;
}

int main(int, const char**) {
 ofstream log("log.txt");

 for(int i = 0; i < 100; ++i) {
  double oldTiming = real_millis(oldBenchmark()).count();
  double dodTiming = real_millis(dodBenchmark()).count();
  log << oldTiming << ", " << dodTiming << endl;
 }

 return 0;
}

If you pay attention, you will see the benchmarks have been written the old fashioned way. It would be better to realise I don't want just 1 timing, and that I won't perform just one benchmark, and write the test to do a bunch of benchmarks and store the results in a bunch of timming records. But for now, we'll stick to this format because it will be easier to read for people not used to DoD, and because I like the irony of it.

Back to the test, a quick run shows this.

The improvement is obvious, even for a dumb example like this, DoD is about 40% faster. Cool. But can we do better? Theory says that the big performance improvements of DoD come from not wasting cache space. The better we use our caches, the faster the test will run. That's what the color member is there for. It represents the more realistic scenario where classes have more than one member. By controlling the size of color, we control how sparse in memory are the radiuses. That way, completely removing the color should make both paradigms perform almost identically, right?

Definitely right. And if we move the other way around and increase color from 3 to 63 floats ...

That's absolutely a win. We have almost 85% improvement. DoD code is running more than 6x faster now. And it's still Object Oriented! We've lost none of the benefits of OOP!

In conclusion, Data Oriented Design doesn't mean throwing away all you know about OOP and good programming practices. It is a reminder to solve the problems we do have, instead of the problems we are comfortable thinking of. Even though its performance gain is very thightly coupled with low level hardware, DoD principles tell us that our code is really messed up from a very high level. The moment you forget what data you are dealing with, you're already going the wrong way. Know your problem, know your data. Then you can apply whatever programming paradigm you see fits better. And if you decide to go for OOP, remember there's no rule saying an "object" in your code has to match any object in your day to day life. So just choose the right objects for your data.

2016/11/21

C vs C++, performance on AVR

2016/07/05

Data Oriented Design vs Object Oriented Programming