Technik's blog

2018/08/10

The other pathtracer 5: Optimizing Triangle-Ray Intersections

In the last post of this series, we spent some time optimizing aabb-ray intersections. In this post, we will do the same for triangles, and in the process, will find one more optimization for aabbs.

So, the code I'm using now, is what I wrote for this previous post. It was supposed to be simple to understand, and overall gets the job done. But we can do better. Let's start with some profiling, and establish a baseline performance.

The other Pathtracer 4: Optimizing AABB-Ray intersection

This post is about optimizing the AABB tree that we're using as our main acceleration structure.
I will use the polly scene from previous post for the tests, but have increased the output resolution to get results a bit more meaningful.

Initial performance:
Scene: Project Polly
Resolution: fullHD 1920x1080
Primary rays per pixel: 64
Results: 200 s, ~660k Rays/s

Taking a first look at the profiler, there's an obvious fail:

The other pathtracer 3: Complex scenes

Going from 1 triangle to many triangles is a trivial thing to do. At least if you don't care about performance at all. Just add a vector of triangles, and test them all.

float t = tMax;

// Bruteforce approach
bool hit_anything = false;
HitRecord tmp_hit;
for(auto& tri : mTris)
{
 if(tri.hit(r,tMin,t,tmp_hit))
 {
  collision = tmp_hit;
  t = tmp_hit.t;
  hit_anything = true;
 }
}

That's all the code you need to render a bunch of triangles. However, that's not so interesting unless you can use those triangles to render something interesting.

The other Pathtracer 2: The triangle

Following on the idea of my last post, today we're building on a very important matter: Intersecting triangles. Or more specifically, intersecting one triangle. This is not covered in many ray tracing tutorials, and that's a shame. Adding triangle intersection is the corner stone for a lot of interesting functionality (like loading full meshes), and it's actually a very simple thing to do.
All the relevant code is in this commit.

The Algorithm

There are several possible algorithms for intersecting triangles, with varying degrees of complexity and performance. For example see this wikipedia article or, if you have access to gdcvault, this talk by Earl Hammon (which has a ton of valuable material).
However. since my goal here is to get a working implementation quickly and easily, I will explain the algorithm that I find most intuitive. In later posts, we will be revisiting it for performance improvements, and even then, it will be useful to have a solid base line to benchmark against.

So the idea is to do intersection in two parts: First, we find whether our ray segment intersects the triangle plane, and if it does, then we see if the intersection point lies inside the triangle.

Part one is basic geometry, and can be decomposed in two parts as well: Find the plane defined by the three vertices of the triangle, then intersect the plane with our ray.

auto edge0 = v[1]-v[0];
auto edge1 = v[2]-v[1];
auto normal = normalize(cross(edge0,edge1));
auto planeOffset = dot(v[0],normal);

In regular production code, we should handle the case of a degenerate triangle, where cross(edge0,edge1) can't be normalized, but for now we will behave ourselves and just not make weird triangles. We can already see a possible optimization path: since none of the above depends on the ray, we could cache the plane definition, instead of recomputing it for every ray. Not now, anyway.

The other pathtracer: Basic job system

Inspired by the Daily Pathtracer, by Aras Pranckevičius, I decided to also write a path-tracer, mainly to better follow along his posts (seriously, they're very good). I also noticed there are some other interesting aspects of a ray tracer that he hasn't mentioned (so far) and that may be worth talking about, so this post is my first attempt at doing just that. I want to touch issues like triangle intersection, texture blending or BVH optimization, but let me know in the comments if you're interested in specific parts. As of now, there are:
- Part one: Job system (this post)
- Pat two: The triangle
- Part three: Complex scenes
- Part four: AABB-Ray optimization
- Part five: Ray-Triangle optimization

Spoiler: This will not be a daily series, but we will do some cool stuff with triangle meshes, and loading gltf scenes.

C vs C++, Part II, Beautiful & efficient

In the first part of this series, we used C++ features (operator overloading and templates) to eliminate all defines and macros necessary for using a Pin. This way, we achieved the same performance of C code, but slightly increased readability, and hugely increased code safety. The result is a library that looks like this:

template<uint16_t address_>
struct Register {
 void operator=   (uint8_t _r)
 {
  *reinterpret_cast<volatile uint8_t*>(address_) = _r;
 }
 operator uint8_t   () const
 {
  return *reinterpret_cast<volatile uint8_t*>(address_);
 }
 operator volatile uint8_t& ()
 {
  return *reinterpret_cast<volatile uint8_t*>(address_);
 }

 template<uint8_t bit_>
 void setBit() { *reinterpret_cast<volatile uint8_t*>(address_) |= (1 << bit_); }
 template<uint8_t bit_>
 void clearBit() { *reinterpret_cast<volatile uint8_t*>(address_) &= ~(1 << bit_); }
};

Register<0x24> DDRB;
Register<0x25> PORTB;

constexpr uint8_t DDB5 = 5;
constexpr uint8_t PORTB5 = 5;

C vs C++, performance on AVR

The aim of this post is to fight the generalized belief of C++ being too slow of a language for embedded environments. This belief goes around, saying that microcontrollers should still be programmed in C, or even in assembler. Probably you don't agree with me right now. The idea of C being much more efficient than C++ is so extended that it almost seems like sacrilege to debate it. That's why I'm about to make a series of comparisons between both languages, throwing in some real and objective numbers (code size, execution time, etc). After we prove that not only can C++ compete with good old C, we'll see it's actually a better alternative. For that, besides performance metric, I will compare things like safety, code readability or portability.

2018/08/10

The other pathtracer 5: Optimizing Triangle-Ray Intersections

2018/06/21

The other Pathtracer 4: Optimizing AABB-Ray intersection

2018/06/17

The other pathtracer 3: Complex scenes

2018/06/07

The other Pathtracer 2: The triangle

The Algorithm

2018/06/06

The other pathtracer: Basic job system

2017/03/28

C vs C++, Part II, Beautiful & efficient

2016/11/21

C vs C++, performance on AVR