OpenGL Demos
OpenGL Misc
MSG Board
Megabyte Softworks
C++, OpenGL, Algorithms

Current series: OpenGL 3.3
(Return to list of OpenGL 3.3 tutorials)

Download (4.94 MB)
1449 downloads. 6 comments
27.) Occlusion Query

Hello guys! This is the 27th tutorial from my series. This one is about occlusion query, which is used to speed up rendering and discard rendering some complex geometry, that won't be rendered simply because it cannot be visible in scene for sure. The principle is really simple - before rendering any complex geometry of the object with many meshes and triangles, we first render a simple bounding shape of an object (for example, bounding box is the easiest) and we just ask OpenGL, whether we would have successfully rendered any fragments (pixels). If rendering of a bounding box wouldn't make a change to a framebuffer (i.e. it's not visible in scene at all), that means, that the object itself cannot be visible as well and we can discard rendering the whole complex shape! And that's the whole simple point, coding it won't be very difficult, so let's dig deeper into it .

Occlusion Query

The whole process described above is called an occlusion query. Unfortunately, the translation of the words occlusion, occluder and occludee doesn't exist in Slovak language, so I hope I won't misuse them (if yes, let me know) . Occluder is in our case the bounding box, because it occludes (contains, consumes ) the object we are testing - the occludee (the highly tesselated sphere in our case). That means that we are not going to actually render occluder, but only ask if it were rendered if we actually rendered it (if we would see some of its parts). If we could see at least one pixel of it, some parts or maybe whole occludee (highly tesselated sphere) is probably visible too and thus we will render it. The whole rendering code is here and I will explain it line by line:

bool bShowOccluders = false;
bool bEnableOcclusionQuery = true;

void RenderScene(LPVOID lpParam)
   // ...

   int iSpheresPassed = 0;
   bool bRenderSphere[3][3][3];
   glm::mat4 mModelMatrices[3][3][3];

   spOccluders.SetUniform("matrices.projMatrix", oglControl->GetProjectionMatrix());
   spOccluders.SetUniform("matrices.viewMatrix", cCamera.Look());
   spOccluders.SetUniform("vColor", glm::vec4(1, 0, 0, 0));

   // Occlusion query begins here
   // First of all, disable writing to the color buffer and depth buffer. We just wanna check if they would be rendered, not actually render them

   FOR(x, 3)
      FOR(y, 3)
         FOR(z, 3)
            bRenderSphere[x][y][z] = false;
            float fLocalRotAngle = fGlobalAngle + x*60.0f + y*20.0f + z*6.0f;
            glm::vec3 vOcclusionCubePos = glm::vec3(-fCubeHalfSize+fCubeHalfSize*x*2.0f/3.0f + fCubeHalfSize/3.0f, -fCubeHalfSize+fCubeHalfSize*y*2.0f/3.0f + fCubeHalfSize/3.0f, -fCubeHalfSize+fCubeHalfSize*z*2.0f/3.0f + fCubeHalfSize/3.0f);

            mModelMatrices[x][y][z] = glm::translate(glm::mat4(1.0), glm::vec3(0.0f, fCubeHalfSize, 0.0f));
            mModelMatrices[x][y][z] = glm::translate(mModelMatrices[x][y][z], vOcclusionCubePos);
            mModelMatrices[x][y][z] = glm::rotate(mModelMatrices[x][y][z], fLocalRotAngle, glm::vec3(1, 0, 0));
            mModelMatrices[x][y][z] = glm::rotate(mModelMatrices[x][y][z], fLocalRotAngle, glm::vec3(0, 1, 0));
            mModelMatrices[x][y][z] = glm::rotate(mModelMatrices[x][y][z], fLocalRotAngle, glm::vec3(0, 0, 1));

               mModel = glm::scale(mModelMatrices[x][y][z], glm::vec3(fCubeHalfSize/3, fCubeHalfSize/3, fCubeHalfSize/3));
               spOccluders.SetUniform("matrices.modelMatrix", mModel);

               // Begin occlusion query
               glBeginQuery(GL_SAMPLES_PASSED, uiOcclusionQuery);
                  // Every pixel that passes the depth test now gets added to the result
                  glDrawArrays(GL_TRIANGLES, 0, 36);
               // Now get tthe number of pixels passed
               int iSamplesPassed = 0;
               glGetQueryObjectiv(uiOcclusionQuery, GL_QUERY_RESULT, &iSamplesPassed);
               // If some samples passed, this means, that we should better render the whole sphere, because we were able
               // to see its bounding box
               if(iSamplesPassed > 0)
                  bRenderSphere[x][y][z] = true;
                  // Increase the number of spheres that have passed
            else // If we do not use occlusion query, then all of the spheres have passed
               bRenderSphere[x][y][z] = true;
               // Increase the number of spheres that have passed

   // Re-enable writing to color buffer and depth buffer

   // ...

So what we're basically doing here is that we have an 3x3x3 array of booleans, where we store true or false - whether to render particular sphere or not. First of all, we must disable writing to color and depth buffer using glColorDepthMask and glDepthMask. That means, that anything that is rendered now (our occluder) will actually not get written into any of the buffer and thus not get rendered. But thanks to occlusion query, we can ask that important question - how many fragments (pixels) have passed the depth test and made it to the final stage of rendering, i.e. writing to color buffer? To start the occlusion query, we must first call the function glBeginQuery(GL_SAMPLES_PASSED, uiOcclusionQuery). This function has two parameters - first is query type, in our case it is GL_SAMPLES_PASSED - we would like to find how many fragments will make it to the rendering phase. The second parameter is query object - to run a query, you must have an OpenGL generated ID associated with it. This is done in the initScene() using the glGenQueries function, which we have also used in the 23rd tutorial about Particle System, where queries have been used for figuring out number of output primitives.

Now that we have began the GL_SAMPLES_PASSED query and have turned off writing to color and depth buffer, every pixel that would make it through counts. So now we just render the bounding box of the sphere and then call the glEndQuery function, which as the name suggests ends the query. Note that we have separate shader program for bounding box rendering without any texturing and lighting calculations, just a flat fragment output, because this is the only thing we care about at the moment. Now we are able to ask for the result of the query - number of pixels passed. This is done using glGetQueryObjectiv(uiOcclusionQuery, GL_QUERY_RESULT, &iSamplesPassed) function and now iSamplesPassed holds the number of pixels passed. Only if this number is greater than zero, only then we want to render the sphere. We mark it to our array of booleans and proceed to next sphere.

When we are done testing which spheres should be rendered, next steps is naturally to render (possibly) visible spheres. We just need to re-enable writing to depth and color buffer and then go through every sphere in the scene using the data from previously filled array of booleans. And that's it .

NOTE: There also exists a query GL_ANY_SAMPLES_PASSED, result of which is GL_TRUE or GL_FALSE, if any pixels have passed or none at all, respectively. This query would also work in this case, but I tested it and I didn't find any performance-wise speed-up, so I sticked with GL_SAMPLES_PASSED with comparing it against 0. You can try it out yourself .

When To Use Occlusion Query

Occlusion query is really nice and simple way to get rid of objects not present on the scene easily. But I found out, that it should be used in cases, when we are rendering really really complex objects only. I mean that there is a certain threshold, starting from which it's worth using. For example - if we used sphere with 10 stacks and 10 slices only, the occlusion query (at least on my computer) resulted in lower FPS than without it. However with spheres with 200 stacks and 200 slices, the FPS difference was remarkable when rendering with or without occlusion query. In such cases, the occlusion query really paid off. You can actually test it in the application by turning occlusion query ON / OFF and also by modifying stacks and slices numbers of a single sphere in sphere.ini file, which is present in the bin directory (where exe is). So you can try to modify this value and see the results for yourself.


This is the fruit of today's effort:

I think that this tutorial was one of the easiest I have ever written. It's really not difficult to understand logic behind it, nor is it difficult to convert it to code. I recommend you to play a little with that stacks / slices parameter to see the FPS differences. Next time I will probably create a little more complex tutorial, as this one was really chill .

Download (4.94 MB)
1449 downloads. 6 comments


Enter the text from image:


PtkR8PXwV (ismagbjvzi@gmail.com) on 03.02.2016 07:55:34
Never would have thunk I would find this so inpnldeisabse. http://zkwkystfnvi.com [url=http://hcfbzs.com]hcfbzs[/url] [link=http://vpebxox.com]vpebxox[/link]
qX1PBdH9D (cwuyqxxvj8@mail.com) on 03.02.2016 01:15:38
I was looking <a href="http://awxgtmy.com">evyewehrre</a> and this popped up like nothing!
N3na6sSeO1M (vegdvsclv@yahoo.com) on 27.01.2016 10:58:04
You have shed a ray of suhnsine into the forum. Thanks!
l1n6Hx3Oy (zsu40ugos1j@mail.com) on 15.12.2015 15:51:27
Some other tips:-1). Turn on details <a href="http://pqcjbpiu.com">ennmcneaeht</a> if using GI. Low performance hit compared to sometimes very good detail improvements.2). Lower settings for test renders but not so much that your test renders don't reasonably resemble your final render.3). I skimmed through this so not sure if Nick mentioned it, but lower that res for test renders and turn off/down AA!!!4). Do over-night final renders whenever you can.5). If you have access to several PCs, but not one single fast one, use Net Render (but learn about if first because under certain conditions the renders won't always look the same across different PCs).6). Get the fastest PC you can. It's not just the final render time that matters, it's also the quick test single frame renders that really, really make a difference to how quickly you'll get it looking good and therefore have more time to improve your skills.7). The goalposts move massively depending on your system. Going from a quad-core to an overclocked 12-core gave me nearly ten-times faster renders, I find the difference incredible going back would be hellish! Get the fastest system you can. 8). Speeding up that viewport with a good OpenGL graphics card also matters, especially when using Cloners.I'm no expert, just sharing what I've found out so far in the few months I've been using C4D
Cloudy Mc Strife on 12.12.2015 13:41:39
Hey. Great Tutorials so far :)
What I'm not understanding is the mathematics before the occlusion Culling.
While loading the mesh I would save 6 Values. The biggest and smalles x Axis Value of a Vertex. The same with y and z.
Out of that information I could "draw" a cube and multiply the model matrix of my mesh to these vertices. Would that work? (I hope you understand what I mean xD. My english isn't the best)
And can you please explain your approach? :)
7mkqbnwE (1ryfzjiqh@yahoo.com) on 11.12.2015 20:36:53
Bakura:I'm having a hard time undisrtandeng why so many people dislike screenspace-methods. Granted, they are view-dependent which mostly leads to two kinds of problems: Noise and lack of data to solve certain problems.The noise is in my method easily removed by blurring the indirect term while taking the depth-gradients into consideration.I have addressed this in my latets improvements. This is a valid solution since indirect lighting is a low-frequency function and by blurring this signal, peaks are effectively smoothed out.The lack of knowledge to solve a given problem is in my case a question of not being able to gather direct illumination from behind visible objects. Again this is rarely noticable since of its low frequency nature. If however you really wanted to squeeze more realism out of this model you could (and I will address this in my thesis as well) sample a back-face view of your scene. This ofcourse only gives you one more layer, but will in most cases result in very near ground-truth images.To answer Stuart Yarham's question: Yes, changing the point-of-view results in a different global illumination-solution, but this is for the most cases not noticable since it's a low-frequency function. Worst case is ofcourse when the view-rays are parallel to i.e. the blue wall, but human vision tends to only take notice of color-bleeding when both the caster and receiver is visible ;)When working with dynamic scenes screenspace-methods can be great tools to approximate many of the otherwise offline methods.Best regards,Mikkel Hempel
Jump to page: