OpenGL Demos
OpenGL Misc
MSG Board
Megabyte Softworks
C++, OpenGL, Algorithms

Current series: OpenGL 3.3
(Return to list of OpenGL 3.3 tutorials)

Download (4.56 MB)
3880 downloads. 17 comments
23.) Particle System


15.7.2013 - I repaired a small, but significant bug, which caused malfunction on nVidia cards and was causing flickering issues on some AMD cards. In the particle_render.vert I just forgot to pass particle type further. nVidia cards automatically set the values, which haven't been passed to their defaults, in case of int it is zero, which caused, that the geometry shader for rendering thought, that particles incoming are all generators. AMD cards seem to not care and they will just go on with whatever there was in memory. Thus, when it came to a memory place, where there was a zero, the particles sometimes dissapeared. That's it !


Hello guys! Impossible just happened! Another tutorial after like 3 months! I'm sorry it took me so long, but you know the drill (or you will probably know in the future when you attend an university) - writing the neverending pages of Diploma Thesis, then immediately learning for final state exams (I actually had two parts of final exams that were two weeks apart). But this is over now and I'm finally Master of Informatics. And I bring you next round of OpenGL tutorials. Today, I am going to teach you how to make Particle system, that is running purely on GPU, taking advantage of parallelism nature, which is really welcome in this case, because when simulating explosions or fluids, we have to deal with thousands of particles. So no more CPU bottlenecks, we can now render many many particles really quickly with just calling few OpenGL commands. So let's go make some fires, fountains or explosions!

The key ingredience for particle system is Transform Feedback. This OpenGL feature allows us to send some geometry through the shaders and then record the outputted geometry in some buffer. You may now ask: 'Wat?' How can rendering something can help me with simulating particles. On the first sight, you probably don't see how we can do it. But if I tell you the key idea, you will be probably able to program a particle system yourself.

The key idea to simulate and render particles is to make two shader programs, one for rendering particles and one for updating and creating new particles. Rendering part is pretty easy as you will later see - we simply take all the particles on the scene and render them, really nothing special. But shader program for updating is where all the magic happens. The trick is to consider every particle as vertex and all the attributes of particles, like position, velocity, color etc. as vertex attributes. Just as in normal rendering, where you send vertices with attributes as position, normal, texture coordinates, you will send the particles with their respective attributes. Geometry shader will then deal with particles updating, deleting and creation. If particle is still alive, our geometry shader will emit vertex further. If the particle's lifetime has expired, the geometry shader will simply not emit the particle, thus deleting it. And when we want to create new particles, our geometry shader will emit more particles. And the Transform feedback feature will record the new set of particles, which will be used as input for next pass of particles rendering. Simple as that. So let's cover all the details and functions required to do so. The whole transform feedback particle system is stored in class CParticleSystemTransformFeedback. You can see the class definition here:

class CParticleSystemTransformFeedback
   bool InitalizeParticleSystem();

   void RenderParticles();
   void UpdateParticles(float fTimePassed);

   void SetGeneratorProperties(glm::vec3 a_vGenPosition, glm::vec3 a_vGenVelocityMin, glm::vec3 a_vGenVelocityMax, glm::vec3 a_vGenGravityVector,
      glm::vec3 a_vGenColor, float a_fGenLifeMin, float a_fGenLifeMax, float a_fGenSize, float fEvery, int a_iNumToGenerate);

   void ClearAllParticles();
   bool ReleaseParticleSystem();

   int GetNumParticles();

   void SetMatrices(glm::mat4* a_matProjection, glm::vec3 vEye, glm::vec3 vView, glm::vec3 vUpVector);


   bool bInitialized;

   UINT uiTransformFeedbackBuffer;

   UINT uiParticleBuffer[2];
   UINT uiVAO[2];

   UINT uiQuery;
   UINT uiTexture;

   int iCurReadBuffer;
   int iNumParticles;

   glm::mat4 matProjection, matView;
   glm::vec3 vQuad1, vQuad2;

   float fElapsedTime;
   float fNextGenerationTime;

   glm::vec3 vGenPosition;
   glm::vec3 vGenVelocityMin, vGenVelocityRange;
   glm::vec3 vGenGravityVector;
   glm::vec3 vGenColor;

   float fGenLifeMin, fGenLifeRange;
   float fGenSize;

   int iNumToGenerate;

   CShader shVertexRender, shGeomRender, shFragRender;
   CShader shVertexUpdate, shGeomUpdate, shFragUpdate;
   CShaderProgram spRenderParticles;
   CShaderProgram spUpdateParticles;

This is the class definition for our particles:

class CParticle                                                                        
   glm::vec3 vPosition;
   glm::vec3 vVelocity;
   glm::vec3 vColor;
   float fLifeTime;
   float fSize;
   int iType;

The class above has six properties. Let's go through each one of them:

These are all important properties of particles that we should implement in order to create nice and versatile effects. Now let's divide this tutorial into two main parts - particle creation + updating and particle rendering. Updating is more difficult, so let's begin with updating .

1.) Particle Creation And Updating

How does a particle system work? Well, every now and then (every few milliseconds or seconds) it should generate a bunch of new particles. After this is done, we go through all the particles in some cycle (classic FOR cycle) for example and update their properties (position, velocity etc.). If particle is still alive, we would also render it, if not, we would remove from our list of particles. Then, if the desired time has come, we will create a bunch (for example 30) of new particles. If we were to simulate particles normally in our C++ program on CPU, the code would look something like this:

for(int i = 0; i < number_of_particles_on_scene; i++)                
   if(particle[i].lifetime > 0.0)

And this is what we will do on GPU too! However, we won't be programming any for cycles to do so, we will use shader programs instead. And with a single drawing call we will perform whole creation and updating stuff at once. We will do this by having two big buffers allocated in GPU memory, where the particles are going to be stored. One buffer is for reading, second one is for writing the updated particles with new properties. By using two buffers, we will always be swapping them in subsequent frames - in N-th frame first buffer will be used for reading and second for writing and in (N+1)-th frame first buffer is used for writing and second for reading. This must be done this way, because we cannot read and write into the same buffer on GPU (*1).

The shader program that does all of the above will take as input vertices. These vertices are actually particles, and all the vertex attributes that we set are particle attributes actually. Our buffer that's gonna be filled with particles has stored particles one by one as tight as possible, so in GPU memory it's stored like this:

If we have such buffer, only thing we need to remember is the number of particles stored there. Then we just call drawing command to draw points and the source of drawing will be this buffer exactly. Each point (vertex) represents one particle. However, we will not be drawing literally, we will just run our shader program that will update all the particles in this buffer and write the result into the second buffer, remembering the new number of particles. This number can differ from previous rendering, because some particles might have their lifetime expired, so they are removed or new particles are generated.

Now I will explain probably the most important part of this tutorial - the updating shader program. This shader programs consists of only two shaders - vertex shader and geometry shader. We don't need fragemtn shader, because we ain't rendering anything, we just need to update particles. Vertex shader will be very simple, it will just pass vertices and associated data further to geometry shader, nothing special. Vertex shader looks as following:

#version 330

layout (location = 0) in vec3 vPosition;
layout (location = 1) in vec3 vVelocity;
layout (location = 2) in vec3 vColor;
layout (location = 3) in float fLifeTime;
layout (location = 4) in float fSize;
layout (location = 5) in int iType;

out vec3 vPositionPass;
out vec3 vVelocityPass;
out vec3 vColorPass;
out float fLifeTimePass;
out float fSizePass;
out int iTypePass;

void main()
  vPositionPass = vPosition;
  vVelocityPass = vVelocity;
  vColorPass = vColor;
  fLifeTimePass = fLifeTime;
  fSizePass = fSize;
  iTypePass = iType;

Really simple stuff. However, geometry shader is where all the magic happens. Here it is, explanation continues below that:

#version 330

layout(points) in;
layout(points) out;
layout(max_vertices = 40) out;

// All that we get from vertex shader

in vec3 vPositionPass[];
in vec3 vVelocityPass[];
in vec3 vColorPass[];
in float fLifeTimePass[];
in float fSizePass[];
in int iTypePass[];

// All that we send further

out vec3 vPositionOut;
out vec3 vVelocityOut;
out vec3 vColorOut;
out float fLifeTimeOut;
out float fSizeOut;
out int iTypeOut;

uniform vec3 vGenPosition; // Position where new particles are spawned
uniform vec3 vGenGravityVector; // Gravity vector for particles - updates velocity of particles
uniform vec3 vGenVelocityMin; // Velocity of new particle - from min to (min+range)
uniform vec3 vGenVelocityRange;

uniform vec3 vGenColor;
uniform float fGenSize; 

uniform float fGenLifeMin, fGenLifeRange; // Life of new particle - from min to (min+range)
uniform float fTimePassed; // Time passed since last frame

uniform vec3 vRandomSeed; // Seed number for our random number function
vec3 vLocalSeed;

uniform int iNumToGenerate; // How many particles will be generated next time, if greater than zero, particles are generated

// This function returns random number from zero to one
float randZeroOne()
    uint n = floatBitsToUint(vLocalSeed.y * 214013.0 + vLocalSeed.x * 2531011.0 + vLocalSeed.z * 141251.0);
    n = n * (n * n * 15731u + 789221u);
    n = (n >> 9u) | 0x3F800000u;
    float fRes =  2.0 - uintBitsToFloat(n);
    vLocalSeed = vec3(vLocalSeed.x + 147158.0 * fRes, vLocalSeed.y*fRes  + 415161.0 * fRes, vLocalSeed.z + 324154.0*fRes);
    return fRes;

void main()
  vLocalSeed = vRandomSeed;
  // gl_Position doesn't matter now, as rendering is discarded, so I don't set it at all

  vPositionOut = vPositionPass[0];
  vVelocityOut = vVelocityPass[0];
  if(iTypePass[0] != 0)vPositionOut += vVelocityOut*fTimePassed;
  if(iTypePass[0] != 0)vVelocityOut += vGenGravityVector*fTimePassed;

  vColorOut = vColorPass[0];
  fLifeTimeOut = fLifeTimePass[0]-fTimePassed;
  fSizeOut = fSizePass[0];
  iTypeOut = iTypePass[0];
  if(iTypeOut == 0)
    for(int i = 0; i < iNumToGenerate; i++)
      vPositionOut = vGenPosition;
      vVelocityOut = vGenVelocityMin+vec3(vGenVelocityRange.x*randZeroOne(), vGenVelocityRange.y*randZeroOne(), vGenVelocityRange.z*randZeroOne());
      vColorOut = vGenColor;
      fLifeTimeOut = fGenLifeMin+fGenLifeRange*randZeroOne();
      fSizeOut = fGenSize;
      iTypeOut = 1;
  else if(fLifeTimeOut > 0.0)

Wow, dat shader ! The first lines of geometry shader are simple. They are saying that we want incoming vertices to be treated as points (one point = one particle). The outputted vertices should also be points. The line layout(max_vertices = 40) out; just hints the GPU driver, that maximum amount of emitted vertices will be 40. This will make sense a little later.

Next few lines starting with comment 'All that we get from vertex shader' are just input vertex attributes. Every vertex that comes into shader has these attributes associated with it. These attributes are nothing else than particle attributes. The next lines starting with comment 'All that we send further' are just the same attributes, but we're saying that we are outputting them to next processing stage (these data will actually be written to our writing buffer using transform feedback).

Later we can see several uniform variables, that control particle generation. These variables deserve a thorough examination:

Now you should understand the meaning of all uniform variables in the geometry shader. After this there is a randZeroOne function, which generates a random number from 0 to 1 depending on a seed. How does it work? Well, it's a magic , as Coldplay sings:

But if you look closer, it's just converting bits of float to integer, making some stuff with it, and restoring bits of integer to float. It's not my creation, I found this on the internet . There are many other rand implementations, even one-liners, but I found this one to be nice and usable, so just believe it works for now .

Finally we proceed to the geometry shader main function. First line just sets the vLocalSeed to the seed provided by uniform. Then there is actual updating of particle attributes - we update position with velocity, velocity with gravity and subtract from life. Other parameters of particles remain unchanged, specifically color, size and type. And now you will finally get to know what that type means. Particle can be of two types - PARTICLE_TYPE_GENERATOR or PARTICLE_TYPE_NORMAL.

Normal particle is easy. It's just a regular particle with its life. Whenever such particle comes, it's updated. If it's life is still above zero, then we actually EMIT this vertex and primitive, thus sending particle further. When it's life is below zero, the particle is not emitted further and this it won't be available in the next frame render. By not emitting we have effectively eliminated the particle.

Particle generator is a special particle, that is always there. It always passes the test, we never check it's lifetime. The special thing about it is that this type of particle is unique - i.e. no other generator is present among other particles. Why do we need something like this? If we want to generate particles, we want to do it exactly once. If our geometry shader receives particle of type PARTICLE_TYPE_GENERATOR (0), we want to emit new vertices, therefore new particles depending on our iNumToGenerate uniform variable. And that's the trick! If we want to generate new particles, we simply emit more vertices (that's what geometry shaders are for - they can generate new geometry for us, in this case new vertices, which are particles). In our application, we locally count the time. If we want to generate particles every 0.25 seconds and that time has passed, we set the iNumToGenerate uniform to the desired number of produced particles and geometry shader will emit them.

Transform Feedback

Emitted particles must however be tracked. And that's what transform feedback is about. It captures the outputted geometry into some buffers. In our case, we will record every emitted particle to buffer previously allocated and then this buffer can be used to render particles. We also need to use double buffering as mentioned before - one buffer servers as read buffer and other one as write buffer. We need to initialize transform feedback buffer. All of the particle system initialization, including transform feedback initialization is done in pretty long InitalizeParticleSystem() function. First part deals with shader loading:

bool CParticleSystemTransformFeedback::InitalizeParticleSystem()
   if(bInitialized)return false;

   const char* sVaryings[NUM_PARTICLE_ATTRIBUTES] = 

   // Updating program

   shVertexUpdate.LoadShader("data\\shaders\\particles_update.vert", GL_VERTEX_SHADER);
   shGeomUpdate.LoadShader("data\\shaders\\particles_update.geom", GL_GEOMETRY_SHADER);

   FOR(i, NUM_PARTICLE_ATTRIBUTES)glTransformFeedbackVaryings(spUpdateParticles.GetProgramID(), 6, sVaryings, GL_INTERLEAVED_ATTRIBS);

   // Rendering program

   shVertexRender.LoadShader("data\\shaders\\particles_render.vert", GL_VERTEX_SHADER);
   shGeomRender.LoadShader("data\\shaders\\particles_render.geom", GL_GEOMETRY_SHADER);
   shFragRender.LoadShader("data\\shaders\\particles_render.frag", GL_FRAGMENT_SHADER);




Besides code for loading shaders, there is only one special thing that requires explanation - glTransformFeedbackVaryings. This is the function which tells OpenGL, which vertex attributes should be recorded by transform feedback. In our case, we simply take all the particle attributes. First parameter is shader program ID, second is total number of attributes recorded, third is a string name of an output variable we want to record (this name corresponds with output variable in the geometry shader) and the last one is either GL_INTERLEAVED_ATTRIBS or GL_SEPARATE_ATTRIBS. In our case we use GL_INTERLEAVED_ATTRIBS, because our output is written into a single buffer where particles are stored one after another.

The second part of InitalizeParticleSystem() deals with all necessary buffers creation:

bool CParticleSystemTransformFeedback::InitalizeParticleSystem()
   // ...

   glGenTransformFeedbacks(1, &uiTransformFeedbackBuffer);
   glGenQueries(1, &uiQuery);

   glGenBuffers(2, uiParticleBuffer);
   glGenVertexArrays(2, uiVAO);

   CParticle partInitialization;
   partInitialization.iType = PARTICLE_TYPE_GENERATOR;

   FOR(i, 2)
      glBindBuffer(GL_ARRAY_BUFFER, uiParticleBuffer[i]);
      glBufferSubData(GL_ARRAY_BUFFER, 0, sizeof(CParticle), &partInitialization);

      FOR(i, NUM_PARTICLE_ATTRIBUTES)glEnableVertexAttribArray(i);

      glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, sizeof(CParticle), (const GLvoid*)0); // Position
      glVertexAttribPointer(1, 3, GL_FLOAT, GL_FALSE, sizeof(CParticle), (const GLvoid*)12); // Velocity
      glVertexAttribPointer(2, 3, GL_FLOAT, GL_FALSE, sizeof(CParticle), (const GLvoid*)24); // Color
      glVertexAttribPointer(3, 1, GL_FLOAT, GL_FALSE, sizeof(CParticle), (const GLvoid*)36); // Lifetime
      glVertexAttribPointer(4, 1, GL_FLOAT, GL_FALSE, sizeof(CParticle), (const GLvoid*)40); // Size
      glVertexAttribPointer(5, 1, GL_INT,     GL_FALSE, sizeof(CParticle), (const GLvoid*)44); // Type
   iCurReadBuffer = 0;
   iNumParticles = 1;

   bInitialized = true;

   return true;

There are more important functions there. First is glGenTransformFeedbacks. As usual in OpenGL, generates a transform feedback object and assigns it a name. glGenQueries generates a query object. This query will be used later to ask OpenGL how many particles have been emitted last time. There is no need to count the particles manually, we will just run this query . Next lines generate two buffers for storing particles. For every buffer, we create VBO and associated VAO. There is one special thing though - these buffers will contain one single particle after initilization - the generator particle. The code is very similar to normal VBO with VAO creation, we just add some initialization data using glBufferSubData. Also we set corresponding vertex attribute pointers with glVertexAttribPointer to tell OpenGL layout of attributes in memory. Also don't forget to initialize iNumParticles to 1. And that's it!

The update function of our transform feedback particle system class is following:

void CParticleSystemTransformFeedback::UpdateParticles(float fTimePassed)


   spUpdateParticles.SetUniform("fTimePassed",         fTimePassed);
   spUpdateParticles.SetUniform("vGenPosition",      vGenPosition);
   spUpdateParticles.SetUniform("vGenVelocityMin",      vGenVelocityMin);
   spUpdateParticles.SetUniform("vGenVelocityRange",   vGenVelocityRange);
   spUpdateParticles.SetUniform("vGenColor",         vGenColor);
   spUpdateParticles.SetUniform("vGenGravityVector",   vGenGravityVector);

   spUpdateParticles.SetUniform("fGenLifeMin",         fGenLifeMin);
   spUpdateParticles.SetUniform("fGenLifeRange",      fGenLifeRange);

   spUpdateParticles.SetUniform("fGenSize",         fGenSize);
   spUpdateParticles.SetUniform("iNumToGenerate",         0);

   fElapsedTime += fTimePassed;

   if(fElapsedTime > fNextGenerationTime)
      spUpdateParticles.SetUniform("iNumToGenerate", iNumToGenerate);
      fElapsedTime -= fNextGenerationTime;

      glm::vec3 vRandomSeed = glm::vec3(grandf(-10.0f, 20.0f), grandf(-10.0f, 20.0f), grandf(-10.0f, 20.0f));
      spUpdateParticles.SetUniform("vRandomSeed", &vRandomSeed);

   glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, uiTransformFeedbackBuffer);

   glEnableVertexAttribArray(1); // Re-enable velocity

   glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0, uiParticleBuffer[1-iCurReadBuffer]);


   glDrawArrays(GL_POINTS, 0, iNumParticles);


   glGetQueryObjectiv(uiQuery, GL_QUERY_RESULT, &iNumParticles);

   iCurReadBuffer = 1-iCurReadBuffer;

   glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, 0);

This function takes only one parameter - time passed since the last frame. There are several important things here. In the first place, we need to set all generator uniforms, nothing really special. Then there's a very important part - whenever our time that we count locally reaches a certain threshold (fNextGenerationTime), we need to set the number of desired generated particles and random seed in our shader program to generate particles properly.

Then the most important part of the tutorial happens. Line by line - glEnable(GL_RASTERIZER_DISCARD) disables rasterization. This means that we don't want to make any graphical output. We are just updating particles. glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, uiTransformFeedbackBuffer) tells OpenGL that we want to use our previously created transform feedback buffer object. glBindVertexArray(uiVAO[iCurReadBuffer]) binds current VAO, i.e. read buffer. Then there is glEnableVertexAttribArray(1). This is because when updating particles, we do need velocity vector. However, when rendering them, we don't need velocity. This way we can save some processing time by not sending down things we ain't using. Now listen carefully - glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0, uiParticleBuffer[1-iCurReadBuffer]) tells OpenGL, where should it store the result of transform feedback operation. Because our current read buffer is iReadBuffer, the writing buffer is 1-iCurReadBuffer. This means, that VBO uiParticleBuffer[1-iCurReadBuffer] serves as storage for transform feedback output.

Now everything is set up for rendering with transform feedback. First, we need to call glBeginQuery(GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN, uiQuery) to count the number of outputted primitives. Then we tell OpenGL, that we want to begin transform feedback rendering with function glBeginTransformFeedback(GL_POINTS). After that, we call the actual drawing function glDrawArrays(GL_POINTS, 0, iNumParticles), then end the transform feedback with glEndTransformFeedback. Because rendering is done, we can also end query with glEndQuery(GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN). Now we just need to find the result of query, i.e. the number of newly generated particles by calling glGetQueryObjectiv(uiQuery, GL_QUERY_RESULT, &iNumParticles). Simple thing that remains is to swap read and write buffer and also unbind any transform feedbacks by calling gglBindTransformFeedback(GL_TRANSFORM_FEEDBACK, 0) with 0 parameter.

Wow, so much stuff at once! But that's all we need for updating, now we can proceed to rendering, which is a lot less complicated.

2.) Rendering Particles
Now we're getting into a little easier part - rendering of particles. What we have at this moment is buffer full of particles and their number. All we need to do is to create a shader program that takes these particles and renders them. What's the issue that comes to your mind? Think about it. Particles are only points but what we would like to do is to render some quads with a texture applied to it. So we want to generate a quad from point. And what kind of OpenGL thing can create more geometry from less geometry? That's right! Geometry shader! We must create a geometry shader, that takes particles as input and renders quads with texture on it. Not that difficult.

However, the problems arising are not over yet. Last thing that remains is how to orientate the quad, so that it faces camera? This technique is called Billboarding and there are several ways how to do this. I'm going to show you simple approach I've been using my whole life and results are good .


My custom method is maybe not the most effective method, but really simple to understand. We just take the camera's view vector and then calculate the billboarded plane vectors. So it's like we have a normal of a plane (it's the view vector) and we want to get a plane from it. We want to search for two vectors, vQuad1 and vQuad2:

These two vectors together with view vector should be perpendicular to each other. when all 3 vectors are perpendicular to each other, they create an orthonormal base (random math wisdom ). We will find them in SetMatrices function of CParticleSystemTransformFeedback. This function does not only calculate these two values, but also tells the projection and view matrix for the particles render program. Here it is:

void CParticleSystemTransformFeedback::SetMatrices(glm::mat4* a_matProjection, glm::vec3 vEye, glm::vec3 vView, glm::vec3 vUpVector)
   matProjection = *a_matProjection;
   matView = glm::lookAt(vEye, vView, vUpVector);

   vView = vView-vEye;
   vView = glm::normalize(vView);
   vQuad1 = glm::cross(vView, vUpVector);

   vQuad1 = glm::normalize(vQuad1);
   vQuad2 = glm::cross(vView, vQuad1);
   vQuad2 = glm::normalize(vQuad2);

We just do some cross products. First, we find first quad vector by doing cross product of view vector and camera up vector and normalize it. Second quad vector can be found by taking view vector again and newly calculated first quad vector. There two vectors are then set as uniforms to our rendering shader program. Rendering of particles happens around the position of particle, generating the quad of desired size using these two vertices:

The biggest problem is over, let's get to examination of the rendering shader program. It consists of vertex, geometry and fragment shader. Vertex shader is really simple - it just passes data further into geometry shader:

#version 330

layout (location = 0) in vec3 vPosition;
layout (location = 2) in vec3 vColor;
layout (location = 3) in float fLifeTime;
layout (location = 4) in float fSize;
layout (location = 5) in int iType;

out vec3 vColorPass;
out float fLifeTimePass;
out float fSizePass;
out int iTypePass;

void main()
   gl_Position = vec4(vPosition, 1.0);
   vColorPass = vColor;
   fSizePass = fSize;
   fLifeTimePass = fLifeTime;

Geometry shader is where we want to create a quad from a point. So we need to emit 4 vertices. Because there is nothing like GL_QUADS anymore, we will generate triangle strip by outputting 4 vertices in correct order. The uniform variables vQuad1 and vQuad2 are the ones we calculated before. We also need to set texture coordinates of generated vertices. We could also set normals, but there is no need to do any kind of shadings on the particles. For now, we are fine with just vertices and their texture coordinates:

#version 330

uniform struct Matrices
   mat4 mProj;
   mat4 mView;
} matrices;

uniform vec3 vQuad1, vQuad2;

layout(points) in;
layout(triangle_strip) out;
layout(max_vertices = 4) out;

in vec3 vColorPass[];
in float fLifeTimePass[];
in float fSizePass[];
in int iTypePass[];

smooth out vec2 vTexCoord;
flat out vec4 vColorPart;

void main()
  if(iTypePass[0] != 0)
    vec3 vPosOld = gl_in[0].gl_Position.xyz;
    float fSize = fSizePass[0];
    mat4 mVP = matrices.mProj*matrices.mView;
    vColorPart = vec4(vColorPass[0], fLifeTimePass[0]);
    vec3 vPos = vPosOld+(-vQuad1-vQuad2)*fSize;
    vTexCoord = vec2(0.0, 0.0);
    gl_Position = mVP*vec4(vPos, 1.0);
    vPos = vPosOld+(-vQuad1+vQuad2)*fSize;
    vTexCoord = vec2(0.0, 1.0);
    gl_Position = mVP*vec4(vPos, 1.0);
    vPos = vPosOld+(vQuad1-vQuad2)*fSize;
    vTexCoord = vec2(1.0, 0.0);
    gl_Position = mVP*vec4(vPos, 1.0);
    vPos = vPosOld+(vQuad1+vQuad2)*fSize;
    vTexCoord = vec2(1.0, 1.0);
    gl_Position = mVP*vec4(vPos, 1.0);

Last stage is fragment shader, but this one does really little. It just textures the object:

#version 330

uniform sampler2D gSampler;

smooth in vec2 vTexCoord;
flat in vec4 vColorPart;

out vec4 FragColor;

void main()
  vec4 vTexColor = texture2D(gSampler, vTexCoord);
  FragColor = vec4(vTexColor.xyz, 1.0)*vColorPart;

And that's it! All we need to do is now just correctly call the OpenGL commands to take the buffer, that transform feedback has filled before. This is all done in RenderParticles function:

void CParticleSystemTransformFeedback::RenderParticles()

   glBlendFunc(GL_SRC_ALPHA, GL_ONE);

   spRenderParticles.SetUniform("matrices.mProj", &matProjection);
   spRenderParticles.SetUniform("matrices.mView", &matView);
   spRenderParticles.SetUniform("vQuad1", &vQuad1);
   spRenderParticles.SetUniform("vQuad2", &vQuad2);
   spRenderParticles.SetUniform("gSampler", 0);

   glDisableVertexAttribArray(1); // Disable velocity, because we don't need it for rendering

   glDrawArrays(GL_POINTS, 0, iNumParticles);


There are few things worth mentioning. First is disabling writing to depth buffer. We simply don't want particles to overwrite depth, it would be doing not nice things (try to remove glDepthMask to see). For this reason, in final scene rendering, we will render particles after we render everything else. Then we also turn on blending. Setting uniforms and binding the correct VAO is the last thing we need to do before calling glDrawArrays(GL_POINTS, 0, iNumParticles). And that's all!

Last few things

In the initScene function, there's an particle system initialization code:

void InitScene(LPVOID lpParam)
   // ...


      glm::vec3(-10.0f, 17.5f, 0.0f), // Where the particles are generated
      glm::vec3(-5, 0, -5), // Minimal velocity
      glm::vec3(5, 20, 5), // Maximal velocity
      glm::vec3(0, -5, 0), // Gravity force applied to particles
      glm::vec3(0.0f, 0.5f, 1.0f), // Color (light blue)
      1.5f, // Minimum lifetime in seconds
      3.0f, // Maximum lifetime in seconds
      0.75f, // Rendered size
      0.02f, // Spawn every 0.05 seconds
      30); // And spawn 30 particles

   // ...

In renderScene function, the rendering and updating particles is just a matter of few function calls:

void RenderScene(LPVOID lpParam)
   // ...

   tTextures[6].BindTexture(); // Bind particle texture

   psMainParticleSystem.SetMatrices(oglControl->GetProjectionMatrix(), cCamera.vEye, cCamera.vView, cCamera.vUp);


   // ...

And that's really all important from the code you need. I also didn't explain, why I have max vertices in geometry shader set to 40. The thing is, that we cannot emit arbitrary number of particles. The actual number we can emit depends on the GPU. The newer GPUs can emit more particles at once. It's because of safety issues I guess. If every vertex could output another 1000 vertices, well, these things could raise exponentially. This actual number can be get by using several glGet commands, but I don't know exactly right now. It doesn't matter though. I chose number 40, because it's pretty safe and works on older cards (tested it with AMD HD5870). You can change this number to whatever you want, but the GPU will emit only so many vertices as it can. But if you would like to emit more stuff (for example, maximum emmited per vertex is 50 and you want to emit 100), you can simply add more generator particles! So there's pretty easy way to overcome this .

As this was pretty much of a stuff, this summarization lists all new and important functions that appeared in this tutorial:

glTransformFeedbackVaryings // Tells OpenGL which attributes should transform feedback record

glGenTransformFeedbacks // Generates transform feedback object

glBindTransformFeedback // Binds transform feedback object, if you bind 0, you cancel all transform feedbacks

glBindBufferBase // Tells where to store the results of transform feedback

glGenQueries // Generates a general query object, in our case we use it to determine number of emitted particles

glBeginTransformFeedback // Starts recording of outputted geometry

glEndTransformFeedback // Ends recording of outputted geometry

glBeginQuery // Starts query, in our case we call it with parameter GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN

glGetQueryObjectiv // Gets query result, in our case number of emitted particles

glEnable(GL_RASTERIZER_DISCARD) // Disables rasterization completely


This is what the result looks like:

I hope you guys enjoyed this tutorials and learned a lot from it. This one was pretty long, because the topic presented isn't that straightforward. If you want to read like extended version of this tutorial, I have written a Bachelor's thesis (undergraduate degree on University thesis) about transform feedback particle system. There are some more things explained and I also created the Blaze Particle System Library, that I've been using in my projects. The work can be found here:

OpenGL Library For Particle Systems

Don't get discouraged by first lines, as they must have been written in Slovak language. Latter pages are written in English .

So that's it for today! Let your head regenerate after this tutorial, as it may have head some headaches after reading some much text and code .

(*1) - I said that you can't use same buffer for reading and writing. However, when I was making that Thesis, I tried using one buffer only. On AMD cards, particle system worked normally, but on nVidias it didn't. So it's simply better to use two buffers - works on both GPU types plus it makes more sense to use two then to read and write to the same buffer at once.

Download (4.56 MB)
3880 downloads. 17 comments


Enter the text from image:


AndreyOGL_D3D (geecandrey@gmail.com) on 08.02.2017 10:19:05
Fix shaders:(terrain.frag,particles_render.frag,font2D.frag , ortho2D.frag,main_shader.frag)
AndreyOGL_D3D (geecandrey@gmail.com) on 08.02.2017 10:08:27
Doesn't works on Intel HD 4000(OpenGL 4.0 support)
i have some compilation errors for GLSL shaders.
Misu on 29.07.2014 22:25:33
Hi, thank you very much for this work, It's very easy to understand.

I think there is a mistake in the tutorial, when you show us the paticles_update.geom file,

"if(iGenerate == 1) " that condition is not present in the real file in the /data folder.

I was trying to find where you set this uniform in your code, but wasn't able to find that.

Michal Bubnar (michalbb1@gmail.com) on 31.07.2014 11:10:17
True that! That iGenerate parameter was present in the Blaze Particle System Library, all I forgot is to update this shader code in the article to a newer one, after polishing it. Thanks, it's repaired now
Sebastian on 16.07.2014 11:07:29
It works well.

Es funktioniert gut.
Michal Bubnar (michalbb1@gmail.com) on 14.07.2014 16:51:42
I've been looking through the issue with not rendering particles on nVidias today and it is indeed strange, because particles are generated correctly with their properties, I mapped buffer to see values and they were correct... I still didn't find it, but I will look onto it tomorrow, because today I have band practice till the evening, and I will be probably exhausted then.

But it really doesn't make much sense, I compared it with blaze code, and I can't see any significant difference that can cause this kind of problem
Michal Bubnar (michalbb1@gmail.com) on 15.07.2014 13:34:22
Sebastian on 05.07.2014 09:37:55
I do not see particles. The scene renders correctly. Nvidia 330gt latest drivers, Windows 7 64bit.
Sebastian on 05.07.2014 09:40:33
I'll use two buffers.
Michal Bubnar (michalbb1@gmail.com) on 05.07.2014 12:36:31
Hm, this is strange, I will debug it on some nVidia card. And if you run my demo The Enchanted Forest, do you see the particles?
Nick on 05.07.2014 16:31:05
I can't see the particles too. However the Enchanted Forest demo works fine.
Michal Bubnar (michalbb1@gmail.com) on 05.07.2014 17:12:47
All you guys have nVidia cards? I have changed a little bit of code since Enchanted Forest, but I'll be able to use nVidia card on Monday, when I am at workplace - I don't have nVidia card in my own computer.

When I find out what makes it not work on nVidias, I'll let you all know.
Sebastian on 05.07.2014 17:44:00
Yes, Enchanted Forest works fine.

GPUanalizer output

ERROR: error(#160) Cannot convert from 'highp 3-component vector of float' to 'default out array of 3-component vector of float'

Sebastian on 05.07.2014 17:49:53

Change code like this
in vec3 vPositionPass[];

To this

in vec3 vPositionPass[1];

Now the GPU analyzer ok, but still can not see particles.
Michal Bubnar (michalbb1@gmail.com) on 05.07.2014 21:18:10
So it seems like nVidia has some compilation problems. vPositionPass[1] cannot work, because input geometry are points and points have only 1 vertex, therefore we must use vPositionPass[0]. It's kinda strange, but I will look at it on Monday as I said
dima (dmitry.trok@gmail.com) on 02.07.2014 11:45:03
Great, thx you !!
deadmau5 (deadmau5@mau5trap.com) on 01.07.2014 18:53:31
dat kick doe
Jump to page: