Simple GPU Path Tracing, Part 8 : Denoising

 

We now have a decent path tracer that's able to render a variety of materials, so now is a good time to get some actually good imagery out of it. To do that we can either accumulate a very big amount of samples, or we can use a denoiser. A denoiser is an image processing tool that removes the imperfections of path traced images.

 


Here's the code for this post. 


There are 2 main kinds of denoisers : 

* Neural Network based : A neural network is trained to denoise images. I don't know much about that topic, but here's a series of blog posts that are very helpful to understand the basics of how it works 

* Filters based : Some image processing filters are applied on the noisy images, like the Spatiotemporal variance guided filtering approach.

In our case, we will be using a neural network approach, using the open image denoiser (oidn) library from Intel. The good thing about this library is that there's a gpu implementation, so we can denoise cuda buffers directly from our path tracer output.

The only drawback is that it's not temporally stable, meaning that if we move the camera, it's not going to use previous images to denoise the current one, leading to visual artifacts.

But it's good enough for our simple purpose, so let's add it in !

I've added it in the vendor folder, and added the library and includes to the project folders.

 The process will be a bit different for cuda and opengl, as we'll be passing data around on the gpu between the 2 apis.

Cuda

First, we'll add some new fields into our app class :
    // Denoiser
    oidn::DeviceRef Device;
    oidn::FilterRef Filter;
    cudaStream_t Stream;
    std::shared_ptr<bufferCu> DenoisedBuffer;
    bool Denoised=false;
    bool DoDenoise=true;
 
 then, in Init(), we will call CreateOIDNFilter() function that will create the denoising filter. Here's the function : 

void application::CreateOIDNFilter()
{
    cudaStreamCreate(&Stream);
    Device = oidn::newCUDADevice(0, Stream);
    Device.commit();
    DenoisedBuffer = std::make_shared<bufferCu>(RenderWidth * RenderHeight * 4 * sizeof(float));

    const char* errorMessage;
    if (Device.getError(errorMessage) != oidn::Error::None)
        std::cout << "Error: " << errorMessage << std::endl;

    Filter = Device.newFilter("RT");
    Filter.setImage("color",  RenderBuffer->Data,    
            oidn::Format::Float3, RenderWidth, RenderHeight, 0
            sizeof(glm::vec4), sizeof(glm::vec4) * RenderWidth);
    Filter.setImage("output", DenoisedBuffer->Data
            oidn::Format::Float3, RenderWidth, RenderHeight, 0,  
            sizeof(glm::vec4), sizeof(glm::vec4) * RenderWidth);
    Filter.set("hdr", true);               
    Filter.set("quality", OIDN_QUALITY_BALANCED);        
    Filter.commit();
}
We first have to create a cuda command stream, and then create a oidn device with that stream.
We then create a new cuda buffer that will contain our denoised image.
We can then create the filter, using the "RT" filter (Refer to the docs for more info).
We only use a colour input, which will be our noisy image, and set HDR to true, because we're using floating point values that can have values higher than 1.
We set the quality to Balanced, because we want it to run fast so we can still interact with the scene.

we also have to change ResizeRenderTextures to account for this  : 
    cudaDeviceSynchronize();
    TonemapTexture = std::make_shared<textureGL>(RenderWidth, RenderHeight, 4);
    RenderBuffer = std::make_shared<bufferCu>(RenderWidth * RenderHeight * 4 * sizeof(float));
    TonemapBuffer = std::make_shared<bufferCu>(RenderWidth * RenderHeight * 4 * sizeof(float));
    RenderTextureMapping = CreateMapping(TonemapTexture);

    DenoisedBuffer = std::make_shared<bufferCu>(RenderWidth * RenderHeight 
                      * 4 * sizeof(float));
    Filter.setImage("color",  RenderBuffer->Data,   oidn::Format::Float3, 
                        RenderWidth, RenderHeight, 0, sizeof(glm::vec4), 
                        sizeof(glm::vec4) * RenderWidth);
    Filter.setImage("output", DenoisedBuffer->Data, oidn::Format::Float3, 
                        RenderWidth, RenderHeight, 0, sizeof(glm::vec4), 
                        sizeof(glm::vec4) * RenderWidth);
    Filter.commit();
 
Great, now we have all the objects ready for denoising our images.
We'll create a function called Denoise() that we will call after tracing, and before tonemapping.
Here's the content of Denoise() : 
void application::Denoise()
{
    Filter.execute();
    Denoised = true;
}
 
Very simple ! But this will slightly change later when we do the openGL version.

Inside of the Trace() function, we call that function : 

    if(DoDenoise && !Denoised)
    {
        Denoise();
    }
 DoDenoise will define wether we want to use the denoiser. We will only call Denoise if the image is not already denoised, so we don't' denoise the same image multiple times.

We also need to change the call to the Tonemap kernel to take the denoised image as input, if it exists : 
TonemapKernel<<<gridSize, blockSize>>>(Denoised ? (glm::vec4*)DenoisedBuffer->Data :         
        (glm::vec4*)RenderBuffer->Data, (glm::vec4*)TonemapBuffer->Data
        RenderWidth, RenderHeight);

And that's all, we now get some nicely denoised images out : 
 
 

OpenGL

Ok, let's tackle the openGL implementation now, which will be a bit more complex, because oidn natively only works with cuda, so we'll have to do some interrop between the 2 apis.

We'll have 3 more gpu objects : 
    std::shared_ptr<textureGL> DenoisedTexture;
    std::shared_ptr<cudaTextureMapping> RenderMapping;
    std::shared_ptr<cudaTextureMapping> DenoiseMapping;

A denoisedTexture that will be passed in to the tonemap kernel, instead of the regular render output.
Then, 2 more cudaTextureMapping objects for getting gl/cuda interop with the denoise and the render textures.

Here's the InitGPUObjects function now for GL : 
    TonemapTexture = std::make_shared<textureGL>(RenderWidth, RenderHeight, 4);    
    RenderTexture = std::make_shared<textureGL>(RenderWidth, RenderHeight, 4);
    DenoisedTexture = std::make_shared<textureGL>(RenderWidth, RenderHeight, 4);    

    DenoiseMapping = CreateMapping(DenoisedTexture, true);    
    RenderMapping = CreateMapping(RenderTexture);    

    TracingParamsBuffer = std::make_shared<uniformBufferGL>(sizeof(tracingParameters), &Params);
    MaterialBuffer = std::make_shared<bufferGL>(sizeof(material) * Scene->Materials.size(), Scene->Materials.data());
    LightsBuffer = std::make_shared<bufferGL>(sizeof(lights), &Lights);

for the oidn filter creation, here's the code : 
    Filter.setImage("color",  RenderMapping->CudaBuffer,   oidn::Format::Float3, 
            RenderWidth, RenderHeight, 0, sizeof(glm::vec4), 
            sizeof(glm::vec4) * RenderWidth);
    Filter.setImage("output", DenoisedBuffer->Data, oidn::Format::Float3, 
            RenderWidth, RenderHeight, 0, sizeof(glm::vec4), 
            sizeof(glm::vec4) * RenderWidth);

The input is the mapping of the render texture, and the output is our denoised buffer.
Now, here's the Denoise() function code : 
    GLTexToCuBuffer(RenderMapping->CudaBuffer, RenderMapping->TexObj,  
                    RenderWidth, RenderHeight);
    Filter.execute();
    cudaMemcpyToArray(DenoiseMapping->CudaTextureArray, 0, 0, DenoisedBuffer->Data,  
                      RenderWidth * RenderHeight * sizeof(glm::vec4),  
                      cudaMemcpyDeviceToDevice);

We first have to copy the content of the GL render texture to the cuda buffer, using the GLTexToCuBuffer function. This function runs a cuda kernel that accesses the content of the texture, and fill a buffer with it.
We can then execute the filter, and we still have to copy the result of the filter into our mapping object, so that openGL can render it.

Then, when running the tonemap shader, we pass the DenoisedTexture object if a denoised output exists : 
TonemapShader->SetTexture(0, Denoised ? DenoisedTexture->TextureID : RenderTexture->TextureID, GL_READ_WRITE);

And we're done with denoising !

Links

 

Commentaires

Articles les plus consultés