2011-02-04

Logarithmic Depth Buffer
========================

Background
----------

There are a good couple of blog postings by Brano Kemen about
z-buffers for planetary-scale rendering.  He explains how to use a
log() function to map view z into depth buffer values, that
distributes resolution more evenly than the commonly available z- and
w-buffers.

http://outerra.blogspot.com/search/label/depth%20buffer

Slightly earlier versions appears on gamasutra as well, links:
http://www.gamasutra.com/blogs/BranoKemen/20090812/2725/Logarithmic_Depth_Buffer.php
http://www.gamasutra.com/blogs/BranoKemen/20091231/3972/Floating_Point_Depth_Buffers.php

He also explains that you can get the benefit of a logarithmic z
buffer by using a floating-point z-buffer, and running it backwards.
I.e. map the near values to larger depth-buffer values, and use
glDepthFunction(GL_DEPTH_LEQUAL) instead of GL_DEPTH_GEQUAL (or DX
equivalent).

Humus also has an earlier article that advocates running the
floating-point depth buffer backwards:
http://www.humus.name/index.php?ID=255

Kemen's formula for the z buffer is:

f(z) = log(C*z + 1) / log(C*Far + 1) * K

Where K = maximum value in depth buffer and C is a tuning parameter.

My Slight Variation
-------------------

I studied the same problem and came to similar conclusions, but
without the tuning parameter.  Here's the function I derived:

  assume an integer depth buffer
  k is the number of bits in the depth buffer
  K = 2^k - 1
  z is the view-space z value for the pixel in question
  zn is the position of the near-clip plane
  zf is the position of the far-clip plane
  f(z) = integer value to write into depth buffer

  f(z) = K * log(z / zn) / log(zf / zn)

This function gives constant relative precision.  I.e.

 g(i) = the view z value that maps to depth-buffer value i
      = zn * exp((i / K) * log(zf / zn))

 delta(z) = how far (in view space) to the next discrete depth value
          = g(f(z) + 1) - z

 rel(z) = relative precision
        = delta(z) / z
        = (n * exp((K * log(z/n) / log(f/n) + 1) / K * log(f/n)) - z)/z
        = (zf / zn) ^ (1 / K) - 1


Implications
------------

The function f(z) is designed so that rel(z) is constant!  In some
sense this is the optimal way to distribute z-buffer precision.  And
in practice, it is DRASTICALLY better than an ordinary z or w buffer.
The z-buffer piles up nearly all the resolution right up against the
near plane, at the expense of everything else, and the w buffer
distributes things linearly in view space, which is very bad for
close/mid objects.

For example, using the log formula we can use a 16-bit z-buffer to
draw planetary-scale scenes!  Let's make the world coords in meters,
set zn to 1 meter and zf to 100M meters (~16 Earth radii):

 let zn = 1
 let zf = 100e6
 f(zn) == 0
 f(zf) == 65535
 g(0) = 1
 g(1) = 1.000281
 g(2) = 1.000562
 ...
 g(65534) = 99971895.79
 g(65535) = 100000000.0

Relative precision is ~ 0.000281, meaning at a depth of z, the depth
buffer can resolve a difference in depth of (z * 0.000281).

So for some values of z:

   z          precision
---------   ------------
1 meter        0.2 mm
100 meters     2.8 cm
100 km         28 m
100,000 km     28 km

That's sort of on the edge of what is easy to deal with, although
compared to a regular 16-bit z-buffer it's a miracle.

If that's not good enough, go to 24 bits.  Same zf and zn, at 24 bits,
gives a relative precision of 0.00000014 .  At zn = 1 meter, precision
is in microns.  At zf == 100M meters, precision is 14 meters.  That
should be enough for any imaginable system.

Caveats
-------

The main problem is that this function is not currently built into
hardware.  You can simulate it in a pixel shader using gl_FragDepth or
DX equivalent, although that is not available in OpenGL ES 2.0
(i.e. mobile chips and WebGL).  Also, even where it is available, it
will disable hardware speedups like early z-rejection and hierarchical
z test.

You can also sort of simulate it in a vertex shader, but honestly I
tried it and it sucks, because interpolated values in the middle of
triangles are way off, and you end up with crazy depth-sorting errors.

The best thing would be to get the function added to hardware.

Compared to floating-point Z
----------------------------

As Kemen and Humus observe, running a floating point z buffer
backwards has a similar effect.  The advantage is, current high-end
hardware does support this, yay!  The disadvantage is, current low-end
hardware (i.e. mobile) does not.  Also, floats need more bandwidth for
a similar result, so if we could fix the hardware, we'd go for the
integer depth buffer using the log function.

Compared to conventional Z or W buffering
-----------------------------------------

No contest; those both totally suck.

Visualizer
----------

This page lets you interactively graph relative imprecision of several
depth functions with various parameters.  The code is embedded in the
page as Javascript so you can inspect it or change it.

http://tulrich.com/geekstuff/log_depth_buffer_vis.html

Some insights from the visualizer:

* There's sort of a zero-sum situation -- getting more precision in on
  part of the view depth means you will lose precision somewhere else.
  No free lunch.

* Yet some functions are better than others.  I happen to think that
  worst-case relative imprecision is the ideal metric to minimize, and
  the functions behave differently with respect to this metric.

* Adding bits helps all of the functions.  With enough bits, even the
  bad ones can be fine in practice.

* Conventional Z and conventional W are both quite bad.  Z is really
  precise at the near plane and bad everywhere else, while W is the
  reverse.

* Conventional floating-point Z is really extra bad.  32-bit float z
  is (in the worst-case) no better than 24-bit fixed-point Z!

* Floating-point W and reverse floating point Z are both pretty good,
  and exact mirror images of each other.  I'm not sure if
  half-precision (16-bit) floating-point Z is common in GPUs, but if
  so, doing reverse half-precision Z could be a good option for many
  apps.

* The log(z/zn)/log(zf/zn) function beats everything else in all
  conditions (it's optimal for the metric).

References
----------

Brano Kemen 2009-08-12 posting on gamedev.net, looks like slightly
earlier version of gamasutra posting.  A good graph of his log
function near origin.
http://www.gamedev.net/blog/715/entry-2001520-logarithmic-depth-buffer/

Steve Baker's "Learning to Love your Z-buffer" mentions "Floating
point or Logrithmic Z (these are pretty similar concepts
actually). This also tends to even out the distribution of bits over
the Z range - but without completely eliminating the improved
resolution close to the eye. SGI Infinite Reality machines do this -
and claim to get the equivelent *useful* Z precision using 15 bits as
a conventional machine gets with 24bits."  No date or further details
but the Infinite Reality thing sounds similar and IR came out in 1996.
http://en.wikipedia.org/wiki/InfiniteReality (as of 2011-02-04)
doesn't say anything about this.