3 January 2009
Short String Optimization
Happy New Year!
Won Chun and I came up with a clever way to squeeze an extra byte of
storage into the C++ "short string optimization".
Quick review: a vanilla C++ string is a class that contains a string
length, a pointer to a memory buffer, and a buffer capacity. The
buffer is the sequence of bytes that make up the string, and is
compatible with C strings, meaning the last byte is 0. Normally, the
buffer is allocated on the heap, and is resized on demand when the
string is changed, so that the buffer capacity is always large enough
to hold the string. On a 32-bit machine, the C++ class instance
itself (not including the allocated buffer) normally is at least 12
bytes: 4 bytes each for length, buffer* and capacity.
The Short String Optimization adds a small byte buffer directly in the
string class. If the string length is short enough, the bytes are
stored directly in the string instance itself instead of allocated on
the heap. There are different ways of doing this, but one of the
better ways is to overlay the local buffer on top of the buffer* and
capacity fields that are only needed by the heap buffer. E.g.:
class string {
...
private:
union {
struct {
unsigned char local_length;
char local_buffer[15];
} local_data_;
struct {
unsigned char flag_value; // set to 0xFF when using heap_data_
char* buffer;
size_t length;
size_t capacity;
} heap_data_;
};
};
That gives sizeof(string)==16, the local buffer is 15 bytes, and the
max local string length is 14 (i.e. not counting the terminating 0 char).
(In practice, STLport and some other STL libs are less size-conscious
than that, and don't reuse fields to that extent.)
Anyway, we think strings should be as efficient as possible. So we
improve this slightly, expanding the local buffer to use the full 16
bytes. We do it like so:
class string {
...
private:
union {
struct {
unsigned char local_buffer[15];
unsigned char fifteen_minus_length;
} local_data_;
struct {
char* buffer;
size_t length;
size_t capacity;
int dummy;
} heap_data_;
};
};
Our clever twist is to store (15 - length) when using local data, and
put that byte at the end of the local struct. So, when length == 15,
(15 - length) == 0, and that means the fifteen_minus_length byte
serves double duty as both the length field and the terminating 0
byte! All 15 local_buffer[] bytes are used for string data! When not
using local_data, we fill fifteen_minus_length with a flag value like
0xFF, and the heap_data fields are active.
This trick is implemented and tested in tu_string here and it
seems to work fine. Also it's parameterized so that you can safely
try different sizes for the local buffer, and it should compile and
work correctly on 64-bit machines, etc.
Hot!
rants index:
current |
2008 archive |
2007 archive |
2006 archive |
2005 archive |
2004 archive |
2003 archive |
2002 archive