Bitfields

From HerzbubeWiki
Jump to: navigation, search

This page illustrates how bit fields are managed in memory in C++. All experiments documented on this page were made with Microsoft's Visual C++ Compiler. It would be interesting to see if other compilers produce the same results, and also if bit fields in C are managed in the same way as in C++.

Also see the Wikipedia article on bit fields.


Helper functions to visualize bit layout in memory

Summary

The helper functions presented in this section analyse the bit layout of a chunk of memory that is passed as the function argument. The mem2bitstring() function is used in the code snippets further down on this page when we look at concrete bit field examples. You should at least know what the function does (without looking at its source code) to better understand the code snippets.

  • mem2bits() returns an std::vector with bool elements, each element representing a bit in memory. The order of elements corresponds to the order of bits as they appear in the memory chunk that was analyzed.
  • mem2bitstring() internally uses mem2bits(). It generates a string from the result of mem2bits(), where each bit that is set is represented with the digit "1", and each bit that is not set is represented with the digit "0". Again, the order of digits corresponds to the order of bits as they appear in the memory chunk that was analyzed.


mem2bits() source code

// Memory layout: 01100101110010100110
//                ^      ^^      ^^  ^
// Bytes          +------++------++--+
// Vector index:  0      78      11  2
//                               67  0
std::vector<bool> mem2bits(void* pMem, int numberOfBits)
{
  std::vector<bool> bits;
  const int bitsPerByte = 8;
  int wholeBytes = numberOfBits / bitsPerByte;

  int trailingBits = numberOfBits % bitsPerByte;
  char* pMemAsChar = static_cast<char*>(pMem);

  for (int byteCounter = 0; byteCounter <= wholeBytes; ++byteCounter, ++pMemAsChar)
  {
     // Start with highest bit of a byte
     int bitMask = 1 << (bitsPerByte - 1);
     // Special handling for trailing bits
     int maxBits;
     if (byteCounter < wholeBytes)
       maxBits = bitsPerByte;
     else
       maxBits = trailingBits;
     for (int bitCounter = 0; bitCounter < maxBits; ++bitCounter, bitMask >>= 1)
     {
       bits.push_back((*pMemAsChar) & bitMask ? true : false);
     }
   }
   return bits;
}


mem2bitstring() source code

std::string mem2bitstring(void* pMem, int numberOfBits)
{
  std::vector<bool> bits = mem2bits(pMem, numberOfBits);
  std::string bitString;
  std::vector<bool>::const_iterator bitsIter = bits.begin();
  for (; bitsIter != bits.end(); ++bitsIter)
  {
    if (*bitsIter)
      bitString += "1";
    else
      bitString += "0";
  }
  return bitString;
}


More preliminaries

Endianness

If a data type is analyzed that is larger than 1 byte (e.g. WORD), one must keep endianness in mind. For instance, Windows that runs on Intel processors is a Little Endian system, which means that the order of bytes, as they are written in the binary number system, is being reversed in memory.

Example:

std::string bitString;
WORD w = 0x00ff;                     // Binary: 00000000 11111111
bitString = mem2bitstring(&w, 16);   // Memory: 11111111 00000000
DWORD dw = 0x1a2b3c4d;               // Binary: 00011010 00101011 00111100 01001101
bitString = mem2bitstring(&dw, 32);  // Memory: 01001101 00111100 00101011 00011010


The sizeof() operator

The size of a data structure in memory can be figured out using the sizeof() operator:

struct bitField
{
  [...]
};

void foo()
{
  int i = sizeof(bitField);
}


Basic bitfield example

The following example shows the basic way how bitfields work. WORD and DWORD are Microsoft specific data types that represent, respectively, a 16-bit and a 32-bit unsigned integer. Also see this MSDN article on Windows data types.

struct bitField1
{
  WORD member1:1;   // use 1 bit
  WORD member2:1;   // use 1 bit
  WORD member3:14;  // use 14 bits
};

void foo()
{
  int s = sizeof(bitField1);  // 2 Bytes

  std::string bitString;

  bitField1 b1;
  b1.member1 = 0x1;                    // -------+
  b1.member2 = 0x0;                    //        |
  b1.member3 = 0x0000;                 //        v
  bitString = mem2bitstring(&b1, 16);  // 00000001 00000000

  bitField1 b2;
  b2.member1 = 0x0;
  b2.member2 = 0x1;                    // ------+
  b2.member3 = 0x0000;                 //       v
  bitString = mem2bitstring(&b2, 16);  // 00000010 00000000

  bitField1 b3;
  b3.member1 = 0x0;
  b3.member2 = 0x0;
  b3.member3 = 0x00ff;                 // 00000000 11111111
  bitString = mem2bitstring(&b3, 16);  // 11111100 00000011

  bitField1 b4;
  b4.member1 = 0x0;
  b4.member2 = 0x0;
  b4.member3 = 0x3fc0;                 // 00111111 11000000
  bitString = mem2bitstring(&b4, 16);  // 00000000 11111111

  bitField1 b5;
  b5.member1 = 0xaaaa;                 // 10101010 10101010  only the last 0 is retained
  b5.member2 = 0x5555;                 // 01010101 01010101  only the last 1 is retained
  b5.member3 = 0xff00;                 // 11111111 00000000  the 2 highest bits are discarded
  bitString = mem2bitstring(&b5, 16);  // 00000010 11111100
}

Notes:

  • Since all members of the bitfield have the same data type, and the bitfield's total number of bits does not surpass the size of that data type, the entire bitfield is packed into the size of the data type
  • In example b1 you can see that the system uses the Little Endian memory layout
  • Examples b2, b3 and b4 demonstrate where the value of each bitfield member is placed within the bitfield's memory chunk
  • Keeping in mind from the previous examples how the bitfield members are laid out in memory, example b5 shows that the highest-order bits are discarded if a value is placed into a bitfield member that does not fit in there according to the bitfield declaration


Variations

The following data structures demonstrate how data type variations affect the size of the bitfield:

// ------------------------------------------------------------
// sizeof(bitfield2) == 4
// member1 = 0x1
// member2 = 0x0
// member3 = 0xffffffff
// mem2bitstring(&bf2, 32) == 11111101 11111111 00000000 00000000
//                                  ^^
//                                  ||
//                        member2 --++-- member1
struct bitField2
{
  DWORD member1:1;
  DWORD member2:1;
  DWORD member3:14;
};
 
// ------------------------------------------------------------
// sizeof(bitfield3) == 6
// member1 = 0x1
// member2 = 0x0
// member3 = 0xffffffff
// mem2bitstring(&bf3, 48) == 00000001 00000000 11111110 01111111 00000000 00000000
//                                   ^                 ^
//                                   |                 |
//                                   +-- member1       +-- member2
struct bitField3
{
  WORD member1:1;
  DWORD member2:1;
  DWORD member3:14;
};
 
// ------------------------------------------------------------
// sizeof(bitfield4) == 6
// member1 = 0x1
// member2 = 0x0
// member3 = 0xffffffff
// mem2bitstring(&bf4, 48) == 00000001 00000000 11111111 00111111 00000000 00000000
//                                  ^^
//                                  ||
//                        member2 --++-- member1
struct bitField4
{
  WORD member1:1;
  WORD member2:1;
  DWORD member3:14;
};
 
// ------------------------------------------------------------
// sizeof(bitfield5) == 6
// member1 = 0x1
// member2 = 0x0
// member3 = 0xffff
// mem2bitstring(&bf5, 48) == 00000001 00000000 00000000 00000000 11111111 00111111
//                                  ^^
//                                  ||
//                        member2 --++-- member1
struct bitField5
{
  DWORD member1:1;
  DWORD member2:1;
  WORD member3:14;
};