C++

From HerzbubeWiki
Jump to: navigation, search

This page is a collection of various bits of knowledge related to C++. When I come across something that I find hard to remember, I just note it down here.

Also see the C++ Snippets page on this wiki.


References

  • The C++ standard documents: Can be found via this Stack Overflow answer, which contains a continuously updated list of URLs.


No "super" keyword

The lamento

Some programming languages have a keyword super, or some similar construct, that allows references to the superclass of a subclass. Alas, C++ has no such keyword. The reason is simple: Since C++ supports multiple inheritance, C++ has no way to know which of the possibly several superclasses super is supposed to refer to.

If there is this inheritance tree "C1 -> C2 -> C3", referring to a superclass implementation of a method looks like this:

void C3::Foo()
{
  C1::Foo();
}

Note that if C2 also implements Foo(), that implementation is going to be skipped by the above code snippet. The proper way to write this would therefore be:

void C3::Foo()
{
  C2::Foo();
}

If C2 does not implement Foo(), this example will not trigger a compilation error, instead C1's implementation of Foo() will be invoked.


A possible workaround

In projects where mulitple inheritance is not used, the clever use of typedef allows the following workaround:

// A.hpp
class A : public B
{
private:
   typedef B Super;
};

// A.cpp
#include "A.hpp"

A::A()
:  Super()
{
}

Notes

  • The typedef is class-specific so it won't pollute the global namespace
  • The typedef is private, so it won't affect subclasses
  • The approach could be used even in multiple inheritance projects, the only problem is to find (and stick to) a convention which of the several superclasses should be identified by the typedef.


Functions in subclasses hide same-name functions in superclasses

The following code snippet (slightly adapted from the first Stackoverflow question below) results in a compiler error in C::bar():

class A
{
public:
  void foo(const std::string& s) {};
};

class B : public A
{
public:
  int foo(int i) {};
};

class C : public B
{
public:
  void bar()
  {
    std::string s;
    foo(s);
  }
};

The explanation can be found in the answers to the following Stackoverflow questions. The error is still weird and unintuitive to me, but apparently according to the standard.


Function pointers and member function pointers

This codeproject.com article is a thorough (some might say "excessive") treatment of function pointers, member function pointers and their use as delegates. Here I only paraphrase the most essential parts in the simplest possible ways; read the article to get the full story.


Example of a function pointer:

float foo(int i, char* s);    // the function prototype

float (*aFunctionPointer1)(int, char*);             // a function pointer variable named "aFunctionPointer1"
typedef float (*AFunctionPointerType)(int, char*);  // typedef makes usage of the type much more readable
AFunctionPointerType aFunctionPointer2;             // see?

aFunctionPointer2 = foo;          // assignment; aFunctionPointer1 works the same as aFunctionPointer2
(*aFunctionPointer2)(42, "bar");  // calling the function


Example of a member function pointer:

class Bar
{
public:
  float foo(int i, char* s);    // the function prototype
};

float (Bar::*aFunctionPointer1)(int, char*);             // a function pointer variable named "aFunctionPointer1"
float (Bar::*aFunctionPointer1)(int, char *) const;      // same, but for a const member function
typedef float (Bar::*AFunctionPointerType)(int, char*);  // typedef is your friend
AFunctionPointerType aFunctionPointer2;

aFunctionPointer2 = &Bar::foo;         // assignment; aFunctionPointer1 works the same as aFunctionPointer2
aFunctionPointer2 = &Bar::operator !;  // the syntax for operators

Bar* aHeapObject = new Bar();                  // operator ->* is used to call the function for objects on the heap
Bar aStackObject;                              // operator .* is used to call the function for objects on the stack
(aHeapObject->*aFunctionPointer2)(42, "bar");  // the operators have low precedence, so parantheses must be placed around them
(aStackObject.*aFunctionPointer2)(42, "bar");

Notes:

  • The ::* in the declaration of the function pointer is an operator!
  • Member function pointers can only point to member functions of a single class, i.e. if another class also has a member function with the same signature, this counts as a different type and a different function pointer must be used! This is reflected by the fact that the function pointer declaration includes the class name.
  • Only non-static member functions can be referenced in this way. For static member functions, regular function pointers must be used.
  • There is no way to have a function pointer of a constructor or destructor


Stack Unwinding

Stack unwinding also takes place in those functions that do not catch an exception, but that are just passed over as the exception propagates up the stack.

One exception is Microsoft's VC++: There is a compiler option (/EH) that allows to specify the exception handling model. Unfortunately the default model assumes that extern "C" functions do not throw exceptions, which for those functions causes the compiler to optimize away the code that would take care of stack unwinding.


Passing variable number of arguments to another function

// Calling this function from f3() does NOT work
void f1(int iii, char* pFormat, ...)
{
  printf(pFormat, &pFormat + sizeof(pFormat));
}

// Calling this function from f3() DOES work
void f2(int iii, char* pFormat, va_list pArguments)
{
  vprintf(pFormat, pArguments);
}

void f3(int iii, char* pFormat, ...)
{
  f1(iii, pFormat, ???);                        // variable argument list has no name, so no way to call f1() directly

  // Although slightly complicated, this is the correct way to do it.
  // The function we are invoking must have a va_list argument.
  va_list pArguments;
  va_start(pArguments, pFormat);
  f3(iii, pFormat, pArguments) ;
  va_end(pArguments) ;

  // This also works because the values in the variable argument list are
  // on the stack in one contiguous memory region. The method above is,
  // however, much better and clearer.
  f2(iii, pFormat, &pFormat + sizeof(pFormat));

  // Compiles but does NOT work at runtime, because of how f1() is implemented:
  // Here we pass to f1() merely a POINTER to the variable argument list,
  // not a COPY of the arguments themselves. But f1() does not know that, and
  // it in turn passes a copy of the pointer on to printf(). Now printf() starts
  // interpreting the pointer as the arguments themselves, which at best crashes
  // the program, or at worst results in weird behaviour.
  f1(iii, pFormat, &pFormat + sizeof(pFormat));
}

Also see the answer to this this Stackoverflow question.


delete on void* does not work

The following results in a compiler error!

Foo* f = new Foo();
void* v = f;
delete v;

Reason: According to the C++ standard, the type that is passed as the operand to delete must be at least a base class of the object's actual type AND the type passed to delete must have a destructor marked as virtual (see the comment by laserlight in forum post).

So not only is it impossible to destroy an arbitrary object using the type void, it would also not work if we tried to destroy an object by passing a wholly unrelated type to delete. I have not tried out what the following does, but I assume it compiles and then barfs at runtime:

Foo* f = new Foo();
Bar* b = reinterpret_cast<Bar*>(f);
delete b;


On the use of NDEBUG

Who defines NDEBUG?

The preprocessor macro NDEBUG is used by the standard header assert.h to detect whether the use of the assert preprocessor macro should resolve to nothing, or actually do something. Although the C++ standard describes the use of NDEBUG in relation to assert.h, it does not say anything about who is responsible for defining the macro.

Consequently, C++ compilers cannot be relied upon to automatically define the macro.

In his 3rd edition of "The C++ Programming Language" Stroustrup says the following about NDEBUG when he talks about assertions (p. 750):

NDEBUG is usually set by compiler options on a per-compilation-unit basis. [...] Like all macro magic, this use of NDEBUG is too low-level, messy, and error-prone. [...] Using NDEBUG in this way requires that we define NDEBUG with a suitable value whether or not we are debugging. A C++ implementation does not do this for us by default [...]


When to use NDEBUG

One should never use NDEBUG in any other context than to control the behaviour of assertions, since the C++ standard allows anyone to define/undefine NDEBUG at will and multiple times, even within the same translation unit.

It is therefore common to use another macro to control a "compile-wide" debug flag. The application programmer could use an application-specific macro MY_DEBUG. On the MSVC platform, the compiler automatically defines _DEBUG when the compiler options /MTd or /MDd are specified.

Useful references for this topic are this and this Stackoverflow question.


Name mangling / Decoration

Basics

C++ supports function overloading, for this reason a C++ compiler performs so-called "name mangling" or "decoration" when it generates the symbol name for a function. Encoded in the symbol name are the function's name and the function's parameter types. The encoding scheme is entirely compiler dependent, i.e. different compilers will produce different symbol names.

See the GCC page for a method on how to undecorate mangled symbol names produced by GCC.

On Windows the popular and free developer tool Dependency Walker is capable of undecorating symbols in an executable binary.


extern "C"

C does not support function overloading, so a C compiler has no need for name-mangling/decoration. When a C program tries to use a function from a C++ library, the C linker therefore expects to find an undecorated symbol in the library file that it uses for linking. This will fail because the C++ compiler/linker used to generate the library file has generated decorated symbols.

To circumvent this problem, a programmer can use the extern "C" declaration to specify that the symbol for a certain function should not be decorated. For instance:

extern "C" void foo(int i);

It is also possible to use the declaration on multiple functions/symbols:

extern "C"
{
   void foo(int i);
   void bar(char c);
   int i;
}


New stuff in C++11

This section contains some of the new stuff in the C++11 standard that I found useful. The section is not complete! Most of the information is taken from the Wikipedia page on C++11.


New type long long int

Up until now we had long int, which is guaranteed to have at least as many bits as int. On some implementations, long int is 64 bits, on others it is 32 bits.

The new type long long int is guaranteed to be at least as large as a long int, and have no fewer than 64 bits. long long int was already introduced to standard C by C99, so most compilers already support this.


Enumerations

Enumerations finally become first class citizens! Up until now enumerations were merely glorified integer values, but with C++11 enumerations are now, finally, strongly typed. To take advantage we must opt in by using a new syntax. It's as simple as adding the keyword class (or struct as a synonym) to the declaration:

// Still has the underlying type "int"
enum class Foo
{
  Value1,        // still gets implicit value 0
  Value42 = 42,
  Value43        // still gets implicit value 43
};

Even better, we can now also specify an underlying type:

enum class Bar : std::string
{
  Value1 = "one",
  Value2 = "two",
  ValueUnknown    // TODO: what are the rules for implicit values when underlying type is not "int"?
};

More goodness: Value names are now properly scoped within their enum declaration, so no more pollution of the global namespace and conflicting enum value names. We write this:

Foo fooValue = Foo::Value1;
Bar barValue = Bar::Value1;

Last, but definitely not least: Enums can now be forward declared. As with any forward declarations, the compiler must know about the size of the forward-declared type, so the underlying type of an enum must be part of the forward declaration:

// Without specification of the underlying type the compiler assumes the default "int"
enum class Foo;
enum class Bar : std::string;


Null pointer constant

A new constant has been introduced that denotes a null pointer:

MyClass* myClass = nullptr;

The type of the constant is nullptr_t, which is implicitly convertible and comparable to any pointer type. It is not implicitly convertible or comparable to integral types, except for bool. Apparently it was decided that making nullptr behave more like regular pointer types would be less surprising - personally I find this a bit strange.

char* pc = nullptr;  // OK
int* pi = nullptr;   // OK
bool b = nullptr;    // Weird, but OK. b is false.
int i = nullptr;     // Error

Finally, some overloading examples:

void foo(MyClass*);
void foo(int);
void bar(MyClass*);
void bar(nullptr_t);

foo(0);        // calls the int overload
foo(nullptr);  // calls the MyClass overload
bar(nullptr);  // calls the nullptr_t overload


Final classes and methods

To make sure that a class cannot be subclassed:

class Foo final {};

To make sure that a virtual method cannot be overridden in a subclass:

virtual void f() final;

Note: final is not a compiler keyword, it is more like an attribute specifier that gets special meaning only in the specific context of a class/method declaration. It is therefore legal to use "final" as a variable name.


Explicit overrides

Up until now when a subclass wanted to override a virtual method in a base class, it simply did so by repeating the method declaration with the exact same signature as it appeared in the base class:

class Base
{
  virtual void f(int);
  virtual void g(int);
}

class Sub
{
  virtual void f(int);    // implicitly override base class f()
  virtual void g(short);  // oops! signature is not the same, so no override - but code compiles!
}

What happens if in the above example the writer of class Base decides to change the signature of method f()? The intended subclass override suddenly stops working! To prevent this it is now possible to explicitly declare that one wants to override a base class method:

class Base
{
  virtual void f(int);
  virtual void g(int);
}

class Sub
{
  virtual void f(int) override;    // explicitly override base class f()
  virtual void g(short) override;  // signature is not the same - compiler error!
}


Right angle bracket

Up until now we annoyingly had to use an artificial space character in declarations that involved even semi-complex template usage:

std::vector<std::pair<int, int> >;

The reason was that without a space the compiler would treat the two consecutive right angle brackets as the ">>" operator, which would result in a compiler error. This is no longer necessary, so we can finally write:

std::vector<std::pair<int, int>>;


Explicit conversion operators

The explicit keyword can now also be applied to operator declarations, to prevent an operator from being surprisingly used in an implicit conversion.

class Testable
{
  explicit operator bool() const
  {
    return false;
  }
};
 
int main()
{
  Testable a, b;
  if (a)      { /*do something*/ }  // OK
  if (a == b) { /*do something*/ }  // compiler error
}


Type inference

The new keyword auto can be used to let the compiler figure out the right type of a variable. This is useful when writing out a complicated type would make the code unreadable. As far as I know, auto is pretty much the same as var in C# / .NET.

For instance:

for (std::vector<int>::const_iterator it = aVector.cbegin(); it != aVector.cEnd(); ++it)

can simply become

for (auto it = aVector.cbegin(); it != aVector.cEnd(); ++it)

Related to this is the keyword decltype, which lets the programmer say "use the same type as this other thing". Examples:

int a;
decltype(a) b;  // b has type int, i.e. the same type as a

auto c = 0;     // c has type int
decltype(c) d;  // d has type int, i.e. the same type as c

decltype((c)) e = c;  // e has type int&, because (c) is an lvalue
decltype(0) f;        // f has type int, because 0 is an rvalue


Simpler for loops

The for loop syntax has been simplified if you want to iterate over some container. The new syntax can be used on

  • C-style arrays
  • Initializer lists
  • Any type that has begin() and end() functions defined for it that return iterators (e.g. all containers in the STL)

This is the new syntax:

int myArray[5] = {1, 2, 3, 4, 5};
for (int& x : myArray)
{
  x *= 2;
}

// Combine new syntax with type inference / auto keyword
for (auto& x : myArray)
{
  x *= 2;
}


Constructor chaining

The syntax for constructor chaining (or "delegation", as it's officially called in the C++ terminology) is obvious and follows the syntax for member initialization.

class MyClass
{
private:
  int myValue;

public:
  MyClass(int aValue)
    : myValue(aValue)
  {
  }
  MyClass()
    : MyClass(42)
  {
  }
}

The "side-effect" that needs to be considered:

  • Object construction is considered to be complete once any constructor is finished. This means that a delegating constructor will operate on a fully constructed object!
  • Subclass constructors will execute only after all constructor delegation in the superclass is complete


Member initialization

Member initialization is now possible directly as part of the declaration:

class MyClass
{
private:
  int myValue = 42;

public:
  // Will use the member initialization code
  MyClass()
  {
  }

  // Will ***NOT*** use the member initialization code
  MyClass(int aValue)
    : myValue(aValue)
  {
  }
}


Explicit control over default constructor and copy constructor

class MyClass
{
  // Up until now, specifying a custom constructor prevented the compiler from
  // generating a default constructor
  MyClass(int value);

  // Here we explicitly tell the compiler to generate a default constructor for us
  MyClass() = default;

  // Here we can explicitly remove copy constructor and copy operator from legal
  // use, instead of declaring them with visibility "private"
  MyClass(const MyClass&) = delete;
  MyClass & operator=(const MyClass&) = delete;

  void f(double d);
  // Without this, someone could call f(42) and the integer would be silently
  // converted into a double. Here we explicitly disallow such silent conversion,
  // if someone calls f(42) the result is a compiler error.
  void f(int) = delete;

  void g(double d);
  // Nice use of templates: Disallow calling the function with any type other
  // than double
  template<class T> void g(T) = delete;
}


Initializer lists

Standard containers from the STL can now be initialized with initializer lists, e.g.

std::vector<int> v = { 1, 2, 3 };

This is possible due to a new STL type

std::initializer_list<>

The STL type can be used in your own constructors, which are then called "initializer-list constructors", a constructor type that is specially treated during uniform initialization:

class MyClass
{
public:
  MyClass(std::initializer_list<int> list);

  void doIt(std::initializer_list<int> list)
};

MyClass myClass = {1, 2, 3};

For what it's worth, std::initializer_list<int> can also be used in regular non-constructor methods:

void doIt(std::initializer_list<int> list);
doIt({1, 2, 3});


Uniform initialization

There is a new uniform initialization syntax that uses braces ("{}"):

struct MyStruct
{
  int x;
  double y;
};
 
class MyClass
{
  MyClass(const std::string& s);
  MyClass(std::initializer_list<int> list);
  MyClass(int x);
};
 
MyStruct myStruct{5, 3.2};
MyClass myClass1{"foo"};
MyClass myClass2{42};  // with uniform initialization syntax, the intializer list constructor takes precedence over the constructor that takes an int
MyClass myClass2(42);  // use classic constructor syntax to access the constructor that takes an int

MyStruct getMyStruct()
{
  return {5, 3.2};
}


Static assertions

Static assertions evaluate constant expressions at compile time:

template<class T>
struct Check
{
 static_assert(sizeof(int) <= sizeof(T), "T is not big enough!");
};


Improved sizeof()

This is now valid:

class A {);
class B
{
  A myMember;
}
size_t size = sizeof(B::myMember);


Function pointer declarations

A new, more readable way to declare function pointers:

typedef void (*FunctionType)(double);   // old style
using FunctionType = void (*)(double);  // new style with "using" keyword


New character types, string literals, and Unicode support

These are the basic character types, two of them are new in C++11:

char c1;      // intended to store UTF-8 (or any plain old C-style string)
char16_t c2;  // intended to store UTF-16
char32_t c3;  // intended to store UTF-32

Here is an example of the new string literal syntax:

const char[] utf8String1 = u8"An UTF-8 string";
// The number after \u is hexadecimal (without the usual 0x prefix) and represents a 16-bit Unicode code point
const char[] utf8String1 = u8"An UTF-8 string with a Unicode character: \u2018";
const char16_t[] utf16String1 = u"An UTF-16 string";
const char16_t[] utf16String2 = u"An UTF-16 string with a Unicode character: \u2018";
const char32_t[] utf32String1 = U"An UTF-32 string";
// The number after \U represents a 32-bit Unicode code point
const char32_t[] utf32String1 = U"An UTF-32 string with a Unicode character: \U00002018";
// The literal is enclosed between "( and )"
const char[] rawLiteralString1 = R"(No need to escape special characters like \ or " until the end)";
// "delimiter" can be any string up to 16 characters, excluding spaces, control characters, and the characters '(', ')' and '\'
const char[] rawLiteralString2 = R"delimiter(Another one with "( and ") character sequences)delimiter";
// Combine Unicode and raw literal prefixes
const char[] rawLiteralString3 = u8R"(A raw UTF-8 string)";


Lambda expressions / closures

Lambda expressions (= anonymous functions) have this form

[capture](parameters) -> return_type { function_body }

Example:

[](int x, int y) -> int { return x + y; }

TODO: Closure example


Multi-threading

The STL now has a number of new types to work with threads:

std::thread
std::mutex
std::recursive_mutex
std::condition_variable
std::condition_variable_any
std::lock_guard
std::unique_lock
std::async
std::future

In addition there is now a new storage specifier

thread_local

No details available at this time.


Additions to the STL

  • Hash tables: std::unordered_map, std::unordered_set and more
  • Regular expressions: std::regex, std::regex_search, std::regex_replace and std::match_results
  • Smart pointers: std::unique_ptr, std::shared_ptr and std::weak_ptr
  • Tuples: std::tuple and std::make_tuple()
  • Random number generator facilities


Standard Library I/O streams

Summary

Some key points about streams:

  • A stream delegates all operations to an intermediate stream buffer.
  • The stream buffer is responsible for performing the actual reading/writing operations on behalf of the stream.
  • The stream buffer reads from / writes to an input and output sequence. It is said to control those sequences.
  • An input or output sequence can be a file, or a string, or something else. The C++ Standard Library defines streams and stream buffers for files and strings only.
  • Stream and stream buffer classes are templates that must be parameterized with the character type (char or wchar)


Stream buffers

Interface reference: http://www.cplusplus.com/reference/streambuf/basic_streambuf/


As its name says, a stream buffer's function is to add the capability of buffering read or write operations. Whether or not a given stream buffer implementation actually performs such buffering is another question, but at least the concept is there.

A stream buffer has a set of internal pointers that reference positions of the buffered part of the input/output sequence. Note that the "end" positions refer to a memory location 1 byte behind the end of the buffer array.

                         beginning   current position   end
                                     (get/put pointer)
-------------------------------------------------------------
Input sequence buffer    eback       gptr               egptr
Output sequence buffer   pbase       pptr               epptr


Overflow/underflow

  • Overflow is the term used to describe the event when an operation wants to write to the stream buffer, but the buffer is full. In technical terms, the current write position (pptr) is equal to the end write position (epptr).
  • Underflow is the equivalent term for reading: The underflow event occurs when an operation wants to read from the stream buffer, but there is no content to read. In technical terms, the current read position (gptr) is equal to the end read position (egptr).


I found this StackOverflow question extremly helpful when at one point I had to create my own custom I/O stream. The first sentence of the accepted answer is the most important part:
The proper way to create a new stream in C++ is to derive from std::streambuf and to override the underflow() operation for reading and the overflow() and sync() operations for writing.