11 November 2017

This summer, as a sophmore at UCSC, I decided to enroll in summer school to get some requirements done with less pressure and for cheaper. One of the classes I took was CMPS 109 “Advanced Programming”. It turned out to be a class on C++.

“No problem!” I thought. C++ was my first programming language. I spent years writing poor C++03 code, and one of my projects on Github is a HTTP server written in C++11. C++ was no longer my favorite language (it lost that title when I discovered that standardized build systems and package management existed) but it was still one that I was very familiar with. I figured the class would be easy.

It wasn’t.

It wasn’t that I didn’t know C++, or that I didn’t do my homework, it was that C++ is hard. Not because the syntax is complicated or the patterns and concepts difficult to understand, it’s hard because C++ has all these idiosyncrasies within the language and the standard library that the programmer has to keep track of. If you assume or extrapolate a behavior in C++, it’s almost definately incorrect. The only way to be sure of anything is by reading the standard, and that gets fustrating real fast.

So, in this post, I’m going to document all the little pitfalls I’ve encounted while writing C++. This is mainly for myself, should I find myself returning to C++ again in the future, but maybe someone else will find some use of it.

Most Vexing Parse

Of course, I have to include this one here. Most programmers, especially coming from language such as Java or Python, would assume that the way to call the constructor of a class with no arguments would be to write Myclass var_name();. Which is a reasonable assumption, because, for one, calling functions with no paramters are done with my_function(); and calling constructors with parameters is done with MyClass var_name(paramter1);.

Unfortunately, most people coming in from other languages forget that C++ has something called “forward declarations” which look exactly like our assummed no-parameter constructor call. And unfortunately, by the standard, a statement that consists of a type, an identifier, and en empty pair of paremthesis are always considered a function declaration, and never a variable.

Which, okay, in most cases where someone confuses the two, there would obviously be a compiler error right? Well yes, but declaring a function inside of another is legal in C++. Which means that the compiler would only detect an error if you tried to use it some way. And even then, the error most compilers (MSVC and GCC) generate is that of a type mismatch. Only clang actually tells you that you probably didn’t want a function declaration there!

C++11 introduces a new syntax Myclass var_name{} to instantiate classes, but even that has it’s own pitfalls.

Translation Units and Multiple Delcarations

Here’s how C++’s build process works: header files and implementation files are concatenated togeather (this is what #include does) and the compiler generates multiple object files from them. The linker then takes these object files and stiches them together in an executable.

The input that the compiler needs to create an object file is called a “translation unit”. This is why you generally declare an object in a header file and define it in an implementation file. This is also why you can’t define a class or a variable in a header file if you plan on using it more than once. And it’s also why if you do define an object in a header file and use it in two different translation units, the linker will throw an error relation to multiple declarations and not the compiler.

You can’t create implementation files for templates

Sorry. You can’t. Templates aren’t generated until they’re used. Since each translation unit is independent, the compiler can’t generate a template implementation each time some other unit uses it.

You have three, technically two, choices:

  1. Define the entire thing in the header
  2. Declare the template in the header file, then define it in an “implementation” file which is included at the bottom of the header file. To the compiler, this is the same as 1.
  3. Explicitly instantiate all the usages of the template file.

Oh but don’t worry. Templates don’t count as multiple declaration.