I'm currently experiencing a strong cognitive dissonance, and it won't let me go. You see, I visit various programmers' forums and see topics where people discuss noble ideas about how to write super-reliable classes; somebody tells he has his project built with the switches -Wall -Wextra -pedantic -Weffc++, and so on. But, God, where are all these scientific and technological achievements? Why do I come across most silly mistakes again and again? Perhaps something is wrong with me?
Some of our users run static analysis only occasionally. They find new errors in their code and, feeling glad about this, willingly renew PVS-Studio licenses. I should feel glad too, shouldn't I? But I feel sad - because you get only 10-20% of the tool's efficiency when using it in such a way, while you could obtain at least 80-90% if you used it otherwise. In this post I will tell you about the most common mistake among users of static code analysis tools.
We thought of checking the Boost library long ago but were not sure if we would collect enough results to write an article. However, the wish remained. We tried to do that twice but gave up each time because we didn't know how to replace a compiler call with a PVS-Studio.exe call. Now we've got us new arms, and the third attempt has been successful. So, are there any bugs to be found in Boost?
I find this question pretty strange. The answer is yes, of course, and that will be so for a long time. But I'm asked this question from time to time at conferences or when communicating with developers on forums. I've decided to answer this question in the form of a brief post so that I could just refer people to it in the future.
We develop the PVS-Studio code analyzer for C/C++ software developers. People sometimes ask me why these particular languages; C/C++ is old and few developers use it, isn't it so? When I tell them this is quite a popular language and it is widely used, they look sincerely astonished.
Perhaps the reason is that the Internet is full of articles, forums and news about new languages and their capabilities. Programmers who don't work with the C/C++ language simply don't notice rare news items about it among all that stuff. It's quite natural: there's no point in advertising what has been widely known and used for a long time. As a result, they come to the conclusion that this language was abandoned long ago and now is used only to maintain some old projects.
It's not so. This is a very popular, live and actively developing language. Just have a look at the rating of programming languages to see that I'm right. Currently it can be found here: TIOBE Programming Community Index for January 2013.
If you sum up C, C++ and Objective-C, you'll get 37%. It's 6 times higher than PHP, for instance. The extinction of the C/C++ language family is quite out of the question.
Here's the answer to the question why it's C/C++ that we prefer to support in PVS-Studio: because these are the most popular languages nowadays. Besides, they are complex, tricky and much error-prone. It's just a paradise where static code analyzers can thrive.
Note. Don't take it as a criticism of the C or C++ language. It's just the price we have to pay for the flexibility of these language and the capability of getting fast optimized code they generate.
Once again I would like to touch upon the wrong belief that C/C++ is now used only in old projects or microcontrollers. No, many contemporary and popular applications are being written in this language. For instance, such is Chromium - you can't say it's an ancient project by any means.
Here is a list of popular applications written in C++: C++ Applications.
To finish the article, I would like to give you one more link to a discussion: Why is C++ still a very popular language in quantitative finance?
There is no fragment in program code where you cannot make mistakes. You may actually make them in very simple fragments. While programmers have worked out the habit of testing algorithms, data exchange mechanisms and interfaces, it's much worse concerning security testing. It is often implemented on the leftover principle. A programmer is thinking: "I just write a couple of lines now, and everything will be ok. And I don't even need to test it. The code is too simple to make a mistake there!". That's not right. Since you're working on security and writing some code for this purpose, test it as carefully!
To be honest, I don't know what the TPP project is intended for. As far as I understand, this is a set of tools to assist in research of proteins and their interaction in living organisms. However, that's not so much important. What is important is that their source codes are open. It means that I can check them with the PVS-Studio static analyzer. Which I'm very much fond of.
So, we have checked the Trans-Proteomic Pipeline (TPP) version 4.5.2 project. To learn more about the project, see the following links:
I'm going on to tell you about how programmers walk on thin ice without even noticing it. Let's speak on shift operators <<, >>. The working principles of the shift operators are evident and many programmers even don't know that using them according to the C/C++ standard might cause undefined or unspecified behavior.
In C language, you may use functions without defining them. Pay attention that I speak about C language, not C++. Of course, this ability is very dangerous. Let us have a look at an interesting example of a 64-bit error related to it.
The id Software company possesses a PVS-Studio license. However, we decided to test the source codes of Doom 3 that have been recently laid out on the Internet. The result is the following: we managed to find just few errors, but still they are there. I think it can be explained by the following fact.
I decided to find out if there is practical sense in writing ++iterator instead of iterator++ when handling iterators. My interest in this question arouse far not from my love to art but from practical reasons. We have intended for a long time to develop PVS-Studio not only in the direction of error search but in the direction of prompting tips on code optimization. A message telling you that you'd better write ++iterator is quite suitable in the scope of optimization.
But how much relevant is this recommendation nowadays? In ancient times, for instance, it was advised not to repeat calculations. It was a good manner to write:
TMP = A + 10;
X = TMP + B;
Y = TMP + C;
X = A + 10 + B;
Y = A + 10 + C;
Such subtle manual optimization is meaningless now. The compiler would handle this task as well. It's just unnecessary complication of code.