Some of our users run static analysis only occasionally. They find new errors in their code and, feeling glad about this, willingly renew PVS-Studio licenses. I should feel glad too, shouldn't I? But I feel sad - because you get only 10-20% of the tool's efficiency when using it in such a way, while you could obtain at least 80-90% if you used it otherwise. In this post I will tell you about the most common mistake among users of static code analysis tools.
We thought of checking the Boost library long ago but were not sure if we would collect enough results to write an article. However, the wish remained. We tried to do that twice but gave up each time because we didn't know how to replace a compiler call with a PVS-Studio.exe call. Now we've got us new arms, and the third attempt has been successful. So, are there any bugs to be found in Boost?
I find this question pretty strange. The answer is yes, of course, and that will be so for a long time. But I'm asked this question from time to time at conferences or when communicating with developers on forums. I've decided to answer this question in the form of a brief post so that I could just refer people to it in the future.
We develop the PVS-Studio code analyzer for C/C++ software developers. People sometimes ask me why these particular languages; C/C++ is old and few developers use it, isn't it so? When I tell them this is quite a popular language and it is widely used, they look sincerely astonished.
Perhaps the reason is that the Internet is full of articles, forums and news about new languages and their capabilities. Programmers who don't work with the C/C++ language simply don't notice rare news items about it among all that stuff. It's quite natural: there's no point in advertising what has been widely known and used for a long time. As a result, they come to the conclusion that this language was abandoned long ago and now is used only to maintain some old projects.
It's not so. This is a very popular, live and actively developing language. Just have a look at the rating of programming languages to see that I'm right. Currently it can be found here: TIOBE Programming Community Index for January 2013.
If you sum up C, C++ and Objective-C, you'll get 37%. It's 6 times higher than PHP, for instance. The extinction of the C/C++ language family is quite out of the question.
Here's the answer to the question why it's C/C++ that we prefer to support in PVS-Studio: because these are the most popular languages nowadays. Besides, they are complex, tricky and much error-prone. It's just a paradise where static code analyzers can thrive.
Note. Don't take it as a criticism of the C or C++ language. It's just the price we have to pay for the flexibility of these language and the capability of getting fast optimized code they generate.
Once again I would like to touch upon the wrong belief that C/C++ is now used only in old projects or microcontrollers. No, many contemporary and popular applications are being written in this language. For instance, such is Chromium - you can't say it's an ancient project by any means.
Here is a list of popular applications written in C++: C++ Applications.
To finish the article, I would like to give you one more link to a discussion: Why is C++ still a very popular language in quantitative finance?
To be honest, I don't know what the TPP project is intended for. As far as I understand, this is a set of tools to assist in research of proteins and their interaction in living organisms. However, that's not so much important. What is important is that their source codes are open. It means that I can check them with the PVS-Studio static analyzer. Which I'm very much fond of.
So, we have checked the Trans-Proteomic Pipeline (TPP) version 4.5.2 project. To learn more about the project, see the following links:
I'm going on to tell you about how programmers walk on thin ice without even noticing it. Let's speak on shift operators <<, >>. The working principles of the shift operators are evident and many programmers even don't know that using them according to the C/C++ standard might cause undefined or unspecified behavior.
I would like to begin this article from the fact that now it is 2012. I am saying this, because I often read the code in C++ at my work and for hobby, which was written about 10-20 years ago (and it is actual now), or the code written recently by the people who learned to program in C++ 20 years ago. And after that I got feeling that there was not any progress over the years, as well nothing was changed and developed, and the mammoths still roam on the Earth.
The programming specific was very different 20 years ago. The memory and resources of CPU were measured by the bytes and the cycles, many things had not been invented yet, and we had to deal with that situation. But this is not an excuse today to write a code based on these prerequisites. The world is changing now. I can feel it in the water. I can feel it in the ground. We need to keep up with the progress.
Everything that I am going to write further only applies to the programming in C++ and the mainstream-compilers (gcc, Intel, and Microsoft), unfortunately, I have worked less with other programming languages and compilers, so I will not talk much about them. Also, I will only talk about the application programming for desktop OS (trends may differ in the clusters, microprocessors and system programming).
More than a year has passed since we analyzed Notepad++ with PVS-Studio. We wanted to see how much better the PVS-Studio analyzer has become since then and which of the previous errors have been fixed in Notepad++.
In C language, you may use functions without defining them. Pay attention that I speak about C language, not C++. Of course, this ability is very dangerous. Let us have a look at an interesting example of a 64-bit error related to it.
I decided to find out if there is practical sense in writing ++iterator instead of iterator++ when handling iterators. My interest in this question arouse far not from my love to art but from practical reasons. We have intended for a long time to develop PVS-Studio not only in the direction of error search but in the direction of prompting tips on code optimization. A message telling you that you'd better write ++iterator is quite suitable in the scope of optimization.
But how much relevant is this recommendation nowadays? In ancient times, for instance, it was advised not to repeat calculations. It was a good manner to write:
TMP = A + 10;
X = TMP + B;
Y = TMP + C;
X = A + 10 + B;
Y = A + 10 + C;
Such subtle manual optimization is meaningless now. The compiler would handle this task as well. It's just unnecessary complication of code.
I create the PVS-Studio analyzer detecting errors in source code of C/C++/C++11 software. So I have to review a large amount of source code of various applications where we detected suspicious code fragments with the help of PVS-Studio. I have collected a lot of examples demonstrating that an error occurred because of copying and modifying a code fragment. Of course, it has been known for a long time that using Copy-Paste in programming is a bad thing. But let's try to investigate this problem closely instead of limiting ourselves to just saying "do not copy the code".