As you know, our main activity is development of the code analyzers PVS-Studio and CppCat. Although we have been doing this for a long time now and - as we believe - quite successfully, an unusual idea struck us recently. You see, we do not use our own tools in exactly the same way our customers do. Well, we analyze the code of PVS-Studio by PVS-Studio of course, but, honestly, the PVS-Studio project is far from large. Also, the manner of working with PVS-Studio's code is different from that of working with Chromium's or LLVM's code, for example.
We felt like putting ourselves in our customers' shoes to see how our tool is used in long-term projects. You see, project checks we regularly do and report about in our numerous articles are done just the way we would never want our analyzer to be used. Running the tool on a project once, fixing a bunch of bugs, and repeating it all again just one year later is totally incorrect. The routine of coding implies that the analyzer ought to be used regularly - daily.
OK, what's the purpose of all that talk? Our theoretical wishes about trying ourselves in third-party projects have coincided with practical opportunities we started to be offered not so long ago. Last year we decided to allocate a separate team in our company to take up - ugh! - outsourcing; that is, take part in third-party projects as a developer team. Moreover, we were interested in long-term and rather large projects, i.e. requiring not less than 2-3 developers and not less than 6 months of development. We had two goals to accomplish:
- try an alternative kind of business (custom development as opposed to own product development);
- see with our own eyes how PVS-Studio is used in long-term projects.
In this article I'm going to discuss a problem few people think of. Computer simulation of various processes becomes more and more widespread. This technology is wonderful because it allows us to save time and materials which would be otherwise spent on senseless chemical, biological, physical and other kinds of experiments. A computer simulation model of a wing section flow may help significantly reduce the number of prototypes to be tested in a real wind tunnel. Numerical experiments are given more and more trust nowadays. However, dazzled by the triumph of computer simulation, nobody notices the problem of software complexity growth behind it. People treat computer and computer programs just as a means to obtain necessary results. I'm worried that very few know and care about the fact that software size growth leads to a non-linear growth of the number of software bugs. It's dangerous to exploit a computer treating it just as a big calculator. So, that's what I think - I need to share this idea with other people.
I'm currently experiencing a strong cognitive dissonance, and it won't let me go. You see, I visit various programmers' forums and see topics where people discuss noble ideas about how to write super-reliable classes; somebody tells he has his project built with the switches -Wall -Wextra -pedantic -Weffc++, and so on. But, God, where are all these scientific and technological achievements? Why do I come across most silly mistakes again and again? Perhaps something is wrong with me?
Some of our users run static analysis only occasionally. They find new errors in their code and, feeling glad about this, willingly renew PVS-Studio licenses. I should feel glad too, shouldn't I? But I feel sad - because you get only 10-20% of the tool's efficiency when using it in such a way, while you could obtain at least 80-90% if you used it otherwise. In this post I will tell you about the most common mistake among users of static code analysis tools.
We thought of checking the Boost library long ago but were not sure if we would collect enough results to write an article. However, the wish remained. We tried to do that twice but gave up each time because we didn't know how to replace a compiler call with a PVS-Studio.exe call. Now we've got us new arms, and the third attempt has been successful. So, are there any bugs to be found in Boost?
I find this question pretty strange. The answer is yes, of course, and that will be so for a long time. But I'm asked this question from time to time at conferences or when communicating with developers on forums. I've decided to answer this question in the form of a brief post so that I could just refer people to it in the future.
We develop the PVS-Studio code analyzer for C/C++ software developers. People sometimes ask me why these particular languages; C/C++ is old and few developers use it, isn't it so? When I tell them this is quite a popular language and it is widely used, they look sincerely astonished.
Perhaps the reason is that the Internet is full of articles, forums and news about new languages and their capabilities. Programmers who don't work with the C/C++ language simply don't notice rare news items about it among all that stuff. It's quite natural: there's no point in advertising what has been widely known and used for a long time. As a result, they come to the conclusion that this language was abandoned long ago and now is used only to maintain some old projects.
It's not so. This is a very popular, live and actively developing language. Just have a look at the rating of programming languages to see that I'm right. Currently it can be found here: TIOBE Programming Community Index for January 2013.
If you sum up C, C++ and Objective-C, you'll get 37%. It's 6 times higher than PHP, for instance. The extinction of the C/C++ language family is quite out of the question.
Here's the answer to the question why it's C/C++ that we prefer to support in PVS-Studio: because these are the most popular languages nowadays. Besides, they are complex, tricky and much error-prone. It's just a paradise where static code analyzers can thrive.
Note. Don't take it as a criticism of the C or C++ language. It's just the price we have to pay for the flexibility of these language and the capability of getting fast optimized code they generate.
Once again I would like to touch upon the wrong belief that C/C++ is now used only in old projects or microcontrollers. No, many contemporary and popular applications are being written in this language. For instance, such is Chromium - you can't say it's an ancient project by any means.
Here is a list of popular applications written in C++: C++ Applications.
To finish the article, I would like to give you one more link to a discussion: Why is C++ still a very popular language in quantitative finance?
There is no fragment in program code where you cannot make mistakes. You may actually make them in very simple fragments. While programmers have worked out the habit of testing algorithms, data exchange mechanisms and interfaces, it's much worse concerning security testing. It is often implemented on the leftover principle. A programmer is thinking: "I just write a couple of lines now, and everything will be ok. And I don't even need to test it. The code is too simple to make a mistake there!". That's not right. Since you're working on security and writing some code for this purpose, test it as carefully!
To be honest, I don't know what the TPP project is intended for. As far as I understand, this is a set of tools to assist in research of proteins and their interaction in living organisms. However, that's not so much important. What is important is that their source codes are open. It means that I can check them with the PVS-Studio static analyzer. Which I'm very much fond of.
So, we have checked the Trans-Proteomic Pipeline (TPP) version 4.5.2 project. To learn more about the project, see the following links:
I'm going on to tell you about how programmers walk on thin ice without even noticing it. Let's speak on shift operators <<, >>. The working principles of the shift operators are evident and many programmers even don't know that using them according to the C/C++ standard might cause undefined or unspecified behavior.
In C language, you may use functions without defining them. Pay attention that I speak about C language, not C++. Of course, this ability is very dangerous. Let us have a look at an interesting example of a 64-bit error related to it.