To let the world know about our product, we check open-source projects. By the moment we have checked 245 projects. A side effect: we found 9574 errors and notified the authors about them.
PVS-Studio team experience: checking various open source projects, or mistakes C, C++ and C# programmers make
1. PVS-Studio team experience:
checking various open source
projects, or mistakes C, C++ and C#
programmers make
Authors:
Candidate of Engineering Sciences,
Evgeniy Ryzhkov, evg@viva64.com
Candidate of Physico-Mathematical Sciences,
Andrey Karpov, karpov@viva64.com
2. OOO "Program Verification Systems"
(www.viva64.com)
• Development, marketing and sales of our software product
• Office: Tula, 200 km away from Moscow.
• Staff: 14 people
3. A couple of words about static analysis
• Does everyone know, what static analysis is?
• PVS-Studio performs static analysis of source
code written in C, C++ and C#.
• C, C++-300 diagnostics;
• C# - 100 diagnostics
4. Our achievements
• To let the world know about our product, we check open-
source projects. By the moment we have checked 245
projects.
• A side effect: we found 9574 errors and notified the authors about
them.
• 9574/245 = 40 errors in a project - not that much. I would like to
stress, that this is a side effect. We didn’t have a goal to find as
many errors as possible. Quite often, we stop when we find
enough errors for an article.
6. So, we have checked a lot of open source
projects...
• ... thus we have accumulated various observations that we would like
to share
7. Let’s start with boring stuff - typical errors
• Let’s speak about the way the programmers usually see the static
analyzers’ work
8. A boring example N1
OpenMW (C++)
std::string rangeTypeLabel(int idx)
{
const char* rangeTypeLabels [] = {
"Self", "Touch", "Target"
};
if (idx >= 0 && idx <= 3)
return rangeTypeLabels[idx];
else
return "Invalid";
}
3 elements
If idx == 3, we have
array index out of
bounds
V557 Array overrun is possible. The value of 'idx'
index could reach 3. esmtool labels.cpp 502
9. A boring example N2
CamStudio (C++)
int CopyStream(PAVIFILE pavi, PAVISTREAM pstm)
{
//....
BYTE p[20000];
//....
free(p);
return 0;
}
V726 An attempt to free memory containing the 'p' array by
using the 'free' function. This is incorrect as 'p' was created on
stack. playplusview.cpp 7059
10. A boring example N3
Sony ATF (C#)
public static QuatF Slerp(QuatF q1, QuatF q2, float t)
{
double dot = q2.X * q1.X + q2.Y * q1.Y +
q2.Z * q1.Z + q2.W * q1.W;
if (dot < 0)
q1.X = -q1.X; q1.Y = -q1.Y; q1.Z = -q1.Z; q1.W = -q1.W;
....
}
V3043 The code's operational logic does not correspond with its formatting.
The statement is indented to the right, but it is always executed. It is possible
that curly brackets are missing. Atf.Core.vs2010 QuatF.cs 282
11. A boring example N4
Xenko (C#)
public string ToString(string format,
IFormatProvider formatProvider)
{
if (format == null) return ToString(formatProvider);
return string.Format(formatProvider,
"Red:{1} Green:{2} Blue:{3}",
R.ToString(format, formatProvider),
G.ToString(format, formatProvider),
B.ToString(format, formatProvider));
}
V3025 Incorrect format. A different number of
format items is expected while calling 'Format'
function. Expected: 4. Present: 3.
SiliconStudio.Core.Mathematics Color3.cs 765
12. But life is way more interesting
• Let’s look at the dark side
13. Programmers do not check comparison
functions
• Psychoanalysis;
• "Can't be wrong" in functions like:
public static int Compare(FooType A, FooType B) {
if (left < right) return -1;
if (left > right) return 1;
return 0;
}
14. Easy. Example N1.
IronPython and IronRuby (C#)
public static int Compare(SourceLocation left,
SourceLocation right) {
if (left < right) return -1;
if (right > left) return 1;
return 0;
}
16. Example N3.
MySQL (C++)
A lot of similar strings. It
should be fine.
static int rr_cmp(uchar *a, uchar *b)
{
if (a[0] != b[0])
return (int)a[0] - (int)b[0];
if (a[1] != b[1])
return (int)a[1] - (int)b[1];
if (a[2] != b[2])
return (int)a[2] - (int)b[2];
if (a[3] != b[3])
return (int)a[3] - (int)b[3];
if (a[4] != b[4])
return (int)a[4] - (int)b[4];
if (a[5] != b[5])
return (int)a[1] - (int)b[5];
if (a[6] != b[6])
return (int)a[6] - (int)b[6];
return (int)a[7] - (int)b[7];
}
18. PVS-Studio is coming to the aid
G3D Content Pak (C++)
bool Matrix4::operator==(const Matrix4& other) const {
if (memcmp(this, &other, sizeof(Matrix4) == 0)) {
return true;
}
....
}
V575 The 'memcmp' function processes '0' elements. Inspect
the 'third' argument. graphics3D matrix4.cpp 269
19. PVS-Studio is coming to the aid
It detects errors in all the previous cases:
1. V3021 There are two 'if' statements with identical conditional expressions.
The first 'if' statement contains method return. This means that the
second 'if' statement is senseless. SourceLocation.cs 156
2. V501 There are identical sub-expressions to the left and to the right of the
'>' operator: i2->pid > i2->pid brlock.c 1901
3. V525 The code containing the collection of similar blocks. Check items '0',
'1', '2', '3', '4', '1', '6' in lines 680, 682, 684, 689, 691, 693, 695. sql
records.cc 680
4. V549 The first argument of 'stricmp' function is equal to the second
argument. ishader.h 2089
20. Last line effect
• About mountain - climbers;
• The statistics was gathered from the
error base, when it had about 1500 error
examples.
• 84 suitable fragments were detected.
• In 43 cases the mistake was in the last
line.
23. Example N3.
Qt (C++)
.....::method_getImageData(.....) {
....
qreal x = ctx->callData->args[0].toNumber();
qreal y = ctx->callData->args[1].toNumber();
qreal w = ctx->callData->args[2].toNumber();
qreal h = ctx->callData->args[3].toNumber();
if (!qIsFinite(x) || !qIsFinite(y) ||
!qIsFinite(w) || !qIsFinite(w))
....
}
24. Example N4.
Space Engineers (C#)
void DeserializeV0(XmlReader reader)
{
....
if (property.Name == "Rotation" ||
property.Name == "AxisScale" ||
property.Name == "AxisScale")
continue;
....
}
25. PVS-Studio is coming to the aid
Xamarin.Forms (C#)
internal bool IsDefault
{
get { return Left == 0 && Top == 0 &&
Right == 0 && Left == 0; }
}
V3001 There are identical sub-expressions 'Left == 0' to the
left and to the right of the '&&' operator. Thickness.cs 29
26. PVS-Studio is coming to the aid
It detects errors in all the previous cases:
1. V537 Consider reviewing the correctness of 'y' item's usage. g3dlib
vector3int32.h 77
2. V525 The code containing the collection of similar blocks. Check items
'SetX', 'SetY', 'SetZ', 'SetZ' in lines 455, 456, 457, 458. Client (HL2)
networkvar.h 455
3. V501 There are identical sub-expressions '!qIsFinite(w)' to the left and to
the right of the '||' operator. qquickcontext2d.cpp 3305
4. V3001 There are identical sub-expressions 'property.Name == "AxisScale"'
to the left and to the right of the '||' operator. Sandbox.Graphics
MyParticleEmitter.cs 352
27. Let’s take a dark break: the compiler is to
blame for everuthing!
Ffdshow
TprintPrefs::TprintPrefs(....)
{
memset(this, 0, sizeof(this)); // This doesn't seem to
// help after optimization.
dx = dy = 0;
isOSD = false;
xpos = ypos = 0;
align = 0;
....
}
28. It only seems that people
verify the pointers
(references) against null
• In fact, the programs are not ready to
face nullptr/null;
• This is the most common error that we
find in both C++ and in C# projects.
29. Example N1.
Linux (C) kernel
static int tc_ctl_action(struct sk_buff *skb,
struct nlmsghdr *n)
{
struct net *net = sock_net(skb->sk);
struct nlattr *tca[TCA_ACT_MAX + 1];
u32 portid = skb ? NETLINK_CB(skb).portid : 0;
....
}
The function
got an
argument:
Dereferencing
Oops, it should be checked too.
30. Example N2.
These bugs have ALWAYS been there. Taken from Cfront compiler, year 1985:
Pexpr expr::typ(Ptable tbl)
{
....
Pclass cl;
....
cl = (Pclass) nn->tp;
cl->permanent=1;
if (cl == 0) error('i',"%k %s'sT missing",CLASS,s);
....
}
31. Example N3.
Nothing has changed for the past 30 years. Contemporary Clang compiler:
Instruction *InstCombiner::visitGetElementPtrInst(....) {
....
Value *StrippedPtr = PtrOp->stripPointerCasts();
PointerType *StrippedPtrTy =
dyn_cast<PointerType>(StrippedPtr->getType());
if (!StrippedPtr)
return 0;
....
}
32. Example N4.
C # projects are no better. In the source code of 270 controls written by
DevExpress we found 460 errors of this kind (1.7 error per project). Example:
public IList<ISeries> CreateBindingSeries(....) {
DataBrowser seriesBrowser = CreateDataBrowser(....);
....
int currentPosition = seriesBrowser.Position;
if (seriesBrowser != null &&
seriesBrowser.Position >= 0)
....
}
33. PVS-Studio is coming to the aid
Unreal Engine 4 (C++)
FName UKismetNodeHelperLibrary::GetEnumeratorName(
const UEnum* Enum, uint8 EnumeratorValue)
{
int32 EnumeratorIndex = Enum->GetIndexByValue(EnumeratorValue);
return (NULL != Enum) ?
Enum->GetEnum(EnumeratorIndex) : NAME_None;
}
V595 The 'Enum' pointer was utilized before it
was verified against nullptr. Check lines: 146, 147.
kismetnodehelperlibrary.cpp 146
34. PVS-Studio is coming to the aid
It detects errors in all the previous cases:
1. V595 The 'skb' pointer was utilized before it was verified against nullptr.
Check lines: 949, 951. act_api.c 949
2. V595 The 'cl' pointer was utilized before it was verified against nullptr.
Check lines: 927, 928. expr.c 927
3. V595 The 'StrippedPtr' pointer was utilized before it was verified against
nullptr. Check lines: 918, 920. LLVMInstCombine instructioncombining.cpp
918
4. V3095 The 'seriesBrowser' object was used before it was verified against
null. Check lines: 509, 510. - ADDITIONAL IN CURRENT
DevExpress.Charts.Core BindingProcedure.cs 509
35. What does a “normal”
programmer think about a code
analyzer?
Myths and stereotypes
36. Laziness is on my side
• "It is hard to start using static analysis, because
of the large number of messages on the first
stage."
37. PVS-Studio is coming to the aid:
markup base
• Old messages can be marked as "uninteresting". This is a key point
when you embed the code analyzer into a real project.
38. All settings turned to the maximum!
• “The more messages the analyzer issues, the
better is the analyzer”
39. "The first 10 messages”
• People’s attention weakens very quickly.
• The analyzer must take this into account.
• Default settings are chosen in such a way that you have
maximum chances to see the error immediately.
40. The hardest part about static analysis:
not to issue warnings
• C++: 105 open source projects
• C#: 36 open source projects
• Example V501
41. V501.
Infix operation is considered as a dangerous one, if
the right and the left operands are the same.
while (X < X)
if (A == B || A == B)
42. V501. The devil is in the details
• X*X
• while (*p++ == *a++ && *p++ == *a++)
• There are number literals to the left and to the right
if (0 == 0)
… 15 | 15 …
• #define M1 100
#define M2 100
if (x == M1 || x == M2)
• float x = foo();
if (x == x)
43. V501. The devil is in the details
• /or - apply to numeric constants: 1./1.
• A string from Zlib:
if (opaque) items += size - size; / * make compiler happy * /
• rand() - rand()
rand() % N - rand() % N
• There are classes to the left and right of '|', '&', '^', '%'.
if (str == str) – look for
if (vect ^ vect) – we’d better skip
• sizeof(__int64) < sizeof(__int64)
44. V501. The devil is in the details
• 0 << 31 | 0 << 30 | ...
(0 << 6) | (0 << 3) | …
• '0' == 0x30 && 'A' == 0x41 && 'a' == 0x61
• This is a template function to define NaN numbers.
• Read(x) && Read(x)
• #define USEDPARAM(p) ((&p) == (&p)) and others
• To the right and left there is a function call with such names as
pop, _pop
• Etc …
46. PVS-Studio is coming to the aid:
Ability to work with the list of messages.
• Filters by the code of the message;
• Filters by the message text;
• Filters by the name of a file or a folder;
• False alarm markup in the code
(Mark As False Alarm: //-V501), including macros;
• 100 messages for an .h-file.
• Interactivity is super important!
47. PVS-Studio is coming to the aid:
Different ways to run the analyzer
• Integration with IDE;
• A separate application;
• Monitoring of the compiler;
• Command line version;
• Integration with nightly builds;
• IncrediBuild Support.
48. Static analysis is not a panacea
• This is an answer to the question: "What else can I do to improve the
quality of the code”
49. On the topic of programming culture in Russia and
in the world, or “Why should I care about static
analysis at all?”
• Western people have used for a long time quite successfully.
• Knowing the principles and tools for static code analysis gives you +10
points on the job interview and +20 during the implementation in
your project. On top of it - a position of a Team Leader.
• Where else can we find articles about static code analysis?
49/26
50. Q&A
• Contact: evg@viva64.com
• Follow us on twitter: https://twitter.com/Code_Analysis
• Visit the site: www.viva64.com
• Come and talk to us during the conference (mostly, we are friendly
people and won’t bite you, we promise)
50/26