1. Jun. 30 IJASCSE Vol 1 Issue 1 2012
A LITRERATURE SURVEY ON VARIOUS SOFTWARE COMPLEXITY
MEASURES
1
Anurag Bhatnagar, 2Nikhar Tak ,3Shweta Shukla
Abstract - Latin word “complexus”, Keyword – Software complexity, Code
which signifies "entwined", "twisted based complexity measures, cognitive
together". This may be interpreted in based complexity measures.
the following way - in order to have a
complex we need two or more
components, which are joined in such a 1. INTRODUCTION TO
way that it is difficult to separate them. COMPLEXITY
Similarly, the Oxford Dictionary defines
Complexity is defined by the execution
something as "complex" if it is "made of
time and storage required to perform
(usually several) closely connected
the computation. When the interacting
parts". Software systems change over
system is a programmer, complexity is
time, so it is very difficult to understand
defined by the difficulty of performing
and measure the effect of these
tasks such as coding, debugging,
changes and it is a very complex
testing, or modifying the software. The
problem in most modern software
term software complexity is often
systems that consist of thousands of
applied to the interaction between a
program modules. Thus one of the
program and a programmer working on
central problems in software
some programming task. Complexity
engineering is its inherited complexity.
measures focus on designs and actual
This paper is a survey on different
code. They assume there is a direct
Code based complexity measures such
correlation between design complexity
as Halsted Complexity Measure and
and design errors, and code complexity
MAC Cabe’s Cyclomatic Complexity
and latent defects. By recognizing the
Measure and Cognitive based
properties of each that correlate to their
complexity measures such as KLCID
complexity, we can identify those high
Complexity Measure, Cognitive
risk applications that either should be
Functional Complexity Measure and
revised or subjected to additional
Cognitive Information Complexity
testing. Those software properties
Measure.
which correlate to how complex its size,
interfaces among modules (usually
1
2. Jun. 30 IJASCSE Vol 1 Issue 1 2012
"Open Reengineering" provides
criterion for metrics selection. Today
measured as fan-in, the number of "Open Systems" are so popular
modules invoking a given application, because commercial software
or fan-out, the number of modules applications is that the user is
invoked by a given application), and guaranteed a certain level of
structure (the number of paths within a interoperability - the applications work
module). Complexity metrics help together in a common framework, and
determine the number and type of tests applications can be ported across
needed to cover the design (interfaces hardware platforms with minimal
or calls) or coded logic (branches and impact. The open reengineering consist
statements). There are several of the abstract model that state that
accepted methods for measuring software systems must be as
complexity, most of which can be independent as possible from the
calculated by using automated tools. source code formatting and
Basili defines complexity as a measure programming language. The objective
of the resources expended by a system is to be able to set complexity
while interacting with a piece of standards and interpret the resultant
software to perform a given task [1]. numbers uniformly across projects and
Software complexity measures are languages. Once we calculate the
based on program code disregarding complexity then that value should be
comment and stylistic attributes such as the same whether in Ada, FORTRAN,
indentation and naming conventions. or in some other programming
Measures typically depend on program language. Some of the very basic but
size, control structure, or the nature of important complexity measures are the
module interfaces. number of lines of code also known as
LOC (Line of Code), does not meet the
Software complexity can be measured open reengineering criterion, as it’s
by Direct Measures which is also tremendously sensitive to programming
known as internal attributes and Indirect language, coding style, and textual
Measures which is also known as formatting of the source code. The
external attributes. Direct Measures are cyclomatic complexity measure, is
measured directly such as Cost, effort, example of open engineering as it
LOC, speed, memory. Indirect measures the amount of decision logic
Measures can not be measured in a source code function. Cyclomatic
directly. Example - Functionality, complexity is entirely independent of
quality, complexity, efficiency, reliability, text formatting and is this is
maintainability. independent of programming language
2
3. Jun. 30 IJASCSE Vol 1 Issue 1 2012
to find the complexity of the procedural
programs, it is used to measure a
since the same fundamental decision software component's complexity
structure are available and uniformly directly from source code with
used in all procedural programming prominence on computational
languages. complexity [3]. Halstead’s measures
uses distinct and total number of
2. CODE BASED COMPLEXITY operators and operands means it
MEASURES absorbed the operators (that means
operations in a programming language
Code based complexity measures, as that may vary or move the operands)
its name is indicating are based on the and operands (different variables,
code of the program and not on the constants, and addresses on which
comments and stylistic attributes of it. operation is to be performed) in a
Code based measures are typically software component in order to
depends upon the program sizes, measure complexity. Halstead makes
program flow graphs, or module the observation that metrics of the
interfaces such as Halstead’s software software should reflect the
science metrics [3] and the most widely implementation or expression of
known measure of cyclomatic algorithms in different languages, but
complexity developed by McCabe [6]. be independent of their execution on a
However, Halstead’s software metrics specific platform. These metrics are
purely calculates the number of therefore computed statically from the
operators and operands, but it does not code. Thus the Halstead complexity
consider the internal structures of measure uses code of the software in
modules, while McCabe’s cyclomatic order to calculate the complexity so it is
complexity does not consider I/O’s of a known as code based complexity
system as it is based on the flow chart measure. The goal of Halstead
of the program. It uses flow chart of the complexity measure was to identify
program and on the basis of nodes and measurable properties of software or
edges it provides complexity of the code, and the relations between them.
program. The main problem with this method or
we can say blockage with this is that it
2.1 HALSTEAD COMPLEXITY does not distinguish the differences
MEASURE among the same operators and among
the same operands in a program.
A set of software metrics was Besides, it ignores the nested structure
developed by Maurice Halstead in 1977 and fails to analyze the case statement
3
4. Jun. 30 IJASCSE Vol 1 Issue 1 2012
the physical meanings of the measures
are not very clear.
when code is not accessible. Some of
the important parameters used in Table – 2. Derived Measures of
measures of Halstead are based on Halstead’s Software Metrics
four scalar numbers of programs. All
these parameters are show in Table 1. Measure Symbol Formula
Program N N = N1 + N2
Table – 1. Meta Measures in Halstead Length
Complexity measure Program n n = n1 + n2
vocabulary
Factor Meta Description
Volume V V = N*(Log2n)
Measure
Difficulty D D=
n1 $operators The number
(n1*N2)/2*n2
of distinct
Effort E E = D*V
operators
Time T T = E/18
n2 $operands The number
Total Bugs B B = E2/3/3000
of distinct
operands
N1 $operators The total 2.2 MAC CABE’S CYCLOMETRIC
number of
COMPLEXITY
operators
N2 $operands The total In 1976, Thomas J.McCabe provides a
number of method calculates Cyclomatic
operands Complexity by the control flow graph of
program. This McCabe method is
Six measures can be derived from table based on control flow. McCabe's
– 1 , as shown in Table 2. Halstead’s complexity measure is a mathematical
software metrics were developed in the technique for calculating the logical
context of assembler language complexity of a computer program.
programs, and get into too small details Cyclomatic complexity also known as
of measurement. The metrics identified conditional complexity, is a software
the need for three types of measures metric and is used to indicate the
for software: basic, derived, and complexity of a program. It is a direct
estimated measures. But, some of the measure so it directly measures the
measures and constants are arguable, number of linearly independent paths
e.g. T = E/18, and B = E2/3 / 3000, and through a program's source code.
Cyclomatic complexity is computed
4
5. Jun. 30 IJASCSE Vol 1 Issue 1 2012
expressed as graphs or to be precise
using the control flow graph of the "program control graphs." Thus
program: the nodes of the graph mathematical analysis may be applied
correspond to indivisible groups of to computer programs in order to
commands of a program, and a calculate complexity. Graph theory
directed edge connects two nodes if the allows for such a graph to yield a
second command might be executed quantitative cyc]omatic complexity
immediately after the first command. number via the formula –
Cyclomatic complexity may also be V(F) = E − N + 2
applied to individual functions, modules,
methods or classes within a program. A F is the flow graph of the code.
number representing its logical weight N is the number of vertices/Nodes.
can be used to show the complexity of E is the number of edges.
a computer program. This quantitative
complexity number dependent on a
program's decision structure or the
number of basic paths through a
program generated but it is
independent of the program's size.
Complexity can assume a value of one
to infinity, a reasonable upper limit of
intellectual manageability has been
placed by McCabe at ten. Software
should be tested for the complexity
once they are created but when the
complexity of the software exceeds
than ten, sub-functions should be given
their own procedure or the software
should be redone. The theoretical basis
for McCabe's complexity measure is
graph theory. The following connection
exists between graph theory and
computer programs. Each node in the Figure – 1. McCabe Cyclomatic
graph corresponds to a block of code in Complexity Example.
the program where the flow is
sequential and the arcs correspond to Here we have
branches taken in the program. Thus, E = No. of Edges = 7
all computer programs may be N = No. of Nodes = 5.
5
6. Jun. 30 IJASCSE Vol 1 Issue 1 2012
Cyclomatic Complexity Klemola and Rilling proposed KLCID
V(F) = 7-5+2 = 4 based complexity measure in 2004. It
defines the use of the identifiers as
3. COGNITIVE COMPLEXITY programmer defined variables and
MEASURES identifiers (ID) when a software is built
up.
Cognitive complexity measures are the
human effort needed to perform a task ID = Total no. of identifiers/ LOC
or difficulty in understanding the In order to calculate KLCID, we need to
software. Cognitive complexity measure find the number of unique lines of code
is an attempt that quantifies the effort or in a module, lines that have same type
notch of difficulty in understanding the and kind of operands with same
software based on cognitive informatics arrangements of operators would be
foundation. Cognitive complexity of consider equal. I defines KLCID as –
software is dependent on three basic
fundamental factors. These are inputs, KLCID= No. of Identifier in the set of
outputs, and internal processing. unique lines/ No. of unique lines
Cognitive complexity measures containing identifier
consider all these factors affecting the This is a time consuming method when
effort in twigging the software, e.g. data comparing a line of code with each line
objects such as inputs, outputs, of the program. KLCID accepts that
variables, loops and branches internal control structures for different
evaluating complexity but if the factors software’s are identical.
are not carefully thought and organized
it may cause trouble in calculating 3.2 COGINITIVE FUNCTIONAL SIZE
complexity. Some of the factors (CFS)
considered in order to calculate the
complexity of the software and these Cognitive Functional Size (CFS) [1] was
factors are separately-derived weights. defined by Wang
Thus this process of calculating the As –
complexity ignores the relationships
among the factors of the modules and CFS = (Ni + No) * Wc
has a little importance to human Where Ni = No of inputs.
cognitive process when apprehending No = No of outputs.
the code of the program. Wc = The total cognitive weight
of basic control structures (BCS's).
3.1 KLCID COMPLEXITY METRICS
6
7. Jun. 30 IJASCSE Vol 1 Issue 1 2012
3.2 COGNITIVE INFORMATION
COMPLEXITY MEASURE
These BCS are defined as the total
sum of cognitive weights of its Q linear Cognitive Informatics plays an
blocks composed in individual BCS's. important role in understanding the
Since each block may consist of 'm' fundamental characteristics of software.
layers of nesting BCS's, and each layer Wang [8] defined information as the
third essence in modelling the natural
with 'n' linear BCS's, then –
world supplement to matter and energy.
Wang [9] also defined software as
“Software in cognitive informatics is
perceived as formally described design
information and implementations
Table 2. Cognitive Weights (Wc) of
instructions of computing application”
BCS’s
i.e.
Category BCS Wc.
Software ≃ Information
Sequence Sequence (SEQ) 1
Branch If-Then-Else 2
It represent that, software is equivalent
(ITE)
to information, So
Case(CASE) 3
implies that
For-do (Ri) 3
Iteration Repeat-Until (Ri) 3 Difficulty in understanding ≃
While-do (Ro) 3 Difficulty in
Embedded Function call 2 software understanding information
Component (FC)
Recursion (REC) 3 Software is a computational information
Concurrency Parallel (PAR) 4 and is a mathematical entity. Or we can
Interrupt (INT) 4 say that software is a piece of
information that consist of identifiers to
Cognitive Functional Size interestingly represent the information and
started the study in software complexity Operators that represent function to be
measurement based on cognitive performed.
informatics foundation [6] and state that
the complexity of software is dependent Software = fun(Identifiers,
on inputs, outputs, and its internal Operators)
processing.
7
8. Jun. 30 IJASCSE Vol 1 Issue 1 2012
on this fact, weighted information count
Identifiers can be variable names, is been introduced as in definition 3.
defined constants or other labels in
software. Thus information can be Definition 3: The Weighted
defined as – Information Count of a line of code
(WICL) of a software is a function of
Definition 1: Information can be identifiers, operands and LOC and is
represented by LOC. It means different defined as –
operators and operand can be used to
show information, Thus in Kth line of WICLk = Ik / [LOCS – k]
code the Information
contained is – Where WICk = Weighted Information
Count for the kth line.
IKth = (Identifiers + Operands)K Ik = Information contained in a software
= (IDK + OPK) IU for the kth line. So the Weighted
Information Count of the
Where ID = Total number of identifiers Software (WICS) is defined as (LOCS)
in the kth LOC of software. –
OP = Total number of operators in the
kth LOC of software. LOCS
IU = Information Unit represents that WICS = Σ WICLk
any identifier or operator has one unit of K=1
information in it. The internal control structure of
software such as it’s weight must be
Definition 2: Overall Information considered in order to provide complete
contained in software and robust complexity measure.
(Itotal) is the sum of information
contained in each line of Definition 4: Sum of the cognitive
Code (LOCS) i.e. weights of basic control structures
(SBCS) is defined –
LOCS Let W1, W2…..Wn be the cognitive
Itotal = Σ Ik weights of the basic control structures
K=1 [8] in the software –
When have established that software is n
ready to comprehend the information SBCS = Σ (Wi)
units (IU’s), the measure of the i=1
complexity of the software should Definition 5: Thus we have Cognitive
contain the above parameters. Based Information Complexity Measure –
8
9. Jun. 30 IJASCSE Vol 1 Issue 1 2012
Calculation of KLCID complexity
(CICM) = WICS * SBCS measure –
Thus we have used an approach to Total no. of identifiers in the above
measure the amount of information program = 18
contained in the software thus enabling Total no. of lines of code = 17
us to calculate the coding efficiency ID = 18/17 = 1.05
(EI). No. of unique lines containing identifier
=9
Definition 6: Efficiency of No. of identifiers in the set of unique
Information Coding (EI) of a software lines = 11
is defined as – KLCID = 11 / 9 = 1.22
EI = Itotal / LOCS.
Calculation of CFS –
Now to analyse the KLCID, CMS and Number of inputs Ni = 1
CICM complexity measures, we have Number of outputs No = 3
an algorithm to calculate the average of BCS (sequence) W1 = 1
a set of ‘n’ numbers on which we will BCS (while) W2 = 3
applied all these methods to calculate Wc = W1 + W2 = 1+3 = 4
the complexity. CFS = (Ni + No)* Wc = (1+3)*4 = 16
# define n 10
Void main( ) Calculation of CICM –
{ LOC = 17
int total; Total no. of identifiers = 18
float sum, avg, num; Total no. of operators = 4
sum = 0; BCS(sequence) W1 = 1
total = 0; BCS(while) W2 = 3
while (total < n) SBCS = W1 + W2 = 1+3 = 4
{ WICS = [ 1/16 + 1/13 + 3/12 + 1/11 +
scanf(“%f”, &num); 1/10 + 3/9 + 1/7 + 4/6 + 3/5 + 4/3] =
sum = sum + num; 3.63
total = total + 1; CICM = WICS * SBCS = 3.63 * 4 =
} 14.53
avg = sum/N; Information Coding Efficiency (EI) of
printf(“N=%d sum = %f”’, N, sum); the above
printf(“average = %f”’, avg); program = 22/17 = 1.29
} The cognitive information complexity of
the given algorithm is 14.53 CICU
(Cognitive Information Complexity unit).
9
10. Jun. 30 IJASCSE Vol 1 Issue 1 2012
[1] B. Auprasert and Y, Limpiyakorn,
3.3 CONCLUSION "StructuringCognitive Information
for Software Complexity Measurement",
In this paper we have studied various
Accepted for CSIE 2009, Los
Software complexity measures along
Angeles, USA., April 2009.
with their appropriate example. All
[2] Halstead, M.H., Elements of
Code based complexity measures such
Software Science, Elsevier North, New
as Halsted Complexity Measure and
York, 1977
MAC Cabe’s Cyclomatic Complexity
[3] Mc Cabe, T.H., A Complexity
Measure that we have studied are
measure, IEEE Transactions on
based on coding part of the program,
Software Engineering, SE-2,6, pp. 308-
so it needed time to compute
320, 1976
complexity as all these methods can
[4] Kushwaha, D.S. and Misra, A.K., A
be applied after the coding phase has
Modified Cognitive Information
been completed. MAC Cab Cylomatic
Complexity Measure of Software, ACM
complexity measure is based on the
SIGSOFT Software Engineering Notes,
flow graph of the program by which
Vol. 31, No. 1 January 2006.
code is generated thus it is also
[5] Kushwaha, D.S. and Misra, A.K., A
classified in code based complexity
Complexity Measure based on
measures. Cognitive based complexity
Information Contained in Software,
measures such as KLCID Complexity
Proceedings of the 5th WSEAS Int.
Measure, Cognitive Functional
Conf. on Software Engineering, Parallel
Complexity Measure and Cognitive
and Distributed Systems, Madrid,
Information Complexity Measure are
Spain, February 15-17, 2006, (pp 187-
also based on the coding part of the
195)
programs so they also needed code of
[6] Tumous Klemola and Juergen
the procedural program on which they
Rilling, A Cognitive Complexity Metric
are to be applied, in order to compute
based on Category Learning, IEEE
the complexity. So we can say that the
International Conference on Cognitive
time to measure the complexity for
Informatics (ICCI-04)
procedural programs by both Code
[5] Kushwaha, D.S. and Misra, A.K.,
based complexity measures and
Cognitive Information Complexity
Cognitive based complexity measures
Measure: A Metric Based on
depends on the time that is needed to
Information Contained in the Software,
complete the coding phase.
WSEAS Transactions on Computers,
3.4 REFERENCES Issue 3, Vol. 5, March 2006, ISSN:
1109 – 2750
10
11. Jun. 30 IJASCSE Vol 1 Issue 1 2012
[6] Kushwaha, D.S. and Misra, A.K.,
Improved Cognitive Information
Complexity Measure: A metric that
establishes programcomprehension
effort, ACM SIGSOFT Software
Engineering, Page 1, September 2006,
Volume 31 Number 5
[7]IEEE Computer Society: IEEE
Recommended Practice for Software
Requirement Specifications, New York,
1994
[8] Wang. Y and Shao,J.,”On Cognitive
informatics”, 1st IEEE International
Conference on Cognitive Informatics,
pages 34-42,August 2002.
[9] Wang ,Y .and Shao,J.,
“Measurement of the Cognitive
Functional Complexity of Software”, 3
rd IEEE International
Conference on Cognitive
Informatics(ICCI’04).
11