SlideShare uma empresa Scribd logo
1 de 45
Baixar para ler offline
Shri Shivaji Education Society Amravati’s
COLLEGE OF ENGINEERING & TECHNOLOGY AKOLA
LAB MANUAL
Compiler Design Lab
Subject code: 5KS07
SEMESTER: V
Department of Computer Science & Engineering
College of Engineering & Technology, Akola
Affiliated to Sant Gadge Baba Amravati University,
Maharashtra, India.
PREFACE
A compiler is a program that can read a program in one language- the source language - and
translate it into an equivalent program in another language - the target language. The compiler
may produce an assembly-language program as its output.
Compiler design can be divided into two parts: analysis and synthesis. The analysis part
breaks up the source program into constituent pieces and imposes a grammatical structure on
them. The synthesis part constructs the desired target program from the intermediate
representation and the information in the symbol table. If we examine the compilation process in
more detail, we see that it operates as a sequence of phases-lexical analysis, syntax & semantic
analysis, intermediate code generation, code optimization & code generation, each of which
transforms one representation of the source program to another.
Practical of Compiler design Lab for engineering in CSE are divided into two parts- i)
implementation using programming language and ii) implementation using tools. In first part,
practical can be performed in any programming language such as C language or Python. C
language is always preferred. C is the middle level language combining low-level hardware
controlling ability and high level programming capabilities. It is the procedural language
focusing on functions and pointers.
To implement phases of compiler, various tools are available in Linux system. Lex tool is
lexical analyzer generator whereas YACC tool is parser generator. Recent versions of Lex and
YACC tools are Flex and Bison respectively. Recently windows version of these tools is
available. Flex Windows (Lex and Yacc) contains the GNU Win 32 Ports of Flex and Bison,
which are Lex and Yacc Compilers respectively, and are used for generating tokens and parsers.
The Flex Windows Package contains inbuilt gcc and g libraries, C and C++ compilers, which are
ported to Windows from Linux. Programming using tools is typically more interesting. This lab
manual covers various programs using tools. I hope that this manual will be useful for students.
Prof. Kalpana S. Gilda
List of practical
Subject: Compiler Design
Sr.No. Title Page no.
1 Program to check valid ‘c’ language identifier. 1
2 Program to check whether given string is keyword or not. 5
3 Program to count lines and characters in given input file. 8
4 Program to check parenthesis of expression is balanced or not. 10
5 Program for lexical analyzer which produces tokens for given expression. 12
6 Program for word recognizer using Lex. 16
7 Program for number system using Lex. 19
8 Program to count words, characters and lines using Lex. 21
9 Program for lexical analyzer using Lex. 23
10 Program for simple desk calculator using YACC. 25
Additional set of practicals
Sr.No. Title Page
no.
1 Write a C program to eliminate single line comments from given input file. 29
2 Program for predictive parser. 31
3 Program to check if given grammar is left recursive and to remove left recursion. 36
4 Program code generator. 37
5 Program to check whether the string end with bb using Lex. 39
6 Program to count no. of vowels and consonants in a given input string using Lex. 40
7 YACC program for advance desk calculator by creating lexical analyzer using Lex. 41
Text Books:
[1] Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D. Ullman Compilers: ―Principles,
Techniques and Tools‖, Pearson Education, Second Edition.
Reference Books:
[1] Doug Brown, John Levine, and Tony Mason, ―Lex & Yacc‖, O‘Reilly & Associates, Inc., Second
Edition.
System Software Lab Manual
COET, Akola 1
Practical 1
Aim: Program to check valid ‘c’ language identifier.
Theory:
Identifiers
Identifiers are the names that are given to various program elements such as variables, symbolic
constants and functions. Identifiers can be freely named by following some restrictions.
Identifiers in C language
In C language identifiers are the names given to variables, constants and functions. These
identifiers are defined against a set of rules.
Rules for an identifier in C language
1. An identifier can only have alphanumeric characters (a-z, A-Z, 0-9) and underscore (_).
2. The first character of an identifier can only be an alphabet (a-z, A-Z) or underscore (_).
3. Identifiers are also case sensitive in C. For example name and Name are two different
identifiers in C.
4. Keywords are not allowed to be used as Identifiers.
5. No special characters, such as semicolon, period, whitespaces, slash or comma are permitted to
be used in or as an identifier.
6. Maximum length of identifier is 31 characters.
Examples of valid C language identifiers:
_a12, abc_, pr_12, abc, then_next_val etc.
Examples of invalid C language identifiers:
1a, abc$12, ab c, abc# etc.
Functions Used in programming
 int isalpha(int c)
isalpha(c) is a function in C which can be used to check if the passed character is an alphabet or
not. It returns a non-zero value if it’s an alphabet else it returns 0. For example, it returns non-zero
values for ‘a’ to ‘z’ and ‘A’ to ‘Z’ and zeroes for other characters.
Declaration
int isalpha(int c);
Parameters c − This is the character to be checked.
System Software Lab Manual
COET, Akola 2
Return Value
This function returns non-zero value if c is an alphabet, else it returns 0.
 int isdigit(int c):
isdigit(c) is a function in C which can be used to check if the passed character is a digit or not. It
returns a non-zero value if it’s a digit else it returns 0. For example, it returns a non-zero value for
‘0’ to ‘9’ and zero for others.
Declaration
int isdigit(int c);
Parameters
c − This is the character to be checked.
Return Value
This function returns non-zero value if c is a digit, else it returns 0.
 int isalnum(int c)
The function isalnum() is used to check that the character is alphanumeric or not. It returns non-
zero value, if the character is alphanumeric means letter or number otherwise, returns zero. It is
declared in “ctype.h” header file.
Declaration
int isalnum(int c);
Parameters
c − This is the character to be checked.
Return Value This function returns non-zero value if c is a digit or a letter, else it returns 0.
System Software Lab Manual
COET, Akola 3
// Program to check valid ‘c’ language identifier.
#include<stdio.h>
#include<string.h>
#include<ctype.h>
int main()
{
char id[20];
int i,flag,len;
//clrscr();
printf("n Enter a string: ");
gets(id);
len=strlen(id);
flag=0;
if(isalpha(id[0]) || id[0]=='_')
{
for(i=1;i<len;i++)
{
if(isalpha(id[i]) || isdigit(id[i]) || id[i]=='_')
{ continue;}
else
{
flag=1;
break;
}
}
if(flag==0)
{
printf("n%s is valid c-language identifier");
}
else
printf("nInvalid identifier, %c char not allowed",id[i]);
}
else
printf("nERROR:Variable should begin with letter");
return 0;
}
----------------------------------------------------------------------------------------------------------------------
//using switch-case
#include<stdio.h>
#include<conio.h>
#include<string.h>
#include<ctype.h>
#include<stdlib.h>
void main()
{
char id[20];
int i,state,len;
System Software Lab Manual
COET, Akola 4
clrscr();
printf("n Enter a string: ");
gets(id);
len=strlen(id);
state=0;
while(1)
{
switch(state)
{
case 0: if(isalpha(id[0]))
state=1;
else
{
printf("nERROR:Variable should start with alphabet");
getch();
exit(0);
}
break;
case 1: for(i=1;i<len;i++)
{
if(isalpha(id[i]) || isdigit(id[i]) || id[i]=='_')
{}
else
{
printf("nERROR: Invalid identifier, %c char not allowed",id[i]);
getch();
exit(0);
}
}
state=2;
break;
case 2:printf("n%s is valid c-language identifier");
getch();
exit(0);
}
}
}
OUTPUT
Enter a string: abc_123
ab92_g5 is valid c-language identifier
Enter a string: 1fgj
ERROR: Variable should start with alphabet
Enter a string:af$d1
Invalid identifier, $ char not allowed
System Software Lab Manual
COET, Akola 5
Practical 2
Aim: Program to check whether given string is keyword or not.
Theory:
Keywords
Keywords are preserved words that have special meaning in C language. The meaning has already
been described. These meaning cannot be changed. There are total 32 keywords in C language as
shown in figure1.
auto double int struct
break else long switch
case enum register typedef
const extern return union
char float short unsigned
continue for signed volatile
default goto sizeof void
do if static while
Figure1: List of keywords in C language.
String as Character array
String is a sequence of characters that is treated as a single data item and terminated by null
character '0'. Remember that C language does not support strings as a data type. A string is
actually one-dimensional array of characters in C language.
For declaration of keywords, we need to use 2-dimensional array of characters. A 2D array
is also known as a matrix (a table of rows and columns) as shown in following example.
f o r 0
i f 0
e l s e 0
d o 0
System Software Lab Manual
COET, Akola 6
w h i l e 0
b r e a k 0
s w i t c h 0
c a s e 0
v o i d 0
s t r u c t 0
Figure2 : 2D character array for keywords
Functions used in programming
 strcmp() :
strcmp() is a built-in library function that is used for string comparison. This function takes two
strings (array of characters) as arguments, compares these two strings lexicographically, and then
returns 0,1, or -1 as the result. It is defined inside <string.h> header file with its prototype as
follows.
Syntax of strcmp()
strcmp(first_str, second_str );
Parameters of strcmp()
This function takes two strings (array of characters) as parameters:
first_str: First string is taken as a pointer to the constant character (i.e. immutable string).
second_str: Second string is taken as a pointer to a constant character.
Return Value of strcmp()
The strcmp() function returns three different values after the comparison of the two strings which
are as follows:
1. Zero ( 0 )
A value equal to zero when both strings are found to be identical. That is, all of the characters in
both strings are the same.
2. Greater than Zero ( > 0 )
A value greater than zero is returned when the first not-matching character in first_str has a greater
ASCII value than the corresponding character in second_str or we can also say that if the character
in first_str is lexicographically after the character of second_str, then zero is returned.
3. Lesser than Zero ( < 0 )
System Software Lab Manual
COET, Akola 7
A value less than zero is returned when the first not-matching character in first_str has a lesser
ASCII value than the corresponding character in second_str. We can also say that if the character in
first_str is lexicographically before the character of second_str, zero is returned.
Program: Illegal
//ProgramTo identify given string is keyword or not
#include<stdio.h>
#include<conio.h>
#include<string.h>
#include<ctype.h>
int iskey(char *);
char keywrd[10][7]={"for","if","else","do","while","break","switch","case",”void”,”struct”};
void main()
{
int i,flag,len;
char ch,temp[10];
clrscr();
printf("nEnter string:");
scanf("%s",temp);
if(iskey(temp))
printf("It is a KEYWORD");
else
printf("It is not a KEYWORD");
getch();
}
int iskey(char *temp )
{
int flag=0,i;
for(i=0;i<10;i++)
{
if(strcmp(keywrd[i],temp)==0)
{
flag=1;
break;
}
}
return flag;
}
OUTPUT
Enter string: for
It is a KEYWORD
Enter string: make
It is not a KEYWORD
System Software Lab Manual
COET, Akola 8
Practical 3
Aim: Program to count lines and characters in given input file.
Theory:
Introduction
In this program, we are going to count the number of characters and new lines present in the given
input file. If a character read from file is not a white space then we can increment character count.
White space can be determined by using function isspace().
Functions Used in programming
 isspace()
The isspace() in C is a predefined function used for string and character handling. This function is
used to check if the argument contains any whitespace characters (tab, blank or new line character).
It is declared inside <ctype.h> header file.
Syntax of isspace()
isspace (character);
Parameters of isspace()
The isspace() function takes only one parameter of type char. It is the character to be tested.
Return Value of isspace()
The isspace() function returns an integer value that tells whether the passed parameter is a
whitespace character or not. The possible return values of isspace() function are:
If the character is a whitespace character, then the return value is non-zero.
If the character is not a whitespace character, then the return value is zero.
 fgetc()
fgetc() is used to obtain input from a file single character at a time. This function returns the ASCII
code of the character read by the function. It returns the character present at position indicated by
file pointer. After reading the character, the file pointer is advanced to next character. If pointer is
at end of file or if an error occurs EOF is returned by this function.
Syntax:
int fgetc(FILE *pointer)
System Software Lab Manual
COET, Akola 9
pointer: pointer to a FILE object that identifies the stream on which the operation is to be
performed.
Program
//Program to count lines and characters in input file
#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
#include<ctype.h>
int main()
{
int line_count=0,tab_count=0,blank_count=0,char_count=0;
FILE *p;
char ch,fname[20];
//clrscr();
printf("nEnter file name:");
scanf("%s",fname);
p=fopen(fname,"r");
if(p==NULL)
printf("nError in opening the file");
else
{
while((ch=fgetc(p))!=EOF)
{
if(isspace(ch))
{
if(ch=='n')
line_count++;
}
else
char_count++;
}
fclose(p);
printf("nLine count = %d",line_count);
printf("nCharacter count = %d",char_count);
}
return 0;
}
OUTPUT
Enter file name: comment.cpp
Line count = 74
Character count = 715
System Software Lab Manual
COET, Akola 10
Practical 4
Aim: Program to check parenthesis of expression is balanced or not.
Theory:
Introduction
There are two types of parentheses- Open and close parentheses. A well-formed parenthesis string
is a balanced parenthesis string. A string of parentheses is called balanced if, for every opening
bracket, there is a equivalent closing bracket. Characters such as ‘(‘, ‘)’, ‘[’, ‘]’, ‘{’, and ‘}’ are
considered brackets. In this program, we are considering only round brackets.
Examples of balanced parenthesis
( )( ), (( )), (( )( )), ( ( ( ) ) ( ) )
Examples of unbalanced parenthesis
There are two possibilities of unbalanced parenthesis:
1. Extra or misplaced closing parenthesis
For example: ( ) ), ) ( ) , ) (
2. Extra opening parenthesis
For example: (( ), ( )(, ( ) ( ) (
System Software Lab Manual
COET, Akola 11
Program
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
void main()
{
int count=0;
char expr[20], len, i;
printf("nEnter expression :");
gets(expr);
len=strlen(expr);
i=0;
while(i<len)
{
if(expr[i]=='(')
count++;
else if(expr[i]==')')
{
count--;
if (count<0) //) cant come before ( i.e., ')'
{ //before '(' is not allowed
printf("Error: misplaced/extra ')' brace");
exit(0);
}
}
i++;
}
if(count==0)
printf("nValid expression: balanced parenthesis");
else
if(count>0)
printf("nError: balanced parenthesis, Extra ( brace");
getch();
}
output
Enter expression :())
Error: misplaced/extra ')' brace
Enter expression :((()())
Error: balanced parenthesis, Extra ( brace
Enter expression :()()(())
Valid expression: balanced parenthesis
System Software Lab Manual
COET, Akola 12
Practical 5
Aim: Program for lexical analyzer.
Theory:
Role of lexical analyzer
As the first phase of a compiler, the main task of the lexical analyzer is to read the input characters
of the source program, group them into lexemes, and produce as output a sequence of tokens for
each lexeme in the source program.
Transition diagrams
Patterns are expressed using regular expressions. Regular-expression patterns are then converted to
transition diagrams. Following figure shows transition diagrams for relational operators, identifiers,
unsigned numbers and white space.
System Software Lab Manual
COET, Akola 13
Figure: transition diagrams for various tokens.such as relational operators, identifiers, unsigned
numbers and white space respectively.
Architecture of a Transition-Diagram-Based Lexical Analyzer
These transition diagrams are then converted into coding. Let us consider the ways to implement
the entire lexical analyzer.
1. We could arrange for the transition diagrams for each token to be tried sequentially. Then, the
function fail ( ) resets the pointer forward and starts the next transition diagram, each time it is
called.
2. We could run the various transition diagrams "in parallel," feeding the next input character to all
of them and allowing each one to make whatever transitions it required.
3. To combine all the transition diagrams into one. We allow the transition diagram to read input
until there is no possible next state, and then take the longest lexeme that matched any pattern, as
we discussed in case (2) above. This combination is easy, because no two tokens can start with the
same character; i.e., the first character immediately tells us which token we are looking for. Thus,
we could simply combine states 0, 9, 12, and 22 into one start state, leaving other transitions intact.
We are going to follow this approach for implementation of lexical analyzer.
Functions used in the program
gets()
The C library function char *gets(char *str) reads a line from stdin and stores it into the string
pointed to by str. It stops either when the newline character is read or when the end-of-file is
reached, whichever comes first. It allows blank and tab to be a part of input string.
Declaration
char *gets(char *str)
Parameters
str − This is the pointer to an array of chars where the C string is stored.
Return Value
This function returns str on success, and NULL on error or when end of file occurs, while no
characters have been read.
System Software Lab Manual
COET, Akola 14
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
#include<ctype.h>
int isop(char);
int main()
{
int state,i,j;
char ch,input[80],temp[20];
//clrscr();
printf("nnnEnter expression :"); //input
gets(input);
state=0;
i=0;
do
{
switch(state)
{
case 0: ch=input[i];
//pos++;
if(isalpha(ch))
state=1;
else if(isdigit(ch))
state=2;
else if(isop(ch))
state=3;
else if(isspace(ch))
state=4;
else if(ch=='0') state=5;
else state=6;
break;
case 1:
j=0;
temp[j]=input[i];
do
{
i++; j++;
if(isalpha(input[i]) || isdigit(input[i]) || input[i]=='_')
temp[j]=input[i];
else
break;
}while(1);
temp[j]='0';
printf(" <ID, %s>",temp);
state=0;
System Software Lab Manual
COET, Akola 15
break;
case 2:
j=0;
temp[j]=input[i];
do
{
i++; j++;
if(isdigit(input[i]))
temp[j]=input[i];
else
break;
}while(1);
temp[j]='0';
printf(" <NUM,%s > ",temp);
state=0;
break;
case 3:
printf(" < OP, %c > ",input[i]);
i++;
state=0;
break;
case 4:
while(isspace(input[i]))
{i++;}
state=0;
break;
case 5:exit(0);
case 6:printf(" cant recognize");
i++;
state=0;
break;
}
}while(1);
return 0;
}
int isop(char ch) {
if(ch=='+' || ch=='-' ||ch=='*' ||ch=='/' ||ch=='%' ||ch=='=')
return 1;
else
return 0;}
OUTPUT
Enter expression :abc_12 = pqr + 50$
<ID, abc_12> <OP, => <ID, pqr> <OP, +> <NUM, 50> cant recognize
System Software Lab Manual
COET, Akola 16
Practical 6
Aim: Program for word recognizer using Lex.
Theory:
Lexical analyzer Generator - Lex Tool:
Let us study a tool called Lex, or in a more recent implementation Flex (Fast Lexical analyzer
Generator), that allows one to specify a lexical analyzer by specifying regular expressions to
describe patterns for tokens. The input notation for the Lex tool is referred to as the Lex language
and the tool itself is the Lex compiler. The Lex compiler transforms the input patterns into a
transition diagram and generates code, in a file called lex . yy .c.
How to create Lexical analyzer using Lex?
Figure suggests how Lex is used. An input file, which we call lex.l, is written in the Lex
language and describes the lexical analyzer to be generated. The Lex compiler transforms lex.l to a
C program, that is always named lex.yy.c. The latter file is compiled by the C compiler into a file
called a.out as always. The C-compiler output is a working lexical analyzer that can take a stream
of input characters and produce a stream of tokens.
flex source program l lex.yy.c a.out
lex.l
Figure: Creating a lexical analyzer with flex
Structure of Lex Program:
A lex program has the following form:
Declarations
%%
Translation rules
%%
Auxiliary functions
The declarations section includes declarations of variables, manifest constants (identifiers
declared to stand for a constant, e.g., the name of a token), and regular definitions (names are given
to regular expression and use those names in subsequent expressions).
The translation rules each have the form-
Pattern { Action }
C-compiler
Lex compiler
System Software Lab Manual
COET, Akola 17
Each pattern is a regular expression, which may use the regular definitions of the declaration
section. The actions are fragments of code, typically written in C.
The third section holds whatever additional functions are used in the actions. These functions
can be compiled separately and loaded with the lexical analyzer.
Tool used for progamming: Flex Windows
Flex Windows (Lex and Yacc) contains the GNU Win 32 Ports of Flex and Bison which are Lex
and Yacc Compilers respectively, and are used for generating tokens and parsers. The Flex
Windows Package contains inbuilt Gcc And g libraries, C and C++ compilers which are ported to
Windows from Linux. The package also contains EditPlus IDE which provides pre-defined Blank
templates for the Lex/Yacc/C/C /Java files, thus each time you want to type a program you can
simply use the New Lex / New Yacc template, and the basic code will be inserted.
%option noyywrap
The %option noyywrap generally causes a lex compatible lexer generator (e.g. lex or flex) to emit a
macro version of yywrap() that returns 1, which causes the lexer to stop lexing when the first end-
of-file is reached.
Word Recognizer using Lex
Let’s build a simple program that recognizes different types of English words. It identifies different
parts of speech (noun, verb, etc.) and handles multiword sentences that conform to a simple English
grammar.
Program
%option noyywrap
%{
#include<stdio.h>
%}
%%
[t ]+ /* ignore whitespace */ ;
is |
am | are | were | was |
be | being | been | do | did |
will | would | should | can | could |
has | have | had |
go { printf("%s: verbn", yytext); }
System Software Lab Manual
COET, Akola 18
very | simply |
gently | quietly |
calmly | angrily { printf("%s: adverbn", yytext); }
to | from | behind | above |
below | between { printf("%s: prepositionn", yytext); }
if | then |
and | but |
or { printf("%s: conjunctionn", yytext); }
their | my | your |
his | her | its |
good | bad | nice { printf("%s: adjectiven", yytext); }
I | you | we |
he | she | it | they { printf("%s: pronounn", yytext); }
girl | boy |
student | teacher | parent { printf("%s: common nounn", yytext); }
.|n { ECHO;/* normal default anyway */ }
%%
main()
{
yylex();
}
OUTPUT
I can do it very effectively.
I can do it very effectively.
I: pronoun
can: verb
do: verb
it: pronoun
very: adverb
effectively.
I am a good girl.
I: pronoun
am: verb
a good: adjective
girl: common noun
.
do it calmly.
do: verb
it: pronoun
calmly: adverb.
System Software Lab Manual
COET, Akola 19
Practical 7
Aim: Program for number system using Lex.
Theory:
Regular Expressions in Lex:
1 Basics:
Meaning of some special characters in regular expression is as follows:
. matches any single character except n
* matches 0 or more instances of the preceding regular expression
+ matches 1 or more instances of the preceding regular expression
? matches 0 or 1 of the preceding regular expression
[ ] defines a character class
() groups enclosed regular expression into a new regular expression
“…” matches everything within the “ “ literally
x|y x or y
{i} definition of i
x/y x, only if followed by y (y not removed from input)
x{m,n} m to n occurrences of x
 x x, but only at beginning of line
x$ x, but only at end of line
"s" exactly what is in the quotes (except for "" and following character)
Note that a regular expression finishes with a space, tab or newline.
2. Meta characters:
Meta-characters do not match themselves, because they are used in the preceding regular
expression:
• ( ) [ ] { } < > + / , ^ * | .  " $ ? - %
To match a meta-character, use them with prefix "" (e.g.  . ).Similarly to match a backslash, tab or
newline, use , t, or n.
System Software Lab Manual
COET, Akola 20
Program
%option noyywrap
%{
#include<stdio.h>
%}
%%
[01]+ {printf("Binary number");}
[0-7]+ {printf("Octal number");}
[0-9]+ {printf("Decimal number");}
[0-9a-fA-F]+ {printf("Hexadecimal Number");}
[0-9]*.[0-9]+ {printf("Fractional Number");}
[0-9]*E[+-]?[0-9]+ {printf("Number with exponent");}
. {printf("it is not a number");}
%%
main()
{
printf(“nEnter any numbers:”);
yylex();
}
OUTPUT
Enter any numbers:
1011
Binary number
290
Decimal number
345
Octal number
3D
Hexadecimal Number
18.92
Fractional Number
89E+12
Number with exponent
g
it is not a number
$
it is not a number
System Software Lab Manual
COET, Akola 21
Practical 8
Aim: Program to count words, characters and lines using Lex.
Theory:
Variables & functions provided by flex:
There are some variables that are set automatically by the lexical analyzer that flex
generates are as follows:
1. yylval: yylval can hold attribute value, whether it be another numeric code, a pointer to
symbol table, or nothing. Yylval is shared between the lexical analyzer and parser.
2. yytext : It is a pointer to the beginning of lexeme.
3. yyleng : It holds the length of the lexeme found
Some functions provided by flex are as follows:
1. yymore() : append next string matched to current contents of yytext
2. yyless(n) : remove from yytext all but the first n characters
3. unput(c) : return character c to input stream
4. yywrap() : may be replaced by user.The yywrap method is called by the lexical
analyser whenever it inputs an EOF as the first character when trying to match a
regular expression.
Meaning of some special characters used in regular expression:
 x x, but only at beginning of line
[ ] defines a character class
+ matches 1 or more instances of the preceding regular expression
System Software Lab Manual
COET, Akola 22
Program
%option noyywrap
%{
#include<stdio.h>
int wordcount=0,charcount=0,linecount=0;
%}
word [^ tn]+
line n
%%
"#" {printf("n word count =%d, character count =%d, line count =%d", wordcount, charcount,
linecount); }
{word} {wordcount++;charcount+=yyleng;}
{line} {linecount++;}
%%
main()
{
printf("Enter any text (Terminate input with #):n");
yylex();
// printf("n wc=%d,cc=%d,lc=%d",wordcount,charcount,linecount);
}
OUTPUT:-
Enter any text (Terminate input with #):
I live in India.
Maharasthra Akola
#
word count =6, character count =29, line count =3
System Software Lab Manual
COET, Akola 23
Practical 9
Aim: Program for lexical analyzer using Lex.
Theory:
Lex program that recognizes the tokens and returns the token found. A few observations about this
code will introduce us to many of the important features of Lex.
In the declarations section we see a pair of special brackets, %{ and %}. Anything within
these brackets is copied directly to the file lex.yy.c, Also in the declarations section is a sequence of
regular definitions. These use the extended notation for regular expressions. Regular definitions
that are used in later definitions or in the patterns of the translation rules are surrounded by curly
braces. Thus, for instance, delim is defined to be shorthand for the character class consisting of the
tab, blank and new line. Then, ws is defined to be one or more delimiters, by the regular expression
{delim}+.
Notice that in the definition of id and number, parentheses are used as grouping
metasymbols and do not stand for themselves. If we wish to use one of the Lex metasymbols, such
as any of the parentheses, +, *, or ?, to stand for themselves, we may precede them with a
backslash. For instance, we see  . in the definition of number, to represent the dot, since that
character is a metasymbol representing "any character," as usual in UNIX regular expressions.
Finally, let us examine some of the patterns and rules in the middle section. First, ws, an
identifier declared in the first section, has an associated empty action. The second token has the
simple regular expression pattern i f. Should we see the two letters if on the input, and they are not
followed by another letter or digit (which would cause the lexical analyzer to find a longer prefix of
the input matching the pattern for id), then the lexical analyzer consumes these two letters from the
input and print the token name IF. Keywords else are treated similarly.
Conflict Resolution in Lex
We have alluded to the two rules that Lex uses to decide on the proper lexeme to select, when
several prefixes of the input match one or more patterns:
1. Always prefer a longer prefix to a shorter prefix.
2. If the longest possible prefix matches two or more patterns, prefer the pattern listed first in the
Lex program.
System Software Lab Manual
COET, Akola 24
Program
%option noyywrap
%{
#include<stdio.h>
%}
/*Regular definitions */
digit [0-9]
letter [a-zA-Z_]
id {letter}({letter}|{digit})*
digits {digit}+
num {digits}(.{digits})?
delim [ nt]
op [%/*-+]
whitespace {delim}+
%%
{whitespace} {/*no action,ignore*/}
if {printf(" <IF>");}
else {printf(" <ELSE>");}
{id} {printf(" <ID,%s>",yytext);}
{num} {printf(" <NUM,yytext>");}
"<=" {printf(" <RELOP,LE>");}
"<" {printf(" <RELOP,LT>");}
">=" {printf(" <RELOP,GE>");}
">" {printf(" <RELOP,GT>");}
"<>" {printf(" <RELOP,NE>");}
"==" {printf(" <RELOP,EQ>");}
"=" {printf(" <ASSIGN>");}
{op} {printf(" <OP,%s>",yytext);}
. {printf(" <%s>",yytext);}
%%
int main()
{
Printf(“Enter expression/code fragment:n”);
yylex();
return 0;
}
OUTPUT
Enter expression/code fragment:
if(x<y)
<IF,if> <(> <ID,x> <RELOP,LT> <ID,y> <)>
x=x+y;
<ID,x> <ASSIGN> <ID,x> <OP,+> <ID,y><;>
else
<ELSE>
y=y*10;
<ID,y> <ASSIGN> <ID,y> <OP,*> <NUM,10> <;>
System Software Lab Manual
COET, Akola 25
Practical 10
Aim: Program for simple desk calculator using YACC.
Theory:
Parser Generator - Yacc tool:
Yacc (Yet another compiler compiler) is a tool for automatically generating a LALR parser given a
grammar written in a yacc specification (.y file). A grammar specifies a set of production rules,
which define a language. A production rule specifies a sequence of symbols, sentences, which are
legal in the language. Latest parser generator Bison tool is also available in linux.
How to create parser using Yacc?
Yacc transforms translate.y into C program called translate.tab.c. The program y.tab.c is a
representation of an LALR parser written in C. By compiling y.tab.c with the ly library that
contains the LR parsing program using the command
cc y.tab.c –ly
we obtain desired object program a.out that performs translation specified by translate.y.
Yacc specification tranclate.tab.c a.out
translate.y
Figure: Creating translator with Yacc.
Structure of Yacc program:
The structure of Yacc source program is as follows:
%{
< C global variables, prototypes, comments >
%}
[DEFINITION SECTION]
%%
[PRODUCTION RULES SECTION]
%%
< Supporting C routines >
Yacc
Compiler
C
Compliler
System Software Lab Manual
COET, Akola 26
Definition section contains token declarations as follows-
%token DIGIT
This statement declares DIGIT to be a token. These tokens are also available to lexical analyzer
generated by flex.
In Production rule section we put grammar production rules and associated semantic
actions for each rule. A set of productions in grammar of the form-
<head> → <body>1 | <body>2 |…….. | <body>n
would be written in Yacc as
<head> : <body>1 {<semantic action>1}
| <body>2 {<semantic action>2}
……..
| <body>n {<semantic action>n}
;
Semantic action is a sequence of C statements. In semantic action, attribute values
associated with each grammar symbol in production rule are as follows-
Head: symbol1 symbol2 …symboln { semantic action }
$$ $1 $2 $n
Note that { $$=$1;} is the default semantic action.
Third part of Yacc specification is supporting C routines. Here lexical analyzer by the
name yylex() must be provided. Lexical analyzer produces tokens of the form-
< Token-name, Attribute value >
Where, token-name (such as DIGIT) must be declared in Yacc specification.The attribute
value associated with the token is communicated to the parser through Yacc-defined variable
yylval.
System Software Lab Manual
COET, Akola 27
Program:
%{
#include<stdio.h>
#include<ctype.h>
%}
%token DIGIT
%left ‘+’ ‘-‘
%left ‘*’ ‘/‘
%%
line :expr'n' {printf("%dn",$1);}
;
expr :expr'+'term {$$=$1+$3;}
|expr'-'term {$$=$1-$3;}
|term {/*$$=$1;*/}
;
term :term'*'factor {$$=$1*$3;}
|term'/'factor {$$=$1/$3;}
|factor
;
factor :'('expr')' {$$=$2;}
|DIGIT
;
%%
int main()
{
yyparse();
return 1;
}
yyerror (char *s)
{
fprintf(stderr,"%s",s);
}
yylex()
{
int c;
c=getchar();
if(isdigit(c))
{
yylval=c-'0';
return DIGIT;
}
return c;
}
OUTPUT
(4+2)*(7-5)/6
2
System Software Lab Manual
COET, Akola 28
Extra set of
practicals
System Software Lab Manual
COET, Akola 29
Practical 1
Aim: Program to eliminate comments from given input file.
Theory:
This program is intended to eliminate comments from a C program. This task is simple
since in a C program, comments will be associated only with '//'. So our aim is to detect the
occurence of these characters and ignore subsequent comments.
Program
#include <stdio.h>
void main()
{
FILE *in;
char fname[20];
printf("nEnter file name: ");
scanf("%s",fname);
in=fopen(fname,"r");
char temp,c;
if(!out)
printf("nfile cannot be opened");
else
{
while((c=fgetc(in))!=EOF)
{
if(c=='n' || c=='t')//eliminate blank space
continue;
if(c=='/')
{
if(fgetc(in)=='/')
{
do
{
c=fgetc(in);
}while(c!='n');
}
}
putchar(c);
}
}
fclose(in);
}
System Software Lab Manual
COET, Akola 30
INPUT
void main()
{
int a=10; //first
int b=20; //second
if (a<b) //check
printf("/a is greater/");
else
printf("b is greater");
getch();
}
OUTPUT
Enter file name: testin.c
void main(){int a=10;
int b=20;
if (a<b)
printf("/ is greater/);else printf("b is
greater");getch();
System Software Lab Manual
COET, Akola 31
Practical 2
Aim: Program for predictive parser.
Theory:
Predictive parsers can be depicted using transition diagrams for each non-terminal symbol where
the edges between the initial and the final states are labeled by the symbols (terminals and non-
terminals) of the right side of the production rule.
Program
// PROGRAM FOR PREDICTIVE PARSER
#include<stdio.h>
#include<conio.h>
#include<string.h>
#include<stdlib.h>
#include<iomanip.h>
void display(void);
void push(char);
char pop(void);
char stack[10];
int top=0,index=0;
char *istr,tos,curr_symb,len;
char E[]={'d','A','T','n','+','B','n','*','B','n','(','A','T','n',')','B','n','$','B','n'};
char A[]={'d','B','n','+','A','T','+','n','*','B','n','(','B','n',')','N','n','$','N','n'};
char T[]={'d','H','F','n','+','B','n','*','B','n','(','H','F','n',')','B','n','$','B','n'};
char H[]={'d','B','n','+','N','n','*','H','F','*','n','(','B','n',')','N','n','$','N','n'};
char F[]={'d','d','n','+','B','n','*','B','n','(',')','E','(','n',')','B','n','$','B','n'};
// for above: d=id B=blank N=null production A=E' H=T'
void main()
{
int i,j,error_flag=0;
clrscr();
printf("n Enter Input String (at the end $) : ");
scanf("%s",istr);
len=strlen(istr);
push('$');
push('E');
printf("StacktInputn");
display();
curr_symb=istr[index];
while(1)
{
tos=pop();
top--;
System Software Lab Manual
COET, Akola 32
curr_symb=istr[index];
switch(tos)
{
case 'E': for(i=0;i<20;i++)
{
if(E[i]==curr_symb)
{
for(j=i+1;E[j]!='n';j++)
{
if(E[j]!='B')
push(E[j]);
else
error_flag=1;
}
break;
}
}
display(); break;
case 'A': for(i=0;i<20;i++)
{
if(A[i]==curr_symb)
{
for(j=i+1;A[j]!='n';j++)
{
if(A[j]!='B')
push(A[j]);
else
error_flag=1;
}
break;
}
}
display(); break;
case 'T': for(i=0;i<20;i++)
{
if(T[i]==curr_symb)
{
for(j=i+1;T[j]!='n';j++)
{
if(T[j]!='B')
push(T[j]);
else
error_flag=1;
}
break;
}
}
System Software Lab Manual
COET, Akola 33
display(); break;
case 'H': for(i=0;i<20;i++)
{
if(H[i]==curr_symb)
{
for(j=i+1;H[j]!='n';j++)
{
if(H[j]!='B')
push(H[j]);
else
error_flag=1;
}
break;
}
}
display(); break;
case 'F': for(i=0;i<20;i++)
{
if(F[i]==curr_symb)
{
for(j=i+1;F[j]!='n';j++)
{
if(F[j]!='B')
push(F[j]);
else
error_flag=1;
}
break;
}
}
display();
break;
case 'd': if(curr_symb==tos)
{
pop();
index++;
}
display();
break;
case '+': if(curr_symb==tos)
{
pop();
index++;
}
display();
break;
case '*': if(curr_symb==tos)
{
System Software Lab Manual
COET, Akola 34
pop();
index++;
}
display();
break;
case '(': if(curr_symb==tos)
{
pop();
index++;
}
display();
break;
case ')': if(curr_symb==tos)
{
pop();
index++;
}
display();
break;
case '$': if(curr_symb=='$')
{
printf("n Accepted");
getch();
exit(0);
}
break;
case 'N': pop();
display();
break;
}
if(error_flag==1)
{
printf("nError");
exit(0);
}
}
}
void display(void)
{
int t=0;
while(t<=top)
{
printf("%c ",stack[t]);
t++;
}
printf(" t");
t=index;
while(t<=len)
System Software Lab Manual
COET, Akola 35
{
printf("%c ",istr[t]);
t++;
}
printf("n");
getch();
}
void push(char ch)
{
top++;
stack[top]=ch;
}
char pop()
{
if(top>=0)
return stack[top];
else
{
printf("n Stack is empty");
getch();
exit(0);
}
}
OUTPUT
Enter Input String (at the end $) : d+d*d$
Stack Input
$E d + d * d $
$AT d + d * d $
$AHF d + d * d $
$AHd d + d * d $
$AH + d * d $
$AN + d * d $
$A + d * d $
$AT+ + d * d $
$AT d * d $
$AHF d * d $
$AHd d * d $
$AH * d $
$AHF* * d $
$AHF d $
$AHd d $
$AH $
$AN $
$A $
$N $
$ $
Accepted
System Software Lab Manual
COET, Akola 36
Practical 3
Aim: Program to check if given grammar is left recursive and to remove left recursion.
Theory:
Removing left recursion
Left recursion often poses problems for parsers, either because it leads them into infinite recursion
(as in the case of most top-down parsers). Therefore, a grammar is often preprocessed to eliminate
the left recursion.
#include<stdio.h>
#include<stdlib.h>
void main()
{
int i,j; char pr[20], NT, beta[5], alpha[5];
printf("nEnter production rule :");
scanf("%s",pr);
j=0; NT=pr[j];
j+=3; //skip ->
if(pr[j]==NT)
{
printf("nGrammar is left recursive"); j++; i=0;
do //alpha
{
alpha[i++]=pr[j++];
}while(pr[j]!='|');
alpha[i]='0';
j+=1; //skip |
i=0;
do //beta
{
beta[i++]=pr[j++];
}while(pr[j]!='0');
beta[i]='0';
printf("nNew production rules are:");
printf("nt%c -> %sX",NT,beta);
printf("ntX->%sX|^",alpha);
}
else
printf("grammar is not left recursive");
}
output
Enter production rule :T->T*F|T
Grammar is left recursive
New production rules are:
T -> TX
X->*FX|^
System Software Lab Manual
COET, Akola 37
Practical 4
Aim: Program code generator.
Theory:
In computing, code generation is the process by which a compiler's code generator converts
some intermediate representation of source code into a target code. The input to the code generator
typically consists of a parse tree or an abstract syntax tree or three-address code.
Major tasks in code generation
Tasks which are typically part of a sophisticated compiler's "code generation" phase include:
 Instruction selection: which instructions to use.
 Instruction scheduling: in which order to put those instructions. Scheduling is a speed
optimization that can have a critical effect on pipelined machines.
 Register allocation: the allocation of variables to processor registers.
Program
#include<stdio.h>
#include<stdlib.h>
char op;
char soc1,soc2,dest;
void code_genarate();
void free_reg(int);
void main()
{
FILE *p; char ch;
p=fopen("tac.c","r");
if(p==NULL)
{
printf("nError in opening");
exit(0);
}
do
{
dest=fgetc(p);
if(dest=='n')
break;
fgetc(p); //assignment op
soc1=fgetc(p);
op=fgetc(p);
soc2 =fgetc(p);
code_genarate();
System Software Lab Manual
COET, Akola 38
}while(fgetc(p)!=EOF);
}
void code_genarate()
{
printf("nMOV %c,R1n",soc1);
printf("nMOV %c,R2n",soc2);
switch(op)
{
case '+':printf("nADD R1,R2n");
break;
case '-':printf("nSUB R1,R2n");
break;
case '*':printf("nMUL R1,R2n");
break;
case '/':printf("nDIV R1,R2n");
break;
}
printf("nMOV R1,%cn",dest);
}
INPUT
x=a*b
y=c+x
d=x+y
OUTPUT
MOV a,R1
MOV b,R2
MUL R1,R2
MOV R1,x
MOV c,R1
MOV x,R2
ADD R1,R2
MOV R1,y
MOV x,R1
MOV y,R2
ADD R1,R2
MOV R1,d
System Software Lab Manual
COET, Akola 39
Practical 5
Aim: Program to check whether the string end with bb using Lex.
Theory:
Regular expressions used are described as-
[ab]*bb - zero or more occurrences of a or b
[ab]*bb - zero or more occurrences of a or b followed by bb
Program
%option noyywrap
%{
#include<stdio.h>
%}
%%
[ab]*bb {printf("string accepted");}
[ab]* {printf("string rejected");}
%%
int main()
{
printf(“nEnter string of a & b:”);
yylex();
return 0;
}
OUTPUT
Enter string of a & b:
abaaa
string rejected
abaab
string rejected
abb
string accepted
bb
string accepted
System Software Lab Manual
COET, Akola 40
Practical 6
Aim: Program to count no. of vowels and consonants in a given input string using Lex.
Theory:
Regular expressions used are described as-
[aeiouAEIOU] - Any vowel either in capital or small case
[a-zA-Z] - Any consonant either in capital or small case
Conflict Resolution in Lex
Two rules that Lex uses to decide on the proper lexeme to select, when several prefixes of the input
match one or more patterns:
1. Always prefer a longer prefix to a shorter prefix.
2. If the longest possible prefix matches two or more patterns, prefer the pattern listed first in the
Lex program. So, even vowels come in a-z, it matches it with first Regular Expression (i. e. aeiou).
%option noyywrap
%{
#include<stdio.h>
int vowel_count=0, consonant_count=0;
%}
%%
[aeiouAEIOU] {vowel_count++;}
[a-zA-Z] {consonant_count++;}
[$] {printf("vowel count=%d consonant count=%d ",vowel_count,consonant_count);}
%%
int main(){
printf("Enter any text (Terminate input with $):n");
yylex();
return 0;
}
OUTPUT
Enter any text (Terminate input with $):
I live in India.$
vowel count=7 consonant count=5
System Software Lab Manual
COET, Akola 41
Practical 7
Aim: YACC program for advance desk calculator by creating lexical analyzer using Lex.
Theory:
Creating lexical analyzer for Yacc using flex:
If Lex is used to produce the lexical analyzer, we replace the routine yylex() in the Yacc
specification by the statement-
#include “lex.yy.c”
File lex.yy.c can be obtained by compiling lex program (using Lex compile option). Lexical
analyzer generated using Lex recognize token NUM (number with multiple digits). Now, yacc
program is compiled using YACC compile option in tools menu. Now, y.tab.c file is obtained
which can be compiled using C compiler.
Precedence and Associatively:
To reduce conflicts Yacc provides facility to assign precedences and associativities to
terminals. In declaration section, the declaration
%left ‘+’ ‘-‘
makes + and – be of same precedence and to be left associative. We can declare operator to be right
associative as-
%right ‘^’
The operators are given precedences in the order in which they appear in the declaration
part, lowest first.
Program
Lex file- cal.l
%option noyywrap
number [0-9]+
%%
{number} { sscanf(yytext, "%d", &yylval); return NUM;}
n|. { return yytext[0];}
%%
System Software Lab Manual
COET, Akola 42
Yacc file-cal.y
%{
#include<stdio.h>
%}
%token NUM
%left '+' '-'
%left '*' '/'
%%
line :expr'n' {printf("%dn",$1);}
;
expr :expr'+'expr {$$=$1+$3;}
|expr'-'expr {$$=$1-$3;}
|expr'*'expr {$$=$1*$3;}
|expr'/'expr {$$=$1/$3;}
|'('expr')' {$$=$2;}
|NUM
;
%%
#include "lex.yy.c"
int main()
{
printf("Enter expression:");
yyparse();
return 1;
}
yyerror (char *s)
{
fprintf(stderr,"%s",s);
}
OUTPUT
Enter expression: (12*5)/10+35
41
Enter expression:10*(280+20)/10+1
301

Mais conteúdo relacionado

Mais procurados

Lexical Analysis - Compiler Design
Lexical Analysis - Compiler DesignLexical Analysis - Compiler Design
Lexical Analysis - Compiler DesignAkhil Kaushik
 
Compiler design syntax analysis
Compiler design syntax analysisCompiler design syntax analysis
Compiler design syntax analysisRicha Sharma
 
Three address code In Compiler Design
Three address code In Compiler DesignThree address code In Compiler Design
Three address code In Compiler DesignShine Raj
 
Introduction to c programming
Introduction to c programmingIntroduction to c programming
Introduction to c programmingManoj Tyagi
 
A Role of Lexical Analyzer
A Role of Lexical AnalyzerA Role of Lexical Analyzer
A Role of Lexical AnalyzerArchana Gopinath
 
Passes of Compiler.pptx
Passes of Compiler.pptxPasses of Compiler.pptx
Passes of Compiler.pptxSanjay Singh
 
Lecture 01 introduction to compiler
Lecture 01 introduction to compilerLecture 01 introduction to compiler
Lecture 01 introduction to compilerIffat Anjum
 
1.1. the central concepts of automata theory
1.1. the central concepts of automata theory1.1. the central concepts of automata theory
1.1. the central concepts of automata theorySampath Kumar S
 
Programming in c
Programming in cProgramming in c
Programming in cvineet4523
 
INTRODUCTION TO C PROGRAMMING
INTRODUCTION TO C PROGRAMMINGINTRODUCTION TO C PROGRAMMING
INTRODUCTION TO C PROGRAMMINGAbhishek Dwivedi
 
Symbol table in compiler Design
Symbol table in compiler DesignSymbol table in compiler Design
Symbol table in compiler DesignKuppusamy P
 
GE8151 Problem Solving and Python Programming
GE8151 Problem Solving and Python ProgrammingGE8151 Problem Solving and Python Programming
GE8151 Problem Solving and Python ProgrammingMuthu Vinayagam
 

Mais procurados (20)

Lexical Analysis - Compiler Design
Lexical Analysis - Compiler DesignLexical Analysis - Compiler Design
Lexical Analysis - Compiler Design
 
Compiler design syntax analysis
Compiler design syntax analysisCompiler design syntax analysis
Compiler design syntax analysis
 
Three address code In Compiler Design
Three address code In Compiler DesignThree address code In Compiler Design
Three address code In Compiler Design
 
Introduction to c programming
Introduction to c programmingIntroduction to c programming
Introduction to c programming
 
Chapter 5 Syntax Directed Translation
Chapter 5   Syntax Directed TranslationChapter 5   Syntax Directed Translation
Chapter 5 Syntax Directed Translation
 
A Role of Lexical Analyzer
A Role of Lexical AnalyzerA Role of Lexical Analyzer
A Role of Lexical Analyzer
 
Lexical analysis - Compiler Design
Lexical analysis - Compiler DesignLexical analysis - Compiler Design
Lexical analysis - Compiler Design
 
Top down parsing
Top down parsingTop down parsing
Top down parsing
 
Passes of Compiler.pptx
Passes of Compiler.pptxPasses of Compiler.pptx
Passes of Compiler.pptx
 
Programming Fundamentals
Programming FundamentalsProgramming Fundamentals
Programming Fundamentals
 
Lecture 01 introduction to compiler
Lecture 01 introduction to compilerLecture 01 introduction to compiler
Lecture 01 introduction to compiler
 
1.1. the central concepts of automata theory
1.1. the central concepts of automata theory1.1. the central concepts of automata theory
1.1. the central concepts of automata theory
 
Programming in c
Programming in cProgramming in c
Programming in c
 
Control statements in c
Control statements in cControl statements in c
Control statements in c
 
java token
java tokenjava token
java token
 
Phases of compiler
Phases of compilerPhases of compiler
Phases of compiler
 
INTRODUCTION TO C PROGRAMMING
INTRODUCTION TO C PROGRAMMINGINTRODUCTION TO C PROGRAMMING
INTRODUCTION TO C PROGRAMMING
 
Symbol table in compiler Design
Symbol table in compiler DesignSymbol table in compiler Design
Symbol table in compiler Design
 
Yacc
YaccYacc
Yacc
 
GE8151 Problem Solving and Python Programming
GE8151 Problem Solving and Python ProgrammingGE8151 Problem Solving and Python Programming
GE8151 Problem Solving and Python Programming
 

Semelhante a Compiler Design lab manual for Computer Engineering .pdf (20)

C programming.pdf
C programming.pdfC programming.pdf
C programming.pdf
 
C notes
C notesC notes
C notes
 
Introduction to C Programming
Introduction to C ProgrammingIntroduction to C Programming
Introduction to C Programming
 
Let's us c language (sabeel Bugti)
Let's us c language (sabeel Bugti)Let's us c language (sabeel Bugti)
Let's us c language (sabeel Bugti)
 
C programming languag for cse students
C programming languag for cse studentsC programming languag for cse students
C programming languag for cse students
 
unit 1 cpds.pptx
unit 1 cpds.pptxunit 1 cpds.pptx
unit 1 cpds.pptx
 
C LANGUAGE NOTES
C LANGUAGE NOTESC LANGUAGE NOTES
C LANGUAGE NOTES
 
C material
C materialC material
C material
 
Introduction of C++ By Pawan Thakur
Introduction of C++ By Pawan ThakurIntroduction of C++ By Pawan Thakur
Introduction of C++ By Pawan Thakur
 
Introduction%20C.pptx
Introduction%20C.pptxIntroduction%20C.pptx
Introduction%20C.pptx
 
C Programming Unit-1
C Programming Unit-1C Programming Unit-1
C Programming Unit-1
 
C Lang notes.ppt
C Lang notes.pptC Lang notes.ppt
C Lang notes.ppt
 
Compilers Design
Compilers DesignCompilers Design
Compilers Design
 
Inroduction to r
Inroduction to rInroduction to r
Inroduction to r
 
Lecture 01 2017
Lecture 01 2017Lecture 01 2017
Lecture 01 2017
 
Introduction to c++
Introduction to c++Introduction to c++
Introduction to c++
 
C-PROGRAM
C-PROGRAMC-PROGRAM
C-PROGRAM
 
Language processors
Language processorsLanguage processors
Language processors
 
Assignment4
Assignment4Assignment4
Assignment4
 
Bcsl 031 solve assignment
Bcsl 031 solve assignmentBcsl 031 solve assignment
Bcsl 031 solve assignment
 

Último

Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 

Último (20)

Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 

Compiler Design lab manual for Computer Engineering .pdf

  • 1. Shri Shivaji Education Society Amravati’s COLLEGE OF ENGINEERING & TECHNOLOGY AKOLA LAB MANUAL Compiler Design Lab Subject code: 5KS07 SEMESTER: V Department of Computer Science & Engineering College of Engineering & Technology, Akola Affiliated to Sant Gadge Baba Amravati University, Maharashtra, India.
  • 2. PREFACE A compiler is a program that can read a program in one language- the source language - and translate it into an equivalent program in another language - the target language. The compiler may produce an assembly-language program as its output. Compiler design can be divided into two parts: analysis and synthesis. The analysis part breaks up the source program into constituent pieces and imposes a grammatical structure on them. The synthesis part constructs the desired target program from the intermediate representation and the information in the symbol table. If we examine the compilation process in more detail, we see that it operates as a sequence of phases-lexical analysis, syntax & semantic analysis, intermediate code generation, code optimization & code generation, each of which transforms one representation of the source program to another. Practical of Compiler design Lab for engineering in CSE are divided into two parts- i) implementation using programming language and ii) implementation using tools. In first part, practical can be performed in any programming language such as C language or Python. C language is always preferred. C is the middle level language combining low-level hardware controlling ability and high level programming capabilities. It is the procedural language focusing on functions and pointers. To implement phases of compiler, various tools are available in Linux system. Lex tool is lexical analyzer generator whereas YACC tool is parser generator. Recent versions of Lex and YACC tools are Flex and Bison respectively. Recently windows version of these tools is available. Flex Windows (Lex and Yacc) contains the GNU Win 32 Ports of Flex and Bison, which are Lex and Yacc Compilers respectively, and are used for generating tokens and parsers. The Flex Windows Package contains inbuilt gcc and g libraries, C and C++ compilers, which are ported to Windows from Linux. Programming using tools is typically more interesting. This lab manual covers various programs using tools. I hope that this manual will be useful for students. Prof. Kalpana S. Gilda
  • 3. List of practical Subject: Compiler Design Sr.No. Title Page no. 1 Program to check valid ‘c’ language identifier. 1 2 Program to check whether given string is keyword or not. 5 3 Program to count lines and characters in given input file. 8 4 Program to check parenthesis of expression is balanced or not. 10 5 Program for lexical analyzer which produces tokens for given expression. 12 6 Program for word recognizer using Lex. 16 7 Program for number system using Lex. 19 8 Program to count words, characters and lines using Lex. 21 9 Program for lexical analyzer using Lex. 23 10 Program for simple desk calculator using YACC. 25 Additional set of practicals Sr.No. Title Page no. 1 Write a C program to eliminate single line comments from given input file. 29 2 Program for predictive parser. 31 3 Program to check if given grammar is left recursive and to remove left recursion. 36 4 Program code generator. 37 5 Program to check whether the string end with bb using Lex. 39 6 Program to count no. of vowels and consonants in a given input string using Lex. 40 7 YACC program for advance desk calculator by creating lexical analyzer using Lex. 41 Text Books: [1] Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D. Ullman Compilers: ―Principles, Techniques and Tools‖, Pearson Education, Second Edition. Reference Books: [1] Doug Brown, John Levine, and Tony Mason, ―Lex & Yacc‖, O‘Reilly & Associates, Inc., Second Edition.
  • 4. System Software Lab Manual COET, Akola 1 Practical 1 Aim: Program to check valid ‘c’ language identifier. Theory: Identifiers Identifiers are the names that are given to various program elements such as variables, symbolic constants and functions. Identifiers can be freely named by following some restrictions. Identifiers in C language In C language identifiers are the names given to variables, constants and functions. These identifiers are defined against a set of rules. Rules for an identifier in C language 1. An identifier can only have alphanumeric characters (a-z, A-Z, 0-9) and underscore (_). 2. The first character of an identifier can only be an alphabet (a-z, A-Z) or underscore (_). 3. Identifiers are also case sensitive in C. For example name and Name are two different identifiers in C. 4. Keywords are not allowed to be used as Identifiers. 5. No special characters, such as semicolon, period, whitespaces, slash or comma are permitted to be used in or as an identifier. 6. Maximum length of identifier is 31 characters. Examples of valid C language identifiers: _a12, abc_, pr_12, abc, then_next_val etc. Examples of invalid C language identifiers: 1a, abc$12, ab c, abc# etc. Functions Used in programming  int isalpha(int c) isalpha(c) is a function in C which can be used to check if the passed character is an alphabet or not. It returns a non-zero value if it’s an alphabet else it returns 0. For example, it returns non-zero values for ‘a’ to ‘z’ and ‘A’ to ‘Z’ and zeroes for other characters. Declaration int isalpha(int c); Parameters c − This is the character to be checked.
  • 5. System Software Lab Manual COET, Akola 2 Return Value This function returns non-zero value if c is an alphabet, else it returns 0.  int isdigit(int c): isdigit(c) is a function in C which can be used to check if the passed character is a digit or not. It returns a non-zero value if it’s a digit else it returns 0. For example, it returns a non-zero value for ‘0’ to ‘9’ and zero for others. Declaration int isdigit(int c); Parameters c − This is the character to be checked. Return Value This function returns non-zero value if c is a digit, else it returns 0.  int isalnum(int c) The function isalnum() is used to check that the character is alphanumeric or not. It returns non- zero value, if the character is alphanumeric means letter or number otherwise, returns zero. It is declared in “ctype.h” header file. Declaration int isalnum(int c); Parameters c − This is the character to be checked. Return Value This function returns non-zero value if c is a digit or a letter, else it returns 0.
  • 6. System Software Lab Manual COET, Akola 3 // Program to check valid ‘c’ language identifier. #include<stdio.h> #include<string.h> #include<ctype.h> int main() { char id[20]; int i,flag,len; //clrscr(); printf("n Enter a string: "); gets(id); len=strlen(id); flag=0; if(isalpha(id[0]) || id[0]=='_') { for(i=1;i<len;i++) { if(isalpha(id[i]) || isdigit(id[i]) || id[i]=='_') { continue;} else { flag=1; break; } } if(flag==0) { printf("n%s is valid c-language identifier"); } else printf("nInvalid identifier, %c char not allowed",id[i]); } else printf("nERROR:Variable should begin with letter"); return 0; } ---------------------------------------------------------------------------------------------------------------------- //using switch-case #include<stdio.h> #include<conio.h> #include<string.h> #include<ctype.h> #include<stdlib.h> void main() { char id[20]; int i,state,len;
  • 7. System Software Lab Manual COET, Akola 4 clrscr(); printf("n Enter a string: "); gets(id); len=strlen(id); state=0; while(1) { switch(state) { case 0: if(isalpha(id[0])) state=1; else { printf("nERROR:Variable should start with alphabet"); getch(); exit(0); } break; case 1: for(i=1;i<len;i++) { if(isalpha(id[i]) || isdigit(id[i]) || id[i]=='_') {} else { printf("nERROR: Invalid identifier, %c char not allowed",id[i]); getch(); exit(0); } } state=2; break; case 2:printf("n%s is valid c-language identifier"); getch(); exit(0); } } } OUTPUT Enter a string: abc_123 ab92_g5 is valid c-language identifier Enter a string: 1fgj ERROR: Variable should start with alphabet Enter a string:af$d1 Invalid identifier, $ char not allowed
  • 8. System Software Lab Manual COET, Akola 5 Practical 2 Aim: Program to check whether given string is keyword or not. Theory: Keywords Keywords are preserved words that have special meaning in C language. The meaning has already been described. These meaning cannot be changed. There are total 32 keywords in C language as shown in figure1. auto double int struct break else long switch case enum register typedef const extern return union char float short unsigned continue for signed volatile default goto sizeof void do if static while Figure1: List of keywords in C language. String as Character array String is a sequence of characters that is treated as a single data item and terminated by null character '0'. Remember that C language does not support strings as a data type. A string is actually one-dimensional array of characters in C language. For declaration of keywords, we need to use 2-dimensional array of characters. A 2D array is also known as a matrix (a table of rows and columns) as shown in following example. f o r 0 i f 0 e l s e 0 d o 0
  • 9. System Software Lab Manual COET, Akola 6 w h i l e 0 b r e a k 0 s w i t c h 0 c a s e 0 v o i d 0 s t r u c t 0 Figure2 : 2D character array for keywords Functions used in programming  strcmp() : strcmp() is a built-in library function that is used for string comparison. This function takes two strings (array of characters) as arguments, compares these two strings lexicographically, and then returns 0,1, or -1 as the result. It is defined inside <string.h> header file with its prototype as follows. Syntax of strcmp() strcmp(first_str, second_str ); Parameters of strcmp() This function takes two strings (array of characters) as parameters: first_str: First string is taken as a pointer to the constant character (i.e. immutable string). second_str: Second string is taken as a pointer to a constant character. Return Value of strcmp() The strcmp() function returns three different values after the comparison of the two strings which are as follows: 1. Zero ( 0 ) A value equal to zero when both strings are found to be identical. That is, all of the characters in both strings are the same. 2. Greater than Zero ( > 0 ) A value greater than zero is returned when the first not-matching character in first_str has a greater ASCII value than the corresponding character in second_str or we can also say that if the character in first_str is lexicographically after the character of second_str, then zero is returned. 3. Lesser than Zero ( < 0 )
  • 10. System Software Lab Manual COET, Akola 7 A value less than zero is returned when the first not-matching character in first_str has a lesser ASCII value than the corresponding character in second_str. We can also say that if the character in first_str is lexicographically before the character of second_str, zero is returned. Program: Illegal //ProgramTo identify given string is keyword or not #include<stdio.h> #include<conio.h> #include<string.h> #include<ctype.h> int iskey(char *); char keywrd[10][7]={"for","if","else","do","while","break","switch","case",”void”,”struct”}; void main() { int i,flag,len; char ch,temp[10]; clrscr(); printf("nEnter string:"); scanf("%s",temp); if(iskey(temp)) printf("It is a KEYWORD"); else printf("It is not a KEYWORD"); getch(); } int iskey(char *temp ) { int flag=0,i; for(i=0;i<10;i++) { if(strcmp(keywrd[i],temp)==0) { flag=1; break; } } return flag; } OUTPUT Enter string: for It is a KEYWORD Enter string: make It is not a KEYWORD
  • 11. System Software Lab Manual COET, Akola 8 Practical 3 Aim: Program to count lines and characters in given input file. Theory: Introduction In this program, we are going to count the number of characters and new lines present in the given input file. If a character read from file is not a white space then we can increment character count. White space can be determined by using function isspace(). Functions Used in programming  isspace() The isspace() in C is a predefined function used for string and character handling. This function is used to check if the argument contains any whitespace characters (tab, blank or new line character). It is declared inside <ctype.h> header file. Syntax of isspace() isspace (character); Parameters of isspace() The isspace() function takes only one parameter of type char. It is the character to be tested. Return Value of isspace() The isspace() function returns an integer value that tells whether the passed parameter is a whitespace character or not. The possible return values of isspace() function are: If the character is a whitespace character, then the return value is non-zero. If the character is not a whitespace character, then the return value is zero.  fgetc() fgetc() is used to obtain input from a file single character at a time. This function returns the ASCII code of the character read by the function. It returns the character present at position indicated by file pointer. After reading the character, the file pointer is advanced to next character. If pointer is at end of file or if an error occurs EOF is returned by this function. Syntax: int fgetc(FILE *pointer)
  • 12. System Software Lab Manual COET, Akola 9 pointer: pointer to a FILE object that identifies the stream on which the operation is to be performed. Program //Program to count lines and characters in input file #include<stdio.h> #include<conio.h> #include<stdlib.h> #include<ctype.h> int main() { int line_count=0,tab_count=0,blank_count=0,char_count=0; FILE *p; char ch,fname[20]; //clrscr(); printf("nEnter file name:"); scanf("%s",fname); p=fopen(fname,"r"); if(p==NULL) printf("nError in opening the file"); else { while((ch=fgetc(p))!=EOF) { if(isspace(ch)) { if(ch=='n') line_count++; } else char_count++; } fclose(p); printf("nLine count = %d",line_count); printf("nCharacter count = %d",char_count); } return 0; } OUTPUT Enter file name: comment.cpp Line count = 74 Character count = 715
  • 13. System Software Lab Manual COET, Akola 10 Practical 4 Aim: Program to check parenthesis of expression is balanced or not. Theory: Introduction There are two types of parentheses- Open and close parentheses. A well-formed parenthesis string is a balanced parenthesis string. A string of parentheses is called balanced if, for every opening bracket, there is a equivalent closing bracket. Characters such as ‘(‘, ‘)’, ‘[’, ‘]’, ‘{’, and ‘}’ are considered brackets. In this program, we are considering only round brackets. Examples of balanced parenthesis ( )( ), (( )), (( )( )), ( ( ( ) ) ( ) ) Examples of unbalanced parenthesis There are two possibilities of unbalanced parenthesis: 1. Extra or misplaced closing parenthesis For example: ( ) ), ) ( ) , ) ( 2. Extra opening parenthesis For example: (( ), ( )(, ( ) ( ) (
  • 14. System Software Lab Manual COET, Akola 11 Program #include<stdio.h> #include<string.h> #include<stdlib.h> void main() { int count=0; char expr[20], len, i; printf("nEnter expression :"); gets(expr); len=strlen(expr); i=0; while(i<len) { if(expr[i]=='(') count++; else if(expr[i]==')') { count--; if (count<0) //) cant come before ( i.e., ')' { //before '(' is not allowed printf("Error: misplaced/extra ')' brace"); exit(0); } } i++; } if(count==0) printf("nValid expression: balanced parenthesis"); else if(count>0) printf("nError: balanced parenthesis, Extra ( brace"); getch(); } output Enter expression :()) Error: misplaced/extra ')' brace Enter expression :((()()) Error: balanced parenthesis, Extra ( brace Enter expression :()()(()) Valid expression: balanced parenthesis
  • 15. System Software Lab Manual COET, Akola 12 Practical 5 Aim: Program for lexical analyzer. Theory: Role of lexical analyzer As the first phase of a compiler, the main task of the lexical analyzer is to read the input characters of the source program, group them into lexemes, and produce as output a sequence of tokens for each lexeme in the source program. Transition diagrams Patterns are expressed using regular expressions. Regular-expression patterns are then converted to transition diagrams. Following figure shows transition diagrams for relational operators, identifiers, unsigned numbers and white space.
  • 16. System Software Lab Manual COET, Akola 13 Figure: transition diagrams for various tokens.such as relational operators, identifiers, unsigned numbers and white space respectively. Architecture of a Transition-Diagram-Based Lexical Analyzer These transition diagrams are then converted into coding. Let us consider the ways to implement the entire lexical analyzer. 1. We could arrange for the transition diagrams for each token to be tried sequentially. Then, the function fail ( ) resets the pointer forward and starts the next transition diagram, each time it is called. 2. We could run the various transition diagrams "in parallel," feeding the next input character to all of them and allowing each one to make whatever transitions it required. 3. To combine all the transition diagrams into one. We allow the transition diagram to read input until there is no possible next state, and then take the longest lexeme that matched any pattern, as we discussed in case (2) above. This combination is easy, because no two tokens can start with the same character; i.e., the first character immediately tells us which token we are looking for. Thus, we could simply combine states 0, 9, 12, and 22 into one start state, leaving other transitions intact. We are going to follow this approach for implementation of lexical analyzer. Functions used in the program gets() The C library function char *gets(char *str) reads a line from stdin and stores it into the string pointed to by str. It stops either when the newline character is read or when the end-of-file is reached, whichever comes first. It allows blank and tab to be a part of input string. Declaration char *gets(char *str) Parameters str − This is the pointer to an array of chars where the C string is stored. Return Value This function returns str on success, and NULL on error or when end of file occurs, while no characters have been read.
  • 17. System Software Lab Manual COET, Akola 14 #include<stdio.h> #include<string.h> #include<stdlib.h> #include<ctype.h> int isop(char); int main() { int state,i,j; char ch,input[80],temp[20]; //clrscr(); printf("nnnEnter expression :"); //input gets(input); state=0; i=0; do { switch(state) { case 0: ch=input[i]; //pos++; if(isalpha(ch)) state=1; else if(isdigit(ch)) state=2; else if(isop(ch)) state=3; else if(isspace(ch)) state=4; else if(ch=='0') state=5; else state=6; break; case 1: j=0; temp[j]=input[i]; do { i++; j++; if(isalpha(input[i]) || isdigit(input[i]) || input[i]=='_') temp[j]=input[i]; else break; }while(1); temp[j]='0'; printf(" <ID, %s>",temp); state=0;
  • 18. System Software Lab Manual COET, Akola 15 break; case 2: j=0; temp[j]=input[i]; do { i++; j++; if(isdigit(input[i])) temp[j]=input[i]; else break; }while(1); temp[j]='0'; printf(" <NUM,%s > ",temp); state=0; break; case 3: printf(" < OP, %c > ",input[i]); i++; state=0; break; case 4: while(isspace(input[i])) {i++;} state=0; break; case 5:exit(0); case 6:printf(" cant recognize"); i++; state=0; break; } }while(1); return 0; } int isop(char ch) { if(ch=='+' || ch=='-' ||ch=='*' ||ch=='/' ||ch=='%' ||ch=='=') return 1; else return 0;} OUTPUT Enter expression :abc_12 = pqr + 50$ <ID, abc_12> <OP, => <ID, pqr> <OP, +> <NUM, 50> cant recognize
  • 19. System Software Lab Manual COET, Akola 16 Practical 6 Aim: Program for word recognizer using Lex. Theory: Lexical analyzer Generator - Lex Tool: Let us study a tool called Lex, or in a more recent implementation Flex (Fast Lexical analyzer Generator), that allows one to specify a lexical analyzer by specifying regular expressions to describe patterns for tokens. The input notation for the Lex tool is referred to as the Lex language and the tool itself is the Lex compiler. The Lex compiler transforms the input patterns into a transition diagram and generates code, in a file called lex . yy .c. How to create Lexical analyzer using Lex? Figure suggests how Lex is used. An input file, which we call lex.l, is written in the Lex language and describes the lexical analyzer to be generated. The Lex compiler transforms lex.l to a C program, that is always named lex.yy.c. The latter file is compiled by the C compiler into a file called a.out as always. The C-compiler output is a working lexical analyzer that can take a stream of input characters and produce a stream of tokens. flex source program l lex.yy.c a.out lex.l Figure: Creating a lexical analyzer with flex Structure of Lex Program: A lex program has the following form: Declarations %% Translation rules %% Auxiliary functions The declarations section includes declarations of variables, manifest constants (identifiers declared to stand for a constant, e.g., the name of a token), and regular definitions (names are given to regular expression and use those names in subsequent expressions). The translation rules each have the form- Pattern { Action } C-compiler Lex compiler
  • 20. System Software Lab Manual COET, Akola 17 Each pattern is a regular expression, which may use the regular definitions of the declaration section. The actions are fragments of code, typically written in C. The third section holds whatever additional functions are used in the actions. These functions can be compiled separately and loaded with the lexical analyzer. Tool used for progamming: Flex Windows Flex Windows (Lex and Yacc) contains the GNU Win 32 Ports of Flex and Bison which are Lex and Yacc Compilers respectively, and are used for generating tokens and parsers. The Flex Windows Package contains inbuilt Gcc And g libraries, C and C++ compilers which are ported to Windows from Linux. The package also contains EditPlus IDE which provides pre-defined Blank templates for the Lex/Yacc/C/C /Java files, thus each time you want to type a program you can simply use the New Lex / New Yacc template, and the basic code will be inserted. %option noyywrap The %option noyywrap generally causes a lex compatible lexer generator (e.g. lex or flex) to emit a macro version of yywrap() that returns 1, which causes the lexer to stop lexing when the first end- of-file is reached. Word Recognizer using Lex Let’s build a simple program that recognizes different types of English words. It identifies different parts of speech (noun, verb, etc.) and handles multiword sentences that conform to a simple English grammar. Program %option noyywrap %{ #include<stdio.h> %} %% [t ]+ /* ignore whitespace */ ; is | am | are | were | was | be | being | been | do | did | will | would | should | can | could | has | have | had | go { printf("%s: verbn", yytext); }
  • 21. System Software Lab Manual COET, Akola 18 very | simply | gently | quietly | calmly | angrily { printf("%s: adverbn", yytext); } to | from | behind | above | below | between { printf("%s: prepositionn", yytext); } if | then | and | but | or { printf("%s: conjunctionn", yytext); } their | my | your | his | her | its | good | bad | nice { printf("%s: adjectiven", yytext); } I | you | we | he | she | it | they { printf("%s: pronounn", yytext); } girl | boy | student | teacher | parent { printf("%s: common nounn", yytext); } .|n { ECHO;/* normal default anyway */ } %% main() { yylex(); } OUTPUT I can do it very effectively. I can do it very effectively. I: pronoun can: verb do: verb it: pronoun very: adverb effectively. I am a good girl. I: pronoun am: verb a good: adjective girl: common noun . do it calmly. do: verb it: pronoun calmly: adverb.
  • 22. System Software Lab Manual COET, Akola 19 Practical 7 Aim: Program for number system using Lex. Theory: Regular Expressions in Lex: 1 Basics: Meaning of some special characters in regular expression is as follows: . matches any single character except n * matches 0 or more instances of the preceding regular expression + matches 1 or more instances of the preceding regular expression ? matches 0 or 1 of the preceding regular expression [ ] defines a character class () groups enclosed regular expression into a new regular expression “…” matches everything within the “ “ literally x|y x or y {i} definition of i x/y x, only if followed by y (y not removed from input) x{m,n} m to n occurrences of x  x x, but only at beginning of line x$ x, but only at end of line "s" exactly what is in the quotes (except for "" and following character) Note that a regular expression finishes with a space, tab or newline. 2. Meta characters: Meta-characters do not match themselves, because they are used in the preceding regular expression: • ( ) [ ] { } < > + / , ^ * | . " $ ? - % To match a meta-character, use them with prefix "" (e.g. . ).Similarly to match a backslash, tab or newline, use , t, or n.
  • 23. System Software Lab Manual COET, Akola 20 Program %option noyywrap %{ #include<stdio.h> %} %% [01]+ {printf("Binary number");} [0-7]+ {printf("Octal number");} [0-9]+ {printf("Decimal number");} [0-9a-fA-F]+ {printf("Hexadecimal Number");} [0-9]*.[0-9]+ {printf("Fractional Number");} [0-9]*E[+-]?[0-9]+ {printf("Number with exponent");} . {printf("it is not a number");} %% main() { printf(“nEnter any numbers:”); yylex(); } OUTPUT Enter any numbers: 1011 Binary number 290 Decimal number 345 Octal number 3D Hexadecimal Number 18.92 Fractional Number 89E+12 Number with exponent g it is not a number $ it is not a number
  • 24. System Software Lab Manual COET, Akola 21 Practical 8 Aim: Program to count words, characters and lines using Lex. Theory: Variables & functions provided by flex: There are some variables that are set automatically by the lexical analyzer that flex generates are as follows: 1. yylval: yylval can hold attribute value, whether it be another numeric code, a pointer to symbol table, or nothing. Yylval is shared between the lexical analyzer and parser. 2. yytext : It is a pointer to the beginning of lexeme. 3. yyleng : It holds the length of the lexeme found Some functions provided by flex are as follows: 1. yymore() : append next string matched to current contents of yytext 2. yyless(n) : remove from yytext all but the first n characters 3. unput(c) : return character c to input stream 4. yywrap() : may be replaced by user.The yywrap method is called by the lexical analyser whenever it inputs an EOF as the first character when trying to match a regular expression. Meaning of some special characters used in regular expression:  x x, but only at beginning of line [ ] defines a character class + matches 1 or more instances of the preceding regular expression
  • 25. System Software Lab Manual COET, Akola 22 Program %option noyywrap %{ #include<stdio.h> int wordcount=0,charcount=0,linecount=0; %} word [^ tn]+ line n %% "#" {printf("n word count =%d, character count =%d, line count =%d", wordcount, charcount, linecount); } {word} {wordcount++;charcount+=yyleng;} {line} {linecount++;} %% main() { printf("Enter any text (Terminate input with #):n"); yylex(); // printf("n wc=%d,cc=%d,lc=%d",wordcount,charcount,linecount); } OUTPUT:- Enter any text (Terminate input with #): I live in India. Maharasthra Akola # word count =6, character count =29, line count =3
  • 26. System Software Lab Manual COET, Akola 23 Practical 9 Aim: Program for lexical analyzer using Lex. Theory: Lex program that recognizes the tokens and returns the token found. A few observations about this code will introduce us to many of the important features of Lex. In the declarations section we see a pair of special brackets, %{ and %}. Anything within these brackets is copied directly to the file lex.yy.c, Also in the declarations section is a sequence of regular definitions. These use the extended notation for regular expressions. Regular definitions that are used in later definitions or in the patterns of the translation rules are surrounded by curly braces. Thus, for instance, delim is defined to be shorthand for the character class consisting of the tab, blank and new line. Then, ws is defined to be one or more delimiters, by the regular expression {delim}+. Notice that in the definition of id and number, parentheses are used as grouping metasymbols and do not stand for themselves. If we wish to use one of the Lex metasymbols, such as any of the parentheses, +, *, or ?, to stand for themselves, we may precede them with a backslash. For instance, we see . in the definition of number, to represent the dot, since that character is a metasymbol representing "any character," as usual in UNIX regular expressions. Finally, let us examine some of the patterns and rules in the middle section. First, ws, an identifier declared in the first section, has an associated empty action. The second token has the simple regular expression pattern i f. Should we see the two letters if on the input, and they are not followed by another letter or digit (which would cause the lexical analyzer to find a longer prefix of the input matching the pattern for id), then the lexical analyzer consumes these two letters from the input and print the token name IF. Keywords else are treated similarly. Conflict Resolution in Lex We have alluded to the two rules that Lex uses to decide on the proper lexeme to select, when several prefixes of the input match one or more patterns: 1. Always prefer a longer prefix to a shorter prefix. 2. If the longest possible prefix matches two or more patterns, prefer the pattern listed first in the Lex program.
  • 27. System Software Lab Manual COET, Akola 24 Program %option noyywrap %{ #include<stdio.h> %} /*Regular definitions */ digit [0-9] letter [a-zA-Z_] id {letter}({letter}|{digit})* digits {digit}+ num {digits}(.{digits})? delim [ nt] op [%/*-+] whitespace {delim}+ %% {whitespace} {/*no action,ignore*/} if {printf(" <IF>");} else {printf(" <ELSE>");} {id} {printf(" <ID,%s>",yytext);} {num} {printf(" <NUM,yytext>");} "<=" {printf(" <RELOP,LE>");} "<" {printf(" <RELOP,LT>");} ">=" {printf(" <RELOP,GE>");} ">" {printf(" <RELOP,GT>");} "<>" {printf(" <RELOP,NE>");} "==" {printf(" <RELOP,EQ>");} "=" {printf(" <ASSIGN>");} {op} {printf(" <OP,%s>",yytext);} . {printf(" <%s>",yytext);} %% int main() { Printf(“Enter expression/code fragment:n”); yylex(); return 0; } OUTPUT Enter expression/code fragment: if(x<y) <IF,if> <(> <ID,x> <RELOP,LT> <ID,y> <)> x=x+y; <ID,x> <ASSIGN> <ID,x> <OP,+> <ID,y><;> else <ELSE> y=y*10; <ID,y> <ASSIGN> <ID,y> <OP,*> <NUM,10> <;>
  • 28. System Software Lab Manual COET, Akola 25 Practical 10 Aim: Program for simple desk calculator using YACC. Theory: Parser Generator - Yacc tool: Yacc (Yet another compiler compiler) is a tool for automatically generating a LALR parser given a grammar written in a yacc specification (.y file). A grammar specifies a set of production rules, which define a language. A production rule specifies a sequence of symbols, sentences, which are legal in the language. Latest parser generator Bison tool is also available in linux. How to create parser using Yacc? Yacc transforms translate.y into C program called translate.tab.c. The program y.tab.c is a representation of an LALR parser written in C. By compiling y.tab.c with the ly library that contains the LR parsing program using the command cc y.tab.c –ly we obtain desired object program a.out that performs translation specified by translate.y. Yacc specification tranclate.tab.c a.out translate.y Figure: Creating translator with Yacc. Structure of Yacc program: The structure of Yacc source program is as follows: %{ < C global variables, prototypes, comments > %} [DEFINITION SECTION] %% [PRODUCTION RULES SECTION] %% < Supporting C routines > Yacc Compiler C Compliler
  • 29. System Software Lab Manual COET, Akola 26 Definition section contains token declarations as follows- %token DIGIT This statement declares DIGIT to be a token. These tokens are also available to lexical analyzer generated by flex. In Production rule section we put grammar production rules and associated semantic actions for each rule. A set of productions in grammar of the form- <head> → <body>1 | <body>2 |…….. | <body>n would be written in Yacc as <head> : <body>1 {<semantic action>1} | <body>2 {<semantic action>2} …….. | <body>n {<semantic action>n} ; Semantic action is a sequence of C statements. In semantic action, attribute values associated with each grammar symbol in production rule are as follows- Head: symbol1 symbol2 …symboln { semantic action } $$ $1 $2 $n Note that { $$=$1;} is the default semantic action. Third part of Yacc specification is supporting C routines. Here lexical analyzer by the name yylex() must be provided. Lexical analyzer produces tokens of the form- < Token-name, Attribute value > Where, token-name (such as DIGIT) must be declared in Yacc specification.The attribute value associated with the token is communicated to the parser through Yacc-defined variable yylval.
  • 30. System Software Lab Manual COET, Akola 27 Program: %{ #include<stdio.h> #include<ctype.h> %} %token DIGIT %left ‘+’ ‘-‘ %left ‘*’ ‘/‘ %% line :expr'n' {printf("%dn",$1);} ; expr :expr'+'term {$$=$1+$3;} |expr'-'term {$$=$1-$3;} |term {/*$$=$1;*/} ; term :term'*'factor {$$=$1*$3;} |term'/'factor {$$=$1/$3;} |factor ; factor :'('expr')' {$$=$2;} |DIGIT ; %% int main() { yyparse(); return 1; } yyerror (char *s) { fprintf(stderr,"%s",s); } yylex() { int c; c=getchar(); if(isdigit(c)) { yylval=c-'0'; return DIGIT; } return c; } OUTPUT (4+2)*(7-5)/6 2
  • 31. System Software Lab Manual COET, Akola 28 Extra set of practicals
  • 32. System Software Lab Manual COET, Akola 29 Practical 1 Aim: Program to eliminate comments from given input file. Theory: This program is intended to eliminate comments from a C program. This task is simple since in a C program, comments will be associated only with '//'. So our aim is to detect the occurence of these characters and ignore subsequent comments. Program #include <stdio.h> void main() { FILE *in; char fname[20]; printf("nEnter file name: "); scanf("%s",fname); in=fopen(fname,"r"); char temp,c; if(!out) printf("nfile cannot be opened"); else { while((c=fgetc(in))!=EOF) { if(c=='n' || c=='t')//eliminate blank space continue; if(c=='/') { if(fgetc(in)=='/') { do { c=fgetc(in); }while(c!='n'); } } putchar(c); } } fclose(in); }
  • 33. System Software Lab Manual COET, Akola 30 INPUT void main() { int a=10; //first int b=20; //second if (a<b) //check printf("/a is greater/"); else printf("b is greater"); getch(); } OUTPUT Enter file name: testin.c void main(){int a=10; int b=20; if (a<b) printf("/ is greater/);else printf("b is greater");getch();
  • 34. System Software Lab Manual COET, Akola 31 Practical 2 Aim: Program for predictive parser. Theory: Predictive parsers can be depicted using transition diagrams for each non-terminal symbol where the edges between the initial and the final states are labeled by the symbols (terminals and non- terminals) of the right side of the production rule. Program // PROGRAM FOR PREDICTIVE PARSER #include<stdio.h> #include<conio.h> #include<string.h> #include<stdlib.h> #include<iomanip.h> void display(void); void push(char); char pop(void); char stack[10]; int top=0,index=0; char *istr,tos,curr_symb,len; char E[]={'d','A','T','n','+','B','n','*','B','n','(','A','T','n',')','B','n','$','B','n'}; char A[]={'d','B','n','+','A','T','+','n','*','B','n','(','B','n',')','N','n','$','N','n'}; char T[]={'d','H','F','n','+','B','n','*','B','n','(','H','F','n',')','B','n','$','B','n'}; char H[]={'d','B','n','+','N','n','*','H','F','*','n','(','B','n',')','N','n','$','N','n'}; char F[]={'d','d','n','+','B','n','*','B','n','(',')','E','(','n',')','B','n','$','B','n'}; // for above: d=id B=blank N=null production A=E' H=T' void main() { int i,j,error_flag=0; clrscr(); printf("n Enter Input String (at the end $) : "); scanf("%s",istr); len=strlen(istr); push('$'); push('E'); printf("StacktInputn"); display(); curr_symb=istr[index]; while(1) { tos=pop(); top--;
  • 35. System Software Lab Manual COET, Akola 32 curr_symb=istr[index]; switch(tos) { case 'E': for(i=0;i<20;i++) { if(E[i]==curr_symb) { for(j=i+1;E[j]!='n';j++) { if(E[j]!='B') push(E[j]); else error_flag=1; } break; } } display(); break; case 'A': for(i=0;i<20;i++) { if(A[i]==curr_symb) { for(j=i+1;A[j]!='n';j++) { if(A[j]!='B') push(A[j]); else error_flag=1; } break; } } display(); break; case 'T': for(i=0;i<20;i++) { if(T[i]==curr_symb) { for(j=i+1;T[j]!='n';j++) { if(T[j]!='B') push(T[j]); else error_flag=1; } break; } }
  • 36. System Software Lab Manual COET, Akola 33 display(); break; case 'H': for(i=0;i<20;i++) { if(H[i]==curr_symb) { for(j=i+1;H[j]!='n';j++) { if(H[j]!='B') push(H[j]); else error_flag=1; } break; } } display(); break; case 'F': for(i=0;i<20;i++) { if(F[i]==curr_symb) { for(j=i+1;F[j]!='n';j++) { if(F[j]!='B') push(F[j]); else error_flag=1; } break; } } display(); break; case 'd': if(curr_symb==tos) { pop(); index++; } display(); break; case '+': if(curr_symb==tos) { pop(); index++; } display(); break; case '*': if(curr_symb==tos) {
  • 37. System Software Lab Manual COET, Akola 34 pop(); index++; } display(); break; case '(': if(curr_symb==tos) { pop(); index++; } display(); break; case ')': if(curr_symb==tos) { pop(); index++; } display(); break; case '$': if(curr_symb=='$') { printf("n Accepted"); getch(); exit(0); } break; case 'N': pop(); display(); break; } if(error_flag==1) { printf("nError"); exit(0); } } } void display(void) { int t=0; while(t<=top) { printf("%c ",stack[t]); t++; } printf(" t"); t=index; while(t<=len)
  • 38. System Software Lab Manual COET, Akola 35 { printf("%c ",istr[t]); t++; } printf("n"); getch(); } void push(char ch) { top++; stack[top]=ch; } char pop() { if(top>=0) return stack[top]; else { printf("n Stack is empty"); getch(); exit(0); } } OUTPUT Enter Input String (at the end $) : d+d*d$ Stack Input $E d + d * d $ $AT d + d * d $ $AHF d + d * d $ $AHd d + d * d $ $AH + d * d $ $AN + d * d $ $A + d * d $ $AT+ + d * d $ $AT d * d $ $AHF d * d $ $AHd d * d $ $AH * d $ $AHF* * d $ $AHF d $ $AHd d $ $AH $ $AN $ $A $ $N $ $ $ Accepted
  • 39. System Software Lab Manual COET, Akola 36 Practical 3 Aim: Program to check if given grammar is left recursive and to remove left recursion. Theory: Removing left recursion Left recursion often poses problems for parsers, either because it leads them into infinite recursion (as in the case of most top-down parsers). Therefore, a grammar is often preprocessed to eliminate the left recursion. #include<stdio.h> #include<stdlib.h> void main() { int i,j; char pr[20], NT, beta[5], alpha[5]; printf("nEnter production rule :"); scanf("%s",pr); j=0; NT=pr[j]; j+=3; //skip -> if(pr[j]==NT) { printf("nGrammar is left recursive"); j++; i=0; do //alpha { alpha[i++]=pr[j++]; }while(pr[j]!='|'); alpha[i]='0'; j+=1; //skip | i=0; do //beta { beta[i++]=pr[j++]; }while(pr[j]!='0'); beta[i]='0'; printf("nNew production rules are:"); printf("nt%c -> %sX",NT,beta); printf("ntX->%sX|^",alpha); } else printf("grammar is not left recursive"); } output Enter production rule :T->T*F|T Grammar is left recursive New production rules are: T -> TX X->*FX|^
  • 40. System Software Lab Manual COET, Akola 37 Practical 4 Aim: Program code generator. Theory: In computing, code generation is the process by which a compiler's code generator converts some intermediate representation of source code into a target code. The input to the code generator typically consists of a parse tree or an abstract syntax tree or three-address code. Major tasks in code generation Tasks which are typically part of a sophisticated compiler's "code generation" phase include:  Instruction selection: which instructions to use.  Instruction scheduling: in which order to put those instructions. Scheduling is a speed optimization that can have a critical effect on pipelined machines.  Register allocation: the allocation of variables to processor registers. Program #include<stdio.h> #include<stdlib.h> char op; char soc1,soc2,dest; void code_genarate(); void free_reg(int); void main() { FILE *p; char ch; p=fopen("tac.c","r"); if(p==NULL) { printf("nError in opening"); exit(0); } do { dest=fgetc(p); if(dest=='n') break; fgetc(p); //assignment op soc1=fgetc(p); op=fgetc(p); soc2 =fgetc(p); code_genarate();
  • 41. System Software Lab Manual COET, Akola 38 }while(fgetc(p)!=EOF); } void code_genarate() { printf("nMOV %c,R1n",soc1); printf("nMOV %c,R2n",soc2); switch(op) { case '+':printf("nADD R1,R2n"); break; case '-':printf("nSUB R1,R2n"); break; case '*':printf("nMUL R1,R2n"); break; case '/':printf("nDIV R1,R2n"); break; } printf("nMOV R1,%cn",dest); } INPUT x=a*b y=c+x d=x+y OUTPUT MOV a,R1 MOV b,R2 MUL R1,R2 MOV R1,x MOV c,R1 MOV x,R2 ADD R1,R2 MOV R1,y MOV x,R1 MOV y,R2 ADD R1,R2 MOV R1,d
  • 42. System Software Lab Manual COET, Akola 39 Practical 5 Aim: Program to check whether the string end with bb using Lex. Theory: Regular expressions used are described as- [ab]*bb - zero or more occurrences of a or b [ab]*bb - zero or more occurrences of a or b followed by bb Program %option noyywrap %{ #include<stdio.h> %} %% [ab]*bb {printf("string accepted");} [ab]* {printf("string rejected");} %% int main() { printf(“nEnter string of a & b:”); yylex(); return 0; } OUTPUT Enter string of a & b: abaaa string rejected abaab string rejected abb string accepted bb string accepted
  • 43. System Software Lab Manual COET, Akola 40 Practical 6 Aim: Program to count no. of vowels and consonants in a given input string using Lex. Theory: Regular expressions used are described as- [aeiouAEIOU] - Any vowel either in capital or small case [a-zA-Z] - Any consonant either in capital or small case Conflict Resolution in Lex Two rules that Lex uses to decide on the proper lexeme to select, when several prefixes of the input match one or more patterns: 1. Always prefer a longer prefix to a shorter prefix. 2. If the longest possible prefix matches two or more patterns, prefer the pattern listed first in the Lex program. So, even vowels come in a-z, it matches it with first Regular Expression (i. e. aeiou). %option noyywrap %{ #include<stdio.h> int vowel_count=0, consonant_count=0; %} %% [aeiouAEIOU] {vowel_count++;} [a-zA-Z] {consonant_count++;} [$] {printf("vowel count=%d consonant count=%d ",vowel_count,consonant_count);} %% int main(){ printf("Enter any text (Terminate input with $):n"); yylex(); return 0; } OUTPUT Enter any text (Terminate input with $): I live in India.$ vowel count=7 consonant count=5
  • 44. System Software Lab Manual COET, Akola 41 Practical 7 Aim: YACC program for advance desk calculator by creating lexical analyzer using Lex. Theory: Creating lexical analyzer for Yacc using flex: If Lex is used to produce the lexical analyzer, we replace the routine yylex() in the Yacc specification by the statement- #include “lex.yy.c” File lex.yy.c can be obtained by compiling lex program (using Lex compile option). Lexical analyzer generated using Lex recognize token NUM (number with multiple digits). Now, yacc program is compiled using YACC compile option in tools menu. Now, y.tab.c file is obtained which can be compiled using C compiler. Precedence and Associatively: To reduce conflicts Yacc provides facility to assign precedences and associativities to terminals. In declaration section, the declaration %left ‘+’ ‘-‘ makes + and – be of same precedence and to be left associative. We can declare operator to be right associative as- %right ‘^’ The operators are given precedences in the order in which they appear in the declaration part, lowest first. Program Lex file- cal.l %option noyywrap number [0-9]+ %% {number} { sscanf(yytext, "%d", &yylval); return NUM;} n|. { return yytext[0];} %%
  • 45. System Software Lab Manual COET, Akola 42 Yacc file-cal.y %{ #include<stdio.h> %} %token NUM %left '+' '-' %left '*' '/' %% line :expr'n' {printf("%dn",$1);} ; expr :expr'+'expr {$$=$1+$3;} |expr'-'expr {$$=$1-$3;} |expr'*'expr {$$=$1*$3;} |expr'/'expr {$$=$1/$3;} |'('expr')' {$$=$2;} |NUM ; %% #include "lex.yy.c" int main() { printf("Enter expression:"); yyparse(); return 1; } yyerror (char *s) { fprintf(stderr,"%s",s); } OUTPUT Enter expression: (12*5)/10+35 41 Enter expression:10*(280+20)/10+1 301