|
Algorithms:
Procedure main
begin
if symb=# then
begin
advance to next token in input
file
if symb=i then
begin
advance to next token
in input file
while symb!=\n do
begin
advance to
next token in input file
end {while }
print symb is a
preprocessor directive
end {if symb=i}
if symb=d then
begin
advance to next token
input file
while symb!= do
begin
advance to
next token in input file
end{while}
advance to next token
in input file
print symb is a
constant
advance to next token
in input file
while symb!=\n do
begin
advance to
the next token in input file
end {while}
end {if symb=d}
end {if symb=#}
if symb is a alphabet or symb=_ then
begin
advance to the next token in
input file
while symb is a digit or
alphabet or symb=_ do
begin
advance to the next
token of input file
end {while}
call function verify to check
whether symb is a identifier or keyword
end {if}
if symb=+ then
begin
advance to the next token in
input file
if symb=+
print symb is ++
operator
else
ungetc symb from the
input file
print symb is +
operator
end {if}
if symb=- then
begin
advance to the next token in
input file
if symb=-
print symb is --
operator
else
ungetc symb from the
input file
print symb is -
operator
end {if}
if symb=| then
begin
advance to the next token in
input file
if symb=|
print symb is logical
or operator
else
ungetc symb from the
input file
print symb is bitwise
or operator
end {if}
if symb=* then
begin
print symb is a multiplication
operator
end {if}
if symb=? then
begin
print symb is a conditional
operator
end{if}
if symb=!or symb=>or symb=<then
begin
advance to the next token in
input file
if symb==
print symb is
a relational operator
else
ungetc symb
from output file
print symb is a operator
end{if}
if symb==
begin
advance to next token in input
file
if symb==then
print symb is
equal to operator
else
ungetc symb
from output file
print symb is assignment operator
end{if}
if symb=& then
begin
advance to next token
in input file
if symb=& then
print symb
is a logical and operator
else
print & symb
is an address operator
end{if}
if symb=/ then
begin
advance to next token
in input file
if symb=* then
begin
advance to
next token in input file
while symb!=/
do
advance to next token in input file
end{while}
end{if}
else if symb=/ then
begin
advance to
next token in input file
while symb!=\n
do
advance to next token in input file
end{while}
end{if}
else
ungetc symb
from output file
print symb
is a division operator
end{if}
if symb is a digit then
begin
advance to next token in input
file
while symb is a digit or symb=.
then
begin
advance to next token
in input file
end {while}
print symb is a number
end{if}
if symb = then
begin
advance to next token in input
file
while symb!= do
begin
advance to next token
in input file
end{while}
print symb is a string
end{if}}
if symb= { then
print open brace
if symb=} then
print close brace
if symb=[ then
print open bracket
if symb=] then
print close bracket
if symb=( then
print open parenthesis
if symb=) then
print close parenthesis
end {procedure main}
procedure verify
begin
scan the symbol table to check if
encountered token exists
if exists
return token value
end{procedure}
USER MANUAL
The code for modules appears in two files: lex.c
and output.c. The file lex.c contains the main source code of the lexical analyzer.
And the input to the lexical analyzer is contained in test.c. Under the DOS
operating system, the program is compiled by using alt F9, and is executed by
using ctrl F9. The output i.e token types are stored in the output file, output.txt
Sample Input:
#include<stdio.h>
#include<stdlib.h>
#define abc 100
void main()
{
int a_,b=30;
printf("enter 2 no.s\n"); // printf
statement
scanf("%d%d",&a,&b);
/* scanf
statement*/
if(a<20)
a=a+1;
}
Sample
Output:
LINE NO
TOKENS
-----------------------------------------------
1: #include<stdio.h>
is a header file
2: #include<stdlib.h>
is a header file
3: #define statement:
abc is a constant
4: void: token value :
7
main
:identifier, token value : 18
(: open
parenthesis
):
close parenthesis
5: {: open brace
6: int: token
value : 1
a_
:identifier, token value : 18
, :
comma
b
:identifier, token value : 18
=:
assignment operator
30
is a number
; :
semi colon
7: printf: token
value : 5
(: open
parenthesis
enter 2 no.s\n : is a string
): close
parenthesis
;: semi
colon
8: scanf:
token value : 6
(: open
parenthesis
%d%d : is
a string
,: comma
&a: address
operator
, : comma
&b:
address operator
):
close parenthesis
;:
semi colon
9:
10:
11:
if: token value : 8
(:
open parenthesis
a
:identifier, token value : 18
<:
less than operator
20
is a number
):
close parenthesis
12:
a: token value : 18
=:
assignment operator
a:
token value : 18
+:
plus operator
1 is
a number
;:
semi colon
13:
}: close parenthesis
CONCLUSION:
Generally, when syntactic
analysis is being carried out by the parser it may call upon the scanner for
tokenizing the input. But the LEXICAL ANALYZER designed by us is an independent
program. It takes as input a file with an executable code in C. There fore, the
parser cannot make use of the designed scanner as and when required.
Consider as an example an array
ch[20].The designed lexical analyzer will tokenize 'ch' as an identifier,'[' as
an opening brace,'20' as a number, and ']' as a closing brace. But the parser
might require a[5] to be identified as an array. Similarly, there may arise a
number of cases where the parser has to identify a token by a different
mannerism than the one specified and designed. Hence, we conclude that the
LEXICAL ANALYZER so designed is an independent program which is not flexible. |