University of Calgary

problem: cross-reference map for code


"Cross-Reference Map"

You are to write an efficient C program that will take as it's input a file containing a C, C++, or Java program and produce as output a cross reference map of all the identifiers encountered in the program as well as the line-numbers on which they appear.

Your program should be run like a command, and take two (command-line) arguments:

1. the name of a C, C++, or Java source program ( <name>.c, <name>.cc, <name>.cpp <name.java>)
2. an output name for the program listing (same name but with the suffix ".out" assumed if not given)
For example the following program as input:
------------------------------------------------------------------------------
#include <iostream.h>
int main()
{
int i;
cout << "Hello World" << endl;
i = 42;
cout << i << endl;
return 0;
}
should make our preprocessor produce:
identifier used on:
cout 5, 7
endl 5, 7
i 4, 6, 7
main 2
 

Assumptions:
- you may assume the original source compiled without errors or warnings
 
Specifications:
- your program must be written in 'C' and you must use a proper 'C' compiler {'cc' on UNIX; turn on the standard 'C' switch on a PC C++ compiler}
Note: if you are implementing this as a Java solution, then your solution must be in plain Java. That means no fancy I/O or utility routines. You may use string operations and character I/O. Your solution must still be EFFICIENT.
- your program may not terminate abnormally. (It can give up; claim to have found no identifiers, etc. but it cannot bomb)
- any messages produced by your program should be written to the monitor.
- you do not need to design this as an OO program.
-the list must be sorted before being printed
- you may assume that the first occurence of an identifier will be the declaration
- you will be given a file of standard C++ keywords
- assume a maximum of 250 identifiers
 
- if no arguments are provided, the utility will print a brief message outlining how it is used. (For a UNIX example, try typing cp with no arguments)

Testing:
1. your own source
     
Test your program using at least the following files : [tba - see instructor]

Requirements & Grading:

Your program does not need to parse C, C++, or Java. You may assume the program compiles.
'C' Version:
- lists all identifiers and the line numbers on which they appear.
- converts characters correctly
- echoes characters ('raw' and encoded)
'B' Version:
- produce a listing of the original source with line numbers attached which gets printed before the x-ref map
1: #include <iostream.h>
2: int main()
3: {
4:    int i;
5:    cout << "Hello World" << endl;
6:    i = 42;
7:    cout << i << endl;
8:    return 0;
9: }
- mark locations of strings and comments in your listing (something like:)
1 : #include <iostream.h>
2 : int main()
3 : {
4 :    int i;
5*:    cout << "Hello World" << endl;
6 :    i = 42;
7 :    cout << i << endl;
8 :    return 0;
9 : }
'A' Version:
- have your program able to tolerate non-compiling programs (i.e. it shouldn't blow up if comment markers or quotes are unbalanced). Mis-matched quotes would produce output like:
1 : #include <iostream.h>
2 : int main()
3 : {
4 :    int i;
5*:    cout << "Hello World << endl;
  i = 42;
   cout << " i " << endl;
6 :    return 0;
}
Warning: end-of-file reached while in string.

BONUSES:
1. identify which function (or global) the identifier was first found
2. identify the type of each identifier and where it was declared (this includes distinguishing between different versions of "int i;" found in different functions
3. mark all function calls
4. identify and mark off (in the margin of the listing) different scope levels
1   : #include <iostream.h>
2   : int main()
3 1>: {
4   :    int i;
5 * :    cout << "Hello World" << endl;
6   :    i = 42;
7   :    cout << i << endl;
8   :    return 0;
9 <1: }
5. If you are still looking for more to do, you need to get a hobby.


Updated: August 5, 2005 12:40 AM