Both the indexer and the extractor need to be able to parse the source code: the indexer needs to know which functions/subroutines are indexed in each source file so that it can write the Routines index, and the extractor needs to know what is a reference to another file so that it can generate hyperlinks, as well as the position of function/subroutine definitions so that references can point to the right place in source files (via <a name='...'> tags).
Both programs therefore use the same parsing routines. A detailed description of how these interface with the indexer and extractor programs are given in the module Scb.pm, but the basic idea is that these routines take as input the raw source code and output the same text with added HTML-like tags indicating where functions are defined and where they are called.
One routine is supplied for tagging C source and one for Fortran. These are currently implemented as external programs written using lex and yacc, with rather simplified views of the grammars of the languages. A previous version of the package contained tagging routines written in Perl, and contained supporting routines to assist in writing new taggers in Perl; this is no longer distributed but can be made available on request.
The SCB package has been designed to make replacing one of the language-specific tagging modules (for instance with a more efficient or more accurate one), or adding a new one (for instance for a different language), fairly easy -- aspects of this procedure are documented in the module Scb.pm.
The existing tagging routines are good enough to generate accurate indexing information and hypertext for a large majority of the source files in the Starlink software collection, but they are not perfectly accurate, and could no doubt be improved upon. It is however impossible to do it perfectly without walking the include files, which would make the browser far too slow.
SCB --- Source Code Browser