Contents 
 Index 
 "TPT User's Guide" 
 < Previous 
 Next > 

Chapter 8 voyant_nav.pl

The voyant_nav.pl perl program is the central tool in the TechPubTools suite around which all of the other shell scripts and perl programs rely.

Overview

The voyant_nav.pl perl program does most of the work in creating a comprehensive HTML system that spans the mini-HTML systems generated by Doxygen and Mif2Go (from FrameMaker source).

Specifically, this tool:

• reads a master file which specifies information for the head, top of the topic, and bottom of each topic.

• swaps out information from a master for each topic and removes certain HTML tags specified by the master (for better CSS control).

• creates several hash tables that ultimately determine topic order, topic level in tree (table of contents, TOC), etc.

• generates a mini-Table of Contents file (tree.html and tree.script) for the directory. This file is then re-used and combined with others later to create the comprehensive system.

• processes index tokens coming from the output of Mif2Go.

• creates index tokens from the topic hash table.

• parses index tokens generated from Mif2Go or already existing in the file.

• writes index entries to _index_file. This file is then re-used and combined with others later to create the comprehensive system.

• figures out the previous-next topic browsing.

The Beginnings

The voyant_nav.pl perl program started out as a tool just to swap out the header, navigation, and copyright areas. These were items that needed to remain consistent for all generated HTML files in a subproject.

The goal was to have a tool that could post-process HTML files and, say, update the copyright area at the bottom of each file with newer information without having to go back to the (unchanged) source and re-export.

Likewise, the header and navigation areas were items that could frequently change based on trial-and-error of the look-and-feel and the whims of my audience. Again, there was no sense re-running extraction programs against the source if the general content remained unchanged.

To keep the voyant_nav.pl program manageable, it focuses only on the HTML files in one directory. A shell script (such as 50_nav_update.b and specifically 55_nav_gen.b) can conveniently call this program with the specific input files for the project.

The voyant_nav.pl program reads in information from external master files in order to avoid hard-coding data that frequently changes or could potentially change, such as the tags to look for in the HTML files.

The Extensions

The index and table of contents are two special areas which theoretically could be created with their own tools. Such tools would require opening and scanning each file in the input directory in order to locate pieces of information of interest.

Because the voyant_nav.pl program already opens each file, reads it into memory, and scans it for information of interest, it was enhanced to tackle the mini-index and mini-table of contents just for the files in its directory.

CYA

The voyant_nav.pl program was built up over time. Programming was done iteratively and piecemeal. Debugging statements were liberally created and then were initially deleted once the section of code seemed to work.

As the complexity of the program increased, many of the same debugging code had to be re-entered to help trace the operation and verify the results at various stages. Over time, I stopped deleting debug sections and instead conditioned them out in order to keep them available when needed later.

An early CYA effort was to always write a _temp file of the original HTML file. This way the input could be compared to the output. Then in one line (that could be turned on or off) the program overwrites the original input HTML file with the contents of the _temp file.

ultimately overwrites the HTML file that it reads it. However,

Data Structures

The voyant_nav.pl program initially had simple variables.

When other tools were needed, such as voyant_mt_app.pl to create the master table of contents or voyant_indexer.pl to create the master index, I discovered that many of the same variables and routines were required.

I placed global variables and frequently used routines in the globe.pm file, a Perl package.

More complex data structures were created later as the program grew, because they simplify the program execution, make it more reliable, and allow for code re-use.

Topic Browsing

Another of the extensions of the voyant_nav.pl program was to create previous-next topic browsing that crossed over chapter boundaries or to simply implement previous-next topic browsing.

• A limitation of our off-the-shelf extraction tool from FrameMaker was that it could only offer browsing within a chapter.

• A limitation of our off-the-shelf source code extraction tool was that it had no browsing.

In order to allow browsing over chapter boundaries from the FrameMaker, I made restrictions on the naming convention of the generated HTML files. The naming convention is:

• a two-digit number for the chapter [required], followed by

• a three-digit number for the topic in the chapter, followed by

• plain text regarding the topic title [required to be reader-friendly].

In this manner, all files for a given chapter are grouped together by the prefix; all topics are placed in order when sorted alpha-numerically; and all files have meaningful names.

The grouping of files by chapters (first two-digits) is the most important aspect of the naming convention, because when it allows the brain-dead voyant_nav.pl program to know the order of the chapters from the name.

Specifically, each HTML file extracted from a FrameMaker file knows whether or not it is the first topic in the chapter, the last topic in the chapter, or a topic in the middle of the chapter. Information about the first and last topics is stored in a hash. Once all of the files in the directory have been processed, all first and last topic files are re-visited so that their previous and next links can be updated with information that is known about the last topic of the previous chapter or the first topic of the next chapter.

In the case of code files (which have no topic order tags), their names determine the order. Once all of the files in a code directory have been processed normally, they are revisited so that their previous and next links can be updated with information.

Index Tokens

When pulling together multiple manuals or mini-HTML documentation systems, the most effective and easily understood way of implementing cross-references between manuals or systems is to have an effective index.

Two extensions of the voyant_nav.pl program were to have it generate index entries for the output file based on content (e.g., <H1> tags) if none existed, and to have it handle index tokens that might be inserted by other programs, such as Doxygen or Mif2Go.

Doxygen Index Tokens

In the case of the Doxygen extraction from source code, the generated HTML files had many anchor tags of the format:

<a name="a0" doxytag="dox_comment_chg.pl::comment_count"></a>

These anchors were used as mid-topic jumps sometimes from within the same (often very long) HTML file and from other HTML files. The doxytag extension was an easy tag to spot. Moreover, value of the doxytag tag was useful information for the index.

Hence the voyant_nav.pl program was enhanced to locate these tags and extract relevant information for the entries into the index file.

Mif2Go Index Tokens

As was previously mentioned, my faith in going with Mif2Go as the export tool was well-founded particularly when it came to OmniSys’s responsiveness to my request for better index support.

My proposal to OmniSys used the Doxygen doxytag anchors as a model. Once implemented and supported in my mif2htm.ini files, I could create anchor tags of the format:

<a name=”<$$objectid>” class=”v_index” value=”index entry”></a>

The <$$objectid> actually comes from FrameMaker that Mif2Go uses and uniquely identifies the paragraph format. The index entry was text that was extracted from writer-defined index token in FrameMaker.

The voyant_nav.pl perl program can easily locate anchor tags. When they are determined to be of class=”v_index”, it then knows how to handle the name attribute as part of a fully qualified URL to the HTML filename and target within the file (00000FileOwner.html#$$objectid), as well as the text to display to the reader (“index entry”).

Hence the voyant_nav.pl program was enhanced to locate these new v_index tags and extract relevant information for the entries into the index file.

Note: For this to work properly, index tokens should have only one entry per token and should have no more than two levels.



 "TPT User's Guide" 
 < Previous 
 Next > 


Open-Source tools compliments of Voyant Technologies, Inc. and Glenn C. Maxey.
01/13/2003

TP Tools v2-00-0a

# tpt-hug-02