Contents 
 Index 
 "Perl Program Reference" 
 < Previous 
 Next > 

Indexer Tools

Creates a comprehensive index from previously generated index_ files. More...

Files

Functions


Detailed Description

Creates a comprehensive index from previously generated index_ files.


Function Documentation

add_to_index_struct  
 

Adds an element to the complicated hash table.

Parameters:
in_entry  the compacted entry into the hash
in_title  the display text to show for the item
in_url  the anchor to add to the URL array. If you send in
globe:
:word_c_boundary for the
in_url, then it won't add the URL to the list.
Return values:
Always  returns 1.
$entry = compacted and clean display text for sorting. $subentry = compacted and clean display text for sorting. $idx_struct{$entry}{display} = display text $idx_struct{$entry}{url}[] = array of URL's for the $entry. $idx_struct{$entry}{sub}{$subentry}{display} = display text for the $entry's $subentry. $idx_struct{$entry}{sub}{$subentry}{url}[] = array of URL's for the $entry's $subentry.

Definition at line 736 of file voyant_indexer.pl.

add_to_lev2_index_struct  
 

Adds an element to the second level of the complicated hash table.

Parameters:
in_entry  the compacted entry into the hash.
in_sub  the compacted subentry into the hash.
in_title  the display text to show for the item.
in_url  the anchor to add to the URL array.
Return values:
Always  returns 1.
$entry = compacted and clean display text for sorting. $subentry = compacted and clean display text for sorting. $idx_struct{$entry}{display} = display text $idx_struct{$entry}{url}[] = array of URL's for the $entry. $idx_struct{$entry}{sub}{$subentry}{display} = display text for the $entry's $subentry. $idx_struct{$entry}{sub}{$subentry}{url}[] = array of URL's for the $entry's $subentry.

Definition at line 792 of file voyant_indexer.pl.

ignore_item  
 

Compares the input term to a list of ignore terms.

Parameters:
term_to_test  Term to test.
Returns:
Returns 1 if the term has matched an ignore term; Otherwise it returns 0.
It tests for fragments twice, so that "get" doesn't match on "together".
Limitations and Caveats:
If the term has a perl special character in it, it is sent back immediately and left alone.

Definition at line 648 of file voyant_indexer.pl.

Referenced by unless().

trash_special_characters  
 

Removes all special characters that we don't want in index as word chunks.

Parameters:
in_word  The word that might have special characters.
Returns:
The word without special characters.
Limitations and Caveats:
Debug statements are left in.

Definition at line 1048 of file voyant_indexer.pl.

word_chunking  
 

Performs word-chunking on the passed in entries that was extracted from the $globe::master_raw.

Parameters:
unproc_title  the unprocessed title passed in $entry_chunk[0].
assoc_t_data  the associated title and data given by $entry_chunk[1].
Returns:
Updated entries in the hash $globe::master_index. If a word-chunk already is available as a key into the hash, then this appends its information to the contents of the key using $globe::division_mult_entry.
Word-chunking is performed on the $unproc_title. Natural boundaries (spaces, dashes, underscores, changes in case in the middle of a word) are used to create additional two-level index entries that contain the word-chunk followed by where it came from.

The $globe::ignore_terms_file is used to eliminate unuseful word-chunked entries (such as "the", "a", "to", etc.)

The additional useful entries are appended to the contents of the hash

globe:
:master_raw using the
globe::division_mult_entry separator only if the new entry is not a duplicate.

Word-chunking is particular useful for API documentation so that the reader does not have to remember the exact name of a code item in order to find it. An initial index token of "api_GetMovie-list" could be found not just under its name in the "A's", but under "get", "movie", and "list".

$entry = compacted and clean display text for sorting. $subentry = compacted and clean display text for sorting. $idx_struct{$entry}{display} = display text $idx_struct{$entry}{url}[] = array of URL's for the $entry. $idx_struct{$entry}{sub}{$subentry}{display} = display text for the $entry's $subentry. $idx_struct{$entry}{sub}{$subentry}{url}[] = array of URL's for the $entry's $subentry.

Limitations and Caveats:
None.

Definition at line 868 of file voyant_indexer.pl.



 "Perl Program Reference" 
 < Previous 
 Next > 



Open-Source tools compliments of Voyant Technologies, Inc. and Glenn C. Maxey.
01/13/2003

TP Tools v2-00-0a

# tpt-perl-hcr-02