.\"------------------------------------------------------------ .\" Id - set Rv,revision, and Dt, Date using rcs-Id tag. .de Id .ds Rv \\$3 .ds Dt \\$4 .. .Id $Id$ .\"------------------------------------------------------------ .TH mg_stem_idx 1 \*(Dt CITRI .SH NAME mg_stem_idx \- builds a stem index file .SH SYNOPSIS .B mg_stem_idx [ .B \-h ] [ .BI \-b " entries-per-block" ] .if n .ti +12n [ .BI \-a " stemmer" ] [ .BI \-d " directory" ] .if n .ti +12n .B \-s 1|2|3 .BI \-f " name" .SH DESCRIPTION .B mg_stem_idx generates a stem index file for a collection. This program should be called three times: once for each .B -s parameter. It uses the stemmed dictionary to create the stem index which contains pointers into the stemmed dictionary. .SH OPTIONS Options may appear in any order. .TP "\w'\fB\-d\fP \fIdirectoryyyyyyyy\fP'u+2n" .B \-h This displays a usage line on .IR stderr . .TP .BI \-b " entries-per-block" The dictionary is stored in blocks on disk; this option is used to set the number of entries per block. The default is 16. .TP .BI \-a " stemmer" The name of the stemmer to use, the default is the Lovin stemmer. .TP .B -s 1|2|3 The stem method to apply for the stem index. .br 1 = casefolded and non-stemmed .br 2 = non-casefolded and stemmed .br 3 = casefolded and stemmed .TP .BI \-d " directory" This specifies the directory where the document collection can be found. .TP .BI \-f " name" This specifies the base name of the document collection. .SH ENVIRONMENT .TP "\w'\fBMGDATA\fP'u+2n" .SB MGDATA If this environment variable exists, then its value is used as the default directory where the mg collection files are. If this variable does not exist, then the directory \*(lq\fB.\fP\*(rq is used by default. The command line option .BI \-d " directory" overrides the directory in .BR MGDATA . .SH FILES .TP 22 .B *.invf.dict Compressed stemmed dictionary. .TP .B *.invf.dict.blocked.1 Stem index with stem index method 1. .TP .B *.invf.dict.blocked.2 Stem index with stem index method 2. .TP .B *.invf.dict.blocked.3 Stem index with stem index method 3. .SH "SEE ALSO" .na .BR mg_compression_dict (1), .BR mg_fast_comp_dict (1), .BR mg_invf_dict (1), .BR mg_passes (1), .BR mg_perf_hash_build (1), .BR mg_weights_build (1)