.\"------------------------------------------------------------ .\" Id - set Rv,revision, and Dt, Date using rcs-Id tag. .de Id .ds Rv \\$3 .ds Dt \\$4 .. .Id $Id$ .\"------------------------------------------------------------ .TH mg_perf_hash_build 1 \*(Dt CITRI .SH NAME mg_perf_hash_build \- generate an order-preserving hash function for the stemmed dictionary .SH SYNOPSIS .B mg_perf_hash_build [ .B \-h ] [ .BI \-r " rndseed" ] [ .BI \-d " directory" ] .if n .ti +9n .BI \-f " name" .SH DESCRIPTION .B mg_perf_hash_build generates an order-preserving hashing function from the compressed stemmed dictionary and writes it out to disk. .BR mg_passes (1) will make use of the hash function when it builds the inverted file. .SH OPTIONS Options may appear in any order. .TP "\w'\fB\-d\fP \fIdirectory\fP'u+2n" .B \-h This displays a usage line on .IR stderr . .TP .BI \-r " rndseed" This specifies the random seed to be used in the generation of the hash function; it must be an integer. If this is not specified, the current time will be used as the random seed. .TP .BI \-d " directory" This specifies the directory where the document collection can be found. .TP .BI \-f " name" This specifies the base name of the document collection. .SH ENVIRONMENT .TP "\w'\fBMGDATA\fP'u+2n" .SB MGDATA If this environment variable exists, then its value is used as the default directory where the mg collection files are. If this variable does not exist, then the directory \*(lq\fB.\fP\*(rq is used by default. The command line option .BI \-d " directory" overrides the directory in .BR MGDATA . .SH FILES .TP 20 .B *.invf.dict Stemmed dictionary. .TP .B *.invf.dict.hash Data for a order-preserving perfect hash function. .SH "SEE ALSO" .na .BR mg_compression_dict (1), .BR mg_fast_comp_dict (1), .BR mg_invf_dict (1), .BR mg_passes (1), .BR mg_stem_idx (1), .BR mg_weights_build (1)