Using the Z39.50 Client in Greenstone
Table of Contents
About: What is Z39.50?
Configuring Greenstone - Adding Servers
Currently known problems, and TODO list
Contacting Us
What is Z39.50?
Z39.50 is an international client/server protocol for searching bibliographic
data. It can use the Internet Protocol (TCP/IP), which makes the databases
on a server available from almost anywhere around the globe. It is widely
used, for example, in on-line library catalogues. It allows a user to search
one or more databases and retrieve the results of the query.
Our implementation uses the YAZ z39.50 toolkit, written by
Index Data.
Adding Servers
The file z3950.cfg in the etc/packages directory contains entries for
each server. By default, no servers are set up, although the config file
comes with two (commented out) example Z39.50 servers, both for servers
of the United States' Library of Congress.
Each entry consists of:
- A unique "short name" for internal use by Greenstone.
- The internet name or address of the server, and optionally the
port that the server is running on if not the default port 210.
- The name of the database to search on that server.
- A string that provides a meaningful name for the "collection".
- An optional "About" string, providing some information about the database
and/or server.
- Optional icon fields, which are displayed instead of the text on the
front page.
The entries need only be separated by whitespace, but for the purposes of
clarity the sample entry uses newlines and tabs.
The sample Library of Congress entry looks like this:
LOC
z3950.loc.gov:7090
Voyager
"Library of Congress z39.50 server"
About en "This is the z39.50 interface to the US LoC catalogue
system. It contains approximately 12 million bibliographic records.
For server capabilities, look at
<a href=\"http://lcweb.loc.gov/z3950/lcserver.html\">their web pages</a>."
There is a list at the Library Of Congress website containing some
servers
publicly available for testing.
Known Problems/Issues list
- Because of the open nature of the standard, our client may not work with
some servers. Please notify the author (contact details below) of any
such problems.
- Because of the large number of Marc fields, only the most frequently
used fields have been given explicit names in the results. Furthermore,
these are currently hard-coded to correspond to the USMARC field names.
- The z39.50 client will not work when fast-cgi is used!. Currently,
the use of the fast-cgi package (off by default) will disable the
use of the z39.50 client code for Greenstone.
- There are spurious <B> tags in the title of the window when
viewing an individual record.
- Greenstone will claim there are no collections if there are no
sub-directories in the "collect" directory, even if there are configured
z39.50 servers, although it will work if there are unbuilt collections.
History
Jul 2001 - finally tracked down and fixed an annoying bug that caused
greenstone to segfault when retrieving documents for some queries.
Jun 2001 - decided to drop href for field 856, as some examples have
explanatory text in the field as well as the URI.
Oct 2000 - cache results from server, so cut down the number of connections for
a single query/document. Also added href for MARC field 856.
Aug 2000 - Various bug fixes and minor modifications.
Fri Aug 4 2000 - The z39.50 code is now "stable" in the main source code (in CVS).
Tue 1 Aug 2000 - selecting a field from the drop-down box now actually works.
This results in both quicker replies and more accurate results.
Mon 24 Jul 2000 - queries are now "AND"ed - for example, the query "computer
science" will now only return records that contain both words.
May-June 2000 - Code written.
Contact Details
For general comments about Greenstone, or suggestions for improvements, send
email to
[email protected]
For bug reports or questions about the z39.50 code itself, send email to
the author, John McPherson -
[email protected].
Back to top
Last modified: July 12, 2001 by John McPherson