W3C Lib Using

Registering Proxy Servers and Gateways

Applications do not have to provide native support for all protocols, they can in many situations rely on the support of proxies and gateways to help doing the job. Proxy servers are often used to carry client requests through a firewall where they can provide services like corporate caching and other network optimizations. Both Proxy servers and gateways can serve as "protocol translators" which can convert a request in the main Web protocol, HTTP, to an equivalent request in another protocol, for example NNTP, FTP, or Gopher. In case a proxy server or a gateway is available to the application, it can therefore by use of HTTP forward all requests to for example a proxy server which then handle the communications with the remote server, for example using FTP about the document and return it to the application (proxy client) using HTTP.

The Library supports both proxies and gateways through the HTProxy module and all requests can be redirected to a proxy or a gateway, even requests on the local file system. Of course, the Library can also be used in proxy or gateway applications which in terms can use other proxies or gateways so that a single request can be passed through a series of intermediate agents.

There is one main mechanism for registering both proxies and gateways but there are two different APIs to follow. It is free to the application to chose which one suits it the best, the functionality provided by the Library is the same in both cases. The first API is based on a set of registration functions as we have seen it so often through out this guide. Regardless of the registration mechanism used, proxy servers are always rated higher than gateways so if both a proxy server and a gateway is registered for the same access method, the proxy server will be used.

Registration of Proxies

A proxy server is registered with a corresponding access method, for example http, ftp etc. The `proxy' parameter should be a fully valid name, like http://proxy.w3.org:8001 but domain name is not required. If an entry exists for this access then delete it and use the new one.
extern BOOL HTProxy_add		(CONST char * access, CONST char * proxy);
In addition to the proxy list, the Library supports a list of servers for which a proxy should not be consulted. This can be useful in order to avoid going via a proxy server for servers inside a firewall, if the server is known to be either as well connected as the proxy or the remote server is in fact itself a proxy server.
extern BOOL HTNoProxy_add	(CONST char * host, CONST char * access, unsigned port);
The set of server registered using this function are host names and domain names where we don't contact a proxy even though a proxy is in fact registered for this particular access method. When registering a server as a noproxy element, you can specify a specific port for this access method in which case it is valid only for requests to this port. If `port' is '0' then it applies to all ports and if `access' is NULL then it applies to to all access methods. Examples of host names are:
	w3.org
	www.fastlink.com

Registration of Gateways

Gateways are registered exactly like proxy servers: it is registered with a corresponding access method, for example http, ftp etc. The `gate' parameter should be a fully valid name, like http://gateway.w3.org:8001 but domain name is not required. If an entry exists for this access then delete it and use the new one.
extern BOOL HTGateway_add	(CONST char * access, CONST char * gate);

Backwards Compatibility with Environment Variables

Proxy servers and gateways have historically been registered environment variables which is a Unix'ism and not especially portable. However, in order to support this way of registration, the Library provides the following function to read the environment variables that defines proxy severs and gateways.
extern void HTProxy_getEnvVar	(void);
There is no standard for the format of the environment variables, but the most accepted convention is the format described here:
WWW_<access>_GATEWAY
Definition of a gateway. Note that a WAIS gateway can be defined this way to change the default gateway at wais://www.w3.org:8001/.
<access>_proxy
Definition of a proxy server
no_proxy
This is a comma separated list of remote servers where a proxy server should not be consulted for handling the request. An example is
	no_proxy="cern.ch,ncsa.uiuc.edu,some.host:8080"
	export no_proxy
<access> is the specific access scheme and it is case sensitive as access schemes in URIs are case sensitive. Proxy servers have precedence over gateways, so if both a proxy server and a gateway has been defined for a specific access scheme, the proxy server is selected to handle the request.

It is important to note that the usage of proxy servers or gateways is an extension to the binding between an access scheme and a protocol module. An application can be set up to redirect all URLs with a specific access scheme without knowing about the semantics of this access scheme or how to access the information directly. That way, powerful client applications can be built having direct support for, for example, HTTP only.

Finding Proxies and Gateways

Registering a proxy server or a gateway does not mean that the request automatically is redirected to the new location instead of the origin server. In order to actually redirect a request, you can register a Pre-Request Callback function which will bet called before a request is actually sent over the wire. You can find more information on how to actually redirect a request to for example a proxy server in the section Proxies and Gateways


Henrik Frystyk, libwww@w3.org, December 1995