SO_BINDANY socket option in BSD/OS, along with the Squid cache
(http://www.squid-cache.org) to
implement this function. Recently, I was asked the specifics of how to do
this, so I spent a little time making this technology go. What follows is a
short writeup of the work I did to make transparent Web caching work.
/usr/contrib/bin/squid.
The source code for that version of squid is on the "Contributed Sources"
CD-ROM that ships with both the binary and source releases of BSD/OS. I
didn't try to make transparent caching work with any version of Squid other
than the version that is shipped with BSD/OS 4.2, since that version of
Squid is relatively recent. I didn't know of any good reason to upgrade to
a newer release of Squid.
SO_BINDANY socket option. This socket
option will allow an application to bind to and accept connections for any
IP address that get routed to localhost for processing. To that end, I
created a small set of patches for Squid that turns on the
SO_BINDANY option inside of Squid. In order to apply these
patches to Squid, you will need to find your "Contributed Sources" CD-ROM
and put it in your CD-ROM drive to retrieve the Squid source code.
# mount /cdrom
# cd /var/tmp
# gzcat < /cdrom/contrib/squid.tar.gz | tar xvf -
# umount /cdrom
Apply the patches in the squid.patch file.
# patch -p0 < squid.patch
Build the (now patched) Squid distribution.
# gmake all
First, save the original binary, in case you need to revert back. Then
copy the squid binary to the installation directory.
# mv /usr/contrib/bin/squid /usr/contrib/bin/squid.FCS
# cp src/squid /usr/contrib/bin/squid
You have now completed the first step of the process -- your squid
binary now will set the socketoption SO_BINDANY on the sockets
it creates for accepting connections.
Next you need to create a configuration file and apply the patches in the squid.conf.patch file.
# cd /var/www/squid/conf
# cp squid.conf.default squid.conf
# patch -p0 < squid.conf.patch
# vi squid.conf
Note: You WILL need to add an ACL for your network/hosts that allows them to connect to the proxy. I've called my ACL "dummy" and it allows anybody on 192.168.1.0/24 to connect to the proxy. You MUST edit this line of the configuration file for it work properly on your network. There are lots of other settings that you might want or need to tweak to have Squid do what you want or need to adjust.
Now, start the Squid process:
# /var/www/squid/bin/start-squid
IPFW option on in the kernel. BSD/OS 4.2 ships with
this facility turned on, so unless you turned it off explicitly in the
kernel you are running, you should have it in your current kernel. You can
verify that this facility is turned on in your kernel by examining the
kernel configuration file that was used to build your kernel and making
sure that a line like the following is in the configuration file:
options IPFW # IP Filtering
You need to install a small filter at the pre-input
location in the kernel. Here's a filter similar to what is installed on my
router host:
tcp && srcaddr(192.168.1.25) && dstport(service(http/tcp)) {
forcelocal;
accept;
}
This filter only turns on the transparent proxying for a single host
(192.168.1.25) -- which is all that I needed for demonstrating that the
proxying was working. In a normal situation, you will need to change the
filter to allow your entire netblock(s) to connect to the service. This could
be done by modifying the srcaddr(...) part in the above example
to srcaddr(192.168.1.0/24). If you have multiple netblocks that
you want to allow, you can list them like this: srcaddr(192.168.1.0/24,
192.168.2.0/24)
To install the filter, you will need to run the following commands.
# ipfwcmp -o /var/run/ipfw.pre-input /path/to/pre-input.filter
# ipfw pre-input -replace /var/run/ipfw.pre-input
Don't forget to put these commands in your startup files so this filter
will get installed each time your machine is rebooted! I would suggest
putting these commands at the end of the /etc/netstart
file.
The first command compiles the ASCII representation of the filter into
the binary format that the ipfw command uses. The second
command will actually download the filter into the running kernel,
replacing any existing pre-input filter. If you need to make changes to the
filter, you can execute these commands again (after editing the filter
file) and implement the changes to the filtering rules in the running
kernel without having to reboot.
I examined the log files as they were written by the Squid proxy. Whenever a new Web site was visited, there was a long pause before the first log entry for a new website would be written into the access file. All the subsequent log entries would be written quite rapidly. This problem resembled a DNS lookup problem. The the Squid cache was making the end user wait while it resolved the name of the new Web host. This is not acceptable!
Reading more of the Web pages at the Squid Web server, I stumbled across an extremely important piece of information. With the 2.3 release of Squid, the default nameserver lookup routines used were the internal proxy routines. In other words, all the nameservice lookups were being done internally by the program using the internal Squid resolver code.
While there should be nothing wrong with the Squid resolver, it does has
one significant mis-feature, that may or may not affect your installation.
I have been given a patch from the the Squid
development team to resolve this problem. The misfeature is this -- the
Squid internal resolver routines open and parse the
/etc/resolv.conf file to retrieve a list of nameservers to
query during normal operation. This isn't a bad method to use to get a list
of nameservers, except that the parsing code doesn't know that it needs to
handle the following nameserver entry specially:
nameserver 0.0.0.0
Since the dawn of time (well, OK, at least since the dawn of the BIND
resolver code, in the late 1980s) this has been a legal mechanism to signal
the resolver to query the local running named process.
Squid dutifully sends DNS requests to this address, which get delivered
to the local DNS server. The DNS server then sends back the response, from
one of the IP addresses it has bound to on the machine, but never from the
address 0.0.0.0. Because the response comes from an IP address
that isn't on the list of nameservers it will believe, Squid tosses out the
answer and then queries the next nameserver on the list. What Squid should
when the special nameserver 0.0.0.0 is listed in the
/etc/resolv.conf file is to query the machine for all local IP addresses
bound to its interfaces and accept DNS answers from any of those IP
addresses.
Because of this misfeature in Squid, it appears as if every nameservice request was failing. So while the proxy waited for a nameserver to fail on every request, it wasn't handling relaying other http traffic and was causing the end user to have to wait while the dns information resolved from a distant nameserver.
If you apply the above patch that works around this problem, you should
not need the following workaround. To implement the workaround, you will
need to rebuild the squid executable with the
--disable-internal-dns option. This flag forces Squid to use
the external dnsserver program, which uses the system resolver
routines and will happily accept answers from the local nameserver's IP
addresses. After restarting the Web proxy with this workaround in place,
browsing through the proxy did not seem noticeably slower than without the
proxy.
The very small patch for the Makefile
to specify this flag is available. You probably don't need this patch if
you don't have 0.0.0.0 listed in your
/etc/resolv.conf file as a nameserver!
If you choose to implement this workaround, you will want to rebuild from scratch and reinstall the resulting binary:
# gmake clean
# gmake all
# cp src/squid /usr/contrib/bin/squid
Don't forget to kill and restart the squid daemon for the change to take effect!
dnsserverdnsserver program (and you
probably do not need to do this), the following section
will describe what you need to do collect some more information about how
many instances of the dnsserver program to run.
The notes on the Squid homepage that describe using the external
dnsserver program say that you should always try to run at
least as many copies of the dnssserver program than the cache
will have nameservice requests outstanding. And then run two more copies of
the program for good measure. However, it doesn't appear that Squid keeps
track of how many requests each of the dnsserver instances has
handled, so figuring out when there are enough dnsserver
processes running is not as simple as it could be.
Hacking a little code into the dnsserver program to do this
counting seemed like the right solution. So, after a little work on the
code to put in a call to setproctitle(), you can now look at
the dnsserver processes with ps and see how many requests each
of the dnsserver processes has handled. The patches for the dnsserver program are
available.
# ps -auxw -U www | grep dnsserver
www 829 0.0 1.7 1396 492 ?? Is 10:30PM 0:00.06 (16 requests) (dnsserver)
www 830 0.0 1.7 1396 492 ?? Is 10:30PM 0:00.04 (1 requests) (dnsserver)
www 831 0.0 0.5 1060 144 ?? Is 10:30PM 0:00.02 (0 requests) (dnsserver)
www 832 0.0 0.5 1060 144 ?? Is 10:30PM 0:00.02 (0 requests) (dnsserver)
www 833 0.0 0.5 1060 144 ?? Is 10:30PM 0:00.02 (0 requests) (dnsserver)
This is much more useful than the default listing in the process table
for dnsserver, at least in my opinion. If you patch
dnsserver, you will need to recompile everything and reinstall
at least the dnsserver binary. You should probably save the
original copy of the program, in case you need to revert back to it for
some reason.
# gmake clean
# gmake all
# mv /var/www/squid/bin/dnsserver /var/www/squid/bin/dnsserver.FCS
# install -c -o bin -g bin src/dnsserver /var/www/squid/bin/dnsserver
It's not completely obvious, but you will need to kill and restart the
squid daemon for the new version of dnsserver to
be started. This is necessary because squid starts up all the
copies of the dnsserver program when it first starts and uses
them until the squid daemon is stopped.
SO_BINDANY socket option works in BSD/OS. Also, my thanks go
to Adrian Chadd for pointing out that the Squid resolver routines are
asynchronous and really ought to be used, now that the resolver misfeature
in Squid is fixed.