North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: decreased caching efficiency?

  • From: Scott Gifford
  • Date: Thu Oct 19 15:06:21 2000

All of these problems are solvable, using common and well-known
techniques:

Daniel Senie <[email protected]> writes:

> It might be worth thinking about the problem from the other end. From a
> web site owner's perspective, caching is a major annoyance. Here are the
> arguments you may encounter from a web site owner or web developer:
> 
> 1. It interferes with content in many cases (web site visitors may see
> cached pages instead of current content). I know cache products claim
> this doesn't happen, but it has, and often.

Most (all?) reasonable caching products will honor whatever expiration
information you put in the page, such as the Cache-Control header and
Expires header.  Where I've made careful use of these, I've never had
problems with stale content, even from browser caches.

> 2. The website owner loses information on how many visitors are coming
> to the site.

A common technique to just count Web page hits is to <img src> a small
image on the page, and then use that to count page visits, or to have
the page itself not be cacheable, but the images (which are most of
the load time) cachable.  Having the page itself be dynamic and
uncachable, while the images can always be cached, can be a big win
all around; dynamic images are fairly rare (except from MRTG. :) )

> 3. The website owner loses the demographics on where visitors are coming
> from, and especially the number of unique visitors. (It's not helpful to
> know that one cache engine visited, if that cache engine equated to
> 10,000 visits in an hour).

You can use the X-Forwarded-For header that many caches provide to
gather this same information.  In the future, you may be able to use
the protocol described in RFC 2227 to get more detailed information.

> 4. Banner advertising may or may not display properly when caching is
> involved, thereby costing the website money.

I've never experienced this; I've been viewing the Web through a cache
or a hierarchy of caches for 2 years now, and I've never noticed
anything weird with banner ads.  At least nothing an "Expires: 0"
wouldn't solve.

> 5. There's NOTHING in it for the website owner, other than the
> possibility that SOME pages might display faster for SOME users.
> 
> If folks running networks really think website designers and owners
> should care about caching, then there needs to be some sort of benefit
> (perhaps paid in dollars) to those affected. Otherwise, there's little
> reason for them to care.

I don't understand this; having Web pages which are effectively cached
around the world reduces the load on your servers significantly
(especially as more and more ISPs start to cache), and saves you
significant bandwidth.  This lets you buy fewer servers for your farm,
and buy less upstream bandwidth.

Right now, having a site which is cache friendly can save you money in
a big way, at the same time savin ISPs money, making your page display
how you want it (since the ISPs are already deploying caching, whether
your pages are friendly to it or not), and having the page load faster
for quite a few users.

How is that not a benefit?  How is that not paid in dollars?

In the future, if Webserver operators would take effective cache
performance while maintaining correct display into account when
configuring their servers, and make sure that page designers do the
same, that would allow caches to become more ubiquitous, and push
people to set up large-scale cache hierarchies.  It could get to the
point where all of the non-dynamic content from an infinitely large
Website could be served by an old desktop computer over a 28.8 modem,
since it would just have to send its content once to the UUNet cache,
once to the MCI/Worldcom cache, once to the Sprint cache, etc.  Of
course, that's still a ways off.  :)


Just my 2 cents,

------ScottG.