North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

RE: PGP kerserver infrastructure

  • From: L. Sassaman
  • Date: Wed Jun 28 18:44:25 2000

Hash: SHA1

A couple of people have emailed me and asked exactly what the resource
requirements for a keyserer are. I asked Randy Harmon, the administrator
for, to see if he could answer that question for
me. His response is below.



L. Sassaman

System Administrator                |  
Technology Consultant               |  "Common sense is wrong." 
icq.. 10735603                      |  
pgp.. finger:// |    --Practical C Programming

Hash: SHA1

Many parts of the answer to that question:

 - The replication infrastructure affects the bandwidth used.  Take a
10-node keyserver network.  If one is the master for accepting keys,
and 9 accept replications from it (without cross-replicating), then
the bandwidth used is far different than if each replicates all
changes it receives to each of the other 9.  In the first case, each
one would replicate its news to the master, which would replicate
back to all 9 (or 8, arguably).

 - Keyservers do get out of sync with each other (replications not
arriving, replicas unavailable) and should be periodically
cross-sync'd.  Databases can also become corrupted and should be
periodically rebuilt (combined with cross-syncing, ideally).  These
rebuilds help make up for the possibility of failure in the
more-efficient replication mechanisms, and should probably use a
"pull" approach rather than a "push" approach.  So the master server
could pull keysets from its replicas, rebuild and merge its own
database with the keysets from others, then send a notification that
a database update is available.  This approach would use, in the
10-server example, 1.2kb per key x 2 transmissions x 9 servers

 - The disk hardware should be capable of withstanding the search
volume (depends on the number/types of searches performed) and of
rebuilding the databases without major impact on search performance. 
I'd suggest 8-15 gigs of FREE space, beyond ~6 gigs for the main
database.  Single drives are OK for low search volume and low search
performance.   Larger database means more disk space, approximately
linear. Modulo differences in indexing techniques, as the number of
keys continues to grow very large.

 - Day-to-day bandwidth is a function of the number of replications
triggered by a key add/update, the number of keys added/updated
(roughly 4/3 * 1.2kb per key added/updated) and the number/types of
searches performed.  We currently receive about 20,000 searches per
day (5 keys returned per search, on average), about 1500 adds, and
about 250 updates. For today's volume with the more efficient
replication approach discussed above, the master server would

 	Searches:  100,000 * 1.2k * 4/3 radix-64 overhead = 160 MB
	Adds/Mods: 1750 * 1.2k * 4/3 overhead = 2.8 MB
	Replications: 2.8 MB * 9 = 26 MB

If we assume 10 times the volume as today (1.1 million keys on the
server), and 10 servers to balance the search load, then the
bandwidth for each server would be roughly:

	Searches: 		160 MB 
	Adds/Mods: 		 28 MB
	Replications: 	 28 MB outgoing 
				280 MB incoming
				496 MB/day
				 15 GB/month

Multiply by 10 for 100,000,000 keys.


- - --------
Randy Harmon <[email protected]>
Engineer, PGP Keyserver
PGP KeyID: 0x5cb7b7f2a0aa5c1e

Version: PGP 6.5.3


Comment: OpenPGP Encrypted Email Preferred.