North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

2006.06.06 NANOG-NOTES IDC power and cooling panel

  • From: Matthew Petach
  • Date: Thu Jun 08 10:29:24 2006
  • Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:sender:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition:x-google-sender-auth; b=dTgNClZ+9vZO3go6pUKNMe2zdJwxVDX3UBDh3vsQNNfiKEVuvchpkmRjDTppQOBnDkt/4ZNSm5526ItNy7H4Rnjb79J1FFTnVErHy93s49ramZW6SXcDsEDLzc3dqosmeYqmbW3fgvRaoS/M/0R4ZmPkRzaXcbNTQAl9+IEr65Q=

(ok, one more set of notes and then off to sit in traffic for an hour on
the way to work... --Matt)


2006.06.06 Power and Cooling panel
Dan Golding, Tier1 research, moderator

Hot Time in the Big IDC
Cooling, Power, and the Data Center

3 IDC vendors, 4 hardware vendors
Michael Laudon, force10
Jay Park, equinix
Rob Snevely, Sun
Josh Snowhorn, terremark
David Tsiang, cisco
Brad Turner, juniper
Brian Young, S&D

The power and cooling crisis
internet datacenters are getting full
most of the slack capacity has been used up
devices are using more and more power
low power density - routers, full sized servers
medium power density - 1u servers, switches
high power density - blade servers
Many data centers are full at 70-80% floor space
utilized
North America IDC occupancy is around 50%
 most sought-after space is around 70%

full when power and cooling capacity is used up,
floor space is vacant but can't be used.

There is a relationship between power and cooling
devices are not 100% efficient
I^2R losses means that power becomes heat
 (conservation of energy)
heat must be dissipated
The ability to dissipate heat with normal cooling
technologies is hitting the wall
need new techniques

Some quick rules of thumb
a rack or cabinet is a standard unit of space
from 30-40sqft per rack
power is measured in watts
many facilities do around 80-100w/sqft; at 30sqft
 per rack, that's about 3kw/rack
high

how did we get here?
what is current situation
where are we going?
[dang,  he's flying through his slides!!]

Hardware engineers
T-series hardware engineer for Juniper
CRS-1 hardware
E-series
datacenter design issues for Sun,
there were other hardware vendors who were not
interested in showing up, these people were brave
for coming up here!

Josh snowhorn, IDC planner
Jay Park, electrial engineer for equinix
Brian Young, S&D cage design specialist

What do the IDC vendors feel the current situation
is in terms of power/cooling, how did we get here?

Josh--designed datacenters at 100w/sq/ft, more than
enough for the carriers; the server guys hit 100w/sqft
in a quarter rack.  you could cannabalize some power
and cooling, but still ran out of cooling.
Now spend hundreds of millions to make 200wsqft
datacenters, or higher.

Now, to hardware vendors--why are their boxes
using up so much electricity, putting out so
much heat?
What are economics behind increasing density
and heat load?

From high-end router space--it's been simple, the
bandwidth demand has grown faster than the power
efficiency can keep up with.  In the past, had
the ability to improve keep up, do power spins about
every 2 years, half power; but now bandwidth is
doubling every year, but takes two years to drop
power in half.  We've been loosing at this game
for a while, and running out of room on the voltage
scale; 90nm is down at 1v, can't go much lower,
since diode drop is at 0.7v; at 65nm, it's still
at 1v, there's no big hammer anymore for power
efficiency.  Need to pull some tricks out, but
may need to do clock gating, may get some 20-30%
efficiency gains, but not much more that can be
pulled out of the bag now.

Newton was right; you can do some tricks, but no
magic.  Chip multithreading is one area they're
trying to squeeze more performance out of; don't
replicate ancillary ASICs for each core.  Also
can more easily share memory, and nobody has a
100% efficient power supply, so you lose some
power there too.

More and more getting squeezed in each rack.

Also a drive on cost; amortizing costs over
space and capability.
reducing costs per port is a big driver.

And customers are pushing for more and more
density, since the cost of real-estate is getting
so high, since each square foot costs so high.
In Ginza, $120/sq ft for space.

If you go to places where realestate is cheap,
easier/cheaper to just build really big rooms,
and let power dissipate more naturally.

IDC people agree, some cities are just crazy
in real-estate costs.  But for those in suburban
areas, cost of real-estate isn't so expensive.
3kw per blade server, put a few in a rack, you
hit nearly 10kw in a rack.  Soon, will need
direct chilled water in the rack to cool them.
But chilled water mixed with other colocation
and lower density cabinets is very challenging
to build.
But need to have enclosed space to handle local
chilled water coolers in localized racks.
20 years ago at IBM, nobody wanted chilled water
in their hardware.  Now, we're running out of
options.

Disagree--other ways of handling the challenge;
how thermally efficient are the rooms in the
first place, and are there other ways of handling
heat issues?
Cables with a cutout in tiles allows air to escape
in areas that don't provide cooling.

Josh notes the diversity between carriers at 40w/sq/ft
vs hosting providers at 400w/sq/ft is making engineering
decisions challenging.

It's not about power really anymore, we can get power,
it's about cooling now.

Dealing with space in wrong terms--watts/sq ft, vs
requirements of each rack.  Charge customers based
on the cooling requirements?

If you try to cool 15kw per cabinet, you still have
limits of how many cfm you can move through a given
space.  At some point, the air flow vertically through
the rack starts to starve.
What about a dual push-pull air system that pushes from
the bottom and pulls from the top.

Q: Randy, IIJ, question from the audience.  He'll put as
hot stuff in there as he can cool, because he wants
the power, that's life.
Problem is cooling; over 3kw/4kw over current level,
the wind tunnel effect gets painful.
the option of putting water in the cabinets is a
dealbreaker for many people.
Fact is, most facilities can't even handle 3kw per
square meter; any build that can't meet that today
is unrealistic.
That's 300+ w per sq. ft.
Josh has some cabinets at NOTA; Akamai is at 386w/sq/ft,
they can cool it with huge hot aisle behind it, and
around carriers at 40w/sq/ft.
Those are the densest cabinets they have.
IDCs need to build them and charge a realistic amount;
people will burn as hot as possible, since they need
to move more and more data.
Raise plenums higher, move more air, air coming up
side of rack and across.

Currently, equinix is building 4kw per cabinet,
planning for that in 2007.  A cabinet is about
2sq meters, so still not at the density Randy's
looking for.
Starting to separate high density users from
medium and low density users.

Q: GNI, Derek ? datacenter in SF, 1008 IBM blade
servers, 2500 sq ft, ping pong table, soda
machine in surrounding areas.
need 2500w/sq/ft to deliver the same cabinet space.
20kw per cabinet is what they can deliver in
cabinets.
He's paying for 100 cabinets, have 12 installed.
he's still netting efficiency for it.  still gets
3% better efficiency, still beats 84 1u pizza box
servers.  If IDCs could keep up with that, could
keep physical space requirements more reasonable.

The costs are exponential for more density.
Up to a year lead time for 2MW generators, we're
pushing the envelope on that.  It is an exponential
increase.
Budget trauma when those costs get passed on.
Let the demand stimulate ingenuity.
The internet industry in general is short sighted.
22 million blade servers installed, where they will
be located.

Q: BillNorton.  one other dimension.  Life span for
new datacenters is 10-40 year timeframe, so it's hard
to adjust midlife to hugely different power and
cooling demands.

From a technology point of view, CMOS was last
great quantum leap, need another great quantum
leap before we get relief on the cooling footprint.
Randy is right, the cooling architecture isn't
optimal.
CRS1, 20% of power goes to fans to move air past
convoluted air paths.
Spreading out the equipment is a mitigation.
multi-chassis systems will help with that.
Sun, Juniper, do you see power continuing
to grow linearly, or flatten out?
As they go to 40gig or 100gig, the power and
heat will continue to grow; more gates, more
heat, more power.  We'll hit a wall soon.

Cisco and Juniper agree, it's 6/6/06, take
note, world!

20 year shelf life for datacenter, look at where
they were 20 years ago.  10w/sq/foot back in 1986.
We've greatly increased the amount of work that
can be done since that time.   Will machines
continue to do the same amount of work, or will
we flatten out on the machine capability curve?
Element of geographic progression as you double.

Nobody will ever need more than 20kw per rack!
(Dan Golding)

Running into some roadblocks; 100M gate ASICs,
packing so much power into a single chip, may
not be linear since can't move that much
heat out of the chip from the point source into
the system.

Q: Patrick from Akamai--mcmurta base, south pole?
As a brighter light, spot of hope, hottest colo
in Terremark, finding they don't need more power,
coming to less power.  Running into getting enough
spindles, the processors are getting faster and
drawing less power.  40 amps per rack, used to be
non-full, now able to fill them more completely.
not sure if everyone is seeing this, but their
power consumption is going down.
Not all doom and gloom, but for next 12 months,
at least somewhat lucky.

Chip multicores is a good leap that can help for
a bit; like Sun's Niagra multicore chips or chip
multithreading, only about 50% of power is used by
real processing power, rest is ancillary power.

Q: Rob Seastrom; BillN danced around the question;
seen it happen before.  MAE-EAST, mark 3, additional
liebert challengers tucked in....if one builds a
datacenter to 4kw/meter^2, how long will that be
premium space vs no-longer-up-to-par.  Does 20
year colo life even make sense?  Is the run rate
steep enough that the number is just one we're
fooling ourselves over?
Josh: none of them are running at this density.  it's
the server density that hurts; carriers aren't as much
of the pain.
Separation of infrastructure most likely; Voice, carriers,
etc. there, and separate datacenters where server floors
exist.
20 years from now--will it be obsolete?  Yes, probably.
they'll keep doing what they can to help service their
customers.

Q: Joel K, from ? -- what's coming in network equipment
to help cut power?  throttle back linecards that
aren't running at full bore?
A: If you make it automatic, service providers would
consider it; but from the bandwidth demand growth,
the exponential growth--technology isn't keeping
up, it's plateauing.  Multithreaded ideas, turn
off idle logic portions, incremental improvements,
they're one-shot efforts, won't really help fix
the slope of the curve.
May just accept that we need more space, period.
High speed/low speed fans, only kick up to high
speed during thermal extremes.  Again, both Cisco
and Juniper have explored suppressing some gear,
but customers still want 50ms protect gear response,
so they can't really shut down.
Even making heat sinks to move the heat is getting
challenging!
Force10 also talked to customers about it; in order
to do turning off portions and then turn back on,
incurs latency, buffer some packets, etc; people
can be sensitive to jitter and latency.
Pushback has been fairly large from other sources.

Q: Rick Wesson--to colo vendors--when will heat/BTUs be
a part of of the charges?  And to server vendors, when
will heat be a listed component upon which vendors make
sales?
A: IDCs don't charge based on heat load.
Power as proxy for heat right now.
Cooling overhead is wrapped into cost of sq ft and power;
costs from utilities have been going up 30% due to
oil prices going up.
Might make it easier to add that charge for customers.
Hardware vendors are certainly seeing power/heat
limitations in RFPs.
Building smaller systems with fewer slots to meet those
RFPs.
Customers asking for gbits/kw now from network gear.
Sun notes that total cost of ownership, power may
cost more to run the server than the cost of the
server itself.

Q: Lane Patterson, equinix.  T640, if you redesigned it
today, what fraction of power would it use today
compared to past?
Do they engineer gear to see how much they can pack
into the same power/heat footprint?
But customers are also asking for more and more
capacity, less likely to pay for holding the line
at same power/heat as previous generation.
Cisco reiterates that we're running out of tricks;
we can hold the line for a product generation, but
after that, we're out of luck.  We may need to
shift architecture of pops going forward.
Why not build in cheaper places, and backhaul?

Q: Jared Mauch, NTT--huge customer demand; no vendors
are proving interfaces greater than 40gig; not for
next 3 years at least will there be faster links;
backhauling from remote locations requires aggregating
more and more traffic; if link speeds aren't increasing,
backhaul isn't practical.
As media companies continue deciding they can
sell movies, music, and the like online, we may
start hitting the wall; demand on all sides is
growing, and we're running out of ways to address
these challenges.

Q: Avi Freedman.  Talks to people doing lots of very
dense disk solutions.  Rackable solutions working
on high density storage racks using laptop drives.
48 disks for you starts to generate a lot of heat;
thumper product?
4u, 196 laptop disk rack unit?  For people who need
lots of spindles, lots of IOPS.
A: can't talk about those products, they showed
up in Jonathan's blog, but don't exist yet.
There are always going to be limitations, the
vendor will expect you can run the box in the
location you're going to put it; that is, box
has requirements, need to make sure customers
are installing boxes in areas where the thermal
issues are being considered.

Q: phil, rosenthal, ISprime.
people on the panel are pretty good, not the worst
offenders.  You need to hit the 1u server people
where most power is being wasted; Dell 1650 vs
Dell 1850; processor time, sitting at 90% idle
on both system for bottom of line servers, do
we need lower end CPUs on server lines so the
CPUs won't be idle.
A: why not use fewer machines, but have them do
more work each?  Virtualization might help us a
bit in these areas, where we get more efficient
use of the servers already in place.

Equinix notes neutral current dropped a lot,
people using 208V instead  of 120V, generates
less heat to the datacenter as well.

Q: Randy; on left, crew singing I want my p2p.
will always have max heat in the rack; servers and
router vendors will keep working as hard as they
can to do what they can.
They had to leave 30% of their datacenter empty
and build a bistro in it because they couldn't
handle the heat budget.
The IDC vendors [sorry, missed the comment]

Q: Tim Elisio, new metric?
To what extent is standardization, like using
larger, more efficient ower supplies, or more
efficient fans, cooling systems, etc. helping?
A: IEEE meetings talking about some standardization,
get some savings; the economies of scale helps make
more efficient products on the standardized products.
Telecom/router industry is working to old standards;
may need to re-think what airflow standards are,
for example.
More dollars in a particular area helps push
research and development in that direction.
Juniper notes internals can be optimized, but
the external plant and interfaces therein need
better standardization to get economies of scale.
Everyone's using multispeed fans, use them when
you need them.
3 orgs, SPEC, ECL forum/EPA (energystar for servers
and blade servers), and GreenGrid.
Will see benchmarks coming out; will start asking
all vendors to start compete on how much work they
can do per how much power sucked and heat generated.
Make the hardware vendors compete on how efficiently
they use power and generate heat; we can then decide
with dollars on who will win.

Helps motivate people to optimize on the axis they
care about.

But are vendors talking to each other about how they
can use standardized gear and standardized facilities
designs more efficiently?
ASHRAY?, heating/refrigeration group puts specs on how
machines should be cooled (front to back, etc).
But vendors don't want to help each other compete
because that hurts their business.

Dan Golding, 30 seconds, what would each person like
to see the folks on other side do to help.

Brian: asked vendors to have more efficient power
supplies, more efficient systems that generate less
heat for them to dissipate.
Equinix--challenged by power density; customers
don't understand, they want to put in smallest cage
as possible.  Need people to understand heat load
better!
Asks customers to use blanks in unused rack space
to isolate cold and hot aisles.  Too much leakage
from cold aisles to hot aisles.  Put blanks in!!
Josh.  Everyone building hot datacenters; would like
to see vendors come into IDCs, test them in real
world environments, put them in labs, see how they
stand up to environment, test glycol taps, water taps,
etc.  Building servers is faster than building
datacenters!

Hardware vendor; PGE, worked with them to measure
the increased efficiency, blanks DO help!!
Education, amongst each other and customers.
Watts per sq ft is crazy, do it on a per rack
basis, makes it easier for customers to understand
the limitations.
Force10, if IDC groups got together, if there was
a forum or group they could work with; right now,
everyone has different requirements.  Otherwise,
always doing multiple tradeoffs, if there were
a more general consensus, easier to engineer for.
Cisco--good point, get IDCs and service providers
to meet with vendors, come up with a next generation
facility architecture to aim and build for.  Hopefully
make cooling and airflow easier, reduce the amount
of power used by fans.
Juniper--sees RFPs from customers, environment specs
are very diverse; would be good to have common
standards for customers to aim for; also, update
some outdated nomeclature, use common terminology.

Michael, Josh, Rob, Brian, thanks to all the panelists,
Steve Feldman, we've slipped by 15 minutes, start at 2:15,
everything will slip thereafter

LUNCH!