The Quest for Better Performance
By
Johan De Gelas
Friday, April 21, 2000 3:07 AM EDT
|
Performance: you can never get enough. So we all crave for more, and
we expect the industry to build faster and faster systems, but designing
and building a high speed system has never been so complex as right now.
You need more than a recklessly fast CPU and strong video card.
Performance is limited by certain bottlenecks, and that is what this article
is all about. We will try to find out what exactly are the bottlenecks
in a top notch system. Do we need more memory? Faster memory?
Faster videocards? A faster Front Side bus (FSB)? Bigger faster L2-caches?
Which one of the mentioned solutions will have the most impact?
Whether you are an hardware enthusiast, system engineer, or a novice
searching for the best components, we hope to enlighten you with some astonishing results. In this article we focus on the Athlon platform, as I had the opportunity to test this system for weeks. We will serve up quite a
few benchmarks on the Athlon 1000 system that AMD sent us.
I've got to warn you, though, this is not the usual review of the Athlon
1000 or so. The purpose of this article is to understand what
is going on, what slows it down and speeds it up. Let us get started.
More Clockspeed
Never skimp on CPU power, that is probably one of the most basic rules
in building a fast system. So, we buy higher and higher clocked CPUs. We
all know that the faster-clocked Athlons are showing diminishing returns,
and we assume that this the result of higher cache dividers. Is the only bottleneck in today's Athlon systems the L2-cache, or is there more than meets the eye? Let us investigate this further, but first we will try to understand how bad the bottlenecks really are.
I benchmarked 3 different game benchmarks, an OpenGL first person shooter
(Quake 3), a directX one (Expendable) and a polygon rich (25000 polygons
per frame) Opengl benchmark (Indy3D's Animation benchmark) .
The Athlon 861 was an Athlon 800 which we overclocked by raising the
bus speed to 107 MHz. All CPUs were tested on the same configuration,
and the memory bus/FSB ratio was 4:3. In other words, in case of the Athlon
650, 850 and 1000 the memory was running at 133 MHz. The Athlon 861 was
running with a 214 FSB and 142 MHz SDRAMs. The Athlon 950 was an Athlon
1000 underclocked to 950, by lowering the FSB to 95 MHz.

The results are surprising, don't you think? Games run only between
10 and 20 percent faster on the Athlon 1000 than on the Athlon 650, while the
former is 54% higher clocked than the latter! Talk about diminishing returns...
Back to our main goal: finding the bottlenecks. Is the Athlon 1000 only
slightly faster because the cache is clocked 3 times slower than the CPU? Let us try to explain each benchmark.
In case of Quake 3 High Quality (800x600 pixels, 32 bit color),
the problem is obvious. The video card's fillrate and bandwidth is
too limited to show a major difference.
What about Q3 Normal? That is not exactly a fillrate limited
test. Quake 3 at "Normal" setting means that the videocard has to calculate
frames of 640x480 pixels, bilinear filtered with only 16-bit colors. A
decent videocard like our Creative labs annihilator pro card (Geforce 32
MB DDR RAM) has no problem whatsoever with these kind of resolutions
and color depths.� If this was simply a fillrate limited test, then
the 861 should not show a 6% difference with the Athlon 800, which is almost
the difference in clockspeed. Also, notice that the Athlon 861 is as fast
as the Athlon 1000. Why? The L2-cache of the 861 MHz works
at 40 percent of CPU speed, or 344 MHz. The Athlon 1000 works with a 333
MHz L2-cache. Or is it the memory that works at 142 Mhz compared to Athlon's
1000 133 MHz (the FSB at 214 MHz instead of 200 MHz)? A lot of questions,
we'll investigate this further.
The Expendable and the Indy3D test animation benchmark displays similar
behavior. Let us take a look at few applications that I have benchmarked.

The polygon rich benchmarks, MCAD40 and MCAD150, are not really impressed
by the muscles of the Athlon 1000. It is clear that there is a huge bottleneck
somewhere. Granted, there are applications that will benefit much more
from higher clockspeeds, but they are rather rare. Games and 3D-applications
and the applications that the content creation benchmark is based upon
are used by most computer users out there.
Content creation, with tests like Photoshop, although known to be
very memory intensive tests, seem to appreciate the extra�
power that the Athlon 1000 throws in a bit more. That is quite surprising when you consider that these tests involve quite a bit of disk activity, which should
flatten the curve. However, the content creation benchmark scales better
with clockspeed than the Quake 3 benchmark does. Amazing, considering that Quake 3 has no hard-disk activity at all.
To make things complete, a comparison between the Athlon 850 and the
Athlon 1000, with the Spec ViewPerf benchmark suite.
�
| � |
Athlon 850 |
Athlon 1000 |
Performance increase |
| ADWAV |
80,33 |
80,36 |
0% |
| Drv6 |
35.09 |
36.99 |
5% |
| Dx05 |
48.21 |
50.53 |
5% |
| Light |
3.924 |
4.131 |
5% |
| Pro-CDRS |
13.54 |
13.87 |
2% |
Nothing. Zip. The benchmark couldn't care less whether we got a�
850 or 1000 Mhz CPU. So, now that we have clearly demonstrated that the
fastest clocked Athlons systems contains quite a few bottlenecks, let us
pinpoint the problems. To do that, we need to know what is going on in
our system.
All Content is Copyright (C) 1998-2003 Ace's Hardware. All Rights Reserved.
|