[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Re[2]: PCI latency and VGA drivers
- To: Mailing List Recipients <pci-sig-request@znyx.com>
- Subject: Re: Re[2]: PCI latency and VGA drivers
- From: "John R Pierce" <pierce@hogranch.com>
- Date: Tue, 10 Mar 1998 21:55:33 -0800
- Resent-Date: Tue, 10 Mar 1998 22:38:50 -0800
- Resent-From: pci-sig-request@znyx.com
- Resent-Message-ID: <"b-W_i3.0.jc1.USY1r"@electra.znyx.com>
- Resent-Sender: pci-sig-request@znyx.com
>It's simple -- WinBench. You get better graphics benchmarks if you push
>that graphics card as hard as physically possible. Which means, ideally
>(in the minds of the graphics card folks), you have the CPU locked to
>the graphics card over PCI, to the exclusion, as much as possible, of
>all else. This has to be the default in the graphics driver, because the
>benchmarks are run from off-the-shelf cards. This should be a
stictly-Windows
>thing.
This situation was complicated by two things. The write fifos were too darn
small on most graphics chips, especially the ones designed a few years ago
or based on those designs... And stopping to poll a status port before each
write exacts an extreme penalty in PCI bus throughput (suddenly, you are
looking at 5 or 6 or more PCI clocks per 32 bit word written instead of 1
dword burst per clock...). You don't want to know the lengths I used to go
through as a graphics driver writer to try and resolve this, it was
especially pronounced on the various S3 chips I worked on up to and
including the 968. (I'm not currently in the GFX biz, so I am somewhat out
of touch with the current chips, one would hope they've fixed this).
Classic problem... scrolling a window in a text application... First you
issue a large bitblt command, it might have to move 3MBytes or more memory.
Next you write text, which consists of a mono memory bitmap -> screen
transfer... to do a character, you write a command, about 4 words of
parameters (starting X, Y, width, height), then several dwords of packed
monochrome font pixel data. Meanwhile, that bitblt is executing, the
command fifo is empty... but if you start writing those characters without
waiting for chip IDLE, the 16 word fifo fills up in a big hurry, and blam,
you've blocked the bus. If you wait for chip idle, then the chip's graphics
engine sits there with nothing to do until you've seen that its idle, and
done about 8 writes, this can easily be several microseconds, which is an
eternity on a 200MHz processor.
-jrp