[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Read Speed
Andre,
the problem is worse than you are estimating. First off the host bridge
on a 486 class machine could start a read about every 6 clock cycles (i.e.
fast decode on the target device and 0 wait states). When you move to the
Pentium class machine you get to 7 clock cycles per read. And a
PentiumIII class machine is even slower.
Those numbers don't include any target latency. You are going to need
to assume at least a medium decode speed, and if you are accessing DRAM or
some other off-chip memory, you should add another 5-10 clock cycles.
So the number of clock cycles that you will roughly see for a read
access is going to be about:
8 (new processor limitations)
1 decode speed
10 memory wait states
= 19 clock cycles @33Mhz => 1.7Mdwords/second => 6.9Mbytes/second
Those are just rough estimates (and I didn't do the math until I had
chosen all of the clock cycle estimates). You notice that my estimates
are actually a bit low compared to what you are getting.
If you try going to a 66Mhz bus speed, you bus frequency will double,
but ALL of your latencies will double also, and you will be running at the
same bandwith.
Lesson to be learned: use DMA if you need peripheral to host transfers
to be fast.
-- Neal
On Thu, 5 Apr 2001, André David wrote:
> Hi!
>
> I am working on a group developing a PCI board for data acquisition.
> Since our priority is getting it running, busmastering capabilities for
> such things as DMA to the host memory are not on the front line of
> development.
>
> Now, since I'm the guy behing the device driver, I have done some
> benchmarking with a simple device driver (in Linux, of course) using a
> standar PCI VGA adapter.
> This "driver" just uses the memcpy() transfer some data between the main
> memory and the board's framebuffer.
>
> I have tried three different processor/chipset combinations and the
> results I get, are:
>
> (results after BIOS and MTRR parameters tweaking)
>
> Reading (Mbyte/s)
> Writing(Mbyte/s)
> Intel 440FX (PII@233) 7.03
> 36.16
> Intel 440BX (2*PII@400) 8.62 102.4
> VIA KT133 (Athlon@900) 7.46 119.6
>
> Now this points to a pattern in which the north bridge seems unable to
> read with a reasonable speed from the board. I know writing is always
> easier than reading (from the specs a single data phase read is slower
> than a single data phase write (4 clock cycles vs. 3)).
>
> The north bridge behaviour is inadmissible even if we assume that all
> the reads are single data phase reads (4 clock cycles), with even medium
> devsel (1 more clock cycle lost) and a wait-state from the VGA board
> (another clock sycle lost), because this would give a total of 6 clock
> cycles, or 22Mb/s total bandwidth.
>
> So my questions are:
>
> - Since it looks that north bridges have always been like this, has
> anyone found one that is not?
> - Is it admissible (logical) that the north bridge is like this?
> - Since I have only talked about commodity PC's, could there be
> something on the industrial market that does not suffer from this
> apparent "feature"?
>
> Thanks in advance for all comments,
>
> Andre David
>
--
-- Neal Palmer
The Dini Group
1010 Pearl St #6
La Jolla, CA 92037
(858) 454-3419 x16
(858) 454-1728 (Fax)
begin:vcard
n:David;André
tel;work:+41792013849
x-mozilla-html:FALSE
org:CERN - Centre Europeen de Recherche Nucleaire;Experimental Physics Division - NA60 Experiment
adr:;;;;;;
version:2.1
email;internet:Andre.David@cern.ch
note:Geneva, Switzerland
x-mozilla-cpt:;-11552
fn:André David
end:vcard
- References:
- Read Speed
- From: André David <Andre.David@cern.ch>