[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Read Speed




Hi Neal,
Your second claim: "But from a single read point of view, a 33Mhz and 66Mhz
system behave roughly at the same bandwith (not including 66Mhz systems
slowed down to 33Mhz)" is still wrong.

I have been designing a PCI 66MHz/64-bit core and a SDRAM controller system
for almost 1 year. The project is closing to finish. All simulations go
excellently well.

1. No matter it is a single read or a burst read, 66MHz or 33MHz, the
response time, or number of clocks, from read operation is fixed for a
design;
Starting with clock 1 with nFRAME asserted, my SDRAM design will feed data
into PCI bus on clock 15, no matter it is a single read or a burst read,
66MHz or 33MHz. There is no difference in SDRAM read operations design for
single read or burst read: all read operations are treaded as burst; when
single read is detected, PCI singles will be generated differently from
burst, internal SDRAM read operations are the same. Overread data will be
pushed into internal ReadFIFO for next potential transaction with greater
possibility of the next starting address being contiguous.

2. With 66MHz PCI bus, SDRAM grade is updated to SDRAM100MHz, that means,
after fixed number of read overhead, 1 data will appear on PCI bus per
clock; data bandwidth is linear to number of clocks in a second. For 66MHz,
1 us has 66 clocks, while 33MHz, 1 us only has 33 clocks. So 66MHz PCI has
doubled data bandwidth as 33MHz, no matter read is single oriented or burst
oriented.

Weng Tianxiang

wtx@umem.com
wengtianxiang@yahoo.com

Micro Memory Inc.
9540 Vassar Avenue
Chartsworth, CA 91311
Tel: 818-998-0070
Fax: 818-998-4459


----- Original Message -----
From: Neal Palmer <neal@dinigroup.com>
To: Weng <wtx@umem.com>
Cc: <pci-sig@znyx.com>
Sent: Thursday, April 05, 2001 5:19 PM
Subject: Re: Read Speed



Weng,
  If you start with the assumption that you have 2 separate designs in the
same technology: one designed for 33Mhz operation and one designed for
66Mhz operation.  Then you should find that the 33Mhz device does twice
the amount of work in the same clock cycle that the 66Mhz device does.
  We are only talking about single read accesses (not bursts).
  Also if you are accessing SDRAM, it has a fixed latency for a read
(about 70ns).  That latency is going to be twice as many clock cycles at
66Mhz as it would be at 33Mhz.
  If you take a look at read access on PCI to main memory on a
motherboard, you will find that at 33Mhz you will get data back from the
motherboard within 16 clock cycles.  But on a 66Mhz bus you don't get data
back until about 25-35 clocks (depending on the chipset).  The exact same
thing applies to peripheral devices when doing a read access.

  Everything changes when you start talking about writes or about read
bursts.  But from a single read point of view, a 33Mhz and 66Mhz system
behave roughly at the same bandwith (not including 66Mhz systems slowed
down to 33Mhz).

-- Neal

On Thu, 5 Apr 2001, Weng wrote:

>
> Hi Neal,
> Your claim that "  If you try going to a 66Mhz bus speed, you bus
frequency
> will double, but ALL of your latencies will double also, and you will be
> running at the same bandwidth." is wrong.
>
> When being switched from 33MHz to 66MHz, all number of clocks waiting for
> read data are same with both clocks, but data bandwidth is doubled
actually.
>
> Weng Tianxiang
>
> wtx@umem.com
> wengtianxiang@yahoo.com
>
> Micro Memory Inc.
> 9540 Vassar Avenue
> Chartsworth, CA 91311
> Tel: 818-998-0070
> Fax: 818-998-4459
> ----- Original Message -----
> From: Neal Palmer <neal@dinigroup.com>
> To: André David <Andre.David@cern.ch>
> Cc: <pci-sig@znyx.com>
> Sent: Thursday, April 05, 2001 11:22 AM
> Subject: Re: Read Speed
>
>
>
> Andre,
>
>   the problem is worse than you are estimating.  First off the host bridge
> on a 486 class machine could start a read about every 6 clock cycles (i.e.
> fast decode on the target device and 0 wait states).  When you move to the
> Pentium class machine you get to 7 clock cycles per read.  And a
> PentiumIII class machine is even slower.
>
>   Those numbers don't include any target latency.  You are going to need
> to assume at least a medium decode speed, and if you are accessing DRAM or
> some other off-chip memory, you should add another 5-10 clock cycles.
>
>   So the number of clock cycles that you will roughly see for a read
> access is going to be about:
> 8 (new processor limitations)
> 1 decode speed
> 10 memory wait states
>         = 19 clock cycles @33Mhz => 1.7Mdwords/second => 6.9Mbytes/second
>
>   Those are just rough estimates (and I didn't do the math until I had
> chosen all of the clock cycle estimates).  You notice that my estimates
> are actually a bit low compared to what you are getting.
>
>   If you try going to a 66Mhz bus speed, you bus frequency will double,
> but ALL of your latencies will double also, and you will be running at the
> same bandwith.
>
>   Lesson to be learned: use DMA if you need peripheral to host transfers
> to be fast.
>
> -- Neal
>
> On Thu, 5 Apr 2001, André David wrote:
>
> > Hi!
> >
> > I am working on a group developing a PCI board for data acquisition.
> > Since our priority is getting it running, busmastering capabilities for
> > such things as DMA to the host memory are not on the front line of
> > development.
> >
> > Now, since I'm the guy behing the device driver, I have done some
> > benchmarking with a simple device driver (in Linux, of course) using a
> > standar PCI VGA adapter.
> > This "driver" just uses the memcpy() transfer some data between the main
> > memory and the board's framebuffer.
> >
> > I have tried three different processor/chipset combinations and the
> > results I get, are:
> >
> > (results after BIOS and MTRR parameters tweaking)
> >
> >                                              Reading (Mbyte/s)
> > Writing(Mbyte/s)
> > Intel 440FX (PII@233)                7.03
> > 36.16
> > Intel 440BX (2*PII@400)            8.62                        102.4
> > VIA KT133 (Athlon@900)            7.46                        119.6
> >
> > Now this points to a pattern in which the north bridge seems unable to
> > read with a reasonable speed from the board. I know writing is always
> > easier than reading (from the specs a single data phase read is slower
> > than a single data phase write (4 clock cycles vs. 3)).
> >
> > The north bridge behaviour is inadmissible even if we assume that all
> > the reads are single data phase reads (4 clock cycles), with even medium
> > devsel (1 more clock cycle lost) and a wait-state from the VGA board
> > (another clock sycle lost), because this would give a total of 6 clock
> > cycles, or 22Mb/s total bandwidth.
> >
> > So my questions are:
> >
> > - Since it looks that north bridges have always been like this, has
> > anyone found one that is not?
> > - Is it admissible (logical) that the north bridge is like this?
> > - Since I have only talked about commodity PC's, could there be
> > something on the industrial market that does not suffer from this
> > apparent "feature"?
> >
> > Thanks in advance for all comments,
> >
> > Andre David
> >
>
> --
> -- Neal Palmer
>
> The Dini Group
> 1010 Pearl St #6
> La Jolla, CA 92037
> (858) 454-3419 x16
> (858) 454-1728 (Fax)
>
>

--
-- Neal Palmer

The Dini Group
1010 Pearl St #6
La Jolla, CA 92037
(858) 454-3419 x16
(858) 454-1728 (Fax)