[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Behavior of retries




In many systems, particularly data networking, most of the critical data
movement
is handled by DMA transfers.  In these systems, deterministic latency
when arbitrating
for the bus turns out the be THE critical design issue.  Data FIFOs and
all sorts of
other stuff depends on this.

PCI retries after some maximum number of clocks help these systems.
OK, if the target can provide data after 20 clocks but the spec says
retry at 16, we've given up some bus bandwidth.
But if a worst-case target cycle is more like 100 clocks
(ok this is remote but possible if you have multiple bridges
and a burst going somewhere) then we have a major improvement.
If nobody else is waiting, and the initiator continuously retries,
what's the harm? 
The master has to wait anyway.

Yes, the above comments completely ignore latency effects of other
DMA devices in the system (that's a different conversation).
But very long target cycles were a common sin in older bus
architectures,
and the PCI designers provided a quite reasonable solution.

Most PC applications share features that PCI may not shine at:
1)	Performance is bound purely by the CPU
2)	the CPU gets to resources via direct target access over PCI,
and,
3)	the CPU DOES NOT SUPPORT BURST TRANSFERs over PCI to targets.
(ok there are some exceptions to this but in general it is true for x86
family).
Pretty funny for a bus architecture that depends on bursts for
performance, don't you think?

But if you follow it through, you will probably end up agreeing with the
PCI architects,
who (although they may not say it explicitly) clearly believe that:
1)	data that is critical to CPU performance belongs in the CPU's
LOCAL memory, NOT on the PCI bus
2)	DMA and/or intelligent peripherals are needed to move the data
to/from there
3)	Peripherals become responsible for buffering and bursting
features
	appropriate for their application.
4)	CPUs should initiate PCI cycles to targets only rarely, for
initialization
	or perhaps for minimal interrupt status.
And yes, this is not perfect either but welcome to the next level of
performance....







	-----Original Message-----
	From:	Mike Dini [SMTP:mdini@dinigroup.com]
	Sent:	Thursday, March 26, 1998 4:35 PM
	To:	Mailing List Recipients
	Subject:	Behavior of retries

	>> No, there are not obvious answers here.  We have found that
the best
	>> behavior of a slow device is to simply hold the bus until it
can
	>> generate the data and ignore the disconnect requirement of
the spec.
	>
	>Hmm, interesting philosophy: "We've noticed that under our
specific
	>situation that if we break PCI compliance then we work better,
therefore
	>it is ok to break compliance."
	>

	No, reality.  In embedded systems, sometimes the PCI spec is
wrong, and
	therefore in order to get a system to work, it is necessary to
violate the
	PCI specification.  You imply that you have a problem with this
philosophy.
	I don't.

	>> This
	>> actually significantly *increases* the performance of a PCI
bus by
	>> eliminating the retires which chew up most of the bandwidth
of the bus.
	>
	>Sometimes, yes reduceing retries would increase throughput.
However, that
	>is highly dependant on the other devices on the bus, and under
other
	>circumstances might severely restrict throughput.
	>
	>For example, suppose there is a bus segment with 4 devices on
it.  Device
	>A only talks with B and Device C only talks with D.  If device
B is slow
	>and breaks the spec by holding on to the bus for long periods
of time,
	>then it is interfering with the transfers that could happen
between C & D.
	>

	I agree, in this situation.

	>> Further, if an initiator attempts a different data phase
without completing
	>> the first, there is a chance that the target device does not
decode the
	>> full retry address and returns the data from the first phase.
Several
	>> off-the-shelf controllers behave this way.  Note that on
retries, a target
	>> should decode the entire address, not just the BAR to
determine if the
	>  ^^^^^^
	>  ^^^^^^ This should be "must"; it is requirement.  PCI 2.1,
Section
	>3.3.3.3.2, page 51.  The only exception is that reads from
prefetchable
	>memory space may ignore the byte enables.
	>
	>If a device does not decode the entire address then it is
broken, plain
	>and simple.
	>

	I agree, but not all off-the-shelf controllers and bridges
behave this way.
	There are, therefore, hundreds of thousands of 'broken devices'
in the
	field today.

	>> initiator is attempting to get the data that the target
originally
	>> responded to with a retry.  I believe that an initiator
should be able to
	>> do other stuff while it is waiting for the data, but in
practical reality,
	>> I have not seen a system that will work if an initiator does
this. The
	>> problem is compounded with multiple masters.
	>
	>Should devices be designed to the specification, or to work
with other
	>*broken* devices?

	The DEC bridges *DO NOT* fully meet the specification.  This is
the not the
	only example.  Therefore is necessary in a real system to design
to all
	known broken devices, even if this means violating the spec.
PLX and AMCC
	also have known problems which force compromise.  Several SCSI
controllers
	have unique behavior.  Might I note that holding the bus for a
few extra
	clocks is not a very serious *violation* of the spec?

	dini
	      
	Mike Dini
	The DINI Group
	1010 Pearl Street, Suite #6
	mdini@dinigroup.com
	http://www.dinigroup.com
	phone (619) 454-3419
	FAX (619) 454-1728
	cellular (619) 888-9173
	home (619) 454-7983