[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: "out-of-context" delayed completions
- To: Mailing List Recipients <pci-sig-request@znyx.com>
- Subject: Re: "out-of-context" delayed completions
- From: "David O'Shea" <daveo@corollary.com>
- Date: Mon, 27 Jan 1997 17:32:44 GMT
- Old-Return-Path: <daveo@corollary.com>
- Resent-Date: Mon, 27 Jan 1997 17:32:44 GMT
- Resent-From: pci-sig-request@znyx.com
- Resent-Message-Id: <"usTIK1.0.Z54.LSExo"@dart>
- Resent-Sender: pci-sig-request@znyx.com
Joe,
See my not very helpful comments below.
At 12:08 PM 1/23/97 MST, Joe Cowan wrote:
>Greetings,
>
>If a device is reset while it has an outstanding delayed request, I
>assume the device won't complete the delayed transaction, leaving an
>"out-of-context" delayed completion lying around in an HB or PPB. This
>"out-of-context" delayed completion might later be used to complete a
>delayed transaction under a different context, effectively corrupting
>the transaction. When a driver does a reset, how can it overcome this
>problem? Below is a specific example of the problem, though the problem
>also can occur with delayed writes as well as delayed reads.
>
> - A device attempts a DMA read from a work queue in host memory, and is
> retried by the HB since the HB is handling it as a delayed
> transaction.
>
> - The device driver issues a reset to the device, via a PIO write. The
> timing is such that the PIO write is on its way down to the device
> ahead of the DMA DRC.
>
> - The reset reaches the device and causes it to forget about completing
> the earlier DMA read from host memory.
>
> - The driver sets up a new work queue (context) in host memory.
>
> - The DMA DRC for the earlier context remains in the HB.
>
> - The driver restarts the device and the device resumes. The device
> attempts a DMA read to the same location it did just before the
> reset, and the HB returns the now "out-of-context" data in the DMA
> DRC, effectively corrupting the transaction.
>
>Joe Cowan
Your device is broken and non-compliant to the 2.1 revision of the
specification. The specification says pretty clearly that a 2.1
compliant device MUST retry FOREVER any RETRIED transaction specifically
so that the PPB or HB will not hang until such time as the request is
finally completed by the HB or PPB. The only exception is when
the ENTIRE PCI bus is reset using PCIRST#. In that case, the HB
or PPB is also reset and forget about the uncompleted Delayed Request.
To make the device compliant, your programmed I/O, vendor specific
"reset" would have to wait until all outstanding requests which have
been retried have been completed, and then undergoe the internal reset
operations of the device. You are not allowed to just drop the
attempted request in mid-stream even if your device defines a "self-
reset" condition port. The PCI bus does not recognize that a device
is ever allowed to just forget about transactions unless the entire
bus is reset using PCIRST#.
In short, there is no way to overcome this problem. Your device
driver should never reset your device in this manner if the behavior
of the device would be to forget transactions. As you point out,
and as has been demonstrated on a few known systems already, the bus
would hang in this situation, or your device would receive incorrect
data.
IF AND ONLY IF the HB or PPB device designer used some forsight, then
that designer took advantage of the rule that allows the HB or PPB
to drop the outstanding Delayed Request after 2**15 PCI clocks. In
that case, you could feel safe if after "internally reseting" your
device, you then waited around for 2**15 clocks before re-enabling
your device. This has two problems. First, the HB and PPB are
NOT required to dump the transaction after 2**15 clocks, as opposed
to the DEFINITE requirement that your device never stop retrying the
transaction, and second 2**15 clocks is a really long time.
No easy solutions here.
Regards,
David O'Shea
daveo@corollary.com
Corollary Inc.
Principle Software Engineer.
* ˜ ‡