[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Interrupt latency...
- To: Mailing List Recipients <pci-sig-request@znyx.com>
- Subject: Re: Interrupt latency...
- From: "Jasper Balraj" <jasper@utopia.hclt.com>
- Date: Tue, 21 Jan 1997 16:08:13 +0530 (IST)
- Resent-Date: Tue, 21 Jan 1997 16:08:13 +0530 (IST)
- Resent-From: pci-sig-request@znyx.com
- Resent-Message-Id: <"WzokS3.0.R74.2y9vo"@dart>
- Resent-Sender: pci-sig-request@znyx.com
Dear PCI Experts,
I thank you for having given really useful suggestions regarding interrupt
latencies. I understand that it is OS specific and also that my card need
not interrupt the CPU for every 8 dword transfer since it can become a bus
master when enabled. I specially thank JRP, David O'Shea, Ward, and DaveN.
I have decided to use AMCC's S5933QB in posted mode. From the add-on side, I'd
fill up the FIFO and then enable it to become a bus master to dump the FIFO's
contents into the memory. For this I wanted to use the add-on side signals, FWE
(add-on-to-PCI FIFO Empty) and FRF (PCI-to-add-on FIFO Full) as interrupts
(ORed)to the processor 'cause IRQ# cannot be programmed to become asserted
when everytime the corrosponding FIFOs get empty or full. FWE seems to be
having an errata (Errata D14.1). that means FWE gets asserted even before
the last DWORD from the FIFO hasn't got really transfered from the PCI
controller (bus master) but staying in a holding register inside the controller,
after the controller got a target retry due to cache line boundry crossing
or so. If anybody can tell me, at the worst case, how long my bus master
may have to wait to get bus grant to complete the `retried' transfer. Why I
want to know this is my add-on processor will have to be in a wait loop till
this gets over (by reading from the add-on general control and status register,
on the PCI controller) to disable the controller from bus mastering and fill
the add-on-to-PCI FIFO again. I am doing this kind of `32 byte at a time' -
transfer 'cause I am not implementing a DMAC on the add-on side.
If some one knows, please tell whether the last dword which is staying at
the holding register (after getting a target retry) will be lost or not if
AMWEN is deasserted to fill data into the add-on-to-PCI FIFO and asserted
once again enabling bus mastering. Maybe if it won't get lost my add-on
processor will not have to go into the wait loop after getting FWE to get
an actual FIFO empty status from the add-on general control and status
register. I'm sorry for putting such a silly device 'n design specific
doubt on the reflector. Thanx in advance.
regards.
-Jasper
Jasper Balraj wrote:
>
> Hello!
>
> I'd like to know the exact way to calculate the interrupt latency
> in PCI bus. I understand, since the interrupt pins at the PCI
> connectors are shared, when there's an interrupt, the device driver
> has to read from the device's status register and check whether that
> device is the source of interrupt. If that device had not generated
> the interrupt, pass control to the previous device driver. So while
> calculating the interrupt latency, am I right that one has to take
> into account all this PCI bus reads, compare and jump instructions
> (for the worst case)?
Yes.
> Is there any typical interrupt latency period
> value for the PCI bus?
You're not really interested in 'typical'. If you don't want to
drop any data, you need worst case. This number is dependent
on the speed and load of the CPU, and other masters running on
the bus.
> Why I am asking this is I am planning to have
> a 8 DWORD FIFO in my PCI controller. So after filling-up the FIFO, my
> on-board processor would generate an interrupt in the PCI bus to tell
> the host to put my PCI controller in the initiator mode and do bus
> master xfer directly to the host memory.
Rather than having your board interrupt the processor every time
it wants to move 8 DWORDS, you should build something similar to
a scatter/gatter DMA engine capability into your board. Program
it for a series of transfers, up to some larger block size. Then
you can take the load off of the CPU for interupt processing. Your
major worry then will be what is the maximum bus mastering latency
on the PCI bus. This is dependent on the arbiter algorithm and
the behavior of other boards plugged in with you. A good systems
queueing theory modeling program may help you there. As far as I
know, no one vends an off-the-shelf package for such an analysis
for the PCI bus (but there ought to be) that takes into account
popular host chipsets and video, etc. boards. One company vends
a Rate Monotonic Analysis package for interrupts, which has some
built in numbers for interrupt rates for popular adapters, but it
assumes no bus latency, and it's a pricey package (IMHO), given
that RMA is nothing more than a arithmetic series calculation.
There are published papers from the SEI on RMA. It's mostly
good for fixed-priority schemes, where a CPU is hooked to a
set of devices all on different fixed priority interrupt lines,
although some research has been done on adapting it to other
situations.
> According to my data xfer
> rates, my add-on processor may interrupt the host, some 4000 times in
> a second to transfer 8 DWORDS of data each time thru' the FIFO. So the
> total time wasted because of interrupt latency itself, in a second,
> itself will be 4000 times that of a single interrupt. Will this
> create any problem with other PCI cards like graphics adapters,
> PCI-SCSI, or so.
>
Anything that interrupts the CPU at 4 KHz is likely to impact other
system resources. You will have to decide if the perceived slowness
seen by the user is OK. If your interrupt routines are really tight,
you might impact the system somewhat less than 10%.
> I'll be really glad to get any info. in this regard. Thanx in advance.
>
> -Jasper
>
> jasper@hclt.com
--
Cheers,
DaveN
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Dave New, den@aisinc.com | Machine vision?
Applied Intelligent Systems | I'm glad *they* can see the future...
3980 Ranchero Drive |
Ann Arbor, MI 48108 | Opinions expressed are mine. | PGP 2.6
(313) 332-7010 | 08 12 9F AF 5B 3E B2 9B 6F DC 66 5A 41 0B AB 29
(313) 332-7077 FAX
Interrupt Latency is going to kill your throughput in this design.
It will also bring most OS to their knees. In the PC area, if you
are running on an 8259A interrupt controller, with a Unix OS, your
latency will be VERY high. (Because of the masking required to
implement Unix's interrupt "levels".) On NT or Win95, it is still
very high but not as bad. On Unix you might count on .3 millisecond.
(Pathetically slow). On something like NT, its going to be faster,
but still slow .1 millisecond. You are also correct that with PCI
devices, you will be on a software interrupt chain, and have to wait
for the unpredictable latency of other PCI drivers in the chain before
you. This will also kill the perfomance. Since those other drivers
will also have to check their interrupt status registers on each of
your 4000 interrupts a second.
Its going to be slow! Now for the almost-good news. If you primary
market is uniprocessor PC's, and there are only 3 or 4 PCI slots, and
the user has properly gone through configuration hell with today's
Plug and Pray BIOS, then your device may not be sharing an interrupt
with another device. [No guarantee, just matter of fact].
Why don't you put some intellegence on your adapter and have it
determine when to do the data transfer. The day of "dumb" controllers
is past. [I have worked on many "dumb" controllers.... those places
are all out of business now. Though they thrived 10-5 years ago.] Put a
processor on the board (i960 comes to mind) and stop interrupting the
host for every little thing. Just let him know when the transfer
is all done.
You will find that this is definitely the trend, and is getting less trendy
like and more "requirement" like. Notice that Microsoft has changed
the requirements for Windows 97 Hardware platforms for IDE controllers.
The IDE controllers now MUST support the DMA modes of ATA-3 so that
the controller can transfer all of the data on its own without interrupting
the host for each and every block of data (512 bytes). The DMA-IDE will
just interrupt on completion of transfer. Microsoft wants this
because a) the technology is already there, ATA-3 is defined and implemented,
and b) interrupt latency kills OS performance. Here they are concerned
with an interrupt per 512 bytes, not an interrupt per 8*4=32 bytes
that you are talking about.
I do not want to be too harsh. Your design will work. It will be simple.
It will be SLOW!!!!! and it will fail in the marketplace.
[Better for harshness early now, then latter in the marketplace. All of
this is definitely (obviosly) In-My-Humble-Option.]
Regards,
David O'Shea
corollary.com
Jasper -
Don't try this one on a Macintosh! As you surmise, interrupts require the
OS to determine the interrupting device. While the Mac can directly
determine which slot is requesting the interrupt, the PCI interrupt manager
is layered on the older OS interrupt manager. The older code is not native
PowerPC code, so there a couple of context switches between emulated 68K
and native PPC before your interrupt handler gets the signal. 4000
Interrupts/sec would create an unacceptable load on the CPU.
My guess is that Unix or NT systems would be substantially faster
responding, but IMHO, 4K int/sec is still a lot of traffic to sustain under
program control.
I expect you will get a number of suggestions to use either deeper FIFOs or
a more efficient DMA scheme. DMA for 8 Dwords seems quite inefficient
anyway. By the time you set up your DMA controller parameters, you could
have moved the data programmatically. Why not set up longer transfers and
use normal DMA "handshaking" to do it for you. Then generate an interrupt
only when the DMA requested count reaches 0?
-- ward
*********************************************************************
Ward McFarland MegaWolf, Inc Macintosh Serial Expansion
ward@megawolf.com (203)562-1243 http://www.megawolf.com/
**********************************************************************
Well, depending on your target operating system, this isn't going to work at
all. Interrupt latency is generally more a function of the OS architecture and
the Wintel PC style of OS's just don't allow specifying this stuff very well.
Also there are a lot of poorly designed devices in the PC world such as many
graphics adapters which can block the processor (by putting it in a wait state
during target disconnects) for milliseconds.
If your device is a bus master, why in the heck would it need the CPU's
attention for each short burst of 8 dwords??? By the time you handle a IRQ,
and setup a busmaster operation, you might as well have just read those 8
dwords directly... Most bus masters should be able to transfer KILObytes
unattended otherwise there really isn't much point in them BEING a busmaster.
-jrp
¨ –