[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Purposeful Parity control



> 1.  Is there some reason why it is felt that the PCI bus is more
> susceptible to single bit failures then the rest of the data 
> paths in the
> machine?  

Yes, but let's qualify that further by saying "more suseptible than _some_
other data paths"; there are connectors (connector junctions are more prone
to failure than fixed connections) on the bus, and it is meant to accept a
wide variety of 3rd party components that have relatively strict design
requirements to ensure signal integrity (but they may not necessarily be
designed or manufactured to the highest quality).

> How about verfication of hardware control lines 
> (PCI included)?
> are they less likely to fail?  

No, but a control line failure is often more readily apparent than a single
bit failure. 

> Maybe it could be argued that 
> it is better
> to have some detection capability rather than none.  I have always
> suspected that parity and EDC (error detection and 
> correction) existed on
> buses because of the possibility of single point upsets in 
> memory devices
> (specifically DRAM).  Since the PCI bus does not connect 
> directly to memory
> chips, I would argue that the PCI device which interfaces to 
> this type of
> memory is the one who should implement the EDC/Parity algorithm.

A single bit failure can occur anywhere in a data path, not just inside a
memory device.


> 2.  If we are talking about the desktop, my suspicion is that the
> likelihood of a serious system problem due to a software 
> crash (either OS
> or application) is orders of magnitude more likely than a PCI data bus
> parity error.  I think that the same argument might be made for most
> embedded systems (except possibly those whose software has 
> been tested very
> rigorously (e.g. formal verification)).

Consider the question of how much of these intermittent, undiagnosed system
crashes may actually be caused by flakey hardware.  I've heard of systems
being tossed out by frustrated Hardware Quality Lab engineers when they
finally prove that a system is or has become flaky (I'm refering to bad data
tranfers in particular).  How many manufactures don't send their machines to
rigorous HQLs?  In the cost-consious personal computer market, parity and
other error detection schemes are disappearing, and instead, (my theory)
allows the small number of errors to exist with the assumption that the user
will restart the system often and so not notice a cumulative effect of
errors.
-- BrooksL


> Lame Brooks-G14738 <Brooks_Lame-G14738@email.mot.com> on 04/17/2001
> 01:46:52 PM
> 
> To:   "PCISIG List (E-mail)" <pci-sig@znyx.com>
> cc:   "'Dimiter Popoff'" <tgi@bulnet.bg>
> 
> Subject:  Purposeful Parity control RE: PCI Politics (was RE: 
> why Target
>       cannot change its mind)
> 
> 
> > Why was the parity control necessary for a chip-to-chip bus?
> 
> For the same reason it's necessary in other reliable, high-speed data
> transfer mechanisms - it is very easy for single bit errors 
> to sneak in, be
> they due to design margin, bad components, external 
> influence, or whatever,
> and without some bandwidth-efficient, cost-effective 
> detection method, a
> single bit error could easily run undetected for a while and 
> worm its way
> into causing system instability.
> -- BrooksL
> 
> 
> 
> 
>