[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Write Combining difference between 98 and NT
Thanks for your thoughts - I've added replies to your questions below.
> I'll throw out a couple of thoughts:
>
> 1. I would concentrate on the timing on the CPU side of the things. I
> suspect
> that the WC flush occurs under two different conditions -- buffer
full,
> or
> timeout. If the timeout is sufficiently short, any delay in writing
to
> the WC
> buffer would result in a flush of a partial buffer.
What timeout are you refering to - a timeout in the processors memory
manager, the PCI Chipset or the Target PCI device? Intel don't seem to
document any internal timeouts on the WC buffers.
> 2. With the above in mind, I would concentrate on what things can cause a
> user-mode program that is writing directly to mapped memory locations
> to run slower on NT. The first thing I would suspect is NT's vaunted
> security
> mechanisms. While 98 will essentially let you party on just about
> memory
> location with abandon, NT will generally trap any direct I/O or memory
> calls
> that don't look like they belong to the user mode app. Depending on
the
> situation, it may allow the access to occur, but of course, only after
a
> delay
> to the additional processing overhead incurred.
Under NT the user mode pointer is obtained by the driver using the
VideoPortGetDeviceBase() Video Miniport Driver DDK call. The InIOSpace field
is set to VIDEO_MEMORY_SPACE_P6CACHE to specify that the memory should be
mapped as Write Combined memory.
I would be very surprised if the OS traps accesses to the mapped region
since the whole point of the VideoPortGetDeviceBase() call is to allow very
fast access to display memory.
Additionally, if I structure my data such that each call to the function
writes out a multiple of eight 32 bit words, then the code runs
significantly faster (around 20%) under NT than 98. This would seem to
indicate that there is little or no OS overhead being incurred by the
accesses.
> 3. You didn't say exactly how you arrived at getting a user-mode pointer
to
> a physical memory block (I'm assuming) hosted on your PCI adapter
under
> NT.
> If you didn't use ZwMapViewOfSection() [see the 'memmap' or 'mapmem'
> example (I can't remember which) in the NT DDK], then you probably
> don't have an unencumbered user-mode pointer that is also valid in
> kernel-mode. Using ZwMapViewOfSection() will let you get such a
> pointer (after all the pre-requisite PhysToVirt(), etc. stuff -- see
the
> example, which shows how to get a pointer to video RAM that is valid
> in both kernel and user mode). You can then use such a pointer to
> party on the memory without any interference from the OS kernel.
> Of course, Microsoft does not recommend using such system interfaces
> (even though they document them and provide example code!) because
> it circumvents the NT security mechanism. In other words, the
presence
> of such a utility on a system will could allow someone to exploit the
> memory mapping for nefarious purposes.
See answer to previous question.
> 4. Unless the call to writeData() is inlined by the compiler/linker, there
> would probably be at least a couple of memory hits outside of your
> WC buffer area (or the CPU's intstruction cache), that is to the
system
> stack area. That may be enough to throw things off in this case.
True, but I would expect that after the first itterartion through the
function then all code and data would be cached for all subsequent
itterations. There is not a huge amount of code or data being accessed
much less than 1KByte).
> It sounds like you have a PCI bus analyzer hooked up, but that you don't
> have an ICE (or equivalent) watching what the CPU is doing on it's side
> of the bridge. If you hook up an ICE, and capture the code execution,
> you may see a *lot* of extraneous activity on the part of the NT kernel
> that you don't see under 98, thus allowing the WC buffer to timeout and
> flush its contents.
I hooked up a standard logic analyzer to some of the PCI signals.
Unfortunatley I don't have access to an ICE.
Cheers,
-Paul