Re: Greetings. Having a FIFO overrun when my amplc_pci230 card is set on the other side of a PCI bridge

On 26/01/06 16:35, Michael R. Head wrote:
> On Wed, 2006-01-25 at 16:27 +0000, Ian Abbott wrote: 
>> On 25/01/2006 16:03, Michael R. Head wrote:
>>> On Wed, 2006-01-25 at 13:18 +0000, Ian Abbott wrote:
>>>
>>>> On 24/01/06 19:21, Michael R. Head wrote:
>>>>
>>>>> I've discovered that the card was (and is) _not_ working properly when
>>>>> it is on the main PCI bus. When I run the inp or inpn tests, I get
>>>>> 'timeout' printed on the console and in dmesg. I don't get this when the
>>>>> card is on the other side of the PCI bridge. It appears the problem was
>>>>> more subtle than I first though.
>>>> It seems the test that checks the ADC busy flag is inverted.  The 
>>>> attached patch should fix it.
>>> Damn... I'll try this out today. But I'm guessing this means that I will
>>> stop getting timeouts on the primary bus and start getting timeouts on
>>> the expansion bus again...
>> I'm hoping it will work on both unless there's something seriously 
>> strange going on!
> 
> Hmm... it is as I feared. It now works on the main PCI bus, but I get a
> bunch of timeouts when the card is on the expansion bus. What's more, I
> don't think it was working properly before. It seems your patches should
> be combined and committed.

They're in bugzilla as separate bugs:

https://bugzilla.comedi.org/cgi-bin/bugzilla/show_bug.cgi?id=203
https://bugzilla.comedi.org/cgi-bin/bugzilla/show_bug.cgi?id=204

> Here are the various results I get running inpn from comedilib:
> 
> Without this new patch, on the main PCI bus: 
> 
> [hwtester_at_host demo]$ ./inpn
> 0: 656968 328484 164242 82121 328489 164245 82122.3
> 1: 656968 328484 164242 82121 328489 164245 82122.3
> 2: 656968 328484 164242 82121 328489 164245 82122.3
> 3: 656968 328484 164242 82121 328489 164245 82122.3
> 4: 656968 328484 164242 82121 328489 164245 82122.3
> 5: 656968 328484 164242 0.000915751 5.00366 2.50183 1.25092

Those big numbers would be due to the 'data' variable remaining 
uninitialized due to the timeout errors.

> 6: 0.00732601 0.003663 0.0018315 0.000915751 5.00366 2.50183 1.25092
> 7: 0.00732601 0.003663 0.0018315 0.000915751 5.00366 2.50183 1.25092
> 8: 0.00732601 0.003663 0.0018315 0.000915751 5.00366 2.50183 1.25092
> 9: 0.00732601 0.003663 0.0018315 0.000915751 5.00366 2.50183 1.25092
> 10: 0.00732601 0.003663 0.0018315 0.000915751 5.00366 2.50183 1.25092
> 11: 0.00732601 0.003663 -0.0177045 -0.00885226 4.96459 2.4823 1.24115
> 12: -0.0708181 -0.035409 -0.0177045 -0.00885226 4.96459 0.763126 0.381563
> 13: -6.9475 -3.47375 -1.73687 -0.868437 1.52625 0.763126 0.381563
> 14: -6.9475 -3.47375 -1.73687 -0.868437 1.52625 0.763126 0.381563
> 15: -6.9475 -3.47375 -1.73687 -0.868437 1.52625 0.763126 0.381563
> 
> The numbers above vary, but this particular run gave a good variety of
> the values. If I run dmesg, I see the timeout message printed many times
> over. If run it from a text console (outside of X), I get all the
> timeouts printed inline with the output (as you might imagine for printk
> messages).
> 
> Here is what I get on the expansion PCI bus:
> [hwtester_at_host demo]$ ./inpn
> 0: -0.002442 -0.001221 -0.000610501 -0.00030525 nan nan nan
> 1: -0.002442 -0.001221 -0.000610501 -0.00030525 nan nan nan
> 2: -0.002442 -0.001221 -0.000610501 -0.00030525 nan nan nan
> 3: -0.002442 -0.001221 -0.000610501 -0.00030525 nan nan nan
> 4: -0.002442 -0.001221 -0.000610501 -0.00030525 nan nan nan
> 5: -0.002442 -0.001221 -0.000610501 -0.00030525 nan nan nan
> 6: -0.002442 -0.001221 -0.000610501 -0.00030525 nan nan nan
> 7: -0.002442 -0.001221 -0.000610501 -0.00030525 nan nan nan
> 8: -0.002442 -0.001221 -0.000610501 -0.00030525 nan nan nan
> 9: -0.002442 -0.001221 -0.000610501 -0.00030525 nan nan nan
> 10: -0.002442 -0.001221 -0.000610501 -0.00030525 nan nan nan
> 11: -0.002442 -0.001221 -0.000610501 -0.00030525 nan nan nan
> 12: -0.002442 -0.001221 -0.000610501 -0.00030525 nan nan nan
> 13: -0.002442 -0.001221 -0.000610501 -0.00030525 nan nan nan
> 14: -0.002442 -0.001221 -0.000610501 -0.00030525 nan nan nan
> 15: -0.002442 -0.001221 -0.000610501 -0.00030525 nan nan nan
> 
> Those numbers are always the same.

It looks like it is always reading 0xFFFF out of the ADC data register 
in this case.  For bipolar ranges, the driver converts that to 0x7FF 
(2047); for unipolar ranges it is converted to 0xFFF (4095).

> Now, here are the results with your new patch:
> 
> card on main PCI bus:
> [hwtester_at_host demo]$ ./inpn
> 0: -4.98901 -0.199023 -0.0873016 -0.0363248 1.52137 nan nan
> 1: -5.62882 -0.860806 -0.468254 -0.249389 1.40415 nan nan
> 2: -5.04274 -0.474969 -0.214286 -0.0973748 1.49206 nan nan
> 3: -5.6337 -0.86569 -0.471917 -0.251832 1.35531 nan nan
> 4: -5.0525 -0.465201 -0.202076 -0.0906593 1.46276 nan nan
> 5: -5.61416 -0.868132 -0.452381 -0.240232 1.40659 nan nan
> 6: -5.06227 -0.506716 -0.230159 -0.106532 1.48718 nan nan
> 7: -5.55067 -0.868132 -0.449939 -0.227411 1.37485 nan nan
> 8: -5.04762 -0.357753 -0.154457 -0.0699023 1.50672 nan nan
> 9: -5.62393 -0.82906 -0.434066 -0.235348 1.41148 nan nan
> 10: -4.9304 -0.433455 -0.197192 -0.0912698 1.45055 nan nan
> 11: -5.6044 -0.863248 -0.459707 -0.243895 1.41148 nan nan
> 12: -5.0525 -0.555556 -0.260684 -0.118742 nan nan nan
> 13: -1.81441 -0.821734 -0.447497 -0.245726 1.39683 nan nan
> 14: -5.03297 -0.504274 -0.23138 -0.108364 1.48962 nan nan
> 15: -4.59341 -0.836386 -0.459707 -0.266484 1.37973 nan nan
> 
> Which looks right. Hurray!
> 
> And here's the card on the secondary PCI bus:
> [hwtester_at_csi-lear-1 demo]$ ./inpn
> 0: 656968 328484 164242 82121 328489 164245 82122.3
> 1: 656968 328484 164242 82121 328489 164245 82122.3
> 2: 656968 328484 164242 82121 328489 164245 82122.3
> 3: 656968 328484 164242 82121 328489 164245 82122.3
> 4: 656968 328484 164242 82121 328489 164245 82122.3
> 5: 656968 328484 164242 82121 328489 164245 82122.3
> 6: 656968 328484 164242 82121 328489 164245 82122.3
> 7: 656968 328484 164242 82121 328489 164245 82122.3
> 8: 656968 328484 164242 82121 328489 164245 82122.3
> 9: 656968 328484 164242 82121 328489 164245 82122.3
> 10: 656968 328484 164242 82121 328489 164245 82122.3
> 11: 656968 328484 164242 82121 328489 164245 82122.3
> 12: 656968 328484 164242 82121 328489 164245 82122.3
> 13: 656968 328484 164242 82121 328489 164245 82122.3
> 14: 656968 328484 164242 82121 328489 164245 82122.3
> 15: 656968 328484 164242 82121 328489 164245 82122.3
> 
> And when I check dmesg, I get the timeouts.

My guess is that the driver is always reading 0xFFFF from the ADCCON 
register that has the "ADC busy" bit.  The ADCCON is in the same region 
as the ADCDATA register, so I think the driver is reading 0xFFFF from 
all registers in that region.  (The digital I/O, counter timer and 
interrupt control registers are in a separate region to the ADC and DAC 
stuff.)

It would be interesting to change the following line in amplc_pci230.c 
to check this suspicion:

			rt_printk("timeout\n");

becomes:

			rt_printk("timeout (adccon=%x)\n", status);

If it prints "timeout (adccon=ffff)" then my suspicion is correct.

> I'm finding trouble with other PCI cards on the expansion, so it could
> be a chipset issue or a general linux problem. If you can make an
> educated guess as to what's causing the timeout across the PCI bridge,
> I'd appreciate it. Do you think interrupts are simply not being
> propagated? Maybe I/O or memory port data isn't getting to the card?

./inpn won't be using interrupts, but the other bug-fix for the 
never-ending interrupts suggests that interrupts are getting through and 
that the region containing the interrupt status and control registers is 
at least partially working.  The region containing the ADC and DAC 
registers probably isn't working, but I don't know if that is due to the 
PLX PCI9052 bridge chip on the card, the PCI-to-PCI bridge or something 
else.  The fact that you're having trouble with other cards lets us off 
the hook a bit unless they are also using PLX chips!

> Either way, thanks for looking in to this.

No problem.  It would be nice to get to the bottom of this.

-- 
-=( Ian Abbott _at_ MEV Ltd.    E-mail: <abbotti_at_mev.co.uk>        )=-
-=( Tel: +44 (0)161 477 1898   FAX: +44 (0)161 718 3587         )=-

Received on 2006-01-27Z15:47:32