It appears that a memory barrier soon after a mispredicted
branch, not just in the delay slot, can cause the hang
condition of this cpu errata.
So move them out-of-line, and explicitly put them into
a "branch always, predict taken" delay slot which should
fully kill this problem.
Signed-off-by: David S. Miller <davem@davemloft.net>
Firstly, if the direction is TODEVICE, then dirty data in the
streaming cache is impossible so we can elide the flush-flag
synchronization in that case.
Next, the context allocator is broken. It is highly likely
that contexts get used multiple times for different dma
mappings, which confuses the strbuf flushing code and makes
it run inefficiently.
Signed-off-by: David S. Miller <davem@davemloft.net>
The recent change to add a timeout to strbuf flushing had
a negative performance impact. The udelay()'s are too long,
and they were done in the wrong order wrt. the register read
checks. Fix both, and things are happy again.
There are more possible improvements in this area. In fact,
PCI streaming buffer flushing seems to be part of the bottleneck
in network receive performance on my SunBlade1000 box.
Signed-off-by: David S. Miller <davem@davemloft.net>
If some hardware error occurs and the flush flag never updates,
we will hang forever in these routines. Add a timeout, and
print out a diagnostic if it is reached.
Signed-off-by: David S. Miller <davem@davemloft.net>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!