In case of unlinked files with dirty pages GFS2 wasn't clearing
the pages in quite the right order. This patch clears the pages
earlier (before the qlock_dq) to avoid the situation that the
release of the glock results in attempting to write back data that
has already been deallocated.
This fixes Red Hat bugzilla: #220117
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
I just noticed this message when testing some other changes I'd made to
lowcomms (to use workqueues) but the problem seems to be in the current
git trees too. I'm amazed no-one has seen it.
BUG: spinlock already unlocked on CPU#1, dlm_recoverd/16868
Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
I was a little over-enthusiastic turning schedule() calls int cond_sched() when fixing the DLM for Andrew Morton.
These four should really be calls to schedule() or the dlm can busy-wait.
Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Bugzilla 215088
Fix deadlock in gfs2_change_nlink() while installing RHEL5 into GFS2
partition. The gfs2_rename() apparently needs block allocation for the
new name (into the directory) where it requires rg locks. At the same
time, while updating the nlink count for the replaced file,
gfs2_change_nlink() tries to return the inode meta-data back to resource
group where it needs rg locks too. Our logic doesn't allow process to
acquire these locks recursively by the same process (RHEL installer)
that results a BUG call. This only happens within rename code path and
only if the destination file exists before the rename operation.
Signed-off-by: S. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This is partially derrived from a patch written by Russell Cattelan.
It fixes a bug where there is a race between readpages and truncate
by ignoring readpages for stuffed files. This is ok because a stuffed
file will never be more than one block (minus sizeof(struct gfs2_dinode))
in size and block size is always less than page size, so we do not lose
anything efficiency-wise by not doing readahead for stuffed files. They
will have already been "read ahead" by the action of reading the inode
in, in the first place.
This is the remaining part of the fix for Red Hat bugzilla #218966
which had not yet made it upstream.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Russell Cattelan <cattelan@redhat.com>
This patch fixes Red Hat bugzilla #212627 in which a deadlock occurs
due to trying to take the i_mutex while holding a glock. The correct
locking order is defined as i_mutex -> glock in all cases.
I've left dealing with allocating writes. I know that we need to do
that, but for now this should do the trick. We don't need to take the
i_mutex on write, because the VFS has already taken it for us. On read
we don't need it since the glock is enough protection. The reason that
I've made some of the checks into a separate function is that we'll need
to do the checks again in the allocating write case eventually, so this
is partly in preparation for this. Likewise the return value test of !=
1 might look a bit odd and thats because we'll need a third return value
in case of requiring an allocation.
I've made the change to deferred mode on the glock to ensure flushing
read caches on other nodes. I notice that (using blktrace to look at
whats going on) we appear to do a better job of large I/Os than ext3
after this patch (in terms of not splitting up the I/Os).
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Wendy Cheng <wcheng@redhat.com>
Remove the following unused functions:
- lowcomms_send_message()
- lowcomms_max_buffer_size()
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
When the dlm fakes an unlock/cancel reply from a failed node using a stub
message struct, it wasn't setting the flags in the stub message. So, in
the process of receiving the fake message the lkb flags would be updated
and cleared from the zero flags in the message. The problem observed in
tests was the loss of the USER flag which caused the dlm to think a user
lock was a kernel lock and subsequently fail an assertion checking the
validity of the ast/callback field.
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
LVB's are not sent as part of new requests, but the code receiving the
request was copying data into the lvb anyway. The space in the message
where it mistakenly thought the lvb lived actually contained the resource
name, so it wound up incorrectly copying this name data into the lvb. Fix
is to just create the lvb, not copy junk into it.
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
The send_args() function is used to copy parameters into a message for a
number different message types. Only some of those types are set up
beforehand (in create_message) to include space for sending lvb data.
send_args was wrongly copying the lvb for all message types as long as the
lock had an lvb. This means that the lvb data was being written past the
end of the message into unknown space.
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Check if we receive a message from another lockspace member running a
version of the dlm with an incompatible inter-node message protocol.
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
A reply to a recovery message will often be received after the relevant
recovery sequence has aborted and the next recovery sequence has begun.
We need to ignore replies to these old messages from the previous
recovery. There's already a way to do this for synchronous recovery
requests using the rc_id number, but not for async.
Each recovery sequence already has a locally unique sequence number
associated with it. This patch adds a field to the rcom (recovery
message) structure where this recovery sequence number can be placed,
rc_seq. When a node sends a reply to a recovery request, it copies the
rc_seq number it received into rc_seq_reply. When the first node receives
the reply to its recovery message, it will check whether rc_seq_reply
matches the current recovery sequence number, ls_recover_seq, and if not
then it ignores the old reply.
An old, inadequate approach to filtering out old replies (checking if the
current stage of recovery has moved back to the start) has been removed
from two spots.
The protocol version number is changed to reflect the different rcom
structures.
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
There's a chance the new master of resource hasn't learned it's the new
master before another node sends it a lock during recovery. The node
sending the lock needs to resend if this happens.
- A sends a master lookup for resource R to C
- B sends a master lookup for resource R to C
- C receives A's lookup, assigns A to be master of R and
sends a reply back to A
- C receives B's lookup and sends a reply back to B saying
that A is the master
- B receives lookup reply from C and sends its lock for R to A
- A receives lock from B, doesn't think it's the master of R
and sends an error back to B
- A receives lookup reply from C and becomes master of R
- B gets error back from A and resends its lock back to A
(this resending is what this patch does)
- A receives lock from B, it now sees it's the master of R
and takes the lock
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
If an fs has already been shut down, a lockfs callback should do nothing.
An fs that's been shut down can't acquire locks or do anything with
respect to the cluster.
Also, remove FIXME comment in withdraw function. The missing bits of the
withdraw procedure are now all done by user space.
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Some HID devices by Apple have both keyboard and mouse interfaces; the
keyboard interface is handled by usbhid, but the mouse (really
touchpad) interface must be handled by the separate 'appletouch'
driver. Using HID_QUIRK_IGNORE will make hiddev ignore both
interfaces, therefore a new quirk flag to ignore only the mouse
interface is required.
Signed-off-by: Soeren Sonnenburg <kernel@nn7.de>
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
CONFIG_INPUT_DEBUG is non-existent option, so remove anything depending
on it.
Also, as we have new CONFIG_HID_DEBUG, this should be used on places
where ifdef DEBUG was used before.
Suggested by Adrian Bunk.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The comment in hid_get_class_descriptor() says a very obvious thing
and is also violating codingstyle. Just remove it.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The unused hid_find_field_by_usage() function has been commented out for
a pretty long time. Remove it completely.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
hidinput_{open,close}() functions do not belong to usbhid, but
to the generic HID layer. Move them, and fix hooks in struct
hid_device, so that now the callbacks are done to transport-specific
_open() functions, but not input_open() functions.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
hid-debug.h contains a lot of code, and should not therefore
be a header.
This patch moves the code to generic hid layer as .c source, and
introduces CONFIG_HID_DEBUG to conditionally compile it, instead
of playing with #define DEBUG and including hid-debug.h.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Add a force feedback driver for PantherLord USB/PS2 2in1 Adapter,
0810:0001. The device identifies itself as "Twin USB Joystick".
Signed-off-by: Anssi Hannula <anssi.hannula@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Add new quirk HID_QUIRK_SKIP_OUTPUT_REPORTS to skip output reports
when enumerating reports on a hid-input device. Add this quirk and
HID_QUIRK_MULTI_INPUT to 0810:0001.
PantherLord Twin USB Joystick, 0810:0001 has separate input reports
for 2 distinct game controllers in the same interface, so it needs
HID_QUIRK_MULTI_INPUT. However, the device also contains one output
report per controller which is used to control the force feedback
function, and we do not want those to appear as separate input
devices as well. The simplest approach seems to be to add a quirk to
skip output reports on 0810:0001, and allow the force feedback
driver to handle those.
Signed-off-by: Anssi Hannula <anssi.hannula@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Allow hid devices with HID_QUIRK_MULTI_INPUT to have force feedback.
This was previously disabled because there were not any force
feedback drivers for such devices. This will change with my upcoming
patch.
Signed-off-by: Anssi Hannula <anssi.hannula@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Remove prototypes for functions that don't exist.
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch removes do_mmap() from ehca:
- Call remap_pfn_range() for hardware register block
- Use vm_insert_page() to register memory allocated for completion
queues and queue pairs
- The actual mmap() call/trigger is now controlled by user space,
ie. libehca
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The iWARP connection manager uses the ib_addr services to do route
resolution (neighbour discovery in the IP world). The ib_addr
netevent callback routine, however, currently only acts on InfiniBand
neighbour updates. It needs to act on ethernet neighbour updates as
well.
This patch just removes filtering on device type altogether and will
trigger on any neighour updates where the nud_type is valid. This
simplifies the code some.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Make the untyped data region in ib_user_mad have type u64 so that it
gets aligned properly. This avoids alignment faults in ib_umad when
casting the data field to an rmpp_mad and accessing the 64-bit tid
field on architectures like ia64.
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When there is a call to send_tsk_mgmt SRP posts a send and waits for 5
seconds to get a response.
When the QP is in the error state it is obvious that there will be no
response so it is quite useless to wait. In fact, the timeout causes
SRP to wait a long time to reconnect when a QP error occurs. (Each
abort and each reset_device calls send_tsk_mgmt, which waits for the
timeout). The following patch solves this problem by identifying the
failure and returning an immediate error code.
Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
struct ib_wc currently only includes the local QP number: this matches
the IB spec, but seems mostly useless. The following patch replaces
this with the pointer to qp itself, and updates all low level drivers
and all users.
This has the following advantages:
- Ability to get a per-qp context through wc->qp->qp_context
- Existing drivers already have the qp pointer ready in poll cq, so
this change actually saves a tiny bit (extra memory read) on data path
(for ehca it would actually be expensive to find the QP pointer when
polling a CQ, but ehca does not support SRQ so we can leave wc->qp as
NULL for ehca)
- Users that need the QP number can still get it through wc->qp->qp_num
Use case:
In IPoIB connected mode code, I have a common CQ shared by multiple
QPs. To track connection usage, I need a way to get at some per-QP
context upon the completion, and I would like to avoid allocating
context object per work request just to stick a QP pointer into it.
With this code, I can just use wc->qp->qp_context.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
<rdma/ib_verbs.h> uses struct kref, so it should include <linux/kref.h>
explicitly to avoid hidden include dependencies.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Since we actively avoid highmem, calling kmap_atomic() instead
of page_address() is effectively only obfuscation.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Since we actively avoid highmem, calling kmap_atomic() instead
of page_address() is effectively only obfuscation.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Since we actively avoid highmem, calling kmap_atomic() instead
of page_address() is effectively only obfuscation.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
MMC high-speed, wide bus support and SD high-speed
are functions that aren't critical for correct
operation of the card. As such, they shouldn't mark
the card as bad or dead when there is a failure
activating these features.
This is needed in particular on some really stupid
hardware (e.g. Winbond's) where not all data transfer
commands are supported.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
The wbsd hardware is so incredibly brain damaged that it has an internal
list of commands that result in data transfers. The result being that
commands that aren't on this list aren't supported.
Instead of locking up, waiting for a data interrupt that will never come,
we try to fail a bit more gracefully.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Many controllers have an upper limit on the number of blocks that can be
transferred in one request. Allow the host drivers to specify this and make
sure we avoid hitting this limit.
Also change the max_sectors field to avoid confusion. This makes it map
less directly to the block layer limits, but as they didn't apply directly
on MMC cards anyway, this isn't a great loss.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Most controllers have an upper limit on the block size. Allow the host
drivers to specify this and make sure we avoid hitting this limit.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Fix some spaces and tabs. No semantic changes are introduced.
Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
This patch also adds symbolic defines for supported pci ids.
Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
As there's only one work item (media_switcher) to handle and it's effectively
serialized with itself, I found it more convenient to use kthread instead of
workqueue. This also allows for a working implementation of suspend/resume,
which were totally broken in the past version.
Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Hardware does not say whether card was inserted or removed when reporting
socket events. Moreover, during suspend, media can be removed or switched
to some other card type without notification. Therefore, for each socket
in the change set the following is performed:
1. If there's active device in the socket it's unregistered
2. Media detection is performed
3. If detection recognizes supportable media, new device is registered
This patch also alters some macros and variable names to enhance clarity.
Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Eject function can take advantage of the socket_id field instead of explicit
pointer comparison.
Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>