linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Sean Hefty	923c100ef0	IB/addr: Simplify resolving IPv4 addresses Merge resolve local/remote address resolution into a single data flow to ensure consistent access and use of the local routing tables. Based on work from: David Wilder <dwilder@us.ibm.com> Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-11-19 13:26:51 -08:00
Sean Hefty	6f8372b69c	RDMA/cm: fix loopback address support The RDMA CM is intended to support the use of a loopback address when establishing a connection; however, the behavior of the CM when loopback addresses are used is confusing and does not always work, depending on whether loopback was specified by the server, the client, or both. The defined behavior of rdma_bind_addr is to associate an RDMA device with an rdma_cm_id, as long as the user specified a non- zero address. (ie they weren't just trying to reserve a port) Currently, if the loopback address is passed to rdam_bind_addr, no device is associated with the rdma_cm_id. Fix this. If a loopback address is specified by the client as the destination address for a connection, it will fail to establish a connection. This is true even if the server is listing across all addresses or on the loopback address itself. The issue is that the server tries to translate the IP address carried in the REQ message to a local net_device address, which fails. The translation is not needed in this case, since the REQ carries the actual HW address that should be used. Finally, cleanup loopback support to be more transport neutral. Replace separate calls to get/set the sgid and dgid from the device address to a single call that behaves correctly depending on the format of the device address. And support both IPv4 and IPv6 address formats. Signed-off-by: Sean Hefty <sean.hefty@intel.com> [ Fixed RDS build by s/ib_addr_get/rdma_addr_get/ - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-11-19 13:26:06 -08:00
Sean Hefty	c4315d85f9	IB/addr: Store net_device type instead of translating to RDMA transport The struct rdma_dev_addr stores net_device address information: the source device address, destination hardware address, and broadcast address. For consistency, store the net_device type rather than converting it to the rdma_node_type. The type indicates the format of the various hardware addresses, which is what we're concerned with, and not the RDMA node type that the address may map to. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-11-19 12:57:18 -08:00
Sean Hefty	d2e0886245	IB/addr: Verify source and destination address families match If a source address is provided, verify that the address family matches that of the destination address. If the source is not specified, use the same address family as the destination. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-11-19 12:55:22 -08:00
Sean Hefty	6266ed6e41	RDMA/cma: Replace net_device pointer with index Provide the device interface when resolving route information to ensure that the correct outbound device is used. This will also simplify processing of sin6_scope_id for IPv6 support. Based on work from: David Wilder <dwilder@us.ibm.com> Jason Gunthorpe <jgunthrope@obsidianresearch.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-11-19 12:55:22 -08:00
Jason Gunthorpe	e2e626972e	RDMA/cma: Fix AF_INET6 support in multicast joining If joining to an AF_INET6 address, we need to map the address to a MGID in the same way as the IP stack. The old code would just fall through to the IPv4 case and generate garbage. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-11-19 12:55:22 -08:00
Jason Gunthorpe	1c9b281997	RDMA/cma: Correct detection of SA Created MGID RDMA CM treats AF_INET6 addresses that are either 0 or prefixed with FF1x:A01B::/32 as MGIDs, but the detection for the prefix was buggy; fix it up. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-11-19 12:55:21 -08:00
Eric Dumazet	0f9ea5d2ab	RDMA/addr: Use appropriate locking with for_each_netdev() for_each_netdev() should be used with RTNL or dev_base_lock held, or else we risk a crash. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-11-18 14:24:34 -08:00
Sean Hefty	a7ca1f00ed	RDMA/ucma: Add option to manually set IB path Export rdma_set_ib_paths to user space to allow applications to manually set the IB path used for connections. This allows alternative ways for a user space application or library to obtain path record information, including retrieving path information from cached data, avoiding direct interaction with the IB SA. The IB SA is a single, centralized entity that can limit scaling on large clusters running MPI applications. Future changes to the rdma cm can expand on this framework to support the full range of features allowed by the IB CM, such as separate forward and reverse paths and APM. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Reviewed-By: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-11-16 09:30:33 -08:00
Alexey Dobriyan	d43c36dc6b	headers: remove sched.h from interrupt.h After m68k's task_thread_info() doesn't refer to current, it's possible to remove sched.h from interrupt.h and not break m68k! Many thanks to Heiko Carstens for allowing this. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2009-10-11 11:20:58 -07:00
David J. Wilder	85f20b39fd	RDMA/addr: Fix resolution of local IPv6 addresses This patch allows a local IPv6 address to be resolved by rdma_cm. To reproduce the problem: $ rping -s -v -a ::0 & $ rping -c -v -a <IPv6 address local to this system> rdma_resolve_addr error -1 Local IPv6 address was obtained with "ip addr show ib0" Addresses: https://bugs.openfabrics.org/show_bug.cgi?id=1759 Signed-off-by: David Wilder <dwilder@us.ibm.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-10-07 16:03:18 -07:00
Steve Wise	54e05f15cc	RDMA/iwcm: Don't call provider reject func with irqs disabled In commit `cb58160e` ("RDMA/iwcm: Reject the connection when the cm_id is destroyed") a call to the provider's reject handler was added to destroy_cm_id() to fix a provider endpoint leak. This call needs to be done with interrupts enabled. So unlock and relock around this call. This is safe because: 1) the provider will do nothing with this endpoint until the iwcm either accepts or rejects. 2) the lock is only released after the iwcm state is changed, so an errant iwcm app that is destroying -and- rejecting the connection concurrently will get a failure on one of the calls. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-10-07 15:38:12 -07:00
Alexey Dobriyan	a99bbaf5ee	headers: remove sched.h from poll.h Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-10-04 15:05:10 -07:00
Roland Dreier	0e442afd92	IB/mad: Fix lock-lock-timer deadlock in RMPP code Holding agent->lock across cancel_delayed_work() (which does del_timer_sync()) in ib_cancel_rmpp_recvs() leads to lockdep reports of possible lock-timer deadlocks if a consumer ever does something that connects agent->lock to a lock taken in IRQ context (cf http://marc.info/?l=linux-rdma&m=125243699026045). Fix this by changing the list items to a new state "CANCELING" while holding the lock, and then canceling the delayed work without holding the lock. If the delayed work runs after the lock is dropped, it will see the state is CANCELING and return immediately, so the list will stay stable while we traverse it with the lock not held. Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-09-23 11:10:15 -07:00
Roland Dreier	73f526da02	Merge branch 'mad' into for-linus Conflicts: drivers/infiniband/core/mad.c	2009-09-10 21:19:45 -07:00
Steve Wise	cb58160e72	RDMA/iwcm: Reject the connection when the cm_id is destroyed If the cm_id of a connect request is destroyed prior to the ULP accepting or rejecting the connection, then the provider never cleans up the connection. The iwcm should explicitly reject these connections if the cm_id is destroyed. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-09-09 11:37:38 -07:00
Hal Rosenstock	b76aabc395	IB/mad: Allow tuning of QP0 and QP1 sizes MADs are UD and can be dropped if there are no receives posted, so allow receive queue size to be set with a module parameter in case the queue needs to be lengthened. Send side tuning is done for symmetry with receive. Signed-off-by: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-09-07 08:28:48 -07:00
Roland Dreier	6b2eef8fd7	IB/mad: Fix possible lock-lock-timer deadlock Lockdep reported a possible deadlock with cm_id_priv->lock, mad_agent_priv->lock and mad_agent_priv->timed_work.timer; this happens because the mad module does cancel_delayed_work(&mad_agent_priv->timed_work); while holding mad_agent_priv->lock. cancel_delayed_work() internally does del_timer_sync(&mad_agent_priv->timed_work.timer). This can turn into a deadlock because mad_agent_priv->lock is taken inside cm_id_priv->lock, so we can get the following set of contexts that deadlock each other: A: holding cm_id_priv->lock, waiting for mad_agent_priv->lock B: holding mad_agent_priv->lock, waiting for del_timer_sync() C: interrupt during mad_agent_priv->timed_work.timer that takes cm_id_priv->lock Fix this by using the new __cancel_delayed_work() interface (which internally does del_timer() instead of del_timer_sync()) in all the places where we are holding a lock. Addresses: http://bugzilla.kernel.org/show_bug.cgi?id=13757 Reported-by: Bart Van Assche <bart.vanassche@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-09-07 08:27:50 -07:00
Jack Morgenstein	b1b8afb833	IB/uverbs: Return ENOSYS for unimplemented commands (not EINVAL) Since the original commit `883a99c7` ("[IB] uverbs: Add a mask of device methods allowed for userspace"), the uverbs core returns EINVAL for commands not implemented by a specific low-level driver. This creates a problem that there is no way to tell the difference between an unimplemented command and an implemented one which is incorrectly invoked (which also returns EINVAL). The fix is to have unimplemented commands return ENOSYS. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-09-05 20:24:24 -07:00
Yossi Etigin	e1d7806df3	IB/core: Fix send multicast group leave retry Until now, retries were only sent when joining a multicast group. This patch will adds retries when leaving a multicast group as well. Signed-off-by: Ron Livne <ronli@voltaire.com> Signed-off-by: Yossi Etigin <yosefe@voltaire.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-09-05 20:24:24 -07:00
Roland Dreier	6276e08a9b	IB: Use DEFINE_SPINLOCK() for static spinlocks Rather than just defining static spinlock_t variables and then initializing them later in init functions, simply define them with DEFINE_SPINLOCK() and remove the calls to spin_lock_init(). This cleans up the source a tad and also shrinks the compiled code; eg on x86-64: add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-40 (-40) function old new delta ib_uverbs_init 336 326 -10 ib_mad_init_module 147 137 -10 ib_sa_init 123 103 -20 Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-09-05 20:24:23 -07:00
Roland Dreier	60f2b652f5	IB/mad: Check hop count field in directed route MAD to avoid array overflow The hop count field in a directed route MAD is only allowed to be in the range 0 to 63 (by spec). Check that this really is the case to avoid accessing outside the bounds of the hop array. Reported-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-09-05 20:24:10 -07:00
Peter Huewe	716abb1fdf	RDMA: Add __init/__exit macros to addr.c and cma.c Add __init and __exit annotations to the module_init/module_exit functions from drivers/infiniband/core/addr.c and cma.c. Signed-off-by: Peter Huewe <peterhuewe@gmx.de> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-06-23 10:38:42 -07:00
Greg Kroah-Hartman	3f7c58a05f	infiniband: remove driver_data direct access of struct device In the near future, the driver core is going to not allow direct access to the driver_data pointer in struct device. Instead, the functions dev_get_drvdata() and dev_set_drvdata() should be used. These functions have been around since the beginning, so are backwards compatible with all older kernel versions. Cc: general@lists.openfabrics.org Cc: Roland Dreier <rolandd@cisco.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-06-15 21:30:26 -07:00
Yossi Etigin	d2ca39f262	RDMA/cma: Create cm id even when IB port is down When doing rdma_resolve_addr(), if the relevant IB port is down, the function fails and the cm_id is not bound to the correct device. Therefore, application does not have a device handle and cannot wait for the port to become active. The function fails because the underlying IPoIB interface is not joined to the broadcast group and therefore the SA does not have a multicast record to take a Q_Key from. The fix is to use lazy Q_Key resolution - cma_set_qkey() will set id_priv->qkey if it was not set, and will be called just before the Q_Key is really required. Signed-off-by: Yossi Etigin <yosefe@voltaire.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-04-08 13:42:33 -07:00
Yossi Etigin	84adeee9aa	RDMA/cma: Use rate from IPoIB broadcast when joining IPoIB multicast groups When joining an IPoIB multicast group, use the same rate as in the broadcast group. Otherwise, if the RDMA CM creates this group before IPoIB does, it might get a different rate. This will cause IPoIB to fail joining to the same group later on, because IPoIB uses strict rate selection. Signed-off-by: Yossi Etigin <yosefe@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-04-01 13:55:32 -07:00
Roland Dreier	09f98bafea	Merge branches 'cxgb3', 'endian', 'ipath', 'ipoib', 'iser', 'mad', 'misc', 'mlx4', 'mthca', 'nes' and 'sysfs' into for-next	2009-03-24 20:44:41 -07:00
Roland Dreier	6432f36684	IB: Remove useless ibdev_is_alive() tests from sysfs code Some attribute show functions test ibdev_is_alive() to make sure that it's OK to access device state. However, the sysfs attributes will not be registered until the device is fully initialized, and they'll be unregistered before anything is torn down, so ibdev_is_alive() doesn't do anything useful. Remove it. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-03-04 15:22:39 -08:00
Jack Morgenstein	6b708b3dde	IB/sa_query: Fix AH leak due to update_sm_ah() race Our testing uncovered a race condition in ib_sa_event(): spin_lock_irqsave(&port->ah_lock, flags); if (port->sm_ah) kref_put(&port->sm_ah->ref, free_sm_ah); port->sm_ah = NULL; spin_unlock_irqrestore(&port->ah_lock, flags); schedule_work(&sa_dev->port[event->element.port_num - sa_dev->start_port].update_task); If two events occur back-to-back (e.g., client-reregister and LID change), both may pass the spinlock-protected code above before the scheduled work updates the port->sm_ah handle. Then if the scheduled work ends up running twice, the second operation will then find a non-NULL port->sm_ah, and will simply overwrite it in update_sm_ah -- resulting in an AH leak. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-03-03 14:30:01 -08:00
Ralph Campbell	4780c1953f	IB/mad: Fix ib_post_send_mad() returning 0 with no generate send comp If ib_post_send_mad() returns 0, the API guarantees that there will be a callback to send_buf->mad_agent->send_handler() so that the sender can call ib_free_send_mad(). Otherwise, the ib_mad_send_buf will be leaked and the mad_agent reference count will never go to zero and the IB device module cannot be unloaded. The above can happen without this patch if process_mad() returns (IB_MAD_RESULT_SUCCESS \| IB_MAD_RESULT_CONSUMED). If process_mad() returns IB_MAD_RESULT_SUCCESS and there is no agent registered to receive the mad being sent, handle_outgoing_dr_smp() returns zero which causes a MAD packet which is at the end of the directed route to be incorrectly sent on the wire but doesn't cause a hang since the HCA generates a send completion. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-03-03 14:22:17 -08:00
Ralph Campbell	d9620a4c82	IB/mad: initialize mad_agent_priv before putting on lists There is a potential race in ib_register_mad_agent() where the struct ib_mad_agent_private is not fully initialized before it is added to the list of agents per IB port. This means the ib_mad_agent_private could be seen before the refcount, spin locks, and linked lists are initialized. The fix is to initialize the structure earlier. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-02-27 14:44:32 -08:00
Ralph Campbell	1d9bc6d648	IB/mad: Fix null pointer dereference in local_completions() handle_outgoing_dr_smp() can queue a struct ib_mad_local_private *local on the mad_agent_priv->local_work work queue with local->mad_priv == NULL if device->process_mad() returns IB_MAD_RESULT_SUCCESS \| IB_MAD_RESULT_REPLY and (!ib_response_mad(&mad_priv->mad.mad) \|\| !mad_agent_priv->agent.recv_handler). In this case, local_completions() will be called with local->mad_priv == NULL. The code does check for this case and skips calling recv_mad_agent->agent.recv_handler() but recv == 0 so kmem_cache_free() is called with a NULL pointer. Also, since recv isn't reinitialized each time through the loop, it can cause a memory leak if recv should have been zero. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>	2009-02-27 10:34:30 -08:00
Roland Dreier	9206dff157	IB: Remove sysfs files before unregistering device Move the ib_device_unregister_sysfs() call from ib_dealloc_device() to ib_unregister_device(). The old code allows device unregister to proceed even if some sysfs files are open, which leaves a window where userspace can open a file before a device is removed but then end up reading the file after the device is removed, which leads to various kernel crashes either because the device data structure is freed or because the low-level driver code is gone after module removal. By not returning from ib_unregister_device() until after all sysfs entries are removed, we make sure that data structures and/or module code is not freed until after all sysfs access is done. Reported-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-02-25 13:27:46 -08:00
Harvey Harrison	9c3da09917	IB: Remove __constant_{endian} uses The base versions handle constant folding just fine, use them directly. The replacements are OK in the include/ files as they are not exported to userspace so we don't need the __ prefixed versions. This patch does not affect code generation at all. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-01-17 17:11:57 -08:00
Kay Sievers	d927e38c6c	infiniband: struct device - replace bus_id with dev_name(), dev_set_name() Acked-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:39 -08:00
Roland Dreier	2c4ab6243f	RDMA/addr: Fix build breakage when IPv6 is disabled Commit `38617c64` ("RDMA/addr: Add support for translating IPv6 addresses") broke the build when CONFIG_IPV6=n, because the ib_addr module unconditionally attempted to call ipv6_chk_addr() and other IPv6 functions that are not defined when IPv6 is disabled. Fix this by only building IPv6 support if CONFIG_IPV6 is turned on, and add a Kconfig dependency to prevent the ib_addr code from being built in when IPv6 is built modular. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-12-29 23:37:14 -08:00
Linus Torvalds	0191b625ca	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1429 commits) net: Allow dependancies of FDDI & Tokenring to be modular. igb: Fix build warning when DCA is disabled. net: Fix warning fallout from recent NAPI interface changes. gro: Fix potential use after free sfc: If AN is enabled, always read speed/duplex from the AN advertising bits sfc: When disabling the NIC, close the device rather than unregistering it sfc: SFT9001: Add cable diagnostics sfc: Add support for multiple PHY self-tests sfc: Merge top-level functions for self-tests sfc: Clean up PHY mode management in loopback self-test sfc: Fix unreliable link detection in some loopback modes sfc: Generate unique names for per-NIC workqueues 802.3ad: use standard ethhdr instead of ad_header 802.3ad: generalize out mac address initializer 802.3ad: initialize ports LACPDU from const initializer 802.3ad: remove typedef around ad_system 802.3ad: turn ports is_individual into a bool 802.3ad: turn ports is_enabled into a bool 802.3ad: make ntt bool ixgbe: Fix set_ringparam in ixgbe to use the same memory pools. ... Fixed trivial IPv4/6 address printing conflicts in fs/cifs/connect.c due to the conversion to %pI (in this networking merge) and the addition of doing IPv6 addresses (from the earlier merge of CIFS).	2008-12-28 12:49:40 -08:00
Aleksey Senin	1f5175adea	RDMA/cma: Add IPv6 support Handle AF_INET6 cases where required, and use struct sockaddr_storage wherever an IPv6 address might be stored. Signed-off-by: Aleksey Senin <aleksey@alst60.(none)> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-12-24 10:16:45 -08:00
Aleksey Senin	38617c64bf	RDMA/addr: Add support for translating IPv6 addresses Add support for translating AF_INET6 addresses to the IB address translation service. This requires using struct sockaddr_storage instead of struct sockaddr wherever an IPv6 address might be stored, and adding cases to handle IPv6 in addition to IPv4 to the various translation functions. Signed-off-by: Aleksey Senin <aleksey@alst60.(none)> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-12-24 10:16:37 -08:00
David S. Miller	9eeda9abd1	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/ath5k/base.c net/8021q/vlan_core.c	2008-11-06 22:43:03 -08:00
Al Viro	233e70f422	saner FASYNC handling on file close As it is, all instances of ->release() for files that have ->fasync() need to remember to evict file from fasync lists; forgetting that creates a hole and we actually have a bunch that does forget. So let's keep our lives simple - let __fput() check FASYNC in file->f_flags and call ->fasync() there if it's been set. And lose that crap in ->release() instances - leaving it there is still valid, but we don't have to bother anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-01 09:49:46 -07:00
Harvey Harrison	5b095d9892	net: replace %p6 with %pI6 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-29 12:52:50 -07:00
Harvey Harrison	8867cd7c86	infiniband: use %p6 for printing message ids Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-28 23:02:35 -07:00
Linus Torvalds	724bdd097e	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/ehca: Reject dynamic memory add/remove when ehca adapter is present IB/ehca: Fix reported max number of QPs and CQs in systems with >1 adapter IPoIB: Set netdev offload features properly for child (VLAN) interfaces IPoIB: Clean up ethtool support mlx4_core: Add Ethernet PCI device IDs mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC mlx4_core: Multiple port type support mlx4_core: Ethernet MAC/VLAN management mlx4_core: Get ethernet MTU and default address from firmware mlx4_core: Support multiple pre-reserved QP regions Update NetEffect maintainer emails to Intel emails RDMA/cxgb3: Remove cmid reference on tid allocation failures IB/mad: Use krealloc() to resize snoop table IPoIB: Always initialize poll_timer to avoid crash on unload IB/ehca: Don't allow creating UC QP with SRQ mlx4_core: Add QP range reservation support RDMA/ucma: Test ucma_alloc_multicast() return against NULL, not with IS_ERR()	2008-10-23 08:16:03 -07:00
Roland Dreier	56f2fdaade	Merge branches 'cma', 'cxgb3', 'ehca', 'ipoib', 'mad', 'mlx4' and 'nes' into for-next	2008-10-22 15:56:41 -07:00
Parag Warudkar	01e8ef11bc	x86: sysfs: kill owner field from attribute Tejun's commit `7b595756ec` made sysfs attribute->owner unnecessary. But the field was left in the structure to ease the merge. It's been over a year since that change and it is now time to start killing attribute->owner along with its users - one arch at a time! This patch is attempt #1 to get rid of attribute->owner only for CONFIG_X86_64 or CONFIG_X86_32 . We will deal with other arches later on as and when possible - avr32 will be the next since that is something I can test. Compile (make allyesconfig / make allmodconfig / custom config) and boot tested. akpm: the idea is that we put the declaration of sttribute.owner inside `#ifndef CONFIG_X86'. But that proved to be too ambitious for now because new usages kept on turning up in subsystem trees. [akpm: remove the ifdef for now] Signed-off-by: Parag Warudkar <parag.lkml@gmail.com> Cc: Greg KH <greg@kroah.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Tejun Heo <htejun@gmail.com> Cc: Len Brown <lenb@kernel.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Jean Delvare <khali@linux-fr.org> Cc: Roland Dreier <rolandd@cisco.com> Cc: David Brownell <david-b@pacbell.net> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:42 -07:00
Greg Kroah-Hartman	91bd418fdc	device create: infiniband: convert device_create_drvdata to device_create Now that device_create() has been audited, rename things back to the original call to be sane. Cc: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-10-16 09:24:42 -07:00
Roland Dreier	528051746b	IB/mad: Use krealloc() to resize snoop table Use krealloc() instead of kmalloc() followed by memcpy() when resizing the MAD module's snoop table. Also put parentheses around the new table size to avoid calculating the wrong size to allocate, which fixes a bug pointed out by Haven Hash <haven.hash@isilon.com>. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-10-14 14:05:36 -07:00
Julien Brunel	6aea938f54	RDMA/ucma: Test ucma_alloc_multicast() return against NULL, not with IS_ERR() In case of error, the function ucma_alloc_multicast() returns a NULL pointer, but never returns an ERR pointer. So after a call to this function, an IS_ERR test should be replaced by a NULL test. The semantic match that finds this problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @match bad_is_err_test@ expression x, E; @@ x = ucma_alloc_multicast(...) ... when != x = E IS_ERR(x) // </smpl> Signed-off-by: Julien Brunel <brunel@diku.dk> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-10-10 12:00:19 -07:00
Roland Dreier	eedd5d0a70	Merge branches 'cma', 'cxgb3', 'ehca', 'ipath', 'ipoib', 'mad', 'misc', 'mlx4', 'mthca' and 'nes' into for-next	2008-10-09 17:41:15 -07:00
Hefty, Sean	a7e80ce26c	IB/cm: Correctly free cm_device structure commit `110cf374` ("infiniband: make cm_device use a struct device and not a kobject.") introduced a memory leak, since it deleted cm_release_dev_obj(), which was where cm_dev was freed. Fix this by freeing the leaked structure after calling device_unregister(). Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-09-30 10:36:54 -07:00
Michael Brooks	7097228c54	IB/mad: Don't discard BMA responses in kernel This fixes the problem of incoming BMA responses being dropped due to a bad "is response" check. Fix the test to use the ib_response_mad() predicate, which correctly handles BMA MADs. This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=988>. Signed-off-by: Michael Brooks <michael.brooks@qlogic.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-09-20 20:06:16 -07:00
Roland Dreier	06a91a02e9	Merge branches 'cma', 'cxgb3', 'ipath', 'ipoib', 'mad' and 'mlx4' into for-linus	2008-08-07 14:12:03 -07:00
Julien Brunel	cd55ef5a10	IB/mad: Test ib_create_send_mad() return with IS_ERR(), not == NULL In case of error, the function ib_create_send_mad() returns an ERR pointer, but never returns a NULL pointer. So testing the return value for error should be done with IS_ERR, not by comparing with NULL. A simplified version of the semantic patch that makes this change is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @correct_null_test@ expression x,E; statement S1, S2; @@ x = ib_create_send_mad(...) <... when != x = E if ( ( - x@p2 != NULL + ! IS_ERR ( x ) \| - x@p2 == NULL + IS_ERR( x ) ) ) S1 else S2 ...> ? x = E; // </smpl> Signed-off-by: Julien Brunel <brunel@diku.dk> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-08-07 14:11:56 -07:00
Roland Dreier	3f44675439	RDMA/cma: Remove padding arrays by using struct sockaddr_storage There are a few places where the RDMA CM code handles IPv6 by doing struct sockaddr addr; u8 pad[sizeof(struct sockaddr_in6) - sizeof(struct sockaddr)]; This is fragile and ugly; handle this in a better way with just struct sockaddr_storage addr; [ Also roll in patch from Aleksey Senin <alekseys@voltaire.com> to switch to struct sockaddr_storage and get rid of padding arrays in struct rdma_addr. ] Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-08-04 11:02:14 -07:00
Roland Dreier	5ba18b186c	RDMA/ucm: BKL is not needed for ib_ucm_open() Remove explicit cycle_kernel_lock() call and document why the code is safe. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-24 20:36:59 -07:00
Roland Dreier	f7a6117ee5	RDMA/ucma: BKL is not needed for ucma_open() Remove explicit lock_kernel() calls and document why the code is safe. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-24 20:36:59 -07:00
Linus Torvalds	5c402355ad	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: MAINTAINERS: Remove Glenn Streiff from NetEffect entry mlx4_core: Improve error message when not enough UAR pages are available IB/mlx4: Add support for memory management extensions and local DMA L_Key IB/mthca: Keep free count for MTT buddy allocator mlx4_core: Keep free count for MTT buddy allocator mlx4_code: Add missing FW status return code IB/mlx4: Rename struct mlx4_lso_seg to mlx4_wqe_lso_seg mlx4_core: Add module parameter to enable QoS support RDMA/iwcm: Remove IB_ACCESS_LOCAL_WRITE from remote QP attributes IPoIB: Include err code in trace message for ib_sa_path_rec_get() failures IB/sa_query: Check if sm_ah is NULL in ib_sa_remove_one() IB/ehca: Release mutex in error path of alloc_small_queue_page() IB/ehca: Use default value for Local CA ACK Delay if FW returns 0 IB/ehca: Filter PATH_MIG events if QP was never armed IB/iser: Add support for RDMA_CM_EVENT_ADDR_CHANGE event RDMA/cma: Add RDMA_CM_EVENT_TIMEWAIT_EXIT event RDMA/cma: Add RDMA_CM_EVENT_ADDR_CHANGE event	2008-07-24 12:56:07 -07:00
Roland Dreier	2cc177364e	Merge branches 'bkl-removal', 'cma', 'ehca', 'for-2.6.27', 'mlx4', 'mthca' and 'nes' into for-linus	2008-07-24 08:38:47 -07:00
Dotan Barak	1ca8d15619	RDMA/iwcm: Remove IB_ACCESS_LOCAL_WRITE from remote QP attributes Remove IB_ACCESS_LOCAL_WRITE from qp.qp_access_flags because this attribute is only used to set remote permissions. Signed-off-by: Dotan Barak <dotanba@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-22 14:18:34 -07:00
Ralph Campbell	64b784b583	IB/sa_query: Check if sm_ah is NULL in ib_sa_remove_one() If update_sm_ah() fails, it leaves the port's sm_ah as NULL. Then if the device or module is removed, ib_sa_remove_one() will dereference a NULL pointer when it calls kref_put(). Fix this by testing if sm_ah is NULL before dropping the reference. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-22 14:18:33 -07:00
Amir Vadai	38ca83a588	RDMA/cma: Add RDMA_CM_EVENT_TIMEWAIT_EXIT event Consumers that want to re-use their QPs in new connections need to know when the QP has exited the timewait state. Report the timewait event through the rdma_cm. Signed-off-by: Amir Vadai <amirv@mellanox.co.il> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-22 14:14:23 -07:00
Or Gerlitz	dd5bdff83b	RDMA/cma: Add RDMA_CM_EVENT_ADDR_CHANGE event Add an RDMA_CM_EVENT_ADDR_CHANGE event can be used by rdma-cm consumers that wish to have their RDMA sessions always use the same links (eg <hca/port>) as the IP stack does. In the current code, this does not happen when bonding is used and fail-over happened but the IB link used by an already existing session is operating fine. Use the netevent notification for sensing that a change has happened in the IP stack, then scan the rdma-cm ID list to see if there is an ID that is "misaligned" with respect to the IP stack, and deliver RDMA_CM_EVENT_ADDR_CHANGE for this ID. The consumer can act on the event or just ignore it. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-22 14:14:22 -07:00
Greg Kroah-Hartman	110cf374a8	infiniband: make cm_device use a struct device and not a kobject. This object really should be a struct device, or at least contain a pointer to a struct device, as it is trying to create a separate device tree outside of the main device tree. This patch fixes this problem. It is needed for the class core rework that is being done in the driver core. Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:49 -07:00
Greg Kroah-Hartman	d4c4196f24	infiniband: rename "device" to "ib_device" in cm_device This pointer really is a struct ib_device, not a struct device, so name it properly to help prevent confusion. This makes the followon patch in this series much smaller and easier to understand as well. Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Roland Dreier <rolandd@cisco.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:49 -07:00
Or Gerlitz	de910bd921	RDMA/cma: Simplify locking needed for serialization of callbacks The RDMA CM has some logic in place to make sure that callbacks on a given CM ID are delivered to the consumer in a serialized manner. Specifically it has code to protect against a device removal racing with a running callback function. This patch simplifies this logic by using a mutex per ID instead of a wait queue and atomic variable. This means that cma_disable_remove() now is more properly named to cma_disable_callback(), and cma_enable_remove() can now be removed because it just would become a trivial wrapper around mutex_unlock(). Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:53 -07:00
Or Gerlitz	64c5e613b9	RDMA/addr: Keep pointer to netdevice in struct rdma_dev_addr Keep a pointer to the local (src) netdevice in struct rdma_dev_addr, and copy it in as part of rdma_copy_addr(). Use rdma_translate_ip() in cma_new_conn_id() to reduce some code duplication and also make sure the src_dev member gets set. In a high-availability configuration the netdevice pointer can be used by the RDMA CM to align RDMA sessions to use the same links as the IP stack does under fail-over and route change cases. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:53 -07:00
Steve Wise	7f624d023b	RDMA/core: Add iWARP protocol statistics attributes in sysfs This patch adds a sysfs attribute group called "proto_stats" under /sys/class/infiniband/$device/ and populates this group with protocol statistics if they exist for a given device. Currently, only iWARP stats are defined, but the code is designed to allow InfiniBand protocol stats if they become available. These stats are per-device and more importantly -not- per port. Details: - Add union rdma_protocol_stats in ib_verbs.h. This union allows defining transport-specific stats. Currently only iwarp stats are defined. - Add struct iw_protocol_stats to define the current set of iwarp protocol stats. - Add new ib_device method called get_proto_stats() to return protocol statistics. - Add logic in core/sysfs.c to create iwarp protocol stats attributes if the device is an RNIC and has a get_proto_stats() method. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:48 -07:00
Roland Dreier	468f2239bc	RDMA/cma: Add missing newlines to printk()s Signed-off-by: Roland Dreier <rolandd@cisco.com> Acked-by: Sean Hefty <sean.hefty@intel.com>	2008-07-14 23:48:47 -07:00
Ralph Campbell	e5a5e7d59a	IB/core: Reset to error QP state transition is not allowed I was reviewing the QP state transition diagram in the IB 1.2.1 spec and the code for qp_state_table[], and noticed that the code allows a QP to be modified from IB_QPS_RESET to IB_QPS_ERR whereas the notes for figure 124 (pg 457) specifically says that this transition isn't allowed. This is a clarification from earlier versions of the IB spec, which were ambiguous in this area and suggested that the RESET to ERR transition was allowed. Fix up the qp_state_table[] to make RESET->ERR not allowed. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:46 -07:00
Steve Wise	00f7ec36c9	RDMA/core: Add memory management extensions support This patch adds support for the IB "base memory management extension" (BMME) and the equivalent iWARP operations (which the iWARP verbs mandates all devices must implement). The new operations are: - Allocate an ib_mr for use in fast register work requests. - Allocate/free a physical buffer lists for use in fast register work requests. This allows device drivers to allocate this memory as needed for use in posting send requests (eg via dma_alloc_coherent). - New send queue work requests: * send with remote invalidate * fast register memory region * local invalidate memory region * RDMA read with invalidate local memory region (iWARP only) Consumer interface details: - A new device capability flag IB_DEVICE_MEM_MGT_EXTENSIONS is added to indicate device support for these features. - New send work request opcodes IB_WR_FAST_REG_MR, IB_WR_LOCAL_INV, IB_WR_RDMA_READ_WITH_INV are added. - A new consumer API function, ib_alloc_mr() is added to allocate fast register memory regions. - New consumer API functions, ib_alloc_fast_reg_page_list() and ib_free_fast_reg_page_list() are added to allocate and free device-specific memory for fast registration page lists. - A new consumer API function, ib_update_fast_reg_key(), is added to allow the key portion of the R_Key and L_Key of a fast registration MR to be updated. Consumers call this if desired before posting a IB_WR_FAST_REG_MR work request. Consumers can use this as follows: - MR is allocated with ib_alloc_mr(). - Page list memory is allocated with ib_alloc_fast_reg_page_list(). - MR R_Key/L_Key "key" field is updated with ib_update_fast_reg_key(). - MR made VALID and bound to a specific page list via ib_post_send(IB_WR_FAST_REG_MR) - MR made INVALID via ib_post_send(IB_WR_LOCAL_INV), ib_post_send(IB_WR_RDMA_READ_WITH_INV) or an incoming send with invalidate operation. - MR is deallocated with ib_dereg_mr() - page lists dealloced via ib_free_fast_reg_page_list(). Applications can allocate a fast register MR once, and then can repeatedly bind the MR to different physical block lists (PBLs) via posting work requests to a send queue (SQ). For each outstanding MR-to-PBL binding in the SQ pipe, a fast_reg_page_list needs to be allocated (the fast_reg_page_list is owned by the low-level driver from the consumer posting a work request until the request completes). Thus pipelining can be achieved while still allowing device-specific page_list processing. The 32-bit fast register memory key/STag is composed of a 24-bit index and an 8-bit key. The application can change the key each time it fast registers thus allowing more control over the peer's use of the key/STag (ie it can effectively be changed each time the rkey is rebound to a page list). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:45 -07:00
Roland Dreier	f3781d2e89	RDMA: Remove subversion $Id tags They don't get updated by git and so they're worse than useless. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:44 -07:00
Moni Shoua	164ba0893c	IB/sa: Fail requests made while creating new SM AH This patch solves a race that occurs after an event occurs that causes the SA query module to flush its SM address handle (AH). When SM AH becomes invalid and needs an update it is handled by the global workqueue. On the other hand this event is also handled in the IPoIB driver by queuing work in the ipoib_workqueue that does multicast joins. Although queuing is in the right order, it is done to 2 different workqueues and so there is no guarantee that the first to be queued is the first to be executed. This causes a problem because IPoIB may end up sending an request to the old SM, which will take a long time to time out (since the old SM is gone); this leads to a much longer than necessary interruption in multicast traffer. The patch sets the SA query module's SM AH to NULL when the event occurs, and until update_sm_ah() is done, any request that needs sm_ah fails with -EAGAIN return status. For consumers, the patch doesn't make things worse. Before the patch, MADs are sent to the wrong SM so the request gets lost. Consumers can be improved if they examine the return code and respond to EAGAIN properly but even without an improvement the situation is not getting worse. Signed-off-by: Moni Levy <monil@voltaire.com> Signed-off-by: Moni Shoua <monis@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:43 -07:00
Sean Hefty	a947491709	RDMA: Fix license text The license text for several files references a third software license that was inadvertently copied in. Update the license to what was intended. This update was based on a request from HP. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:43 -07:00
Jonathan Corbet	2fceef397f	Merge commit 'v2.6.26' into bkl-removal	2008-07-14 15:29:34 -06:00
Roland Dreier	feae1ef116	IB/umad: BKL is not needed for ib_umad_open() Remove explicit lock_kernel() calls and document why the code is safe. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>	2008-07-11 16:40:58 -06:00
Roland Dreier	5b2d281acb	IB/uverbs: BKL is not needed for ib_uverbs_open() Remove explicit lock_kernel() calls and document why the code is safe. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>	2008-07-04 10:32:28 -06:00
Arnd Bergmann	6b0ee363b2	infiniband-ucma: BKL pushdown Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2008-06-20 14:05:57 -06:00
Jonathan Corbet	f2b9857eee	Add a bunch of cycle_kernel_lock() calls All of the open() functions which don't need the BKL on their face may still depend on its acquisition to serialize opens against driver initialization. So make those functions acquire then release the BKL to be on the safe side. Signed-off-by: Jonathan Corbet <corbet@lwn.net>	2008-06-20 14:05:53 -06:00
Jonathan Corbet	057e7c7ff9	infiniband: more BKL pushdown Be extra-cautious and protect the remaining open() functions. Signed-off-by: Jonathan Corbet <corbet@lwn.net>	2008-06-20 14:05:51 -06:00
Jonathan Corbet	d21c95c569	Add "no BKL needed" comments to several drivers This documents the fact that somebody looked at the relevant open() functions and concluded that, due to their trivial nature, no locking was needed. Signed-off-by: Jonathan Corbet <corbet@lwn.net>	2008-06-20 14:05:50 -06:00
Jack Morgenstein	fb77bcef9f	IB/uverbs: Fix check of is_closed flag check in ib_uverbs_async_handler() Commit `1ae5c187` ("IB/uverbs: Don't store struct file * for event files") changed the way that closed files are handled in the uverbs code. However, after the conversion, is_closed flag is checked incorrectly in ib_uverbs_async_handler(). As a result, no async events are ever passed to applications. Found by: Ronni Zimmerman <ronniz@mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-06-18 15:36:38 -07:00
Roland Dreier	8079ffa0e1	IB/umem: Avoid sign problems when demoting npages to integer On a 64-bit architecture, if ib_umem_get() is called with a size value that is so big that npages is negative when cast to int, then the length of the page list passed to get_user_pages(), namely min_t(int, npages, PAGE_SIZE / sizeof (struct page *)) will be negative, and get_user_pages() will immediately return 0 (at least since `900cf086`, "Be more robust about bad arguments in get_user_pages()"). This leads to an infinite loop in ib_umem_get(), since the code boils down to: while (npages) { ret = get_user_pages(...); npages -= ret; } Fix this by taking the minimum as unsigned longs, so that the value of npages is never truncated. The impact of this bug isn't too severe, since the value of npages is checked against RLIMIT_MEMLOCK, so a process would need to have an astronomical limit or have CAP_IPC_LOCK to be able to trigger this, and such a process could already cause lots of mischief. But it does let buggy userspace code cause a kernel lock-up; for example I hit this with code that passes a negative value into a memory registartion function where it is promoted to a huge u64 value. Cc: <stable@kernel.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-06-06 21:38:37 -07:00
Linus Torvalds	c2448278e3	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/mad: Fix kernel crash when .process_mad() returns SUCCESS\|CONSUMED IPoIB: Test for NULL broadcast object in ipiob_mcast_join_finish() MAINTAINERS: Add cxgb3 and iw_cxgb3 NIC and iWARP driver entries IB/mlx4: Fix creation of kernel QP with max number of send s/g entries IB/mthca: Fix max_sge value returned by query_device RDMA/cxgb3: Fix uninitialized variable warning in iwch_post_send() IB/mlx4: Fix uninitialized-var warning in mlx4_ib_post_send() IB/ipath: Fix UC receive completion opcode for RDMA WRITE with immediate IB/ipath: Fix printk format for ipath_sdma_status	2008-05-23 11:11:44 -07:00
Dave Olson	5a4f2b6752	IB/mad: Fix kernel crash when .process_mad() returns SUCCESS\|CONSUMED If a low-level driver returns IB_MAD_RESULT_SUCCESS \| IB_MAD_RESULT_CONSUMED, handle_outgoing_dr_smp() doesn't clean up properly. The fix is to kfree the local data and break, rather than falling through. This was observed with the ipath driver, but could happen with any driver. This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1027>. Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-05-23 10:52:59 -07:00
Greg Kroah-Hartman	6c06aec248	IB: fix race in device_create There is a race from when a device is created with device_create() and then the drvdata is set with a call to dev_set_drvdata() in which a sysfs file could be open, yet the drvdata will be NULL, causing all sorts of bad things to happen. This patch fixes the problem by using the new function, device_create_drvdata(). Cc: Kay Sievers <kay.sievers@vrfy.org> Reviewed-by: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-05-20 13:31:55 -07:00
Arthur Kepner	cb9fbc5c37	IB: expand ib_umem_get() prototype Add a new parameter, dmasync, to the ib_umem_get() prototype. Use dmasync = 1 when mapping user-allocated CQs with ib_umem_get(). Signed-off-by: Arthur Kepner <akepner@sgi.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Jes Sorensen <jes@sgi.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Roland Dreier <rdreier@cisco.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: David Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:12 -07:00
Linus Torvalds	e80ab411e5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6: (36 commits) SCSI: convert struct class_device to struct device DRM: remove unused dev_class IB: rename "dev" to "srp_dev" in srp_host structure IB: convert struct class_device to struct device memstick: convert struct class_device to struct device driver core: replace remaining __FUNCTION__ occurrences sysfs: refill attribute buffer when reading from offset 0 PM: Remove destroy_suspended_device() Firmware: add iSCSI iBFT Support PM: Remove legacy PM (fix) Kobject: Replace list_for_each() with list_for_each_entry(). SYSFS: Explicitly include required header file slab.h. Driver core: make device_is_registered() work for class devices PM: Convert wakeup flag accessors to inline functions PM: Make wakeup flags available whenever CONFIG_PM is set PM: Fix misuse of wakeup flag accessors in serial core Driver core: Call device_pm_add() after bus_add_device() in device_add() PM: Handle device registrations during suspend/resume block: send disk "change" event for rescan_partitions() sysdev: detect multiple driver registrations ... Fixed trivial conflict in include/linux/memory.h due to semaphore header file change (made irrelevant by the change to mutex).	2008-04-21 15:49:58 -07:00
Tony Jones	f4e91eb4a8	IB: convert struct class_device to struct device This converts the main ib_device to use struct device instead of struct class_device as class_device is going away. Signed-off-by: Tony Jones <tonyj@suse.de> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Cc: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-04-19 19:10:30 -07:00
Matthew Wilcox	6188e10d38	Convert asm/semaphore.h users to linux/semaphore.h Signed-off-by: Matthew Wilcox <willy@linux.intel.com>	2008-04-18 22:22:54 -04:00
Eli Cohen	2dd5716227	IB/core: Add support for modify CQ Add support for modifying CQ parameters for controlling event generation moderation. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:09:33 -07:00
Roland Dreier	0f39cf3d54	IB/core: Add support for "send with invalidate" work requests Add a new IB_WR_SEND_WITH_INV send opcode that can be used to mark a "send with invalidate" work request as defined in the iWARP verbs and the InfiniBand base memory management extensions. Also put "imm_data" and a new "invalidate_rkey" member in a new "ex" union in struct ib_send_wr. The invalidate_rkey member can be used to pass in an R_Key/STag to be invalidated. Add this new union to struct ib_uverbs_send_wr. Add code to copy the invalidate_rkey field in ib_uverbs_post_send(). Fix up low-level drivers to deal with the change to struct ib_send_wr, and just remove the imm_data initialization from net/sunrpc/xprtrdma/, since that code never does any send with immediate operations. Also, move the existing IB_DEVICE_SEND_W_INV flag to a new bit, since the iWARP drivers currently in the tree set the bit. The amso1100 driver at least will silently fail to honor the IB_SEND_INVALIDATE bit if passed in as part of userspace send requests (since it does not implement kernel bypass work request queueing). Remove the flag from all existing drivers that set it until we know which ones are OK. The values chosen for the new flag is not consecutive to avoid clashing with flags defined in the XRC patches, which are not merged yet but which are already in use and are likely to be merged soon. This resurrects a patch sent long ago by Mikkel Hagen <mhagen@iol.unh.edu>. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:09:32 -07:00
Dotan Barak	7ce5eacb45	IB/core: Check optional verbs before using them Make sure that a device implements the modify_srq and reg_phys_mr optional methods before calling them. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:09:28 -07:00
Eli Cohen	b846f25aa2	IB/core: Add creation flags to struct ib_qp_init_attr Add a create_flags member to struct ib_qp_init_attr that will allow a kernel verbs consumer to create a pass special flags when creating a QP. Add a flag value for telling low-level drivers that a QP will be used for IPoIB UD LSO. The create_flags member will also be useful for XRC and ehca low-latency QP support. Since no create_flags handling is implemented yet, add code to all low-level drivers to return -EINVAL if create_flags is non-zero. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:09:27 -07:00
Robert P. J. Day	157de22946	IB: Use shorter list_splice_init() for brevity Convert list_splice() + INIT_LIST_HEAD() to the equivalent list_splice_init() Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:09:26 -07:00
Julia Lawall	10f32065a2	RDMA/iwcm: Test rdma_create_id() for IS_ERR rather than 0 The function rdma_create_id() always returns either a valid pointer or a value made with ERR_PTR, so its result should be tested with IS_ERR, not with a test for 0. The problem was found using the following semantic match. (http://www.emn.fr/x-info/coccinelle/) //<smpl> @a@ expression E, E1; statement S,S1; position p; @@ E = rdma_create_id(...) ... when != E = E1 if@p (E) S else S1 @n@ position a.p; expression E,E1; statement S,S1; @@ E = NULL ... when != E = E1 if@p (E) S else S1 @depends on !n@ expression E; statement S,S1; position a.p; @@ * if@p (E) S else S1 //</smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:09:25 -07:00
Roland Dreier	a7dab9e887	IB/uverbs: Use alloc_file() instead of get_empty_filp() Christoph Hellwig wants to unexport get_empty_filp(), which is an ugly internal interface. Change the modular user in ib_uverbs_alloc_event_file() to use the better alloc_file() interface; this makes the code cleaner too. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:01:08 -07:00
Roland Dreier	1ae5c187ac	IB/uverbs: Don't store struct file * for event files The file member of struct ib_uverbs_event_file was only used to keep track of whether the file had been closed or not. The only thing we ever did with the value was check if it was NULL or not. Simplify the code and get rid of the need to keep track of the struct file * we allocate by replacing the file member with an is_closed member. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:01:08 -07:00
Roland Dreier	9cda779cc2	RDMA/ucma: Endian annotation Add __force cast of node_guid to __u64, since we are sticking it into a structure whose definition is shared with userspace. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:01:07 -07:00
Roland Dreier	a88f488857	IB/cm: Endianness annotations Mostly update the RB tree comparisons to force __be types to normal integers, but the change to cm_format_sidr_req() is a real fix: param->path->pkey is already __be16. Signed-off-by: Roland Dreier <rolandd@cisco.com> Acked-by: Sean Hefty <sean.hefty@intel.com>	2008-04-16 21:01:07 -07:00

1 2 3 4 5 ...

534 Commits