OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Jack Morgenstein	e6028c0e00	IB/mlx4: mlx4_ib_fmr_alloc() should call mlx4_fmr_enable() Currently mlx4_ib_fmr_alloc() calls mlx4_mr_enable() instead of mlx4_fmr_enable(). The two functions are equivalent at the moment, but this is not really correct (and the change is needed to fix a bug). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-14 10:39:36 -08:00
Eli Cohen	a9d1884925	IPoIB: Remove unused struct ipoib_cm_tx.ibwc member struct ipoib_cm_tx.ibwc is unused since commit `1b524963` ("IPoIB/cm: Use common CQ for CM send completions"), so remove it. Signed-off-by: Eli Cohen <eli@mellanox.co.il>	2008-02-14 10:30:50 -08:00
Jack Morgenstein	167c42655c	IPoIB: On P_Key change event, reset state properly In P_Key event handling, if the old P_Key is no longer available, the driver must call ipoib_ib_dev_stop() -- just as it does when the P_Key is still available (see procedure __ipoib_ib_dev_flush()). When a P_Key becomes available, the driver will perform ipoib_open(), which assumes that the QP is in RESET, the cm_id has been destroyed/deleted, etc. If ipoib_ib_dev_stop() is not called as described above, then these assumptions will be false, and the attempt to bring the interface up will fail. Found by Mellanox QA. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-14 10:15:06 -08:00
Marcin Slusarz	5163dc1a64	IB/mthca: Convert to use be16_add_cpu() replace: big_endian_variable = cpu_to_beX(beX_to_cpu(big_endian_variable) + expression_in_cpu_byteorder); with: beX_add_cpu(&big_endian_variable, expression_in_cpu_byteorder); Generated with a semantic patch. Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-13 07:47:47 -08:00
Steve Wise	8704e9a879	RDMA/cxgb3: Fail loopback connections The cxgb3 HW and driver don't support loopback RDMA connections. So fail any connection attempt where the destination address is local. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-13 07:47:42 -08:00
Roland Dreier	7c7a9bccd2	IB/cm: Fix infiniband_cm class kobject ref counting Commit `9af57b7a` ("IB/cm: Add basic performance counters") introduced a bug in how the reference count for cm_class.subsys.kobj was handled: the path that released a device did a kobject_put() on that kobject, but there was no kobject_get() in the path the handles adding a device. So the reference count ended up too low, which leads to bad things. Fix up and simplify the reference counting to avoid this. (Actually, I introduced the bug when fixing the patch up to match some of Greg's kobject changes, but who's counting) Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-12 14:38:27 -08:00
Roland Dreier	ab64b96067	IB/cm: Remove debug printk()s that snuck upstream Pesky little devils, sneaking around... Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-12 14:38:27 -08:00
Roland Dreier	fe174357eb	IB/mthca: Add missing sg_init_table() in mthca_map_user_db() Usually harmless, since the scatterlist is always hard-coded to a length of 1, but it triggers a BUG() if CONFIG_DEBUG_SG=y, so we better fix it. This fixes <http://bugzilla.kernel.org/show_bug.cgi?id=9934>. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-12 14:38:22 -08:00
Eli Cohen	7143740d26	IPoIB: Add send gather support This patch acts as a preparation for using checksum offload for IB devices capable of inserting/verifying checksum in IP packets. The patch does not actaully turn on NETIF_F_SG - we defer that to the patches adding checksum offload capabilities. We only add support for send gathers for datagram mode, since existing HW does not support checksum offload on connected QPs. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-08 14:32:37 -08:00
Eli Cohen	eb14032f9e	IPoIB: Add high DMA feature flag All current InfiniBand devices can handle all DMA addresses, and it's hard to imagine anyone would be silly enough to build a new device that couldn't. Therefore, enable the NETIF_F_HIGHDMA feature for IPoIB. This has no effect for no, but is needed when we enable gather/scatter support and checksum stateless offloads. Signed-off-by: Eli Cohen <eli@mellnaox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-08 13:39:26 -08:00
Jack Morgenstein	ea54b10c77	IB/mlx4: Use multiple WQ blocks to post smaller send WQEs ConnectX HCA supports shrinking WQEs, so that a single work request can be made of multiple units of wqe_shift. This way, WRs can differ in size, and do not have to be a power of 2 in size, saving memory and speeding up send WR posting. Unfortunately, if we do this then the wqe_index field in CQEs can't be used to look up the WR ID anymore, so our implementation does this only if selective signaling is off. Further, on 32-bit platforms, we can't use vmap() to make the QP buffer virtually contigious. Thus we have to use constant-sized WRs to make sure a WR is always fully within a single page-sized chunk. Finally, we use WRs with the NOP opcode to avoid wrapping around the queue buffer in the middle of posting a WR, and we set the NoErrorCompletion bit to avoid getting completions with error for NOP WRs. However, NEC is only supported starting with firmware 2.2.232, so we use constant-sized WRs for older firmware. And, since MLX QPs only support SEND, we use constant-sized WRs in this case. When stamping during NOP posting, do stamping following setting of the NOP WQE valid bit. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-08 13:30:02 -08:00
Roland Dreier	1c69fc2a90	IB/mlx4: Consolidate code to get an entry from a struct mlx4_buf We use struct mlx4_buf for kernel QP, CQ and SRQ buffers, and the code to look up an entry is duplicated in get_cqe_from_buf() and the QP and SRQ versions of get_wqe(). Factor this out into mlx4_buf_offset(). This will also make it easier to switch over to using vmap() for buffers. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-06 21:07:54 -08:00
Glenn Streiff	3c2d774cad	RDMA/nes: Add a driver for NetEffect RNICs Add a standard NIC and RDMA/iWARP driver for NetEffect 1/10Gb ethernet adapters. Signed-off-by: Glenn Streiff <gstreiff@neteffect.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-04 20:20:45 -08:00
Olaf Kirch	2c78853472	IB/mthca: Return proper error codes from mthca_fmr_alloc() If the allocation of the MTT or the mailbox failed, mthca_fmr_alloc() would return 0 (success) no matter what. This leads to crashes a little down the road, when we try to dereference eg mr->mtt, which was really ERR_PTR(-Ewhatever). Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-04 20:20:44 -08:00
Roland Dreier	f33afc26dc	IB: Avoid marking __devinitdata as const Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-04 20:20:44 -08:00
Roland Dreier	68f3948dab	IB/mlx4: Actually print out the driver version The string mlx4_ib_version was defined, but never used. Print out the version once when the first device is initialized. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-04 20:20:44 -08:00
Eli Cohen	1d368c5465	IB/ib_mthca: Pre-link receive WQEs in Tavor mode We have recently discovered that Tavor mode requires each WQE in a posted list of receive WQEs to have a valid NDA field at all times. This requirement holds true for regular QPs as well as for SRQs. This patch prelinks the receive queue in a regular QP and keeps the free list in SRQ always properly linked. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Reviewed-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-04 20:20:44 -08:00
Eli Cohen	1203c42e7b	IB/mthca: Remove checks for srq->first_free < 0 The SRQ receive posting functions make sure that srq->first_free never becomes negative, so we can remove tests of whether it is negative. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-04 20:20:44 -08:00
Or Gerlitz	1d96354e61	IB/fmr_pool: Allocate page list for pool FMRs only when caching enabled Allocate memory for the page_list field of struct ib_pool_fmr only when caching is enabled for the FMR pool, since the field is not used otherwise. This can save significant amounts of memory for large pools with caching turned off. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-04 20:20:44 -08:00
David Dillow	9fe4bcf45e	IB/srp: Retry stale connections When a host just goes away (crash, power loss, etc.) without tearing down its IB connections, it can get stale connection errors when it tries to reconnect to targets upon rebooting. Retrying the connection a few times will prevent sysadmins from playing the "which disk(s) went missing?" game. This would have made things slightly quicker when tracking down some of the recent bugs, but it also helps quite a bit when you've got a large number of targets hanging off a wedged server. Signed-off-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-04 20:20:43 -08:00
Jack Morgenstein	893da75956	mlx4_core: Don't read reserved fields in mlx4_QUERY_ADAPTER() The firmware QUERY_ADAPTER command does not return vendor_id, device_id, and revision_id; eliminate these fields from the query. Initialize the rev_id field of the mlx4 device via init_node_data (MAD IFC query), as is done in the query_device verb implementation. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-04 20:20:43 -08:00
Jack Morgenstein	6ccef1de2c	IB/mthca: Don't read reserved fields in mthca_QUERY_ADAPTER() For memfree devices, the firmware QUERY_ADAPTER command does not return vendor_id, device_id, and revision_id; do not return these fields in the QUERY_ADAPTER function for memfree devices. Instead, for memfree devices, initialize the rev_id field of the mthca device via init_node_data (MAD IFC query), as is done in the query_device verb implementation. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-04 20:20:43 -08:00
Or Gerlitz	7bc531dd88	IPoIB: Remove a misleading debug print Commit `732a2170` ("IB/ipoib: Bound the net device to the ipoib_neigh structue") left a misleading debug print (n->dev would be a bond device only if boding is used). Clean it up. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-04 20:20:43 -08:00
Or Gerlitz	bafff97417	IPoIB: Handle bonding failover race for connected neighbours too Move up the code that checks for a situation where the remote GID stored in the ipoib_neigh is different than the one present in the neighbour (handle gratuitous ARP) or that a bonding fail over has happened but the neighbour still has a pointer to an ipoib_neigh created by a different device than the current slave. This will cause the driver to apply the check also for connected mode neighbours. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-04 20:20:43 -08:00
Roland Dreier	0d89fe2c0c	IB/mthca: Fix and simplify page size calculation in mthca_reg_phys_mr() In mthca_reg_phys_mr(), we calculate the page size for the HCA hardware to use to map the buffer list passed in by the consumer. For example, if the consumer passes in [0] addr 0x1000, size 0x1000 [1] addr 0x2000, size 0x1000 then the algorithm would come up with a page size of 0x2000 and a list of two pages, at 0x0000 and 0x2000. Usually, this would work fine since the memory region would start at an offset of 0x1000 and have a length of 0x2000. However, the old code did not take into account the alignment of the IO virtual address passed in. For example, if the consumer passed in a virtual address of 0x6000 for the above, then the offset of 0x1000 would not be used correctly because the page mask of 0x1fff would result in an offset of 0. We can fix this quite neatly by making sure that the page shift we use is no bigger than the first bit where the start of the first buffer and the IO virtual address differ. Also, we can further simplify the code by removing the special case for a single buffer by noticing that it doesn't matter if we use a page size that is too big. This allows the loop to compute the page shift to be replaced with __ffs(). Thanks to Bryan S Rosenburg <rosnbrg@us.ibm.com> for pointing out the original bug and suggesting several ways to improve this patch. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-04 20:20:42 -08:00
Hoang-Nam Nguyen	2b5e6b120e	IB/ehca: Add PMA support This patch enables ehca to redirect any PMA queries to the actual PMA QP. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Reviewed-by: Joachim Fenkes <fenkes@de.ibm.com> Reviewed-by: Christoph Raisch <raisch@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-04 20:20:42 -08:00
Joachim Fenkes	528b03f732	IB/ehca: Update sma_attr also in case of disruptive config change Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-04 20:20:42 -08:00
Joachim Fenkes	2b7274c392	IB/ehca: Prevent sending UD packets to QP0 The IB spec doesn't allow packets to QP0 sent on any other VL than VL15. Hardware doesn't filter those packets on the send side, so we need to do this in the driver and firmware. As eHCA doesn't support QP0, we can just filter out all traffic going to QP0, regardless of SL or VL. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-04 20:20:42 -08:00
Sean Hefty	3971c9f6db	IB/cm: Add interim support for routed paths Paths with hop_limit > 1 indicate that the connection will be routed between IB subnets. Update the subnet local field in the CM REQ based on the hop_limit value. In addition, if the path is routed, then set the LIDs in the REQ to the permissive LIDs. This is used to indicate to the passive side that it should use the LIDs in the received local route header (LRH) associated with the REQ when programming the QP. This is a temporary work-around to the IB CM to support IB router development until the IB router specification is completed. It is not anticipated that this work-around will cause any interoperability issues with existing stacks or future stacks that will properly support IB routers when defined. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-02-04 20:20:42 -08:00
James Bottomley	d3f46f39b7	[SCSI] remove use_sg_chaining With the sg table code, every SCSI driver is now either chain capable or broken (or has sg_tablesize set so chaining is never activated), so there's no need to have a check in the host template. Also tidy up the code by moving the scatterlist size defines into the SCSI includes and permit the last entry of the scatterlist pools not to be a power of two. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-01-30 13:14:02 -06:00
Linus Torvalds	0ba6c33bcd	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.25 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.25: (1470 commits) [IPV6] ADDRLABEL: Fix double free on label deletion. [PPP]: Sparse warning fixes. [IPV4] fib_trie: remove unneeded NULL check [IPV4] fib_trie: More whitespace cleanup. [NET_SCHED]: Use nla_policy for attribute validation in ematches [NET_SCHED]: Use nla_policy for attribute validation in actions [NET_SCHED]: Use nla_policy for attribute validation in classifiers [NET_SCHED]: Use nla_policy for attribute validation in packet schedulers [NET_SCHED]: sch_api: introduce constant for rate table size [NET_SCHED]: Use typeful attribute parsing helpers [NET_SCHED]: Use typeful attribute construction helpers [NET_SCHED]: Use NLA_PUT_STRING for string dumping [NET_SCHED]: Use nla_nest_start/nla_nest_end [NET_SCHED]: Propagate nla_parse return value [NET_SCHED]: act_api: use PTR_ERR in tcf_action_init/tcf_action_get [NET_SCHED]: act_api: use nlmsg_parse [NET_SCHED]: act_api: fix netlink API conversion bug [NET_SCHED]: sch_netem: use nla_parse_nested_compat [NET_SCHED]: sch_atm: fix format string warning [NETNS]: Add namespace for ICMP replying code. ...	2008-01-29 22:54:01 +11:00
Denis V. Lunev	f206351a50	[NETNS]: Add namespace parameter to ip_route_output_key. Needed to propagate it down to the ip_route_output_flow. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:11:07 -08:00
Denis V. Lunev	f1b050bf7a	[NETNS]: Add namespace parameter to ip_route_output_flow. Needed to propagate it down to the __ip_route_output_key. Signed_off_by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:11:06 -08:00
Denis V. Lunev	1ab352768f	[NETNS]: Add namespace parameter to ip_dev_find. in_dev_find() need a namespace to pass it to fib_get_table(), so add an argument. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:11:04 -08:00
Joe Perches	6360a02af1	[IPV4] drivers/infiniband: Use ipv4_is_<type> Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:58:17 -08:00
WANG Cong	aff5905778	INFINIBAND: Remove 'TOPDIR' from Makefiles This patch removes TOPDIR from infiniband Makefile and delete one include statement pointing to a non-existing directory Cc: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <mshefty@ichips.intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>	2008-01-28 23:14:37 +01:00
Linus Torvalds	9b73e76f3c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (200 commits) [SCSI] usbstorage: use last_sector_bug flag universally [SCSI] libsas: abstract STP task status into a function [SCSI] ultrastor: clean up inline asm warnings [SCSI] aic7xxx: fix firmware build [SCSI] aacraid: fib context lock for management ioctls [SCSI] ch: remove forward declarations [SCSI] ch: fix device minor number management bug [SCSI] ch: handle class_device_create failure properly [SCSI] NCR5380: fix section mismatch [SCSI] sg: fix /proc/scsi/sg/devices when no SCSI devices [SCSI] IB/iSER: add logical unit reset support [SCSI] don't use __GFP_DMA for sense buffers if not required [SCSI] use dynamically allocated sense buffer [SCSI] scsi.h: add macro for enclosure bit of inquiry data [SCSI] sd: add fix for devices with last sector access problems [SCSI] fix pcmcia compile problem [SCSI] aacraid: add Voodoo Lite class of cards. [SCSI] aacraid: add new driver features flags [SCSI] qla2xxx: Update version number to 8.02.00-k7. [SCSI] qla2xxx: Issue correct MBC_INITIALIZE_FIRMWARE command. ...	2008-01-25 17:19:08 -08:00
Steve Wise	8176d297c7	RDMA/cxgb3: Fix the T3A workaround checks Correctly work around T3A issues by checking "hwtype != T3A" instead of "hwtype == T3B". This will be needed for new hardware types. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:17:47 -08:00
Jan Engelhardt	f7fca1e8a8	IB/ipath: Remove unnecessary cast Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:17:46 -08:00
Jan Engelhardt	1cf18d5aab	IPoIB: Constify seq_operations function pointer tables Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:17:46 -08:00
Steve Wise	c6b5b50474	RDMA/cxgb3: Mark QP as privileged based on user capabilities This is needed to support zero-stag properly. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:17:45 -08:00
Steve Wise	d08ca26cee	RDMA/cxgb3: Fix page shift calculation in build_phys_page_list() The existing logic incorrectly maps this buffer list: 0: addr 0x10001000, size 0x1000 1: addr 0x10002000, size 0x1000 To this bogus page list: 0: 0x10000000 1: 0x10002000 The shift calculation must also take into account the address of the first entry masked by the page_mask as well as the last address+size rounded up to the next page size. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:17:45 -08:00
Steve Wise	856b592504	RDMA/cxgb3: Flush the receive queue when closing - for kernel mode cqs, call event notification handler when flushing. - flush QP when moving from RTS -> CLOSING. - fix logic to identify a kernel mode qp. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:17:45 -08:00
Ralph Campbell	4e1e93a418	IB/ipath: Trivial simplification of ipath_make_ud_req() Move the increment of s_hdrwords into the existing if block that tests if we're doing a send with immediate, to save one test of the opcode. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:17:44 -08:00
Roland Dreier	950529e5c6	IB/mthca: Update latest "native Arbel" firmware revision Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:17:44 -08:00
Krishna Kumar	48fe5e594c	IPoIB: Remove redundant check of netif_queue_stopped() in xmit handler qdisc_run() now tests for queue_stopped() before calling __qdisc_run(), and the same check is done in every iteration of __qdisc_run(), so another check is not required in the driver xmit. This means that ipoib_start_xmit() no longer needs to test netif_queue_stopped(); the test was added to fix earlier kernels, where the networking stack did not guarantee that the xmit method of an LLTX driver would not be called after the queue was stopped, but current kernels do provide this guarantee. To validate, I put a debug in the TX_BUSY path which never hit with 64 threads running overnight exercising this code a few 100 million times. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:17:44 -08:00
Ralph Campbell	3d68ea3261	IB/ipath: Add mappings from HW register to PortInfo port physical state Add new mappings from port physical state (a HW register value) to the IB SubnGet(PortInfo) port physical state. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:17:44 -08:00
Dave Olson	6ac50727bd	IB/ipath: Changes to support PIO bandwidth check on IBA7220 The IBA7220 uses a count-based triggering mechanism, and therefore can't use the same bandwidth verification mechanism as older chips. To support the 7220, allow enabling and disabling armlaunch errors on application request. Minor robustness improvements as well. Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:17:43 -08:00
Dave Olson	ddb70c83a5	IB/ipath: Minor cleanup of unused fields and chip-specific errors Clean up some unused header fields, minor related cleanup. Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:17:43 -08:00
Michael Albaugh	359193ef43	IB/ipath: New sysfs entries to control 7220 features IBA7220 includes many more configurable IB settings. Getting/setting these is now grouped into a pair of chip specific functions accessed via function pointers. Provide sysfs access to these settings. Signed-off-by: Michael Albaugh <michael.albaugh@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:17:32 -08:00
Dave Olson	c4bce8032e	IB/ipath: Add new chip-specific functions to older chips, consistent init This adds the new (sometimes empty) chip-specific functions to the older chips, and makes the initialization and related functions consistent across all 3 chips. Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:45 -08:00
Dave Olson	7387273307	IB/ipath: Remove unused MDIO interface code This code has been unused for some time, but still had leftovers from when it was used. Signed-off-by: Dave Olson <dave.olson@qlogic.com Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:44 -08:00
Joachim Fenkes	2ec8e66241	IB/ehca: Prevent RDMA-related connection failures on some eHCA2 hardware Some HW revisions of eHCA2 may cause an RC connection to break if they received RDMA Reads over that connection before. This can be prevented by assuring that, after the first RDMA Read, the QP receives a new RDMA Read every few million link packets. Include code into the driver that inserts an empty (size 0) RDMA Read into the message stream every now and then if the consumer doesn't post them frequently enough. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:44 -08:00
Hoang-Nam Nguyen	bbdd267ef2	IB/ehca: Add "port connection autodetect mode" This patch enhances ehca with a capability to "autodetect" the ports being connected physically. In order to utilize that function the module option nr_ports must be set to -1 (default is 2 - two ports). This feature is experimental and will made the default later. More detail: If the user connects only one port to the switch, current code requires 1) port one to be connected and 2) module option nr_ports=1 to be given. If autodetect is enabled, ehca will not wait at creation of the GSI QP for the respective port to become active. Since firmware does not accept modify_qp() while the port is down at initialization, we need to cache all calls to modify_qp() for the SMI/GSI QP and just return a good return code. When a port is activated and we get a PORT_ACTIVE event, we replay the cached modify-qp() parms and re-trigger any posted recv WRs. Only then do we forward the PORT_ACTIVE event to registered clients. The result of this autodetect patch is that all ports will be accessible by the users. Depending on their respective cabling only those ports that are connected properly will become operable. If a user tries to modify a regular QP of a non-connected port, modify_qp() will fail. Furthermore, ibv_devinfo should show the port state accordingly. Note that this patch primarily improves the loading behaviour of ehca. If the cable is removed while the driver is operating and plugged in again, firmware will handle that properly by sending an appropriate async event. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:44 -08:00
Hoang-Nam Nguyen	b8b50e353b	IB/ehca: Define array to store SMI/GSI QPs Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:44 -08:00
Hoang-Nam Nguyen	0c86e280fe	IB/ehca: Remove CQ-QP-link before destroying QP in error path of create_qp() Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:43 -08:00
Erez Zilber	6410627eb9	IB/iser: Add change_queue_depth method Add a .change_queue_depth handler to the scsi_host_template in the iSER driver. iscsi_change_queue_depth was added to iscsi_tcp in order to solve the problem of queue depth which was too high for some targets. It is also applicable for iSER. Signed-off-by: Erez Zilber <erezz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:43 -08:00
Erez Zilber	a4ef1451df	IB/iser: Print information about unhandled RDMA CM events Some RDMA CM events are not supported or not handled in iSER. This patch adds some info (printk) for the user about them. Signed-off-by: Erez Zilber <erezz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:43 -08:00
Olaf Kirch	a3cd7d9070	IB/fmr_pool: ib_fmr_pool_flush() should flush all dirty FMRs When a FMR is released via ib_fmr_pool_unmap(), the FMR usually ends up on the free_list rather than the dirty_list (because we allow a certain number of remappings before actually requiring a flush). However, ib_fmr_batch_release() only looks at dirty_list when flushing out old mappings. This means that when ib_fmr_pool_flush() is used to force a flush of the FMR pool, some dirty FMRs that have not reached their maximum remap count will not actually be flushed. Fix this by flushing all FMRs that have been used at least once in ib_fmr_batch_release(). Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:43 -08:00
Olaf Kirch	a656eb758f	IB/fmr_pool: Flush serial numbers can get out of sync Normally, the serial numbers for flush requests and flushes executed for an FMR pool should be in sync. However, if the FMR pool flushes dirty FMRs because the dirty_watermark was reached, we wake up the cleanup thread and let it do its stuff. As a side effect, the cleanup thread increments pool->flush_ser, which leaves it one higher than pool->req_ser. The next time the user calls ib_flush_fmr_pool(), the cleanup thread will be woken up, but ib_flush_fmr_pool() won't wait for the flush to complete because flush_ser is already past req_ser. This means the FMRs that the user expects to be flushed may not have all been flushed when the function returns. Fix this by telling the cleanup thread to do work exclusively by incrementing req_ser, and by moving the comparison of dirty_len and dirty_watermark into ib_fmr_pool_unmap(). Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>	2008-01-25 14:15:42 -08:00
Roland Dreier	2fe7e6f7c9	IB/umad: Simplify and fix locking In addition to being overly complex, the locking in user_mad.c is broken: there were multiple reports of deadlocks and lockdep warnings. In particular it seems that a single thread may end up trying to take the same rwsem for reading more than once, which is explicitly forbidden in the comments in <linux/rwsem.h>. To solve this, we change the locking to use plain mutexes instead of rwsems. There is one mutex per open file, which protects the contents of the struct ib_umad_file, including the array of agents and list of queued packets; and there is one mutex per struct ib_umad_port, which protects the contents, including the list of open files. We never hold the file mutex across calls to functions like ib_unregister_mad_agent(), which can call back into other ib_umad code to queue a packet, and we always hold the port mutex as long as we need to make sure that a device is not hot-unplugged from under us. This even makes things nicer for users of the -rt patch, since we remove calls to downgrade_write() (which is not implemented in -rt). Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:42 -08:00
Roland Dreier	cf9542aa92	IB/ipath: Fix some sparse warnings about shadowed symbols There are a few places in the ipath driver where a variable is re-declared within a block where it is already in scope. Most of these extra declarations can simply be removed, since the variable from the outer scope is used in a way so that it does not need to keep its variable across the block with the re-declaration. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:42 -08:00
Roland Dreier	1d6e658e8e	RDMA/cxgb3: Endianness annotation for irs field t3_rdma_init_wr.irs is a big-endian field, so declare it as __be32. This fixes one sparse warning. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:42 -08:00
Anton Blanchard	1a7d2dce41	IB/ehca: Use round_jiffies() for EQ polling timer Use round_jiffies() to align ehca's 1-second timer with other timers and potentially save power by sleeping cores for longer. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:41 -08:00
Sean Hefty	5851bb893e	RDMA/cma: Override default responder_resources with user value By default, the responder_resources parameter is set to that received in a connection request. The passive side may override this value when accepting the connection. Use the value provided by the passive side when transitioning the QP to RTR state, rather than the value given in the connect request. Without this change, the RTR transition may fail if the passive side supports fewer responder_resources than that in the request. For code consistency and to protect against QP destruction, restructure overriding initiator_depth to match how responder_resources is set. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:41 -08:00
Dave Olson	1f813ca830	IB/ipath: Drop support for the original QHT7040 board The original QHT7040 had significant performance issues so there was an additional check in the driver for a newer serial number. Support for the small quantities of that board shipped has been dropped, so this patch removes the special checks to simplify the code. Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:40 -08:00
Arthur Jones	7da0498e7f	IB/ipath: Add ipath_read_ireg() abstraction Different chips have different width interrupt status registers, so add a flag and accessor function to decide which width register read to use. Signed-off-by: Arthur Jones <arthur.jones@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:40 -08:00
Ralph Campbell	4ea61b548b	IB/ipath: Add flag and handling for chips with swapped register bug The 6110 had a bug that caused some registers to be swapped; it was fixed for the 7220 (and didn't affect the 6120 because it had fewer registers). This adds a flag and related code to handle that, and includes some minor cleanups in the same area. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>	2008-01-25 14:15:39 -08:00
Ralph Campbell	60948a4158	IB/ipath: Port config has on-chip effects for 7220 The number of configured ports for the 7220 changes the number of eager TIDs available per port, for all but port 0 (kernel port) which remains constant, so add a field to give port0 count separate from the portdata structure. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:39 -08:00
Ralph Campbell	a18e26ae44	IB/ipath: Allow more flexible user register alignments User registers have different alignments on different chips (4KB on older, 64KB on 7220). Allow mapping the user registers on kernels with page sizes up to 64K. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:39 -08:00
Dave Olson	9e2ef36b5a	IB/ipath: Clean up some comments Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:38 -08:00
Ralph Campbell	3029fcc3d4	IB/ipath: Export hardware counters more consistently Various hardware counters are exported via the ipath file system (since it is binary data). The old file format was very dependent on the HW offsets for these registers. Newer HCA chips can have different counters at different offsets. This patch adds a level of indirection to make the file format consistent across HCAs. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:38 -08:00
Ralph Campbell	6c719cae0b	IB/ipath: MAD performance sampling registers support Add support for QLogic HCAs which have hardware performance sampling registers for PortSamplesControl and PortSamplesResult MADs. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:38 -08:00
David Dillow	7aa54bd730	IB/srp: Add identifying information to log messages When you have multiple targets, it gets really confusing when you try to track down who did a reset when there is no identifying information in the log message, especially when the same extension ID is mapped through two different local IB ports. So, add an identifier that can be used to track back to which local IB port/remote target pair is the one having problems. Signed-off-by: David Dillow <dillowda@ornl.gov> Acked-by: Pete Wyckoff <pw@osc.edu> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:38 -08:00
Pradeep Satyanarayana	586a693448	IPoIB/CM: Enable SRQ support on HCAs that support fewer than 16 SG entries Some HCAs (such as ehca2) support SRQ, but only support fewer than 16 SG entries for SRQs. Currently IPoIB/CM implicitly assumes all HCAs will support 16 SG entries for SRQs (to handle a 64K MTU with 4K pages). This patch removes that restriction by limiting the maximum MTU in connected mode to what the maximum number of SRQ SG entries allows. This patch addresses <https://bugs.openfabrics.org/show_bug.cgi?id=728> Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:37 -08:00
David Dillow	fff09a8e6e	IB/srp: Enable SG list chaining By default, the SCSI mid-layer seems to send down 512KB requests (sg_tablesize = 256), with some requests occasionally combined. By allowing the mid-layer to chain requests, we can easily grow to 1024KB or larger -- I've tested 4096KB I/O requests with no problems. I looked through the DMA paths on the hardware drivers to ensure they could take advantage of the SG chaining, and it seems that every one except ipath uses the system's DMA routines, which have been converted to handle chaining. ipath looks like it should be OK, but I have no way to test it. Signed-off-by: David Dillow <dillowda@ornl.gov> [ Tested on ipath. - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:37 -08:00
David Dillow	8cba207732	IB/srp: Respect target credit limit The current SRP initiator will send requests even if it has no credits available. The results of sending extra requests are vendor specific, but on some devices, overrunning credits will cost 85% of peak performance -- e.g. 100 MB/s vs 720 MB/s. Other devices may just drop the requests. This patch will tell the SCSI midlayer to queue requests if there are fewer than two credits remaining, and will not issue a task management request if there are no credits remaining. The mid-layer will retry the queued command once an outstanding command completes. The patch also removes the unlikely() in __srp_get_tx_iu(), as it is not at all unlikely to hit this limit under heavy load. Signed-off-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:37 -08:00
Rolf Manderscheid	a9e527e3f9	IPoIB: improve IPv4/IPv6 to IB mcast mapping functions An IPoIB subnet on an IB fabric that spans multiple IB subnets can't use link-local scope in multicast GIDs. The existing routines that map IP/IPv6 multicast addresses into IB link-level addresses hard-code the scope to link-local, and they also leave the partition key field uninitialised. This patch adds a parameter (the link-level broadcast address) to the mapping routines, allowing them to initialise both the scope and the P_Key appropriately, and fixes up the call sites. The next step will be to add a way to configure the scope for an IPoIB interface. Signed-off-by: Rolf Manderscheid <rvm@obsidianresearch.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:37 -08:00
Dave Olson	755807a296	IB/ipath: Changes for fields moving from devdata to portdata This patch moves some arrays that were defined per-device to be variables defined in the per context data structure, thus avoiding extra kzalloc() calls. Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:36 -08:00
Dave Olson	d8274869d7	IB/ipath: Generalize some xxx_SHIFT macros In preparation for upcoming chips that have different values for INFINIPATH_R_PORTENABLE_SHIFT, INFINIPATH_R_INTRAVAIL_SHIFT, INFINIPATH_R_TAILUPD_SHIFT, and portcfg_shift, remove the shared #defines and use device-specific variables instead. Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:36 -08:00
Ralph Campbell	c59a80aca0	IB/ipath: kreceive uses portdata rather than devdata kreceive is now portdata * instead of devdata * and other kreceive related cleanups.... Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:35 -08:00
Ralph Campbell	d65708f3a7	IB/ipath: Cleanup ipath_get_egrbuf() Remove an unused parameter and fix up the comment. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:35 -08:00
Ralph Campbell	cc65edcf0c	IB/ipath: Fix RNR NAK handling This patch fixes a couple of minor problems with RNR NAK handling: - The insertion sort was causing extra delay when inserting ahead vs. behind an existing entry on the list. - A resend of a first packet of a message which is still not ready, needs another RNR NAK (i.e., it was suppressed when it shouldn't). - Also, the resend tasklet doesn't need to be woken up unless the ACK/NAK actually indicates progress has been made. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:34 -08:00
Hoang-Nam Nguyen	e57d62a147	IB/ehca: Forward event client-reregister-required to registered clients This patch allows ehca to forward event client-reregister-required to registered clients. One such event is generated by a switch eg. after its reboot. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:34 -08:00
Roland Dreier	b3226184af	IB/mlx4: Micro-optimize mlx4_ib_poll_one() Rather than byte-swapping cqe->g_mlpath_rqpn each time we extract a field from it, byte-swap it once into a temporary variable. This results in smaller, better code -- eg, on 32-bit x86: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-5 (-5) function old new delta mlx4_ib_poll_cq 1188 1183 -5 Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:34 -08:00
Adrian Bunk	e57895d389	IB/mthca: Remove MSI support as scheduled Remove MSI support from the mthca driver, as scheduled. There is no reason to use MSI instead of MSI-X, since MSI-X performs better. No one has spoken up since MSI support was deprecated in commit `f6be6fbe` ("IB/mthca: Schedule MSI support for removal"), so apparently the MSI support is unused. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:33 -08:00
Oliver Pinter	38dc732f47	IB/iser: Typo fix (s/destory/destroy/) Signed-off-by: Oliver Pinter <oliver.pntr@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:32 -08:00
Erez Zilber	bd5d7a8585	IB/iser: update URLs of iSER docs Signed-off-by: Erez Zilber <erezz@voltaire.com>	2008-01-25 14:15:32 -08:00
Sean Hefty	88314e4dda	RDMA/cma: add support for rdma_migrate_id() This is based on user feedback from Doug Ledford at RedHat: Events that occur on an rdma_cm_id are reported to userspace through an event channel. Connection request events are reported on the event channel associated with the listen. When the connection is accepted, a new rdma_cm_id is created and automatically uses the listen event channel. This is suboptimal where the user only wants listen events on that channel. Additionally, it may be desirable to have events related to connection establishment use a different event channel than those related to already established connections. Allow the user to migrate an rdma_cm_id between event channels. All pending events associated with the rdma_cm_id are moved to the new event channel. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:32 -08:00
Vladimir Sokolovsky	45d9478da1	RDMA/cma: Reenable device removal on passive side Enable conn_id remove on the passive side after connection establishment. This corrects an issue where the IB driver can't be unloaded after running applications over RDS. The 'dev_remove' counter does not reach 0 for established connections on the passive side. This problem is limited to device removal, and only occurs on the passive side if there are established connections. Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:31 -08:00
Sean Hefty	b61d92d8ae	IB/mad: Fix incorrect access to items on local_list In cancel_mads(), MADs are moved from the wait_list and local_list to a cancel_list for processing. However, the structures on these two lists are not the same. The wait_list references struct ib_mad_send_wr_private, but local_list references struct ib_mad_local_private. Cancel_mads() treats all items moved to the cancel_list as struct ib_mad_send_wr_private. This leads to a system crash when requests are moved from the local_list to the cancel_list. Fix this by leaving local_list alone. All requests on the local_list have completed are just awaiting processing by a queued worker thread. Bug (crash) reported by Dotan Barak <dotanb@dev.mellanox.co.il>. Problem with local_list access reported by Robert Reynolds <rreynolds@opengridcomputing.com>. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:31 -08:00
Sean Hefty	9af57b7a27	IB/cm: Add basic performance counters Add performance/debug counters to track sent/received messages, retries, and duplicates. Counters are tracked per CM message type, per port. The counters are always enabled, so intrusive state tracking is not done. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:30 -08:00
Sean Hefty	4fc8cd4919	IB/mad: Report number of times a mad was retried To allow ULPs to tune timeout values and capture retry statistics, report the number of times that a mad send operation was retried. For RMPP mads, report the total number of times that the any portion (send window) of the send operation was retried. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:30 -08:00
Sean Hefty	547af76521	IB/multicast: Report errors on multicast groups if P_key changes P_key changes can invalidate multicast groups. Report errors on all multicast groups affected by a pkey change. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:29 -08:00
Joe Perches	94545e8c51	IB: Spelling fixes in comments Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:29 -08:00
Nick Piggin	3c8450860b	IB/ipath: Convert from .nopage to .fault Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:29 -08:00
Ralph Campbell	a2f76cd69f	IB/ipath: Add the work completion error code to the QP error debug output Add the work completion error code to the QP error debug output. This makes it easier to determine the cause of the error. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:29 -08:00
Arthur Jones	2f01a70011	IB/ipath: Better comment for rmb() in ipath_intr() An internal code review found the comment here lacking -- update it with more specifics of how and why the rmb() is there. Signed-off-by: Arthur Jones <arthur.jones@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:28 -08:00
Ralph Campbell	6276980138	IB/ipath: Fix comments for ipath_create_srq() During a code review, someone noticed the comments didn't match the code. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:28 -08:00
Ralph Campbell	733d12813b	IB/ipath: Fix error returned from ib_resize_cq if new size smaller than # entries The gen2_basic tests check for the errno value when a CQ is resized smaller than the number of outstanding completions queue on the CQ. This patch changes ib_ipath to return EINVAL which is what ib_mthca returns and what gen2_basic expects. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:28 -08:00
John Gregor	e342c11917	IB/ipath: Fix sendctrl locking Code review pointed out that the locking around uses of ipath_sendctrl and kr_sendctrl were, in several places, incorrect and/or inconsistent. Signed-off-by: John Gregor <john.gregor@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:27 -08:00
Ralph Campbell	9ab4295d1d	IB/ipath: Remove dead code for user process waiting for send buffer At one point in time there was code to allow a user process to wait for a send buffer if none were available. This feature was never used and most of the code was removed. This removes some missed unused code. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:27 -08:00
Steve Wise	457fe7b8a6	RDMA/cxgb3: Support version 5.0 firmware The 5.0 firmware now supports translating sgls in recv work requests, so remove the host driver logic currently doing the translation. Note: this change requires 5.0 firmware. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:26 -08:00
Steve Wise	7f049f2f42	RDMA/cxgb3: Hold rtnl_lock() around ethtool get_drvinfo call Currently the call into cxgb3 to get the driver info is not serialized. The iw_cxgb3 module needs to hold the rtnl_lock around the ethtool ops call like dev_ioctl() does. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:26 -08:00
Joe Perches	908cf9a565	drivers/infiniband: Add missing "space" Add missing spaces in the middle of format strings. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:26 -08:00
Matthias Kaehlcke	2c45688fae	IB/ipath: Convert ipath_eep_sem semaphore to a mutex Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com> Acked-by: Michael Albaugh <Michael.Albaugh@qlogic.com> Tested-by: Arthur Jones <arthur.jones@qlogic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:26 -08:00
Steve Welch	727792da2b	IB/mad: Enable loopback of DR SMP responses from userspace The local loopback of an outgoing DR SMP response is limited to those that originate at the driver specific SMA implementation during the driver specific process_mad() function. This patch enables a returning DR SMP originating in userspace (or elsewhere) to be delivered to the local managment stack. In this specific case the driver process_mad() function does not consume or process the MAD, so a reponse mad has not be created and the original MAD must manually be copied to the MAD buffer that is to be handed off to the local agent. Signed-off-by: Steve Welch <swelch@systemfabricworks.com> Acked-by: Hal Rosenstock <hal@xsigo.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:25 -08:00
Ralph Campbell	f9b4035322	IB/ipath: Enable loopback of DR SMP responses from userspace This patch is in response to reviewing a patch to the core MAD processing which fixes loopback of directed route packets to/from user level MAD agents. This change enables the core code to work for ib_ipath by fixing the return code from the ipath process_mad method. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:25 -08:00
Ralph Campbell	3828ff457a	IB/mad: Remove redundant NULL pointer check in ib_mad_recv_done_handler() In ib_mad_recv_done_handler(), the response pointer is checked for NULL after allocating it. It is then checked again in the local process_mad() path but there is no possibility of it changing in between. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Acked-by: Hal Rosenstock <hal@xsigo.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:25 -08:00
Steve Wise	8d8293cfb3	RDMA/iwcm: Set initiator depth and responder resources to device max values Set the initiator depth and responder resources to the device max values for new connect request events in the iWARP connection manager. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:25 -08:00
Dave Olson	e193e3326c	IB/ipath: Improve interrupt handler cache footprint Improve interrupt handler cache footprint by noinline'ing error functions that are rarely called. Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:25 -08:00
Pradeep Satyanarayana	68e995a295	IPoIB/cm: Add connected mode support for devices without SRQs Some IB adapters (notably IBM's eHCA) do not implement SRQs (shared receive queues). The current IPoIB connected mode support only works on devices that support SRQs. Fix this by adding support for using the receive queue of each connected mode receive QP. The disadvantage of this compared to using an SRQ is that it means a full queue of receives must be posted for each remote connected mode peer, which means that total memory usage is potentially much higher than when using SRQs. To manage this, add a new module parameter "max_nonsrq_conn_qp" that limits the number of connections allowed per interface. The rest of the changes are fairly straightforward: we use a table of struct ipoib_cm_rx to hold all the active connections, and put the table index of the connection in the high bits of receive WR IDs. This is needed because we cannot rely on the struct ib_wc.qp field for non-SRQ receive completions. Most of the rest of the changes just test whether or not an SRQ is available, and post receives or find received packets in the right place depending on the answer. Cleaning up dead connections actually becomes simpler, because we do not have to do the "last WQE reached" dance that is required to destroy QPs attached to an SRQ. We just move the QP to the error state and wait for all pending receives to be flushed. Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com> [ Completely rewritten and split up, based on Pradeep's work. Several bugs fixed and no doubt several bugs introduced. - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:24 -08:00
Roland Dreier	efcd99717f	IPoIB/cm: Factor out ipoib_cm_free_rx_reap_list() Factor out the code for going through the rx_reap list of struct ipoib_cm_rx and freeing each one. This consolidates the code duplicated between ipoib_cm_dev_stop() and ipoib_cm_rx_reap() and reduces the risk of error when adding additional accounting. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:24 -08:00
Roland Dreier	7b3687df66	IPoIB/cm: Factor out ipoib_cm_create_srq() Factor out the code to create an SRQ and allocate the receive ring in ipoib_cm_dev_init() into a new function ipoib_cm_create_srq(). This will make the code neater when support for devices that don't implement SRQs is added. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:24 -08:00
Roland Dreier	1efb61444c	IPoIB/cm: Factor out ipoib_cm_free_rx_ring() Factor out the code to unmap/free skbs and free the receive ring in ipoib_cm_dev_cleanup() into a new function ipoib_cm_free_rx_ring(). This function will be called from a couple of other places when support for devices that don't implement SRQs is added. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:24 -08:00
Roland Dreier	2337f80941	IPoIB: Trivial formatting cleanups Fix whitespace blunders, convert "foo* bar" to "foo *bar", etc. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:23 -08:00
Roland Dreier	657c2f2cbc	IB/ipath: Fix crash on unload introduced by sysfs changes Commit `23b9c1ab` ("Infiniband: make ipath driver use default driver groups.") introduced a bug in the ipath driver where ipath_device_create_group() fell through into the error path, even on success, which meant that the sysfs groups it created would always get removed right away. This made ipath_device_remove_group() hit the BUG_ON() in sysfs_remove_group() when it tried to remove those groups a second time. Correct the return path so that the groups stick around until they are supposed to be cleaned up. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:21 -08:00
Greg Kroah-Hartman	c10997f657	Kobject: convert drivers/* from kobject_unregister() to kobject_put() There is no need for kobject_unregister() anymore, thanks to Kay's kobject cleanup changes, so replace all instances of it with kobject_put(). Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:40 -08:00
Greg Kroah-Hartman	23b9c1ab5b	Infiniband: make ipath driver use default driver groups. Make the ipath driver use the new driver functions so that it does not touch the sysfs portion of the driver structure. We also remove the redundant symlink from the device back to the driver, as it is already in the sysfs tree. Any userspace tools should be using the standard symlink, not some driver specific one. Cc: Roland Dreier <rdreier@cisco.com> Cc: Bryan O'Sullivan <bryan.osullivan@qlogic.com> Cc: Arthur Jones <arthur.jones@qlogic.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:34 -08:00
Greg Kroah-Hartman	35be068198	Kobject: change drivers/infiniband to use kobject_init_and_add Stop using kobject_register, as this way we can control the sending of the uevent properly, after everything is properly initialized. Cc: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <mshefty@ichips.intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:26 -08:00
Erez Zilber	90c18f3c28	[SCSI] IB/iSER: add logical unit reset support eh_device_reset_handler was already added to scsi_host_template in iscsi_tcp, and is now added also for iscsi_iser. Signed-off-by: Erez Zilber <erezz@voltaire.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-01-23 13:39:43 -06:00
Ralph Campbell	0a69631b28	IB/ipath: Fix receiving UD messages with immediate data This fixes a small bug in ipath_ud_rcv()'s handling of UD messages with immediate data. We need to test whether immediate data is present and update the header size accordingly before testing the packet size from the header against the actual received length. Otherwise the wrong header size will be used and all messages with immediate data will be dropped. This bug keeps MVAPICH-UD and HP MPI from working at all on ipath devices. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-16 14:42:35 -08:00
Olaf Kirch	a8ac6311cc	[SCSI] iscsi: convert xmit path to iscsi chunks Convert xmit to iscsi chunks. from michaelc@cs.wisc.edu: Bug fixes, more digest integration, sg chaining conversion and other sg wrapper changes, coding style sync up, and removal of io fields, like pdu_sent, that are not needed. Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-01-11 18:28:42 -06:00
Mike Christie	f6d5180c78	[SCSI] libiscsi: fix nop handling During root boot and shutdown the target could send us nops. At this time iscsid cannot be running, so the target will drop the session and the boot or shutdown will hang. To handle this and allow us to better control when to check the network this patch moves the nop handling to the kernel. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-01-11 18:28:35 -06:00
Mike Christie	b3a7ea8d50	[SCSI] libiscsi: do not block session during logout There is not need to block the session during logout. Since we are going to fail the commands that were blocked just fail them immediately instead. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-01-11 18:28:28 -06:00
Boaz Harrosh	38ad03de3f	[SCSI] libiscsi,iser: patch for AHS support - The default initialization of hdr_max is the minimum - sizeof(struct iscsi_cmd) - Once this patch goes into iser the default initialization at libiscsi can be removed. - This is not yet full support for AHSs at iser end. But it should be easy. Just allocate more space at iser_desc right after iscsi_hdr. Than at transmission time use ctask->hdr_len to retrieve the total size of all iscsi pdu headers. See previous patch at iscsi_tcp.[ch] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-01-11 18:28:25 -06:00
Mike Christie	843c0a8a76	[SCSI] libiscsi, iscsi_tcp: add device support This patch adds logical unit reset support. This should work for ib_iser, but I have not finished testing that driver so it is not hooked in yet. This patch also temporarily reverts the iscsi_tcp r2t write out patch. That code is completely rewritten in this patchset. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-01-11 18:28:19 -06:00
Dave Dillow	ad696989b4	IB/srp: Release transport before removing host The documented call sequence for removing a host is to call the transport xxx_remove_host() prior to scsi_remove_host(). The SRP transport used to crash when that order was followed, but as it is now fixed, use the documented order. Signed-off-by: David Dillow <dillowda@ornl.gov> Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-08 12:08:10 -08:00
Dotan Barak	e1bb7843e4	IB/mlx4: Fix value of pkey_index in QP1 completions Fix the value of pkey_index in completions to get a valid value for GSI QPs. Without this fix, incoming GSI packets on port 2 get an invalid P_Key index in the completion, which prevents the MAD layer from sending back a response, which can make the second port of ConnectX HCAs completely useless. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-08 12:05:53 -08:00
David Dillow	b0e47c8b79	IB/srp: Fix list corruption/oops on module reload Add a missing call to srp_remove_host() in srp_remove_one() so that we don't leak SRP transport class list entries. Tested-by: David Dillow <dillowda@ornl.gov> Acked-by: FUJITA Tomonori <tomof@acm.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-03 10:25:27 -08:00
Joachim Fenkes	3d758a4a48	IB/ehca: Fix lock flag variable location, bump version number Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-12-13 09:37:23 -08:00
Joachim Fenkes	4faf775795	IB/ehca: Serialize HCA-related hCalls if necessary Several pSeries firmware versions share a rare locking issue in the HCA-related hCalls. Check for a feature flag that indicates the issue being fixed and serialize all HCA hCalls if not. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-12-12 14:09:43 -08:00
Joachim Fenkes	1457edc72d	IB/ehca: Return correct number of SGEs for SRQ Firmware would round up the number of SGEs to four, because the WQE structure holds four SGEs. For SRQ, only three are supported, so return a fixed value instead. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-12-12 14:09:43 -08:00
Joachim Fenkes	b1812582ba	IB/ehca: Fix static rate if path faster than link The formula would yield -1 if the path is faster than the link, which is wrong in a bad way (max throttling). Clamp to 0, which is the correct value. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-11-30 16:19:41 -08:00
Jack Morgenstein	1401b53acc	IPoIB: Fix oops if xmit is called when priv->broadcast is NULL If a port goes down, ipoib_ib_dev_down() is invoked -- which flushes the mcasts (clearing priv->broadcast) and clearing the path record cache. If ipoib_start_xmit() is then invoked (before the broadcast group is rejoined), a kernel oops results from attempting to access priv->broadcast, which is still unset. Returning NULL from path_rec_create() if priv->broadcast is NULL is a harmless way of bypassing the problem -- the offending packet is simply discarded "without prejudice." Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-11-27 15:40:10 -08:00
Erez Zilber	a316b79c33	IB/iser: Add missing counter increment in iser_data_buf_aligned_len() While adding sg chaining support to iSER, a "for" loop was replaced with a "for_each_sg" loop. The "for" loop included the incrementation of 2 variables. Only one of them is incremented in the current "for_each_sg" loop. This caused iSER to think that all data is unaligned, and all data was copied to aligned buffers. This patch increments the missing counter inside the "for_each_sg" loop whenever necessary. Signed-off-by: Erez Zilber <erezz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-11-24 13:50:39 -08:00
Joachim Fenkes	3fe2ed344d	IB/ehca: Fix static rate regression Wrong choice of port number caused modify_qp() to fail -- fixed. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-11-24 13:47:59 -08:00
Ralph Campbell	4187b915a0	IB/ipath: Normalize error return codes for posting work requests The error codes for ib_post_send(), ib_post_recv(), and ib_post_srq_recv() were inconsistent. Use EINVAL for too many SGEs and ENOMEM for too many WRs. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-11-20 11:05:42 -08:00
Ralph Campbell	14de986a0b	IB/ipath: Fix offset returned to ibv_modify_srq() The wrong offset was being returned to libipathverbs so that when ibv_modify_srq() calls mmap(), it always fails. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-11-20 11:04:41 -08:00
Ralph Campbell	8a278e6d57	IB/ipath: Fix error path in QP creation This patch fixes the code which frees the partially allocated QP resources if there was an error while creating the QP. In particular, the QPN wasn't deallocated and the QP wasn't removed from the hash table. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-11-20 11:04:10 -08:00
Ralph Campbell	fb74dacb0f	IB/ipath: Fix offset returned to ibv_resize_cq() The wrong offset was being returned to libipathverbs so that when ibv_resize_cq() calls mmap(), it always fails. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-11-20 11:03:26 -08:00
Steve Wise	9a7666494b	RDMA/cxgb3: Set the max_qp_init_rd_atom attribute in query_device The device attribute max_qp_init_rd_atom is not getting set in cxgb3's query_device method. Version 1.0.4 of librdmacm now validates the user's requested initiator and responder resources against the max supported by the device. Since iw_cxgb3 wasn't setting this attribute (and it defaulted to 0), all rdma_connect()s fail if there are initiator resources requested by the app. Fix this by setting the correct value in iwch_query_device(). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-11-13 15:27:00 -08:00
Joachim Fenkes	51aaa54eb9	IB/ehca: Fix static rate calculation The IPD (inter-packet delay) formula was a little off and assumed a fixed physical link rate; fix the formula and query the actual physical link rate, now that we can get it. Also, refactor the calculation into a common function ehca_calc_ipd() and use that instead of duplicating code. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-11-13 15:27:00 -08:00
Joachim Fenkes	40ebb5615e	IB/ehca: Return physical link information in query_port() Newer firmware versions return physical port information to the partition, so hand that information to the consumer if it's present. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-11-13 15:26:59 -08:00
Ralph Campbell	f4ad1bcc44	IB/ipath: Fix race with ACK retry timeout list management When an ACK is received, the QP is removed from the timeout list and then if there are still pending send WQEs, the QP is put back on the timeout list. It is possible that another post send has put the QP on the timeout list thus, a check needs to be made before trying to do it again or the list is corrupted. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-11-13 15:26:58 -08:00
Ralph Campbell	a6e7550d8f	IB/ipath: Fix memory leak in ipath_resize_cq() if copy_to_user() fails Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Patrick Marchand Latifi <patrick.latifi@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-11-13 15:26:57 -08:00
Linus Torvalds	53173920da	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/fmr_pool: Stop ib_fmr threads from contributing to load average IB/ipath: Fix incorrect use of sizeof on msg buffer (function argument) IB/ipath: Limit length checksummed in eeprom IB/ipath: Fix a race where s_last is updated without lock held IB/mlx4: Lock SQ lock in mlx4_ib_post_send() IPoIB/cm: Fix receive QP cleanup	2007-10-30 15:26:56 -07:00
Anton Blanchard	3f776e8a25	IB/fmr_pool: Stop ib_fmr threads from contributing to load average I noticed my machine was at a constant load average of 1. This was because ib_create_fmr_pool calls kthread_create but does not immediately wake the thread up. Change to using kthread_run so we enter ib_fmr_cleanup_thread(), set TASK_INTERRUPTIBLE, then go to sleep. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-30 14:57:43 -07:00
Dave Olson	164ef7a252	IB/ipath: Fix incorrect use of sizeof on msg buffer (function argument) Inside a function declared as void foo(char bar[512]) the value of sizeof bar is the size of a pointer, not 512. So avoid constructions like this by passing the size explicitly. Also reduce the size of the buffer to 128 bytes (512 was overly generous). Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-30 11:05:49 -07:00
Michael Albaugh	627934448e	IB/ipath: Limit length checksummed in eeprom The small eeprom that holds the GUID etc. contains a data-length, but if the actual eeprom is new or has been erased, that byte will be 0xFF, which is greater than the maximum physical length of the eeprom, and more importantly greater than the length of the buffer we vmalloc'd. Sanity-check the length to avoid the possbility of reading past end of buffer. Signed-off-by: Michael Albaugh <Michael.Albaugh@Qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-30 10:58:53 -07:00
Ralph Campbell	fffbfeaa68	IB/ipath: Fix a race where s_last is updated without lock held There is a small window where a send work queue entry could be overwritten by ib_post_send() because s_last is updated before the entry is read. This patch closes the window by acquiring the lock and updating the last send work queue entry index after reading the wr_id. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-30 10:57:24 -07:00
Roland Dreier	96db0e0335	IB/mlx4: Lock SQ lock in mlx4_ib_post_send() Because of a typo, mlx4_ib_post_send() takes the same lock rq.lock as mlx4_ib_post_recv(). Correct the code so the intended sq.lock is taken when posting a send. Noticed by Yossi Leybovitch and pointed out by Jack Morgenstein from Mellanox. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-30 10:53:54 -07:00
Roland Dreier	09f60f8f54	IPoIB/cm: Fix receive QP cleanup Commit `1b524963` ("IPoIB/cm: Use common CQ for CM send completions") changed how the high-order bits of work request IDs were used, which had the effect that IPOIB_CM_RX_DRAIN_WRID was no longer handled as a connected mode receive completion. This leads to the messages ib1: cm send completion event with wrid 1073741823 (> 64) ib1: RX drain timing out when an interface with connected mode QPs is brought down. Fix this by making sure that both IPOIB_OP_CM and IPOIB_OP_RECV are set in IPOIB_CM_RX_DRAIN_WRID. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-26 13:44:25 -07:00
Jens Axboe	642f149031	SG: Change sg_set_page() to take length and offset argument Most drivers need to set length and offset as well, so may as well fold those three lines into one. Add sg_assign_page() for those two locations that only needed to set the page, where the offset/length is set outside of the function context. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-24 11:20:47 +02:00
Linus Torvalds	0b776eb542	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: mlx4_core: Increase command timeout for INIT_HCA to 10 seconds IPoIB/cm: Use common CQ for CM send completions IB/uverbs: Fix checking of userspace object ownership IB/mlx4: Sanity check userspace send queue sizes IPoIB: Rewrite "if (!likely(...))" as "if (unlikely(!(...)))" IB/ehca: Enable large page MRs by default IB/ehca: Change meaning of hca_cap_mr_pgsize IB/ehca: Fix ehca_encode_hwpage_size() and alloc_fmr() IB/ehca: Fix masking error in {,re}reg_phys_mr() IB/ehca: Supply QP token for SRQ base QPs IPoIB: Use round_jiffies() for ah_reap_task RDMA/cma: Fix deadlock destroying listen requests RDMA/cma: Add locking around QP accesses IB/mthca: Avoid alignment traps when writing doorbells mlx4_core: Kill mlx4_write64_raw()	2007-10-23 09:56:11 -07:00
Olof Johansson	c8ac5a7309	IB/ehca: Fix sg_page() fallout More fallout from sg_page changes: drivers/infiniband/hw/ehca/ehca_mrmw.c: In function 'ehca_set_pagebuf_user1': drivers/infiniband/hw/ehca/ehca_mrmw.c:1779: error: 'struct scatterlist' has no member named 'page' drivers/infiniband/hw/ehca/ehca_mrmw.c: In function 'ehca_check_kpages_per_ate': drivers/infiniband/hw/ehca/ehca_mrmw.c:1835: error: 'struct scatterlist' has no member named 'page' drivers/infiniband/hw/ehca/ehca_mrmw.c: In function 'ehca_set_pagebuf_user2': drivers/infiniband/hw/ehca/ehca_mrmw.c:1870: error: 'struct scatterlist' has no member named 'page' Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-23 09:12:52 +02:00
Jens Axboe	45711f1af6	[SG] Update drivers to use sg helpers Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-22 21:19:53 +02:00
Michael S. Tsirkin	1b524963fd	IPoIB/cm: Use common CQ for CM send completions Use the same CQ for CM send completions as for all other IPoIB completions. This means all completions are processed via the same NAPI polling routine. This should help reduce the number of interrupts for bi-directional traffic (such as TCP) and fixes "driver is hogging interrupts" errors reported for IPoIB send side, e.g. <https://bugs.openfabrics.org/show_bug.cgi?id=508> To do this, keep a per-interface counter of outstanding send WRs, and stop the interface when this counter reaches the send queue size to avoid CQ overruns. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-19 21:39:34 -07:00
Roland Dreier	cbfb50e6e2	IB/uverbs: Fix checking of userspace object ownership Commit `9ead190b` ("IB/uverbs: Don't serialize with ib_uverbs_idr_mutex") rewrote how userspace objects are looked up in the uverbs module's idrs, and introduced a severe bug in the process: there is no checking that an operation is being performed by the right process any more. Fix this by adding the missing check of uobj->context in __idr_get_uobj(). Apparently everyone is being very careful to only touch their own objects, because this bug was introduced in June 2006 in 2.6.18, and has gone undetected until now. Cc: stable <stable@kernel.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-19 20:01:43 -07:00
Anton Arapov	a25de534f8	[INET]: Justification for local port range robustness. There is a justifying patch for Stephen's patches. Stephen's patches disallows using a port range of one single port and brakes the meaning of the 'remaining' variable, in some places it has different meaning. My patch gives back the sense of 'remaining' variable. It should mean how many ports are remaining and nothing else. Also my patch allows using a single port. I sure we must be able to use mentioned port range, this does not restricted by documentation and does not brake current behavior. usefull links: Patches posted by Stephen Hemminger http://marc.info/?l=linux-netdev&m=119206106218187&w=2 http://marc.info/?l=linux-netdev&m=119206109918235&w=2 Andrew Morton's comment http://marc.info/?l=linux-kernel&m=119248225007737&w=2 1. Allows using a port range of one single port. 2. Gives back sense of 'remaining' variable. Signed-off-by: Anton Arapov <aarapov@redhat.com> Acked-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 22:00:17 -07:00
Joe Perches	898eb71cb1	Add missing newlines to some uses of dev_<level> messages Found these while looking at printk uses. Add missing newlines to dev_<level> uses Add missing KERN_<level> prefixes to multiline dev_<level>s Fixed a wierd->weird spelling typo Added a newline to a printk Signed-off-by: Joe Perches <joe@perches.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Mark M. Hoffman <mhoffman@lightlink.com> Cc: Roland Dreier <rolandd@cisco.com> Cc: Tilman Schmidt <tilman@imap.cc> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Jeff Garzik <jeff@garzik.org> Cc: Stephen Hemminger <shemminger@linux-foundation.org> Cc: Greg KH <greg@kroah.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: David Brownell <david-b@pacbell.net> Cc: James Smart <James.Smart@Emulex.Com> Cc: Andrew Vasquez <andrew.vasquez@qlogic.com> Cc: "Antonino A. Daplas" <adaplas@pol.net> Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Jaroslav Kysela <perex@suse.cz> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:28 -07:00
Jack Morgenstein	839041329f	IB/mlx4: Sanity check userspace send queue sizes Add sanity checks to send queue sizes passed in from userspace. The minimum sq stride value below is taken from the MT25408 PRM (section 11.10, Table 306, log_sq_stride definition). Without this check, userspace can submit arbitrarily large/small values for the number of WQEs and the stride, which can crash the kernel. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-18 09:27:26 -07:00
Roland Dreier	fd312561ad	IPoIB: Rewrite "if (!likely(...))" as "if (unlikely(!(...)))" It's too hard to figure out what "!likely(...)" really means, and who knows how compilers interpret the hint. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-17 21:54:44 -07:00
Joachim Fenkes	8da9ee9c1e	IB/ehca: Enable large page MRs by default Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-17 21:47:47 -07:00
Joachim Fenkes	abc39d3672	IB/ehca: Change meaning of hca_cap_mr_pgsize ehca_shca.hca_cap_mr_pgsize now contains all supported page sizes ORed together. This makes some checks easier to code and understand, plus we can return this value verbatim in query_hca(), fixing a problem with SRP (reported by Anton Blanchard -- thanks!). Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-17 21:47:24 -07:00
Joachim Fenkes	8c08d50d4f	IB/ehca: Fix ehca_encode_hwpage_size() and alloc_fmr() Simplify ehca_encode_hwpage_size(), fixing an infinite loop for pgsize == 0 in the process. Fix the bug in alloc_fmr() that triggered the loop. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-17 21:46:37 -07:00
Joachim Fenkes	9511724da9	IB/ehca: Fix masking error in {,re}reg_phys_mr() Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-17 21:45:56 -07:00
Joachim Fenkes	c0c84d566d	IB/ehca: Supply QP token for SRQ base QPs Because hardware reports the SRQ token in RWQEs of SRQ base QPs, supply the base QP token as SRQ token, so we can properly find the SRQ base QP. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-17 21:45:17 -07:00
Joachim Fenkes	6b08f3ae8e	[POWERPC] ibmebus: Move to of_device and of_platform_driver, match eHCA and eHEA drivers Replace struct ibmebus_dev and struct ibmebus_driver with struct of_device and struct of_platform_driver, respectively. Match the external ibmebus interface and drivers using it. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Roland Dreier <rolandd@cisco.com> Acked-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-10-17 22:30:08 +10:00
Anton Blanchard	69fc507a14	IPoIB: Use round_jiffies() for ah_reap_task Use round_jiffies() to align the 1 second ah_reap_task with other work and potentially save power by sleeping cores for longer. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-16 12:28:56 -07:00
Sean Hefty	d02d1f5359	RDMA/cma: Fix deadlock destroying listen requests Deadlock condition reported by Kanoj Sarcar <kanoj@netxen.com>. The deadlock occurs when a connection request arrives at the same time that a wildcard listen is being destroyed. A wildcard listen maintains per device listen requests for each RDMA device in the system. The per device listens are automatically added and removed when RDMA devices are inserted or removed from the system. When a wildcard listen is destroyed, rdma_destroy_id() acquires the rdma_cm's device mutex ('lock') to protect against hot-plug events adding or removing per device listens. It then tries to destroy the per device listens by calling ib_destroy_cm_id() or iw_destroy_cm_id(). It does this while holding the device mutex. However, if the underlying iw/ib CM reports a connection request while this is occurring, the rdma_cm callback function will try to acquire the same device mutex. Since we're in a callback, the ib_destroy_cm_id() or iw_destroy_cm_id() calls will block until their callback thread returns, but the callback is blocked waiting for the device mutex. Fix this by re-working how per device listens are destroyed. Use rdma_destroy_id(), which avoids the deadlock, in place of cma_destroy_listen(). Additional synchronization is added to handle device hot-plug events and ensure that the id is not destroyed twice. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-16 12:25:49 -07:00
Sean Hefty	c5483388bb	RDMA/cma: Add locking around QP accesses If a user allocates a QP on an rdma_cm_id, the rdma_cm will automatically transition the QP through its states (RTR, RTS, error, etc.) While the QP state transitions are occurring, the QP itself must remain valid. Provide locking around the QP pointer to prevent its destruction while accessing the pointer. This fixes an issue reported by Olaf Kirch from Oracle that resulted in a system crash: "An incoming connection arrives and we decide to tear down the nascent connection. The remote ends decides to do the same. We start to shut down the connection, and call rdma_destroy_qp on our cm_id. ... Now apparently a 'connect reject' message comes in from the other host, and cma_ib_handler() is called with an event of IB_CM_REJ_RECEIVED. It calls cma_modify_qp_err, which for some odd reason tries to modify the exact same QP we just destroyed." Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-16 12:25:00 -07:00
Jens Axboe	53d412fce0	infiniband: sg chaining support Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-16 11:20:59 +02:00
Roland Dreier	ab8403c424	IB/mthca: Avoid alignment traps when writing doorbells Architectures such as ia64 see alignment traps when doing a 64-bit read from __be32 doorbell[2] arrays to do doorbell writes in mthca_write64(). Fix this by just passing the two halves of the doorbell value into mthca_write64(). This actually improves the generated code by allowing the compiler to see what's going on better. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-15 20:17:27 -07:00
Moni Shoua	200d1713b4	IB/ipoib: Verify address handle validity on send When the bonding device senses a carrier loss of its active slave it replaces that slave with a new one. In between the times when the carrier of an IPoIB device goes down and ipoib_neigh is destroyed, it is possible that the bonding driver will send a packet on a new slave that uses an old ipoib_neigh. This patch detects and prevents this from happenning. Signed-off-by: Moni Shoua <monis at voltaire.com> Signed-off-by: Or Gerlitz <ogerlitz at voltaire.com> Acked-by: Roland Dreier <rdreier@cisco.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-10-15 14:20:45 -04:00
Moni Shoua	732a2170f4	IB/ipoib: Bound the net device to the ipoib_neigh structue IPoIB uses a two layer neighboring scheme, such that for each struct neighbour whose device is an ipoib one, there is a struct ipoib_neigh buddy which is created on demand at the tx flow by an ipoib_neigh_alloc(skb->dst->neighbour) call. When using the bonding driver, neighbours are created by the net stack on behalf of the bonding (master) device. On the tx flow the bonding code gets an skb such that skb->dev points to the master device, it changes this skb to point on the slave device and calls the slave hard_start_xmit function. Under this scheme, ipoib_neigh_destructor assumption that for each struct neighbour it gets, n->dev is an ipoib device and hence netdev_priv(n->dev) can be casted to struct ipoib_dev_priv is buggy. To fix it, this patch adds a dev field to struct ipoib_neigh which is used instead of the struct neighbour dev one, when n->dev->flags has the IFF_MASTER bit set. Signed-off-by: Moni Shoua <monis at voltaire.com> Signed-off-by: Or Gerlitz <ogerlitz at voltaire.com> Acked-by: Roland Dreier <rdreier@cisco.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-10-15 14:20:45 -04:00
Linus Torvalds	df3d80f5a5	Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (207 commits) [SCSI] gdth: fix CONFIG_ISA build failure [SCSI] esp_scsi: remove __dev{init,exit} [SCSI] gdth: !use_sg cleanup and use of scsi accessors [SCSI] gdth: Move members from SCp to gdth_cmndinfo, stage 2 [SCSI] gdth: Setup proper per-command private data [SCSI] gdth: Remove gdth_ctr_tab[] [SCSI] gdth: switch to modern scsi host registration [SCSI] gdth: gdth_interrupt() gdth_get_status() & gdth_wait() fixes [SCSI] gdth: clean up host private data [SCSI] gdth: Remove virt hosts [SCSI] gdth: Reorder scsi_host_template intitializers [SCSI] gdth: kill gdth_{read,write}[bwl] wrappers [SCSI] gdth: Remove 2.4.x support, in-kernel changelog [SCSI] gdth: split out pci probing [SCSI] gdth: split out eisa probing [SCSI] gdth: split out isa probing gdth: Make one abuse of scsi_cmnd less obvious [SCSI] NCR5380: Use scsi_eh API for REQUEST_SENSE invocation [SCSI] usb storage: use scsi_eh API in REQUEST_SENSE execution [SCSI] scsi_error: Refactoring scsi_error to facilitate in synchronous REQUEST_SENSE ...	2007-10-15 08:19:33 -07:00
Kay Sievers	7eff2e7a8b	Driver core: change add_uevent_var to use a struct This changes the uevent buffer functions to use a struct instead of a long list of parameters. It does no longer require the caller to do the proper buffer termination and size accounting, which is currently wrong in some places. It fixes a known bug where parts of the uevent environment are overwritten because of wrong index calculations. Many thanks to Mathieu Desnoyers for finding bugs and improving the error handling. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-10-12 14:51:01 -07:00
FUJITA Tomonori	aebd5e476e	[SCSI] transport_srp: add rport roles attribute This adds a 'roles' attribute to rport like transport_fc. The role can be initiator or target. That is, the initiator driver creates target remote ports and the target driver creates initiator remote ports. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2007-10-12 14:37:46 -04:00
FUJITA Tomonori	3236822b1c	[SCSI] ib_srp: convert to use the srp transport class This converts ib_srp to use the srp transport class. I don't have ib hardware so I've not tested this patch. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2007-10-12 14:37:42 -04:00
Linus Torvalds	ce9d3c9a6a	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (87 commits) mlx4_core: Fix section mismatches IPoIB: Allow setting policy to ignore multicast groups IB/mthca: Mark error paths as unlikely() in post_srq_recv functions IB/ipath: Minor fix to ordering of freeing and zeroing of tid pages. IB/ipath: Remove redundant link state checks IB/ipath: Fix IB_EVENT_PORT_ERR event IB/ipath: Better handling of unexpected GPIO interrupts IB/ipath: Maintain active time on all chips IB/ipath: Fix QHT7040 serial number check IB/ipath: Indicate a couple of chip bugs to userspace IB/ipath: iba6110 rev4 no longer needs recv header overrun workaround IB/ipath: Use counters in ipath_poll and cleanup interrupts in ipath_close IB/ipath: Remove duplicate copy of LMC IB/ipath: Add ability to set the LMC via the sysfs debugging interface IB/ipath: Optimize completion queue entry insertion and polling IB/ipath: Implement IB_EVENT_QP_LAST_WQE_REACHED IB/ipath: Generate flush CQE when QP is in error state IB/ipath: Remove redundant code IB/ipath: Future proof eeprom checksum code (contents reading) IB/ipath: UC RDMA WRITE with IMMEDIATE doesn't send the immediate ...	2007-10-11 19:43:13 -07:00
Stephen Hemminger	227b60f510	[INET]: local port range robustness Expansion of original idea from Denis V. Lunev <den@openvz.org> Add robustness and locking to the local_port_range sysctl. 1. Enforce that low < high when setting. 2. Use seqlock to ensure atomic update. The locking might seem like overkill, but there are cases where sysadmin might want to change value in the middle of a DoS attack. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 17:30:46 -07:00
Roland Dreier	9153f66a5b	IPoIB: Fix unused variable warning The conversion to use netdevice internal stats left an unused variable in ipoib_neigh_free(), since there's no longer any reason to get netdev_priv() in order to increment dropped packets. Delete the unused priv variable. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-10-10 16:55:30 -07:00
Roland Dreier	de90351219	[IPoIB]: Convert to netdevice internal stats Use the stats member of struct netdevice in IPoIB, so we can save memory by deleting the stats member of struct ipoib_dev_priv, and save code by deleting ipoib_get_stats(). Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:41 -07:00
Stephen Hemminger	3b04ddde02	[NET]: Move hardware header operations out of netdevice. Since hardware header operations are part of the protocol class not the device instance, make them into a separate object and save memory. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:52 -07:00
Ralf Baechle	10d024c1b2	[NET]: Nuke SET_MODULE_OWNER macro. It's been a useless no-op for long enough in 2.6 so I figured it's time to remove it. The number of people that could object because they're maintaining unified 2.4 and 2.6 drivers is probably rather small. [ Handled drivers added by netdev tree and some missed IRDA cases... -DaveM ] Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:13 -07:00
Eric W. Biederman	881d966b48	[NET]: Make the device list and device lookups per namespace. This patch makes most of the generic device layer network namespace safe. This patch makes dev_base_head a network namespace variable, and then it picks up a few associated variables. The functions: dev_getbyhwaddr dev_getfirsthwbytype dev_get_by_flags dev_get_by_name __dev_get_by_name dev_get_by_index __dev_get_by_index dev_ioctl dev_ethtool dev_load wireless_process_ioctl were modified to take a network namespace argument, and deal with it. vlan_ioctl_set and brioctl_set were modified so their hooks will receive a network namespace argument. So basically anthing in the core of the network stack that was affected to by the change of dev_base was modified to handle multiple network namespaces. The rest of the network stack was simply modified to explicitly use &init_net the initial network namespace. This can be fixed when those components of the network stack are modified to handle multiple network namespaces. For now the ifindex generator is left global. Fundametally ifindex numbers are per namespace, or else we will have corner case problems with migration when we get that far. At the same time there are assumptions in the network stack that the ifindex of a network device won't change. Making the ifindex number global seems a good compromise until the network stack can cope with ifindex changes when you change namespaces, and the like. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:10 -07:00
Stephen Hemminger	bea3348eef	[NET]: Make NAPI polling independent of struct net_device objects. Several devices have multiple independant RX queues per net device, and some have a single interrupt doorbell for several queues. In either case, it's easier to support layouts like that if the structure representing the poll is independant from the net device itself. The signature of the ->poll() call back goes from: int foo_poll(struct net_device dev, int budget) to int foo_poll(struct napi_struct napi, int budget) The caller is returned the number of RX packets processed (or the number of "NAPI credits" consumed if you want to get abstract). The callee no longer messes around bumping dev->quota, budget, etc. because that is all handled in the caller upon return. The napi_struct is to be embedded in the device driver private data structures. Furthermore, it is the driver's responsibility to disable all NAPI instances in it's ->stop() device close handler. Since the napi_struct is privatized into the driver's private data structures, only the driver knows how to get at all of the napi_struct instances it may have per-device. With lots of help and suggestions from Rusty Russell, Roland Dreier, Michael Chan, Jeff Garzik, and Jamal Hadi Salim. Bug fixes from Thomas Graf, Roland Dreier, Peter Zijlstra, Joseph Fannin, Scott Wood, Hans J. Koch, and Michael Chan. [ Ported to current tree and all drivers converted. Integrated Stephen's follow-on kerneldoc additions, and restored poll_list handling to the old style to fix mutual exclusion issues. -DaveM ] Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:47:45 -07:00
Or Gerlitz	335a64a5a9	IPoIB: Allow setting policy to ignore multicast groups The kernel IB stack allows (through the RDMA CM) userspace applications to join and use multicast groups from the IPoIB MGID range. This allows multicast traffic to be handled directly from userspace QPs, without going through the kernel stack, which gives better performance for some applications. However, to fully interoperate with IP multicast, such userspace applications need to participate in IGMP reports and queries, or else routers may not forward the multicast traffic to the system where the application is running. The simplest way to do this is to share the kernel IGMP implementation by using the IP_ADD_MEMBERSHIP option to join multicast groups that are being handled directly in userspace. However, in such cases, the actual multicast traffic should not also be handled by the IPoIB interface, because that would burn resources handling multicast packets that will just be discarded in the kernel. To handle this, this patch adds lookup on the database used for IB multicast group reference counting when IPoIB is joining multicast groups, and if a multicast group is already handled by user space, then the IPoIB kernel driver ignores the group. This is controlled by a per-interface policy flag. When the flag is set, IPoIB will not join and attach its QP to a multicast group which already has an entry in the database; when the flag is cleared, IPoIB will behave as before this change. For each IPoIB interface, the /sys/class/net/$intf/umcast attribute controls the policy flag. The default value is off/0. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-10 13:02:30 -07:00
Eli Cohen	55a98e955c	IB/mthca: Mark error paths as unlikely() in post_srq_recv functions Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-10 11:24:33 -07:00
Dave Olson	3ac8c70f74	IB/ipath: Minor fix to ordering of freeing and zeroing of tid pages. Fixed to be the same as everywhere else. copy and then zero the page * in the array first, and then pass the copy to the VM routines. Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 21:03:02 -07:00
Ralph Campbell	bda94e32b3	IB/ipath: Remove redundant link state checks This patch removes some redundant checks when the SMA changes the link state since the same checks are made in the lower level function that sets the state. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 21:02:08 -07:00
Ralph Campbell	49739b3e24	IB/ipath: Fix IB_EVENT_PORT_ERR event The link state event calls were being generated when the SM told the SMA to change link states. This works for IB_EVENT_PORT_ACTIVE but not if the link goes down and stays down. The fix is to generate event calls from the interrupt handler when the HW link state changes. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 21:01:38 -07:00
Michael Albaugh	6a733cdc71	IB/ipath: Better handling of unexpected GPIO interrupts The General Purpose I/O pins can be configured to cause interrupts. At the end of the interrupt code dealing with all known causes, a message is output if any bits remain un-handled. Since this is a "can't happen" scenario, it should only be triggered by bugs elsewhere. It is harmless, and potentially beneficial, to limit the damage by masking any such unexpected interrupts. This patch adds disabling of interrupts from any pins that should not have been allowed to interrupt, in addition to emitting a message. Signed-off-by: Michael Albaugh <Michael.Albaugh@Qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 21:00:47 -07:00
Michael Albaugh	192594d523	IB/ipath: Maintain active time on all chips There is a count of "active hours" maintained in EEPROM, to aid troubleshooting. The definition of "active" is based on traffic exceeding a threshold in any given 5-second polling interval. As originally written, the check was inadvertently bypassed for chips whose counters were 64-bits wide, and only applied to chips with 32-bit wide counters. This patch moves the test for amount of traffic "out" to a more common location, rather than depending on a side-effect of the software emulation of 64-bit counts on chips whose hardware is only 32-bits wide. Signed-off-by: Michael Albaugh <Michael.Albaugh@Qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 21:00:08 -07:00
Dave Olson	aa7c79abd1	IB/ipath: Fix QHT7040 serial number check Remove all the OEM and bringup boards, and complain and fail initialization if one is found. QHT7040 with GPIO rework (128ywwuuuu) is OK, older 112ywwuuuu is no longer supported). The check that had been added was failing both the 112 and 128 series. Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 20:58:49 -07:00
Arthur Jones	20bed34314	IB/ipath: Indicate a couple of chip bugs to userspace A couple of chip bugs in the iba6110 and in the iba6120 are not in more recent chips. This first bug swaps two of the pioavail register locations. In the second bug, the chip can sometimes forget to dma the pio avail register to memory. We indicate the presence of these bugs with runtime flags and we indicate the presence of the flags by bumping the SWMINOR. Signed-off-by: Arthur Jones <arthur.jones@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 20:57:54 -07:00
Arthur Jones	4bec0b9155	IB/ipath: iba6110 rev4 no longer needs recv header overrun workaround iba6110 rev3 and earlier had a chip bug where the chip could overrun the recv header queue. rev4 fixed this chip bug so userspace no longer needs to workaround it. Now we only set the workaround flag for older chip versions. Signed-off-by: Arthur Jones <arthur.jones@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 20:56:59 -07:00
Arthur Jones	70c51da2c4	IB/ipath: Use counters in ipath_poll and cleanup interrupts in ipath_close ipath_poll() suffered from a couple subtle bugs. Under the right conditions we could leave recv interrupts enabled on an ipath user context on close, thereby taking potentially unwanted interrupts on the next open -- this is fixed by unconditionally turning off recv interrupts on close. Also, we now use counters rather than set/clear bits which allows us to make sure we catch all interrupts at the cost of changing the semantics slightly (it's now give me all events since the last time I called poll() rather than give me all events since I called _this_ poll routine). We also added some memory barriers which may help ensure we get all notifications in a timely manner. Signed-off-by: Arthur Jones <arthur.jones@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 20:56:23 -07:00
Ralph Campbell	542869a17e	IB/ipath: Remove duplicate copy of LMC The LMC value was being saved by the SMA in two places. This patch cleans it up so only one copy is kept. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 20:55:06 -07:00
Ralph Campbell	15cba26f42	IB/ipath: Add ability to set the LMC via the sysfs debugging interface This patch adds the ability to set the LMC via a sysfs file as if the SM sent a SubnSet(PortInfo) MAD. It is useful for debugging when no SM is running. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 20:53:50 -07:00
Ralph Campbell	6cff2faaf1	IB/ipath: Optimize completion queue entry insertion and polling The code to add an entry to the completion queue stored the QPN which is needed for the user level verbs view of the completion queue entry but the kernel struct ib_wc contains a pointer to the QP instead of a QPN. When the kernel polled for a completion queue entry, the QPN was lookup up and the QP pointer recovered. This patch stores the CQE differently based on whether the CQ is a kernel CQ or a user CQ thus avoiding the QPN to QP lookup overhead. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 20:52:23 -07:00
Ralph Campbell	d42b01b584	IB/ipath: Implement IB_EVENT_QP_LAST_WQE_REACHED This patch implements the IB_EVENT_QP_LAST_WQE_REACHED event which is needed by ib_ipoib to destroy the QP when used in connected mode. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 20:51:20 -07:00
Ralph Campbell	c9cf7db2bc	IB/ipath: Generate flush CQE when QP is in error state Follow the IB spec. (C10-96) for post send which states that a flushed completion event should be generated for work requests posted when a QP is in the error state. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 20:50:29 -07:00
Ralph Campbell	036be09ca5	IB/ipath: Remove redundant code This patch removes some redundant initialization code. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 20:46:23 -07:00
Dave Olson	d29cc6efb9	IB/ipath: Future proof eeprom checksum code (contents reading) In an earlier change, the amount of data read from the flash was mistakenly limited to the size known to the current driver. This causes problems when the length is increased, and written with the new longer version; the checksum would fail because not enough data was read. Always read the full 128 byte length to prevent this. Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 20:45:57 -07:00
Ralph Campbell	55046698fa	IB/ipath: UC RDMA WRITE with IMMEDIATE doesn't send the immediate This patch fixes a bug in the receive processing for UC RDMA WRITE with immediate which caused the last packet to be dropped. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 20:44:56 -07:00
Dave Olson	9ef8617af7	IB/ipath: Correctly describe workaround for TID write chip bug This is a comment change, only, correcting the comment to match the implemented workaround, rather than the original workaround, and clarifying why it's needed. Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 20:44:20 -07:00
Ralph Campbell	1793b4771d	IB/ipath: Remove unneeded code for ipathfs The ipathfs file system is used to export binary data verses ASCII data such as through /sys. This patch removes some unneeded files since the data is available through other /sys files. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 20:43:17 -07:00
Dave Olson	9bec399231	IB/ipath: Verify host bus bandwidth to chip will not limit performance There have been a number of issues where host bandwidth via HT or PCIe to the InfiniPath chip has been limited in some fashion (BIOS, configuration, etc.), resulting in user confusion. This check gives a clear warning that something is wrong and needs to be resolved. Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 20:20:15 -07:00
Ralph Campbell	4ee97180ac	IB/ipath: Change UD to queue work requests like RC & UC The code to post UD sends tried to process work requests at the time ib_post_send() is called without using a WQE queue. This was fine as long as HW resources were available for sending a packet. This patch changes UD to be handled more like RC and UC and shares more code. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 20:05:49 -07:00
Ralph Campbell	210d6ca3db	IB/ipath: Performance optimization for CPU differences Different processors have different ordering restrictions for write combining. By taking advantage of this, we can eliminate some write barriers when writing to the send buffers. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 20:04:14 -07:00
Arthur Jones	327a338d4f	IB/ipath: iba6110 rev4 GPIO counters support On iba6110 rev4, support for three more IB counters were added. The LocalLinkIntegrityError counter, the ExcessiveBufferOverrunErrors counter and support for error counting of flow control packets on an invalid VL. These counters trigger GPIO interrupts and the sw keeps track of the counts. Since we also use GPIO interrupts to signal packet reception, we need to turn off the fast interrupts, or we risk losing a GPIO interrupt. Signed-off-by: Arthur Jones <arthur.jones@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 20:02:46 -07:00
Roland Dreier	76dea3bc26	IB/ehca: Fix clipping of device limits to INT_MAX Doing min_t(int, foo, INT_MAX) doesn't work correctly, because if foo is bigger than INT_MAX, then when treated as a signed integer, it will become negative and hence such an expression is just an elaborate NOP. Fix such cases in ehca to do min_t(unsigned, foo, INT_MAX) instead. This fixes negative reported values for max_cqe, max_pd and max_ah: Before: max_cqe: -64 max_pd: -1 max_ah: -1 After: max_cqe: 2147483647 max_pd: 2147483647 max_ah: 2147483647 Based on a bug report and fix from Anton Blanchard <anton@samba.org>. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:18 -07:00
Dotan Barak	ede6bc04f3	IPoIB/cm: Clean up initialization of QP attr in ipoib_cm_create_tx_qp() Make the way QP is being created in ipoib_cm_create_tx_qp() consistent with ipoib_cm_create_rx_qp(). Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:18 -07:00
Roland Dreier	76d7cc0345	IB/mthca: Use mmiowb() to avoid firmware commands getting jumbled up Firmware commands are sent to the HCA by writing multiple words to a command register block. Access to this block of registers is serialized with a mutex. However, on large SGI systems, problems were seen with multiple CPUs issuing FW commands at the same time, because the writes to the register block may be reordered within the system interconnect and reach the HCA in a different order than they were issued (even with the mutex). Fix this by adding an mmiowb() before dropping the mutex. Tested-by: Arthur Kepner <akepner@sgi.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:17 -07:00
Sean Hefty	dcb3f974da	RDMA/cma: Queue IB CM MRAs to avoid unnecessary remote retries Automatically queue MRA message to decrease the number of retries sent by the remote side during connection establishment. This also has the effect of increasing the overall connection timeout without using a longer retry time in the case of dropped packets. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:17 -07:00
Sean Hefty	de98b693e9	IB/cm: Modify interface to send MRAs in response to duplicate messages The IB CM provides a message received acknowledged (MRA) message that can be sent to indicate that a REQ or REP message has been received, but will require more time to process than the timeout specified by those messages. In many cases, the application may not know how long it will take to respond to a CM message, but the majority of the time, it will usually respond before a retry has been sent. Rather than sending an MRA in response to all messages just to handle the case where a longer timeout is needed, it is more efficient to queue the MRA for sending in case a duplicate message is received. This avoids sending an MRA when it is not needed, but limits the number of times that a REQ or REP will be resent. It also provides for a simpler implementation than generating the MRA based on a timer event. (That is, trying to send the MRA after receiving the first REQ or REP if a response has not been generated, so that it is received at the remote side before a duplicate REQ or REP has been received) Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:17 -07:00
Roland Dreier	1a1eb6a646	IB/mthca: Increase max number of QPs per multicast group to 56 Increase the number of QPs allowed per multicast group from 8 to 56. This allows for one QP per core on 16-core systems, which are now quite common, and allows some space for future growth. This is basically the same patch that Jack Morgenstein <jackm@dev.mellanox.co.il> just supplied for mlx4. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:17 -07:00
Jack Morgenstein	8ad11fb6b0	IB/mlx4: Implement FMRs Implement FMRs for mlx4. This is an adaptation of code from mthca. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:16 -07:00
Jack Morgenstein	d7bb58fb1c	mlx4_core: Write MTTs from CPU instead with of WRITE_MTT FW command Write MTT entries directly to ICM from the driver (eliminating use of WRITE_MTT command). This reduces the number of FW commands needed to register an MR by at least a factor of 2 and speeds up memory registration significantly. This code will also be used to implement FMRs. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:16 -07:00
Roland Dreier	04d29b0ede	IB/uverbs: Make ib_uverbs_release_event_file() static ib_uverbs_release_event_file() is only used in uverbs_main.c, so make it static to that file. Also move the definition before the first use, so a forward declaration is not needed. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:15 -07:00
Roland Dreier	a394f83bdf	IB/umad: Fix bit ordering and 32-on-64 problems on big endian systems The declaration of struct ib_user_mad_reg_req.method_mask[] exported to userspace was an array of __u32, but the kernel internally treated it as a bitmap made up of longs. This makes a difference for 64-bit big-endian kernels, where numbering the bits in an array of__u32 gives: \|31.....0\|63....31\|95....64\|127...96\| while numbering the bits in an array of longs gives: \|63..............0\|127............64\| 64-bit userspace can handle this by just treating method_mask[] as an array of longs, but 32-bit userspace is really stuck: the meaning of the bits in method_mask[] depends on whether the kernel is 32-bit or 64-bit, and there's no sane way for userspace to know that. Fix this by updating <rdma/ib_user_mad.h> to make it clear that method_mask[] is an array of longs, and using a compat_ioctl method to convert to an array of 64-bit longs to handle the 32-on-64 problem. This fixes the interface description to match existing behavior (so working binaries continue to work) in almost all situations, and gives consistent semantics in the case of 32-bit userspace that can run on either a 32-bit or 64-bit kernel, so that the same binary can work for both 32-on-32 and 32-on-64 systems. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:15 -07:00
Roland Dreier	2be8e3ee8e	IB/umad: Add P_Key index support Add support for setting the P_Key index of sent MADs and getting the P_Key index of received MADs. This requires a change to the layout of the ABI structure struct ib_user_mad_hdr, so to avoid breaking compatibility, we default to the old (unchanged) ABI and add a new ioctl IB_USER_MAD_ENABLE_PKEY that allows applications that are aware of the new ABI to opt into using it. We plan on switching to the new ABI by default in a year or so, and this patch adds a warning that is printed when an application uses the old ABI, to push people towards converting to the new ABI. Signed-off-by: Roland Dreier <rolandd@cisco.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Hal Rosenstock <hal@xsigo.com>	2007-10-09 19:59:15 -07:00
Joachim Fenkes	c01759cee9	IB/ehca: Return srq_attr->max_sge in ehca_query_srq() Totally forgot this. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:15 -07:00
Hoang-Nam Nguyen	a660722375	IB/ehca: Adjust 64-bit alignment of create QP response for userspace Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:14 -07:00
Hoang-Nam Nguyen	03f72a51cb	IB/ehca: Fix mem leak of firmware ctrlblock in ehca_create_srq() Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:14 -07:00
Jack Morgenstein	cd9281d873	IB/mlx4: Display misc device information under /sys/class/infiniband/ display the following device information under /sys/class/infiniband/mlx4_X: board_id, fw_ver, hw_rev, hca_type. This patch makes this information available to userspace utilities such as ibstat and ibv_devinfo. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:14 -07:00
Ralph Campbell	57cb61d587	IB/core: Fix handling of multicast response failures I was looking at the code for multicast.c and noticed that ib_sa_join_multicast() calls queue_join() which puts the request at the front of the group->pending_list. If this is a second request, it seems like it would interfere with process_join_error() since group->last_join won't point to the member at the head of the pending_list. The sequence would thus be: 1. ib_sa_join_multicast() puts member1 on head of pending_list and starts work thread 2. mcast_work_handler() calls send_join() which sets group->last_join to member1 3. ib_sa_join_multicast() puts member2 on head of pending_list 4. join operation for member1 receives failures response from SA. 5. join_handler() is called with error status 6. process_join_error() fails to process member1 since it doesn't match the first entry in the group->pending_list. The impact is that the failed join request is tossed. The second request is processed, and after it completes, the original request ends up being retried. This change also results in join requests being processed in FIFO order. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:14 -07:00
Satyam Sharma	9faa559c01	IB/ehca: Misc cpuinit section annotations and #ifdef cleanups * Replace {un}register_cpu_notifier with {un}register_hotcpu_notifier thereby losing a couple of #ifdef HOTPLUG_CPU pairs. * Move comp_pool_callback_nb declaration to below that of callback function so that initialization of .notifier_call and .priority can occur at build time itself and not runtime. * Mark the notifier_block (and callback function, and another static function used by it) as __cpuinit{data} for the sake of consistency and remove enclosing #ifdef. (This may increase size for modular build of this module, however, because these are no longer dropped unconditionally now.) Signed-off-by: Satyam Sharma <satyam@infradead.org> Acked-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:14 -07:00
Roland Dreier	ec2a1344ad	IB/iser: Remove unnecessary includes <asm/scatterlist.h> is not needed because everyplace it appears, <linux/scatterlist.h> also appears. <asm/io.h> is not needed because nothing seems to be using device IO anyway. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:13 -07:00
Steve Wise	935ef2d7a2	RDMA/cma: Use neigh_event_send() to start neighbour discovery Calling arp_send() to initiate neighbour discovery (ND) doesn't do the full ND protocol. Namely, it doesn't handle retransmitting the arp request if it is dropped. The function neigh_event_send() does all this. Without doing full ND, RDMA address resolution fails in the presence of dropped ARP broadcast packets. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:13 -07:00
Joachim Fenkes	3a31c41901	IB/ehca: Only use MR large pages for hugetlb regions ...because, on virtualized hardware like System p, we can't be sure that the physical pages behind them are contiguous otherwise. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:13 -07:00
Joachim Fenkes	c8d8beea03	IB/umem: Add hugetlb flag to struct ib_umem During ib_umem_get(), determine whether all pages from the memory region are hugetlb pages and report this in the "hugetlb" member. Low-level drivers can use this information if they need it. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:13 -07:00
Sean Hefty	247e020ee5	IB/srp: Add QoS support through service ID Provide the target service ID when performing a path record query to support optional QoS capability. QoS requires support from the SA. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:12 -07:00
Sean Hefty	7ce86409ad	RDMA/ucma: Allow user space to set service type Export the ability to set the type of service to user space. Model the interface after setsockopt. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:12 -07:00
Sean Hefty	a81c994d5e	RDMA/cma: Add ability to specify type of service Provide support to specify a type of service for a communication identifier. A new function call is used when dealing with IPv4 addresses. For IPv6 addresses, the ToS is specified through the traffic class field in the sockaddr_in6 structure. Signed-off-by: Sean Hefty <sean.hefty@intel.com> [ The comments Eitan Zahavi and myself have made over the v1 post at <http://lists.openfabrics.org/pipermail/general/2007-August/039247.html> were fully addressed. ] Reviewed-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:12 -07:00
Sean Hefty	733d65fe33	IB/sa: Add new QoS fields to path record The QoS annex defines new fields for path records. Add them to the ib_sa for consumers that want to use them. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:12 -07:00
Sean Hefty	81668838c4	IPoIB: Specify Traffic Class with path record queries for QoS support To support QoS within and between subnets, modify IPoIB to request specific Traffic Class values with path record queries, using the value associated with the IPoIB broadcast group. Signed-off-by: Sean Hefty <sean.hefty@intel.com> [ See some comments I made on this at v1 and v2 of the posts <http://lists.openfabrics.org/pipermail/general/2007-August/039275.html> <http://lists.openfabrics.org/pipermail/general/2007-September/040312.html> ] Reviewed-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:11 -07:00
Hoang-Nam Nguyen	08c283ac26	IB/ehca: Fix large page HW cap defines Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:11 -07:00
Joachim Fenkes	39089e7774	IB/ehca: Bump version number and change its format Nobody needed the SVNEHCA_ prefix anyway. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:11 -07:00
Joachim Fenkes	5110e4de49	IB/ehca: Replace get_paca()->paca_index by the more portable raw_smp_processor_id() We can use raw_smp_processor_id() here because the processor ID is only used for debug output and therefore our use is preemption-unsafe. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:11 -07:00
Joachim Fenkes	0b5de96858	IB/ehca: Serialize MR alloc and MR free hvCalls Some firmware levels exhibit a race condition between H_ALLOC_RESOURCE(MR) and H_FREE_RESOURCE(MR). Work around this problem by locking these hvCalls against each other. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:11 -07:00
Joachim Fenkes	e90d0b3dae	IB/ehca: Path migration support Fix some modify_qp() issues related to path migration. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:10 -07:00
Joachim Fenkes	b708fba3c2	IB/ehca: Add check for max #SGE to create_qp() Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:10 -07:00
Joachim Fenkes	86dce445e0	IB/ehca: ehca_gen_warn() should always print Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:10 -07:00
Joachim Fenkes	e37221928b	IB/ehca: Print return codes as signed decimal integers ...because -12 is easier to read than FFFFFFF4. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:10 -07:00
Joachim Fenkes	2863ad4bdd	IB/ehca: Refactor hvcall tracing Change hvcall trace output towards better readability: reg numbers instead of argument numbers, return code as signed decimal instead of unsigned hex. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:10 -07:00
Hoang-Nam Nguyen	e390d3b52f	IB/ehca: Use remap_4k_pfn() to map firmware contexts to user space Use Paul's new remap_4k_pfn() function to map our 4K firmware contexts into user space on 64K-page machines without exposing neighboring firmware contexts. Return the context's offset within a 64K page to user space so it can determine the proper virtual address. For details about remap_4k_pfn(), see commit `721151d0` or http://patchwork.ozlabs.org/linuxppc/patch?id=10281 Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:09 -07:00
Stefan Roscher	5281a4b8a0	IB/ehca: Support more than 4k QPs for userspace and kernelspace Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:08 -07:00

... 3 4 5 6 7 ...

1688 Commits