The resource tracker is used to track usage of HCA resources by the different
guests.
Virtual functions (VFs) are attached to guest operating systems but
resources are allocated from the same pool and are assigned to VFs. It is
essential that hostile/buggy guests not be able to affect the operation of
other VFs, possibly attached to other guest OSs since ConnectX firmware is not
tolerant to misuse of resources.
The resource tracker module associates each resource with a VF and maintains
state information for the allocated object. It also defines allowed state
transitions and enforces them.
Relationships between resources are also referred to. For example, CQs are
pointed to by QPs, so it is forbidden to destroy a CQ if a QP refers to it.
ICM memory is always accessible through the primary function and hence it is
allocated by the owner of the primary function.
When a guest dies, an FLR is generated for all the VFs it owns and all the
resources it used are freed.
The tracked resource types are: QPs, CQs, SRQs, MPTs, MTTs, MACs, RES_EQs,
and XRCDNs.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
MTTs are resources which are allocated and tracked by the PF driver.
In multifunction mode, the allocation and icm mapping is done in
the resource tracker (later patch in this sequence).
To accomplish this, we have "work" functions whose names start with
"__", and "request" functions (same name, no __). If we are operating
in multifunction mode, the request function actually results in
comm-channel commands being sent (ALLOC_RES or FREE_RES).
The PF-driver comm-channel handler will ultimately invoke the
"work" (__) function and return the result.
If we are not in multifunction mode, the "work" handler is invoked
immediately.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
The following commands are added here:
1. QUERY_FUNC_CAP and its wrapper. This function is used by VFs when
they start up to receive configuration information from the PF, such
as resource quotas for this VF, which ports should be used (currently
two), what protocol is running on the port (currently Ethernet ONLY,
or port not active).
2. QUERY_PORT and its wrapper. Previously, this FW command was invoked directly
by the ETH driver (en_port.c) using mlx4_cmd_box. Virtualization is now
required here (the VF's MAC address must be substituted for the PFs
MAC address returned by the FW). We changed the invocation
in the ETH driver to use mlx4_QUERY_PORT, and added the wrapper.
3. QUERY_HCA. Used by the VF to determine how the HCA was initialized.
For now, we need only the multicast table member entry size
(log2_mc_table_entry_sz, in the ConnectX PRM). No wrapper is needed
here, because the data may be passed as is to the VF without modification).
In this command, we have added a GLOBAL_CAPS field for passing required
configuration information from FW to a VF (this field is to allow safely
adding new SRIOV capabilities which require support in VF drivers, too).
Bits will set here by FW in response to PF-driver configuration commands which
will activate as yet undefined new SRIOV features. The VF will test to see that
all required capabilities indicated by this field are supported (i.e., if a bit
is set and the VF driver does not recognize that bit, it must abort
its initialization). Currently, no bits are set.
4. Added a CLOSE_PORT wrapper. The PF context needs to keep track of how many VF contexts
have the port open. The PF context will not actually issue the FW close port command
until the last port user issues a CLOSE_PORT request.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Marcel Apfelbaum <marcela@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
When SRIOV is enabled on the chip (at FW burning time),
the HCA uses only 17 bits for the PD. The remaining 7 high-order bits
are ignored.
Change the allocator to return only 17 bits for the PD. The MSB 7
bits will be used to encode the slave number for consistency
checking later on in the resource tracker.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
These changes will not affect module operation as yet. They
are only to get some structs and enums in place for use by
subsequent patches (making those smaller).
Added here:
* sriov state structs and inlines (mlx4_is_master/slave/mfunc)
* comm-channel and vhcr support structures
* enum values for new FW and comm-channel virtual commands
(i.e., commands, passed via the comm channel to the PF-driver).
* prototypes for many command wrapper functions (used by the
PF context for processing FW commands passed to it by the VFs).
* struct mlx4_eqe is moved from eq.c to mlx4.h (it will be used
by other mlx4_core source files).
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (62 commits)
mlx4_core: Deprecate log_num_vlan module param
IB/mlx4: Don't set VLAN in IBoE WQEs' control segment
IB/mlx4: Enable 4K mtu for IBoE
RDMA/cxgb4: Mark QP in error before disabling the queue in firmware
RDMA/cxgb4: Serialize calls to CQ's comp_handler
RDMA/cxgb3: Serialize calls to CQ's comp_handler
IB/qib: Fix issue with link states and QSFP cables
IB/mlx4: Configure extended active speeds
mlx4_core: Add extended port capabilities support
IB/qib: Hold links until tuning data is available
IB/qib: Clean up checkpatch issue
IB/qib: Remove s_lock around header validation
IB/qib: Precompute timeout jiffies to optimize latency
IB/qib: Use RCU for qpn lookup
IB/qib: Eliminate divide/mod in converting idx to egr buf pointer
IB/qib: Decode path MTU optimization
IB/qib: Optimize RC/UC code by IB operation
IPoIB: Use the right function to do DMA unmap pages
RDMA/cxgb4: Use correct QID in insert_recv_cqe()
RDMA/cxgb4: Make sure flush CQ entries are collected on connection close
...
Moves the Mellanox driver into drivers/net/ethernet/mellanox/ and
make the necessary Kconfig and Makefile changes.
CC: Roland Dreier <roland@kernel.org>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>