OpenCloudOS-Kernel/drivers/pci/pci.h

374 lines
12 KiB
C
Raw Normal View History

#ifndef DRIVERS_PCI_H
#define DRIVERS_PCI_H
#define PCI_FIND_CAP_TTL 48
PCI: Recognize Thunderbolt devices Detect on probe whether a PCI device is part of a Thunderbolt controller. Intel uses a Vendor-Specific Extended Capability (VSEC) with ID 0x1234 on such devices. Detect presence of this VSEC and cache it in a newly added is_thunderbolt bit in struct pci_dev. Also, add a helper to check whether a given PCI device is situated on a Thunderbolt daisy chain (i.e., below a PCI device with is_thunderbolt set). The necessity arises from the following: * If an external Thunderbolt GPU is connected to a dual GPU laptop, that GPU is currently registered with vga_switcheroo even though it can neither drive the laptop's panel nor be powered off by the platform. To vga_switcheroo it will appear as if two discrete GPUs are present. As a result, when the external GPU is runtime suspended, vga_switcheroo will cut power to the internal discrete GPU which may not be runtime suspended at all at this moment. The solution is to not register external GPUs with vga_switcheroo, which necessitates a way to recognize if they're on a Thunderbolt daisy chain. * Dual GPU MacBook Pros introduced 2011+ can no longer switch external DisplayPort ports between GPUs. (They're no longer just used for DP but have become combined DP/Thunderbolt ports.) The driver to switch the ports, drivers/platform/x86/apple-gmux.c, needs to detect presence of a Thunderbolt controller and, if found, keep external ports permanently switched to the discrete GPU. v2: Make kerneldoc for pci_is_thunderbolt_attached() more precise, drop portion of commit message pertaining to separate series. (Bjorn Helgaas) Cc: Andreas Noever <andreas.noever@gmail.com> Cc: Michael Jamet <michael.jamet@intel.com> Cc: Tomas Winkler <tomas.winkler@intel.com> Cc: Amir Levy <amir.jer.levy@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: http://patchwork.freedesktop.org/patch/msgid/0ab165a4a35c0b60f29d4c306c653ead14fcd8f9.1489145162.git.lukas@wunner.de
2017-03-11 04:23:45 +08:00
#define PCI_VSEC_ID_INTEL_TBT 0x1234 /* Thunderbolt */
extern const unsigned char pcie_link_speed[];
bool pcie_cap_has_lnkctl(const struct pci_dev *dev);
/* Functions internal to the PCI core code */
int pci_create_sysfs_dev_files(struct pci_dev *pdev);
void pci_remove_sysfs_dev_files(struct pci_dev *pdev);
#if !defined(CONFIG_DMI) && !defined(CONFIG_ACPI)
static inline void pci_create_firmware_label_files(struct pci_dev *pdev)
{ return; }
static inline void pci_remove_firmware_label_files(struct pci_dev *pdev)
{ return; }
#else
void pci_create_firmware_label_files(struct pci_dev *pdev);
void pci_remove_firmware_label_files(struct pci_dev *pdev);
#endif
void pci_cleanup_rom(struct pci_dev *dev);
enum pci_mmap_api {
PCI_MMAP_SYSFS, /* mmap on /sys/bus/pci/devices/<BDF>/resource<N> */
PCI_MMAP_PROCFS /* mmap on /proc/bus/pci/<BDF> */
};
int pci_mmap_fits(struct pci_dev *pdev, int resno, struct vm_area_struct *vmai,
enum pci_mmap_api mmap_api);
int pci_probe_reset_function(struct pci_dev *dev);
/**
* struct pci_platform_pm_ops - Firmware PM callbacks
*
* @is_manageable: returns 'true' if given device is power manageable by the
* platform firmware
*
* @set_state: invokes the platform firmware to set the device's power state
*
* @get_state: queries the platform firmware for a device's current power state
*
* @choose_state: returns PCI power state of given device preferred by the
* platform; to be used during system-wide transitions from a
* sleeping state to the working state and vice versa
*
* @sleep_wake: enables/disables the system wake up capability of given device
PCI ACPI: Rework PCI handling of wake-up * Introduce function acpi_pm_device_sleep_wake() for enabling and disabling the system wake-up capability of devices that are power manageable by ACPI. * Introduce function acpi_bus_can_wakeup() allowing other (dependent) subsystems to check if ACPI is able to enable the system wake-up capability of given device. * Introduce callback .sleep_wake() in struct pci_platform_pm_ops and for the ACPI PCI 'driver' make it use acpi_pm_device_sleep_wake(). * Introduce callback .can_wakeup() in struct pci_platform_pm_ops and for the ACPI 'driver' make it use acpi_bus_can_wakeup(). * Move the PME# handlig code out of pci_enable_wake() and split it into two functions, pci_pme_capable() and pci_pme_active(), allowing the caller to check if given device is capable of generating PME# from given power state and to enable/disable the device's PME# functionality, respectively. * Modify pci_enable_wake() to use the new ACPI callbacks and the new PME#-related functions. * Drop the generic .platform_enable_wakeup() callback that is not used any more. * Introduce device_set_wakeup_capable() that will set the power.can_wakeup flag of given device. * Rework PCI device PM initialization so that, if given device is capable of generating wake-up events, either natively through the PME# mechanism, or with the help of the platform, its power.can_wakeup flag is set and its power.should_wakeup flag is unset as appropriate. * Make ACPI set the power.can_wakeup flag for devices found to be wake-up capable by it. * Make the ACPI wake-up code enable/disable GPEs for devices that have the wakeup.flags.prepared flag set (which means that their wake-up power has been enabled). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-07-07 09:34:48 +08:00
*
PCI / ACPI / PM: Platform support for PCI PME wake-up Although the majority of PCI devices can generate PMEs that in principle may be used to wake up devices suspended at run time, platform support is generally necessary to convert PMEs into wake-up events that can be delivered to the kernel. If ACPI is used for this purpose, PME signals generated by a PCI device will trigger the ACPI GPE associated with the device to generate an ACPI wake-up event that we can set up a handler for, provided that everything is configured correctly. Unfortunately, the subset of PCI devices that have GPEs associated with them is quite limited. The devices without dedicated GPEs have to rely on the GPEs associated with other devices (in the majority of cases their upstream bridges and, possibly, the root bridge) to generate ACPI wake-up events in response to PME signals from them. Add ACPI platform support for PCI PME wake-up: o Add a framework making is possible to use ACPI system notify handlers for run-time PM. o Add new PCI platform callback ->run_wake() to struct pci_platform_pm_ops allowing us to enable/disable the platform to generate wake-up events for given device. Implemet this callback for the ACPI platform. o Define ACPI wake-up handlers for PCI devices and PCI root buses and make the PCI-ACPI binding code register wake-up notifiers for all PCI devices present in the ACPI tables. o Add function pci_dev_run_wake() which can be used by PCI drivers to check if given device is capable of generating wake-up events at run time. Developed in cooperation with Matthew Garrett <mjg@redhat.com>. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-18 06:44:09 +08:00
* @run_wake: enables/disables the platform to generate run-time wake-up events
* for given device (the device's wake-up capability has to be
* enabled by @sleep_wake for this feature to work)
*
* @need_resume: returns 'true' if the given device (which is currently
* suspended) needs to be resumed to be configured for system
* wakeup.
*
* If given platform is generally capable of power managing PCI devices, all of
* these callbacks are mandatory.
*/
struct pci_platform_pm_ops {
bool (*is_manageable)(struct pci_dev *dev);
int (*set_state)(struct pci_dev *dev, pci_power_t state);
pci_power_t (*get_state)(struct pci_dev *dev);
pci_power_t (*choose_state)(struct pci_dev *dev);
PCI ACPI: Rework PCI handling of wake-up * Introduce function acpi_pm_device_sleep_wake() for enabling and disabling the system wake-up capability of devices that are power manageable by ACPI. * Introduce function acpi_bus_can_wakeup() allowing other (dependent) subsystems to check if ACPI is able to enable the system wake-up capability of given device. * Introduce callback .sleep_wake() in struct pci_platform_pm_ops and for the ACPI PCI 'driver' make it use acpi_pm_device_sleep_wake(). * Introduce callback .can_wakeup() in struct pci_platform_pm_ops and for the ACPI 'driver' make it use acpi_bus_can_wakeup(). * Move the PME# handlig code out of pci_enable_wake() and split it into two functions, pci_pme_capable() and pci_pme_active(), allowing the caller to check if given device is capable of generating PME# from given power state and to enable/disable the device's PME# functionality, respectively. * Modify pci_enable_wake() to use the new ACPI callbacks and the new PME#-related functions. * Drop the generic .platform_enable_wakeup() callback that is not used any more. * Introduce device_set_wakeup_capable() that will set the power.can_wakeup flag of given device. * Rework PCI device PM initialization so that, if given device is capable of generating wake-up events, either natively through the PME# mechanism, or with the help of the platform, its power.can_wakeup flag is set and its power.should_wakeup flag is unset as appropriate. * Make ACPI set the power.can_wakeup flag for devices found to be wake-up capable by it. * Make the ACPI wake-up code enable/disable GPEs for devices that have the wakeup.flags.prepared flag set (which means that their wake-up power has been enabled). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-07-07 09:34:48 +08:00
int (*sleep_wake)(struct pci_dev *dev, bool enable);
PCI / ACPI / PM: Platform support for PCI PME wake-up Although the majority of PCI devices can generate PMEs that in principle may be used to wake up devices suspended at run time, platform support is generally necessary to convert PMEs into wake-up events that can be delivered to the kernel. If ACPI is used for this purpose, PME signals generated by a PCI device will trigger the ACPI GPE associated with the device to generate an ACPI wake-up event that we can set up a handler for, provided that everything is configured correctly. Unfortunately, the subset of PCI devices that have GPEs associated with them is quite limited. The devices without dedicated GPEs have to rely on the GPEs associated with other devices (in the majority of cases their upstream bridges and, possibly, the root bridge) to generate ACPI wake-up events in response to PME signals from them. Add ACPI platform support for PCI PME wake-up: o Add a framework making is possible to use ACPI system notify handlers for run-time PM. o Add new PCI platform callback ->run_wake() to struct pci_platform_pm_ops allowing us to enable/disable the platform to generate wake-up events for given device. Implemet this callback for the ACPI platform. o Define ACPI wake-up handlers for PCI devices and PCI root buses and make the PCI-ACPI binding code register wake-up notifiers for all PCI devices present in the ACPI tables. o Add function pci_dev_run_wake() which can be used by PCI drivers to check if given device is capable of generating wake-up events at run time. Developed in cooperation with Matthew Garrett <mjg@redhat.com>. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-18 06:44:09 +08:00
int (*run_wake)(struct pci_dev *dev, bool enable);
bool (*need_resume)(struct pci_dev *dev);
};
int pci_set_platform_pm(const struct pci_platform_pm_ops *ops);
void pci_update_current_state(struct pci_dev *dev, pci_power_t state);
void pci_power_up(struct pci_dev *dev);
void pci_disable_enabled_device(struct pci_dev *dev);
int pci_finish_runtime_suspend(struct pci_dev *dev);
int __pci_pme_wakeup(struct pci_dev *dev, void *ign);
bool pci_dev_keep_suspended(struct pci_dev *dev);
void pci_dev_complete_resume(struct pci_dev *pci_dev);
void pci_config_pm_runtime_get(struct pci_dev *dev);
void pci_config_pm_runtime_put(struct pci_dev *dev);
void pci_pm_init(struct pci_dev *dev);
void pci_ea_init(struct pci_dev *dev);
void pci_allocate_cap_save_buffers(struct pci_dev *dev);
void pci_free_cap_save_buffers(struct pci_dev *dev);
bool pci_bridge_d3_possible(struct pci_dev *dev);
void pci_bridge_d3_update(struct pci_dev *dev);
static inline void pci_wakeup_event(struct pci_dev *dev)
{
/* Wait 100 ms before the system can be put into a sleep state. */
pm_wakeup_event(&dev->dev, 100);
}
static inline bool pci_has_subordinate(struct pci_dev *pci_dev)
{
return !!(pci_dev->subordinate);
}
PCI: Put PCIe ports into D3 during suspend Currently the Linux PCI core does not touch power state of PCI bridges and PCIe ports when system suspend is entered. Leaving them in D0 consumes power unnecessarily and may prevent the CPU from entering deeper C-states. With recent PCIe hardware we can power down the ports to save power given that we take into account few restrictions: - The PCIe port hardware is recent enough, starting from 2015. - Devices connected to PCIe ports are effectively in D3cold once the port is transitioned to D3 (the config space is not accessible anymore and the link may be powered down). - Devices behind the PCIe port need to be allowed to transition to D3cold and back. There is a way both drivers and userspace can forbid this. - If the device behind the PCIe port is capable of waking the system it needs to be able to do so from D3cold. This patch adds a new flag to struct pci_device called 'bridge_d3'. This flag is set and cleared by the PCI core whenever there is a change in power management state of any of the devices behind the PCIe port. When system later on is suspended we only need to check this flag and if it is true transition the port to D3 otherwise we leave it in D0. Also provide override mechanism via command line parameter "pcie_port_pm=[off|force]" that can be used to disable or enable the feature regardless of the BIOS manufacturing date. Tested-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-02 16:17:12 +08:00
static inline bool pci_power_manageable(struct pci_dev *pci_dev)
{
/*
* Currently we allow normal PCI devices and PCI bridges transition
* into D3 if their bridge_d3 is set.
*/
return !pci_has_subordinate(pci_dev) || pci_dev->bridge_d3;
}
struct pci_vpd_ops {
ssize_t (*read)(struct pci_dev *dev, loff_t pos, size_t count, void *buf);
ssize_t (*write)(struct pci_dev *dev, loff_t pos, size_t count, const void *buf);
int (*set_size)(struct pci_dev *dev, size_t len);
};
struct pci_vpd {
const struct pci_vpd_ops *ops;
struct bin_attribute *attr; /* descriptor for sysfs VPD entry */
struct mutex lock;
unsigned int len;
u16 flag;
u8 cap;
u8 busy:1;
u8 valid:1;
};
int pci_vpd_init(struct pci_dev *dev);
void pci_vpd_release(struct pci_dev *dev);
/* PCI /proc functions */
#ifdef CONFIG_PROC_FS
int pci_proc_attach_device(struct pci_dev *dev);
int pci_proc_detach_device(struct pci_dev *dev);
int pci_proc_detach_bus(struct pci_bus *bus);
#else
static inline int pci_proc_attach_device(struct pci_dev *dev) { return 0; }
static inline int pci_proc_detach_device(struct pci_dev *dev) { return 0; }
static inline int pci_proc_detach_bus(struct pci_bus *bus) { return 0; }
#endif
/* Functions for PCI Hotplug drivers to use */
int pci_hp_add_bridge(struct pci_dev *dev);
#ifdef HAVE_PCI_LEGACY
void pci_create_legacy_files(struct pci_bus *bus);
void pci_remove_legacy_files(struct pci_bus *bus);
#else
static inline void pci_create_legacy_files(struct pci_bus *bus) { return; }
static inline void pci_remove_legacy_files(struct pci_bus *bus) { return; }
#endif
/* Lock for read/write access to pci device and bus lists */
extern struct rw_semaphore pci_bus_sem;
extern raw_spinlock_t pci_lock;
extern unsigned int pci_pm_d3_delay;
#ifdef CONFIG_PCI_MSI
void pci_no_msi(void);
#else
static inline void pci_no_msi(void) { }
#endif
static inline void pci_msi_set_enable(struct pci_dev *dev, int enable)
{
u16 control;
pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
control &= ~PCI_MSI_FLAGS_ENABLE;
if (enable)
control |= PCI_MSI_FLAGS_ENABLE;
pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, control);
}
static inline void pci_msix_clear_and_set_ctrl(struct pci_dev *dev, u16 clear, u16 set)
{
u16 ctrl;
pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, &ctrl);
ctrl &= ~clear;
ctrl |= set;
pci_write_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, ctrl);
}
void pci_realloc_get_opt(char *);
static inline int pci_no_d1d2(struct pci_dev *dev)
{
unsigned int parent_dstates = 0;
if (dev->bus->self)
parent_dstates = dev->bus->self->no_d1d2;
return (dev->no_d1d2 || parent_dstates);
}
extern const struct attribute_group *pci_dev_groups[];
extern const struct attribute_group *pcibus_groups[];
extern struct device_type pci_dev_type;
extern const struct attribute_group *pci_bus_groups[];
/**
* pci_match_one_device - Tell if a PCI device structure has a matching
* PCI device id structure
* @id: single PCI device id structure to match
* @dev: the PCI device structure to match against
*
* Returns the matching pci_device_id structure or %NULL if there is no match.
*/
static inline const struct pci_device_id *
pci_match_one_device(const struct pci_device_id *id, const struct pci_dev *dev)
{
if ((id->vendor == PCI_ANY_ID || id->vendor == dev->vendor) &&
(id->device == PCI_ANY_ID || id->device == dev->device) &&
(id->subvendor == PCI_ANY_ID || id->subvendor == dev->subsystem_vendor) &&
(id->subdevice == PCI_ANY_ID || id->subdevice == dev->subsystem_device) &&
!((id->class ^ dev->class) & id->class_mask))
return id;
return NULL;
}
PCI: introduce pci_slot Currently, /sys/bus/pci/slots/ only exposes hotplug attributes when a hotplug driver is loaded, but PCI slots have attributes such as address, speed, width, etc. that are not related to hotplug at all. Introduce pci_slot as the primary data structure and kobject model. Hotplug attributes described in hotplug_slot become a secondary structure associated with the pci_slot. This patch only creates the infrastructure that allows the separation of PCI slot attributes and hotplug attributes. In this patch, the PCI hotplug core remains the only user of this infrastructure, and thus, /sys/bus/pci/slots/ will still only become populated when a hotplug driver is loaded. A later patch in this series will add a second user of this new infrastructure and demonstrate splitting the task of exposing pci_slot attributes from hotplug_slot attributes. - Make pci_slot the primary sysfs entity. hotplug_slot becomes a subsidiary structure. o pci_create_slot() creates and registers a slot with the PCI core o pci_slot_add_hotplug() gives it hotplug capability - Change the prototype of pci_hp_register() to take the bus and slot number (on parent bus) as parameters. - Remove all the ->get_address methods since this functionality is now handled by pci_slot directly. [achiang@hp.com: rpaphp-correctly-pci_hp_register-for-empty-pci-slots] Tested-by: Badari Pulavarty <pbadari@us.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: make headers_check happy] [akpm@linux-foundation.org: nuther build fix] [akpm@linux-foundation.org: fix typo in #include] Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Matthew Wilcox <matthew@wil.cx> Cc: Greg KH <greg@kroah.com> Cc: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Cc: Len Brown <lenb@kernel.org> Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-06-11 05:28:50 +08:00
/* PCI slot sysfs helper code */
#define to_pci_slot(s) container_of(s, struct pci_slot, kobj)
extern struct kset *pci_slots_kset;
struct pci_slot_attribute {
struct attribute attr;
ssize_t (*show)(struct pci_slot *, char *);
ssize_t (*store)(struct pci_slot *, const char *, size_t);
};
#define to_pci_slot_attr(s) container_of(s, struct pci_slot_attribute, attr)
enum pci_bar_type {
pci_bar_unknown, /* Standard PCI BAR probe */
pci_bar_io, /* An io port BAR */
pci_bar_mem32, /* A 32-bit memory BAR */
pci_bar_mem64, /* A 64-bit memory BAR */
};
bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
int crs_timeout);
int pci_setup_device(struct pci_dev *dev);
int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
struct resource *res, unsigned int reg);
void pci_configure_ari(struct pci_dev *dev);
void __pci_bus_size_bridges(struct pci_bus *bus,
PCI / ACPI: Use boot-time resource allocation rules during hotplug On x86 platforms, the kernel respects PCI resource assignments from the BIOS and only reassigns resources for unassigned BARs at boot time. However, with the ACPI-based hotplug (acpiphp), it ignores the BIOS' PCI resource assignments completely and reassigns all resources by itself. This causes differences in PCI resource allocation between boot time and runtime hotplug to occur, which is generally undesirable and sometimes actively breaks things. Namely, if there are enough resources, reassigning all PCI resources during runtime hotplug should work, but it may fail if the resources are constrained. This may happen, for instance, when some PCI devices with huge MMIO BARs are involved in the runtime hotplug operations, because the current PCI MMIO alignment algorithm may waste huge chunks of MMIO address space in those cases. On the Alexander's Sony VAIO VPCZ23A4R the BIOS allocates limited MMIO resources for the dock station which contains a device (graphics adapter) with a 256MB MMIO BAR. An attempt to reassign that during runtime hotplug causes the dock station MMIO window to be exhausted and acpiphp fails to allocate resources for the majority of devices on the dock station as a result. To prevent that from happening, modify acpiphp to follow the boot time resources allocation behavior so that the BIOS' resource assignments are respected during runtime hotplug too. [rjw: Changelog] References: https://bugzilla.kernel.org/show_bug.cgi?id=56531 Reported-and-tested-by: Alexander E. Patrakov <patrakov@gmail.com> Tested-by: Illya Klymov <xanf@xanf.me> Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Cc: 3.9+ <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-06-23 07:01:35 +08:00
struct list_head *realloc_head);
void __pci_bus_assign_resources(const struct pci_bus *bus,
struct list_head *realloc_head,
struct list_head *fail_head);
bool pci_bus_clip_resource(struct pci_dev *dev, int idx);
void pci_reassigndev_resource_alignment(struct pci_dev *dev);
void pci_disable_bridge_window(struct pci_dev *dev);
2009-03-16 16:13:39 +08:00
/* Single Root I/O Virtualization */
struct pci_sriov {
int pos; /* capability position */
int nres; /* number of resources */
u32 cap; /* SR-IOV Capabilities */
u16 ctrl; /* SR-IOV Control */
u16 total_VFs; /* total VFs associated with the PF */
u16 initial_VFs; /* initial VFs associated with the PF */
u16 num_VFs; /* number of VFs available */
u16 offset; /* first VF Routing ID offset */
u16 stride; /* following VF stride */
u32 pgsz; /* page size for BAR alignment */
u8 link; /* Function Dependency Link */
u8 max_VF_buses; /* max buses consumed by VFs */
u16 driver_max_VFs; /* max num VFs driver supports */
struct pci_dev *dev; /* lowest numbered PF */
struct pci_dev *self; /* this PF */
PCI: Lock each enable/disable num_vfs operation in sysfs Enabling/disabling SRIOV via sysfs by echo-ing multiple values simultaneously: # echo 63 > /sys/class/net/ethX/device/sriov_numvfs& # echo 63 > /sys/class/net/ethX/device/sriov_numvfs # sleep 5 # echo 0 > /sys/class/net/ethX/device/sriov_numvfs& # echo 0 > /sys/class/net/ethX/device/sriov_numvfs results in the following bug: kernel BUG at drivers/pci/iov.c:495! invalid opcode: 0000 [#1] SMP CPU: 1 PID: 8050 Comm: bash Tainted: G W 4.9.0-rc7-net-next #2092 RIP: 0010:[<ffffffff813b1647>] [<ffffffff813b1647>] pci_iov_release+0x57/0x60 Call Trace: [<ffffffff81391726>] pci_release_dev+0x26/0x70 [<ffffffff8155be6e>] device_release+0x3e/0xb0 [<ffffffff81365ee7>] kobject_cleanup+0x67/0x180 [<ffffffff81365d9d>] kobject_put+0x2d/0x60 [<ffffffff8155bc27>] put_device+0x17/0x20 [<ffffffff8139c08a>] pci_dev_put+0x1a/0x20 [<ffffffff8139cb6b>] pci_get_dev_by_id+0x5b/0x90 [<ffffffff8139cca5>] pci_get_subsys+0x35/0x40 [<ffffffff8139ccc8>] pci_get_device+0x18/0x20 [<ffffffff8139ccfb>] pci_get_domain_bus_and_slot+0x2b/0x60 [<ffffffff813b09e7>] pci_iov_remove_virtfn+0x57/0x180 [<ffffffff813b0b95>] pci_disable_sriov+0x65/0x140 [<ffffffffa00a1af7>] ixgbe_disable_sriov+0xc7/0x1d0 [ixgbe] [<ffffffffa00a1e9d>] ixgbe_pci_sriov_configure+0x3d/0x170 [ixgbe] [<ffffffff8139d28c>] sriov_numvfs_store+0xdc/0x130 ... RIP [<ffffffff813b1647>] pci_iov_release+0x57/0x60 Use the existing mutex lock to protect each enable/disable operation. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> CC: Alexander Duyck <alexander.h.duyck@intel.com>
2017-01-07 05:59:08 +08:00
struct mutex lock; /* lock for setting sriov_numvfs in sysfs */
resource_size_t barsz[PCI_SRIOV_NUM_BARS]; /* VF BAR size */
bool drivers_autoprobe; /* auto probing of VFs by driver */
};
/* pci_dev priv_flags */
#define PCI_DEV_DISCONNECTED 0
static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
{
set_bit(PCI_DEV_DISCONNECTED, &dev->priv_flags);
return 0;
}
static inline bool pci_dev_is_disconnected(const struct pci_dev *dev)
{
return test_bit(PCI_DEV_DISCONNECTED, &dev->priv_flags);
}
#ifdef CONFIG_PCI_ATS
void pci_restore_ats_state(struct pci_dev *dev);
#else
static inline void pci_restore_ats_state(struct pci_dev *dev)
{
}
#endif /* CONFIG_PCI_ATS */
#ifdef CONFIG_PCI_IOV
int pci_iov_init(struct pci_dev *dev);
void pci_iov_release(struct pci_dev *dev);
void pci_iov_update_resource(struct pci_dev *dev, int resno);
resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno);
void pci_restore_iov_state(struct pci_dev *dev);
int pci_iov_bus_range(struct pci_bus *bus);
#else
static inline int pci_iov_init(struct pci_dev *dev)
{
return -ENODEV;
}
static inline void pci_iov_release(struct pci_dev *dev)
{
}
static inline void pci_restore_iov_state(struct pci_dev *dev)
{
}
static inline int pci_iov_bus_range(struct pci_bus *bus)
{
return 0;
}
#endif /* CONFIG_PCI_IOV */
unsigned long pci_cardbus_resource_alignment(struct resource *);
static inline resource_size_t pci_resource_alignment(struct pci_dev *dev,
struct resource *res)
PCI SR-IOV: correct broken resource alignment calculations An SR-IOV capable device includes an SR-IOV PCIe capability which describes the Virtual Function (VF) BAR requirements. A typical SR-IOV device can support multiple VFs whose BARs must be in a contiguous region, effectively an array of VF BARs. The BAR reports the size requirement for a single VF. We calculate the full range needed by simply multiplying the VF BAR size with the number of possible VFs and create a resource spanning the full range. This all seems sane enough except it artificially inflates the alignment requirement for the VF BAR. The VF BAR need only be aligned to the size of a single BAR not the contiguous range of VF BARs. This can cause us to fail to allocate resources for the BAR despite the fact that we actually have enough space. This patch adds a thin PCI specific layer over the generic resource_alignment() function which is aware of the special nature of VF BARs and does sorting and allocation based on the smaller alignment requirement. I recognize that while resource_alignment is generic, it's basically a PCI helper. An alternative to this patch is to add PCI VF BAR specific information to struct resource. I opted for the extra layer rather than adding such PCI specific information to struct resource. This does have the slight downside that we don't cache the BAR size and re-read for each alignment query (happens a small handful of times during boot for each VF BAR). Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Yu Zhao <yu.zhao@intel.com> Cc: stable@kernel.org Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-08-29 04:00:06 +08:00
{
#ifdef CONFIG_PCI_IOV
int resno = res - dev->resource;
if (resno >= PCI_IOV_RESOURCES && resno <= PCI_IOV_RESOURCE_END)
return pci_sriov_resource_alignment(dev, resno);
#endif
if (dev->class >> 8 == PCI_CLASS_BRIDGE_CARDBUS)
return pci_cardbus_resource_alignment(res);
PCI SR-IOV: correct broken resource alignment calculations An SR-IOV capable device includes an SR-IOV PCIe capability which describes the Virtual Function (VF) BAR requirements. A typical SR-IOV device can support multiple VFs whose BARs must be in a contiguous region, effectively an array of VF BARs. The BAR reports the size requirement for a single VF. We calculate the full range needed by simply multiplying the VF BAR size with the number of possible VFs and create a resource spanning the full range. This all seems sane enough except it artificially inflates the alignment requirement for the VF BAR. The VF BAR need only be aligned to the size of a single BAR not the contiguous range of VF BARs. This can cause us to fail to allocate resources for the BAR despite the fact that we actually have enough space. This patch adds a thin PCI specific layer over the generic resource_alignment() function which is aware of the special nature of VF BARs and does sorting and allocation based on the smaller alignment requirement. I recognize that while resource_alignment is generic, it's basically a PCI helper. An alternative to this patch is to add PCI VF BAR specific information to struct resource. I opted for the extra layer rather than adding such PCI specific information to struct resource. This does have the slight downside that we don't cache the BAR size and re-read for each alignment query (happens a small handful of times during boot for each VF BAR). Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Yu Zhao <yu.zhao@intel.com> Cc: stable@kernel.org Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-08-29 04:00:06 +08:00
return resource_alignment(res);
}
void pci_enable_acs(struct pci_dev *dev);
#ifdef CONFIG_PCIE_PTM
void pci_ptm_init(struct pci_dev *dev);
#else
static inline void pci_ptm_init(struct pci_dev *dev) { }
#endif
struct pci_dev_reset_methods {
u16 vendor;
u16 device;
int (*reset)(struct pci_dev *dev, int probe);
};
#ifdef CONFIG_PCI_QUIRKS
int pci_dev_specific_reset(struct pci_dev *dev, int probe);
#else
static inline int pci_dev_specific_reset(struct pci_dev *dev, int probe)
{
return -ENOTTY;
}
#endif
#if defined(CONFIG_PCI_QUIRKS) && defined(CONFIG_ARM64)
int acpi_get_rc_resources(struct device *dev, const char *hid, u16 segment,
struct resource *res);
#endif
#endif /* DRIVERS_PCI_H */