This is loosely based on a patch by Jesse Barnes to check the user-space
PCI mappings though the sysfs interfaces. Quoting Jesse's original
explanation:
It's fairly common for applications to map PCI resources through sysfs.
However, with the current implementation, it's possible for an application
to map far more than the range corresponding to the resourceN file it
opened. This patch plugs that hole by checking the range at mmap time,
similar to what is done on platforms like sparc64 in their lower level
PCI remapping routines.
It was initially put together to help debug the e1000e NVRAM corruption
problem, since we initially thought an X driver might be walking past the
end of one of its mappings and clobbering the NVRAM. It now looks like
that's not the case, but doing the check is still important for obvious
reasons.
and this version of the patch differs in that it uses a helper function
to clarify the code, and does all the checks in pages (instead of bytes)
in order to avoid overflows when doing "<< PAGE_SHIFT" etc.
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For Broadcom 5706, 5708, 5709 rev. A nics, any read beyond the
VPD end tag will hang the device. This problem was initially
observed when a vpd entry was created in sysfs
('/sys/bus/pci/devices/<id>/vpd'). A read to this sysfs entry
will dump 32k of data. Reading a full 32k will cause an access
beyond the VPD end tag causing the device to hang. Once the device
is hung, the bnx2 driver will not be able to reset the device.
We believe that it is legal to read beyond the end tag and
therefore the solution is to limit the read/write length.
A majority of this patch is from Matthew Wilcox who gave code for
reworking the PCI vpd size information. A PCI quirk added for the
Broadcom NIC's to limit the read/write's.
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Some PCI devices will lock up if we attempt to read from VPD addresses
beyond some device-dependent limit. Until we can identify these
devices and adjust the file size accordingly, only let root read VPD
through sysfs to prevent a DoS by normal users.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
For the ranges with IORESOURCE_PREFETCH, export a new resource_wc interface in
pci /sysfs along with resource (which is uncached).
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Vital Product Data (VPD) may be exposed by PCI devices in several
ways. It is generally unsafe to read this information through the
existing interfaces to user-land because of stateful interfaces.
This adds:
- abstract operations for VPD access (struct pci_vpd_ops)
- VPD state information in struct pci_dev (struct pci_vpd)
- an implementation of the VPD access method specified in PCI 2.2
(in access.c)
- a 'vpd' binary file in sysfs directories for PCI devices with VPD
operations defined
It adds a probe for PCI 2.2 VPD in pci_scan_device() and release of
VPD state in pci_release_dev().
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
PCI Express ASPM defines a protocol for PCI Express components in the D0
state to reduce Link power by placing their Links into a low power state
and instructing the other end of the Link to do likewise. This
capability allows hardware-autonomous, dynamic Link power reduction
beyond what is achievable by software-only controlled power management.
However, The device should be configured by software appropriately.
Enabling ASPM will save power, but will introduce device latency.
This patch adds ASPM support in Linux. It introduces a global policy for
ASPM, a sysfs file /sys/module/pcie_aspm/parameters/policy can control
it. The interface can be used as a boot option too. Currently we have
below setting:
-default, BIOS default setting
-powersave, highest power saving mode, enable all available ASPM
state and clock power management
-performance, highest performance, disable ASPM and clock power
management
By default, the 'default' policy is used currently.
In my test, power difference between powersave mode and performance mode
is about 1.3w in a system with 3 PCIE links.
Note: some devices might not work well with aspm, either because chipset
issue or device issue. The patch provide API (pci_disable_link_state),
driver can disable ASPM for specific device.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* Cleaned up references to cpumask_scnprintf() and added new
cpulist_scnprintf() interfaces where appropriate.
* Fix some small bugs (or code efficiency improvments) for various uses
of cpumask_scnprintf.
* Clean up some checkpatch errors.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This reverts commit 6c723d5bd8.
It caused build errors on non-x86 platforms, config file confusion, and
even some boot errors on some x86-64 boxes. All around, not quite ready
for prime-time :(
Cc: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This moves the pci_bus class device to be a real struct device and at
the same time, place it in the device tree in the correct location.
Note, the old "bridge" symlink is now gone, but this was a non-standard
link and no userspace program used it. If you need to determine the
device that the bus is on, follow the standard device symlink, or walk
up the device tree.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
PCI Express ASPM defines a protocol for PCI Express components in the D0
state to reduce Link power by placing their Links into a low power state
and instructing the other end of the Link to do likewise. This
capability allows hardware-autonomous, dynamic Link power reduction
beyond what is achievable by software-only controlled power management.
However, The device should be configured by software appropriately.
Enabling ASPM will save power, but will introduce device latency.
This patch adds ASPM support in Linux. It introduces a global policy for
ASPM, a sysfs file /sys/module/pcie_aspm/parameters/policy can control
it. The interface can be used as a boot option too. Currently we have
below setting:
-default, BIOS default setting
-powersave, highest power saving mode, enable all available ASPM
state
and clock power management
-performance, highest performance, disable ASPM and clock power
management
By default, the 'default' policy is used currently.
In my test, power difference between powersave mode and performance mode
is about 1.3w in a system with 3 PCIE links.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There should be a pci_dev_put when breaking out of a loop that iterates
over calls to pci_get_device and similar functions.
This was fixed using the following semantic patch.
// <smpl>
@@
identifier d;
type T;
expression e;
iterator for_each_pci_dev;
@@
T *d;
...
for_each_pci_dev(d)
{... when != pci_dev_put(d)
when != e = d
(
return d;
|
+ pci_dev_put(d);
? return ...;
)
...}
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
I forgot to remove capability.h from mm.h while removing sched.h! This
patch remedies that, because the only inline function which was using
CAP_something was made out of line.
Cross-compile tested without regressions on:
all powerpc defconfigs
all mips defconfigs
all m68k defconfigs
all arm defconfigs
all ia64 defconfigs
alpha alpha-allnoconfig alpha-defconfig alpha-up
arm
i386 i386-allnoconfig i386-defconfig i386-up
ia64 ia64-allnoconfig ia64-defconfig ia64-up
m68k
mips
parisc parisc-allnoconfig parisc-defconfig parisc-up
powerpc powerpc-up
s390 s390-allnoconfig s390-defconfig s390-up
sparc sparc-allnoconfig sparc-defconfig sparc-up
sparc64 sparc64-allnoconfig sparc64-defconfig sparc64-up
um-x86_64
x86_64 x86_64-allnoconfig x86_64-defconfig x86_64-up
as well as my two usual configs.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (34 commits)
PCI: Only build PCI syscalls on architectures that want them
PCI: limit pci_get_bus_and_slot to domain 0
PCI: hotplug: acpiphp: avoid acpiphp "cannot get bridge info" PCI hotplug failure
PCI: hotplug: acpiphp: remove hot plug parameter write to PCI host bridge
PCI: hotplug: acpiphp: fix slot poweroff problem on systems without _PS3
PCI: hotplug: pciehp: wait for 1 second after power off slot
PCI: pci_set_power_state(): check for PM capabilities earlier
PCI: cpci_hotplug: Convert to use the kthread API
PCI: add pci_try_set_mwi
PCI: pcie: remove SPIN_LOCK_UNLOCKED
PCI: ROUND_UP macro cleanup in drivers/pci
PCI: remove pci_dac_dma_... APIs
PCI: pci-x-pci-express-read-control-interfaces cleanups
PCI: Fix typo in include/linux/pci.h
PCI: pci_ids, remove double or more empty lines
PCI: pci_ids, add atheros and 3com_2 vendors
PCI: pci_ids, reorder some entries
PCI: i386: traps, change VENDOR to DEVICE
PCI: ATM: lanai, change VENDOR to DEVICE
PCI: Change all drivers to use pci_device->revision
...
Well, first of all, I don't want to change so many files either.
What I do:
Adding a new parameter "struct bin_attribute *" in the
.read/.write methods for the sysfs binary attributes.
In fact, only the four lines change in fs/sysfs/bin.c and
include/linux/sysfs.h do the real work.
But I have to update all the files that use binary attributes
to make them compatible with the new .read and .write methods.
I'm not sure if I missed any. :(
Why I do this:
For a sysfs attribute, we can get a pointer pointing to the
struct attribute in the .show/.store method,
while we can't do this for the binary attributes.
I don't know why this is different, but this does make it not
so handy to use the binary attributes as the regular ones.
So I think this patch is reasonable. :)
Who benefits from it:
The patch that exposes ACPI tables in sysfs
requires such an improvement.
All the table binary attributes share the same .read method.
Parameter "struct bin_attribute *" is used to get
the table signature and instance number which are used to
distinguish different ACPI table binary attributes.
Without this parameter, we need to offer different .read methods
for different ACPI table binary attributes.
This is impossible as there are various ACPI tables on different
platforms, and we don't know what they are until they are loaded.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
sysfs is now completely out of driver/module lifetime game. After
deletion, a sysfs node doesn't access anything outside sysfs proper,
so there's no reason to hold onto the attribute owners. Note that
often the wrong modules were accounted for as owners leading to
accessing removed modules.
This patch kills now unnecessary attribute->owner. Note that with
this change, userland holding a sysfs node does not prevent the
backing module from being unloaded.
For more info regarding lifetime rule cleanup, please read the
following message.
http://article.gmane.org/gmane.linux.kernel/510293
(tweaked by Greg to not delete the field just yet, to make it easier to
merge things properly.)
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Currently pcibios_add_platform_entries() returns void, but could fail,
so instead have it return an int and propagate errors up to
pci_create_sysfs_dev_files().
Fixes:
arch/powerpc/kernel/pci_64.c: In function 'pcibios_add_platform_entries':
arch/powerpc/kernel/pci_64.c:878: warning: ignoring return value of
'device_create_file', declared with attribute warn_unused_result
arch/powerpc/kernel/pci_32.c: In function 'pcibios_add_platform_entries':
arch/powerpc/kernel/pci_32.c:1043: warning: ignoring return value of
'device_create_file', declared with attribute warn_unused_result
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
I'm not sure if this is going to fly, weak symbols work on the compilers I'm
using, but whether they work for all of the affected architectures I can't say.
I've cc'ed as many arch maintainers/lists as I could find.
But assuming they do, we can use a weak empty definition of
pcibios_add_platform_entries() to avoid having an empty definition on every
arch.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
pci_create_sysfs_dev_files() should call pci_remove_resource_files() in
its error path, to match the call it makes to pci_create_resource_files().
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
At one time, if a BIOS ROM shadow was detected for the boot video
device (stored at offset 0xc0000), we'd set a special resource flag,
IORESOURCE_ROM_SHADOW, so that the sysfs ROM file code could handle
it properly. That broke along the way somewhere though, so current
kernels will be missing 'rom' files in sysfs if the video device
doesn't have an explicit ROM BAR.
This patch fixes the regression by moving the video fixup quirk to a
little later in the boot cycle (to avoid having its work undone by
PCI resource allocation) and checking in the PCI sysfs code whether
a rom file should be created due to a shadow resource, which is also
moved to a little later in the boot cycle so it will occur after the
video fixup. Tested and works on my i386 test box.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Export the numa-node attribute of PCI devices in sysfs so that
user applications may choose where to be placed accordingly.
Signed-off-by: Brice Goglin <Brice.Goglin@ens-lyon.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Changes the pci_{enable,disable}_device() functions to work in a
nested basis, so that eg, three calls to enable_device() require three
calls to disable_device().
The reason for this is to simplify PCI drivers for
multi-interface/capability devices. These are devices that cram more
than one interface in a single function. A relevant example of that is
the Wireless [USB] Host Controller Interface (similar to EHCI) [see
http://www.intel.com/technology/comms/wusb/whci.htm].
In these kind of devices, multiple interfaces are accessed through a
single bar and IRQ line. For that, the drivers map only the smallest
area of the bar to access their register banks and use shared IRQ
handlers.
However, because the order at which those drivers load cannot be known
ahead of time, the sequence in which the calls to pci_enable_device()
and pci_disable_device() cannot be predicted. Thus:
1. driverA starts pci_enable_device()
2. driverB starts pci_enable_device()
3. driverA shutdown pci_disable_device()
4. driverB shutdown pci_disable_device()
between steps 3 and 4, driver B would loose access to it's device,
even if it didn't intend to.
By using this modification, the device won't be disabled until all the
callers to enable() have called disable().
This is implemented by replacing 'struct pci_dev->is_enabled' from a
bitfield to an atomic use count. Each caller to enable increments it,
each caller to disable decrements it. When the count increments from 0
to 1, __pci_enable_device() is called to actually enable the
device. When it drops to zero, pci_disable_device() actually does the
disabling.
We keep the backend __pci_enable_device() for pci_default_resume() to
use and also change the sysfs method implementation, so that userspace
enabling/disabling the device doesn't disable it one time too much.
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The PCI sysfs attributes are created after the initial PCI bus scan. With
the addition of more return value checking and assertions in the device and
sysfs layers we now can get dumps like this on sparc64:
[ 20.135032] Call Trace:
[ 20.135042] [0000000000537f88] pci_remove_bus_device+0x30/0xc0
[ 20.135076] [000000000078f890] pci_fill_in_pbm_cookies+0x98/0x440
[ 20.135109] [000000000042e828] sabre_scan_bus+0x230/0x400
[ 20.135139] [000000000078c710] pcibios_init+0x58/0xa0
[ 20.135159] [0000000000416f14] init+0x9c/0x2e0
[ 20.135190] [0000000000417a50] kernel_thread+0x38/0x60
[ 20.135211] [0000000000417170] rest_init+0x18/0x40
[ 20.135514] PCI0(PBMB): Bus running at 33MHz
It's triggering because removal of the "config" PCI sysfs file for the
device fails.
On sparc64, after probing the device, we'll delete the PCI device via
pci_remove_bus_device() if we cannot find the firmware device tree node
corresponding to it.
This is fine, but at this point the sysfs files for the PCI device won't be
setup yet.
So we should not try to do anything in pci_remove_sysfs_dev_files() if
pci_sysfs_init() has not run yet.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Export the PCI_BUS_FLAGS_NO_MSI flag of a PCI bus in the sysfs files
of its parent device and make it writable. Could be used to:
* disable MSI on a device which has not been blacklisted yet
* allow MSI when some setpci hacks enable MSI support (for instance
on the ServerWorks HT2000 chipset where the MSI HT cap is disabled
by default).
Architecture where some bus have no parent chipset cannot use this
strategy to change MSI support.
If the chipset does not have a subordinate bus, its 'bus_msi' file
is empty.
Also document and warn about the possible danger of changing the flag.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Based on a patch series originally from Vivek Goyal <vgoyal@in.ibm.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
From: Doug Thompson <norsk5@yahoo.com>
This patch adds the 'broken_parity_status' sysfs attribute file to a PCI device.
Reading this attribute a userland program can determine if PCI device provides false
positives (value of 1) in its generation of PCI Parity status, or not (value of 0).
As PCI devices are found to be 'bad' in this regard, userland programs can also set
the appropriate value (root access only) of a faulty device. This per device
information will be used in the EDAC PCI Parity scanner code in a future patch once
this interface becomes available.
Signed-off-by: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch adds an "enable" sysfs attribute to each PCI device. When read it
shows the "enabled-ness" of the device, but you can write a "0" into it to
disable a device, and a "1" to enable it.
This later is needed for X and other cases where userspace wants to enable
the BARs on a device (typical example: to run the video bios on a secundary
head). Right now X does all this "by hand" via bitbanging, that's just evil.
This allows X to no longer do that but to just let the kernel do this.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
CC: Peter Jones <pjones@redhat.com>
Acked-by: Dave Airlie <airlied@linux.ie>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
this patch converts drivers/pci to kzalloc usage.
Compile tested with allyes config.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Some PCI adapters (eg. ipr scsi adapters) have an exposure today in that they
issue BIST to the adapter to reset the card. If, during the time it takes to
complete BIST, userspace attempts to access PCI config space, the host bus
bridge will master abort the access since the ipr adapter does not respond on
the PCI bus for a brief period of time when running BIST. On PPC64 hardware,
this master abort results in the host PCI bridge isolating that PCI device
from the rest of the system, making the device unusable until Linux is
rebooted. This patch is an attempt to close that exposure by introducing some
blocking code in the PCI code. When blocked, writes will be humored and reads
will return the cached value. Ben Herrenschmidt has also mentioned that he
plans to use this in PPC power management.
Signed-off-by: Brian King <brking@us.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
drivers/pci/access.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++++
drivers/pci/pci-sysfs.c | 20 +++++-----
drivers/pci/pci.h | 7 +++
drivers/pci/proc.c | 28 +++++++--------
drivers/pci/syscall.c | 14 +++----
include/linux/pci.h | 7 +++
6 files changed, 134 insertions(+), 31 deletions(-)
This patch converts kcalloc(1, ...) calls to use the new kzalloc() function.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
pcibus_to_cpumask expands into more than just an initialiser so gcc
moans about code before variable declarations.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is an updated version of Ben's fix-pci-mmap-on-ppc-and-ppc64.patch
which is in 2.6.12-rc4-mm1.
It fixes the patch to work on PPC iSeries, removes some debug printks
at Ben's request, and incorporates your
fix-pci-mmap-on-ppc-and-ppc64-fix.patch also.
Originally from Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch was discussed at length on linux-pci and so far, the last
iteration of it didn't raise any comment. It's effect is a nop on
architecture that don't define the new pci_resource_to_user() callback
anyway. It allows architecture like ppc who put weird things inside of
PCI resource structures to convert to some different value for user
visible ones. It also fixes mmap'ing of IO space on those archs.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch adds the possibility to do word-aligned 16-bit atomic PCI
configuration space accesses via the sysfs PCI interface. As a result, problems
with Emulex LFPC on IBM PowerPC64 are fixed.
Patch is present in SLES 9 SP1.
Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!