Commit Graph

2655 Commits

Author SHA1 Message Date
Yinghai Lu ac205b7bb7 PCI: make sriov work with hotplug remove
When hot removing a pci express module that has a pcie switch and supports
SRIOV, we got:

[ 5918.610127] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 1
[ 5918.615779] pciehp 0000:80:02.2:pcie04: Attention button interrupt received
[ 5918.622730] pciehp 0000:80:02.2:pcie04: Button pressed on Slot(3)
[ 5918.629002] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 1f9
[ 5918.637416] pciehp 0000:80:02.2:pcie04: PCI slot #3 - powering off due to button press.
[ 5918.647125] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 10
[ 5918.653039] pciehp 0000:80:02.2:pcie04: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
[ 5918.661229] pciehp 0000:80:02.2:pcie04: pciehp_set_attention_status: SLOTCTRL a8 write cmd c0
[ 5924.667627] pciehp 0000:80:02.2:pcie04: Disabling domain🚌device=0000:b0:00
[ 5924.674909] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 2f9
[ 5924.683262] pciehp 0000:80:02.2:pcie04: pciehp_unconfigure_device: domain🚌dev = 0000:b0:00
[ 5924.693976] libfcoe_device_notification: NETDEV_UNREGISTER eth6
[ 5924.764979] libfcoe_device_notification: NETDEV_UNREGISTER eth14
[ 5924.873539] libfcoe_device_notification: NETDEV_UNREGISTER eth15
[ 5924.995209] libfcoe_device_notification: NETDEV_UNREGISTER eth16
[ 5926.114407] sxge 0000:b2:00.0: PCI INT A disabled
[ 5926.119342] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 5926.127189] IP: [<ffffffff81353a3b>] pci_stop_bus_device+0x33/0x83
[ 5926.133377] PGD 0
[ 5926.135402] Oops: 0000 [#1] SMP
[ 5926.138659] CPU 2
[ 5926.140499] Modules linked in:
...
[ 5926.143754]
[ 5926.275823] Call Trace:
[ 5926.278267]  [<ffffffff81353a38>] pci_stop_bus_device+0x30/0x83
[ 5926.284180]  [<ffffffff81353af4>] pci_remove_bus_device+0x1a/0xba
[ 5926.290264]  [<ffffffff81366311>] pciehp_unconfigure_device+0x110/0x17b
[ 5926.296866]  [<ffffffff81365dd9>] ? pciehp_disable_slot+0x188/0x188
[ 5926.303123]  [<ffffffff81365d6f>] pciehp_disable_slot+0x11e/0x188
[ 5926.309206]  [<ffffffff81365e68>] pciehp_power_thread+0x8f/0xe0
...

 +-[0000:80]-+-00.0-[81-8f]--
 |           +-01.0-[90-9f]--
 |           +-02.0-[a0-af]--
 |           +-02.2-[b0-bf]----00.0-[b1-b3]--+-02.0-[b2]--+-00.0 Device
 |           |                               |            +-00.1 Device
 |           |                               |            +-00.2 Device
 |           |                               |            \-00.3 Device
 |           |                               \-03.0-[b3]--+-00.0 Device
 |           |                                            +-00.1 Device
 |           |                                            +-00.2 Device
 |           |                                            \-00.3 Device

root complex: 80:02.2
pci express modules: have pcie switch and are listed as b0:00.0, b1:02.0 and b1:03.0.
end devices  are b2:00.0 and b3.00.0.
VFs are: b2:00.1,... b2:00.3, and b3:00.1,...,b3:00.3

Root cause: when doing pci_stop_bus_device() with phys fn, it will stop
virt fn and remove the fn, so
	list_for_each_safe(l, n, &bus->devices)
will have problem to refer freed n that is pointed to vf entry.

Solution is just replacing list_for_each_safe() with
list_for_each_prev_safe().  This will make sure we can get valid n pointer
to PF instead of the freed VF pointer (because newly added devices are
inserted to the bus->devices list tail).

During reviewing the patch, Bjorn said:
|   The PCI hot-remove path calls pci_stop_bus_devices() via
|   pci_remove_bus_device().
|
|   pci_stop_bus_devices() traverses the bus->devices list (point A below),
|   stopping each device in turn, which calls the driver remove() method.  When
|   the device is an SR-IOV PF, the driver calls pci_disable_sriov(), which
|   also uses pci_remove_bus_device() to remove the VF devices from the
|   bus->devices list (point B).
|
|       pci_remove_bus_device
|         pci_stop_bus_device
|           pci_stop_bus_devices(subordinate)
|             list_for_each(bus->devices)             <-- A
|               pci_stop_bus_device(PF)
|                 ...
|                   driver->remove
|                     pci_disable_sriov
|                       ...
|                         pci_remove_bus_device(VF)
|                             <remove from bus_list>  <-- B
|
|   At B, we're changing the same list we're iterating through at A, so when
|   the driver remove() method returns, the pci_stop_bus_devices() iterator has
|   a pointer to a list entry that has already been freed.

Discussion thread can be found : https://lkml.org/lkml/2011/10/15/141
				 https://lkml.org/lkml/2012/1/23/360

-v5: According to Linus to make remove more robust, Change to
     list_for_each_prev_safe instead. That is more reasonable, because
     those devices are added to tail of the list before.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:59 -08:00
Yinghai Lu 67cc7e26a5 PCI: remove add_to_failed_list()
Only one user; just use add_to_list instead.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:58 -08:00
Yinghai Lu b592443d90 PCI: add debug print out for add_size
For use in debugging resource reallocation.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:58 -08:00
Yinghai Lu bffc56d411 PCI: make free_list() into a function
After merging struct pci_dev_resource_x and pci_dev_resource,
We can use a function instead of macro now.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:57 -08:00
Yinghai Lu b9b0bba96c PCI: Rename dev_res_x to add_res or fail_res
Linus says don't use dev_res_x because it doesn't communicate anything
about usage.  Rename them to add_res or fail_res etc according to
context.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:57 -08:00
Yinghai Lu 764242a0ae PCI: Merge pci_dev_resource_x and pci_dev_resource
pci_dev_resource_x is a superset of pci_dev_resource and they're just
temp structs used during resource reallocation.

pci_dev_resource usage is quite limted.

So just use pci_dev_resource_x, and rename it as new pci_dev_resource.

-v2: According to Linus, Separate free_list change to another patch

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:56 -08:00
Yinghai Lu bdc4abecae PCI: Replace resource_list with generic list
So we can use helper functions for generic list.  This makes the
resource re-allocation code much more readable.

-v2: Use list_add_tail instead of adding list_insert_before, Pointed out
     by Linus.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:55 -08:00
Yinghai Lu 2934a0de09 PCI: Move struct resource_list to setup-bus.c
No user outside of setup-bus.c now.  Later patches will convert
resource_list to a regular list.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:55 -08:00
Yinghai Lu 78c3b329b9 PCI: Move pdev_sort_resources() to setup-bus.c
This allows us to move the definition of struct resource_list to
setup_bus.c and later convert resource_list to a regular list.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:54 -08:00
Yinghai Lu 19aa7ee432 PCI: make re-allocation try harder by reassigning ranges higher in the heirarchy
On a system with devices that support SRIOV connected to a pcie switch
to pcie root port:

 +-[0000:80]-+-00.0-[81-8f]--
 |           +-01.0-[90-9f]--
 |           +-02.0-[a0-af]----00.0-[a1-a3]--+-02.0-[a2]--+-00.0 Oracle Corporation Device 207a
 |           |                               \-03.0-[a3]--+-00.0 Oracle Corporation Device 207a
 |           +-02.2-[b0-bf]----00.0-[b1-b3]--+-02.0-[b2]--+-00.0 Oracle Corporation Device 207a
 |           |                               \-03.0-[b3]--+-00.0 Oracle Corporation Device 207a

When the BIOS does not assign resources for SRIOV BARs, kernel pci
reallocation only goes up one bridge and then gives up, failing to to
get resources for all sSRIOV BARs, even though the range is large enough
in the peer root bus.

Specifically, only the bridge at the a1:02.0 level has its resources
cleared and reallocated.  The kernel does not go up to clear the bridge
at the 80:02.0 level.

To make it go to upper levels, during retry, we need to treat "good to have"
resources as "must have".

Only on the last try will we treat good to have resources as optional.
At that time, parent bridge resources will already have been released so
we'll have a chance to get everything assigned with must_have plus
good_to_have for all child devices.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:54 -08:00
Yinghai Lu 9b03088f95 PCI: Make pci_rescan_bus handle add_list
This allows us to allocate resources to hotplug bridges during
remove/rescan.

We need to move the function to setup-bus.c so it can use
__pci_bus_size_bridges and __pci_bus_assign_resources directly to take
the add_list resource tracking list.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:53 -08:00
Yinghai Lu 2f320521a0 PCI: Make rescan bus increase bridge resource size if needed
Current rescan will not touch bridge MMIO and IO.

Try to reuse pci_assign_unassigned_bridge_resources(bridge) to update bridge
resources, if child devices need more resources.

Only do that for bridges whose children are all removed already; i.e. don't
release resources that could already be in use by drivers on child devices.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:53 -08:00
Yinghai Lu 8424d7592e PCI: Use add_list in pcie hotplug path.
We need add size for hot plug path when pluging in hotplug chassis
without cards.

-v2: change descriptions. make it applicable after "pci: Check bridge
     resources after resource allocation."

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:52 -08:00
Yinghai Lu 3e6e0d8094 PCI: try to assign required+option size first
We found reassignment can not find a range for one resource, even if the
total available range is large enough.

bridge b1:02.0 will need 2M+3M
bridge b1:03.0 will need 2M+3M

so bridge b0:00.0 will get assigned: 4M : [f8000000-f83fffff]
   later is reassigned to 10M : [f8000000-f9ffffff]

b1:02.0 is assigned to 2M : [f8000000-f81fffff]
b1:03.0 is assigned to 2M : [f8200000-f83fffff]

After that b1:03.0 get chance to be reassigned to [f8200000-f86fffff],
but b1:02.0 will not have chance to expand, because b1:03.0 is using in
middle one.

[  187.911401] pci 0000:b1:02.0: bridge window [mem 0x00100000-0x002fffff] to [bus b2-b2] add_size 300000
[  187.920764] pci 0000:b1:03.0: bridge window [mem 0x00100000-0x002fffff] to [bus b3-b3] add_size 300000
[  187.930129] pci 0000:b1:02.0: [mem 0x00100000-0x002fffff] get_res_add_size  add_size 300000
[  187.938500] pci 0000:b1:03.0: [mem 0x00100000-0x002fffff] get_res_add_size  add_size 300000
[  187.946857] pci 0000:b0:00.0: bridge window [mem 0x00100000-0x004fffff] to [bus b1-b3] add_size 600000
[  187.956206] pci 0000:b0:00.0: BAR 14: assigned [mem 0xf8000000-0xf83fffff]
[  187.963102] pci 0000:b0:00.0: BAR 15: assigned [mem 0xf5000000-0xf51fffff pref]
[  187.970434] pci 0000:b0:00.0: BAR 14: reassigned [mem 0xf8000000-0xf89fffff]
[  187.977497] pci 0000:b1:02.0: BAR 14: assigned [mem 0xf8000000-0xf81fffff]
[  187.984383] pci 0000:b1:02.0: BAR 15: assigned [mem 0xf5000000-0xf50fffff pref]
[  187.991695] pci 0000:b1:03.0: BAR 14: assigned [mem 0xf8200000-0xf83fffff]
[  187.998576] pci 0000:b1:03.0: BAR 15: assigned [mem 0xf5100000-0xf51fffff pref]
[  188.005888] pci 0000:b1:03.0: BAR 14: reassigned [mem 0xf8200000-0xf86fffff]
[  188.012939] pci 0000:b1:02.0: BAR 14: can't assign mem (size 0x200000)
[  188.019471] pci 0000:b1:02.0: failed to add 300000 to res=[mem 0xf8000000-0xf81fffff]
[  188.027326] pci 0000:b2:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
[  188.034071] pci 0000:b2:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit]
[  188.040795] pci 0000:b2:00.0: BAR 2: assigned [mem 0xf8000000-0xf80fffff 64bit]
[  188.048119] pci 0000:b2:00.0: BAR 2: set to [mem 0xf8000000-0xf80fffff 64bit] (PCI address [0xf8000000-0xf80fffff])
[  188.058550] pci 0000:b2:00.0: BAR 6: assigned [mem 0xf5000000-0xf50fffff pref]
[  188.065802] pci 0000:b2:00.0: BAR 0: assigned [mem 0xf8100000-0xf8103fff 64bit]
[  188.073125] pci 0000:b2:00.0: BAR 0: set to [mem 0xf8100000-0xf8103fff 64bit] (PCI address [0xf8100000-0xf8103fff])
[  188.083596] pci 0000:b2:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit]
[  188.090310] pci 0000:b2:00.0: BAR 9: can't assign mem (size 0x300000)
[  188.096773] pci 0000:b2:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
[  188.103479] pci 0000:b2:00.0: BAR 7: assigned [mem 0xf8104000-0xf810ffff 64bit]
[  188.110801] pci 0000:b2:00.0: BAR 7: set to [mem 0xf8104000-0xf810ffff 64bit] (PCI address [0xf8104000-0xf810ffff])
[  188.121256] pci 0000:b1:02.0: PCI bridge to [bus b2-b2]
[  188.126512] pci 0000:b1:02.0:   bridge window [mem 0xf8000000-0xf81fffff]
[  188.133328] pci 0000:b1:02.0:   bridge window [mem 0xf5000000-0xf50fffff pref]
[  188.140608] pci 0000:b3:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
[  188.147341] pci 0000:b3:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit]
[  188.154076] pci 0000:b3:00.0: BAR 2: assigned [mem 0xf8200000-0xf82fffff 64bit]
[  188.161417] pci 0000:b3:00.0: BAR 2: set to [mem 0xf8200000-0xf82fffff 64bit] (PCI address [0xf8200000-0xf82fffff])
[  188.171865] pci 0000:b3:00.0: BAR 6: assigned [mem 0xf5100000-0xf51fffff pref]
[  188.179090] pci 0000:b3:00.0: BAR 0: assigned [mem 0xf8300000-0xf8303fff 64bit]
[  188.186431] pci 0000:b3:00.0: BAR 0: set to [mem 0xf8300000-0xf8303fff 64bit] (PCI address [0xf8300000-0xf8303fff])
[  188.196884] pci 0000:b3:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit]
[  188.203591] pci 0000:b3:00.0: BAR 9: assigned [mem 0xf8400000-0xf86fffff 64bit]
[  188.210909] pci 0000:b3:00.0: BAR 9: set to [mem 0xf8400000-0xf86fffff 64bit] (PCI address [0xf8400000-0xf86fffff])
[  188.221379] pci 0000:b3:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
[  188.228089] pci 0000:b3:00.0: BAR 7: assigned [mem 0xf8304000-0xf830ffff 64bit]
[  188.235407] pci 0000:b3:00.0: BAR 7: set to [mem 0xf8304000-0xf830ffff 64bit] (PCI address [0xf8304000-0xf830ffff])
[  188.245843] pci 0000:b1:03.0: PCI bridge to [bus b3-b3]
[  188.251107] pci 0000:b1:03.0:   bridge window [mem 0xf8200000-0xf86fffff]
[  188.257922] pci 0000:b1:03.0:   bridge window [mem 0xf5100000-0xf51fffff pref]
[  188.265180] pci 0000:b0:00.0: PCI bridge to [bus b1-b3]
[  188.270443] pci 0000:b0:00.0:   bridge window [mem 0xf8000000-0xf89fffff]
[  188.277250] pci 0000:b0:00.0:   bridge window [mem 0xf5000000-0xf51fffff pref]
[  188.284512] pcieport 0000:80:02.2: PCI bridge to [bus b0-bf]
[  188.290184] pcieport 0000:80:02.2:   bridge window [io  0xa000-0xbfff]
[  188.296735] pcieport 0000:80:02.2:   bridge window [mem 0xf8000000-0xf8ffffff]
[  188.303963] pcieport 0000:80:02.2:   bridge window [mem 0xf5000000-0xf5ffffff 64bit pref]

Thus b2:00.0 BAR 9 does not get assigned...

root cause:
b1:02.0 can not be added more range, because b1:03.0 is just after it;
no space between the required ranges.

Solution:
Try to assign required + optional all together at first, and if that
fails, try again with just the required resources.

-v2: seperate add_to_list change() to another patch according to Jesse.
     seperate get_res_add_size() moving to another patch according to Jesse.
     add !realloc_head->next check if the list is empty to bail early
     according to Jesse.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:52 -08:00
Yinghai Lu 1c372353e9 PCI: Move get_res_add_size() function
Need to call it from __assign_resources_sorted() later and we'd like to
avoid a forward declaraion.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:51 -08:00
Yinghai Lu ef62dfefa9 PCI: Make add_to_list() return status
Will be used for resource_list_x duplication when trying
requested+optional at first.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:51 -08:00
Yinghai Lu a4ac9fea01 PCI : Calculate right add_size
During debug of one SRIOV enabled hotplug device, we found found that
add_size is not passed properly.

The device has devices under two level bridges:

 +-[0000:80]-+-00.0-[81-8f]--
 |           +-01.0-[90-9f]--
 |           +-02.0-[a0-af]----00.0-[a1-a3]--+-02.0-[a2]--+-00.0  Oracle Corporation Device
 |           |                               \-03.0-[a3]--+-00.0  Oracle Corporation Device

Which means later the parent bridge will not try to add a big enough range:

[  557.455077] pci 0000:a0:00.0: BAR 14: assigned [mem 0xf9000000-0xf93fffff]
[  557.461974] pci 0000:a0:00.0: BAR 15: assigned [mem 0xf6000000-0xf61fffff pref]
[  557.469340] pci 0000:a1:02.0: BAR 14: assigned [mem 0xf9000000-0xf91fffff]
[  557.476231] pci 0000:a1:02.0: BAR 15: assigned [mem 0xf6000000-0xf60fffff pref]
[  557.483582] pci 0000:a1:03.0: BAR 14: assigned [mem 0xf9200000-0xf93fffff]
[  557.490468] pci 0000:a1:03.0: BAR 15: assigned [mem 0xf6100000-0xf61fffff pref]
[  557.497833] pci 0000:a1:03.0: BAR 14: can't assign mem (size 0x200000)
[  557.504378] pci 0000:a1:03.0: failed to add optional resources res=[mem 0xf9200000-0xf93fffff]
[  557.513026] pci 0000:a1:02.0: BAR 14: can't assign mem (size 0x200000)
[  557.519578] pci 0000:a1:02.0: failed to add optional resources res=[mem 0xf9000000-0xf91fffff]

It turns out we did not calculate size1 properly.

static resource_size_t calculate_memsize(resource_size_t size,
                resource_size_t min_size,
                resource_size_t size1,
                resource_size_t old_size,
                resource_size_t align)
{
        if (size < min_size)
                size = min_size;
        if (old_size == 1 )
                old_size = 0;
        if (size < old_size)
                size = old_size;
        size = ALIGN(size + size1, align);
        return size;
}

We should not pass add_size with min_size in calculate_memsize since
that will make add_size not contribute final add_size.

So just pass add_size with size1 to calculate_memsize().

With this change, we should have chance to remove extra addon in
pci_reassign_resource.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:50 -08:00
Masanari Iida 0dea210b17 PCI: Fix typo in setup-res.c
Correct spelling "resouce" to "resource" in
dricers/pci/setup-res.c

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:50 -08:00
Konrad Rzeszutek Wilk 6fbf9e7a90 PCI: Introduce __pci_reset_function_locked to be used when holding device_lock.
The use case of this is when a driver wants to call FLR when a device
is attached to it using the SysFS "bind" or "unbind" functionality.

The call chain when a user does "bind" looks as so:

 echo "0000:01.07.0" > /sys/bus/pci/drivers/XXXX/bind

and ends up calling:
  driver_bind:
    device_lock(dev);  <=== TAKES LOCK
    XXXX_probe:
         .. pci_enable_device()
         ...__pci_reset_function(), which calls
                 pci_dev_reset(dev, 0):
                        if (!0) {
                                device_lock(dev) <==== DEADLOCK

The __pci_reset_function_locked function allows the the drivers
'probe' function to call the "pci_reset_function" while still holding
the driver mutex lock.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:48 -08:00
Julia Lawall 8f0cdddcd3 PCI: drivers/pci/hotplug/ibmphp_ebda.c: add missing iounmap
Add missing iounmap in error handling code, in a case where the function
already preforms iounmap on some other execution path.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression e;
statement S,S1;
int ret;
@@
e = \(ioremap\|ioremap_nocache\)(...)
... when != iounmap(e)
if (<+...e...+>) S
... when any
    when != iounmap(e)
*if (...)
   { ... when != iounmap(e)
     return ...; }
... when any
iounmap(e);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:47 -08:00
Amos Kong f382a086f3 PCI: Can continually add funcs after adding func0
Boot up a KVM guest, and hotplug multifunction
devices(func1,func2,func0,func3) to guest.

for i in 1 2 0 3;do
qemu-img create /tmp/resize$i.qcow2 1G -f qcow2
(qemu) drive_add 0x11.$i id=drv11$i,if=none,file=/tmp/resize$i.qcow2
(qemu) device_add virtio-blk-pci,id=dev11$i,drive=drv11$i,addr=0x11.$i,multifunction=on
done

In linux kernel, when func0 of the slot is hot-added, the whole
slot will be marked as 'enabled', then driver will ignore other new
hotadded funcs.
But in Win7 & WinXP, we can continaully add other funcs after adding
func0, all funcs will be added in guest.

drivers/pci/hotplug/acpiphp_glue.c:
static int acpiphp_check_bridge(struct acpiphp_bridge *bridge)
{
....
        for (slot = bridge->slots; slot; slot = slot->next) {
                if (slot->flags & SLOT_ENABLED) {
                        acpiphp_disable_slot()
                else
                        acpiphp_enable_slot()
....                              |
}                                 v
                            enable_device()
                                  |
                                  v
        //only don't enable slot if func0 is not added
	list_for_each_entry(func, &slot->funcs, sibling) {
               ...
        }
       slot->flags |= SLOT_ENABLED; //mark slot to 'enabled'

This patch just make pci driver can continaully add funcs after adding
func 0. Only mark slot to 'enabled' when all funcs are added.

For pci multifunction hotplug, we can add functions one by one(func 0 is
necessary), and all functions will be removed in one time.

Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:47 -08:00
Myron Stowe 6535943fbf x86/PCI: Convert maintaining FW-assigned BIOS BAR values to use a list
This patch converts the underlying maintenance aspects of FW-assigned
BIOS BAR values from a statically allocated array within struct pci_dev
to a list of temporary, stand alone, entries.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:46 -08:00
Myron Stowe 351fc6d1a5 PCI: Fix starting basis for resource requests
pci_revert_fw_address() is used to reinstate a PCI device's original
FW-assigned BIOS BAR value(s) if normal resource assignment fails.

When attempting to reinstate an address, the point within the resource
tree from which to attempt the new resource request should be the parent
resource corresponding to the device, not the base of the resource tree
(ioport_resource or iomem_resource).  For PCI devices this would
typically be the resource corresponding to the upstream PCI host bridge
or P2P bridge aperture.

This patch sets the point within the resource tree to attempt a new
resource assignment request to the PCI device's parent resource and only
if that fails does it fall back to the base ioport_resource or
iomem_resource.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:45 -08:00
Yinghai Lu 3682a3946d PCI: Fix pci cardbus removal
During test busn_res allocation with cardbus, found pci card removal is not
working anymore, and it turns out it is broken by:

|commit 79cc9601c3
|Date:   Tue Nov 22 21:06:53 2011 -0800
|
|    PCI: Only call pci_stop_bus_device() one time for child devices at remove

The above changed the behavior of pci_remove_behind_bridge that
yenta_cardbus depended on.  So restore the old behavoir of
pci_remove_behind_bridge (which requires stopping and removing of all
devices) by:

1. rename pci_remove_behind_bridge to __pci_remove_behind_bridge, and let
   __pci_remove_bus_device() call it instead.
2. add pci_stop_behind_bridge that will stop devices behind a bridge
3. add back pci_remove_behind_bridge that will stop and remove devices
   under bridge.

-v2: update commit description a little bit.

Tested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-10 12:19:31 -08:00
Vaidyanathan Srinivasan 8161fe91d8 PCI: set pci sriov page size before reading SRIOV BAR
For an SRIOV device, PCI_SRIOV_SYS_PGSIZE should be set before
the PCI_SRIOV_BAR are queried.  The sys pagesize defaults to 4k,
so this change is required on powerpc box with 64k base page size.
    
This is a regression caused due to moving SRIOV init to sriov_enable().
    
| commit afd24ece5c
| Author: Ram Pai <linuxram@us.ibm.com>
    
| PCI: delay configuration of SRIOV capability
| The SRIOV capability, namely page size and total_vfs of a device are
| configured during enumeration phase of the device.  This can potentially
| interfere with the PCI operations of the platform, if the IOV capability
| of the device is not enabled.
    
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Acked-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-10 12:01:56 -08:00
Yinghai Lu 71f6bd4a23 PCI: workaround hard-wired bus number V2
Fixes PCI device detection on IBM xSeries IBM 3850 M2 / x3950 M2
when using ACPI resources (_CRS).
This is default, a manual workaround (without this patch)
would be pci=nocrs boot param.

V2: Add dev_warn if the workaround is hit. This should reveal
how common such setups are (via google) and point to possible
problems if things are still not working as expected.
-> Suggested by Jan Beulich.

Cc: stable@vger.kernel.org
Tested-by: garyhade@us.ibm.com
Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-10 11:34:42 -08:00
Randy Dunlap 6e9292c588 kernel-doc: fix new warnings in pci
Fix new kernel-doc warnings:

Warning(drivers/pci/pci.c:2811): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2811): Excess function parameter 'pdev' description in 'pci_intx_mask_supported'
Warning(drivers/pci/pci.c:2894): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2894): Excess function parameter 'pdev' description in 'pci_check_and_mask_intx'
Warning(drivers/pci/pci.c:2908): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2908): Excess function parameter 'pdev' description in 'pci_check_and_unmask_intx'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-23 08:44:53 -08:00
Linus Torvalds c49c41a413 Merge branch 'for-linus' of git://selinuxproject.org/~jmorris/linux-security
* 'for-linus' of git://selinuxproject.org/~jmorris/linux-security:
  capabilities: remove __cap_full_set definition
  security: remove the security_netlink_recv hook as it is equivalent to capable()
  ptrace: do not audit capability check when outputing /proc/pid/stat
  capabilities: remove task_ns_* functions
  capabitlies: ns_capable can use the cap helpers rather than lsm call
  capabilities: style only - move capable below ns_capable
  capabilites: introduce new has_ns_capabilities_noaudit
  capabilities: call has_ns_capability from has_capability
  capabilities: remove all _real_ interfaces
  capabilities: introduce security_capable_noaudit
  capabilities: reverse arguments to security_capable
  capabilities: remove the task from capable LSM hook entirely
  selinux: sparse fix: fix several warnings in the security server cod
  selinux: sparse fix: fix warnings in netlink code
  selinux: sparse fix: eliminate warnings for selinuxfs
  selinux: sparse fix: declare selinux_disable() in security.h
  selinux: sparse fix: move selinux_complete_init
  selinux: sparse fix: make selinux_secmark_refcount static
  SELinux: Fix RCU deref check warning in sel_netport_insert()

Manually fix up a semantic mis-merge wrt security_netlink_recv():

 - the interface was removed in commit fd77846152 ("security: remove
   the security_netlink_recv hook as it is equivalent to capable()")

 - a new user of it appeared in commit a38f7907b9 ("crypto: Add
   userspace configuration API")

causing no automatic merge conflict, but Eric Paris pointed out the
issue.
2012-01-14 18:36:33 -08:00
Rusty Russell 90ab5ee941 module_param: make bool parameters really bool (drivers & misc)
module_param(bool) used to counter-intuitively take an int.  In
fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
trick.

It's time to remove the int/unsigned int option.  For this version
it'll simply give a warning, but it'll break next kernel version.

Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-13 09:32:20 +10:30
Linus Torvalds 7b67e75147 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci: (80 commits)
  x86/PCI: Expand the x86_msi_ops to have a restore MSIs.
  PCI: Increase resource array mask bit size in pcim_iomap_regions()
  PCI: DEVICE_COUNT_RESOURCE should be equal to PCI_NUM_RESOURCES
  PCI: pci_ids: add device ids for STA2X11 device (aka ConneXT)
  PNP: work around Dell 1536/1546 BIOS MMCONFIG bug that breaks USB
  x86/PCI: amd: factor out MMCONFIG discovery
  PCI: Enable ATS at the device state restore
  PCI: msi: fix imbalanced refcount of msi irq sysfs objects
  PCI: kconfig: English typo in pci/pcie/Kconfig
  PCI/PM/Runtime: make PCI traces quieter
  PCI: remove pci_create_bus()
  xtensa/PCI: convert to pci_scan_root_bus() for correct root bus resources
  x86/PCI: convert to pci_create_root_bus() and pci_scan_root_bus()
  x86/PCI: use pci_scan_bus() instead of pci_scan_bus_parented()
  x86/PCI: read Broadcom CNB20LE host bridge info before PCI scan
  sparc32, leon/PCI: convert to pci_scan_root_bus() for correct root bus resources
  sparc/PCI: convert to pci_create_root_bus()
  sh/PCI: convert to pci_scan_root_bus() for correct root bus resources
  powerpc/PCI: convert to pci_create_root_bus()
  powerpc/PCI: split PHB part out of pcibios_map_io_space()
  ...

Fix up conflicts in drivers/pci/msi.c and include/linux/pci_regs.h due
to the same patches being applied in other branches.
2012-01-11 18:50:26 -08:00
Linus Torvalds 1c8106528a Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (53 commits)
  iommu/amd: Set IOTLB invalidation timeout
  iommu/amd: Init stats for iommu=pt
  iommu/amd: Remove unnecessary cache flushes in amd_iommu_resume
  iommu/amd: Add invalidate-context call-back
  iommu/amd: Add amd_iommu_device_info() function
  iommu/amd: Adapt IOMMU driver to PCI register name changes
  iommu/amd: Add invalid_ppr callback
  iommu/amd: Implement notifiers for IOMMUv2
  iommu/amd: Implement IO page-fault handler
  iommu/amd: Add routines to bind/unbind a pasid
  iommu/amd: Implement device aquisition code for IOMMUv2
  iommu/amd: Add driver stub for AMD IOMMUv2 support
  iommu/amd: Add stat counter for IOMMUv2 events
  iommu/amd: Add device errata handling
  iommu/amd: Add function to get IOMMUv2 domain for pdev
  iommu/amd: Implement function to send PPR completions
  iommu/amd: Implement functions to manage GCR3 table
  iommu/amd: Implement IOMMUv2 TLB flushing routines
  iommu/amd: Add support for IOMMUv2 domain mode
  iommu/amd: Add amd_iommu_domain_direct_map function
  ...
2012-01-10 11:08:21 -08:00
Linus Torvalds 90160371b3 Merge branch 'stable/for-linus-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/for-linus-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (37 commits)
  xen/pciback: Expand the warning message to include domain id.
  xen/pciback: Fix "device has been assigned to X domain!" warning
  xen/pciback: Move the PCI_DEV_FLAGS_ASSIGNED ops to the "[un|]bind"
  xen/xenbus: don't reimplement kvasprintf via a fixed size buffer
  xenbus: maximum buffer size is XENSTORE_PAYLOAD_MAX
  xen/xenbus: Reject replies with payload > XENSTORE_PAYLOAD_MAX.
  Xen: consolidate and simplify struct xenbus_driver instantiation
  xen-gntalloc: introduce missing kfree
  xen/xenbus: Fix compile error - missing header for xen_initial_domain()
  xen/netback: Enable netback on HVM guests
  xen/grant-table: Support mappings required by blkback
  xenbus: Use grant-table wrapper functions
  xenbus: Support HVM backends
  xen/xenbus-frontend: Fix compile error with randconfig
  xen/xenbus-frontend: Make error message more clear
  xen/privcmd: Remove unused support for arch specific privcmp mmap
  xen: Add xenbus_backend device
  xen: Add xenbus device driver
  xen: Add privcmd device driver
  xen/gntalloc: fix reference counts on multi-page mappings
  ...
2012-01-10 10:09:59 -08:00
Joerg Roedel 00fb5430f5 Merge branches 'iommu/fixes', 'arm/omap' and 'x86/amd' into next
Conflicts:
	drivers/pci/hotplug/acpiphp_glue.c
2012-01-09 13:04:05 +01:00
Linus Torvalds 972b2c7199 Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits)
  reiserfs: Properly display mount options in /proc/mounts
  vfs: prevent remount read-only if pending removes
  vfs: count unlinked inodes
  vfs: protect remounting superblock read-only
  vfs: keep list of mounts for each superblock
  vfs: switch ->show_options() to struct dentry *
  vfs: switch ->show_path() to struct dentry *
  vfs: switch ->show_devname() to struct dentry *
  vfs: switch ->show_stats to struct dentry *
  switch security_path_chmod() to struct path *
  vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb
  vfs: trim includes a bit
  switch mnt_namespace ->root to struct mount
  vfs: take /proc/*/mounts and friends to fs/proc_namespace.c
  vfs: opencode mntget() mnt_set_mountpoint()
  vfs: spread struct mount - remaining argument of next_mnt()
  vfs: move fsnotify junk to struct mount
  vfs: move mnt_devname
  vfs: move mnt_list to struct mount
  vfs: switch pnode.h macros to struct mount *
  ...
2012-01-08 12:19:57 -08:00
Konrad Rzeszutek Wilk 76ccc29701 x86/PCI: Expand the x86_msi_ops to have a restore MSIs.
The MSI restore function will become a function pointer in an
x86_msi_ops struct. It defaults to the implementation in the
io_apic.c and msi.c. We piggyback on the indirection mechanism
introduced by "x86: Introduce x86_msi_ops".

Cc: x86@kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: linux-pci@vger.kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 14:02:26 -08:00
Linus Torvalds 67b0243131 Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Skip cpus with apic-ids >= 255 in !x2apic_mode
  x86, x2apic: Allow "nox2apic" to disable x2apic mode setup by BIOS
  x86, x2apic: Fallback to xapic when BIOS doesn't setup interrupt-remapping
  x86, acpi: Skip acpi x2apic entries if the x2apic feature is not present
  x86, apic: Add probe() for apic_flat
  x86: Simplify code by removing a !SMP #ifdefs from 'struct cpuinfo_x86'
  x86: Convert per-cpu counter icr_read_retry_count into a member of irq_stat
  x86: Add per-cpu stat counter for APIC ICR read tries
  pci, x86/io-apic: Allow PCI_IOAPIC to be user configurable on x86
  x86: Fix the !CONFIG_NUMA build of the new CPU ID fixup code support
  x86: Add NumaChip support
  x86: Add x86_init platform override to fix up NUMA core numbering
  x86: Make flat_init_apic_ldr() available
2012-01-06 13:58:21 -08:00
Hao, Xudong 1900ca132f PCI: Enable ATS at the device state restore
During S3 or S4 resume or PCI reset, ATS regs aren't restored correctly.
This patch enables ATS at the device state restore if PCI device has ATS
capability.

Signed-off-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:11:18 -08:00
Neil Horman 424eb39159 PCI: msi: fix imbalanced refcount of msi irq sysfs objects
This warning was recently reported to me:

------------[ cut here ]------------
WARNING: at lib/kobject.c:595 kobject_put+0x50/0x60()
Hardware name: VMware Virtual Platform
kobject: '(null)' (ffff880027b0df40): is not initialized, yet kobject_put() is
being called.
Modules linked in: vmxnet3(+) vmw_balloon i2c_piix4 i2c_core shpchp raid10
vmw_pvscsi
Pid: 630, comm: modprobe Tainted: G        W   3.1.6-1.fc16.x86_64 #1
Call Trace:
 [<ffffffff8106b73f>] warn_slowpath_common+0x7f/0xc0
 [<ffffffff8106b836>] warn_slowpath_fmt+0x46/0x50
 [<ffffffff810da293>] ? free_desc+0x63/0x70
 [<ffffffff812a9aa0>] kobject_put+0x50/0x60
 [<ffffffff812e4c25>] free_msi_irqs+0xd5/0x120
 [<ffffffff812e524c>] pci_enable_msi_block+0x24c/0x2c0
 [<ffffffffa017c273>] vmxnet3_alloc_intr_resources+0x173/0x240 [vmxnet3]
 [<ffffffffa0182e94>] vmxnet3_probe_device+0x615/0x834 [vmxnet3]
 [<ffffffff812d141c>] local_pci_probe+0x5c/0xd0
 [<ffffffff812d2cb9>] pci_device_probe+0x109/0x130
 [<ffffffff8138ba2c>] driver_probe_device+0x9c/0x2b0
 [<ffffffff8138bceb>] __driver_attach+0xab/0xb0
 [<ffffffff8138bc40>] ? driver_probe_device+0x2b0/0x2b0
 [<ffffffff8138bc40>] ? driver_probe_device+0x2b0/0x2b0
 [<ffffffff8138a8ac>] bus_for_each_dev+0x5c/0x90
 [<ffffffff8138b63e>] driver_attach+0x1e/0x20
 [<ffffffff8138b240>] bus_add_driver+0x1b0/0x2a0
 [<ffffffffa0188000>] ? 0xffffffffa0187fff
 [<ffffffff8138c246>] driver_register+0x76/0x140
 [<ffffffff815ca414>] ? printk+0x51/0x53
 [<ffffffffa0188000>] ? 0xffffffffa0187fff
 [<ffffffff812d2996>] __pci_register_driver+0x56/0xd0
 [<ffffffffa018803a>] vmxnet3_init_module+0x3a/0x3c [vmxnet3]
 [<ffffffff81002042>] do_one_initcall+0x42/0x180
 [<ffffffff810aad71>] sys_init_module+0x91/0x200
 [<ffffffff815dccc2>] system_call_fastpath+0x16/0x1b
---[ end trace 44593438a59a9558 ]---
Using INTx interrupt, #Rx queues: 1.

It occurs when populate_msi_sysfs fails, which in turn causes free_msi_irqs to
be called.  Because populate_msi_sysfs fails, we never registered any of the
msi irq sysfs objects, but free_msi_irqs still calls kobject_del and kobject_put
on each of them, which gets flagged in the above stack trace.

The fix is pretty straightforward.  We can key of the parent pointer in the
kobject.  It is only set if the kobject_init_and_add succededs in
populate_msi_sysfs.  If anything fails there, each kobject has its parent reset
to NULL

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Greg Kroah-Hartman <gregkh@suse.de>
CC: linux-pci@vger.kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:11:17 -08:00
P. Christeas d56641c772 PCI: kconfig: English typo in pci/pcie/Kconfig
Just fix this help text.

Signed-off-by: P. Christeas <xrg@linux.gr>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:11:17 -08:00
Vincent Palatin 85b8582d7c PCI/PM/Runtime: make PCI traces quieter
When the runtime PM is activated on PCI, if a device switches state
frequently (e.g. an EHCI controller with autosuspending USB devices
connected) the PCI configuration traces might be very verbose in the
kernel log.  Let's guard those traces with DEBUG condition.

Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:11:16 -08:00
Bjorn Helgaas 118faafaf9 PCI: remove pci_create_bus()
All users of pci_create_bus() have been converted to pci_create_root_bus(),
so remove it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:11:15 -08:00
Bjorn Helgaas 7e00fe2e53 PCI: deprecate pci_scan_bus_parented()
Users of pci_scan_bus_parented() should be converted to use either
    pci_scan_root_bus() (preferred, but also calls pci_bus_add_devices)
or
    pci_create_root_bus()
    pci_scan_child_bus()

Since pci_scan_bus_parented(), I'm marking it deprecated now and will
actually remove it later.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:55 -08:00
Bjorn Helgaas 1e39ae9f90 PCI: convert pci_scan_bus_parented() to use pci_create_root_bus()
This converts pci_scan_bus_parented() to use pci_create_root_bus()
instead of pci_create_bus().  The new bus still has the default (incorrect)
resources, so this patch doesn't help fix that problem, but it does remove
one more use of pci_create_bus().

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:54 -08:00
Bjorn Helgaas de4b2f76d6 PCI: convert pci_scan_bus() to use pci_create_root_bus()
I plan to deprecate pci_scan_bus_parented(), so use pci_create_root_bus()
directly instead.  pci_scan_bus() itself will be removed as soon as all
callers are gone, so this is just an interim step.

v2: export pci_scan_bus

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:53 -08:00
Bjorn Helgaas a2ebb82795 PCI: add pci_scan_root_bus() that accepts resource list
"Early" and "header" quirks often use incorrect bus resources because they
see the default resources assigned by pci_create_bus(), before the
architecture fixes them up (typically in pcibios_fixup_bus()).  Regions
reserved by these quirks end up with the wrong parents.

Here's the standard path for scanning a PCI root bus:

  pci_scan_bus or pci_scan_bus_parented
    pci_create_bus                     <-- A create with default resources
    pci_scan_child_bus
      pci_scan_slot
        pci_scan_single_device
          pci_scan_device
            pci_setup_device
              pci_fixup_device(early)  <-- B
          pci_device_add
            pci_fixup_device(header)   <-- C
      pcibios_fixup_bus                <-- D fill in correct resources

Early and header quirks at B and C use the default (incorrect) root bus
resources rather than those filled in at D.

This patch adds a new pci_scan_root_bus() function that sets the bus
resources correctly from a supplied list of resources.

I intend to remove pci_scan_bus() and pci_scan_bus_parented() after
fixing all callers.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:52 -08:00
Bjorn Helgaas 166c637075 PCI: add pci_create_root_bus() that accepts resource list
pci_create_bus() assigns ioport_resource and iomem_resource as the default
bus resources, i.e., the entire address space.  Architectures fix these
later, typically in pcibios_fixup_bus() or after pci_scan_bus_parented()
returns, but code that runs in the interim sees incorrect resource
information.

This patch adds a new pci_create_root_bus() that sets the bus resources
correctly from a supplied list of resources.

I intend to remove pci_create_bus() after changing all callers.

Based on original patch by Deng-Cheng Zhu.

Reference: http://www.spinics.net/lists/mips/msg41654.html
Reference: https://lkml.org/lkml/2011/8/26/88
Signed-off-by: Deng-Cheng Zhu <dczhu@mips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:51 -08:00
Bjorn Helgaas a9d9f5276c PCI: show host bridges and root bus resources
Show the bus number and resources for every root bus we create.  This
will become more interesting when we supply the correct resources
instead of using the defaults (ioport_resource and iomem_resource).

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:51 -08:00
Bjorn Helgaas 45ca9e9730 PCI: add helpers for building PCI bus resource lists
We'd like to supply a list of resources when we create a new PCI bus,
e.g., the root bus under a PCI host bridge.  These are helpers for
constructing that list.

These are exported because the plan is to replace this exported interface:
    pci_scan_bus_parented()
with this one:
    pci_add_resource(resources, ...)
    pci_scan_root_bus(..., resources)

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:50 -08:00
Ram Pai afd24ece5c PCI: delay configuration of SRIOV capability
The SRIOV capability, namely page size and total_vfs of a device are
configured during enumeration phase of the device.  This can potentially
interfere with the PCI operations of the platform, if the IOV capability
of the device is not enabled.

The following patch postpones the configuration of the IOV capability of
the device to a later point, when the IOV capability is explicitly
enabled by the device driver.

The patch is tested on x86 and power platform.

Tested-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:49 -08:00
Yinghai Lu 79cc9601c3 PCI: Only call pci_stop_bus_device() one time for child devices at remove
During debugging pcie hotplug with SRIOV with pcie switch, I found
pci_stop_bus_device() is called several times for some child devices.

So change original pci_remove_bus_device() to __pci_remove_bus_device(),
and make it only do remove work, and add a new pci_remove_bus_device
that calls pci_stop_bus_device() one time, and then call
__pci_remove_bus_device().

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:48 -08:00