Commit Graph

677719 Commits

Author SHA1 Message Date
Quan Nguyen d09d3629b6 drivers: net: xgene: Use rgmii mdio mac access
This patch switches to use rgmii mdio mac access routines if available,
as they share the same HW.

Signed-off-by: Quan Nguyen <qnguyen@apm.com>
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16 11:41:08 -04:00
Quan Nguyen 8ec7074a6b drivers: net: phy: xgene: Add lock to protect mac access
This patch,

- refactors mac access routine
- adds lock to protect mac indirect access

Signed-off-by: Quan Nguyen <qnguyen@apm.com>
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16 11:41:08 -04:00
Iyappan Subramanian ae1aed95d0 drivers: net: xgene: Protect indirect MAC access
This patch,

     - refactors mac read/write functions
     - adds lock to protect indirect mac access

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Quan Nguyen <qnguyen@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16 11:41:08 -04:00
Linus Torvalds a95cfad947 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Track alignment in BPF verifier so that legitimate programs won't be
    rejected on !CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS architectures.

 2) Make tail calls work properly in arm64 BPF JIT, from Deniel
    Borkmann.

 3) Make the configuration and semantics Generic XDP make more sense and
    don't allow both generic XDP and a driver specific instance to be
    active at the same time. Also from Daniel.

 4) Don't crash on resume in xen-netfront, from Vitaly Kuznetsov.

 5) Fix use-after-free in VRF driver, from Gao Feng.

 6) Use netdev_alloc_skb_ip_align() to avoid unaligned IP headers in
    qca_spi driver, from Stefan Wahren.

 7) Always run cleanup routines in BPF samples when we get SIGTERM, from
    Andy Gospodarek.

 8) The mdio phy code should bring PHYs out of reset using the shared
    GPIO lines before invoking bus->reset(). From Florian Fainelli.

 9) Some USB descriptor access endian fixes in various drivers from
    Johan Hovold.

10) Handle PAUSE advertisements properly in mlx5 driver, from Gal
    Pressman.

11) Fix reversed test in mlx5e_setup_tc(), from Saeed Mahameed.

12) Cure netdev leak in AF_PACKET when using timestamping via control
    messages. From Douglas Caetano dos Santos.

13) netcp doesn't support HWTSTAMP_FILTER_ALl, reject it. From Miroslav
    Lichvar.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
  ldmvsw: stop the clean timer at beginning of remove
  ldmvsw: unregistering netdev before disable hardware
  net: netcp: fix check of requested timestamping filter
  ipv6: avoid dad-failures for addresses with NODAD
  qed: Fix uninitialized data in aRFS infrastructure
  mdio: mux: fix device_node_continue.cocci warnings
  net/packet: fix missing net_device reference release
  net/mlx4_core: Use min3 to select number of MSI-X vectors
  macvlan: Fix performance issues with vlan tagged packets
  net: stmmac: use correct pointer when printing normal descriptor ring
  net/mlx5: Use underlay QPN from the root name space
  net/mlx5e: IPoIB, Only support regular RQ for now
  net/mlx5e: Fix setup TC ndo
  net/mlx5e: Fix ethtool pause support and advertise reporting
  net/mlx5e: Use the correct pause values for ethtool advertising
  vmxnet3: ensure that adapter is in proper state during force_close
  sfc: revert changes to NIC revision numbers
  net: ch9200: add missing USB-descriptor endianness conversions
  net: irda: irda-usb: fix firmware name on big-endian hosts
  net: dsa: mv88e6xxx: add default case to switch
  ...
2017-05-15 15:50:49 -07:00
Linus Torvalds 1319a2856d Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
 "A set of minor cifs fixes"

* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
  [CIFS] Minor cleanup of xattr query function
  fs: cifs: transport: Use time_after for time comparison
  SMB2: Fix share type handling
  cifs: cifsacl: Use a temporary ops variable to reduce code length
  Don't delay freeing mids when blocked on slow socket write of request
  CIFS: silence lockdep splat in cifs_relock_file()
2017-05-15 15:27:02 -07:00
David S. Miller 66f4bc819d Merge branch 'ldmsw-fixes'
Shannon Nelson says:

====================
ldmvsw: port removal stability

Under heavy reboot stress testing we found a couple of timing issues
when removing the device that could cause the kernel great heartburn,
addressed by these two patches.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-15 15:36:09 -04:00
Shannon Nelson 8b671f906c ldmvsw: stop the clean timer at beginning of remove
Stop the clean timer earlier to be sure there's no asynchronous
interference while stopping the port.

Orabug: 25748241

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-15 15:36:08 -04:00
Thomas Tai b18e5e86b4 ldmvsw: unregistering netdev before disable hardware
When running LDom binding/unbinding test, kernel may panic
in ldmvsw_open(). It is more likely that because we're removing
the ldc connection before unregistering the netdev in vsw_port_remove(),
we set up a window of time where one process could be removing the
device while another trying to UP the device. This also sometimes causes
vio handshake error due to opening a device without closing it completely.
We should unregister the netdev before we disable the "hardware".

Orabug: 25980913, 25925306

Signed-off-by: Thomas Tai <thomas.tai@oracle.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-15 15:36:08 -04:00
Miroslav Lichvar ca9df7ede4 net: netcp: fix check of requested timestamping filter
The driver doesn't support timestamping of all received packets and
should return error when trying to enable the HWTSTAMP_FILTER_ALL
filter.

Cc: WingMan Kwok <w-kwok2@ti.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-15 15:21:03 -04:00
Christoph Hellwig f98e0eb680 dm mpath: multipath_clone_and_map must not return -EIO
Since 412445ac ("dm: introduce a new DM_MAPIO_KILL return value"), the
clone_and_map_rq methods must not return errno values, so fix it up
to properly return DM_MAPIO_KILL, instead of the -EIO value that snuck
in due to a conflict between two patches.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-15 15:09:53 -04:00
Christoph Hellwig 18a482f524 dm mpath: don't return -EIO from dm_report_EIO
Instead just turn the macro into a helper for the warning message.
This removes an unnecessary assignment and will allow the next commit to
fix a place where -EIO is the wrong return value.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-15 15:09:52 -04:00
Christoph Hellwig ece0728037 dm rq: add a missing break to map_request
We don't want to bug when receiving a DM_MAPIO_KILL value..

Fixes: 412445ac ("dm: introduce a new DM_MAPIO_KILL return value")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-15 15:09:51 -04:00
Joe Thornber 0377a07c7a dm space map disk: fix some book keeping in the disk space map
When decrementing the reference count for a block, the free count wasn't
being updated if the reference count went to zero.

Cc: stable@vger.kernel.org
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-15 15:09:50 -04:00
Joe Thornber 91bcdb92d3 dm thin metadata: call precommit before saving the roots
These calls were the wrong way round in __write_initial_superblock.

Cc: stable@vger.kernel.org
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-15 15:09:49 -04:00
David S. Miller 42a928ced3 mlx5-fixes-2017-05-12
Misc fixes for mlx5 driver
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJZGDM8AAoJEEg/ir3gV/o+svMH/1lAl+FIGCWgZ82/UbFHCZCW
 SIXEnP2id+Nic7JxQSa/RVQ43I75wkM8pN929hFoyz8p/pPLhZkpo7vX6yLWG2SL
 fTx/4Qn5jR/eow/D+fdrlyKMPg0A7fijY+ZnvDPsQkjtakIedCgc/A1xDufpKi7+
 I7nOJa4yACuZK0gzy32VGgpJw02q32eRTJjKHRiEYdmNQSIJpmRbG2m4e0z/me2s
 hrMt358/llPOZNwkAPD2SHZxH68oSq5EbSRmz5jDwXfTVFkjWNQVowwm3pCFjsQn
 3xDtkjakVPVBUagR3hngtLPcsDpUR4GU3tHr4l/UPp0wFBXVKgdupC7B+mCkSp4=
 =ubJG
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2017-05-12-V2' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2017-05-12

This series contains some mlx5 fixes for net.
Please pull and let me know if there's any problem.

For -stable:
("net/mlx5e: Fix ethtool pause support and advertise reporting") kernels >= 4.8
("net/mlx5e: Use the correct pause values for ethtool advertising") kernels >= 4.8

v1->v2:
 Dropped statistics spinlock patch, it needs some extra work.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-15 14:38:04 -04:00
Mahesh Bandewar 66eb9f86e5 ipv6: avoid dad-failures for addresses with NODAD
Every address gets added with TENTATIVE flag even for the addresses with
IFA_F_NODAD flag and dad-work is scheduled for them. During this DAD process
we realize it's an address with NODAD and complete the process without
sending any probe. However the TENTATIVE flags stays on the
address for sometime enough to cause misinterpretation when we receive a NS.
While processing NS, if the address has TENTATIVE flag, we mark it DADFAILED
and endup with an address that was originally configured as NODAD with
DADFAILED.

We can't avoid scheduling dad_work for addresses with NODAD but we can
avoid adding TENTATIVE flag to avoid this racy situation.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-15 14:31:51 -04:00
Mintz, Yuval aa4ad88cfc qed: Fix uninitialized data in aRFS infrastructure
Current memset is using incorrect type of variable, causing the
upper-half of the strucutre to be left uninitialized and causing:

  ethernet/qlogic/qed/qed_init_fw_funcs.c: In function 'qed_set_rfs_mode_disable':
  ethernet/qlogic/qed/qed_init_fw_funcs.c:993:3: error: '*((void *)&ramline+4)' is used uninitialized in this function [-Werror=uninitialized]

Fixes: d51e4af5c2 ("qed: aRFS infrastructure support")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-15 14:31:27 -04:00
Julia Lawall 8c977f5a85 mdio: mux: fix device_node_continue.cocci warnings
Device node iterators put the previous value of the index variable, so an
explicit put causes a double put.

In particular, of_mdiobus_register can fail before doing anything
interesting, so one could view it as a no-op from the reference count
point of view.

Generated by: scripts/coccinelle/iterators/device_node_continue.cocci

CC: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-15 14:28:10 -04:00
Douglas Caetano dos Santos d19b183cdc net/packet: fix missing net_device reference release
When using a TX ring buffer, if an error occurs processing a control
message (e.g. invalid message), the net_device reference is not
released.

Fixes c14ac9451c ("sock: enable timestamping using control messages")
Signed-off-by: Douglas Caetano dos Santos <douglascs@taghos.com.br>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-15 14:22:12 -04:00
yuval.shaia@oracle.com 4762010f09 net/mlx4_core: Use min3 to select number of MSI-X vectors
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-15 14:20:26 -04:00
Vlad Yasevich 70957eaecc macvlan: Fix performance issues with vlan tagged packets
Macvlan always turns on offload features that have sofware
fallback (NETIF_GSO_SOFTWARE).  This allows much higher guest-guest
communications over macvtap.

However, macvtap does not turn on these features for vlan tagged traffic.
As a result, depending on the HW that mactap is configured on, the
performance of guest-guest communication over a vlan is very
inconsistent.  If the HW supports TSO/UFO over vlans, then the
performance will be fine.  If not, the the performance will suffer
greatly since the VM may continue using TSO/UFO, and will force the host
segment the traffic and possibly overlow the macvtap queue.

This patch adds the always on offloads to vlan_features.  This
makes sure that any vlan tagged traffic between 2 guest will not
be segmented needlessly.

Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-15 14:18:11 -04:00
Peter Rosin 9fce894d03 i2c: mux: only print failure message on error
As is, a failure message is printed unconditionally, which is confusing.
And noisy.

Fixes: 8d4d159f25 ("i2c: mux: provide more info on failure in i2c_mux_add_adapter")
Signed-off-by: Peter Rosin <peda@axentia.se>
2017-05-15 18:49:11 +02:00
Peter Rosin a36d4637e4 i2c: mux: reg: rename label to indicate what it does
That maintains sanity if it is ever called from some other spot, and
also makes the label names coherent.

Signed-off-by: Peter Rosin <peda@axentia.se>
2017-05-15 18:49:10 +02:00
Peter Rosin 68118e0e73 i2c: mux: reg: put away the parent i2c adapter on probe failure
It is only prudent to let go of resources that are not used.

Fixes: b3fdd32799 ("i2c: mux: Add register-based mux i2c-mux-reg")
Signed-off-by: Peter Rosin <peda@axentia.se>
2017-05-15 18:44:58 +02:00
Niklas Cassel 66c25f6e31 net: stmmac: use correct pointer when printing normal descriptor ring
There are two pointers in sysfs_display_ring,
one that increments if using normal dma descriptors,
another if using extended dma descriptors.

When printing the normal dma descriptors, the wrong pointer is used,
thus the printed descriptor addresses are incorrect.

Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-15 10:02:19 -04:00
Heiko Carstens fb317002ab s390/virtio: change virtio_feature_desc:features type to __le32
The feature member of virtio_feature_desc contains little endian
values, given that it contents will be converted with
le32_to_cpu(). The "wrong" __u32 type leads to the sparse warnings
below.
In order to avoid them, use the correct __le32 type instead.

drivers/s390/virtio/virtio_ccw.c:749:14: warning: cast to restricted __le32
drivers/s390/virtio/virtio_ccw.c:762:28: warning: cast to restricted __le32

Acked-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-05-15 12:20:54 +02:00
Joe Thornber 2e63309507 dm cache policy smq: don't do any writebacks unless IDLE
If there are no clean blocks to be demoted the writeback will be
triggered at that point.  Preemptively writing back can hurt high IO
load scenarios.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-14 21:54:33 -04:00
Joe Thornber 49b7f76890 dm cache: simplify the IDLE vs BUSY state calculation
Drop the MODERATE state since it wasn't buying us much.

Also, in check_migrations(), prepare for the next commit ("dm cache
policy smq: don't do any writebacks unless IDLE") by deferring to the
policy to make the final decision on whether writebacks can be
serviced.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-14 21:54:33 -04:00
Joe Thornber 701e03e4e1 dm cache: track all IO to the cache rather than just the origin device's IO
IO tracking used to throttle writebacks when the origin device is busy.

Even if all the IO is going to the fast device, writebacks can
significantly degrade performance.  So track all IO to gauge whether the
cache is busy or not.

Otherwise, synthetic IO tests (e.g. fio) that might send all IO to the
fast device wouldn't cause writebacks to get throttled.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-14 21:54:33 -04:00
Joe Thornber 6cf4cc8f8b dm cache policy smq: stop preemptively demoting blocks
It causes a lot of churn if the working set's size is close to the fast
device's size.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-14 21:54:33 -04:00
Joe Thornber 4d44ec5ab7 dm cache policy smq: put newly promoted entries at the top of the multiqueue
This stops entries bouncing in and out of the cache quickly.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-14 21:54:33 -04:00
Joe Thornber 78c45607b9 dm cache policy smq: be more aggressive about triggering a writeback
If there are no clean entries to demote we really want to writeback
immediately.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-14 21:54:32 -04:00
Joe Thornber a8cd1eba61 dm cache policy smq: only demote entries in bottom half of the clean multiqueue
Heavy IO load may mean there are very few clean blocks in the cache, and
we risk demoting entries that get hit a lot.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-14 21:54:32 -04:00
Joe Thornber 072792dcdf dm cache: fix incorrect 'idle_time' reset in IO tracker
Some bios have no payload (eg, a FLUSH), don't reset the idle_time when
these come in.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-14 21:53:11 -04:00
Thomas Gleixner 90b4f30b6d hwmon: (coretemp) Handle frozen hotplug state correctly
The recent conversion to the hotplug state machine missed that the original
hotplug notifiers did not execute in the frozen state, which is used on
suspend on resume.

This does not matter on single socket machines, but on multi socket systems
this breaks when the device for a non-boot socket is removed when the last
CPU of that socket is brought offline. The device removal locks up the
machine hard w/o any debug output.

Prevent executing the hotplug callbacks when cpuhp_tasks_frozen is true.

Thanks to Tommi for providing debug information patiently while I failed to
spot the obvious.

Fixes: e00ca5df37 ("hwmon: (coretemp) Convert to hotplug state machine")
Reported-by: Tommi Rantala <tt.rantala@gmail.com>
Tested-by: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-05-14 07:49:32 -07:00
Yishai Hadas 508541146a net/mlx5: Use underlay QPN from the root name space
Root flow table is dynamically changed by the underlying flow steering
layer, and IPoIB/ULPs have no idea what will be the root flow table in
the future, hence we need a dynamic infrastructure to move Underlay QPs
with the root flow table.

Fixes: b3ba51498b ("net/mlx5: Refactor create flow table method to accept underlay QP")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-05-14 13:33:45 +03:00
Saeed Mahameed 5360fd473c net/mlx5e: IPoIB, Only support regular RQ for now
IPoIB doesn't support striding RQ at the moment, for this
we need to explicitly choose non striding RQ in IPoIB init,
even if the HW supports it.

Fixes: 8f493ffd88 ("net/mlx5e: IPoIB, RX steering RSS RQTs and TIRs")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-05-14 13:33:45 +03:00
Saeed Mahameed 20b6a1c782 net/mlx5e: Fix setup TC ndo
Fail-safe support patches introduced a trivial bug,
setup tc callback is doing a wrong check of the netdevice state,
the fix is simply to invert the condition.

Fixes: 6f9485af40 ("net/mlx5e: Fail safe tc setup")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-05-14 13:33:45 +03:00
Gal Pressman e3c1950371 net/mlx5e: Fix ethtool pause support and advertise reporting
Pause bit should set when RX pause is on, not TX pause.
Also, setting Asym_Pause is incorrect, and should be turned off.

Fixes: 665bc53969 ("net/mlx5e: Use new ethtool get/set link ksettings API")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-05-14 13:33:45 +03:00
Gal Pressman b383b544f2 net/mlx5e: Use the correct pause values for ethtool advertising
Query the operational pause from firmware (PFCC register) instead of
always passing zeros.

Fixes: 665bc53969 ("net/mlx5e: Use new ethtool get/set link ksettings API")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-05-14 13:33:45 +03:00
Kirill Tkhai 3fd3722621 pid_ns: Fix race between setns'ed fork() and zap_pid_ns_processes()
Imagine we have a pid namespace and a task from its parent's pid_ns,
which made setns() to the pid namespace. The task is doing fork(),
while the pid namespace's child reaper is dying. We have the race
between them:

Task from parent pid_ns             Child reaper
copy_process()                      ..
  alloc_pid()                       ..
  ..                                zap_pid_ns_processes()
  ..                                  disable_pid_allocation()
  ..                                  read_lock(&tasklist_lock)
  ..                                  iterate over pids in pid_ns
  ..                                    kill tasks linked to pids
  ..                                  read_unlock(&tasklist_lock)
  write_lock_irq(&tasklist_lock);   ..
  attach_pid(p, PIDTYPE_PID);       ..
  ..                                ..

So, just created task p won't receive SIGKILL signal,
and the pid namespace will be in contradictory state.
Only manual kill will help there, but does the userspace
care about this? I suppose, the most users just inject
a task into a pid namespace and wait a SIGCHLD from it.

The patch fixes the problem. It simply checks for
(pid_ns->nr_hashed & PIDNS_HASH_ADDING) in copy_process().
We do it under the tasklist_lock, and can't skip
PIDNS_HASH_ADDING as noted by Oleg:

"zap_pid_ns_processes() does disable_pid_allocation()
and then takes tasklist_lock to kill the whole namespace.
Given that copy_process() checks PIDNS_HASH_ADDING
under write_lock(tasklist) they can't race;
if copy_process() takes this lock first, the new child will
be killed, otherwise copy_process() can't miss
the change in ->nr_hashed."

If allocation is disabled, we just return -ENOMEM
like it's made for such cases in alloc_pid().

v2: Do not move disable_pid_allocation(), do not
introduce a new variable in copy_process() and simplify
the patch as suggested by Oleg Nesterov.
Account the problem with double irq enabling
found by Eric W. Biederman.

Fixes: c876ad7682 ("pidns: Stop pid allocation when init dies")
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Ingo Molnar <mingo@kernel.org>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Mike Rapoport <rppt@linux.vnet.ibm.com>
CC: Michal Hocko <mhocko@suse.com>
CC: Andy Lutomirski <luto@kernel.org>
CC: "Eric W. Biederman" <ebiederm@xmission.com>
CC: Andrei Vagin <avagin@openvz.org>
CC: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Serge Hallyn <serge@hallyn.com>
Cc: stable@vger.kernel.org
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2017-05-13 17:26:02 -05:00
Eric W. Biederman b9a985db98 pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes
The code can potentially sleep for an indefinite amount of time in
zap_pid_ns_processes triggering the hung task timeout, and increasing
the system average.  This is undesirable.  Sleep with a task state of
TASK_INTERRUPTIBLE instead of TASK_UNINTERRUPTIBLE to remove these
undesirable side effects.

Apparently under heavy load this has been allowing Chrome to trigger
the hung time task timeout error and cause ChromeOS to reboot.

Reported-by: Vovo Yang <vovoy@google.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Fixes: 6347e90091 ("pidns: guarantee that the pidns init will be the last pidns process reaped")
Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2017-05-13 17:26:01 -05:00
Linus Torvalds 2ea659a9ef Linux 4.12-rc1 2017-05-13 13:19:49 -07:00
Linus Torvalds cd63645890 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull some more input subsystem updates from Dmitry Torokhov:
 "An updated xpad driver with a few more recognized device IDs, and a
  new psxpad-spi driver, allowing connecting Playstation 1 and 2 joypads
  via SPI bus"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: cros_ec_keyb - remove extraneous 'const'
  Input: add support for PlayStation 1/2 joypads connected via SPI
  Input: xpad - add USB IDs for Mad Catz Brawlstick and Razer Sabertooth
  Input: xpad - sync supported devices with xboxdrv
  Input: xpad - sort supported devices by USB ID
2017-05-13 10:25:05 -07:00
Linus Torvalds b53c4d5eb7 This pull request contains updates for both UBI and UBIFS:
- New config option CONFIG_UBIFS_FS_SECURITY
 - Minor improvements
 - Random fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJZFuwKAAoJEEtJtSqsAOnWYrUP/0/y7PEh0ZGdi4kkQy/CnuJr
 pmybsQ0TbLoljahuDXqShKkMNvuXIvKSKcHROIsXreG+DCfC3v/srZlvRt7UCPOE
 QVvjh0sQTaMUrfcaTTM9g3Im/BZX9MueTaSF2Rgx1lF+R2t3InW1bv9hvmQxfoEA
 N75tJgH69mii5pDWuGgLLjmmxhbSkMGpM31QeO5DUaLRqdXcc5L5iK5Hnd+Wtj81
 oSB5RsergCfk17jaWH2e7G03LB2tm6AhM5oksTOpZ9+OIW9GOiUMfjYFC2ZYRwzx
 zHhnh0rGPfFv0jO5u4CXtWaQDfxyw6Z7XLK+Xo1RemkhM/7AQl2xetfIVDXErgoA
 NxxN/a8MWcEpJ2x6y/Z8740HXjyQjt9h3nHzlVPNP8hz68J796E7UzRjCQtf7Iyh
 xqhfjMabxfBqcLkTESvgmjcuwo1IkqOaFBjIw2Cd2nfBEkCKzoaINjRHitgUGj/z
 Mm1CJNWvaK6QTdZ3iCETCyPQI02A+4ZXhDf/QZS3wRAMc1v45pS/dVeBn+0F8Nrc
 ASiQwcd7u1IfJa3A6d6DgMECUWBXjc1GGMfMyhS/ta56pOfe1RyR3bg9WuISqUMe
 86id9tiSs7cP2UVFTrFFFWAO3rATj+9cOO9f2LTujPzcd88cJhKSykaLPmELfyE9
 YUPw9lpExwyXLn7S46LQ
 =9ZJe
 -----END PGP SIGNATURE-----

Merge tag 'upstream-4.12-rc1' of git://git.infradead.org/linux-ubifs

Pull UBI/UBIFS updates from Richard Weinberger:

 - new config option CONFIG_UBIFS_FS_SECURITY

 - minor improvements

 - random fixes

* tag 'upstream-4.12-rc1' of git://git.infradead.org/linux-ubifs:
  ubi: Add debugfs file for tracking PEB state
  ubifs: Fix a typo in comment of ioctl2ubifs & ubifs2ioctl
  ubifs: Remove unnecessary assignment
  ubifs: Fix cut and paste error on sb type comparisons
  ubi: fastmap: Fix slab corruption
  ubifs: Add CONFIG_UBIFS_FS_SECURITY to disable/enable security labels
  ubi: Make mtd parameter readable
  ubi: Fix section mismatch
2017-05-13 10:23:12 -07:00
Linus Torvalds ec059019b7 Merge branch 'for-linus-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML fixes from Richard Weinberger:
 "No new stuff, just fixes"

* 'for-linus-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: Add missing NR_CPUS include
  um: Fix to call read_initrd after init_bootmem
  um: Include kbuild.h instead of duplicating its macros
  um: Fix PTRACE_POKEUSER on x86_64
  um: Set number of CPUs
  um: Fix _print_addr()
2017-05-13 10:20:02 -07:00
Linus Torvalds 1251704a63 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "15 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm, docs: update memory.stat description with workingset* entries
  mm: vmscan: scan until it finds eligible pages
  mm, thp: copying user pages must schedule on collapse
  dax: fix PMD data corruption when fault races with write
  dax: fix data corruption when fault races with write
  ext4: return to starting transaction in ext4_dax_huge_fault()
  mm: fix data corruption due to stale mmap reads
  dax: prevent invalidation of mapped DAX entries
  Tigran has moved
  mm, vmalloc: fix vmalloc users tracking properly
  mm/khugepaged: add missed tracepoint for collapse_huge_page_swapin
  gcov: support GCC 7.1
  mm, vmstat: Remove spurious WARN() during zoneinfo print
  time: delete current_fs_time()
  hwpoison, memcg: forcibly uncharge LRU pages
2017-05-13 09:49:35 -07:00
Steve French 67b4c889cc [CIFS] Minor cleanup of xattr query function
Some minor cleanup of cifs query xattr functions (will also make
SMB3 xattr implementation cleaner as well).

Signed-off-by: Steve French <steve.french@primarydata.com>
2017-05-12 20:59:10 -05:00
Karim Eshapa 4328fea77c fs: cifs: transport: Use time_after for time comparison
Use time_after kernel macro for time comparison
that has safety check.

Signed-off-by: Karim Eshapa <karim.eshapa@gmail.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2017-05-12 19:56:44 -05:00
Christophe JAILLET cd1230070a SMB2: Fix share type handling
In fs/cifs/smb2pdu.h, we have:
#define SMB2_SHARE_TYPE_DISK    0x01
#define SMB2_SHARE_TYPE_PIPE    0x02
#define SMB2_SHARE_TYPE_PRINT   0x03

Knowing that, with the current code, the SMB2_SHARE_TYPE_PRINT case can
never trigger and printer share would be interpreted as disk share.

So, test the ShareType value for equality instead.

Fixes: faaf946a7d ("CIFS: Add tree connect/disconnect capability for SMB2")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2017-05-12 19:55:56 -05:00