Commit Graph

9 Commits

Author SHA1 Message Date
Lv Zheng 2fb037b89a ACPI / IPMI: Cleanup several acpi_ipmi_device members
This (trivial) patch:
 1. Deletes a member of the acpi_ipmi_device, smi_data, which is not
    actually used.
 2. Updates a member of the acpi_ipmi_device, pnp_dev, which is only used
    by dev_warn() invocations, so changes it to a struct device.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30 19:46:12 +02:00
Lv Zheng 7b98447722 ACPI / IPMI: Add reference counting for ACPI IPMI transfers
This patch adds reference counting for ACPI IPMI transfers to tune the
locking granularity of tx_msg_lock.

This patch also makes the whole acpi_ipmi module's coding style consistent
by using reference counting for all its objects (i.e., acpi_ipmi_device and
acpi_ipmi_msg).

The acpi_ipmi_msg handling is re-designed using referece counting.
 1. tx_msg is always unlinked before complete(), so that it is safe to put
    complete() out side of tx_msg_lock.
 2. tx_msg reference counters are incremented before calling
    ipmi_request_settime() and tx_msg_lock protection is added to
    ipmi_cancel_tx_msg() so that a complete() can be safely called in
    parellel with tx_msg unlinking in failure cases.
 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed
    and freed in the contexts other than acpi_ipmi_space_handler().

The lockdep_chains shows all acpi_ipmi locks are leaf locks after the
tuning:
 1. ipmi_lock is always leaf:
    irq_context: 0
    [ffffffff81a943f8] smi_watchers_mutex
    [ffffffffa06eca60] driver_data.ipmi_lock
    irq_context: 0
    [ffffffff82767b40] &buffer->mutex
    [ffffffffa00a6678] s_active#103
    [ffffffffa06eca60] driver_data.ipmi_lock
 2. without this patch applied, lock used by complete() is held after
    holding tx_msg_lock:
    irq_context: 0
    [ffffffff82767b40] &buffer->mutex
    [ffffffffa00a6678] s_active#103
    [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock
    irq_context: 1
    [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock
    irq_context: 1
    [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock
    [ffffffffa06eccf0] &x->wait#25
    irq_context: 1
    [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock
    [ffffffffa06eccf0] &x->wait#25
    [ffffffff81e36620] &p->pi_lock
    irq_context: 1
    [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock
    [ffffffffa06eccf0] &x->wait#25
    [ffffffff81e36620] &p->pi_lock
    [ffffffff81e5d0a8] &rq->lock
 3. with this patch applied, tx_msg_lock is always leaf:
    irq_context: 0
    [ffffffff82767b40] &buffer->mutex
    [ffffffffa00a66d8] s_active#107
    [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock
    irq_context: 1
    [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30 19:46:12 +02:00
Lv Zheng e96a94edd7 ACPI / IPMI: Use global IPMI operation region handler
It is found on a real machine, in its ACPI namespace, the IPMI
OperationRegions (in the ACPI000D - ACPI power meter) are not defined under
the IPMI system interface device (the IPI0001 with KCS type returned from
_IFT control method):
  Device (PMI0)
  {
      Name (_HID, "ACPI000D")  // _HID: Hardware ID
      OperationRegion (SYSI, IPMI, 0x0600, 0x0100)
      Field (SYSI, BufferAcc, Lock, Preserve)
      {
          AccessAs (BufferAcc, 0x01),
          Offset (0x58),
          SCMD,   8,
          GCMD,   8
      }

      OperationRegion (POWR, IPMI, 0x3000, 0x0100)
      Field (POWR, BufferAcc, Lock, Preserve)
      {
          AccessAs (BufferAcc, 0x01),
          Offset (0xB3),
          GPMM,   8
      }
  }

  Device (PCI0)
  {
      Device (ISA)
      {
          Device (NIPM)
          {
              Name (_HID, EisaId ("IPI0001"))  // _HID: Hardware ID
              Method (_IFT, 0, NotSerialized)  // _IFT: IPMI Interface Type
              {
                  Return (0x01)
              }
          }
      }
  }

Current ACPI_IPMI code registers IPMI operation region handler on a
per-device basis, so for the above namespace the IPMI operation region
handler is registered only under the scope of \_SB.PCI0.ISA.NIPM.  Thus
when an IPMI operation region field of \PMI0 is accessed, there are errors
reported on such platform:
  ACPI Error: No handlers for Region [IPMI]
  ACPI Error: Region IPMI(7) has no handler
The solution is to install an IPMI operation region handler from root node
so that every object that defines IPMI OperationRegion can get an address
space handler registered.

When an IPMI operation region field is accessed, the Network Function
(0x06 for SYSI and 0x30 for POWR) and the Command (SCMD, GCMD, GPMM) are
passed to the operation region handler, there is no system interface
specified by the BIOS.  The patch tries to select one system interface by
monitoring the system interface notification.  IPMI messages passed from
the ACPI codes are sent to this selected global IPMI system interface.

The ACPI_IPMI will always select the first registered IPMI interface
with an ACPI handle (i.e., defined in the ACPI namespace).  It's hard to
determine the selection when there are multiple IPMI system interfaces
defined in the ACPI namespace. According to the IPMI specification:

  A BMC device may make available multiple system interfaces, but only one
  management controller is allowed to be 'active' BMC that provides BMC
  functionality for the system (in case of a 'partitioned' system, there
  can be only one active BMC per partition).  Only the system interface(s)
  for the active BMC allowed to respond to the 'Get Device Id' command.

According to the ipmi_si desigin:

  The ipmi_si registeration notifications can only happen after a
  successful "Get Device ID" command.

Thus it should be OK for non-partitioned systems to do such selection.
However, we do not have much knowledge on 'partitioned' systems.

References: https://bugzilla.kernel.org/show_bug.cgi?id=46741
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30 19:46:12 +02:00
Lv Zheng a1a69b297e ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user
This patch uses reference counting to fix the race caused by the
unprotected ACPI IPMI user.

There are two rules for using the ipmi_si APIs:
 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will
    be passed to ipmi_msg_handler(), but ipmi_request_settime() can not
    use an invalid ipmi_user_t.  This means the ipmi_si users must ensure
    that there won't be any local references on ipmi_user_t before invoking
    ipmi_destroy_user().
 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by
    smi_watchers_mutex, so their execution is serialized.  But as a
    new smi can re-use a freed intf_num, it requires that the callback
    implementation must not use intf_num as an identification mean or it
    must ensure all references to the previous smi are all dropped before
    exiting smi_gone() callback.

As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler()
can happen before setting user_interface to NULL and codes after the check
in acpi_ipmi_space_handler() can happen after user_interface becomes NULL,
the on-going acpi_ipmi_space_handler() still can pass an invalid
acpi_ipmi_device->user_interface to ipmi_request_settime().  Such race
conditions are not allowed by the IPMI layer's API design as a crash will
happen in ipmi_request_settime() if something like that happens.

This patch follows the ipmi_devintf.c design:
 1. Invoke ipmi_destroy_user() after the reference count of
    acpi_ipmi_device drops to 0.  References of acpi_ipmi_device dropping
    to 0 also means tx_msg related to this acpi_ipmi_device are all freed.
    This matches the IPMI layer's API calling rule on ipmi_destroy_user()
    and ipmi_request_settime().
 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be
    running in acpi_ipmi_space_handler().  And it is invoked after invoking
    __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a
    "dead" flag set, and the "dead" flag check is also introduced to the
    point where a tx_msg is going to be added to the tx_msg_list so that no
    new tx_msg can be created after returning from the __ipmi_dev_kill().
 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not
    required since this patch ensures no acpi_ipmi reference is still held
    for ipmi_user_t before calling ipmi_destroy_user() and
    ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen
    after returning from ipmi_destroy_user().
 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch.

The forthcoming IPMI operation region handler installation changes also
requires acpi_ipmi_device be handled in this style.

The header comment of the file is also updated due to this design change.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30 19:46:12 +02:00
Lv Zheng 8584ec6ae9 ACPI / IPMI: Fix race caused by the timed out ACPI IPMI transfers
This patch fixes races caused by timed out ACPI IPMI transfers.

This patch uses timeout mechanism provided by ipmi_si to avoid the race
that the msg_done flag is set but without any protection, its content can
be invalid.  Thanks for the suggestion of Corey Minyard.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30 19:46:11 +02:00
Lv Zheng 5ac557ef49 ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI transfers
This patch fixes races caused by unprotected ACPI IPMI transfers.

We can see that the following crashes may occur:
 1. There is no tx_msg_lock held for iterating tx_msg_list in
    ipmi_flush_tx_msg() while it may be unlinked on failure in
    parallel in acpi_ipmi_space_handler() under tx_msg_lock.
 2. There is no lock held for freeing tx_msg in acpi_ipmi_space_handler()
    while it may be accessed in parallel in ipmi_flush_tx_msg() and
    ipmi_msg_handler().

This patch enhances tx_msg_lock to protect all tx_msg accesses to solve
this issue.  Then tx_msg_lock is always held around complete() and tx_msg
accesses.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30 19:46:11 +02:00
Lv Zheng 6b68f03f95 ACPI / IPMI: Fix potential response buffer overflow
This patch enhances sanity checks on message size to avoid potential buffer
overflow.

The kernel IPMI message size is IPMI_MAX_MSG_LENGTH(272 bytes) while the
ACPI specification defined IPMI message size is 64 bytes.  The difference
is not handled by the original codes.  This may cause crash in the response
handling codes.

This patch closes this gap and also combines rx_data/tx_data to use single
data/len pair since they need not be seperate.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30 19:46:11 +02:00
Lv Zheng 06a8566bcf ACPI / IPMI: Fix atomic context requirement of ipmi_msg_handler()
This patch fixes the issues indicated by the test results that
ipmi_msg_handler() is invoked in atomic context.

BUG: scheduling while atomic: kipmi0/18933/0x10000100
Modules linked in: ipmi_si acpi_ipmi ...
CPU: 3 PID: 18933 Comm: kipmi0 Tainted: G       AW    3.10.0-rc7+ #2
Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.0027.070120100606 07/01/2010
 ffff8838245eea00 ffff88103fc63c98 ffffffff814c4a1e ffff88103fc63ca8
 ffffffff814bfbab ffff88103fc63d28 ffffffff814c73e0 ffff88103933cbd4
 0000000000000096 ffff88103fc63ce8 ffff88102f618000 ffff881035c01fd8
Call Trace:
 <IRQ>  [<ffffffff814c4a1e>] dump_stack+0x19/0x1b
 [<ffffffff814bfbab>] __schedule_bug+0x46/0x54
 [<ffffffff814c73e0>] __schedule+0x83/0x59c
 [<ffffffff81058853>] __cond_resched+0x22/0x2d
 [<ffffffff814c794b>] _cond_resched+0x14/0x1d
 [<ffffffff814c6d82>] mutex_lock+0x11/0x32
 [<ffffffff8101e1e9>] ? __default_send_IPI_dest_field.constprop.0+0x53/0x58
 [<ffffffffa09e3f9c>] ipmi_msg_handler+0x23/0x166 [ipmi_si]
 [<ffffffff812bf6e4>] deliver_response+0x55/0x5a
 [<ffffffff812c0fd4>] handle_new_recv_msgs+0xb67/0xc65
 [<ffffffff81007ad1>] ? read_tsc+0x9/0x19
 [<ffffffff814c8620>] ? _raw_spin_lock_irq+0xa/0xc
 [<ffffffffa09e1128>] ipmi_thread+0x5c/0x146 [ipmi_si]
 ...

Also Tony Camuso says:

 We were getting occasional "Scheduling while atomic" call traces
 during boot on some systems. Problem was first seen on a Cisco C210
 but we were able to reproduce it on a Cisco c220m3. Setting
 CONFIG_LOCKDEP and LOCKDEP_SUPPORT to 'y' exposed a lockdep around
 tx_msg_lock in acpi_ipmi.c struct acpi_ipmi_device.

 =================================
 [ INFO: inconsistent lock state ]
 2.6.32-415.el6.x86_64-debug-splck #1
 ---------------------------------
 inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
 ksoftirqd/3/17 [HC0[0]:SC1[1]:HE1:SE0] takes:
  (&ipmi_device->tx_msg_lock){+.?...}, at: [<ffffffff81337a27>] ipmi_msg_handler+0x71/0x126
 {SOFTIRQ-ON-W} state was registered at:
   [<ffffffff810ba11c>] __lock_acquire+0x63c/0x1570
   [<ffffffff810bb0f4>] lock_acquire+0xa4/0x120
   [<ffffffff815581cc>] __mutex_lock_common+0x4c/0x400
   [<ffffffff815586ea>] mutex_lock_nested+0x4a/0x60
   [<ffffffff8133789d>] acpi_ipmi_space_handler+0x11b/0x234
   [<ffffffff81321c62>] acpi_ev_address_space_dispatch+0x170/0x1be

The fix implemented by this change has been tested by Tony:

 Tested the patch in a boot loop with lockdep debug enabled and never
 saw the problem in over 400 reboots.

Reported-and-tested-by: Tony Camuso <tcamuso@redhat.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-25 03:12:05 +02:00
Zhao Yakui e92b297cc7 IPMI/ACPI: Add the IPMI opregion driver to enable ACPI to access BMC controller
ACPI 4.0 spec adds the ACPI IPMI opregion, which means that the ACPI AML
code can also communicate with the BMC controller. This is to install
the ACPI IPMI opregion and enable the ACPI to access the BMC controller
through the IPMI message.

     It will create IPMI user interface for every IPMI device detected
in ACPI namespace and install the corresponding IPMI opregion space handler.
Then it can enable ACPI to access the BMC controller through the IPMI
message.

The following describes how to process the IPMI request in IPMI space handler:
    1. format the IPMI message based on the request in AML code.
    IPMI system address. Now the address type is SYSTEM_INTERFACE_ADDR_TYPE
    IPMI net function & command
    IPMI message payload
    2. send the IPMI message by using the function of ipmi_request_settime
    3. wait for the completion of IPMI message. It can be done in different
routes: One is in handled in IPMI user recv callback function. Another is
handled in timeout function.
    4. format the IPMI response and return it to ACPI AML code.

At the same time it also addes the module dependency. The ACPI IPMI opregion
will depend on the IPMI subsystem.

Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-12-14 00:22:14 -05:00