While doing some testing I discovered that if the BIOS on a board does not
properly setup the DMI information it leads to a panic in the IPMI code.
The panic is due to dereferencing a pointer which is not initialized. The
pointer is initialized in port_setup() and/or mem_setup() and used in
init_one_smi() and cleanup_one_si(), however if either port_setup() or
mem_setup() return ENODEV the pointer does not get initialized.
Signed-off-by: Paolo Galtieri <pgaltieri@mvista.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The IPMI specifcation says the generator ID is 0x20, but that is for bits
7-1. Bit 0 is set to specify it is a software event. The correct value is
0x41. Without this fix, panic events written into the System Event Log
appear to come from an "unknown" generator, rather than from the kernel.
Signed-off-by: Jordan Hargrave <Jordan_Hargrave@dell.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Acked-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
On IPMI systems with BT interfaces, we don't start the kernel thread, so
smi_info->thread is NULL. Test for NULL when stopping the thread, because
kthread_stop() doesn't, and an oops ensues otherwise.
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Acked-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Lots of good changes to the driver lately that userspace will care about
the version of the driver. Bump the version from 36.0 to 38.0 to be higher
than 37 that the 2.4 driver came out with a few weeks ago which doesn't
have all the same changes.
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Part of a patch was accidentally reverted, this corrects an
inconsistent spinlock use in the IPMI message handler.
Signed-off-by: Hironobu Ishii <hishii@soft.fujitsu.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If a panic came from the IPMI watchdog pretimeout and that was reported via
an NMI, it would also be reported via the standard IPMI flags, which would
get picked up when reporting panic events and cause another panic. This
adds an atomic to avoid calling panic twice.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use rcu_read_lock for the cmd_rcvrs list, since that was what what
intended, anyway. This means that all the users of the cmd_rcvrs_lock are
tasks, so the irq disables are no longer required for that lock and it can
become a semaphore.
Signed-off-by: Corey Minyard <minyard@acm.org>
Acked-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Convert ipmi driver thread to kthread API, only sleep when interface is
idle.
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Cc: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We must poll for responses to commands when interrupts aren't in use. The
default poll interval is based on using a kernel timer, which varies with HZ.
For character-based interfaces like KCS and SMIC though, that can be way too
slow (>15 minutes to flash a new firmware with KCS, >20 seconds to retrieve
the sensor list).
This creates a low-priority kernel thread to poll more often. If the state
machine is idle, so is the kernel thread. But if there's an active command,
it polls quite rapidly. This decrease a firmware flash time from 15 minutes
to 1.5 minutes, and the sensor list time to 4.5 seconds, on a Dell PowerEdge
x8x system.
The timer-based polling remains, to ensure some amount of responsiveness even
under high user process CPU load.
Checking for a stopped timer at rmmod now uses atomics and del_timer_sync() to
ensure safe stoppage.
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
BMCs can get into ERROR0 state while flashing new firmware, particularly while
the BMC is erasing the next flash block, which may take a just under 2 seconds
on a Dell PowerEdge 2800 (1.75 seconds typical), during which time the
single-threaded firmware may not be able to process new commands. In
particular, clearing OBF may not take effect immediately.
We want it to delay in ERROR0 after clearing OBF a bit waiting for OBF to
actually be clear before proceeding.
This introduces a new return value from the LLDD's event loop,
SI_SM_CALL_WITH_TICK_DELAY. This means the calling thread/timer should
schedule_timeout() at least 1 tick, rather than busy-wait. This is a longer
delay than SI_SM_CALL_WITH_DELAY, which is typically a 250us busy-wait.
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The current BT retry/reset mechanism fails to succeed on a PowerEdge 1650,
when the controller is wedged with B2H_ATN asserted at XACTION_START. If this
occurs, no further commands will ever succeed unless the state of the
controller is first cleared out.
Furthermore, the soft reset would only occur if the first command after insmod
was the one that timed out, not if a later command timed out.
This patch changes the retry/reset mechanism to be as follows:
Before retrying a command, clear the state of the BT controller such that the
flags represent ready for a new transaction. This increases the chance of
success of the restarted transaction.
After 2 retries, issue a soft reset and retry one more time before giving up
and reporting back a failure.
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Acked-by: Rocky Craig <rocky.craig@hp.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some commands, on some system BMCs, don't respond at at all. This is seen on
Dell PowerEdge x6xx and x7xx systems with IPMI 1.0 BT controllers when a "Get
SDR" command is issued, with a length field of 0x3A, which happens to be the
length of about SDR entries. If another length is passed, this command
succeeds.
This patch adds general infrastructure for receiving commands before they're
passed down to the low-level drivers, such that they can be completed
immediately, or modified, prior to being sent to ->start_transaction().
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make SMIC driver ignore EVT_AVAIL and SMS_ATN bits in flags register, as
they're used by systems management interrupts, not the host OS.
Make the OEM0 Data Available handler work for pre-IPMI 1.5 systems from Dell
too.
Without these two fixes, PowerEdge 2650 and other similar systems with SMIC
may hang a process (modprobe or anything using /dev/ipmi0).
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make module_param and MODULE_PARAM_DESC agree on poweroff_powercycle name.
There was an extraneous ifdef in the IPMI poweroff code that prevented it from
working if PROC_FS was disabled.
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Modify the IPMI watchdog parameters (the ones that make sense) to be exported
from sysfs. This is somewhat complicated because these parameters have
side-effects that must be handled.
Signed-off-by: Corey Minyard <minyard@acm.org>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A number of small changes for the various system interface drivers,
consolidated from a number of patches from Matt Domsch.
Clear B2H_ATN and drain the BMC message buffer on command timeout. This
prevents further commands from failing after a timeout.
Add bt_debug and smic_debug module parameters, expose them in sysfs. This
lets you enable and disable debugging messages at runtime.
Unsigned jiffies math in ipmi_si_intf.c causes a too-large value to be passed
to ->event() after jiffies wrap-around. The BT driver had caught this, but
didn't know how to fix it. Now all calls to ->event() use a sane value for
time.
Increase timeout for commands handed to the BT driver from 2 seconds to 5
seconds. This is necessary particularly when the previous command was a
"Clear SEL", as that command completes, yet the BMC isn't really ready to
handle another command yet.
Silence BT debugging messages which were being printed on the console.
Increase SMIC timeout form 1/10s to 2s. This is needed on Dell PowerEdge 2650
and PowerEdge 750 with ERA/O cards to allow commands to complete without
timing out.
Adds kcs_debug module param, to match behavior of BT and SMIC. This also
prevents messages from being sent to the console unless explicitly requested.
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch is rather large, but it really can't be done in smaller chunks
easily and I believe it is an important change. This has been out and tested
for a while in the latest IPMI driver release. There are no functional
changes, just changes as necessary to convert the locking over (and a few
minor style updates).
The IPMI driver uses read/write locks to ensure that things exist while they
are in use. This is bad from a number of points of view. This patch removes
the rwlocks and uses refcounts and RCU lists to manage what the locks did.
Signed-off-by: Corey Minyard <minyard@acm.org>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The previous patch adding the ability to nest struct class_device
changed the paramaters to the call class_device_create(). This patch
fixes up all in-kernel users of the function.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Put the IPMI poweroff_powercycle parameter into sysfs. This field is
dynamically settable and is valuable to have in sysfs.
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I found an inconsistent spin_lock usage in ipmi_smi_msg_received.
Signed-off-by: Hironobu Ishii <hishii@soft.fujitsu.com>
Cc: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The IPMI power control function proc_write_chassctrl was badly written, it
directly used userspace pointers, it assumed that strings were NULL
terminated, and it used the evil sscanf function. This converts over to
using the sysctl interface for this data and changes the semantics to be a
little more logical.
Signed-off-by: Corey Minyard <minyard@acm.org>
Cc: <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This removes the unused "all_cmd_rcvr" variable from the IPMI driver.
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Clean up various style issues in the IPMI driver. Should be no functional
changes.
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch allows Dell servers with IPMI controllers that predate IPMI 1.5
to use the standard poweroff or powercycle commands. These systems
firmware don't set the chassis capability bit in the Get Device ID, but
they do implement the standard poweroff and powercycle commands.
Tested on RHEL3 kernel 2.4.21-20.ELsmp on a PowerEdge 2600. The standard
ipmi_poweroff driver cannot drive these systems. With this patch, they
power off or powercycle as expected.
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The "null message handler" in the IPMI driver is used in startup and panic
situations to handle messages. It was only designed to work with messages
from the local management controller, but in some cases it was used to get
messages from remote managmenet controllers, and the system would then
panic. This patch makes the "null message handler" in the IPMI driver more
general so it works with any kind of message.
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This adds MODULE_VERSION, MODULE_DESCRIPTION, and MODULE_AUTHOR tags to the
IPMI driver modules. Also changes the MODULE_VERSION to remove the
prepended 'v' on each value, consistent with the module versioning policy.
This patch also removes all the version information from everything except
the ipmi_msghandler module.
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The ipmi driver does not have a way to handle firmware-generated events
which have the OEM[012] Data Available flags set. In such a case, the
SMS_ATN bit may never get cleared by firmware, leaving the driver looping
infinitely but never able to make any progress.
This patch first simplifies storage and use of the data returned from an
IPMI Get Device ID command.
It then creates a new per-OEM handler hook, which should know how to handle
events with the OEM[012] Data Available flags set. It then uses this to
implement a workaround for IPMI 1.5-capable Dell PowerEdge servers which
are susceptable to setting the OEM[012] Data Available flags when the
driver can't handle it.
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There are some interactions between IPMI NMI timeouts and the other operations
of the IPMI driver. This make sure those interactions are handled properly.
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix some problems with the high-res timer support.
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
IPMI allows multiple IPMB channels on a single interface, and each channel
might have a different IPMB address. However, the driver has only one IPMB
address that it uses for everything. This patch adds new IOCTLS and a new
internal interface for setting per-channel IPMB addresses and LUNs. New
systems are coming out with support for multiple IPMB channels, and they are
broken without this patch.
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
`gcc -W' likes to complain if the static keyword is not at the beginning of
the declaration. This patch fixes all remaining occurrences of "inline
static" up with "static inline" in the entire kernel tree (140 occurrences in
47 files).
While making this change I came across a few lines with trailing whitespace
that I also fixed up, I have also added or removed a blank line or two here
and there, but there are no functional changes in the patch.
Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Attached patch removes #ifdef CONFIG_WATCHDOG_NOWAYOUT mess duplicated in
almost every watchdog driver and replaces it with common define in
linux/watchdog.h.
Signed-off-by: Andrey Panin <pazke@donpac.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
)
From: Corey Minyard <minyard@acm.org>
This contains the patch for supporting 32-bit compatible ioctls on x86_64
systems. The current x86_64 driver will not work with 32-bit applications.
Signed-off-by: Jordan Hargave <jordan_hargrave@dell.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Don't use semaphores for IPC in the poweroff code, use completions instead.
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch to adds "power cycle" functionality to the IPMI power off module
ipmi_poweroff. It also contains changes to support procfs control of the
feature.
The power cycle action is considered an optional chassis control in the IPMI
specification. However, it is definitely useful when the hardware supports
it. A power cycle is usually required in order to reset a firmware in a bad
state. This action is critical to allow remote management of servers.
The implementation adds power cycle as optional to the ipmi_poweroff module.
It can be modified dynamically through the proc entry mentioned above. During
a power down and enabled, the power cycle command is sent to the BMC firmware.
If it fails either due to non-support or some error, it will retry to send
the command as power off.
Signed-off-by: Christopher A. Poblete <Chris_Poblete@dell.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Clean up the timer shutdown handling in the IPMI driver.
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It looks like the recent IPMI patches had some -mm-onlyisms.
Signed-off-by: Neil Horman <nhorman@redhat.com>
Cc: Corey Minyard <minyard@acm.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add support for sysfs to the IPMI device interface.
Clean-ups based on Dimitry Torokovs comment by Philipp Hahn.
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Philipp Hahn <pmhahn@titan.lahn.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
annotated, a bunch of direct dereferencing replaced with readb().
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Correct an issue with the IPMI message layer taking a lock and calling
lower layer driver. If an error occrues at the lower layer the lock can be
taken again causing a deadlock. The lock is released before calling the
lower layer.
Signed-off-by: David Griego <dgriego@mvista.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Enable interrupts for a BT interface. There is a specific register that
needs to be set up to enable interrupts that also must be modified to clear
the irq.
Also, don't reset the BMC on a BT interface. That's probably not a good
idea as the BMC may be performing other important functions and a reset
should only be a last resort. Also, that register is also used to
enable/disable interrupts to the BT; modifying it may screw up the
interrupts.
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If there is an unexpected close, still allow the watchdog interface to be
re-opened on the IPMI watchdog.
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If the ACPI register bit width is zero (an invalid value) assume it is the
default spacing. This avoids some coredumps on invalid data and makes some
systems work that have broken ACPI data.
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ignore the bottom bit of the base address from the DMI data. It is
supposed to be set to 1 if it is I/O space. Few systems do this, but this
enables the ones that do set it to work properly.
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch changes calls to synchronize_kernel(), deprecated in the earlier
"Deprecate synchronize_kernel, GPL replacement" patch to instead call the new
synchronize_rcu() and synchronize_sched() APIs.
Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!