Add optional verbose debug - which when turned off, quiets down
userspace errors. Saves ~8k of code/data for production systems
Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
The kernel does not properly clear the EBIU Error Master (EBIU_ERRMST) Register
on BF548, which causes the kernel to panic.
We need to make sure that we clear the EBIU_ERRMST (necessary on BF54x)
Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Better error handling of unknown exceptions, allows userspace to do a
EXCPT n instruction for a not installed exception handler, and the
kernel doesn't crash (like it use to before this).
Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
as pointed out by Michael McTernan in the forums, when expanding
the trace buffer, it does not print out the decoded instruction.
Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
so that we always send the same signal and we handle the NULL ptr condition properly
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Remove the circular buffering mechanism for exceptions. Instead, point RETX
at a safe location from which to fetch three NOPs.
This safe location is now in the fixed code area, and also used for certain
anomaly workarounds, to ensure that user space can find a valid ICPLB when
things are built with CONFIG_MPU.
Also, save I/DCPLB_FAULT_ADDRESS when lowering to level 5, since the hardware
reg is valid only at exception level.
Signed-off-by: Bernd Schmidt <bernds_cb1@t-online.de>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
grab locks when not atomic - this fixes the issues
sometimes seen when using magic sysrq.
Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Sometimes when we crash, current is not valid, (has been written
over), so the existing code causes a invalid read during exception
context - which is a unrecoverable double fault. This fixes this.
Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
d_path() is used on a <dentry,vfsmount> pair. Lets use a struct path to
reflect this.
[akpm@linux-foundation.org: fix build in mm/memory.c]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: Bryan Wu <bryan.wu@analog.com>
Acked-by: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Don't oops_in_progress if single step is comming from the
kernel, which happens if a single step occurs after a exception cause.
This fixes up the remaining issues in the toolchain bug.
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Clean up dump_bfin_mem so that it will display
content from the kernel, as well as l1 instruction, when deferred
HW errors happen, print out the last frame info if it makes sense.
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
http://blackfin.uclinux.org/gf/project/uclinux-dist/tracker/?action=TrackerItemEdit&tracker_item_id=3719
When the CPLBs get a miss, we do:
- find a victim in the HW table
- remove the victim
- find the replacement in the software table
- put it into the HW table.
If we can't find a replacement in the software table, we accidently
leave a duplicate in the HW table. This patch ensures that duplicate
is marked as not valid.
What we should do is find the replacement in the software table, before
we find a victim in the HW table - but its too late in the release cycle
to do that much restructuring of this code.
Rather that duplicate code, connect Hardware Errors (irq5) into trap_c,
so user space processes get killed properly.
The rest of irq_panic() can be moved into traps.c (later)
There is still a small corner case that causes problems when a
pheriperal interrupt goes off a single cycle before a user space
hardware error. This causes a kernel panic, rather than the user
space process being killed.
But, this checkin makes things work in 99.9% of the cases, and is a vast
improvement from what is there today (which fails 100% of the time).
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
We need to send signals with the proper PC, or gdb gets
confused, and lots of tests fail. This should fix that.
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
We currently do not. Also make it easier to handle cplb violations - in traps.c
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
- move the CONFIG_KGDB into one block, for easier reading
- remove printk from printk_address, and pass around buffers. Also
print out the labels when decoding CPLB errors, so you know exactly
where the error was.
- Do not use fixed addresses, becuase people do not know where they come from.
- Turn the printing level down on the dump, so if you don't want,
only the signal prints out - just like on other archs. If a kernel/interrupt
crashes, it should dump everything all the time
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
This fixes two things:
- stop calling write_lock_irq/write_unlock_irq which can turn modify
irq levels
- don't calling mmput when handing exceptions - since this might_sleep,
which does a rti, and leaves us in kernel space (irq15, rather
than irq5).
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
We must balance calls to get_task_mm with corresponding mmput calls, otherwise
refcounting is screwed up and mms don't get freed when their task exits.
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Also ensure that the traps_c code doesn't cause a double fault, by
sending a signal to a faulting kernel before the memory subsystem
is fully initialized, by printing out the error message before sending
the signal.
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Today when a double fault happens (exception during an exception
handling event), we go into an endless loop, with nothing comming out
the UART. With this patch, we actually see that we have commited a
double fault event
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
add an exception request/free api similar to the interrupt request/fre
api so people can utilize the free software based exceptions for their
own purposes
Signed-off-by: Mike Frysinger <michael.frysinger@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Add ability to expend the hardware trace buffer via a configurable
software buffer - so you can have lots of history when a crash occurs.
The interesting way we do printk in the traps.c confusese the checking
script
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
we converted to using a system call for userspace spinlocks
rather than a dedicated exception long ago
Signed-off-by: Mike Frysinger <michael.frysinger@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Turns on trace earlier, so crashes at kernel start should print out a
trace, making things easier to debug.
Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Mike Frysinger <michael.frysinger@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>