The FrameMask register is relevant to the TLB so it should be dumped by
dump_tlb_regs(), however it is only present in certain cores (r10000,
r12000, r14000, r16000). Add dumping of it, conditional upon
current_cpu_type().
Suggested-by: Joshua Kinard <kumba@gentoo.org>
Suggested-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Joshua Kinard <kumba@gentoo.org>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10724/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The PageGrain register may not exist if certain architectural features
aren't present, therefore only print out its value when dumping the TLB
registers if it is expected to contain fields relevant to the TLB.
Fixes: d1e9a4f547 ("MIPS: Add SysRq operation to dump TLBs on all CPUs")
Reported-by: Joshua Kinard <kumba@gentoo.org>
Reported-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Joshua Kinard <kumba@gentoo.org>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10723/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The TLB registers are dumped in a couble of places:
- sysrq_tlbdump_single() - when dumping TLB state.
- do_mcheck() - in response to a machine check error.
The main TLB registers also differ between r3k and r4k, but r4k appears
to be assumed.
Refactor this code into a dump_tlb_regs() function, implemented for both
r3k and r4k, and used by both of the above functions.
Fixes: d1e9a4f547 ("MIPS: Add SysRq operation to dump TLBs on all CPUs")
Suggested-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10721/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Move the initialisation of the CP0.Wired register implemented by Toshiba
TX3922 and TX3927 processors from `tx39_cache_init' to `tlb_init' where
it belongs, correcting code structure and making sure initialisation
does not rely on `tx39_cache_init' being called before `tlb_init' to
work correctly.
Make `r3k_have_wired_reg' static as it's no longer externally referred
to; remove a stale declaration too.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10195/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
XPA extends the physical addresses on MIPS32, including the EntryLo
registers. Update dump_tlb() to concatenate the PFNX field from the high
end of the EntryLo registers (as read by mfhc0).
The width of physical and virtual addresses are also separated to show
only 8 nibbles of virtual but 11 nibbles of physical with XPA.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10077/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The RI/XI bits when present are above the PFN field in the EntryLo
registers, at bits 63,62 when read with dmfc0, and bits 31,30 when read
with mfc0. This makes them appear as part of the physical address, since
the other bits are masked with PAGE_MASK, for example:
Index: 253 pgmask=16kb va=77b18000 asid=75
[pa=1000744000 c=5 d=1 v=1 g=0] [pa=100134c000 c=5 d=1 v=1 g=0]
The physical addresses have bit 36 set, which corresponds to bit 30 of
EntryLo1, the XI bit.
Explicitly mask off the RI and XI bits from the printed physical
address, and print the RI and XI bits separately if they exist, giving
output more like this:
Index: 226 pgmask=16kb va=77be0000 asid=79
[ri=0 xi=1 pa=01288000 c=5 d=1 v=1 g=0] [ri=0 xi=0 pa=010e4000 c=5 d=0 v=1 g=0]
Cc: linux-mips@linux-mips.org
Cc: James Hogan <james.hogan@imgtec.com>
Cc: David Daney <ddaney@caviumnetworks.com>
Patchwork: https://patchwork.linux-mips.org/patch/10080/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The EHINV bit in EntryHi allows a TLB entry to be properly marked
invalid so that EntryHi doesn't have to be set to a unique value to
avoid machine check exceptions due to multiple matching entries.
Unfortunately dump_tlb() doesn't take this into account so it will print
all the uninteresting invalid TLB entries if the current ASID happens to
be 00. Therefore add a condition to skip entries which are marked
invalid with the EHINV bit.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10076/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The TLB only matches the ASID when the global bit isn't set, so
dump_tlb() shouldn't really be skipping global entries just because the
ASID doesn't match. Fix the condition to read the TLB entry's global bit
from EntryLo0. Note that after a TLB read the global bits in both
EntryLo registers reflect the same global bit in the TLB entry.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10079/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Make use of recently added EntryLo bit definitions in mipsregs.h when
dumping TLB contents.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10075/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Refactor the TLB matching code in dump_tlb() slightly so that the
conditions which can cause a TLB entry to be skipped can be more easily
extended. This should prevent the match condition getting unwieldy once
it is updated to take further conditions into account.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10081/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Use the new tlb read hazard macros from <asm/hazards.h> rather than the
local BARRIER() macro which uses 7 ops regardless of the kernel
configuration.
We use mtc0_tlbr_hazard for the hazard between mtc0 to the index
register and the tlbr, and tlb_read_hazard for the hazard between the
tlbr and the mfc0 of the TLB registers written by tlbr.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10074/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Correct a regression introduced with 8453eebd [MIPS: Fix strnlen_user()
return value in case of overlong strings.] causing assembler warnings
and broken code generated in __strnlen_kernel_nocheck_asm:
arch/mips/lib/strnlen_user.S: Assembler messages:
arch/mips/lib/strnlen_user.S:64: Warning: Macro instruction expanded into multiple instructions in a branch delay slot
with the CPU_DADDI_WORKAROUNDS option set, resulting in the function
looping indefinitely upon mounting NFS root.
Use conditional assembly to avoid a microMIPS code size regression.
Using $at unconditionally would cause such a regression as there are no
16-bit instruction encodings available for ALU operations using this
register. Using $v1 unconditionally would produce short microMIPS
encodings, but would prevent this register from being used across calls
to this function.
The extra LI operation introduced is free, replacing a NOP originally
scheduled into the delay slot of the branch that follows.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10205/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Computing sum introduces true data dependency. This patch removes some
true data depdendencies, hence increases instruction level parallelism.
This patch brings up to 50% csum performance gain on Loongson 3a.
One example about how this patch works is in CSUM_BIGCHUNK1:
// ** original ** vs ** patch applied **
ADDC(sum, t0) ADDC(t0, t1)
ADDC(sum, t1) ADDC(t2, t3)
ADDC(sum, t2) ADDC(sum, t0)
ADDC(sum, t3) ADDC(sum, t2)
In the original implementation, each ADDC(sum, ...) depends on the sum
value updated by previous ADDC(as source operand).
With this patch applied, the first two ADDC operations are independent,
hence can be executed simultaneously if possible.
Another example is in the "copy and sum calculating chunk":
// ** original ** vs ** patch applied **
STORE(t0, UNIT(0) ... STORE(t0, UNIT(0) ...
ADDC(sum, t0) ADDC(t0, t1)
STORE(t1, UNIT(1) ... STORE(t1, UNIT(1) ...
ADDC(sum, t1) ADDC(sum, t0)
STORE(t2, UNIT(2) ... STORE(t2, UNIT(2) ...
ADDC(sum, t2) ADDC(t2, t3)
STORE(t3, UNIT(3) ... STORE(t3, UNIT(3) ...
ADDC(sum, t3) ADDC(sum, t2)
With this patch applied, ADDC and the **next next** ADDC are independent.
Signed-off-by: chenj <chenj@lemote.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9608/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
MIPS R6 dropped the unaligned load and store instructions so
we need to re-write this part of the code for R6 to store
one byte at a time.
Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
MIPS R6 does not support the unaligned load and store instructions
so we add a special MIPS R6 case to copy one byte at a time if we
need to read/write to unaligned memory addresses.
Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
The following instructions have been removed from MIPS R6
ulw, ulh, swl, lwr, lwl, swr.
However, all of them are used in the MIPS specific checksum implementation.
As a result of which, we will use the generic checksum on MIPS R6
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
The toolchain defines exactly one of __MIPSEB__ and
__MIPSEL__. As a result, simplify the ifdefery a little bit.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/8522/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Having #ifdefs just to guard comments is not really helpful
so drop them. Moreover, the code wasn't really reached anyway
since there is a #ifndef CONFIG_CPU_MIPSR2 on the top of the file.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/8513/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Commit cf62a8b813 ("MIPS: lib: memcpy: Use macro to build the
copy_user code") switched to a macro in order to build the memcpy
symbols in preparation for the EVA support. However, this commit
also removed the NOP instruction after the 'jr ra' when returning
back to the caller. This had no visible side-effects since the next
instruction was a load to the t0 register which was already in the
clobbered list, but it may have undesired effects in the future
if some other code is introduced in between the .Ldone and
the .Ll_exc_copy labels.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: <stable@vger.kernel.org> # v3.15+
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/8512/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Virtual page number of R3000 in entryhi is 20 bit from MSB. But in
dump_tlb(), the bit mask to read it from entryhi is 19 bit (0xffffe000).
The patch fixes that to 0xfffff000.
Signed-off-by: Isamu Mogi <isamu@leafytree.jp>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/8290/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
We were returning maxlen like the userland strnlen if no '\0' character
was encountered while the kernel version is expected to return a value
larger than maxlen. Fixed to return maxlen + 1.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This small update to the previous fix to __delay removes a conditional
around the ABI-dependent subtraction operation within an inline asm in
favor to the standard <asm/asm.h> LONG_SUBU macro. No change in code
produced.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6703/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Nobody is maintaining SMTC anymore and there also seems to be no userbase.
Which is a pity - the SMTC technology primarily developed by Kevin D.
Kissell <kevink@paralogos.com> is an ingenious demonstration for the MT
ASE's power and elegance.
Based on Markos Chandras <Markos.Chandras@imgtec.com> patch
https://patchwork.linux-mips.org/patch/6719/ which while very similar did
no longer apply cleanly when I tried to merge it plus some additional
post-SMTC cleanup - SMTC was a feature as tricky to remove as it was to
merge once upon a time.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This change reverts most of commit
60724ca59e [MIPS: IP checksums: Remove
unncessary .set pseudos] that introduced warnings with the
CPU_DADDI_WORKAROUNDS option set:
arch/mips/lib/csum_partial.S: Assembler messages:
arch/mips/lib/csum_partial.S:467: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:467: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:467: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:467: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:467: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:467: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:467: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:467: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:467: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:467: Warning: used $3 with ".set at=$3"
[...]
arch/mips/lib/csum_partial.S:577: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:577: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:577: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:601: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:601: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:601: Warning: used $3 with ".set at=$3"
arch/mips/lib/csum_partial.S:601: Warning: used $3 with ".set at=$3"
[and so on, and so on...]
The warnings are benign and good code is produced regardless because no
macros that'd use the assembler's temporary register are involved, however
the `.set noat' directives removed by the commit referred are crucial to
guarantee this is still going to be the case after any changes in the
future. Therefore they need to be brought back to place which this
change does.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6686/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This corrects assembler warnings and broken code generated in
__strncpy_from_user_asm:
arch/mips/lib/strncpy_user.S: Assembler messages:
arch/mips/lib/strncpy_user.S:52: Warning: Macro instruction expanded into
multiple instructions in a branch delay slot
with the CPU_DADDI_WORKAROUNDS option set. The function schedules delay
slots manually where there is really no need to as GAS is happy to do it
all itself, so undo it all and remove `.set noreorder'.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6685/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
With CPU_DADDI_WORKAROUNDS enabled __delay assembles with a macro in a
branch delay slot:
{standard input}: Assembler messages:
{standard input}:18: Warning: Macro instruction expanded into multiple
instructions in a branch delay slot
and broken code results:
0000000000000000 <__delay>:
0: 1480ffff bnez a0,0 <__delay>
4: 24010001 li at,1
8: 0081202f dsubu a0,a0,at
c: 03e00008 jr ra
10: 00000000 nop
14: 00000000 nop
Consequently the function loops indefinitely, showing up prominently as a
hang in the delay loop calibration at bootstrap.
This change corrects the problem by forcing the immediate 1 into a
register while keeping code produced identical where CPU_DADDI_WORKAROUNDS
is disabled.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6669/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
In preparation for EVA support, we use a macro to build the
__csum_partial_copy_user main code so it can be shared across
multiple implementations. EVA uses the same code but it replaces
the load/store/prefetch instructions with the EVA specific ones
therefore using a macro avoids unnecessary code duplications.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Each load/store macro always adds an entry to the __ex_table
using the EXC macro. There are cases where a load instruction may
never fail such as when we are sure the load happens in the kernel
address space. Therefore, we merge these the EXC and LOADX/STOREX
macros into a single one. We also expand the argument list in the EXC
macro to make the macro more flexible. The extra 'type' argument is not
used by this commit, but it will be used when EVA support is added to
memcpy.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
The 'copy_user' symbol can be used to copy from or to
userland so we will use two different symbols for these
operations. This makes no difference in the existing code,
but when the core is operating in EVA mode, different instructions
need to be used to read and write to userland address space.
The old function has also been renamed to 'copy_kernel' to denote
that it is suitable for copy data to and from kernel space.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Build the __bzero function using the EVA load/store instructions
when operating in the EVA mode. This function is only used when
accessing user code so there is no need to build two distinct symbols
for user and kernel operations respectively.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Build the __bzero symbol using a macor. In EVA mode we will
need to use similar code to do the userspace load operations so
it is better if we use a macro to avoid code duplications.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Add copy_{to,from,in}_user when the CPU operates in EVA mode.
This is necessary so the EVA specific instructions can be used
to perform the virtual to physical translation for user space
addresses. We will use the non-EVA functions to read from kernel
if needed.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
The code can be shared between EVA and non-EVA configurations,
therefore use a macro to build it to avoid code duplications.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
In preparation for EVA support, the PREF macro is split into two
separate macros, PREFS and PREFD, for source and destination data
prefetching respectively.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Each load/store macro always adds an entry to the __ex_table
using the EXC macro. Therefore, these load/store macros are now merged
with the EXC one. The argument list is also expanded in order to make
the macro more flexible. The extra 'type' argument is not used by this
commit, but it will be used when the EVA support is added to the memcpy.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
In non-EVA mode, strncpy_from_user* aliases are used for the
strncpy_from_kernel* symbols since the code is identical. In EVA
mode, new strcpy_from_user* symbols are used which use the EVA
specific instructions to load values from userspace.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Build the __strncpy_from_user symbol using a macro. In EVA mode we will
need to use similar code to do the userspace load operations so
it is better if we use a macro to avoid code duplications.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
In non-EVA mode, strlen_user* aliases are used for the
strlen_kernel* symbols since the code is identical. In EVA
mode, new strlen_user* symbols are used which use the EVA
specific instructions to load values from userspace.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Build the __strlen_user symbol using a macro. In EVA mode we will
need to use similar code to do the userspace load operations so
it is better if we use a macro to avoid code duplications.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
In non-EVA mode, a strlen_user* alias is used for the
strlen_kernel* symbols since the code is identical. In EVA
mode, a new strlen_user* symbol is used which uses the EVA
specific instructions to load values from userspace.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Build the __strnlen_user symbol using a macro. In EVA mode we will
need to use similar code to do the userspace load operations so
it is better if we use a macro to avoid code duplications.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
None of these files are actually using any __init type directives
and hence don't need to include <linux/init.h>. Most are just a
left over from __devinit and __cpuinit removal, or simply due to
code getting copied from one driver to the next.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: John Crispin <blogic@openwrt.org>
Patchwork: http://patchwork.linux-mips.org/patch/6320/
commit 3747069b25e419f6b51395f48127e9812abc3596 upstream.
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications. For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.
After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out. Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.
Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
and are flagged as __cpuinit -- so if we remove the __cpuinit from
the arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
related content into no-ops as early as possible, since that will get
rid of these warnings. In any case, they are temporary and harmless.
Here, we remove all the MIPS __cpuinit from C code and __CPUINIT
from asm files. MIPS is interesting in this respect, because there
are also uasm users hiding behind their own renamed versions of the
__cpuinit macros.
[1] https://lkml.org/lkml/2013/5/20/589
[ralf@linux-mips.org: Folded in Paul's followup fix.]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5494/
Patchwork: https://patchwork.linux-mips.org/patch/5495/
Patchwork: https://patchwork.linux-mips.org/patch/5509/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This reverts commit d532f3d267.
The original commit has several problems:
1) Doesn't work with 64-bit kernels.
2) Calls TLBMISS_HANDLER_SETUP() before the code is generated.
3) Calls TLBMISS_HANDLER_SETUP() twice in per_cpu_trap_init() when
only one call is needed.
[ralf@linux-mips.org: Also revert the bits of the ASID patch which were
hidden in the KVM merge.]
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: "Steven J. Hill" <Steven.Hill@imgtec.com>
Cc: David Daney <david.daney@cavium.com>
Patchwork: https://patchwork.linux-mips.org/patch/5242/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Optimise 'strnlen' to use microMIPS instructions and/or optimisations
for binary size reduction. When the microMIPS ISA is not being used,
the library function compiles to the original binary code.
Signed-off-by: Steven J. Hill <Steven.Hill@imgtec.com>
Optimise 'strlen' to use microMIPS instructions and/or optimisations
for binary size reduction. When the microMIPS ISA is not being used,
the library function compiles to the original binary code.
Signed-off-by: Steven J. Hill <Steven.Hill@imgtec.com>
Optimise 'strncpy' to use microMIPS instructions and/or optimisations
for binary size reduction. When the microMIPS ISA is not being used,
the library function compiles to the original binary code.
Signed-off-by: Steven J. Hill <Steven.Hill@imgtec.com>
Optimise 'memset' to use microMIPS instructions and/or optimisations
for binary size reduction. When the microMIPS ISA is not being used,
the library function compiles to the original binary code.
Signed-off-by: Steven J. Hill <Steven.Hill@imgtec.com>
Original patch by Ralf Baechle and removed by Harold Koerfgen
with commit f67e4ffc79905482c3b9b8c8dd65197bac7eb508. This
allows for more generic kernels since the size of the ASID
and corresponding masks can be determined at run-time. This
patch is also required for the new Aptiv cores and has been
tested on Malta and Malta Aptiv platforms.
[ralf@linux-mips.org: Added relevant part of fix
https://patchwork.linux-mips.org/patch/5213/]
Signed-off-by: Steven J. Hill <Steven.Hill@imgtec.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The operations on the bitmap pointers are protected by "memory"
clobbering raw_local_irq_{save,restore}(), so there is no need for
volatile here. By removing the volatile we get better code generation
out of the compiler.
Signed-off-by: David Daney <david.daney@cavium.com>
Patchwork: http://patchwork.linux-mips.org/patch/4966/
Acked-by: John Crispin <blogic@openwrt.org>
commit 92d11594f6 (MIPS: Remove irqflags.h dependency from bitops.h)
factored some of the bitops code out into a separate file
(arch/mips/lib/bitops.c). Unfortunately the logic converting a bit
mask into a boolean result was lost in some of the functions. We had:
int res;
unsigned long shifted_result_bit;
.
.
.
res = shifted_result_bit;
return res;
Which truncates off the high 32 bits (thus yielding an incorrect
value) on 64-bit systems.
The manifestation of this is that a non-SMP 64-bit kernel will not
boot as the bitmap operations in bootmem.c are all screwed up.
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: Jim Quinlan <jim2101024@gmail.com>
Cc: stable@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/4965/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The csum_partial implementation contain optimalizations for the MIPS R2
instruction set. This optimization is never enabled however because the
if directive uses the CPU_MIPSR2 constant which is not defined anywhere.
Use the CONFIG_CPU_MIPSR2 constant instead.
Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/4971/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Having received another series of whitespace patches I decided to do this
once and for all rather than dealing with this kind of patches trickling
in forever.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
When building a 32-bit kernel for RBTX4927 with gcc version 4.1.2 20061115
(prerelease) (Ubuntu 4.1.1-21), I get:
arch/mips/lib/delay.c:24:5: warning: "__SIZEOF_LONG__" is not defined
As a consequence, __delay() always uses the 64-bit "dsubu" instruction.
Replace the check for "__SIZEOF_LONG__ == 4" by "BITS_PER_LONG == 32" to
fix this.
Introduced by commit 5210edcd52 [MIPS: Make
__{,n,u}delay declarations match definitions and generic delay.h"]
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Patchwork: https://patchwork.linux-mips.org/patch/4678/
Acked-by: David Daney <david.daney@cavium.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
A recent patch changed some irq routines from inlines to functions.
These routines are called by the tracer code. Now that they're functions,
if they are compiled for function tracing they will call the tracer
and crash the system due to infinite recursion. The fix disables
tracing in these functions by using "notrace" in the function
definition.
Signed-off-by: Al Cooper <alcooperx@gmail.com>
Reviewed-by: David Daney <david.daney@cavium.com>
Pathchwork: https://patchwork.linux-mips.org/patch/4564/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
For non MIPSr2 processors, such as the BMIPS 5000, calls to
arch_local_irq_disable() and others may be preempted, and in doing
so a stale value may be restored to c0_status. This fix disables
preemption for such processors prior to the call and enables it
after the call.
Those functions that needed this fix have been "outlined" to
mips-atomic.c, as they are no longer good candidates for inlining.
This bug was observed in a BMIPS 5000, occuring once every few hours
in a continuous reboot test. It was traced to the write_lock_irq()
function which was being invoked in release_task() in exit.c.
By placing a number of "nops" inbetween the mfc0/mtc0 pair in
arch_local_irq_disable(), which is called by write_lock_irq(), we
were able to greatly increase the occurance of this bug. Similarly,
the application of this commit silenced the bug.
Signed-off-by: Jim Quinlan <jim2101024@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: David Daney <ddaney.cavm@gmail.com>
Cc: Kevin Cernekee cernekee@gmail.com
Cc: Jim Quinlan <jim2101024@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/4321/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The "else clause" of most functions in bitops.h invoked
raw_local_irq_{save,restore}() and in doing so had a dependency on
irqflags.h. This fix moves said code to bitops.c, removing the
dependency.
Signed-off-by: Jim Quinlan <jim2101024@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: David Daney <ddaney.cavm@gmail.com>
Cc: Kevin Cernekee cernekee@gmail.com
Cc: Jim Quinlan <jim2101024@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/4320/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
At some recent point arch/mips/include/asm/delay.h has started being
included into csrc-octeon.c where the __?delay() functions are defined.
This causes a compile failure due to conflicting declarations and
definitions of the functions.
It turns out that the generic definitions in arch/mips/lib/delay.c also
conflict.
Proposed fix: Declare the functions to take unsigned long parameters
just like asm-generic (and x86) does. Update __delay to agree
(__ndelay and __udelay need no change).
Bonus: Get rid of 'inline' from __delay() definition, as it is globally
visible, and the compiler should be making this decision itself (it does
in fact inline the function without being told to).
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/4354/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Allows us not to duplicate more lines in arch/mips/lib/Makefile.
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Patchwork: http://patchwork.linux-mips.org/patch/3329/
Signed-off-by: John Crispin <blogic@openwrt.org>
We can save the 451 lines of code that comprise memcpy-inatomic.S at the
expense of a single instruction in the memcpy prolog. We also use an
additional register (t6), so this may cause increased register pressure in
some places as well. But I think the reduced maintenance burden, of not
having two nearly identical implementations, makes it worth it.
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
commit eab90291d3
(mips: switch to GENERIC_PCI_IOMAP)
failed to take into account the PCI controller's
io_map_base for mapping IO BARs.
This also caused a new warning on mips.
Fix this, without re-introducing code duplication,
by setting NO_GENERIC_PCI_IOPORT_MAP
and supplying a mips-specific __pci_ioport_map.
Reported-by: Kevin Cernekee <cernekee@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
mips copied pci_iomap from generic code, probably to avoid
pulling the rest of iomap.c in. Since that's in
a separate file now, we can reuse the common implementation.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add NLM_XLR_BOARD, CPU_XLR and other config options
Makefile updates, mostly based on r4k
Signed-off-by: Jayachandran C <jayachandranc@netlogicmicro.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2334/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
partial_fixup is used in noreorder block.
Separating two consecutive loads can save one cycle on processors with
GPR intrelock and can fix load-use on processors that need a load delay slot.
Also do so for fwd_fixup.
[Ralf: Only R2000/R3000 class processors are lacking the the load-user
interlock and even some of those got it retrofitted. With R2000/R3000
being fairly uncommon these days the impact of this bug should be minor.]
Signed-off-by: Tony Wu <tung7970@gmail.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/1768/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Outlining fixes the issue were on certain CPUs such as the R10000 family
the delay loop would need an extra cycle if it overlaps a cacheline
boundary.
The rewrite also fixes build errors with GCC 4.4 which was changed in
way incompatible with the kernel's inline assembly.
Relying on pure C for computation of the delay value removes the need for
explicit. The price we pay is a slight slowdown of the computation - to
be fixed on another day.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Beyond the requirements of the architecture standard Cavium also supports
8k and 32k pages.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: David Daney <ddaney@caviumnetworks.com>
The special IP27 DMA code selected by DMA_IP27 has been removed a while
ago turning DMA_IP27 into almost a nop. Also fixup the broken logic of
its last users memcpy.S and memcpy-inatomic.s.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Take all the OCTEON specific files that were added, and hook them into
the build system for the arch/mips. For versions of GCC that lack
OCTEON support, override gas target architecture.
Signed-off-by: Tomaso Paoletti <tpaoletti@caviumnetworks.com>
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
We already have sufficient infrastructure to support VR5500 and VR5500A
series processors. Here's a Makefile support to make it selectable by
ports, and enable it for NEC EMMA2RH Markeins board.
This patch also fixes a confused target help, and adds 1Gb PageMask bits
supported by VR5500 and its variants.
Signed-off-by: Shinya Kuribayashi <shinya.kuribayashi@necel.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Use unsigned loads to avoid possible misscalculation of IP checksums. This
bug was instruced in f761106cd728bcf65b7fe161b10221ee00cf7132 (lmo) /
ed99e2bc1d (kernel.org).
[Original fix by Atsushi. Improved instruction scheduling and fix for
unaligned unsigned load by me -- Ralf]
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Almost all implementations of pci_iomap() in the kernel, including the generic
lib/iomap.c one, copies the content of a struct resource into unsigned long's
which will break on 32 bits platforms with 64 bits resources.
This fixes all definitions of pci_iomap() to use resource_size_t. I also
"fixed" the 64bits arch for consistency.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
These symbols appear in oprofile output, stacktraces and similar but only
make the output harder to read. Many identical symbol names such as
"both_aligned" were also being used in multiple source files making it
impossible to see which file actually was meant. So let's get rid of them.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
IP28 needs special treatment to avoid speculative accesses. gcc
takes care for .c code, but for assembly code we need to do it
manually.
This is taken from Peter Fuersts IP28 patches.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Since all the callers of the PHYS_TO_XKPHYS macro call with a constant,
put the cast to LL inside the macro where it really should be rather
than in all the callers. This makes macros like PHYS_TO_XKSEG_UNCACHED
work without gcc whining.
Signed-off-by: Andrew Sharp <andy.sharp@onstor.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This complements the generic R4000/R4400 errata workaround code and adds
bits for the daddiu problem. In most places it just modifies handwritten
assembly code so that the assembler is allowed to use a temporary register
as daddiu may now be treated as a macro that expands to a sequence of li
and daddu. It is the AT register or, where AT is unavailable or used
explicitly for another purpose, an explicitly-named register is selected,
using the .set at=<reg> feature added recently to gas. This feature is
only used if CONFIG_CPU_DADDI_WORKAROUNDS has been set, so if the
workaround remains disabled, the required version of binutils stays
unchanged.
Similarly, daddiu instructions put in branch delay slots in noreorder
fragments are now taken out of them and the assembler is allowed to
reorder them itself as possible (which it does making the whole idea of
scheduling them into delay slots manually questionable).
Also in the very few places where such a simple conversion was not
possible, a handcoded longer sequence is implemented.
Other than that there are changes to code responsible for building the
TLB fault and page clear/copy handlers to avoid daddiu as appropriate.
These are only effective if the erratum is verified to be present at the
run time.
Finally there is a trivial update to __delay(), because it uses daddiu in
a branch delay slot.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Certain 32-bit kernel configurations seem to be able to cause references,
this was observed with gcc 4.1.2.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>