Many PPC instructions have a so-called 'record form' which stores to a specific
condition register the result of comparing the result of the instruction with
zero (always as a signed comparison). For integer operations on PPC64, this is
always a 64-bit comparison.
This implementation is derived from the implementation in the ARM backend;
there are some differences because PPC condition registers are allocatable
virtual registers (although the record forms always use a specific one), and we
look for a matching subtraction instruction after the compare (but before the
first use) in addition to before it.
llvm-svn: 179802
Semantics of parameters named Index and Idx were inconsistent between
"include/llvm/IR/Attributes.h", "lib/IR/AttributeImpl.h" and
"lib/IR/Attributes.cpp": sometimes these were fixed 1-based indexes of IR
parameters (or AttributeSet::ReturnIndex for IR return values or
AttributeSet::FunctionIndex for IR functions), other times they were the
internal slot for storage in the underlying AttributeSetImpl. I renamed usage of
the former to "Index" and usage of the latter to "Slot" ("Slot" was already
being used consistently for the latter in a subset of cases)
Patch by Stephen Lin!
llvm-svn: 179791
1. Verify::VerifyParameterAttrs in "lib/IR/Verifier.cpp" and
AttrBuilder::removeFunctionOnlyAttrs in "lib/IR/Attributes.cpp" (only called
by Verify::VerifyFunctionAttrs) separately maintained a list of function-only
attribute types. I've consolidated the logic into a new function used for
both cases in "lib/IR/Verifier.cpp", so this logic is in one place (other
than the AsmParser front-end)
2. Various functions in "lib/IR/Verifier.cpp" passed AttributeSet around by
reference needlessly, as it's just a handle to an immutable pimpl body.
Patch by Stephen Lin!
llvm-svn: 179790
In X86FastISel::X86SelectStore(), improperly aligned stores are rejected and
handled by the DAG-based ISel. However, X86FastISel::X86SelectLoad() makes
no such requirement. There doesn't appear to be an x86 architectural
correctness issue with allowing potentially unaligned store instructions.
This patch removes this restriction.
Patch by Jim Stichnot.
llvm-svn: 179774
A min/max operation is represented by a select(cmp(lt/le/gt/ge, X, Y), X, Y)
sequence in LLVM. If we see such a sequence we can treat it just as any other
commutative binary instruction and reduce it.
This appears to help bzip2 by about 1.5% on an imac12,2.
radar://12960601
llvm-svn: 179773
This occurs due to an alloca representing a separate ownership from the
original pointer. Thus consider the following pseudo-IR:
objc_retain(%a)
for (...) {
objc_retain(%a)
%block <- %a
F(%block)
objc_release(%block)
}
objc_release(%a)
From the perspective of the optimizer, the %block is a separate
provenance from the original %a. Thus the optimizer pairs up the inner
retain for %a and the outer release from %a, resulting in segfaults.
This is fixed by noting that the signature of a mismatch of
retain/releases inside the for loop is a Use/CanRelease top down with an
None bottom up (since bottom up the Retain-CanRelease-Use-Release
sequence is completed by the inner objc_retain, but top down due to the
differing provenance from the objc_release said sequence is not
completed). In said case in CheckForCFGHazards, we now clear the state
of %a implying that no pairing will occur.
Additionally a test case is included.
rdar://12969722
llvm-svn: 179747
* We only ever specialize these templates with an instantiation of ELFType,
so we don't need a template template.
* Replace LLVM_ELF_COMMA with just passing the individual parameters to the
macro. This requires a second macro for when we only have ELFT, but that
is still a small win.
llvm-svn: 179726
unable to handle cases such as __asm mov eax, 8*-8.
This patch also attempts to simplify the state machine. Further, the error
reporting has been improved. Test cases included, but more will be added to
the clang side shortly.
rdar://13668445
llvm-svn: 179719
for the sdiv/srem/udiv/urem bitcode instructions. This is done for the i8,
i16, and i32 types, as well as i64 for the x86_64 target.
Patch by Jim Stichnoth
llvm-svn: 179715
PR15000 has a testcase where the time to compile was bordering on 30s. When I
dropped the limit value to 100, it became a much more managable 6s. The compile
time seems to increase in a roughly linear fashion based on increasing the limit
value. (See the runtimes below.)
So, let's lower the limit to 100 so that they can get a more reasonable compile
time.
Limit Value Time
----------- ----
10 0.9744s
20 1.8035s
30 2.3618s
40 2.9814s
50 3.6988s
60 4.5486s
70 4.9314s
80 5.8012s
90 6.4246s
100 7.0852s
110 7.6634s
120 8.3553s
130 9.0552s
140 9.6820s
150 9.8804s
160 10.8901s
170 10.9855s
180 12.0114s
190 12.6816s
200 13.2754s
210 13.9942s
220 13.8097s
230 14.3272s
240 15.7753s
250 15.6673s
260 16.0541s
270 16.7625s
280 17.3823s
290 18.8213s
300 18.6120s
310 20.0333s
320 19.5165s
330 20.2505s
340 20.7068s
350 21.1833s
360 22.9216s
370 22.2152s
380 23.9390s
390 23.4609s
400 24.0426s
410 24.6410s
420 26.5208s
430 27.7155s
440 26.4142s
450 28.5646s
460 27.3494s
470 29.7255s
480 29.4646s
490 30.5001s
llvm-svn: 179713
The reference manual defines only 5 permitted values for the immediate field of the "hint" instruction:
1. nop (imm == 0)
2. yield (imm == 1)
3. wfe (imm == 2)
4. wfi (imm == 3)
5. sev (imm == 4)
Therefore, restrict the permitted values for the "hint" instruction to 0 through 4.
Patch by Mihail Popa <Mihail.Popa@arm.com>
llvm-svn: 179707
GCC complains: Core.cpp:1449:27: warning: overflow in implicit constant conversion [-Woverflow]
I'm not sure if that's really a problem here, but using the enum type is better
style anyways.
llvm-svn: 179696
A couple of recently introduced conditional branch patterns
also need to be marked as isCodeGenOnly since they cannot
be handled by the asm parser.
No change in generated code.
llvm-svn: 179690