Commit Graph

5 Commits

Author SHA1 Message Date
Michael Kuperstein d36e24a166 [X86] Avoid folding scalar loads into unary sse intrinsics
Not folding these cases tends to avoid partial register updates:
sqrtss (%eax), %xmm0
Has a partial update of %xmm0, while
movss (%eax), %xmm0
sqrtss %xmm0, %xmm0
Has a clobber of the high lanes immediately before the partial update,
avoiding a potential stall.

Given this, we only want to fold when optimizing for size.
This is consistent with the patterns we already have for some of
the fp/int converts, and in X86InstrInfo::foldMemoryOperandImpl()

Differential Revision: http://reviews.llvm.org/D15741

llvm-svn: 256671
2015-12-31 09:45:16 +00:00
James Y Knight 7c905063c5 Make utils/update_llc_test_checks.py note that the assertions are
autogenerated.

Also update existing test cases which appear to be generated by it and
weren't modified (other than addition of the header) by rerunning it.

llvm-svn: 253917
2015-11-23 21:33:58 +00:00
Sanjay Patel 3eb5146b3c add SSE run to check non-AVX codegen
llvm-svn: 235809
2015-04-25 20:41:51 +00:00
David Blaikie a79ac14fa6 [opaque pointer type] Add textual IR support for explicit type parameter to load instruction
Essentially the same as the GEP change in r230786.

A similar migration script can be used to update test cases, though a few more
test case improvements/changes were required this time around: (r229269-r229278)

import fileinput
import sys
import re

pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)")

for line in sys.stdin:
  sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line))

Reviewers: rafael, dexonsmith, grosser

Differential Revision: http://reviews.llvm.org/D7649

llvm-svn: 230794
2015-02-27 21:17:42 +00:00
Sanjay Patel f34a29a845 add X86 load folding tests for unary math ops
X86 load folding is fragile; eg, the tests here
don't work without AVX even though they should. This
is because we have a mix of tablegen patterns that have
been added over time, and we have a load folding table
used by the peephole optimizer that has to be kept in 
sync with the ever-changing ISA and tablegen defs.

llvm-svn: 229870
2015-02-19 16:59:11 +00:00