- Still O(N^2), just a faster form, and now its the MCAsmLayout's fault.
On the .s I am tuning against (combine.s from 403.gcc):
--
ddunbar@lordcrumb:MC$ diff stats-before.txt stats-after.txt
5,10c5,10
< 1728 assembler - Number of assembler layout and relaxation steps
< 7707 assembler - Number of emitted assembler fragments
< 120588 assembler - Number of emitted object file bytes
< 2233448 assembler - Number of evaluated fixups
< 1727 assembler - Number of relaxed instructions
< 6723845 mcexpr - Number of MCExpr evaluations
---
> 3 assembler - Number of assembler layout and relaxation steps
> 7707 assembler - Number of emitted assembler fragments
> 120588 assembler - Number of emitted object file bytes
> 14796 assembler - Number of evaluated fixups
> 1727 assembler - Number of relaxed instructions
> 67889 mcexpr - Number of MCExpr evaluations
--
Feel free to LOL at the -before numbers, if you like.
I am a little surprised we make more than 2 relaxation passes. It's pretty
trivial for us to do relaxation out-of-order if that would give a speedup.
llvm-svn: 99543
the custom insertion hook deletes the instruction, then we try to set dead
flags on it. Neither the code that I added nor the code that was there
before was safe.
llvm-svn: 99538
On Nehalem and newer CPUs there is a 2 cycle latency penalty on using a register
in a different domain than where it was defined. Some instructions have
equvivalents for different domains, like por/orps/orpd.
The SSEDomainFix pass tries to minimize the number of domain crossings by
changing between equvivalent opcodes where possible.
This is a work in progress, in particular the pass doesn't do anything yet. SSE
instructions are tagged with their execution domain in TableGen using the last
two bits of TSFlags. Note that not all instructions are tagged correctly. Life
just isn't that simple.
The SSE execution domain issue is very similar to the ARM NEON/VFP pipeline
issue handled by NEONMoveFixPass. This pass may become target independent to
handle both.
llvm-svn: 99524
gcc, and the common expectation seems to be that they are unused. If and when
someone cares we can add them back with well documented demantics.
llvm-svn: 99522
- When substituting template arguments as part of template argument
deduction, introduce a new local instantiation scope.
- When substituting into a function prototype type, introduce a new
"temporary" local instantiation scope that merges with its outer
scope but also keeps track of any additions it makes, removing
them when we exit that scope.
Fixes PR6700, where we were getting too much mixing of local
instantiation scopes due to template argument deduction that
substituted results into function types.
llvm-svn: 99509
now configures prerequisite projects individually but also ignores them in the
big project switch statement to avoid the incorrect warning.
llvm-svn: 99506
the redeclaration chain. Recommitted from r99477 with a fix: we need to
merge in default template arguments from previous declarations.
llvm-svn: 99496
bytes instead of one byte. This is important because
we're running up to too many opcodes to fit in a byte
and it is aggrevated by FIRST_TARGET_MEMORY_OPCODE
making the numbering sparse. This just bites the
bullet and bloats out the table. In practice, this
increases the size of the x86 isel table from 74.5K
to 76K. I think we'll cope :)
This fixes rdar://7791648
llvm-svn: 99494
If a TableGen class has an initializer expression containing an X.Y subexpression,
AND X depends on template parameters,
AND those template parameters have defaults,
AND some parameters with defaults are beyond position 1,
THEN parts of the initializer expression are evaluated prematurely with the default values when the first explicit template parameter is substituted, before the remaining explicit template parameters have been substituted.
llvm-svn: 99492
happening.
Enhance scheduling to set the DEAD flag on implicit defs
more aggressively. Before, we'd set an implicit def operand
to dead if it were present in the SDNode corresponding to
the machineinstr but had no use. Now we do it in this case
AND if the implicit def does not exist in the SDNode at all.
This exposes a couple of problems: one is the FIXME, which
causes a live intervals crash on CodeGen/X86/sibcall.ll.
The second is that it makes machinecse and licm more
aggressive (which is a good thing) but also exposes a case
where licm hoists a set0 and then it doesn't get resunk.
Talking to codegen folks about both these issues, but I need
this patch in in the meantime.
llvm-svn: 99485
buildbot. The tramp3d test fails.
--- Reverse-merging r99477 into '.':
U test/SemaTemplate/friend-template.cpp
U test/CXX/temp/temp.decls/temp.friend/p1.cpp
U lib/Sema/SemaTemplateInstantiateDecl.cpp
U lib/Sema/SemaAccess.cpp
llvm-svn: 99481
(1) Do not assume the data arguments start after the format string
(2) Do not use the fact that a function is variadic to treat it like a va_list printf function
Fixes PR 6697.
llvm-svn: 99480