This fixes a very subtle bug. vr defined by an implicit_def is allowed overlap with any register since it doesn't actually modify anything. However, if it's used as a two-address use, its live range can be extended and it can be spilled. The spiller must take care not to emit a reload for the vn number that's defined by the implicit_def. This is both a correctness and performance issue.
llvm-svn: 69743
type to truncate to should be the number of bits of the value that are
preserved, not the number that are clobbered with sign-extension.
This fixes regressions in ldecod.
llvm-svn: 69704
clang: error: unable to make temporary file: /etc/cc: can't make
unique filename: Permission denied
instead of
clang: error: unable to make temporary file: /etc/cc: can't make
unique filenamePermission denied
for example.
Also, audited the uses of MakeErrMsg to make the prefix strings
consistent (not end with newline/punctuation/space/": ").
llvm-svn: 69626
in the MachineFunction class, renaming it to addLiveIn for consistency with
the same method in MachineBasicBlock. Thanks for Anton for suggesting this.
llvm-svn: 69615
%reg1498<def> = MOV32rm %reg1024, 1, %reg0, 12, %reg0, Mem:LD(4,4) [sunkaddr39 + 0]
%reg1506<def> = MOV32rm %reg1024, 1, %reg0, 8, %reg0, Mem:LD(4,4) [sunkaddr42 + 0]
%reg1486<def> = MOV32rr %reg1506
%reg1486<def> = XOR32rr %reg1486, %reg1498, %EFLAGS<imp-def,dead>
%reg1510<def> = MOV32rm %reg1024, 1, %reg0, 4, %reg0, Mem:LD(4,4) [sunkaddr45 + 0]
=>
%reg1498<def> = MOV32rm %reg2036, 1, %reg0, 12, %reg0, Mem:LD(4,4) [sunkaddr39 + 0]
%reg1506<def> = MOV32rm %reg2037, 1, %reg0, 8, %reg0, Mem:LD(4,4) [sunkaddr42 + 0]
%reg1486<def> = MOV32rr %reg1506
%reg1486<def> = XOR32rr %reg1486, %reg1498, %EFLAGS<imp-def,dead>
%reg1510<def> = MOV32rm %reg2038, 1, %reg0, 4, %reg0, Mem:LD(4,4) [sunkaddr45 + 0]
From linearscan's point of view, each of reg2036, 2037, and 2038 are separate registers, each is "killed" after a single use. The reloaded register is available and it's often clobbered right away. e.g. In thise case reg1498 is allocated EAX while reg2036 is allocated RAX. This means we end up with multiple reloads from the same stack slot in the same basic block.
Now linearscan recognize there are other reloads from same SS in the same BB. So it'll "downgrade" RAX (and its aliases) after reg2036 is allocated until the next reload (reg2037) is done. This greatly increase the likihood reloads from SS are reused.
This speeds up sha1 from OpenSSL by 5.8%. It is also an across the board win for SPEC2000 and 2006.
llvm-svn: 69585
type as the vector element type: allow them to be of
a wider integer type than the element type all the way
through the system, and not just as far as LegalizeDAG.
This should be safe because it used to be this way
(the old type legalizer would produce such nodes), so
backends should be able to handle it. In fact only
targets which have legal vector types with an illegal
promoted element type will ever see this (eg: <4 x i16>
on ppc). This fixes a regression with the new type
legalizer (vec_splat.ll). Also, treat SCALAR_TO_VECTOR
the same as BUILD_VECTOR. After all, it is just a
special case of BUILD_VECTOR.
llvm-svn: 69467