Commit Graph

1410 Commits

Author SHA1 Message Date
Liu, Chen3 756f597841 [X86] Support Intel avxvnni
This patch mainly made the following changes:

1. Support AVX-VNNI instructions;
2. Introduce ExplicitVEXPrefix flag so that vpdpbusd/vpdpbusds/vpdpbusds/vpdpbusds instructions only use vex-encoding when user explicity add {vex} prefix.

Differential Revision: https://reviews.llvm.org/D89105
2020-10-31 12:39:51 +08:00
Tianqing Wang be39a6fe6f [X86] Add User Interrupts(UINTR) instructions
For more details about these instructions, please refer to the latest
ISE document:
https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D89301
2020-10-22 17:33:07 +08:00
Wang, Pengfei 412cdcf2ed [X86] Add HRESET instruction.
For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D89102
2020-10-13 08:47:26 +08:00
Craig Topper 375849518d [X86] Add a X86ISD::BEXTRI to distinquish the case where the control must be a constant.
The bextri intrinsic has a ImmArg attribute which will be converted
in SelectionDAG using TargetConstant. We previously converted this
to a plain Constant to allow X86ISD::BEXTR to call SimplifyDemandedBits
on it.

But while trying to decide if D89178 was safe, I realized that
this conversion of TargetConstant to Constant would be one case
where that would break.

So this patch adds a new opcode specifically for the immediate case.
And then teaches computeKnownBits and SimplifyDemandedBits to also
handle it, but not try to SimplifyDemandedBits on it. To make up
for that, I immediately masked the constant to 16 bits when
converting from the intrinsic node to the X86ISD node.
2020-10-10 19:18:06 -07:00
Craig Topper 68e1a8d207 [X86] Defer the creation of LCMPXCHG16B_SAVE_RBX until finalize-isel
We need to use LCMPXCHG16B_SAVE_RBX if RBX/EBX is being used as
the frame pointer. We previously checked for this during type
legalization, but that's too early to know for sure if the base
pointer is needed.

This patch adds a new pseudo instruction to emit from isel that
uses a virtual register for the RBX input. Then we use the custom
inserter hook to emit LCMPXCHG16B if RBX isn't needed as a base
pointer or LCMPXCHG16B_SAVE_RBX if it is.

Fixes PR42064.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D88808
2020-10-07 17:00:43 -07:00
Craig Topper 4da4e7cb20 [X86] Remove X86ISD::LCMPXCHG8_SAVE_EBX_DAG and LCMPXCHG8B_SAVE_EBX pseudo instruction
This and its friend X86ISD::LCMPXCHG8_SAVE_RBX_DAG are used if we need to avoid clobbering the frame pointer in EBX/RBX. EBX/RBX are only used a frame pointer in 64-bit mode. In 64-bit mode we don't use CMPXCHG8B since we have a GR64 cmpxchg available. So we don't need special handling for LCMPXCHG8B.

Split from D88808

Differential Revision: https://reviews.llvm.org/D88853
2020-10-05 15:03:07 -07:00
Craig Topper a7e45ea30d [X86] Add memory operand to AESENC/AESDEC Key Locker instructions.
This removes FIXMEs from selectAddr.
2020-10-03 21:42:16 -07:00
Craig Topper 7f3da48885 [X86] Remove X86ISD::MWAITX_DAG. Just match the intrinsic to the custom inserter pseudo instruction during isel. 2020-10-03 18:44:53 -07:00
Craig Topper adccc0bfa3 [X86] Add X86ISD opcodes for the Key Locker AESENC*KL and AESDEC*KL instructions
Instead of emitting MachineSDNodes during lowering, emit X86ISD
opcodes. These opcodes will either be selected by tablegen
patterns or custom selection code.

Emitting MachineSDNodes during lowering is uncommon so this makes
things more consistent. It also allows selectAddr to be called to
perform address matching during instruction selection.

I had trouble getting tablegen to accept XMM0-XMM7 as results in
an isel pattern for the WIDE instructions so I had to use custom
instruction selection.
2020-10-03 16:55:19 -07:00
Xiang1 Zhang 413577a879 [X86] Support Intel Key Locker
Key Locker provides a mechanism to encrypt and decrypt data with an AES key without having access
to the raw key value by converting AES keys into “handles”. These handles can be used to perform the
same encryption and decryption operations as the original AES keys, but they only work on the current
system and only until they are revoked. If software revokes Key Locker handles (e.g., on a reboot),
then any previous handles can no longer be used.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D88398
2020-09-30 18:08:45 +08:00
Freddy Ye bc7f6c6dd8 [X86] Add TDX instructions.
For more details about these instructions, please refer to the latest TDX document: https://software.intel.com/content/www/us/en/develop/articles/intel-trust-domain-extensions.html

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D88006
2020-09-24 09:35:44 +08:00
Pierre Gousseau cda6b09242 [X86] Make sure we do not clobber RBX with mwaitx when used as a base
pointer.

mwaitx uses EBX as one of its argument.
Using this instruction clobbers RBX as it is defined to hold one of the
input. When the backend uses dynamically allocated stack, RBX is used as
a reserved register for the base pointer.

This patch is adapted from @qcolombet patch for cmpxchg at r263325.

This fixes PR43528.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D73475
2020-08-26 11:20:31 +01:00
Hiroshi Yamauchi 28ccc52c40 [X86] Add feature for Fast Short REP MOV (FSRM) for Icelake or newer.
Differential Revision: https://reviews.llvm.org/D85989
2020-08-19 13:39:42 -07:00
Craig Topper 71b49aa438 [X86] Allow lsl/lar to be parsed with a GR16, GR32, or GR64 as source register.
This matches GNU assembler behavior. Operand size is determined
only from the destination register.
2020-07-15 23:51:37 -07:00
Xiang1 Zhang aded4f0cc0 [X86-64] Support Intel AMX instructions
Summary:
INTEL ADVANCED MATRIX EXTENSIONS (AMX).
AMX is a new programming paradigm, it has a set of 2-dimensional registers
(TILES) representing sub-arrays from a larger 2-dimensional memory image and
operate on TILES.

Spec can be found in Chapter 3 here https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewers: LuoYuanke, annita.zhang, pengfei, RKSimon, xiangzhangllvm

Reviewed By: xiangzhangllvm

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82705
2020-07-02 08:57:04 +08:00
Craig Topper 6673d69226 [X86] Don't imply -mprfchw when -m3dnow is specified. Enable prefetchw in the backend with 3dnow feature.
The PREFETCHW instruction was originally part of the 3DNow. But
it was given its own CPUID bit on later CPUs just before 3DNow
was deprecated.

We were setting the -mprfchw flag if -m3dnow was passed or the CPU
supported 3dnow unless -mno-prfchw was passed. But -march=native
on a CPU without the PRFCHW CPUID bit set will pass -mno-prfchw.
So -march=k8 will behave differently than -march=native on a K8
for example.

So remove this implicit setting from the frontend and instead
enable the backend to use PREFETCHW if 3dnow OR prfchw is enabled.

Also enable PRFCHW flag on amdfam10/barcelona which seems to be
where this CPUID bit was introduced. That CPU also supported
3dnow.
2020-06-25 12:46:52 -07:00
Craig Topper 01c18f9199 Revert "[X86] Don't imply -mprfchw when -m3dnow is specified. Enable prefetchw in the backend with 3dnow feature."
This is failing on the bots.

This reverts commit 636d31a5c3.
2020-06-25 11:43:02 -07:00
Craig Topper 636d31a5c3 [X86] Don't imply -mprfchw when -m3dnow is specified. Enable prefetchw in the backend with 3dnow feature.
The PREFETCHW instruction was originally part of the 3DNow. But
it was given its own CPUID bit on later CPUs just before 3DNow
was deprecated.

We were setting the -mprfchw flag if -m3dnow was passed or the CPU
supported 3dnow unless -mno-prfchw was passed. But -march=native
on a CPU without the PRFCHW CPUID bit set will pass -mno-prfchw.
So -march=k8 will behave differently than -march=native on a K8
for example.

So remove this implicit setting from the frontend and instead
enable the backend to use PREFETCHW if 3dnow OR prfchw is enabled.

Also enable PRFCHW flag on amdfam10/barcelona which seems to be
where this CPUID bit was introduced. That CPU also supported
3dnow.
2020-06-25 11:25:35 -07:00
Craig Topper c721bc081e [X86] Correct the implementation of ud1(a.k.a. ud2b) instruction.
We were missing the modrm byte this instruction has according
to current Intel SDM. Experiments with gcc indicate that different
modrm values are chosen based on 2 operands so I've added those
as well.

I think our previous implementation was based on an older behavior of
binutils that has since been changed.
2020-06-19 23:57:48 -07:00
Craig Topper d72cb4ce21 Recommit "[X86] Separate imm from relocImm handling."
Fix the copy/paste mistake that caused it to fail previously
2020-06-15 10:59:43 -07:00
Craig Topper ad1c46c3c0 [X86] Remove printanymem/printopaquemem from the InstPrinters. Just tell tablegen to printMemReference directly. NFC
Most of the wrappers exist to print the memory size in Intel syntax
and then call the printMemReference. But printanymem/printopaquemem
don't print anything extra in Intel syntax so just drop them.
2020-06-15 09:46:06 -07:00
Hans Wennborg f47a776628 Revert "[X86] Separate imm from relocImm handling."
> relocImm was a complexPattern that handled both ConstantSDNode
> and X86Wrapper. But it was only applied selectively because using
> it would cause patterns to be not importable into FastISel or
> GlobalISel. So it only got applied to flag setting instructions,
> stores, RMW arithmetic instructions, and rotates.
>
> Most of the test changes are a result of making patterns available
> to GlobalISel or FastISel. The absolute-cmp.ll change is due to
> this fixing a pattern ordering issue to make an absolute symbol
> match to an 8-bit immediate before trying a 32-bit immediate.
>
> I tried to use PatFrags to reduce the repetition, but I was getting
> errors from TableGen.

This caused "Invalid EmitNode" assertions, see the llvm-commits thread for
discussion.
2020-06-15 16:14:59 +02:00
Craig Topper 8885a7640b [X86] Separate imm from relocImm handling.
relocImm was a complexPattern that handled both ConstantSDNode
and X86Wrapper. But it was only applied selectively because using
it would cause patterns to be not importable into FastISel or
GlobalISel. So it only got applied to flag setting instructions,
stores, RMW arithmetic instructions, and rotates.

Most of the test changes are a result of making patterns available
to GlobalISel or FastISel. The absolute-cmp.ll change is due to
this fixing a pattern ordering issue to make an absolute symbol
match to an 8-bit immediate before trying a 32-bit immediate.

I tried to use PatFrags to reduce the repetition, but I was getting
errors from TableGen.
2020-06-13 11:29:28 -07:00
Craig Topper 269d843720 [X86] Replace TB with PS on instructions that are documented in the SDM with 'NP'
'NP' means that the instruction is not recognized with a 66, F2 or F3
prefix. It will either #UD or decode to a different instruction.

All of the cases are here should fall into the #UD variety since
we should be detecting the collision with other instructions when
we build the disassembler tables.
2020-06-11 12:20:29 -07:00
Guozhi Wei 587af86f1d [X86] Add a flag to guard the wide load
As shown in http://lists.llvm.org/pipermail/llvm-dev/2020-May/141854.html,
widen load can also cause stall. Add a flag to guard the widening code,
so users can disable it and evaluate its performance impact.

Differential Revision: https://reviews.llvm.org/D80943
2020-06-02 16:16:13 -07:00
Craig Topper 84cf8ed8fd [X86] Lower sse_cmp_ss/sse2_cmp_sd intrinsics to X86ISD::FSETCC with vector types.
Isel match that instead of the intrinsic. Similar to what we do
for avx512.

Trying to move more intrinsics to target specific ISD opcodes.
Hoping to add DAG combines to shrink simple loads going into
scalar intrinsics that only read 32 or 64 bits.
2020-05-26 23:48:16 -07:00
Craig Topper d7e2d937bc [X86] Add X86ISD nodes for PDEP and PEXT.
This will allow use to add DAG combines for these instructions.
2020-04-19 16:14:13 -07:00
WangTianQing a3dc949000 [X86] Add TSXLDTRK instructions.
Summary: For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference

Reviewers: craig.topper, RKSimon, LuoYuanke

Reviewed By: craig.topper

Subscribers: mgorny, hiraditya, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77205
2020-04-09 13:17:29 +08:00
Scott Constable 71e8021d82 [X86][NFC] Generalize the naming of "Retpoline Thunks" and related code to "Indirect Thunks"
There are applications for indirect call/branch thunks other than retpoline for Spectre v2, e.g.,

https://software.intel.com/security-software-guidance/software-guidance/load-value-injection

Therefore it makes sense to refactor X86RetpolineThunks as a more general capability.

Differential Revision: https://reviews.llvm.org/D76810
2020-04-02 21:55:13 -07:00
WangTianQing d08fadd662 [X86] Add SERIALIZE instruction.
Summary: For more details about this instruction, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference

Reviewers: craig.topper, RKSimon, LuoYuanke

Reviewed By: craig.topper

Subscribers: mgorny, hiraditya, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77193
2020-04-02 16:19:23 +08:00
Simon Cook a26bd4ec16 [TableGen] Support combining AssemblerPredicates with ORs
For context, the proposed RISC-V bit manipulation extension has a subset
of instructions which require one of two SubtargetFeatures to be
enabled, 'zbb' or 'zbp', and there is no defined feature which both of
these can imply to use as a constraint either (see comments in D65649).

AssemblerPredicates allow multiple SubtargetFeatures to be declared in
the "AssemblerCondString" field, separated by commas, and this means
that the two features must both be enabled. There is no equivalent to
say that _either_ feature X or feature Y must be enabled, short of
creating a dummy SubtargetFeature for this purpose and having features X
and Y imply the new feature.

To solve the case where X or Y is needed without adding a new feature,
and to better match a typical TableGen style, this replaces the existing
"AssemblerCondString" with a dag "AssemblerCondDag" which represents the
same information. Two operators are defined for use with
AssemblerCondDag, "all_of", which matches the current behaviour, and
"any_of", which adds the new proposed ORing features functionality.

This was originally proposed in the RFC at
http://lists.llvm.org/pipermail/llvm-dev/2020-February/139138.html

Changes to all current backends are mechanical to support the replaced
functionality, and are NFCI.

At this stage, it is illegal to combine features with ands and ors in a
single AssemblerCondDag. I suspect this case is sufficiently rare that
adding more complex changes to support it are unnecessary.

Differential Revision: https://reviews.llvm.org/D74338
2020-03-13 17:13:51 +00:00
Simon Pilgrim b3b4727a3e [X86] Replace (most) X86ISD::SHLD/SHRD usage with ISD::FSHL/FSHR generic opcodes (PR39467)
For i32 and i64 cases, X86ISD::SHLD/SHRD are close enough to ISD::FSHL/FSHR that we can use them directly, we just need to account for the operand commutation for SHRD.

The i16 SHLD/SHRD case is annoying as the shift amount is modulo-32 (vs funnel shift modulo-16), so I've added X86ISD::FSHL/FSHR equivalents, which matches the generic implementation in all other terms.

Something I'm slightly concerned with is that ISD::FSHL/FSHR legality is controlled by the Subtarget.isSHLDSlow() feature flag - we don't normally use non-ISA features for this but it allows the DAG combines to continue to operate after legalization in a lot more cases.

The X86 *bits.ll changes are all affected by the same issue - we now have a "FSHR(-1,-1,amt) -> ROTR(-1,amt) -> (-1)" simplification that reduces the dependencies enough for the branch fall through code to mess up.

Differential Revision: https://reviews.llvm.org/D75748
2020-03-11 11:17:49 +00:00
Craig Topper 8875ee18d7 [X86] Add a new format type for instructions that represent named prefix bytes like data16 and rep. Use it to make a simpler version of isPrefix.
isPrefix was added to support the patches to align branches.
it relies on a switch over instruction names.

This moves those opcodes to a new format so the information is
tablegen and we can just check for a specific value in some bits
in TSFlags instead.

I've left the other function in place for now so that the
existing patches in phabricator will still work. I'll work with
the owner to get them migrated.
2020-02-21 12:34:59 -08:00
Craig Topper 68400a2308 [X86] Add missing isel pattern for BLCFILL producing flags. 2020-02-17 13:20:13 -08:00
Craig Topper d26f11108b [X86] Split X86ISD::CMP into an integer and FP opcode. 2020-02-16 10:10:19 -08:00
serge_sans_paille e67cbac812 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a3 with proper LiveIn
declaration, better option handling and more portable testing.

Differential Revision: https://reviews.llvm.org/D68720
2020-02-09 10:42:45 +01:00
serge-sans-paille 4546211600 Revert "Support -fstack-clash-protection for x86"
This reverts commit 0fd51a4554.

Failures:

http://lab.llvm.org:8011/builders/llvm-clang-win-x-armv7l/builds/4354
2020-02-09 10:06:31 +01:00
serge_sans_paille 0fd51a4554 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a3 with proper LiveIn
declaration, better option handling and more portable testing.

Differential Revision: https://reviews.llvm.org/D68720
2020-02-09 09:35:42 +01:00
serge-sans-paille 658495e6ec Revert "Support -fstack-clash-protection for x86"
This reverts commit e229017732.

Failures:

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/2604
http://lab.llvm.org:8011/builders/llvm-clang-win-x-aarch64/builds/4308
2020-02-08 14:26:22 +01:00
serge_sans_paille e229017732 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a3 with better option
handling and more portable testing

Differential Revision: https://reviews.llvm.org/D68720
2020-02-08 13:31:52 +01:00
Nico Weber b03c3d8c62 Revert "Support -fstack-clash-protection for x86"
This reverts commit 4a1a0690ad.
Breaks tests on mac and win, see https://reviews.llvm.org/D68720
2020-02-07 14:49:38 -05:00
serge_sans_paille 4a1a0690ad Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a3 with correct option
flags set.

Differential Revision: https://reviews.llvm.org/D68720
2020-02-07 19:54:39 +01:00
serge-sans-paille f6d98429fc Revert "Support -fstack-clash-protection for x86"
This reverts commit 39f50da2a3.

The -fstack-clash-protection is being passed to the linker too, which
is not intended.

Reverting and fixing that in a later commit.
2020-02-07 11:36:53 +01:00
serge_sans_paille 39f50da2a3 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

Differential Revision: https://reviews.llvm.org/D68720
2020-02-07 10:56:15 +01:00
Craig Topper 4175d7e22e [X86] Custom isel floating point X86ISD::CMP on pre-CMOV targets. Eliminate ConvertCmpIfNecessary
If we don't have cmov, X87 compares write to FPSW and we need to
move the bits to EFLAGS to use as JCC/SETCC/CMOV conditions.

Previously this was done by calling ConvertCmpIfNecessary in
multiple places which would emit the extra code for the FNSTSW,
a shift, a truncate, and a SAHF instructions. Isel would then
select trunc+X86ISD::CMP to a FUCOM instruction that produces FPSW.

This patch centralizes all of the handling into a single custom
isel handler. This allows us to remove ConvertCmpIfNecessary and
a couple target specific ISD opcodes.

Differential Revision: https://reviews.llvm.org/D73863
2020-02-06 10:43:06 -08:00
Eric Astor 1c545f6dbc [ms] [X86] Use "P" modifier on all branch-target operands in inline X86 assembly.
Summary:
Extend D71677 to apply to all branch-target operands, rather than special-casing call instructions.

Also add a regression test for llvm.org/PR44272, since this finishes fixing it.

Reviewers: thakis, rnk

Reviewed By: thakis

Subscribers: merge_guards_bot, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72417
2020-01-09 14:55:03 -05:00
Ganesh Gopalasubramanian 3408940f73 [X86] AMD Znver2 (Rome) Scheduler enablement
The patch gives out the details of the znver2 scheduler model.
There are few improvements with respect to execution units, latencies and
throughput when compared with znver1.
The tests that were present for znver1 for llvm-mca tool were replicated.
The latencies, execution units, timeline and throughput information are updated for znver2.

Reviewers: craig.topper, Simon Pilgrim

Differential Revision: https://reviews.llvm.org/D66088
2020-01-10 00:44:59 +05:30
Craig Topper 3186b18b99 [X86] Reorder X86any* PatFrags to put the strict node first so that chain property will be inferred for the instruction by the tablegen backend.
Also use X86any_vfpround instead of X86vfpround in some instruction
definitions so the strict version can be used to infer the chain
property.

Without these changes we don't propagate strict FP chain through
isel for some instructions.
2020-01-03 00:11:55 -08:00
Wang, Pengfei 21bc8631fe [FPEnv][X86] Constrained FCmp intrinsics enabling on X86
Summary: This is a follow up of D69281, it enables the X86 backend support for the FP comparision.

Reviewers: uweigand, kpn, craig.topper, RKSimon, cameron.mcinally, andrew.w.kaylor

Subscribers: hiraditya, llvm-commits, annita.zhang, LuoYuanke, LiuChen3

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70582
2019-12-11 08:23:09 +08:00
Hiroshi Yamauchi 52e377497d [PGO][PGSO] DAG.shouldOptForSize part.
Summary:
(Split of off D67120)

SelectionDAG::shouldOptForSize changes for profile guided size optimization.

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70095
2019-11-21 14:16:00 -08:00