Commit Graph

355699 Commits

Author SHA1 Message Date
Alexey Bataev a888fc6b34 [OPENMP50]Initial support for use_device_addr clause.
Summary:
Added parsing/sema analysis/serialization support for use_device_addr
clauses.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, arphaman, sstefan1, llvm-commits, cfe-commits, caomhin

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80404
2020-05-27 11:35:31 -04:00
Alex Richardson 3be5e53f20 [FileCheck] Allow parenthesized expressions
With this change it is be possible to write FileCheck expressions such
as [[#(VAR+1)-2]]. Currently, the only supported arithmetic operators are
plus and minus, so this is not particularly useful yet. However, it our
CHERI fork we have tests that benefit from having multiplication in
FileCheck expressions. Allowing parenthesized expressions is the simplest
way for us to work around the current lack of operator precedence in
FileCheck expressions.

Reviewed By: thopre, jhenderson
Differential Revision: https://reviews.llvm.org/D77383
2020-05-27 16:31:39 +01:00
Eduardo Caldas 461af57de7 Add support for UnaryOperator in SyntaxTree
Reviewers: gribozavr2

Reviewed By: gribozavr2

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D80624
2020-05-27 17:12:46 +02:00
Simon Pilgrim b5b0087722 SpecialCaseList.h - reduce unnecessary includes to forward declarations. NFC.
Remove Regex forward declaration as we already require the Regex.h include.

Add missing VirtualFileSystem.h include to dependent source files.
2020-05-27 15:51:03 +01:00
Lei Huang 559845f8fe Revert "[PowerPC] Add support for -mcpu=pwr10 in both clang and llvm"
This reverts commit 7eb666b155.
2020-05-27 09:40:21 -05:00
Ties Stuij 78bd0c0e5e [AArch64][BFloat] add BFloat instruction support for AArch64
Summary:
Add support for lowering various BFloat related SelDAG nodes:
- load/store (ldrh/strh)
- concat
- dup/duplane
- bitconvert/bitcast
- insert_subvector/insert_subreg

This patch is part of a series implementing the Bfloat16 extension of the
Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type, and its properties are specified in the Arm Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

Reviewers: ab, t.p.northover, john.brawn, fpetrogalli, sdesmalen, LukeGeeson

Reviewed By: fpetrogalli

Subscribers: LukeGeeson, pbarrio, kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79712
2020-05-27 15:36:54 +01:00
Dmitry Vyukov 4408eeed0f tsan: fix false positives in AcquireGlobal
Add ThreadClock:: global_acquire_ which is the last time another thread
has done a global acquire of this thread's clock.

It helps to avoid problem described in:
https://github.com/golang/go/issues/39186
See test/tsan/java_finalizer2.cpp for a regression test.
Note the failuire is _extremely_ hard to hit, so if you are trying
to reproduce it, you may want to run something like:
$ go get golang.org/x/tools/cmd/stress
$ stress -p=64 ./a.out

The crux of the problem is roughly as follows.
A number of O(1) optimizations in the clocks algorithm assume proper
transitive cumulative propagation of clock values. The AcquireGlobal
operation may produce an inconsistent non-linearazable view of
thread clocks. Namely, it may acquire a later value from a thread
with a higher ID, but fail to acquire an earlier value from a thread
with a lower ID. If a thread that executed AcquireGlobal then releases
to a sync clock, it will spoil the sync clock with the inconsistent
values. If another thread later releases to the sync clock, the optimized
algorithm may break.

The exact sequence of events that leads to the failure.
- thread 1 executes AcquireGlobal
- thread 1 acquires value 1 for thread 2
- thread 2 increments clock to 2
- thread 2 releases to sync object 1
- thread 3 at time 1
- thread 3 acquires from sync object 1
- thread 1 acquires value 1 for thread 3
- thread 1 releases to sync object 2
- sync object 2 clock has 1 for thread 2 and 1 for thread 3
- thread 3 releases to sync object 2
- thread 3 sees value 1 in the clock for itself
  and decides that it has already released to the clock
  and did not acquire anything from other threads after that
  (the last_acquire_ check in release operation)
- thread 3 does not update the value for thread 2 in the clock from 1 to 2
- thread 4 acquires from sync object 2
- thread 4 detects a false race with thread 2
  as it should have been synchronized with thread 2 up to time 2,
  but because of the broken clock it is now synchronized only up to time 1

The global_acquire_ value helps to prevent this scenario.
Namely, thread 3 will not trust any own clock values up to global_acquire_
for the purposes of the last_acquire_ optimization.

Reviewed-in: https://reviews.llvm.org/D80474
Reported-by: nvanbenschoten (Nathan VanBenschoten)
2020-05-27 16:27:47 +02:00
Ties Stuij 42eba9b40b [AArch64][BFloat] basic AArch64 bfloat support
Summary:
This patch adds the bfloat type to the AArch64 backend:
- adds it as part of the FPR16 register class
- adds bfloat calling conventions
- as f16 is now not the only FPR16 type anymore, we need to constrain a number
  of instruction patterns using FPR16Op to help out the TableGen type inferrer

This patch is part of a series implementing the Bfloat16 extension of the
Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type, and its properties are specified in the Arm Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

Reviewers: t.p.northover, c-rhodes, fpetrogalli, sdesmalen, ostannard, LukeGeeson, ab

Reviewed By: fpetrogalli

Subscribers: pbarrio, LukeGeeson, kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79709
2020-05-27 15:26:40 +01:00
Alex Zinenko cadb7ccf2c [mlir] SCF: provide function_ref builders for IfOp
Now that OpBuilder is available in `build` functions, it becomes possible to
populate the "then" and "else" regions directly when building the "if"
operation. This is desirable in more structured forms of builders, especially
in when conditionals are mixed with loops. Provide new `build` APIs taking
callbacks for body constructors, similarly to scf::ForOp, and replace more
clunky edsc::BlockBuilder uses with these. The original APIs remain available
and go through the new implementation.

Differential Revision: https://reviews.llvm.org/D80527
2020-05-27 16:12:58 +02:00
Jinsong Ji 5ee902bb5f [compiler-rt][asan] Add noinline to use-after-scope testcases
Some testcases are unexpectedly passing with NPM.
This is because the target functions are inlined in NPM.

I think we should add noinline attribute to keep these test points.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D79648
2020-05-27 14:05:02 +00:00
Georgii Rymar 4ab03e62fd [llvm-readobj] - Do not crash when an invalid .eh_frame_hdr is dumped using --unwind.
When the p_offset/p_filesz of the PT_GNU_EH_FRAME is invalid
(e.g larger than the file size) then llvm-readobj might crash.

This patch fixes the issue. I've introduced `ELFFile<ELFT>::getSegmentContent`
method, which is very similar to `ELFFile<ELFT>::getSectionContentsAsArray` one.

Differential revision: https://reviews.llvm.org/D80380
2020-05-27 16:41:09 +03:00
Ties Stuij ad5d319ee8 [IR][BFloat] add BFloat IR intrinsics support
Summary:
This patch is part of a series that adds support for the Bfloat16 extension of
the Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type, and its properties are specified in the Arm Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

Reviewers: scanon, fpetrogalli, sdesmalen, craig.topper, LukeGeeson

Reviewed By: fpetrogalli

Subscribers: LukeGeeson, pbarrio, kristof.beyls, hiraditya, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79707
2020-05-27 14:37:47 +01:00
David Green 70d4a20299 [UnJ] Update LI for inner nested loops
This makes sure to correctly register the loop info of the children
of unroll and jammed loops. It re-uses some code from the unroller for
registering subloops.

Differential Revision: https://reviews.llvm.org/D80619
2020-05-27 14:36:38 +01:00
Matt Arsenault 833996cef1 AMDGPU: Fix backwards s_cselect_* operands
The vector equivalent has backwards operands, but the scalar version
does not. The passes that use these hooks aren't enabled by default,
so this doesn't really change anything.
2020-05-27 09:26:09 -04:00
Sanjay Patel 2ee4ec6b6f [IR] add set function for FMF 'contract'
This was missed when the flag was added with D31164.
2020-05-27 09:14:51 -04:00
Simon Pilgrim 0865d41492 ObjectFile.h - reduce unnecessary includes to forward declarations. NFC.
Fix SubtargetFeature.h include dependency in XCOFFObjectFile.cpp
2020-05-27 14:02:14 +01:00
Simon Pilgrim ae07fabf6a ObjCARCInstKind.h - remove unused includes. NFC. 2020-05-27 14:02:14 +01:00
Ties Stuij 0508fb45df [CodeGen][BFloat] Add bfloat MVT type
Summary:
This patch adds BFloat MVT support. It also adds fixed and scalable vector MVT
types for BFloat.

This patch is part of a series that adds support for the Bfloat16 extension of the Armv8.6-a architecture, as
detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type, and its properties are specified in the Arm Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

Reviewers: aemerson, huntergr, craig.topper, fpetrogalli, sdesmalen, LukeGeeson, ostannard

Reviewed By: ostannard

Subscribers: LukeGeeson, pbarrio, dschuff, kristof.beyls, hiraditya, aheejin, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79706
2020-05-27 13:38:12 +01:00
Stephen Kelly 63f927b17a Update release notes with porting guide for AST Matchers 2020-05-27 13:21:06 +01:00
Guillaume Chatelet 5b84ee4f61 [Alignment] Fix misaligned interleaved loads
Summary: Tentatively fixing https://bugs.llvm.org/show_bug.cgi?id=45957

Reviewers: craig.topper, nlopes

Subscribers: hiraditya, llvm-commits, RKSimon, jdoerfert, efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80276
2020-05-27 12:12:22 +00:00
Gongyu Deng 763bc23057 [lldb] Tab completion for process plugin name
Summary:

1. Added tab completion to `process launch -p`, `process attach -P`, `process
connect -p`;

2. Bound the plugin name common completion as the default completion for
`eArgTypePlugin` arguments.

Reviewers: teemperor, JDevlieghere

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D79929
2020-05-27 14:11:16 +02:00
Victor Campos c7593b0f0d [ARM] Fix rewrite of frame index in Thumb2's address mode i8s4
Summary:
In Thumb2's frame index rewriting process, the address mode i8s4, which
is used by LDRD and STRD instructions, is handled by taking the
immediate offset operand and multiplying it by 4.

This behaviour is wrong, however. In this specific address mode, the
MachineInstr's immediate operand is already in the expected form. By
consequence of that, multiplying it once more by 4 yields a flawed
offset value, four times greater than it should be.

Differential Revision: https://reviews.llvm.org/D80557
2020-05-27 13:09:13 +01:00
Raphael Isemann 18bb1f1067 [lldb] Fix a potential bug that may cause assert failure in CommandObject::CheckRequirements
Summary: `CommandObject::CheckRequirements` requires cleaning up `m_exe_ctx`
between commands. Function `HandleOptionCompletion` returns without cleaning up
`m_exe_ctx` could cause assert failure in later `CheckRequirements`.

Reviewers: teemperor, JDevlieghere

Reviewed By: teemperor

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D80447
2020-05-27 14:05:17 +02:00
Guillaume Chatelet 6e1eff7858 [NFC] Updating tests
Summary:
Updating IR now that alignment is explicitly set.
This is a prerequisite to D80276.

Reviewers: efriedma

Subscribers: llvm-commits, craig.topper

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80549
2020-05-27 12:02:46 +00:00
Florian Hahn 9b507b2127 [LAA] We only need pointer checks if there are non-zero checks (NFC).
If it turns out that we can do runtime checks, but there are no
runtime-checks to generate, set RtCheck.Need to false.

This can happen if we can prove statically that the pointers passed in
to canCheckPtrAtRT do not alias. This should not change any results, but
allows us to skip some work and assert that runtime checks are
generated, if LAA indicates that runtime checks are required.

Reviewers: anemet, Ayal

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D79969

Note: This is a recommit of 259abfc7cb,
with some suggested renaming.
2020-05-27 12:47:36 +01:00
Florian Hahn 2d0389821e Revert "[LAA] We only need pointer checks if there are non-zero checks (NFC)."
This reverts commit 259abfc7cb.

Reverting this, as I missed a case where we return without setting
RtCheck.Need.
2020-05-27 12:39:45 +01:00
Florian Hahn 259abfc7cb [LAA] We only need pointer checks if there are non-zero checks (NFC).
If it turns out that we can do runtime checks, but there are no
runtime-checks to generate, set RtCheck.Need to false.

This can happen if we can prove statically that the pointers passed in
to canCheckPtrAtRT do not alias. This should not change any results, but
allows us to skip some work and assert that runtime checks are
generated, if LAA indicates that runtime checks are required.

Reviewers: anemet, Ayal

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D79969
2020-05-27 12:37:20 +01:00
Daniil Suchkov 706b22e3e4 [SimpleLoopUnswitch] Drop uses of instructions before block deletion
Currently if instructions defined in a block are used in unreachable
blocks and SimpleLoopUnswitch attempts deleting the block, it triggers
assertion "Uses remain when a value is destroyed!".
This patch fixes it by replacing all uses of instructions from BB with
undefs before BB deletion.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D80551
2020-05-27 18:25:18 +07:00
Georgii Rymar d804b334ed [llvm-readelf] - Split GNUStyle<ELFT>::printHashHistogram. NFC.
As was mentioned in review comments for D80204,
`printHashHistogram` has 2 lambdas that are probably too large
and deserves splitting into member functions.

This patch does it.

Differential revision: https://reviews.llvm.org/D80546
2020-05-27 13:59:20 +03:00
Simon Pilgrim 1e9462a201 ArchiveWriter.h - remove unnecessary includes. NFC. 2020-05-27 11:51:26 +01:00
Simon Pilgrim 8062602810 DOTGraphTraitsPass.h - remove unnecessary includes. NFC. 2020-05-27 11:51:25 +01:00
Georgii Rymar fc98447af6 [llvm-readobj] - Do not skip building of the GNU hash table histogram.
When the `--elf-hash-histogram` is used, the code first tries to build
a histogram for the .hash table and then for the .gnu.hash table.

The problem is that dumper might return early when unable or do not need to
build a histogram for the .hash.

This patch reorders the code slightly to fix the issue and adds a test case.

Differential revision: https://reviews.llvm.org/D80204
2020-05-27 13:46:41 +03:00
Raphael Isemann 019bd6485c [lldb] Don't complete ObjCInterfaceDecls in ClangExternalASTSourceCallbacks::FindExternalVisibleDeclsByName
Summary:
For ObjCInterfaceDecls, LLDB iterates over the `methods` of the interface in FindExternalVisibleDeclsByName
since commit ef423a3ba5 .
However, when LLDB calls `oid->methods()` in that function, Clang will pull in all declarations in the current
DeclContext from the current ExternalASTSource (which is again, `ClangExternalASTSourceCallbacks`). The
reason for that is that `methods()` is just a wrapper for `decls()` which is supposed to provide a list of *all*
(both currently loaded and external) decls in the DeclContext.

However, `ClangExternalASTSourceCallbacks::FindExternalLexicalDecls` doesn't implement support for ObjCInterfaceDecl,
so we don't actually add any declarations and just mark the ObjCInterfaceDecl as having no ExternalLexicalStorage.

As LLDB uses the ExternalLexicalStorage to see if it can complete a type with the ExternalASTSource, this causes
that LLDB thinks our class can't be completed any further by the ExternalASTSource
and will from on no longer make any CompleteType/FindExternalLexicalDecls calls to that decl. This essentially
renders those types unusable in the expression parser as they will always be considered incomplete.

This patch just changes the call to `methods` (which is just a `decls()` wrapper), to some ad-hoc `noload_methods`
call which is wrapping `noload_decls()`. `noload_decls()` won't trigger any calls to the ExternalASTSource, so
this prevents that ExternalLexicalStorage will be set to false.

The test for this is just adding a method to an ObjC interface. Before this patch, this unset the ExternalLexicalStorage
flag and put the interface into the state described above.

In a normal user session this situation was triggered by setting a breakpoint in a method of some ObjC class. This
caused LLDB to create the MethodDecl for that specific method and put it into the the ObjCInterfaceDecl.
Also `ObjCLanguageRuntime::LookupInCompleteClassCache` needs to be unable to resolve the type do
an actual definition when the breakpoint is set (I'm not sure how exactly this can happen, but we just
found no Type instance that had the `TypePayloadClang::IsCompleteObjCClass` flag set in its payload in
the situation where this happens. This however doesn't seem to be a regression as logic wasn't changed
from what I can see).

The module-ownership.mm test had to be changed as the only reason why the ObjC interface in that test had
it's ExternalLexicalStorage flag set to false was because of this unintended side effect. What actually happens
in the test is that ExternalLexicalStorage is first set to false in `DWARFASTParserClang::CompleteTypeFromDWARF`
when we try to complete the `SomeClass` interface, but is then the flag is set back to true once we add
the last ivar of `SomeClass` (see `SetMemberOwningModule` in `TypeSystemClang.cpp` which is called
when we add the ivar). I'll fix the code for that in a follow-up patch.

I think some of the code here needs some rethinking. LLDB and Clang shouldn't infer anything about the ExternalASTSource
and its ability to complete the current type form the `ExternalLexicalStorage` flag. We probably should
also actually provide any declarations when we get asked for the lexical decls of an ObjCInterfaceDecl. But both of those
changes are bigger (and most likely would cause us to eagerly complete more types), so those will be follow up patches
and this patch just brings us back to the state before commit ef423a3ba5 .

Fixes rdar://63584164

Reviewers: aprantl, friss, shafik

Reviewed By: aprantl, shafik

Subscribers: arphaman, abidh, JDevlieghere

Differential Revision: https://reviews.llvm.org/D80556
2020-05-27 12:39:24 +02:00
Simon Pilgrim 35963f6d85 VPlanValue.h - reduce unnecessary includes to forward declarations. NFC. 2020-05-27 11:26:14 +01:00
Simon Pilgrim 410667f1b7 [X86][SSE] Convert PTEST to MOVMSK for allsign bits vector results
If we are using PTEST to check 'allsign bits' vector elements we can use MOVMSK to extract the signbits directly and perform the comparison on the scalar value.

For vXi16 cases, as we don't have a MOVMSK for this type, we must mask each signbit out of a PMOVMSKB v2Xi8 result, which folds into the TEST comparison.

If this allows us to remove a vector op (via the SimplifyMultipleUseDemandedBits call) this is consistently faster than a PTEST (https://godbolt.org/z/ziJUst).

I'm investigating whether we ever get regressions without the SimplifyMultipleUseDemandedBits call, even if this means we don't remove a vector op, but that has exposed some other poor codegen issues that I'm still investigating and would have to wait for a later patch.

Suggested on PR42035 to avoid unnecessary ashr(x,bw-1)/pcmpgt(0,x) sign splat patterns feeding into ptest.

Differential Revision: https://reviews.llvm.org/D80563
2020-05-27 11:06:16 +01:00
Konstantin Schwarz f2fad3f703 [GlobalISel][InlineAsm] Add missing EarlyClobber flag to inline asm output operands
Summary:
Previously, we only added early-clobber flags to the 'group' immediate flag operand
of an inline asm operand.
However, we also have to add the EarlyClobber flag to the MachineOperand itself.

This fixes PR46028

Reviewers: arsenm, leonardchan

Reviewed By: arsenm, leonardchan

Subscribers: phosek, wdng, rovka, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80467
2020-05-27 12:04:18 +02:00
Vitaly Buka f6383643d9 [StackSafety] Bailout on some function calls
Don't miss values used in calls outside regular argument list.
2020-05-27 02:48:42 -07:00
Vitaly Buka 06a07dd608 [StackSafety] Fix formatting in the test 2020-05-27 02:48:41 -07:00
Vitaly Buka b101c6251a [StackSafety] Ignore some use of values
We should ignore value used in MemTransferInst
as other then src/dst argument.
2020-05-27 02:48:41 -07:00
Georgii Rymar 84c6433586 [DebugInfo] - Fix typo in comment. NFC.
I've forgot to address this bit when landed D80476.
2020-05-27 12:21:19 +03:00
Djordje Todorovic 65030821d4 [NFC][Debugify] Format the CheckModuleDebugify output
This fixes the output of the check-debugify option.
Without the patch an example of running the option:

$ opt -check-debugify test.ll -S -o testDebugify.ll
CheckModuleDebugifySkipping module without debugify metadata

After the patch:

$ opt -check-debugify test.ll -S -o testDebugify.ll
CheckModuleDebugify: Skipping module without debugify metadata

Differential Revision: https://reviews.llvm.org/D80553
2020-05-27 10:32:40 +02:00
Craig Topper a1dfd6d828 [X86] Add helper function to reduce some code duplication when shrinking a vector load to a vzext_load.
There's more code for calling CombineTo and replacing the nodes
that I'd like to share, but its complicated by the getNode call
in the middle that needs to be specific to each opcode.

While there are also make sure we recursively delete the load
we're replacing. It eventually gets removed by a RemoveDeadNodes
call at the end of DAG combine, but we should be more eager about
it. We were inconsistently doing this in some places but not all.
2020-05-27 01:32:13 -07:00
Kazushi (Jam) Marukawa dedaf3a2ac [VE] Dynamic stack allocation
Summary:
This patch implements dynamic stack allocation for the VE target. Changes:
* compiler-rt: `__ve_grow_stack` to request stack allocation on the VE.
* VE: base pointer support, dynamic stack allocation.

Differential Revision: https://reviews.llvm.org/D79084
2020-05-27 10:11:06 +02:00
Daniil Suchkov fc44da746f Add test exposing a bug in SimpleLoopUnswitch. 2020-05-27 15:02:28 +07:00
Saiyedul Islam 602d9b0afc [OpenMP][AMDGCN] Support OpenMP offloading for AMDGCN architecture - Part 1
Summary:
Allow AMDGCN as a GPU offloading target for OpenMP during compiler
invocation and allow setting CUDAMode for it.

Originally authored by Greg Rodgers (@gregrodgers).

Reviewers: ronlieb, yaxunl, b-sumner, scchan, JonChesterfield, jdoerfert, sameerds, msearles, hliao, arsenm

Reviewed By: sameerds

Subscribers: sstefan1, jvesely, wdng, arsenm, guansong, dexonsmith, cfe-commits, llvm-commits, gregrodgers

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D79754
2020-05-27 07:51:27 +00:00
Mehdi Amini 0b5d81e6bb Automatically configure MLIR when flang is enabled
This is more friendly than the "Unknown CMake command “mlir_tablegen”."
that would be issued instead.

Differential Revision: https://reviews.llvm.org/D80359
2020-05-27 07:31:49 +00:00
serge-sans-paille de02a75e39 [PGO] Fix computation of function Hash
And bump its version number accordingly.

This is a patched recommit of 7c298c104b

Previous hash implementation was incorrectly passing an uint64_t, that got converted
to an uint8_t, to finalize the hash computation. This led to different functions
having the same hash if they only differ by the remaining statements, which is
incorrect.

Added a new test case that trivially tests that a small function change is
reflected in the hash value.

Not that as this patch fixes the hash computation, it would invalidate all hashes
computed before that patch applies, this is why we bumped the version number.

Update profile data hash entries due to hash function update, except for binary
version, in which case we keep the buggy behavior for backward compatibility.

Differential Revision: https://reviews.llvm.org/D79961
2020-05-27 09:15:21 +02:00
Craig Topper 84cf8ed8fd [X86] Lower sse_cmp_ss/sse2_cmp_sd intrinsics to X86ISD::FSETCC with vector types.
Isel match that instead of the intrinsic. Similar to what we do
for avx512.

Trying to move more intrinsics to target specific ISD opcodes.
Hoping to add DAG combines to shrink simple loads going into
scalar intrinsics that only read 32 or 64 bits.
2020-05-26 23:48:16 -07:00
Craig Topper b4978b2444 [X86] Use SIMD_EXC to remove some let statements in tablegen. NFCI 2020-05-26 23:48:16 -07:00
Wang, Pengfei 6565b58584 [X86][llvm-mc] Make the suffix matcher more accurate.
Summary:
Some instruction like VPMULDQ is NOT the variant of VPMULD but a new
one.
So we should make sure the suffix matcher only works for memory variant
that has the same size with the suffix.
Currently we only check for SSE/AVX* instructions, because many legacy
instructions didn't declare the alias instructions of their variants.

Differential Revision: https://reviews.llvm.org/D80608
2020-05-27 14:45:17 +08:00