Commit Graph

116329 Commits

Author SHA1 Message Date
Tom Stellard ffc6bd6f3d AMDGPU/GlobalISel: Define instruction mapping for G_SELECT
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D49737

llvm-svn: 341271
2018-09-01 02:41:19 +00:00
Zhaoshi Zheng f5297fb24b [Constant Hoisting] Hoisting Constant GEP Expressions
Leverage existing logic in constant hoisting pass to transform constant GEP
expressions sharing the same base global variable. Multi-dimensional GEPs are
rewritten into single-dimensional GEPs.

Differential Revision: https://reviews.llvm.org/D51396

llvm-svn: 341269
2018-09-01 00:04:56 +00:00
Jessica Paquette a69696dca6 Fix typo in size remarks for module passes
ModuleCount = InstrCount was incorrect. It should have been
InstrCount = ModuleCount. This was making it emit an extra, incorrect remark
for Print Module IR.

The test didn't catch this, because it didn't ensure that the only remark
output was from the desired pass. So, it was possible to have an extra remark
come through and not fail. Updated the test so that we ensure that the last
remark that's output comes from the desired pass. This is done by ensuring
that whatever is being read after the last remark is YAML output rather than
some incorrect garbage.

llvm-svn: 341267
2018-08-31 22:43:41 +00:00
Stanislav Mekhanoshin 44451b3344 [AMDGPU] Split v32i32 loads
Differential Revision: https://reviews.llvm.org/D51555

llvm-svn: 341266
2018-08-31 22:43:36 +00:00
Krzysztof Parzyszek 4cef462922 [Hexagon] Don't access non-existent instructions
llvm-svn: 341264
2018-08-31 22:10:04 +00:00
Craig Topper caf6672779 [X86] Add intrinsics for KTEST instructions.
These intrinsics use the same implementation as PTEST intrinsics, but use vXi1 vectors.

New clang builtins will be accompanying them shortly.

llvm-svn: 341259
2018-08-31 21:31:53 +00:00
Jessica Paquette 71e9778006 [NFC] Optionally pass a function to emitInstrCountChangedRemark
In basic block, loop, and function passes, we already have a function that
we can use to emit optimization remarks. We can use that instead of searching
the module for the first suitable function (that is, one that contains at
least one basic block.)

llvm-svn: 341253
2018-08-31 20:54:37 +00:00
Jessica Paquette 397c05dd7d [NFC] Check if P is a pass manager on entry to emitInstrCountChangedRemark
There's no point in finding a function to use for remark output when we're
not going to emit anything.

llvm-svn: 341252
2018-08-31 20:51:54 +00:00
Jessica Paquette 9a23c55920 [NFC] Pass the instruction delta to emitInstrCountChangedRemark
Instead of counting the size of the entire module every time we run a pass,
pass along a delta instead and use that to emit the remark.

This means we only have to use (on average) smaller IR units to calculate
instruction counts. E.g, in a BB pass, we only need to look at the delta of
the BB instead of the delta of the entire module.

6/6

(This improved compile time for size remarks on sqlite3 + O2 significantly)

llvm-svn: 341250
2018-08-31 20:20:57 +00:00
Jessica Paquette 1fc443b887 [NFC] Pre-calculate SCC IR counts in size remarks.
Same vein as the previous commits. Pre-calculate the size of
the module and use that to decide if we're going to emit a
remark.

This one comes with a FIXME and TODO. First off, CallGraphSCC
and CallGraphNode don't have a getInstructionCount function. So,
for now, we do the same thing as in a module pass.

Second off, we're not really saving anything here yet, because
as before, I need to change emitInstrCountChangedRemark to take
in a delta. Keeping the patches small though, so that's coming up
next.

5/6

llvm-svn: 341249
2018-08-31 20:20:56 +00:00
Jessica Paquette 454d1032e9 [NFC] Pre-calculate module IR counts in size remarks.
Same as the previous NFC commits in the same vein.

This one introduces a TODO. I'm going to change emitInstrCountChangedRemark
so that it takes in a delta. Since the delta isn't necessary yet, it's not
there. For now, this means that we're calculating the size of the module
twice.

Just done separately to keep the patches small.

4/6

llvm-svn: 341248
2018-08-31 20:20:55 +00:00
Jessica Paquette 872a4c92b2 [NFC] Pre-calculate loop IR counts in size remarks.
Another commit reducing compile time in size remarks.

Cache the size of the module and loop, and update values based
off of deltas instead. Avoid recalculating the size of the
whole module whenever possible.

3/6

llvm-svn: 341247
2018-08-31 20:20:54 +00:00
Jessica Paquette 9eda13e976 [NFC] Pre-calculate basic block IR counts in size remarks.
Size remarks are slow due to lots of recalculation of the module.

This is similar to the previous commit. Cache the size of the module and
update counts in basic block passes based off a less-expensive delta.

2/6

llvm-svn: 341246
2018-08-31 20:20:53 +00:00
Jessica Paquette f2a202ce7a [NFC] Pre-calculate function IR counts in size remarks.
Size remarks are slow due to lots of recalculation of the module.

Pre-calculate the module size and initial function size for a remark. Use
deltas calculated using the less-expensive function IR count to update the
module counts for Function passes.

1/6

llvm-svn: 341245
2018-08-31 20:19:41 +00:00
Dean Michael Berris f135ac4bbc [XRay] Update RecordInitializer for PIDRecord
Since we changed the storage for the PID in PIDRecord instances, we need
to also update the way we load the data from a DataExtractor through the
RecordInitializer.

llvm-svn: 341243
2018-08-31 20:02:55 +00:00
Dean Michael Berris 4cae04873b [XRay] Use correct type for PID records
Previously we've been reading and writing the wrong types which only
worked in little endian implementations. This time we're writing the
same typed values the runtime is using, and reading them appropriately
as well.

llvm-svn: 341241
2018-08-31 19:32:46 +00:00
Dean Michael Berris 250c56d127 [XRay] Use correct type for thread ID parsing
Previously we were reading only a uint16_t when we really needed to read
an int32_t from the log.

llvm-svn: 341239
2018-08-31 19:11:19 +00:00
Sid Manning b1c9813042 [Hexagon] Add support for getRegisterByName.
Support required to build the Hexagon Linux kernel.

Differential Revision: https://reviews.llvm.org/D51363

llvm-svn: 341238
2018-08-31 19:08:23 +00:00
Dean Michael Berris 3fc4cbfe10 [XRay] Change function record reader to be endian-aware
This change allows us to let the compiler do the right thing for when
handling big-endian and little-endian records for FDR mode function
records.

Previously, we assumed that the encoding was little-endian that reading
the first byte to look for the function id and function record types was
ordered in a little-endian manner. This change allows us to better
handle function records where the first four bytes may actually be
encoded in big-endian thus giving us the wrong bytes where we're seeking
the function information from.

This is a follow-up to D51210 and D51289.

llvm-svn: 341236
2018-08-31 18:36:58 +00:00
Dean Michael Berris c1dceee50b [XRay] Fix FunctionRecord serialization
This change makes the writer implementation more consistent with the way
fields are written down to avoid assumptions on bitfield order and
padding. We also fix an inconsistency between the type returned by the
`delta()` accessor to match the data member it's returning.

This is a follow-up to D51289 and D51210.

llvm-svn: 341230
2018-08-31 17:49:59 +00:00
Alexandre Ganea 6a7efef4af [DebugInfo] Common behavior for error types
Following D50807, and heading towards D50664, this intermediary change does the following:

1. Upgrade all custom Error types in llvm/trunk/lib/DebugInfo/ to use the new StringError behavior (D50807).
2. Implement std::is_error_code_enum and make_error_code() for DebugInfo error enumerations.
3. Rename GenericError -> PDBError (the file will be renamed in a subsequent commit)
4. Update custom error messages to follow the same formatting: (\w\s*)+\.
5. Keep generic "file not found" (ENOENT) errors as they are in PDB code. Previously, there used to be a custom enumeration for that purpose.
6. Remove a few extraneous LF in log() implementations. Printing LF is a responsability at a higher level, not at the error level.

Differential Revision: https://reviews.llvm.org/D51499

llvm-svn: 341228
2018-08-31 17:41:58 +00:00
Craig Topper b7bb9f0078 [X86] Add support for turning vXi1 shuffles into KSHIFTL/KSHIFTR.
This patch recognizes shuffles that shift elements and fill with zeros. I've copied and modified the shift matching code we use for normal vector registers to do this. I'm not sure if there's a good way to share more of this code without making the existing function more complex than it already is.

This will be used to enable kshift intrinsics in clang.

Differential Revision: https://reviews.llvm.org/D51401

llvm-svn: 341227
2018-08-31 17:17:21 +00:00
Dean Michael Berris 5b7548c653 [XRay] Make Trace loading endian-aware
This change makes the XRay Trace loading functions first use a
little-endian data extractor, then on failures try a big-endian data
extractor. Without this change, the trace loading facility will not work
with data written from a big-endian machine.

Follow-up to D51210 and D51289.

llvm-svn: 341226
2018-08-31 17:06:28 +00:00
Dean Michael Berris 98717978c9 [XRay] Make the FDRTraceWriter Endian-aware
Before this patch, the FDRTraceWriter would not take endianness into
account when writing data into the output stream.

This is a follow-up to D51289 and D51210.

llvm-svn: 341223
2018-08-31 16:08:38 +00:00
Andrea Di Biagio a59ec4efa0 [X86][BtVer2] Remove wrong ReadAdvance from AVX vbroadcast(ss|sd|f128) instructions.
The presence of a ReadAdvance for input operand #0 is problematic
because it changes the input latency of the register used as the base address
for the folded load.

A broadcast cannot start executing if the load address hasn't been computed yet.

In the llvm-mca example, the VBROADCASTSS is dependent on the address generated
by the LEAQ.  That means, it cannot start until LEAQ reaches the write-back
stage. If we apply ReadAdvance, then we wrongly assume that the load can start 3
cycles in advance.

Differential Revision: https://reviews.llvm.org/D51534

llvm-svn: 341222
2018-08-31 16:05:48 +00:00
Simon Atanasyan 3785e84cf2 [mips] Fix `mtc1` and `mfc1` definitions for microMIPS R6
The `mtc1` and `mfc1` definitions in the MipsInstrFPU.td have MMRel,
but do not have StdMMR6Rel tags. When these instructions are emitted
for microMIPS R6 targets, `Mips::MipsR62MicroMipsR6` nor
`Mips::Std2MicroMipsR6` cannot find correct op-codes and as a result the
backend uses mips32 variant of the instructions encoding.

The patch fixes this problem by adding the StdMMR6Rel tag and check
instructions encoding in the test case.

Differential revision: https://reviews.llvm.org/D51482

llvm-svn: 341221
2018-08-31 15:57:17 +00:00
Matt Arsenault bf07a50a98 AMDGPU: Restrict extract_vector_elt combine to loads
The intention is to enable the extract_vector_elt load combine,
and doing this for other operations interferes with more
useful optimizations on vectors.

Handle any type of load since in principle we should do the
same combine for the various load intrinsics.

llvm-svn: 341219
2018-08-31 15:39:52 +00:00
Jonas Devlieghere e3d6b9786e [Wasm] Add missing EOF checks for floats
Adds the same checks we already do for ints to floats.

Fixes: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=8698
llvm-svn: 341216
2018-08-31 14:54:01 +00:00
Matt Arsenault c807ce0ee4 SLPVectorizer: Fix assert with different sized address spaces
llvm-svn: 341215
2018-08-31 14:34:53 +00:00
Dean Michael Berris fc774e29d2 [XRay] Remove array for Metadata Record Types
This simplifies the implementation of the metadata lookup by using
scoped enums, rather than using enum classes. This way we can get the
number-name mapping without having to resort to comments.

Follow-up to D51289.

llvm-svn: 341205
2018-08-31 11:41:08 +00:00
Alexander Ivchenko 9d053074a1 [GlobalISel][X86] Add the support for G_FPTRUNC
Differential Revision: https://reviews.llvm.org/D49855

llvm-svn: 341202
2018-08-31 11:26:51 +00:00
Alexander Ivchenko 9b0b492653 [GlobalISel][X86_64] Support for G_FPTOSI
Differential Revision: https://reviews.llvm.org/D49183

llvm-svn: 341200
2018-08-31 11:16:58 +00:00
Alexander Ivchenko 58a5d6fde7 [GlobalIsel][X86] Support for llvm.trap intrinsic
Differential Revision: https://reviews.llvm.org/D49180

llvm-svn: 341199
2018-08-31 11:05:13 +00:00
Alexander Ivchenko 5b8418983c [NFC] Fix unused variable warning in X86RegisterBankInfo.cpp
llvm-svn: 341198
2018-08-31 10:39:54 +00:00
Dean Michael Berris 7a07a41cbb [XRay] Attempt to fix failure on Windows
Original version of the code relied on implementation-defined order of bitfields.

Follow-up on D51210.

llvm-svn: 341194
2018-08-31 10:03:52 +00:00
Alexander Ivchenko a26a364e75 [GlobalIsel][X86] Support for G_FCMP
Differential Revision: https://reviews.llvm.org/D49172

llvm-svn: 341193
2018-08-31 09:38:27 +00:00
Simon Pilgrim 95f4120f09 Fix MSVC "not all control paths return a value" warning. NFCI.
llvm-svn: 341191
2018-08-31 09:24:09 +00:00
Andrea Di Biagio b998eae2f2 [X86][BtVer2] Fix WriteFShuffle256 schedule write info.
This patch fixes the number of micro opcodes, and processor resource cycles for
the following AVX instructions:

vinsertf128rr/rm
vperm2f128rr/rm
vbroadcastf128

Tests have been regenerated using the usual scripts in the llvm/utils directory.

Differential Revision: https://reviews.llvm.org/D51492

llvm-svn: 341185
2018-08-31 08:30:47 +00:00
Dean Michael Berris 146d5791d9 [XRay] FDR Record Producer/Consumer Implementation
Summary:
This patch defines two new base types called `RecordProducer` and
`RecordConsumer` which have default implementations for convenience
(particularly for testing).

A `RecordProducer` implementation has one member function called
`produce()` which serves as a factory constructor for `Record`
instances. This code exercises the `RecordInitializer` code path in the
implementation for `FileBasedRecordProducer`.

A `RecordConsumer` has a single member function called `consume(...)`
which, as the name implies, consumes instances of
`std::unique_ptr<Record>`. We have two implementations, one of which is
used in the test to generate a vector of `std::unique_ptr<Record>`
similar to how the `LogBuilder` implementation works.

We introduce a test in `FDRProducerConsumerTest` which ensures that
records we write through the `FDRTraceWriter` can be loaded by the
`FileBasedRecordProducer`. The record(s) loaded this way are written
again through the `FDRTraceWriter` into a separate string, which we then
compare. This ensures that the read-in bytes to create the `Record`
instances in memory can be replicated when written out through the
`FDRTraceWriter`.

This change depends on D51210 and is part of the refactoring of D50441
into smaller, more focused changes.

Reviewers: eizan, kpw

Subscribers: mgorny, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D51289

llvm-svn: 341180
2018-08-31 08:04:56 +00:00
Martin Storsjo 9e4d5f9b7b [AArch64] Hook up the missed machine operand flag name for MO_DLLIMPORT
llvm-svn: 341178
2018-08-31 08:00:34 +00:00
Martin Storsjo f010872b5c [MinGW] [X86] Pass true for the second parameter to StubValueTy for MO_COFFSTUB. NFC.
These stubs should never be emitted for internal symbols, and
nothing in AsmPrinter ever actually use this value when producing
the stubs for COFF anyway.

llvm-svn: 341177
2018-08-31 08:00:31 +00:00
Martin Storsjo 2dcaa41e1e [MinGW] [ARM] Add stubs for potential automatic dllimported variables
The runtime pseudo relocations can't handle the ARM format embedded
addresses in movw/movt pairs. By using stubs, the potentially
dllimported addresses can be touched up by the runtime pseudo relocation
framework.

Differential Revision: https://reviews.llvm.org/D51450

llvm-svn: 341176
2018-08-31 08:00:25 +00:00
Craig Topper 83e9f928ba [X86] Don't do anything in ReplaceNodeResults for (v2i32 (fptoui/fptosi v2f32)) when -x86-experimental-vector-widening-legalization is on.
We don't need to do our own widening, the generic legalizer can do it.

llvm-svn: 341174
2018-08-31 07:05:39 +00:00
Craig Topper e9af89a78b [X86] Don't custom widen (v2i32 (setcc v2f32)) when -x86-experimental-vector-widening-legalization is in effect.
We aren't doing anything than what the generic legalizer will do so just let it do it.

llvm-svn: 341172
2018-08-31 07:05:37 +00:00
Matt Arsenault 988df63525 AMDGPU: Stop forcing internalize at -O0
This doesn't really matter if clang is always emitting
the visibility as hidden by default.

llvm-svn: 341168
2018-08-31 06:02:36 +00:00
Matt Arsenault 0da6350dc8 AMDGPU: Remove remnants of old address space mapping
llvm-svn: 341165
2018-08-31 05:49:54 +00:00
Lang Hames 6d32002e2b [ORC] Add utilities to RTDyldObjectLinkingLayer2 to simplify symbol flag
management and materialization responsibility registration.

The setOverrideObjectFlagsWithResponsibilityFlags method instructs
RTDyldObjectlinkingLayer2 to override the symbol flags produced by RuntimeDyld with
the flags provided by the MaterializationResponsibility instance. This can be used
to enable symbol visibility (hidden/exported) for COFF object files, which do not
currently support the SF_Exported flag.

The setAutoClaimResponsibilityForObjectSymbols method instructs
RTDyldObjectLinkingLayer2 to claim responsibility for any symbols provided by a
given object file that were not already in the MaterializationResponsibility
instance. Setting this flag allows higher-level program representations (e.g.
LLVM IR) to be added based on only a subset of the symbols they provide, without
having to write intervening layers to scan and add the additional symbols. This
trades diagnostic quality for convenience however: If all symbols are enumerated
up-front then clashes can be detected and reported early. If this option is set,
clashes for the additional symbols may not be detected until late, and detection
may depend on the flow of control through JIT'd code.

llvm-svn: 341154
2018-08-31 00:53:17 +00:00
Max Kazantsev c683d643c9 Revert "[NFC] Add severe validation of InstructionPrecedenceTracking" for discussion
llvm-svn: 341147
2018-08-31 00:01:54 +00:00
Krzysztof Parzyszek d51f7b3b43 [Hexagon] Check validity of register class when generating bitsplit
llvm-svn: 341137
2018-08-30 22:26:43 +00:00
Eli Friedman d5d0a4d27f [ARM] Enable GEP offset splitting for 32-bit ARM.
It has essentially the same benefit it has on 64-bit ARM: it
substantially reduces the number of constants used by large GEP
operations. Seems to be generally helpful across a few different
codebases I've tried.

Differential Revision: https://reviews.llvm.org/D51462

llvm-svn: 341136
2018-08-30 22:18:27 +00:00