Commit Graph

219163 Commits

Author SHA1 Message Date
Rui Ueyama c7b073a23a Simplify. NFC.
llvm-svn: 256846
2016-01-05 16:35:48 +00:00
Rui Ueyama f53b1b7fde Update comments.
llvm-svn: 256845
2016-01-05 16:35:46 +00:00
Rui Ueyama e57c487eee Consistently use 'Bss' instead of 'BSS'.
llvm-svn: 256844
2016-01-05 16:35:43 +00:00
Rui Ueyama 7f9e7ea343 Remove redundant typedef.
llvm-svn: 256843
2016-01-05 16:35:41 +00:00
Samuel Antao 4d5f0bbea1 [OpenMP] Offloading descriptor registration and device codegen.
Summary:
In order to offloading work properly two things need to be in place:
- a descriptor with all the offloading information (device entry functions, and global variable) has to be created by the host and registered in the OpenMP offloading runtime library.
- all the device functions need to be emitted for the device and a convention has to be in place so that the runtime library can easily map the host ID of an entry point with the actual function in the device.

This patch adds support for these two things. However, only entry functions are being registered given that 'declare target' directive is not yet implemented.

About offloading descriptor:

The details of the descriptor are explained with more detail in http://goo.gl/L1rnKJ. Basically the descriptor will have fields that specify the number of devices, the pointers to where the device images begin and end (that will be defined by the linker), and also pointers to a the begin and end of table whose entries contain information about a specific entry point. Each entry has the type:
```
struct __tgt_offload_entry{
 void *addr;
 char *name;
 int64_t size;
};
```  
and will be implemented in a pre determined (ELF) section `.omp_offloading.entries` with 1-byte alignment, so that when all the objects are linked, the table is in that section with no padding in between entries (will be like a C array). The code generation ensures that all `__tgt_offload_entry` entries are emitted in the same order for both host and device so that the runtime can have the corresponding entries in both host and device in same index of the table, and efficiently implement the mapping.

The resulting descriptor is registered/unregistered with the runtime library using the calls `__tgt_register_lib` and `__tgt_unregister_lib`. The registration is implemented in a high priority global initializer so that the registration happens always before any initializer (that can potentially include target regions) is run.

The driver flag -omptargets= was created to specify a comma separated list of devices the user wants to support so that the new functionality can be exercised. Each device is specified with its triple.


About target codegen:

The target codegen is pretty much straightforward as it reuses completely the logic of the host version for the same target region. The tricky part is to identify the meaningful target regions in the device side. Unlike other programming models, like CUDA, there are no already outlined functions with attributes that mark what should be emitted or not. So, the information on what to emit is passed in the form of metadata in host bc file. This requires a new option to pass the host bc to the device frontend. Then everything is similar to what happens in CUDA: the global declarations emission is intercepted to check to see if it is an "interesting" declaration. The difference is that instead of checking an attribute, the metadata information in checked. Right now, there is only a form of metadata to pass information about the device entry points (target regions). A class `OffloadEntriesInfoManagerTy` was created to manage all the information and queries related with the metadata. The metadata looks like this:
```
!omp_offload.info = !{!0, !1, !2, !3, !4, !5, !6}

!0 = !{i32 0, i32 52, i32 77426347, !"_ZN2S12r1Ei", i32 479, i32 13, i32 4}
!1 = !{i32 0, i32 52, i32 77426347, !"_ZL7fstatici", i32 461, i32 11, i32 5}
!2 = !{i32 0, i32 52, i32 77426347, !"_Z9ftemplateIiET_i", i32 444, i32 11, i32 6}
!3 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 99, i32 11, i32 0}
!4 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 272, i32 11, i32 3}
!5 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 127, i32 11, i32 1}
!6 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 159, i32 11, i32 2}
```
The fields in each metadata entry are (in sequence):
Entry 1) an ID of the type of metadata - right now only zero is used meaning "OpenMP target region".
Entry 2) a unique ID of the device where the input source file that contain the target region lives. 
Entry 3) a unique ID of the file where the input source file that contain the target region lives. 
Entry 4) a mangled name of the function that encloses the target region.
Entries 5) and 6) line and column number where the target region was found.
Entry 7) is the order the entry was emitted.

Entry 2) and 3) are required to distinguish files that have the same function name.
Entry 4) is required to distinguish different instances of the same declaration (usually templated ones)
Entries 5) and 6) are required to distinguish the particular target region in body of the function (it is possible that a given target region is not an entry point - if clause can evaluate always to zero - and therefore we need to identify the "interesting" target regions. )

This patch replaces http://reviews.llvm.org/D12306.

Reviewers: ABataev, hfinkel, tra, rjmccall, sfantao

Subscribers: FBrygidyn, piotr.rak, Hahnfeld, cfe-commits

Differential Revision: http://reviews.llvm.org/D12614

llvm-svn: 256842
2016-01-05 16:23:04 +00:00
Daniel Jasper 411af72e8c clang-format: Fix corner case in "if it saves columns"-calculation.
Before:
  aaaa
      .aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(
	  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa)
      .aaaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa);

After:
  aaaa.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(
	  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa)
      .aaaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa);

llvm-svn: 256841
2016-01-05 16:10:39 +00:00
Michael Zuckerman 5cbae95916 [AVX512] add PSLLD and PSLLQ Intrinsic
Differential Revision: http://reviews.llvm.org/D15885

llvm-svn: 256840
2016-01-05 15:17:39 +00:00
MinSeong Kim 4a9a4e198f [MISched] Explanatory error message when machine model is not complete. NFC
When not all instructions have a scheduling class,
the error message now provides a possible solution.

Differential Revision: http://reviews.llvm.org/D15854

llvm-svn: 256839
2016-01-05 14:50:15 +00:00
Anastasia Stulova cf04d04ccf [OpenCL] Disallow taking an address of a function.
An undecorated function designator implies taking the address of a function,
which is illegal in OpenCL. Implementing a check for this earlier to allow
the error to be reported even in the presence of other more obvious errors.

Patch by Neil Hickey!

http://reviews.llvm.org/D15691

llvm-svn: 256838
2016-01-05 14:39:27 +00:00
Aaron Ballman e09fcfb108 Reverting r256836; it causes a build bot failure: http://lab.llvm.org:8011/builders/lldb-x86-win7-msvc/builds/14050/steps/build/logs/stdio
llvm-svn: 256837
2016-01-05 14:35:01 +00:00
Aaron Ballman 2566f4be50 Enable more strict standards conformance in MSVC for rvalue casting and string literal type conversion to non-const types. Also enables generation of intrinsics for more functions.
Patch by Alexander Riccio

llvm-svn: 256836
2016-01-05 14:24:01 +00:00
Pavel Labath 6f38342af1 Mark a test_lldbmi_gdb_set_target_async_on as flaky on linux
Test fails in about 1% of buildbot runs. Marking as flaky to avoid the noise.

llvm-svn: 256835
2016-01-05 14:21:15 +00:00
Sagar Thakur 307a3ba3b3 [LLDB][MIPS] Make register read/write to set/get the size of register according to abi.
Summary:
For O32 abi register size should be 4 bytes.
For N32 and N64 abi register size should be 8 bytes.
This patch will make register read/write to set/get the size of register according to abi.

Reviewers: clayborg, tberghammer
Subscribers: lldb-commits, nitesh.jain, mohit.bhakkad, bhushan, jaydeep
Differential: http://reviews.llvm.org/D15884
llvm-svn: 256834
2016-01-05 14:03:45 +00:00
Ewan Crawford 7093cccf92 Revert r256769
Reverts "Use correct format identifiers to print something meaningful."

Original format specifiers were correct.
Instead use void* casts to remove warnings, since this is what the %p specifier expects.

llvm-svn: 256833
2016-01-05 13:18:46 +00:00
Daniel Jasper 0a589416e8 clang-format: Handle \n the same way as std::endl with stream operator.
clang-format breaks multi-line streams after std::endl.
It now also break for '\n', the suggested replacement for std::endl:

  http://llvm.org/docs/CodingStandards.html#avoid-std-endl

Before:
  llvm::errs() << aaaaaaaaaaaaaaaaaaaaaa << '\n' << bbbbbbbbbbbbbbbbbbbbbb
               << '\n';
  llvm::errs() << aaaa << "aaaaaaaaaaaaaaaaaa\n" << bbbb
               << "bbbbbbbbbbbbbbbbbb\n";

After:
  llvm::errs() << aaaaaaaaaaaaaaaaaaaaaa << '\n'
               << bbbbbbbbbbbbbbbbbbbbbb << '\n';
  llvm::errs() << aaaa << "aaaaaaaaaaaaaaaaaa\n"
               << bbbb << "bbbbbbbbbbbbbbbbbb\n";

This changeset ensure that multiline streams have a line break after:
  - std::endl
  - '\n'
  - "\n"
  - "Some Text\n"

Patch by Jean-Philippe Dufraigne, thank you.

llvm-svn: 256832
2016-01-05 13:06:27 +00:00
Daniel Jasper 801cdb27e4 clang-format: Avoid creating hanging indents in call sequences.
Before:
  aaaaaaaaaaaaaaaa.aaaaaaaaaaaaaaaaaaa(
                      aaaaaaaaaaaaaaaaaaaa)
        .aaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaaa);

After:
  aaaaaaaaaaaaaaaa
        .aaaaaaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaaaaaa)
        .aaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaaa);

llvm-svn: 256831
2016-01-05 13:03:59 +00:00
Daniel Jasper 00492f96bf clang-format: Improve line wrapping behavior in call sequences.
r256750 has been leading to an undesired behavior:

  aaaaaaaaaa
      .aaaaaaaaaaaaaaaaaaaaaaaa.aaaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa);

This change increases penalty for wrapping before member accesses that aren't
calls. Thus, this is again formatted as (as it has been before r256750):

  aaaaaaaaaa.aaaaaaaaaaaaaaaaaaaaaaaa.aaaaaa(
      aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa);

llvm-svn: 256830
2016-01-05 13:03:50 +00:00
MinSeong Kim 50d9c156dd [AArch64] Teaches clang about Samsung Exynos-M1
Adds core tuning support for new Samsung Exynos-M1 core (ARMv8-A).

Differential Revision: http://reviews.llvm.org/D15664

llvm-svn: 256829
2016-01-05 12:53:24 +00:00
MinSeong Kim a7385ebf78 [AArch64] Add support for Samsung Exynos-M1
Adds core tuning support for new Samsung Exynos-M1 core (ARMv8-A).

Differential Revision: http://reviews.llvm.org/D15663

llvm-svn: 256828
2016-01-05 12:51:59 +00:00
Pavel Labath b4872150d2 Remove XTIMEOUT from TestEvents on linux
I'm getting rid of the expected timeouts. I'll XFAIL/skip any tests that show up as failing after
this (I haven't seen any when running locally, but maybe the buildbot will disagree).

llvm-svn: 256827
2016-01-05 12:51:26 +00:00
Tobias Grosser c28ae257c0 TODO: Polly can handle boolean expressions (Open->Done)
The necessary support was committed by Johannes in r249971.

llvm-svn: 256826
2016-01-05 11:48:59 +00:00
Tobias Grosser 100ef6b30c TODO: We do not use -independent-blocks any more (Open -> Done)
llvm-svn: 256825
2016-01-05 11:45:26 +00:00
Pavel Labath 773e86f255 Remove old flaky test rerun logic
Summary:
This removes the old logic for rerunning flaky tests. The new test runners will take care of
rerunning failing tests.

Reviewers: tfiala

Subscribers: lldb-commits

Differential Revision: http://reviews.llvm.org/D15855

llvm-svn: 256824
2016-01-05 10:44:36 +00:00
Artyom Skrobov 8c6992344d (NFC) Change SubtargetFeatures::ToggleFeature and
SubtargetFeatures::ApplyFeatureFlag to be static, so that
MCSubtargetInfo doesn't need to instantiate SubtargetFeatures
for nothing. Also change the return type to void, as it
wasn't ever used.

This is a partial commit of http://reviews.llvm.org/D15746

llvm-svn: 256823
2016-01-05 10:25:56 +00:00
Alexandros Lamprineas d162b5c8c4 [ARM] [AARCH64] Add CodeGen IR tests for {VS}QRDML{AS}H v8.1a intrinsics.
Differential Revision: http://reviews.llvm.org/D15223

llvm-svn: 256822
2016-01-05 09:58:29 +00:00
Junmo Park a1cc2d31b5 Remove extra whitespace. NFC.
llvm-svn: 256821
2016-01-05 09:40:03 +00:00
Junmo Park 3b8c715b2f Remove extra whitespace. NFC.
llvm-svn: 256820
2016-01-05 09:36:47 +00:00
Simon Pilgrim d47ac60f00 [X86][SSE] Merge PerformBLENDICombine into PerformShuffleCombine
PBLEND/BLENDPD/BLENDPS are no different to the other target shuffles and this will make future improvements to the target shuffle combines more straightforward.

llvm-svn: 256819
2016-01-05 09:12:17 +00:00
Craig Topper e00bffbc13 [X86] Make MOV32ri64 a post-RA pseudo instead of a CodeGenOnly instruction. It was only needed for rematerialization.
llvm-svn: 256818
2016-01-05 07:44:14 +00:00
Craig Topper 9583f51348 [X86] Add OpSize32 to OR32mrLocked instruction to match the normal OR32mr instruction.
llvm-svn: 256817
2016-01-05 07:44:11 +00:00
Craig Topper ad2ce36be0 [AVX512] Add hasSideEffects=0 to kunpck instructions since they lack a pattern in their instructions.
llvm-svn: 256816
2016-01-05 07:44:08 +00:00
David Majnemer 59eb733af1 [SimplifyCFG] Further improve our ability to remove redundant catchpads
In r256814, we managed to remove catchpads which were trivially redudant
because they were the same SSA value.  We can do better using the same
algorithm but with a smarter datastructure by hashing the SSA values
within the catchpad and comparing them structurally.

llvm-svn: 256815
2016-01-05 07:42:17 +00:00
David Majnemer 2fa8651a8f [SimplifyCFG] Remove redundant catchpads
Remove duplicate catchpad handlers from a catchswitch.

llvm-svn: 256814
2016-01-05 06:27:50 +00:00
Matt Arsenault 905042774d AMDGPU: Remove redundant let mayLoad = 1
This is already set on the SMRD format class.

llvm-svn: 256813
2016-01-05 04:50:28 +00:00
Manuel Jacob 75cbfdcf03 [RS4GC] Simplify handling of Constants in findBaseDefiningValue(). NFC.
Summary:
Previously there were three conditionals, checking for global
variables, undef values and everything constant except these two, all three
returning the same value.  This commit replaces them by one conditional.

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15818

llvm-svn: 256812
2016-01-05 04:06:21 +00:00
Manuel Jacob 83eefa6d20 [Statepoints] Refactor GCRelocateOperands into an intrinsic wrapper. NFC.
Summary:
This commit renames GCRelocateOperands to GCRelocateInst and makes it an
intrinsic wrapper, similar to e.g. MemCpyInst.  Also, all users of
GCRelocateOperands were changed to use the new intrinsic wrapper instead.

Reviewers: sanjoy, reames

Subscribers: reames, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D15762

llvm-svn: 256811
2016-01-05 04:03:00 +00:00
Tom Stellard 5cd09ade38 AMDGPU/SI: Select non-uniform constant addrspace loads to flat instructions for HSA
Summary: This fixes a regression caused by r256282.

Reviewers: arsenm, cfang

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15736

llvm-svn: 256810
2016-01-05 03:40:16 +00:00
Joseph Tremoulet 0d808888c1 [WinEH] Simplify unreachable catchpads
Summary:
At least for CoreCLR, a catchpad which immediately executes an
`unreachable` instruction indicates that the exception can never have a
matching type, and so such catchpads can be removed, and so can their
catchswitches if the catchswitch becomes empty.

Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15846

llvm-svn: 256809
2016-01-05 02:37:41 +00:00
David Majnemer 869be0a4a6 Revert "[X86] Use push-pop for materializing small constants under 'minsize'"
The red zone consists of 128 bytes beyond the stack pointer so that the
allocation of objects in leaf functions doesn't require decrementing
rsp.  In r255656, we introduced an optimization that would cheaply
materialize certain constants via push/pop.  Push decrements the stack
pointer and stores it's result at what is now the top of the stack.
However, this means that using push/pop would encroach on the red zone.
PR26023 gives an example where this corrupts an object in the red zone.

llvm-svn: 256808
2016-01-05 02:32:06 +00:00
Tom Stellard 2c82ee60c3 AMDGPU/SI: Consolidate FLAT patterns
Summary:
We had to sets of identical FLAT patterns one inside the
HasFlatAddressSpace predicate and one inside the useFlatForGloabl
predicate.  This patch merges these sets into a single pattern
under the isCIVI predicate.

The reason we can remove the predicates is that when MUBUF instructions
are legal, the instruction selector will prefer selecting those over
FLAT instructions because MUBUF patterns have a higher complexity score.
So, in this case having patterns for FLAT instructions will have no effect.

This change also simplifies the process for forcing global address space
loads to use FLAT instructions, since we no only have to disable the
MUBUF patterns instead of having to disable the MUBUF patterns and
enable the FLAT patterns.

Reviewers: arsenm, cfang

Subscribers: llvm-commits
llvm-svn: 256807
2016-01-05 02:26:37 +00:00
Mike Aizatsky cc56ac3669 [sancov] adding internal function
llvm-svn: 256806
2016-01-05 02:09:54 +00:00
Pete Cooper 52db793d33 Improved debugging printing. NFC
llvm-svn: 256805
2016-01-05 01:56:59 +00:00
Mike Aizatsky 54fc6575c5 [sancov] coverage pc buffer
Differential Revision: http://reviews.llvm.org/D15871

llvm-svn: 256804
2016-01-05 01:49:39 +00:00
Richard Smith 40b14d4893 Avoid walking all the declarations in the TU when a tag is declared in function
prototype scope in a function definition.

llvm-svn: 256803
2016-01-05 01:21:53 +00:00
Philip Reames a694a0b141 [MDA] Don't be quite as conservative for noalias functions
If we encounter a noalias call that alias analysis can't analyse, we can fall down into the generic call handling rather than giving up entirely. I noticed this while reading through the code for another purpose.

I can't seem to write a test case which changes; that sorta makes sense given any test case would have to be an inconsistency in AA. Suggestions welcome.

Differential Revision: http://reviews.llvm.org/D15825

llvm-svn: 256802
2016-01-05 00:49:14 +00:00
Matthias Braun d9fe082ba7 X86: Add a testcase for PR25951
llvm-svn: 256801
2016-01-05 00:48:16 +00:00
Pete Cooper 921227c3bc Fix test case comment after r256786. NFC.
The comment spacing was meant to show the interesting bytes in the prior
line, but the prior line moved slightly.

llvm-svn: 256800
2016-01-05 00:47:22 +00:00
Matthias Braun 7e762e4f9c MachineInstrBundle: Fix reversed isSuperRegisterEq() call
Unfortunately this fix had the effect of exposing the
-verify-machineinstrs FIXME of X86InstrInfo.cpp in two testcases for
which I disabled it for now.
Two testcases also have additional pushq/popq where the corrected code
cannot prove that %rax is dead any longer. Looking at the examples, this
could potentially be fixed by improving computeRegisterLiveness() to check
the live-in lists of the successors blocks when reaching the end of a
block.

This fixes http://llvm.org/PR25951.

llvm-svn: 256799
2016-01-05 00:45:35 +00:00
Matthias Braun d84af9ba8b Fix typo in comment
llvm-svn: 256798
2016-01-05 00:45:31 +00:00
David Majnemer 758e79858e Remove an unused parameter
No functionality change is intended

llvm-svn: 256797
2016-01-05 00:08:41 +00:00