Commit Graph

40 Commits

Author SHA1 Message Date
Atmn Patel ac73b73c16 [clang] Add mustprogress and llvm.loop.mustprogress attribute deduction
Since C++11, the C++ standard has a forward progress guarantee
[intro.progress], so all such functions must have the `mustprogress`
requirement. In addition, from C11 and onwards, loops without a non-zero
constant conditional or no conditional are also required to make
progress (C11 6.8.5p6). This patch implements these attribute deductions
so they can be used by the optimization passes.

Differential Revision: https://reviews.llvm.org/D86841
2020-11-04 22:03:14 -05:00
Duncan P. N. Exon Smith d4c667c9af Avoid unnecessary uses of `MDNode::getTemporary`, NFC
This is a long-delayed follow-up to
5e5b85098d.

`TempMDNode` includes a bunch of machinery for RAUW, and should only be
used when necessary. RAUW wasn't being used in any of these cases... it
was just a placeholder for a self-reference.

Where the real node was using `MDNode::getDistinct`, just replace the
temporary argument with `nullptr`.

Where the real node was using `MDNode::get`, the `replaceOperandWith`
call was "promoting" the node to a distinct one implicitly due to
self-reference detection in `MDNode::handleChangedOperand`. The
`TempMDNode` was serving a purpose by delaying uniquing, but it's way
simpler to just call `MDNode::getDistinct` in the first place.

Note that using a self-reference at all in these places is a hold-over
from before `distinct` metadata existed. It was an old trick to create
distinct nodes. It would be intrusive to change, including bitcode
upgrades, etc., and it's harmless so I'm not sure there's much value in
removing it from existing schemas. After this commit it still has a tiny
memory cost (in the extra metadata operand) but no more overhead in
construction.

Differential Revision: https://reviews.llvm.org/D90079
2020-10-26 17:03:25 -04:00
Florian Hahn 338be9c595 [Clang] Add llvm.loop.unroll.disable to loops with -fno-unroll-loops.
Currently Clang does not respect -fno-unroll-loops during LTO. During
D76916 it was suggested to respect -fno-unroll-loops on a TU basis.

This patch uses the existing llvm.loop.unroll.disable metadata to
disable loop unrolling explicitly for each loop in the TU if
unrolling is disabled. This should ensure that loops from TUs compiled
with -fno-unroll-loops are skipped by the unroller during LTO.

This also means that if a loop from a TU with -fno-unroll-loops
gets inlined into a TU without this option, the loop won't be
unrolled.

Due to the fact that some transforms might drop loop metadata, there
potentially are cases in which we still unroll loops from TUs with
-fno-unroll-loops. I think we should fix those issues rather than
introducing a function attribute to disable loop unrolling during LTO.
Improving the metadata handling will benefit other use cases, like
various loop pragmas, too. And it is an improvement to clang completely
ignoring -fno-unroll-loops during LTO.

If that direction looks good, we can use a similar approach to also
respect -fno-vectorize during LTO, at least for LoopVectorize.

In the future, this might also allow us to remove the UnrollLoops option
LLVM's PassManagerBuilder.

Reviewers: Meinersbur, hfinkel, dexonsmith, tejohnson

Reviewed By: Meinersbur, tejohnson

Differential Revision: https://reviews.llvm.org/D77058
2020-04-07 14:01:55 +01:00
Reid Kleckner 26d254f084 Sink more Attr.h inline methods, NFC
This has very little impact on build time, but is a mechanical pre-req
to removing the OpenMPClause.h include, which matters. Most of these
pretty print methods require Expr to be complete.
2020-03-12 11:54:31 -07:00
Sjoerd Meijer 0216854917 [Clang] Pragma vectorize_width() implies vectorize(enable)
Let's try this again; this has been reverted/recommited a few times. Last time
this got reverted because for this loop:

  void a() {
    #pragma clang loop vectorize(disable)
    for (;;)
      ;
  }

vectorisation was incorrectly enabled and the vectorize.enable metadata was set
due to a logic error. But with this fixed, we now imply vectorisation when:

1) vectorisation is enabled, which means: VectorizeWidth > 1,
2) and don't want to add it when it is disabled or enabled, otherwise we would
   be incorrectly setting it or duplicating the metadata, respectively.

This should fix PR27643.

Differential Revision: https://reviews.llvm.org/D69628
2019-12-11 10:37:40 +00:00
Jordan Rupprecht 6d424a161b Revert "Recommit "[Clang] Pragma vectorize_width() implies vectorize(enable)""
This reverts commit 80371c74ae.

Given the following source:
```
void a() {
  for (;;)
    ;
}
```

It incorrectly enables vectorization (with vector width 1), as well as generating a warning that vectorization could not be performed.
2019-10-24 16:35:45 -07:00
Sjoerd Meijer 80371c74ae Recommit "[Clang] Pragma vectorize_width() implies vectorize(enable)"
This was further discussed at the llvm dev list:

http://lists.llvm.org/pipermail/llvm-dev/2019-October/135602.html

I think the brief summary of that is that this change is an improvement,
this is the behaviour that we expect and promise in ours docs, and also
as a result there are cases where we now emit diagnostics whereas before
pragmas were silently ignored. Two areas where we can improve: 1) the
diagnostic message itself, and 2) and in some cases (e.g. -Os and -Oz)
the vectoriser is (quite understandably) not triggering.

Original commit message:

Specifying the vectorization width was supposed to implicitly enable
vectorization, except that it wasn't really doing this. It was only
setting the vectorize.width metadata, but not vectorize.enable.

This should fix PR27643.

llvm-svn: 374288
2019-10-10 08:27:14 +00:00
Hans Wennborg 858a1ae37d Revert r372082 "[Clang] Pragma vectorize_width() implies vectorize(enable)"
This broke the Chromium build. Consider the following code:

  float ScaleSumSamples_C(const float* src, float* dst, float scale, int width) {
    float fsum = 0.f;
    int i;
  #if defined(__clang__)
  #pragma clang loop vectorize_width(4)
  #endif
    for (i = 0; i < width; ++i) {
      float v = *src++;
      fsum += v * v;
      *dst++ = v * scale;
    }
    return fsum;
  }

Compiling at -Oz, Clang  now warns:

  $ clang++ -target x86_64 -Oz -c /tmp/a.cc
  /tmp/a.cc:1:7: warning: loop not vectorized: the optimizer was unable to
  perform the requested transformation; the transformation might be disabled or
  specified as part of an unsupported transformation ordering
  [-Wpass-failed=transform-warning]

this suggests it's not actually enabling vectorization hard enough.

At -Os it asserts instead:

  $ build.release/bin/clang++ -target x86_64 -Os -c /tmp/a.cc
  clang-10: /work/llvm.monorepo/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:2734: void
  llvm::InnerLoopVectorizer::emitMemRuntimeChecks(llvm::Loop*, llvm::BasicBlock*): Assertion `
  !BB->getParent()->hasOptSize() && "Cannot emit memory checks when optimizing for size"' failed.

Of course neither of these are what the developer expected from the pragma.

> Specifying the vectorization width was supposed to implicitly enable
> vectorization, except that it wasn't really doing this. It was only
> setting the vectorize.width metadata, but not vectorize.enable.
>
> This should fix PR27643.
>
> Differential Revision: https://reviews.llvm.org/D66290

llvm-svn: 372225
2019-09-18 13:41:51 +00:00
Sjoerd Meijer e573a9c035 [Clang] Pragma vectorize_width() implies vectorize(enable)
Specifying the vectorization width was supposed to implicitly enable
vectorization, except that it wasn't really doing this. It was only
setting the vectorize.width metadata, but not vectorize.enable.

This should fix PR27643.

Differential Revision: https://reviews.llvm.org/D66290

llvm-svn: 372082
2019-09-17 08:43:11 +00:00
Aaron Ballman b6ab533b93 Don't keep stale pointers to LoopInfos.
CGLoopInfo was keeping pointers to parent loop LoopInfos, but when the loop info vector grew, it reallocated the storage and invalidated all of the parent pointers, causing use-after-free. Manage the lifetimes of the LoopInfos separately so that the pointers aren't stale.

Patch by Bevin Hansson.

llvm-svn: 369259
2019-08-19 13:37:41 +00:00
Sjoerd Meijer 535efab2e5 [Clang] Pragma vectorize_predicate implies vectorize
New pragma "vectorize_predicate(enable)" now implies "vectorize(enable)",
and it is ignored when vectorization is disabled with e.g.
"vectorize(disable) vectorize_predicate(enable)".

Differential Revision: https://reviews.llvm.org/D65776

llvm-svn: 368970
2019-08-15 06:24:40 +00:00
Sjoerd Meijer a48f58c97f [Clang] New loop pragma vectorize_predicate
This adds a new vectorize predication loop hint:

  #pragma clang loop vectorize_predicate(enable)

that can be used to indicate to the vectoriser that all (load/store)
instructions should be predicated (masked). This allows, for example, folding
of the remainder loop into the main loop.

This patch will be followed up with D64916 and D65197. The former is a
refactoring in the loopvectorizer and the groundwork to make tail loop folding
a more general concept, and in the latter the actual tail loop folding
transformation will be implemented.

Differential Revision: https://reviews.llvm.org/D64744

llvm-svn: 366989
2019-07-25 07:33:13 +00:00
Michael Kruse 58e7642669 [CodeGen] Generate follow-up metadata for loops with more than one transformation.
Before this patch, CGLoop would dump all transformations for a loop into
a single LoopID without encoding any order in which to apply them.
rL348944 added the possibility to encode a transformation order using
followup-attributes.

When a loop has more than one transformation, use the follow-up
attribute define the order in which they are applied. The emitted order
is the defacto order as defined by the current LLVM pass pipeline,
which is:

  LoopFullUnrollPass
  LoopDistributePass
  LoopVectorizePass
  LoopUnrollAndJamPass
  LoopUnrollPass
  MachinePipeliner

This patch should therefore not change the assembly output, assuming
that all explicit transformations can be applied, and no implicit
transformations in-between. In the former case,
WarnMissedTransformationsPass should emit a warning (except for
MachinePipeliner which is not implemented yet). The latter could be
avoided by adding 'llvm.loop.disable_nonforced' attributes.

Because LoopUnrollAndJamPass processes a loop nest, generation of the
MDNode is delayed to after the inner loop metadata have been processed.
A temporary LoopID is therefore used to annotate instructions and
RAUW'ed by the actual LoopID later.

Differential Revision: https://reviews.llvm.org/D57978

llvm-svn: 357415
2019-04-01 17:47:41 +00:00
Andrew Savonichev 76b178d949 [OpenCL] Generate 'unroll.enable' metadata for __attribute__((opencl_unroll_hint))
Summary:
[OpenCL] Generate 'unroll.enable' metadata for  __attribute__((opencl_unroll_hint))
    
For both !{!"llvm.loop.unroll.enable"} and !{!"llvm.loop.unroll.full"} the unroller
will try to fully unroll a loop unless the trip count is not known at compile time.
In that case for '.full' metadata no unrolling will be processed, while for '.enable'
the loop will be partially unrolled with a heuristically chosen unroll factor.
    
See: docs/LanguageExtensions.rst
    
From https://www.khronos.org/registry/OpenCL/sdk/2.0/docs/man/xhtml/attributes-loopUnroll.html

    __attribute__((opencl_unroll_hint))
    for (int i=0; i<2; i++)
    {
        ...
    }
    
In the example above, the compiler will determine how much to unroll the loop.

    
Before the patch for  __attribute__((opencl_unroll_hint)) was generated metadata
!{!"llvm.loop.unroll.full"}, which limits ability of loop unroller to decide, how
much to unroll the loop.

Reviewers: Anastasia, yaxunl

Reviewed By: Anastasia

Subscribers: zzheng, dmgreen, jdoerfert, cfe-commits, asavonic, AlexeySotkin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D59493

llvm-svn: 356571
2019-03-20 16:43:07 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Aaron Ballman 9bdf515c74 Add two new pragmas for controlling software pipelining optimizations.
This patch adds #pragma clang loop pipeline and #pragma clang loop pipeline_initiation_interval for debugging or reducing compile time purposes. It is possible to disable SWP for concrete loops to save compilation time or to find bugs by not doing SWP to certain loops. It is possible to set value of initiation interval to concrete number to save compilation time by not doing extra pipeliner passes or to check created schedule for specific initiation interval.

Patch by Alexey Lapshin.

llvm-svn: 350414
2019-01-04 17:20:00 +00:00
Michael Kruse 0535137e4a [CodeGen] Generate llvm.loop.parallel_accesses instead of llvm.mem.parallel_loop_access metadata.
Instead of generating llvm.mem.parallel_loop_access metadata, generate
llvm.access.group on instructions and llvm.loop.parallel_accesses on
loops. There is one access group per generated loop.

This is clang part of D52116/r349725.

Differential Revision: https://reviews.llvm.org/D52117

llvm-svn: 349823
2018-12-20 21:24:54 +00:00
Richard Trieu 0614cff40d Move LoopHint.h from Sema to Parse
struct LoopHint was only used within Parse and not in any of the Sema or
Codegen files.  In the non-Parse files where it was included, it either wasn't
used or LoopHintAttr was used, so its inclusion did nothing.

llvm-svn: 347728
2018-11-28 04:36:31 +00:00
Chandler Carruth 4aaaaabe87 [TI removal] Test predicate rather than casting to detect a terminator
and use the range based successor API.

llvm-svn: 344730
2018-10-18 08:16:20 +00:00
Michael Kruse cba47b4978 [CodeGen] Emit parallel_loop_access for each loop in the loop stack.
Summary:
Emit !llvm.mem.parallel_loop_access metadata for memory accesses even if the parallel loop is not the top on the loop stack.

Fixes llvm.org/PR37558.

Reviewers: ABataev, hfinkel, amusman, tyler.nowicki

Reviewed By: hfinkel

Subscribers: Meinersbur, hfinkel, cfe-commits

Differential Revision: https://reviews.llvm.org/D48808

llvm-svn: 338810
2018-08-03 04:42:52 +00:00
David Green c8e3924b3b [UnrollAndJam] Add unroll_and_jam pragma handling
This adds support for the unroll_and_jam pragma, to go with the recently
added unroll and jam pass. The name of the pragma is the same as is used
in the Intel compiler, and most of the code works the same as for unroll.

#pragma clang loop unroll_and_jam has been separated into a different
patch. This part adds #pragma unroll_and_jam with an optional count, and
#pragma no_unroll_and_jam to disable the transform.

Differential Revision: https://reviews.llvm.org/D47267

llvm-svn: 338566
2018-08-01 14:36:12 +00:00
Fangrui Song 6907ce2f8f Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

llvm-svn: 338291
2018-07-30 19:24:48 +00:00
Benjamin Kramer 81cb4b7103 [CodeGen] Pass objects that are expensive to copy by const ref.
No functionality change. Found by clang-tidy's
performance-unnecessary-value-param.

llvm-svn: 287894
2016-11-24 16:01:20 +00:00
Amara Emerson 652795db16 Add the loop end location to the loop metadata. This additional information
can be used to improve the locations when generating remarks for loops.

Depends on the companion LLVM change r286227.

Patch by Florian Hahn.

Differential Revision: https://reviews.llvm.org/D25764

llvm-svn: 286456
2016-11-10 14:44:30 +00:00
Adam Nemet 9c84859075 [Pragma] Clear loop distribution attribute between loops
llvm-svn: 279608
2016-08-24 04:31:56 +00:00
Adam Nemet 2de463ece3 Add loop pragma for Loop Distribution
Summary:
This is similar to other loop pragmas like 'vectorize'.  Currently it
only has state values: distribute(enable) and distribute(disable).  When
one of these is specified the corresponding loop metadata is generated:

  !{!"llvm.loop.distribute.enable", i1 true/false}

As a result, loop distribution will be attempted on the loop even if
Loop Distribution in not enabled globally.  Analogously, with 'disable'
distribution can be turned off for an individual loop even when the pass
is otherwise enabled.

There are some slight differences compared to the existing loop pragmas.

1. There is no 'assume_safety' variant which makes its handling slightly
different from 'vectorize'/'interleave'.

2. Unlike the existing loop pragmas, it does not have a corresponding
numeric pragma like 'vectorize' -> 'vectorize_width'.  So for the
consistency checks in CheckForIncompatibleAttributes we don't need to
check it against other pragmas.  We just need to check for duplicates of
the same pragma.

Reviewers: rsmith, dexonsmith, aaron.ballman

Subscribers: bob.wilson, cfe-commits, hfinkel

Differential Revision: http://reviews.llvm.org/D19403

llvm-svn: 272656
2016-06-14 12:04:26 +00:00
Hal Finkel c07e19b2c1 Add a loop's debug location to its llvm.loop metadata
Getting accurate locations for loops is important, because those locations are
used by the frontend to generate optimization remarks. Currently, optimization
remarks for loops often appear on the wrong line, often the first line of the
loop body instead of the loop itself. This is confusing because that line might
itself be another loop, or might be somewhere else completely if the body was
an inlined function call. This happens because of the way we find the loop's
starting location. First, we look for a preheader, and if we find one, and its
terminator has a debug location, then we use that. Otherwise, we look for a
location on an instruction in the loop header.

The fallback heuristic is not bad, but will almost always find the beginning of
the body, and not the loop statement itself. The preheader location search
often fails because there's often not a preheader, and even when there is a
preheader, depending on how it was formed, it sometimes carries the location of
some preceeding code.

I don't see any good theoretical way to fix this problem. On the other hand,
this seems like a straightforward solution: Put the debug location in the
loop's llvm.loop metadata. When emitting debug information, this commit causes
us to add the debug location as an operand to each loop's llvm.loop metadata.
Thus, we now generate this metadata for all loops (not just loops with
optimization hints) when we're otherwise generating debug information.

The remark test case changes depend on the companion LLVM commit r270771.

llvm-svn: 270772
2016-05-25 21:53:24 +00:00
Duncan P. N. Exon Smith f72d5b608d CGLoopInfo: Use the MD_loop metadata kind from r264371, NFC
Besides a small compile-time speedup, there should be no real
functionality change here.

llvm-svn: 264372
2016-03-25 00:38:14 +00:00
Anastasia Stulova 6bdbcbb3d9 [OpenCL] Generate metadata for opencl_unroll_hint attribute
Add support for opencl_unroll_hint attribute from OpenCL v2.0 s6.11.5.

Reusing most of metadata generation from CGLoopInfo helper class.

The code is based on Khronos OpenCL compiler:
https://github.com/KhronosGroup/SPIR/tree/spirv-1.0

Patch by Liu Yaxun (Sam)!

Differential Revision: http://reviews.llvm.org/D16686

llvm-svn: 261350
2016-02-19 18:30:11 +00:00
Mark Heffernan 397a98d86d Add new llvm.loop.unroll.enable metadata for use with "#pragma unroll".
This change adds the new unroll metadata "llvm.loop.unroll.enable" which directs
the optimizer to unroll a loop fully if the trip count is known at compile time, and
unroll partially if the trip count is not known at compile time. This differs from
"llvm.loop.unroll.full" which explicitly does not unroll a loop if the trip count is not
known at compile time

With this change "#pragma unroll" generates "llvm.loop.unroll.enable" rather than
"llvm.loop.unroll.full" metadata. This changes the semantics of "#pragma unroll" slightly
to mean "unroll aggressively (fully or partially)" rather than "unroll fully or not at all".

The motivating example for this change was some internal code with a loop marked
with "#pragma unroll" which only sometimes had a compile-time trip count depending
on template magic. When the trip count was a compile-time constant, everything works
as expected and the loop is fully unrolled. However, when the trip count was not a
compile-time constant the "#pragma unroll" explicitly disabled unrolling of the loop(!).
Removing "#pragma unroll" caused the loop to be unrolled partially which was desirable
from a performance perspective.

llvm-svn: 244467
2015-08-10 17:29:39 +00:00
Tyler Nowicki 54c020d372 Use CGLoopInfo to emit metadata for loop hint pragmas.
When ‘#pragma clang loop vectorize(assume_safety)’ was specified on a loop other loop hints were lost. The problem is that CGLoopInfo attaches metadata differently than EmitCondBrHints in CGStmt. For do-loops CGLoopInfo attaches metadata to the br in the body block and for while and for loops, the inc block. EmitCondBrHints on the other hand always attaches data to the br in the cond block. When specifying assume_safety CGLoopInfo emits an empty llvm.loop metadata shadowing the metadata in the cond block. Loop transformations like rotate and unswitch would then eliminate the cond block and its non-empty metadata.

This patch unifies both approaches for adding metadata and modifies the existing safety tests to include non-assume_safety loop hints.

llvm-svn: 243315
2015-07-27 20:10:20 +00:00
Tyler Nowicki da46d0ea8c Make the variable names match the name of the metadata they control.
Rename Vectorizer to Vectorize and VectorizeUnroll to InterleaveCount.

llvm-svn: 242241
2015-07-14 23:03:09 +00:00
David Majnemer 03a9056f58 [IRGen] Fix the MSVC2013 build
llvm-svn: 239576
2015-06-12 00:17:26 +00:00
Tyler Nowicki 9d268e178e Add assume_safety option for pragma loop vectorize and interleave.
Specifying #pragma clang loop vectorize(assume_safety) on a loop adds the
mem.parallel_loop_access metadata to each load/store operation in the loop. This
metadata tells loop access analysis (LAA) to skip memory dependency checking.

llvm-svn: 239572
2015-06-11 23:23:17 +00:00
Tyler Nowicki 4e8e900dd1 Eliminate unnecessary namespace to prevent conflicts.
llvm-svn: 239365
2015-06-08 23:27:35 +00:00
Duncan P. N. Exon Smith 7fd74acd0b CodeGen: Update LoopAttributes for LLVM API change
`MDNode::getTemporary()` returns a `unique_ptr<>` as of r226504.

llvm-svn: 226505
2015-01-19 21:30:48 +00:00
Duncan P. N. Exon Smith fb49491477 IR: Update clang for Metadata/Value split in r223802
Match LLVM API changes from r223802.

llvm-svn: 223803
2014-12-09 18:39:32 +00:00
Mark Heffernan 34735af3cb Rename metadata llvm.loop.vectorize.unroll to llvm.loop.vectorize.interleave.
llvm-svn: 213587
2014-07-21 23:10:56 +00:00
Eli Bendersky b198b4e864 Rename loop unrolling and loop vectorizer metadata to have a common prefix.
[Clang part]

These patches rename the loop unrolling and loop vectorizer metadata
such that they have a common 'llvm.loop.' prefix.  Metadata name
changes:

llvm.vectorizer.* => llvm.loop.vectorizer.*
llvm.loopunroll.* => llvm.loop.unroll.*

This was a suggestion from an earlier review
(http://reviews.llvm.org/D4090) which added the loop unrolling
metadata. 

Patch by Mark Heffernan.

llvm-svn: 211712
2014-06-25 15:42:16 +00:00
Alexander Musman 515ad8c490 This patch adds a helper class (CGLoopInfo) for marking memory instructions with llvm.mem.parallel_loop_access metadata.
It also adds a simple initial version of codegen for pragma omp simd (it will change in the future to support all the clauses).

Differential revision: http://reviews.llvm.org/D3644

llvm-svn: 209411
2014-05-22 08:54:05 +00:00