Commit Graph

1256 Commits

Author SHA1 Message Date
Alexey Bader 7818906ca1 [SYCL] Implement SYCL address space attributes handling
Default address space (applies when no explicit address space was
specified) maps to generic (4) address space.

Added SYCL named address spaces `sycl_global`, `sycl_local` and
`sycl_private` defined as sub-sets of the default address space.

Static variables without address space now reside in global address
space when compile for SPIR target, unless they have an explicit address
space qualifier in source code.

Differential Revision: https://reviews.llvm.org/D89909
2021-04-26 13:44:10 +03:00
Joshua Haberman 8344675908 Implemented [[clang::musttail]] attribute for guaranteed tail calls.
This is a Clang-only change and depends on the existing "musttail"
support already implemented in LLVM.

The [[clang::musttail]] attribute goes on a return statement, not
a function definition. There are several constraints that the user
must follow when using [[clang::musttail]], and these constraints
are verified by Sema.

Tail calls are supported on regular function calls, calls through a
function pointer, member function calls, and even pointer to member.

Future work would be to throw a warning if a users tries to pass
a pointer or reference to a local variable through a musttail call.

Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D99517
2021-04-15 17:12:21 -07:00
Saurabh Jha 71ab6c98a0
[Matrix] Implement C-style explicit type conversions for matrix types.
This implements C-style type conversions for matrix types, as specified
in clang/docs/MatrixTypes.rst.

Fixes PR47141.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D99037
2021-04-10 11:48:41 +01:00
Nikita Popov 42eb658f65 [OpaquePtrs] Remove some uses of type-less CreateGEP() (NFC)
This removes some (but not all) uses of type-less CreateGEP()
and CreateInBoundsGEP() APIs, which are incompatible with opaque
pointers.

There are a still a number of tricky uses left, as well as many
more variation APIs for CreateGEP.
2021-03-12 21:01:16 +01:00
Nikita Popov 68e01339cc [CGBuilder] Remove type-less CreateAlignedLoad() APIs (NFC)
These are incompatible with opaque pointers. This is in preparation
of dropping this API on the IRBuilder side as well.

Instead explicitly pass the loaded type.
2021-03-11 10:41:23 +01:00
Yaxun (Sam) Liu 53d30381f5 Fix build failure due to dump()
Change-Id: I86b534223d63bf8bb8f49af5a64b300efbeba77b
2021-03-01 16:44:08 -05:00
Yaxun (Sam) Liu 5cf2a37f12 [HIP] Emit kernel symbol
Currently clang uses stub function to launch kernel. This is inconvenient
to interop with C++ programs since the stub function has different name
as kernel, which is required by ROCm debugger.

This patch emits a variable symbol which has the same name as the kernel
and uses it to register and launch the kernel. This allows C++ program to
launch a kernel by using the original kernel name.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D86376
2021-03-01 16:31:40 -05:00
Melanie Blower e64fcdf8d5 [clang][patch] Inclusive language, modify filename SanitizerBlacklist.h to NoSanitizeList.h
This patch responds to a comment from @vitalybuka in D96203: suggestion to
do the change incrementally, and start by modifying this file name. I modified
the file name and made the other changes that follow from that rename.

Reviewers: vitalybuka, echristo, MaskRay, jansvoboda11, aaron.ballman

Differential Revision: https://reviews.llvm.org/D96974
2021-02-22 15:11:37 -05:00
Simon Pilgrim 55488bd3cd CGExpr - EmitMatrixSubscriptExpr - fix getAs<> null-dereference static analyzer warning. NFCI.
getAs<> can return null if the cast is invalid, which can lead to null pointer deferences. Use castAs<> instead which will assert that the cast is valid.
2021-01-05 17:08:11 +00:00
Thorsten Schütt 2fd11e0b1e Revert "[NFC, Refactor] Modernize StorageClass from Specifiers.h to a scoped enum (II)"
This reverts commit efc82c4ad2.
2021-01-04 23:17:45 +01:00
Thorsten Schütt efc82c4ad2 [NFC, Refactor] Modernize StorageClass from Specifiers.h to a scoped enum (II)
Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D93765
2021-01-04 22:58:26 +01:00
Juneyoung Lee 9b29610228 Use unary CreateShuffleVector if possible
As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used
instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`.
Let's update them.

Actually, it would have been more natural if the patches were made in this order:
(1) let them use unary CreateShuffleVector first
(2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793)

The order is swapped, but in terms of correctness it is still fine.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D93923
2020-12-30 22:36:08 +09:00
Tim Northover c5978f42ec UBSAN: emit distinctive traps
Sometimes people get minimal crash reports after a UBSAN incident. This change
tags each trap with an integer representing the kind of failure encountered,
which can aid in tracking down the root cause of the problem.
2020-12-08 10:28:26 +00:00
Yaxun (Sam) Liu 3a781b912f Fix assertion in tryEmitAsConstant
due to cd95338ee3

Need to check if result is LValue before getLValueBase.
2020-12-02 19:10:01 -05:00
Yaxun (Sam) Liu cd95338ee3 [CUDA][HIP] Fix capturing reference to host variable
In C++ when a reference variable is captured by copy, the lambda
is supposed to make a copy of the referenced variable in the captures
and refer to the copy in the lambda. Therefore, it is valid to capture
a reference to a host global variable in a device lambda since the
device lambda will refer to the copy of the host global variable instead
of access the host global variable directly.

However, clang tries to avoid capturing of reference to a host global variable
if it determines the use of the reference variable in the lambda function is
not odr-use. Clang also tries to emit load of the reference to a global variable
as load of the global variable if it determines that the reference variable is
a compile-time constant.

For a device lambda to capture a reference variable to host global variable
and use the captured value, clang needs to be taught that in such cases the use of the reference
variable is odr-use and the reference variable is not compile-time constant.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D91088
2020-12-02 10:14:46 -05:00
Kevin P. Neal 2069403cdf [FPEnv] Use strictfp metadata in casting nodes
The strictfp metadata was added to the casting AST nodes in D85960, but
we aren't using that metadata yet. This patch adds that support.

In order to avoid lots of ad-hoc passing around of the strictfp bits I
updated the IRBuilder when moving from a function that has the Expr* to a
function that lacks it. I believe we should switch to this pattern to keep
the strictfp support from being overly invasive.

For the purpose of testing that we're picking up the right metadata, I
also made my tests use a pragma to make the AST's strictfp metadata not
match the global strictfp metadata. This exposes issues that we need to
deal with in subsequent patches, and I believe this is the right method
for most all of our clang strictfp tests.

Differential Revision: https://reviews.llvm.org/D88913
2020-11-06 11:56:12 -05:00
Richard Smith ba4768c966 [c++20] For P0732R2 / P1907R1: Basic frontend support for class types as
non-type template parameters.

Create a unique TemplateParamObjectDecl instance for each such value,
representing the globally unique template parameter object to which the
template parameter refers.

No IR generation support yet; that will follow in a separate patch.
2020-10-21 13:21:41 -07:00
Caroline Concatto 145e44bb18 [SVE]Fix implicit TypeSize casts in EmitCheckValue
Using TypeSize::getFixedSize() instead of relying upon the implicit
TypeSize->uint64_cast as the type is always fixed width.

Differential Revision: https://reviews.llvm.org/D89313
2020-10-15 13:25:46 +01:00
Bevin Hansson 9fa7f48459 [Fixed Point] Add fixed-point to floating point cast types and consteval.
Reviewed By: leonardchan

Differential Revision: https://reviews.llvm.org/D86631
2020-10-13 13:26:56 +02:00
Ties Stuij 208987844f [ARM] Follow AACPS standard for volatile bit-fields access width
This patch resumes the work of D16586.
According to the AAPCS, volatile bit-fields should
be accessed using containers of the widht of their
declarative type. In such case:
```
struct S1 {
  short a : 1;
}
```
should be accessed using load and stores of the width
(sizeof(short)), where now the compiler does only load
the minimum required width (char in this case).
However, as discussed in D16586,
that could overwrite non-volatile bit-fields, which
conflicted with C and C++ object models by creating
data race conditions that are not part of the bit-field,
e.g.
```
struct S2 {
  short a;
  int  b : 16;
}
```
Accessing `S2.b` would also access `S2.a`.

The AAPCS Release 2020Q2
(https://documentation-service.arm.com/static/5efb7fbedbdee951c1ccf186?token=)
section 8.1 Data Types, page 36, "Volatile bit-fields -
preserving number and width of container accesses" has been
updated to avoid conflict with the C++ Memory Model.
Now it reads in the note:
```
This ABI does not place any restrictions on the access widths of bit-fields where the container
overlaps with a non-bit-field member or where the container overlaps with any zero length bit-field
placed between two other bit-fields. This is because the C/C++ memory model defines these as being
separate memory locations, which can be accessed by two threads simultaneously. For this reason,
compilers must be permitted to use a narrower memory access width (including splitting the access into
multiple instructions) to avoid writing to a different memory location. For example, in
struct S { int a:24; char b; }; a write to a must not also write to the location occupied by b, this requires at least two
memory accesses in all current Arm architectures. In the same way, in struct S { int a:24; int:0; int b:8; };,
writes to a or b must not overwrite each other.
```

I've updated the patch D16586 to follow such behavior by verifying that we
only change volatile bit-field access when:
 - it won't overlap with any other non-bit-field member
 - we only access memory inside the bounds of the record
 - avoid overlapping zero-length bit-fields.

Regarding the number of memory accesses, that should be preserved, that will
be implemented by D67399.

Reviewed By: ostannard

Differential Revision: https://reviews.llvm.org/D72932
2020-10-13 10:31:48 +01:00
Vedant Kumar 06bc685fa2 [ubsan] nullability-arg: Fix crash on C++ member pointers
Extend -fsanitize=nullability-arg to handle call sites which accept C++
member pointers.

rdar://62476022

Differential Revision: https://reviews.llvm.org/D88336
2020-09-28 09:41:18 -07:00
Ties Stuij d6f3f61231 Revert "[ARM] Follow AACPS standard for volatile bit-fields access width"
This reverts commit 514df1b2bb.

Some of the buildbots got llvm-lit errors on CodeGen/volatile.c
2020-09-08 18:46:27 +01:00
Ties Stuij 514df1b2bb [ARM] Follow AACPS standard for volatile bit-fields access width
This patch resumes the work of D16586.
According to the AAPCS, volatile bit-fields should
be accessed using containers of the widht of their
declarative type. In such case:
```
struct S1 {
  short a : 1;
}
```
should be accessed using load and stores of the width
(sizeof(short)), where now the compiler does only load
the minimum required width (char in this case).
However, as discussed in D16586,
that could overwrite non-volatile bit-fields, which
conflicted with C and C++ object models by creating
data race conditions that are not part of the bit-field,
e.g.
```
struct S2 {
  short a;
  int  b : 16;
}
```
Accessing `S2.b` would also access `S2.a`.

The AAPCS Release 2020Q2
(https://documentation-service.arm.com/static/5efb7fbedbdee951c1ccf186?token=)
section 8.1 Data Types, page 36, "Volatile bit-fields -
preserving number and width of container accesses" has been
updated to avoid conflict with the C++ Memory Model.
Now it reads in the note:
```
This ABI does not place any restrictions on the access widths of bit-fields where the container
overlaps with a non-bit-field member or where the container overlaps with any zero length bit-field
placed between two other bit-fields. This is because the C/C++ memory model defines these as being
separate memory locations, which can be accessed by two threads simultaneously. For this reason,
compilers must be permitted to use a narrower memory access width (including splitting the access into
multiple instructions) to avoid writing to a different memory location. For example, in
struct S { int a:24; char b; }; a write to a must not also write to the location occupied by b, this requires at least two
memory accesses in all current Arm architectures. In the same way, in struct S { int a:24; int:0; int b:8; };,
writes to a or b must not overwrite each other.
```

Patch D16586 was updated to follow such behavior by verifying that we
only change volatile bit-field access when:
 - it won't overlap with any other non-bit-field member
 - we only access memory inside the bounds of the record
 - avoid overlapping zero-length bit-fields.

Regarding the number of memory accesses, that should be preserved, that will
be implemented by D67399.

Differential Revision: https://reviews.llvm.org/D72932

The following people contributed to this patch:
- Diogo Sampaio
- Ties Stuij
2020-09-08 17:49:49 +01:00
Christopher Tetreault 19e883fc59 [SVE] Remove calls to VectorType::getNumElements from clang
Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D82582
2020-08-26 11:12:26 -07:00
Saiyedul Islam 160ff83765 [OpenMP][AMDGCN] Support OpenMP offloading for AMDGCN architecture - Part 3
Provides AMDGCN and NVPTX specific specialization of getGPUWarpSize,
getGPUThreadID, and getGPUNumThreads methods. Adds tests for AMDGCN
codegen for these methods in generic and simd modes. Also changes the
precondition in InitTempAlloca to be slightly more permissive. Useful for
AMDGCN OpenMP codegen where allocas are created with a cast to an
address space.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D84260
2020-08-03 05:38:39 +00:00
David Blaikie 36036aa70e Reapply "Rename/refactor isIntegerConstantExpression to getIntegerConstantExpression"
Reapply 49e5f603d4
which had been reverted in c94332919b.

Originally reverted because I hadn't updated it in quite a while when I
got around to committing it, so there were a bunch of missing changes to
new code since I'd written the patch.

Reviewers: aaron.ballman

Differential Revision: https://reviews.llvm.org/D76646
2020-07-21 20:57:12 -07:00
David Blaikie c94332919b Revert "Rename/refactor isIntegerConstantExpression to getIntegerConstantExpression"
Broke buildbots since I hadn't updated this patch in a while. Sorry for
the noise.

This reverts commit 49e5f603d4.
2020-07-12 20:29:19 -07:00
David Blaikie 49e5f603d4 Rename/refactor isIntegerConstantExpression to getIntegerConstantExpression
There is a version that just tests (also called
isIntegerConstantExpression) & whereas this version is specifically used
when the value is of interest (a few call sites were actually refactored
to calling the test-only version) so let's make the API look more like
it.

Reviewers: aaron.ballman

Differential Revision: https://reviews.llvm.org/D76646
2020-07-12 19:43:24 -07:00
cchen 2da9572a9b [OPENMP50] extend array section for stride (Parsing/Sema/AST)
Reviewers: ABataev, jdoerfert

Reviewed By: ABataev

Subscribers: yaxunl, guansong, arphaman, sstefan1, cfe-commits, sandoval, dreachem

Tags: #clang

Differential Revision: https://reviews.llvm.org/D82800
2020-07-09 13:28:51 -05:00
sstefan1 6aab27ba85 [OpenMPIRBuilder][Fix] Move llvm::omp::types to OpenMPIRBuilder.
Summary:
D82193 exposed a problem with global type definitions in
`OMPConstants.h`. This causes a race when running in thinLTO mode.
Types now live inside of OpenMPIRBuilder to prevent this from happening.

Reviewers: jdoerfert

Subscribers: yaxunl, hiraditya, guansong, dexonsmith, aaron.ballman, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D83176
2020-07-08 17:23:55 +02:00
Fady Ghanim 80e15b4574 [Clang][OpenMP][OMPBuilder] Moving OMP allocation and cache creation code to OMPBuilderCBHelpers
Summary:
Modified the OMPBuilderCBHelpers in the following ways:
- Moved location of class definition and deleted all constructors
- Moved OpenMP-specific address allocation of local variables
- Moved threadprivate variable creation for the current thread

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79676
2020-06-28 19:04:20 -04:00
Tyker 51e4aa87e0 attempt to fix failing buildbots after 3bab88b7ba
Prevent IR-gen from emitting consteval declarations

Summary: with this patch instead of emitting calls to consteval function. the IR-gen will emit a store of the already computed result.
2020-06-15 12:58:37 +02:00
Kirill Bobyrev 550c4562d1 Revert "Prevent IR-gen from emitting consteval declarations"
This reverts commit 3bab88b7ba.

This patch causes test failures:
http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/17260
2020-06-15 12:14:15 +02:00
Tyker 3bab88b7ba Prevent IR-gen from emitting consteval declarations
Summary: with this patch instead of emitting calls to consteval function. the IR-gen will emit a store of the already computed result.

Reviewers: rsmith

Reviewed By: rsmith

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76420
2020-06-15 10:47:14 +02:00
Akira Hatanaka c9a52de002 [CodeGen] Simplify the way lifetime of block captures is extended
Rather than pushing inactive cleanups for the block captures at the
entry of a full expression and activating them during the creation of
the block literal, just call pushLifetimeExtendedDestroy to ensure the
cleanups are popped at the end of the scope enclosing the block
expression.

rdar://problem/63996471

Differential Revision: https://reviews.llvm.org/D81624
2020-06-11 16:06:22 -07:00
Florian Hahn 8f3f88d2f5 [Matrix] Implement matrix index expressions ([][]).
This patch implements matrix index expressions
(matrix[RowIdx][ColumnIdx]).

It does so by introducing a new MatrixSubscriptExpr(Base, RowIdx, ColumnIdx).
MatrixSubscriptExprs are built in 2 steps in ActOnMatrixSubscriptExpr. First,
if the base of a subscript is of matrix type, we create a incomplete
MatrixSubscriptExpr(base, idx, nullptr). Second, if the base is an incomplete
MatrixSubscriptExpr, we create a complete
MatrixSubscriptExpr(base->getBase(), base->getRowIdx(), idx)

Similar to vector elements, it is not possible to take the address of
a MatrixSubscriptExpr.
For CodeGen, a new MatrixElt type is added to LValue, which is very
similar to VectorElt. The only difference is that we may need to cast
the type of the base from an array to a vector type when accessing it.

Reviewers: rjmccall, anemet, Bigcheese, rsmith, martong

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D76791
2020-06-01 20:08:49 +01:00
Christopher Tetreault 796898172c [SVE] Eliminate calls to default-false VectorType::get() from Clang
Reviewers: efriedma, david-arm, fpetrogalli, ddunbar, rjmccall

Reviewed By: fpetrogalli, rjmccall

Subscribers: tschuett, rkruppe, psnobl, dmgreen, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D80323
2020-06-01 10:02:14 -07:00
Eli Friedman 62f3ef2b53 [CGCall] Annotate references with "align" attribute.
If we're going to assume references are dereferenceable, we should also
assume they're aligned: otherwise, we can't actually dereference them.

See also D80072.

Differential Revision: https://reviews.llvm.org/D80166
2020-05-19 20:21:30 -07:00
Anastasia Stulova a6a237f204 [OpenCL] Added addrspace_cast operator in C++ mode.
This operator is intended for casting between
pointers to objects in different address spaces
and follows similar logic as const_cast in C++.

Tags: #clang

Differential Revision: https://reviews.llvm.org/D60193
2020-05-18 12:07:54 +01:00
Eli Friedman 11aa3707e3 StoreInst should store Align, not MaybeAlign
This is D77454, except for stores.  All the infrastructure work was done
for loads, so the remaining changes necessary are relatively small.

Differential Revision: https://reviews.llvm.org/D79968
2020-05-15 12:26:58 -07:00
Florian Hahn 1065869195 [Matrix] Add matrix type to Clang.
This patch adds a matrix type to Clang as described in the draft
specification in clang/docs/MatrixSupport.rst. It introduces a new option
-fenable-matrix, which can be used to enable the matrix support.

The patch adds new MatrixType and DependentSizedMatrixType types along
with the plumbing required. Loads of and stores to pointers to matrix
values are lowered to memory operations on 1-D IR arrays. After loading,
the loaded values are cast to a vector. This ensures matrix values use
the alignment of the element type, instead of LLVM's large vector
alignment.

The operators and builtins described in the draft spec will will be added in
follow-up patches.

Reviewers: martong, rsmith, Bigcheese, anemet, dexonsmith, rjmccall, aaron.ballman

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D72281
2020-05-11 18:55:45 +01:00
Ayke van Laethem 215dc2e203
[AVR] Use the correct address space for non-prototyped function calls
Some function declarations like this:

    void foo();

do not have a type declaration, for that you'd use:

    void foo(void);

Clang internally bitcasts the variadic function declaration to a
function pointer, but doesn't use the correct address space on AVR. This
commit fixes that.

This fix is necessary to let Clang compile compiler-rt for AVR.

Differential Revision: https://reviews.llvm.org/D78125
2020-04-15 23:44:51 +02:00
Richard Smith bab6df86ae Rework how UuidAttr, CXXUuidofExpr, and GUID template arguments and constants are represented.
Summary:
Previously, we treated CXXUuidofExpr as quite a special case: it was the
only kind of expression that could be a canonical template argument, it
could be a constant lvalue base object, and so on. In addition, we
represented the UUID value as a string, whose source form we did not
preserve faithfully, and that we partially parsed in multiple different
places.

With this patch, we create an MSGuidDecl object to represent the
implicit object of type 'struct _GUID' created by a UuidAttr. Each
UuidAttr holds a pointer to its 'struct _GUID' and its original
(as-written) UUID string. A non-value-dependent CXXUuidofExpr behaves
like a DeclRefExpr denoting that MSGuidDecl object. We cache an APValue
representation of the GUID on the MSGuidDecl and use it from constant
evaluation where needed.

This allows removing a lot of the special-case logic to handle these
expressions. Unfortunately, many parts of Clang assume there are only
a couple of interesting kinds of ValueDecl, so the total amount of
special-case logic is not really reduced very much.

This fixes a few bugs and issues:
 * PR38490: we now support reading from GUID objects returned from
   __uuidof during constant evaluation.
 * Our Itanium mangling for a non-instantiation-dependent template
   argument involving __uuidof no longer depends on which CXXUuidofExpr
   template argument we happened to see first.
 * We now predeclare ::_GUID, and permit use of __uuidof without
   any header inclusion, better matching MSVC's behavior. We do not
   predefine ::__s_GUID, though; that seems like a step too far.
 * Our IR representation for GUID constants now uses the correct IR type
   wherever possible. We will still fall back to using the
      {i32, i16, i16, [8 x i8]}
   layout if a definition of struct _GUID is not available. This is not
   ideal: in principle the two layouts could have different padding.

Reviewers: rnk, jdoerfert

Subscribers: arphaman, cfe-commits, aeubanks

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78171
2020-04-15 12:20:42 -07:00
Benjamin Kramer 6f64daca8f Upgrade calls to CreateShuffleVector to use the preferred form of passing an array of ints
No functionality change intended.
2020-04-15 12:51:38 +02:00
Christopher Tetreault f22fbe3a15 Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: sdesmalen, efriedma, krememek

Reviewed By: sdesmalen, efriedma

Subscribers: dexonsmith, Charusso, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77257
2020-04-13 13:01:40 -07:00
Raul Tambre 878d96011a [clang][CodeGen] Handle throw expression in conditional operator constant folding
Summary:
We're smart and do constant folding when emitting conditional operators.
Thus we emit the live value as a lvalue. This doesn't work if the live value is a throw expression.
Handle this by emitting the throw and returning the dead value as the lvalue.

Fixes PR28184.

Reviewers: rsmith

Reviewed By: rsmith

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77502
2020-04-08 12:32:21 -07:00
Eli Friedman 1ee6ec2bf3 Remove "mask" operand from shufflevector.
Instead, represent the mask as out-of-line data in the instruction. This
should be more efficient in the places that currently use
getShuffleVector(), and paves the way for further changes to add new
shuffles for scalable vectors.

This doesn't change the syntax in textual IR. And I don't currently plan
to change the bitcode encoding in this patch, although we'll probably
need to do something once we extend shufflevector for scalable types.

I expect that once this is finished, we can then replace the raw "mask"
with something more appropriate for scalable vectors.  Not sure exactly
what this looks like at the moment, but there are a few different ways
we could handle it.  Maybe we could try to describe specific shuffles.
Or maybe we could define it in terms of a function to convert a fixed-length
array into an appropriate scalable vector, using a "step", or something
like that.

Differential Revision: https://reviews.llvm.org/D72467
2020-03-31 13:08:59 -07:00
Richard Smith f18233dad4 Fix -fsanitize=array-bound to treat T[0] union members as flexible array
members regardless of whether they're the last member of the union.
2020-03-18 15:47:24 -07:00
Michael Liao 4cf01ed75e [hip] Revise `GlobalDecl` constructors. NFC.
Summary:
- https://reviews.llvm.org/D68578 revises the `GlobalDecl` constructors
  to ensure all GPU kernels have `ReferenceKenelKind` initialized
  properly with an explicit constructor and static one. But, there are
  lots of places using the implicit constructor triggering the assertion
  on non-GPU kernels. That's found in compilation of many tests and
  workloads.
- Fixing all of them may change more code and, more importantly, all of
  them assumes the default kernel reference kind. This patch changes
  that constructor to tell `CUDAGlobalAttr` and construct `GlobalDecl`
  properly.

Reviewers: yaxunl

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76344
2020-03-18 09:33:39 -04:00
Akira Hatanaka 40568fec7e [CodeGen] Emit destructor calls to destruct compound literals
Fix a bug in IRGen where it wasn't destructing compound literals in C
that are ObjC pointer arrays or non-trivial structs. Also diagnose jumps
that enter or exit the lifetime of the compound literals.

rdar://problem/51867864

Differential Revision: https://reviews.llvm.org/D64464
2020-03-10 14:08:28 -07:00
Yaxun (Sam) Liu 22c457a869 [HIP] Fix device stub name
HIP emits a device stub function for each kernel in host code.

The HIP debugger requires device stub function to have a different unmangled name as the kernel.

Currently the name of the device stub function is the mangled name with a postfix .stub. However,
this does not work with the HIP debugger since the unmangled name is the same as the kernel.

This patch adds prefix __device__stub__ to the unmangled name of the device stub before mangling,
therefore the device stub function has a valid mangled name which is different than the device kernel
name. The device side kernel name is kept unchanged. kernels with extern "C" also gets the prefix added
to the corresponding device stub function.

Differential Revision: https://reviews.llvm.org/D68578
2020-03-09 16:40:05 -04:00
Yaxun (Sam) Liu 29e1a16be8 [NFC] Let mangler accept GlobalDecl
Differential Revision: https://reviews.llvm.org/D75700
2020-03-07 23:51:41 -05:00
Reid Kleckner 86565c1309 Avoid SourceManager.h include in RawCommentList.h, add missing incs
SourceManager.h includes FileManager.h, which is expensive due to
dependencies on LLVM FS headers.

Remove dead BeforeThanCompare specialization.

Sink ASTContext::addComment to cpp file.

This reduces the time to compile a file that does nothing but include
ASTContext.h from ~3.4s to ~2.8s for me.

Saves these includes:
    219 -    ../clang/include/clang/Basic/SourceManager.h
    204 -    ../clang/include/clang/Basic/FileSystemOptions.h
    204 -    ../clang/include/clang/Basic/FileManager.h
    165 -    ../llvm/include/llvm/Support/VirtualFileSystem.h
    164 -    ../llvm/include/llvm/Support/SourceMgr.h
    164 -    ../llvm/include/llvm/Support/SMLoc.h
    161 -    ../llvm/include/llvm/Support/Path.h
    141 -    ../llvm/include/llvm/ADT/BitVector.h
    128 -    ../llvm/include/llvm/Support/MemoryBuffer.h
    124 -    ../llvm/include/llvm/Support/FileSystem.h
    124 -    ../llvm/include/llvm/Support/Chrono.h
    124 -    .../MSVCSTL/include/stack
    122 -    ../llvm/include/llvm-c/Types.h
    122 -    ../llvm/include/llvm/Support/NativeFormatting.h
    122 -    ../llvm/include/llvm/Support/FormatProviders.h
    122 -    ../llvm/include/llvm/Support/CBindingWrapping.h
    122 -    .../MSVCSTL/include/xtimec.h
    122 -    .../MSVCSTL/include/ratio
    122 -    .../MSVCSTL/include/chrono
    121 -    ../llvm/include/llvm/Support/FormatVariadicDetails.h
    118 -    ../llvm/include/llvm/Support/MD5.h
    109 -    .../MSVCSTL/include/deque
    105 -    ../llvm/include/llvm/Support/Host.h
    105 -    ../llvm/include/llvm/Support/Endian.h

Reviewed By: aaron.ballman, hans

Differential Revision: https://reviews.llvm.org/D75279
2020-02-27 13:49:40 -08:00
Diogo Sampaio 9d869180c4 [ARM] Follow AACPS for preserving number of loads/stores of volatile bit-fields
Summary:
Following the AAPCS, every store to a volatile bit-field requires to generate one load of that field, even if all the bits are going to be replaced.
This patch allows the user to opt-in in following such rule, whenever the a.

AAPCS Release 2019Q1.1 (https://static.docs.arm.com/ihi0042/g/aapcs32.pdf)
section 8.1 Data Types, page 35, paragraph: Volatile bit-fields – preserving number and width of container accesses

```
When a volatile bit-field is written, and its container does not overlap with any non-bit-field member, its
container must be read exactly once and written exactly once using the access width appropriate to the
type of the container. The two accesses are not atomic.

```

Reviewers: lebedev.ri, ostannard, jfb, eli.friedman

Reviewed By: jfb

Subscribers: rsmith, rjmccall, dexonsmith, kristof.beyls, jfb, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D67399
2020-02-07 10:11:54 +00:00
Yonghong Song 9271cab270 [BPF] use base lvalue type for preserve_{struct,union}_access_index metadata
Linux commit
  1cf5b23988 (diff-289313b9fec99c6f0acfea19d9cfd949)
uses "#pragma clang attribute push (__attribute__((preserve_access_index)),
      apply_to = record)"
to apply CO-RE relocations to all records including the following pattern:
  #pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record)
  typedef struct {
    int a;
  } __t;
  #pragma clang attribute pop
  int test(__t *arg) { return arg->a; }

The current approach to use struct type in the relocation record will
result in an anonymous struct, which make later type matching difficult
    in bpf loader. In fact, current BPF backend will fail the above program
with assertion:
  clang: ../lib/Target/BPF/BPFAbstractMemberAccess.cpp:796: ...
     Assertion `TypeName.size()' failed.

The patch use the base lvalue type for the "base" value to annotate
preservee_{struct,union}_access_index intrinsics. In the above example,
the type will be "__t" which preserved the type name.

Differential Revision: https://reviews.llvm.org/D73900
2020-02-04 09:28:30 -08:00
Richard Smith 0130b6cb5a Don't assume a reference refers to at least sizeof(T) bytes.
When T is a class type, only nvsize(T) bytes need be accessible through
the reference. We had matching bugs in the application of the
dereferenceable attribute and in -fsanitize=undefined.
2020-01-31 19:08:17 -08:00
Benjamin Kramer adcd026838 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00
Diogo Sampaio 2147703bde Revert "[ARM] Follow AACPS standard for volatile bit-fields access width"
This reverts commit 6a24339a45.
Submitted using ide button by mistake
2020-01-21 15:31:33 +00:00
Diogo Sampaio 6a24339a45 [ARM] Follow AACPS standard for volatile bit-fields access width
Summary:
This patch resumes the work of D16586.
According to the AAPCS, volatile bit-fields should
be accessed using containers of the widht of their
declarative type. In such case:
```
struct S1 {
  short a : 1;
}
```
should be accessed using load and stores of the width
(sizeof(short)), where now the compiler does only load
the minimum required width (char in this case).
However, as discussed in D16586,
that could overwrite non-volatile bit-fields, which
conflicted with C and C++ object models by creating
data race conditions that are not part of the bit-field,
e.g.
```
struct S2 {
  short a;
  int  b : 16;
}
```
Accessing `S2.b` would also access `S2.a`.

The AAPCS Release 2019Q1.1
(https://static.docs.arm.com/ihi0042/g/aapcs32.pdf)
section 8.1 Data Types, page 35, "Volatile bit-fields -
preserving number and width of container accesses" has been
updated to avoid conflict with the C++ Memory Model.
Now it reads in the note:
```
This ABI does not place any restrictions on the access widths
of bit-fields where the container overlaps with a non-bit-field member.
 This is because the C/C++ memory model defines these as being separate
memory locations, which can be accessed by two threads
 simultaneously. For this reason, compilers must be permitted to use a
narrower memory access width (including splitting the access
 into multiple instructions) to avoid writing to a different memory location.
```

I've updated the patch D16586 to follow such behavior by verifying that we
only change volatile bit-field access when:
 - it won't overlap with any other non-bit-field member
 - we only access memory inside the bounds of the record

Regarding the number of memory accesses, that should be preserved, that will
be implemented by D67399.

Reviewers: rsmith, rjmccall, eli.friedman, ostannard

Subscribers: ostannard, kristof.beyls, cfe-commits, carwil, olista01

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72932
2020-01-21 15:23:38 +00:00
serge-sans-paille d437fba8ef Reapply Allow system header to provide their own implementation of some builtin
This reverts commit 3d210ed3d1.

See https://reviews.llvm.org/D71082 for the patch and discussion that make it
possible to reapply this patch.
2020-01-17 09:58:32 +01:00
Amy Huang 3d210ed3d1 Revert "Allow system header to provide their own implementation of some builtin"
This reverts commit 921f871ac4 because it
causes libc++ code to trigger __warn_memset_zero_len.

See https://reviews.llvm.org/D71082.
2020-01-15 15:03:45 -08:00
Simon Pilgrim 16c53ffcb9 Fix "pointer is null" static analyzer warnings. NFCI.
Use castAs<> instead of getAs<> since the pointer is dereferenced immediately below and castAs will perform the null assertion for us.
2020-01-11 16:02:23 +00:00
serge-sans-paille 921f871ac4 Allow system header to provide their own implementation of some builtin
If a system header provides an (inline) implementation of some of their
function, clang still matches on the function name and generate the appropriate
llvm builtin, e.g. memcpy. This behavior is in line with glibc recommendation «
users may not provide their own version of symbols » but doesn't account for the
fact that glibc itself can provide inline version of some functions.

It is the case for the memcpy function when -D_FORTIFY_SOURCE=1 is on. In that
case an inline version of memcpy calls __memcpy_chk, a function that performs
extra runtime checks. Clang currently ignores the inline version and thus
provides no runtime check.

This code fixes the issue by detecting functions whose name is a builtin name
but also have an inline implementation.

Differential Revision: https://reviews.llvm.org/D71082
2020-01-10 09:44:20 +01:00
Alexey Bataev 7b518dcb29 [OPENMP50]Support lastprivate conditional updates in inc/dec unary ops.
Added support for checking of updates of variables used in unary
pre(pos) inc/dec expressions.
2020-01-06 16:37:01 -05:00
Alexey Bataev a58da1a2ff [OPENMP50]Codegen for lastprivate conditional list items.
Added codegen support for lastprivate conditional. According to the
standard, if  when the conditional modifier appears on the clause, if an
assignment to a list item is encountered in the construct then the
original list item is assigned the value that is assigned to the new
list item in the sequentially last iteration or lexically last section
in which such an assignment is encountered.
We look for the assignment operations and check if the left side
references lastprivate conditional variable. Then the next code is
emitted:
if (last_iv_a <= iv) {
  last_iv_a = iv;
  last_a = lp_a;
}

At the end the implicit barrier is generated to wait for the end of all
threads and then in the check for the last iteration the private copy is
assigned the last value.

if (last_iter) {
  lp_a = last_a; // <--- new code
  a = lp_a;      // <--- store of private value to the original  variable.
}
2020-01-02 16:43:00 -05:00
Alexey Bataev 0860db966a [OPENMP50]Codegen for nontemporal clause.
Summary:
Basic codegen for the declarations marked as nontemporal. Also, if the
base declaration in the member expression is marked as nontemporal,
lvalue for member decl access inherits nonteporal flag from the base
lvalue.

Reviewers: rjmccall, hfinkel, jdoerfert

Subscribers: guansong, arphaman, caomhin, kkwli0, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D71708
2019-12-23 10:04:46 -05:00
Akira Hatanaka d8136f14f1 [CodeGen][ObjC] Emit a primitive store to store a __strong field in
ExpandTypeFromArgs

This fixes a bug in IRGen where a call to `llvm.objc.storeStrong` was
being emitted to initialize a __strong field of an uninitialized
temporary struct, which caused crashes at runtime.

rdar://problem/51807365
2019-12-03 23:44:30 -08:00
Akira Hatanaka f139ae3d93 [NFC] Pass a reference to CodeGenFunction to methods of LValue and
AggValueSlot

This reapplies 8a5b7c3570 after a null
dereference bug in CGOpenMPRuntime::emitUserDefinedMapper.

Original commit message:

This is needed for the pointer authentication work we plan to do in the
near future.

a63a81bd99/clang/docs/PointerAuthentication.rst
2019-12-03 15:22:13 -08:00
Akira Hatanaka 9f37c0e703 Revert "[NFC] Pass a reference to CodeGenFunction to methods of LValue and"
This reverts commit 8a5b7c3570. This seems
to have broken UBSan because of a null dereference.
2019-12-03 13:08:01 -08:00
Akira Hatanaka 8a5b7c3570 [NFC] Pass a reference to CodeGenFunction to methods of LValue and
AggValueSlot

This is needed for the pointer authentication work we plan to do in the
near future.

a63a81bd99/clang/docs/PointerAuthentication.rst
2019-12-03 11:30:09 -08:00
Peter Collingbourne 90b8bc003c IRGen: Call SetLLVMFunctionAttributes{,ForDefinition} on __cfi_check_fail.
This has the main effect of causing target-cpu and target-features to be set
on __cfi_check_fail, causing the function to become ABI-compatible with other
functions in the case where these attributes affect ABI (e.g. reserve-x18).

Technically we only need to call SetLLVMFunctionAttributes to get the target-*
attributes set, but since we're creating a definition we probably ought to
call the ForDefinition function as well.

Fixes PR44094.

Differential Revision: https://reviews.llvm.org/D70692
2019-11-25 15:16:43 -08:00
Tyker b0561b3346 [NFC] Refactor representation of materialized temporaries
Summary:
this patch refactor representation of materialized temporaries to prevent an issue raised by rsmith in https://reviews.llvm.org/D63640#inline-612718

Reviewers: rsmith, martong, shafik

Reviewed By: rsmith

Subscribers: thakis, sammccall, ilya-biryukov, rnkovacs, arphaman, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D69360
2019-11-19 18:20:45 +01:00
Nico Weber c9276fbfdf Revert "[NFC] Refactor representation of materialized temporaries"
This reverts commit 08ea1ee2db.
It broke ./ClangdTests/FindExplicitReferencesTest.All
on the bots, see comments on https://reviews.llvm.org/D69360
2019-11-17 02:09:25 -05:00
Tyker 08ea1ee2db [NFC] Refactor representation of materialized temporaries
Summary:
this patch refactor representation of materialized temporaries to prevent an issue raised by rsmith in https://reviews.llvm.org/D63640#inline-612718

Reviewers: rsmith, martong, shafik

Reviewed By: rsmith

Subscribers: rnkovacs, arphaman, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D69360
2019-11-16 17:56:09 +01:00
Yonghong Song dd16b3fe25 [BPF] Restrict preserve_access_index attribute to C only
This patch is a follow-up for commit 4e2ce228ae
  [BPF] Add preserve_access_index attribute for record definition
to restrict attribute for C only. A new test case is added
to check for this restriction.

Additional code polishing is done based on
Aaron Ballman's suggestion in https://reviews.llvm.org/D69759/new/.

Differential Revision: https://reviews.llvm.org/D70257
2019-11-14 14:14:59 -08:00
Yonghong Song 4e2ce228ae [BPF] Add preserve_access_index attribute for record definition
This is a resubmission for the previous reverted commit
9434360401 with the same subject. This commit fixed the
segfault issue and addressed additional review comments.

This patch introduced a new bpf specific attribute which can
be added to struct or union definition. For example,
  struct s { ... } __attribute__((preserve_access_index));
  union u { ... } __attribute__((preserve_access_index));
The goal is to simplify user codes for cases
where preserve access index happens for certain struct/union,
so user does not need to use clang __builtin_preserve_access_index
for every members.

The attribute has no effect if -g is not specified.

When the attribute is specified and -g is specified, any member
access defined by that structure or union, including array subscript
access and inner records, will be preserved through
  __builtin_preserve_{array,struct,union}_access_index()
IR intrinsics, which will enable relocation generation
in bpf backend.

The following is an example to illustrate the usage:
  -bash-4.4$ cat t.c
  #define __reloc__ __attribute__((preserve_access_index))
  struct s1 {
    int c;
  } __reloc__;

  struct s2 {
    union {
      struct s1 b[3];
    };
  } __reloc__;

  struct s3 {
    struct s2 a;
  } __reloc__;

  int test(struct s3 *arg) {
    return arg->a.b[2].c;
  }
  -bash-4.4$ clang -target bpf -g -S -O2 t.c

A relocation with access string "0:0:0:0:2:0" will be generated
representing access offset of arg->a.b[2].c.

forward declaration with attribute is also handled properly such
that the attribute is copied and populated in real record definition.

Differential Revision: https://reviews.llvm.org/D69759
2019-11-13 08:23:44 -08:00
Yonghong Song 9434360401 Revert "[BPF] Add preserve_access_index attribute for record definition"
This reverts commit 4a5aa1a7bf.

There are some other test failures. Investigate them first.
2019-11-09 08:32:44 -08:00
Yonghong Song 4a5aa1a7bf [BPF] Add preserve_access_index attribute for record definition
This patch introduced a new bpf specific attribute which can
be added to struct or union definition. For example,
  struct s { ... } __attribute__((preserve_access_index));
  union u { ... } __attribute__((preserve_access_index));
The goal is to simplify user codes for cases
where preserve access index happens for certain struct/union,
so user does not need to use clang __builtin_preserve_access_index
for every members.

The attribute has no effect if -g is not specified.

When the attribute is specified and -g is specified, any member
access defined by that structure or union, including array subscript
access and inner records, will be preserved through
  __builtin_preserve_{array,struct,union}_access_index()
IR intrinsics, which will enable relocation generation
in bpf backend.

The following is an example to illustrate the usage:
  -bash-4.4$ cat t.c
  #define __reloc__ __attribute__((preserve_access_index))
  struct s1 {
    int c;
  } __reloc__;

  struct s2 {
    union {
      struct s1 b[3];
    };
  } __reloc__;

  struct s3 {
    struct s2 a;
  } __reloc__;

  int test(struct s3 *arg) {
    return arg->a.b[2].c;
  }
  -bash-4.4$ clang -target bpf -g -S -O2 t.c

A relocation with access string "0:0:0:0:2:0" will be generated
representing access offset of arg->a.b[2].c.

forward declaration with attribute is also handled properly such
that the attribute is copied and populated in real record definition.

Differential Revision: https://reviews.llvm.org/D69759
2019-11-09 08:17:12 -08:00
Craig Topper 910718bd03 [opaque pointer types] Add element type argument to IRBuilder CreatePreserveStructAccessIndex and CreatePreserveArrayAccessIndex
Summary:
These were the only remaining users of the GetElementPtrInst::getGEPReturnType
method that gets the element type from the pointer type.

Remove that method since its now dead.

Reviewers: jyknight, t.p.northover, arsenm

Reviewed By: arsenm

Subscribers: wdng, arsenm, arphaman, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D69756
2019-11-03 10:27:18 -08:00
Richard Smith 778dc0f1d4 [c++20] Add CXXRewrittenBinaryOperator to represent a comparison
operator that is rewritten as a call to multiple other operators.

No functionality change yet: nothing creates these expressions.

llvm-svn: 375305
2019-10-19 00:04:38 +00:00
Dmitry Mikulin 15984457a6 Revert Tag CFI-generated data structures with "#pragma clang section" attributes.
This reverts r375022 (git commit e2692b3bc0)

llvm-svn: 375069
2019-10-17 00:55:38 +00:00
Dmitry Mikulin e2692b3bc0 Tag CFI-generated data structures with "#pragma clang section" attributes.
Differential Revision: https://reviews.llvm.org/D68808

llvm-svn: 375022
2019-10-16 17:51:40 +00:00
Yonghong Song 05e46979d2 [BPF] do compile-once run-everywhere relocation for bitfields
A bpf specific clang intrinsic is introduced:
   u32 __builtin_preserve_field_info(member_access, info_kind)
Depending on info_kind, different information will
be returned to the program. A relocation is also
recorded for this builtin so that bpf loader can
patch the instruction on the target host.
This clang intrinsic is used to get certain information
to facilitate struct/union member relocations.

The offset relocation is extended by 4 bytes to
include relocation kind.
Currently supported relocation kinds are
 enum {
    FIELD_BYTE_OFFSET = 0,
    FIELD_BYTE_SIZE,
    FIELD_EXISTENCE,
    FIELD_SIGNEDNESS,
    FIELD_LSHIFT_U64,
    FIELD_RSHIFT_U64,
 };
for __builtin_preserve_field_info. The old
access offset relocation is covered by
    FIELD_BYTE_OFFSET = 0.

An example:
struct s {
    int a;
    int b1:9;
    int b2:4;
};
enum {
    FIELD_BYTE_OFFSET = 0,
    FIELD_BYTE_SIZE,
    FIELD_EXISTENCE,
    FIELD_SIGNEDNESS,
    FIELD_LSHIFT_U64,
    FIELD_RSHIFT_U64,
};

void bpf_probe_read(void *, unsigned, const void *);
int field_read(struct s *arg) {
  unsigned long long ull = 0;
  unsigned offset = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_OFFSET);
  unsigned size = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_SIZE);
 #ifdef USE_PROBE_READ
  bpf_probe_read(&ull, size, (const void *)arg + offset);
  unsigned lshift = __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64);
 #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
  lshift = lshift + (size << 3) - 64;
 #endif
 #else
  switch(size) {
  case 1:
    ull = *(unsigned char *)((void *)arg + offset); break;
  case 2:
    ull = *(unsigned short *)((void *)arg + offset); break;
  case 4:
    ull = *(unsigned int *)((void *)arg + offset); break;
  case 8:
    ull = *(unsigned long long *)((void *)arg + offset); break;
  }
  unsigned lshift = __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64);
 #endif
  ull <<= lshift;
  if (__builtin_preserve_field_info(arg->b2, FIELD_SIGNEDNESS))
    return (long long)ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64);
  return ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64);
}

There is a minor overhead for bpf_probe_read() on big endian.

The code and relocation generated for field_read where bpf_probe_read() is
used to access argument data on little endian mode:
        r3 = r1
        r1 = 0
        r1 = 4  <=== relocation (FIELD_BYTE_OFFSET)
        r3 += r1
        r1 = r10
        r1 += -8
        r2 = 4  <=== relocation (FIELD_BYTE_SIZE)
        call bpf_probe_read
        r2 = 51 <=== relocation (FIELD_LSHIFT_U64)
        r1 = *(u64 *)(r10 - 8)
        r1 <<= r2
        r2 = 60 <=== relocation (FIELD_RSHIFT_U64)
        r0 = r1
        r0 >>= r2
        r3 = 1  <=== relocation (FIELD_SIGNEDNESS)
        if r3 == 0 goto LBB0_2
        r1 s>>= r2
        r0 = r1
LBB0_2:
        exit

Compare to the above code between relocations FIELD_LSHIFT_U64 and
FIELD_LSHIFT_U64, the code with big endian mode has four more
instructions.
        r1 = 41   <=== relocation (FIELD_LSHIFT_U64)
        r6 += r1
        r6 += -64
        r6 <<= 32
        r6 >>= 32
        r1 = *(u64 *)(r10 - 8)
        r1 <<= r6
        r2 = 60   <=== relocation (FIELD_RSHIFT_U64)

The code and relocation generated when using direct load.
        r2 = 0
        r3 = 4
        r4 = 4
        if r4 s> 3 goto LBB0_3
        if r4 == 1 goto LBB0_5
        if r4 == 2 goto LBB0_6
        goto LBB0_9
LBB0_6:                                 # %sw.bb1
        r1 += r3
        r2 = *(u16 *)(r1 + 0)
        goto LBB0_9
LBB0_3:                                 # %entry
        if r4 == 4 goto LBB0_7
        if r4 == 8 goto LBB0_8
        goto LBB0_9
LBB0_8:                                 # %sw.bb9
        r1 += r3
        r2 = *(u64 *)(r1 + 0)
        goto LBB0_9
LBB0_5:                                 # %sw.bb
        r1 += r3
        r2 = *(u8 *)(r1 + 0)
        goto LBB0_9
LBB0_7:                                 # %sw.bb5
        r1 += r3
        r2 = *(u32 *)(r1 + 0)
LBB0_9:                                 # %sw.epilog
        r1 = 51
        r2 <<= r1
        r1 = 60
        r0 = r2
        r0 >>= r1
        r3 = 1
        if r3 == 0 goto LBB0_11
        r2 s>>= r1
        r0 = r2
LBB0_11:                                # %sw.epilog
        exit

Considering verifier is able to do limited constant
propogation following branches. The following is the
code actually traversed.
        r2 = 0
        r3 = 4   <=== relocation
        r4 = 4   <=== relocation
        if r4 s> 3 goto LBB0_3
LBB0_3:                                 # %entry
        if r4 == 4 goto LBB0_7
LBB0_7:                                 # %sw.bb5
        r1 += r3
        r2 = *(u32 *)(r1 + 0)
LBB0_9:                                 # %sw.epilog
        r1 = 51   <=== relocation
        r2 <<= r1
        r1 = 60   <=== relocation
        r0 = r2
        r0 >>= r1
        r3 = 1
        if r3 == 0 goto LBB0_11
        r2 s>>= r1
        r0 = r2
LBB0_11:                                # %sw.epilog
        exit

For native load case, the load size is calculated to be the
same as the size of load width LLVM otherwise used to load
the value which is then used to extract the bitfield value.

Differential Revision: https://reviews.llvm.org/D67980

llvm-svn: 374099
2019-10-08 18:23:17 +00:00
Simon Pilgrim 7e38f0c408 Codegen - silence static analyzer getAs<> null dereference warnings. NFCI.
The static analyzer is warning about potential null dereferences, but in these cases we should be able to use castAs<> directly and if not assert will fire for us.

llvm-svn: 373918
2019-10-07 16:42:25 +00:00
Guillaume Chatelet c79099e0f4 [Alignment][Clang][NFC] Add CharUnits::getAsAlign
Summary:
This is a prerequisite to removing `llvm::GlobalObject::setAlignment(unsigned)`.
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D68274

llvm-svn: 373592
2019-10-03 13:00:29 +00:00
Guillaume Chatelet ab11b9188d [Alignment][NFC] Remove AllocaInst::setAlignment(unsigned)
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, arsenm, jvesely, nhaehnle, eraman, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68141

llvm-svn: 373207
2019-09-30 13:34:44 +00:00
Richard Smith 00223827a9 Improve code generation for thread_local variables:
Summary:
 * Don't bother using a thread wrapper when the variable is known to
   have constant initialization.
 * Emit the thread wrapper as discardable-if-unused in TUs that don't
   contain a definition of the thread_local variable.
 * Don't emit the thread wrapper at all if the thread_local variable
   is unused and discardable; it will be emitted by all TUs that need
   it.

Reviewers: rjmccall, jdoerfert

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D67429

llvm-svn: 371767
2019-09-12 20:00:24 +00:00
Alexey Bataev b5890a329a Fix for PR43175: compiler crash when trying to emit noncapturable
constant.

If the constexpr variable is partially initialized, the initializer can
be emitted as the structure, not as an array, because of some early
optimizations. The llvm variable gets the type from this constant and,
thus, gets the type which is pointer to struct rather than pointer to an
array. We need to convert this type to be truely array, otherwise it may
lead to the compiler crash when trying to emit array subscript
expression.

llvm-svn: 371548
2019-09-10 19:16:56 +00:00
Aaron Ballman 0299dbd2ae Implement codegen for MSVC unions with reference members.
Currently, clang accepts a union with a reference member when given the -fms-extensions flag. This change fixes the codegen for this case.

Patch by Dominic Ferreira.

llvm-svn: 370052
2019-08-27 12:42:45 +00:00
Vitaly Buka 669d111c52 hwasan, codegen: Keep more lifetime markers used for hwasan
Reviewers: eugenis

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D66697

llvm-svn: 369980
2019-08-26 22:16:05 +00:00
Vitaly Buka aeca56964f msan, codegen, instcombine: Keep more lifetime markers used for msan
Reviewers: eugenis

Subscribers: hiraditya, cfe-commits, #sanitizers, llvm-commits

Tags: #clang, #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D66695

llvm-svn: 369979
2019-08-26 22:15:50 +00:00
Peter Collingbourne 2452d7030b IR. Change strip* family of functions to not look through aliases.
I noticed another instance of the issue where references to aliases were
being replaced with aliasees, this time in InstCombine. In the instance that
I saw it turned out to be only a QoI issue (a symbol ended up being missing
from the symbol table due to the last reference to the alias being removed,
preventing HWASAN from symbolizing a global reference), but it could easily
have manifested as incorrect behaviour.

Since this is the third such issue encountered (previously: D65118, D65314)
it seems to be time to address this common error/QoI issue once and for all
and make the strip* family of functions not look through aliases.

Includes a test for the specific issue that I saw, but no doubt there are
other similar bugs fixed here.

As with D65118 this has been tested to make sure that the optimization isn't
load bearing. I built Clang, Chromium for Linux, Android and Windows as well
as the test-suite and there were no size regressions.

Differential Revision: https://reviews.llvm.org/D66606

llvm-svn: 369697
2019-08-22 19:56:14 +00:00
Yonghong Song d0ea05d5ef [BPF] annotate DIType metadata for builtin preseve_array_access_index()
Previously, debuginfo types are annotated to
IR builtin preserve_struct_access_index() and
preserve_union_access_index(), but not
preserve_array_access_index(). The debug info
is useful to identify the root type name which
later will be used for type comparison.

For user access without explicit type conversions,
the previous scheme works as we can ignore intermediate
compiler generated type conversions (e.g., from union types to
union members) and still generate correct access index string.

The issue comes with user explicit type conversions, e.g.,
converting an array to a structure like below:
  struct t { int a; char b[40]; };
  struct p { int c; int d; };
  struct t *var = ...;
  ... __builtin_preserve_access_index(&(((struct p *)&(var->b[0]))->d)) ...
Although BPF backend can derive the type of &(var->b[0]),
explicit type annotation make checking more consistent
and less error prone.

Another benefit is for multiple dimension array handling.
For example,
  struct p { int c; int d; } g[8][9][10];
  ... __builtin_preserve_access_index(&g[2][3][4].d) ...
It would be possible to calculate the number of "struct p"'s
before accessing its member "d" if array debug info is
available as it contains each dimension range.

This patch enables to annotate IR builtin preserve_array_access_index()
with proper debuginfo type. The unit test case and language reference
is updated as well.

Signed-off-by: Yonghong Song <yhs@fb.com>

Differential Revision: https://reviews.llvm.org/D65664

llvm-svn: 367724
2019-08-02 21:28:28 +00:00
Yonghong Song 4754814c5a fix unnamed fiefield issue and add tests for __builtin_preserve_access_index intrinsic
The original commit is r366076. It is temporarily reverted (r366155)
due to test failure. This resubmit makes test more robust by accepting
regex instead of hardcoded names/references in several places.

This is a followup patch for https://reviews.llvm.org/D61809.
Handle unnamed bitfield properly and add more test cases.

Fixed the unnamed bitfield issue. The unnamed bitfield is ignored
by debug info, so we need to ignore such a struct/union member
when we try to get the member index in the debug info.

D61809 contains two test cases but not enough as it does
not checking generated IRs in the fine grain level, and also
it does not have semantics checking tests.
This patch added unit tests for both code gen and semantics checking for
the new intrinsic.

Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 366231
2019-07-16 17:24:33 +00:00
Stephan Bergmann e215996a29 Finish "Adapt -fsanitize=function to SANITIZER_NON_UNIQUE_TYPEINFO"
i.e., recent 5745eccef54ddd3caca278d1d292a88b2281528b:

* Bump the function_type_mismatch handler version, as its signature has changed.

* The function_type_mismatch handler can return successfully now, so
  SanitizerKind::Function must be AlwaysRecoverable (like for
  SanitizerKind::Vptr).

* But the minimal runtime would still unconditionally treat a call to the
  function_type_mismatch handler as failure, so disallow -fsanitize=function in
  combination with -fsanitize-minimal-runtime (like it was already done for
  -fsanitize=vptr).

* Add tests.

Differential Revision: https://reviews.llvm.org/D61479

llvm-svn: 366186
2019-07-16 06:23:27 +00:00
Rui Ueyama 49a3ad21d6 Fix parameter name comments using clang-tidy. NFC.
This patch applies clang-tidy's bugprone-argument-comment tool
to LLVM, clang and lld source trees. Here is how I created this
patch:

$ git clone https://github.com/llvm/llvm-project.git
$ cd llvm-project
$ mkdir build
$ cd build
$ cmake -GNinja -DCMAKE_BUILD_TYPE=Debug \
    -DLLVM_ENABLE_PROJECTS='clang;lld;clang-tools-extra' \
    -DCMAKE_EXPORT_COMPILE_COMMANDS=On -DLLVM_ENABLE_LLD=On \
    -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ ../llvm
$ ninja
$ parallel clang-tidy -checks='-*,bugprone-argument-comment' \
    -config='{CheckOptions: [{key: StrictMode, value: 1}]}' -fix \
    ::: ../llvm/lib/**/*.{cpp,h} ../clang/lib/**/*.{cpp,h} ../lld/**/*.{cpp,h}

llvm-svn: 366177
2019-07-16 04:46:31 +00:00
Eric Christopher fdcbd5fa48 Temporarily Revert "fix unnamed fiefield issue and add tests for __builtin_preserve_access_index intrinsic"
The commit had tests that would only work with names in the IR.

This reverts commit r366076.

llvm-svn: 366155
2019-07-15 23:49:31 +00:00
Yonghong Song e5086481b6 fix unnamed fiefield issue and add tests for __builtin_preserve_access_index intrinsic
This is a followup patch for https://reviews.llvm.org/D61809.
Handle unnamed bitfield properly and add more test cases.

Fixed the unnamed bitfield issue. The unnamed bitfield is ignored
by debug info, so we need to ignore such a struct/union member
when we try to get the member index in the debug info.

D61809 contains two test cases but not enough as it does
not checking generated IRs in the fine grain level, and also
it does not have semantics checking tests.
This patch added unit tests for both code gen and semantics checking for
the new intrinsic.

Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 366076
2019-07-15 15:42:41 +00:00
Yonghong Song 048493f882 [BPF] Preserve debuginfo array/union/struct type/access index
For background of BPF CO-RE project, please refer to
  http://vger.kernel.org/bpfconf2019.html
In summary, BPF CO-RE intends to compile bpf programs
adjustable on struct/union layout change so the same
program can run on multiple kernels with adjustment
before loading based on native kernel structures.

In order to do this, we need keep track of GEP(getelementptr)
instruction base and result debuginfo types, so we
can adjust on the host based on kernel BTF info.
Capturing such information as an IR optimization is hard
as various optimization may have tweaked GEP and also
union is replaced by structure it is impossible to track
fieldindex for union member accesses.

Three intrinsic functions, preserve_{array,union,struct}_access_index,
are introducted.
  addr = preserve_array_access_index(base, index, dimension)
  addr = preserve_union_access_index(base, di_index)
  addr = preserve_struct_access_index(base, gep_index, di_index)
here,
  base: the base pointer for the array/union/struct access.
  index: the last access index for array, the same for IR/DebugInfo layout.
  dimension: the array dimension.
  gep_index: the access index based on IR layout.
  di_index: the access index based on user/debuginfo types.

If using these intrinsics blindly, i.e., transforming all GEPs
to these intrinsics and later on reducing them to GEPs, we have
seen up to 7% more instructions generated. To avoid such an overhead,
a clang builtin is proposed:
  base = __builtin_preserve_access_index(base)
such that user wraps to-be-relocated GEPs in this builtin
and preserve_*_access_index intrinsics only apply to
those GEPs. Such a buyin will prevent performance degradation
if people do not use CO-RE, even for programs which use
bpf_probe_read().

For example, for the following example,
  $ cat test.c
  struct sk_buff {
     int i;
     int b1:1;
     int b2:2;
     union {
       struct {
         int o1;
         int o2;
       } o;
       struct {
         char flags;
         char dev_id;
       } dev;
       int netid;
     } u[10];
  };

  static int (*bpf_probe_read)(void *dst, int size, const void *unsafe_ptr)
      = (void *) 4;

  #define _(x) (__builtin_preserve_access_index(x))

  int bpf_prog(struct sk_buff *ctx) {
    char dev_id;
    bpf_probe_read(&dev_id, sizeof(char), _(&ctx->u[5].dev.dev_id));
    return dev_id;
  }
  $ clang -target bpf -O2 -g -emit-llvm -S -mllvm -print-before-all \
    test.c >& log

The generated IR looks like below:
  ...
  define dso_local i32 @bpf_prog(%struct.sk_buff*) #0 !dbg !15 {
    %2 = alloca %struct.sk_buff*, align 8
    %3 = alloca i8, align 1
    store %struct.sk_buff* %0, %struct.sk_buff** %2, align 8, !tbaa !45
    call void @llvm.dbg.declare(metadata %struct.sk_buff** %2, metadata !43, metadata !DIExpression()), !dbg !49
    call void @llvm.lifetime.start.p0i8(i64 1, i8* %3) #4, !dbg !50
    call void @llvm.dbg.declare(metadata i8* %3, metadata !44, metadata !DIExpression()), !dbg !51
    %4 = load i32 (i8*, i32, i8*)*, i32 (i8*, i32, i8*)** @bpf_probe_read, align 8, !dbg !52, !tbaa !45
    %5 = load %struct.sk_buff*, %struct.sk_buff** %2, align 8, !dbg !53, !tbaa !45
    %6 = call [10 x %union.anon]* @llvm.preserve.struct.access.index.p0a10s_union.anons.p0s_struct.sk_buffs(
         %struct.sk_buff* %5, i32 2, i32 3), !dbg !53, !llvm.preserve.access.index !19
    %7 = call %union.anon* @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(
         [10 x %union.anon]* %6, i32 1, i32 5), !dbg !53
    %8 = call %union.anon* @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(
         %union.anon* %7, i32 1), !dbg !53, !llvm.preserve.access.index !26
    %9 = bitcast %union.anon* %8 to %struct.anon.0*, !dbg !53
    %10 = call i8* @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(
         %struct.anon.0* %9, i32 1, i32 1), !dbg !53, !llvm.preserve.access.index !34
    %11 = call i32 %4(i8* %3, i32 1, i8* %10), !dbg !52
    %12 = load i8, i8* %3, align 1, !dbg !54, !tbaa !55
    %13 = sext i8 %12 to i32, !dbg !54
    call void @llvm.lifetime.end.p0i8(i64 1, i8* %3) #4, !dbg !56
    ret i32 %13, !dbg !57
  }

  !19 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "sk_buff", file: !3, line: 1, size: 704, elements: !20)
  !26 = distinct !DICompositeType(tag: DW_TAG_union_type, scope: !19, file: !3, line: 5, size: 64, elements: !27)
  !34 = distinct !DICompositeType(tag: DW_TAG_structure_type, scope: !26, file: !3, line: 10, size: 16, elements: !35)

Note that @llvm.preserve.{struct,union}.access.index calls have metadata llvm.preserve.access.index
attached to instructions to provide struct/union debuginfo type information.

For &ctx->u[5].dev.dev_id,
  . The "%6 = ..." represents struct member "u" with index 2 for IR layout and index 3 for DI layout.
  . The "%7 = ..." represents array subscript "5".
  . The "%8 = ..." represents union member "dev" with index 1 for DI layout.
  . The "%10 = ..." represents struct member "dev_id" with index 1 for both IR and DI layout.

Basically, traversing the use-def chain recursively for the 3rd argument of bpf_probe_read() and
examining all preserve_*_access_index calls, the debuginfo struct/union/array access index
can be achieved.

The intrinsics also contain enough information to regenerate codes for IR layout.
For array and structure intrinsics, the proper GEP can be constructed.
For union intrinsics, replacing all uses of "addr" with "base" should be enough.

Signed-off-by: Yonghong Song <yhs@fb.com>

Differential Revision: https://reviews.llvm.org/D61809

llvm-svn: 365438
2019-07-09 04:21:50 +00:00
Yonghong Song e085b40e9c Revert "[BPF] Preserve debuginfo array/union/struct type/access index"
This reverts commit r365435.

Forgot adding the Differential Revision link. Will add to the
commit message and resubmit.

llvm-svn: 365436
2019-07-09 04:15:12 +00:00