This fixes PR# 45336.
Output sections described in a linker script as NOLOAD with no input sections would be marked as SHT_PROGBITS.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D76981
MC already knows how to emulate the .weak directive (with its ELF
semantics; i.e., an undefined weak symbol resolves to 0, and a defined
weak symbol has lower link precedence than a strong symbol of the same
name) using COFF weak externals. Plumb this through the ASM printer too,
so that definitions marked with __attribute__((weak)) at the language
level (which gets translated to weak linkage at the IR level) have the
corresponding .weak directive emitted. Note that declarations marked
with __attribute__((weak)) at the language level (which translates to
extern_weak at the IR level) already have .weak directives emitted.
Weak*/linkonce* symbols without an associated comdat (in particular, ones
generated with __attribute__((weak)) in C/C++) were earlier emitted as
normal unique globals, as the comdat is required to provide the linkonce
semantics. This change makes sure they are emitted as .weak instead,
allowing other symbols to override them.
Rename the existing coff-weak.ll test to coff-linkonce.ll. I'm not
quite sure what that test covers, since the behavior being tested in it
(the emission of a one_only section) is just a result of passing
-function-sections to llc; the linkonce_odr makes no difference.
Add a new coff-weak.ll which tests the new directive emission.
Based on an previous patch by Shoaib Meenai.
Differential Revision: https://reviews.llvm.org/D44543
The preprocessor reads the whole line even if the first condition of an and is false so this broke when compiling on older gcc versions which don't recognize `__has_builtin`
The LatticeVal alias was introduced to reduce the diff size for the
transition to ValueLatticeElement, which is done now.
This patch removes the unnecessary alias and updates some very verbose
type uses with auto.
This seems to be used in some resource files, e.g.
f3217573d7/include/wx/msw/wx.rc (L28).
MSVC rc.exe and GNU windres both allow any value here, and silently
just truncate to uint16_t range. This just explicitly allows the
-1 value and errors out on others - the same was done for control
IDs in dialogs in c1a67857ba.
Differential Revision: https://reviews.llvm.org/D76951
I added a list of options to configure should someone have issues with
long build time or running out of memory. This was added under common
problems in the getting started section of the documentation.
Reviewed By: Meinersbur, dim, e-leclercq
Differential Revision: https://reviews.llvm.org/D75425
Minor update/fixes to comments for the Attributor pass, and dyn_cast -> cast.
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Differential Revision: https://reviews.llvm.org/D76972
In ObjectFileMachO we construct the symbol table from multiple
sources -- primarily the binary's nlist records, but when the nlist
symbols have been stripped, we would augment those with function
start address from the LC_FUNCTION_STARTS or eh_frame. This patch
adds another source of symbols - the exported symbols that the
dynamic linker, dyld, uses at runtime from its trie structure. This
provides us names and addresses for these functions/data.
This patch removes the code from ParseSymtab that would reject an
empty symbol table / nlist source. It adds a new symbols_added
set which tracks the address of every symbol we've added to the
symtab. We add symbols in most-information-ful order, and before
adding a symbol from less-informational-ful source (e.g.
LC_FUNCTION_STARTS with no function name), we check if that symbol
has already been added.
On targets with thumb code generation, instead of using the 0th bit
in these addresses in FunctionStarts (or now the trie entries), we
use the data field of FunctionStarts (formerly used to track if the
func_start should be added) and a flag for the trie entries to
encode this, and only store the actual addresses in the symbols_seen
and these vectors.
<rdar://problem/50791451>
Differential revision: https://reviews.llvm.org/D76758
This change implements constant folding to constrained versions of
intrinsics, implementing rounding: floor, ceil, trunc, round, rint and
nearbyint.
Differential Revision: https://reviews.llvm.org/D72930
The main purpose of introducing these builtins is to add a range
metadata [1, 1025) on the work group size loaded from dispatch
ptr, which cannot be done by source code.
Differential Revision: https://reviews.llvm.org/D76772
scope.
There are a few contexts in which we assume a name is a template name;
if such a context is one where we should perform an unqualified lookup,
and lookup finds nothing, we would form a dependent template name even
if the name is not dependent. This happens in particular for the lookup
of a pseudo-destructor.
In passing, rename ActOnDependentTemplateName to just ActOnTemplateName
given that we apply it for non-dependent template names too.
Summary:
Commit 5f5fb56c68 ("[compiler-rt] Intercept the uname() function")
broke sanitizer-x86_64-linux and clang-cmake-thumbv7-full-sh (again)
builds:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/26313http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-full-sh/builds/4324
The reason is that uname() can be called as early as
__pthread_initialize_minimal_internal(). When intercepted, this
triggers ASan initialization, which eventually calls dlerror(), which
in turn uses pthreads, causing all sorts of issues.
Fix by falling back to internal_uname() when interceptor runs before
ASan is initialized. This is only for Linux at the moment.
Reviewers: eugenis, vitalybuka
Reviewed By: eugenis
Subscribers: dberris, #sanitizers, pcc
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D76919
Move test/lib/TestDialect to test/lib/Dialect/Test - makes the dir
structure more uniform.
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Differential Revision: https://reviews.llvm.org/D76677
This patch introduces a utility to separate full tiles from partial
tiles when tiling affine loop nests where trip counts are unknown or
where tile sizes don't divide trip counts. A conditional guard is
generated to separate out the full tile (with constant trip count loops)
into the then block of an 'affine.if' and the partial tile to the else
block. The separation allows the 'then' block (which has constant trip
count loops) to be optimized better subsequently: for eg. for
unroll-and-jam, register tiling, vectorization without leading to
cleanup code, or to offload to accelerators. Among techniques from the
literature, the if/else based separation leads to the most compact
cleanup code for multi-dimensional cases (because a single version is
used to model all partial tiles).
INPUT
affine.for %i0 = 0 to %M {
affine.for %i1 = 0 to %N {
"foo"() : () -> ()
}
}
OUTPUT AFTER TILING W/O SEPARATION
map0 = affine_map<(d0) -> (d0)>
map1 = affine_map<(d0)[s0] -> (d0 + 32, s0)>
affine.for %arg2 = 0 to %M step 32 {
affine.for %arg3 = 0 to %N step 32 {
affine.for %arg4 = #map0(%arg2) to min #map1(%arg2)[%M] {
affine.for %arg5 = #map0(%arg3) to min #map1(%arg3)[%N] {
"foo"() : () -> ()
}
}
}
}
OUTPUT AFTER TILING WITH SEPARATION
map0 = affine_map<(d0) -> (d0)>
map1 = affine_map<(d0) -> (d0 + 32)>
map2 = affine_map<(d0)[s0] -> (d0 + 32, s0)>
#set0 = affine_set<(d0, d1)[s0, s1] : (-d0 + s0 - 32 >= 0, -d1 + s1 - 32 >= 0)>
affine.for %arg2 = 0 to %M step 32 {
affine.for %arg3 = 0 to %N step 32 {
affine.if #set0(%arg2, %arg3)[%M, %N] {
// Full tile.
affine.for %arg4 = #map0(%arg2) to #map1(%arg2) {
affine.for %arg5 = #map0(%arg3) to #map1(%arg3) {
"foo"() : () -> ()
}
}
} else {
// Partial tile.
affine.for %arg4 = #map0(%arg2) to min #map2(%arg2)[%M] {
affine.for %arg5 = #map0(%arg3) to min #map2(%arg3)[%N] {
"foo"() : () -> ()
}
}
}
}
}
The separation is tested via a cmd line flag on the loop tiling pass.
The utility itself allows one to pass in any band of contiguously nested
loops, and can be used by other transforms/utilities. The current
implementation works for hyperrectangular loop nests.
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Differential Revision: https://reviews.llvm.org/D76700
When we see this:
```
%a = COPY $physreg
...
SOMETHING implicit-def $physreg
...
%b = COPY $physreg
```
The two copies are not equivalent, and so we shouldn't perform any folding
on them.
When we have two instructions which use a physical register check that they
define the same virtual register(s) as well.
e.g., if we run into this case
```
%a = COPY $physreg
...
%b = COPY %a
```
we can say that the two copies are the same, and can be folded.
Differential Revision: https://reviews.llvm.org/D76890
Extend the FileCollector's API with addDirectory which adds a directory
and its contents to the VFS mapping.
Differential revision: https://reviews.llvm.org/D76671
Instead of bailing out of parsing when we encounter an invalid
template-name or template arguments in a template-id, produce an
annotation token describing the invalid construct.
This avoids duplicate errors and generally allows us to recover better.
In principle we should be able to extend this to store some kinds of
invalid template-id in the AST for tooling use, but that isn't handled
as part of this change.
- add `to_extent_tensor`
- rename `create_shape` to `from_extent_tensor` for symmetry
- add `split_at` and `concat` ops for basic shape manipulations
This set of ops is inspired by the requirements of lowering a dynamic-shape-aware batch matmul op. For such an op, the "matrix" dimensions aren't subject to broadcasting but the others are, and so we need to slice, broadcast, and reconstruct the final output shape. Furthermore, the actual broadcasting op used downstream uses a tensor of extents as its preferred shape interface for the actual op that does the broadcasting.
However, this functionality is quite general. It's obvious that `to_extent_tensor` is needed long-term to support many common patterns that involve computations on shapes. We can evolve the shape manipulation ops introduced here. The specific choices made here took into consideration the potentially unranked nature of the !shape.shape type, which means that a simple listing of dimensions to extract isn't possible in general.
Differential Revision: https://reviews.llvm.org/D76817
Without this some instances of copy construction would use the
converting constructor & lead to the destination function_ref referring
to the source function_ref instead of the underlying functor.
Discovered in feedback from 857bf5da35
Thanks to Johannes Doerfert, Arthur O'Dwyer, and Richard Smith for the
discussion and debugging.
Summary:
This makes it easier/safer to add bits (error) to other node types without
worrying about bit layout all the time.
For now, just use to implement the ad-hoc conversion functions.
Next: remove these functions and use this directly.
Reviewers: hokein
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76939
Summary:
It is safe to assume that the TypeSize associated to scalar types has
a fixed size.
This avoids an implicit cast of TypeSize to integer inside
`Type::getScalarSizeInBits()`, as such implicit cast is deprecated.
Reviewers: efriedma, sdesmalen
Reviewed By: efriedma
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76892
Currently -fno-unroll-loops is ignored when doing LTO on Darwin. This
patch adds a new -lto-no-unroll-loops option to the LTO code generator
and forwards it to the linker if -fno-unroll-loops is passed.
Reviewers: thegameg, steven_wu
Reviewed By: thegameg
Differential Revision: https://reviews.llvm.org/D76916
The current implementation of the JSONWriter does not support writing
out directory entries. Earlier today I added a unit test to illustrate
the problem. When an entry is added to the YAMLVFSWriter and the path is
a directory, it will incorrectly emit the directory as a file, and any
files inside that directory will not be found by the VFS.
It's possible to partially work around the issue by only adding "leaf
nodes" (files) to the YAMLVFSWriter. However, this doesn't work for
representing empty directories. This is a problem for clients of the VFS
that want to iterate over a directory. The directory not being there is
not the same as the directory being empty.
This is not just a hypothetical problem. The FileCollector for example
does not differentiate between file and directory paths. I temporarily
worked around the issue for LLDB by ignoring directories, but I suspect
this will prove problematic sooner rather than later.
This patch fixes the issue by extending the JSONWriter to support
writing out directory entries. We store whether an entry should be
emitted as a file or directory.
Differential revision: https://reviews.llvm.org/D76670
CPlusPlusNameParser is used in several places on of them is during IR execution and setting breakpoints to pull information C++ like the basename, the context and arguments.
Currently it does not handle templated operator< properly, because of idiosyncrasy is how clang generates debug info for these cases.
It uses clang::Lexer which will tokenize operator<<A::B> into:
tok::kw_operator
tok::lessless
tok::raw_identifier
Later on the parser in ConsumeOperator() does not handle this case properly and we end up failing to parse.
Differential Revision: https://reviews.llvm.org/D76168