Summary:
The BFloat IR type is introduced to provide support for, initially, the BFloat16
datatype introduced with the Armv8.6 architecture (optional from Armv8.2
onwards). It has an 8-bit exponent and a 7-bit mantissa and behaves like an IEEE
754 floating point IR type.
This is part of a patch series upstreaming Armv8.6 features. Subsequent patches
will upstream intrinsics support and C-lang support for BFloat.
Reviewers: SjoerdMeijer, rjmccall, rsmith, liutianle, RKSimon, craig.topper, jfb, LukeGeeson, sdesmalen, deadalnix, ctetreau
Subscribers: hiraditya, llvm-commits, danielkiss, arphaman, kristof.beyls, dexonsmith
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78190
D79276 caused the following builder to fail:
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/23489
Specifically, FileCheck dumped stack in the following tests:
LLVM :: MC/Mips/micromips-jump-pc-region.s
LLVM :: MC/Mips/mips-jump-pc-region.s
Those tests contained characters encoded as 160 but that render (at
least for me in vim) like a single space (32). Those characters
appeared between the `#` and `RUN:` on several lines, and D79276
caused FileCheck to process those lines differently: `RUN:` is a
comment directive. As a result, D79276 caused FileCheck to start
calling is `isalnum` on those characters.
The problem is that FileCheck calls `isalnum` on type `char` without
casting to `unsigned char` first, so it sign-extends 160 beyond what
`unsigned char` or `EOF` can represent. C says that has undefined
behavior. This problem is general to FileCheck's prefix parsing and
so exists independently of D79276.
524457edbc fixed the above tests. This patch changes FileCheck to
use LLVM's replacements for `ctype.h` functions, and it adds tests for
cases that are representative with or without D79276.
Reviewed By: jhenderson, thopre, efriedma
Differential Revision: https://reviews.llvm.org/D79810
Summary:
The order of Z3_mk_fpa_mul, Z3_mk_fpa_div, Z3_mk_fpa_add and Z3_mk_fpa_sub functions' arguments is: context, rounding_mode, ast1, ast2.
See for example: a14c2a3051/src/api/api_fpa.cpp (L433)
At function calls from LLVM the argument order was different: rounding_mode was passed as last argument.
Unfortunately these Z3_ast and other function parameter types are technically like void* which are reinterpret_cast-ed to a specific class type. So there was no type error, but the assertions fail in runtime if something goes wrong. Such a crash happened during Z3 refutation while using StaticAnalyzer.
Reviewers: Szelethus, xazax.hun, baloghadamsoftware, steakhal, martong, mikhail.ramalho
Reviewed By: martong
Subscribers: hiraditya, rnkovacs, mikhail.ramalho, martong, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79883
Patch by Tibor Brunner!
In D49466, sys::path::replace_path_prefix was used instead startswith for -f[macro/debug/file]-prefix-map options.
However those were reverted later (commit rG3bb24bf25767ef5bbcef958b484e7a06d8689204) due to broken Windows tests.
This patch restores those replace_path_prefix calls.
It also modifies the prefix matching to be case-insensitive under Windows.
Differential Revision : https://reviews.llvm.org/D76869
Sometimes you want to disable a FileCheck directive without removing
it entirely, or you want to write comments that mention a directive by
name. The `COM:` directive makes it easy to do this. For example,
you might have:
```
; X32: pinsrd_1:
; X32: pinsrd $1, 4(%esp), %xmm0
; COM: FIXME: X64 isn't working correctly yet for this part of codegen, but
; COM: X64 will have something similar to X32:
; COM:
; COM: X64: pinsrd_1:
; COM: X64: pinsrd $1, %edi, %xmm0
```
Without this patch, you need to use some combination of rewording and
directive syntax mangling to prevent FileCheck from recognizing the
commented occurrences of `X32:` and `X64:` above as directives.
Moreover, FileCheck diagnostics have been proposed that might complain
about the occurrences of `X64` that don't have the trailing `:`
because they look like directive typos:
<http://lists.llvm.org/pipermail/llvm-dev/2020-April/140610.html>
I think dodging all these problems can prove tedious for test authors,
and directive syntax mangling already makes the purpose of existing
test code unclear. `COM:` can avoid all these problems.
This patch also updates the small set of existing tests that define
`COM` as a check prefix:
- clang/test/CodeGen/default-address-space.c
- clang/test/CodeGenOpenCL/addr-space-struct-arg.cl
- clang/test/Driver/hip-device-libs.hip
- llvm/test/Assembler/drop-debug-info-nonzero-alloca.ll
I think lit should support `COM:` as well. Perhaps `clang -verify`
should too.
Reviewed By: jhenderson, thopre
Differential Revision: https://reviews.llvm.org/D79276
This will prove especially helpful after D79276, which introduces
comment prefixes. Specifically, identifying whether there's a
uniqueness violation will be helpful as prefixes will be required to
be unique across both check prefixes and comment prefixes.
Also, remove a related comment about `cl::list` that no longer seems
relevant now that FileCheck is also a library.
Reviewed By: jhenderson, thopre
Differential Revision: https://reviews.llvm.org/D79375
Sometimes you want to disable a FileCheck directive without removing
it entirely, or you want to write comments that mention a directive by
name. The `COM:` directive makes it easy to do this. For example,
you might have:
```
; X32: pinsrd_1:
; X32: pinsrd $1, 4(%esp), %xmm0
; COM: FIXME: X64 isn't working correctly yet for this part of codegen, but
; COM: X64 will have something similar to X32:
; COM:
; COM: X64: pinsrd_1:
; COM: X64: pinsrd $1, %edi, %xmm0
```
Without this patch, you need to use some combination of rewording and
directive syntax mangling to prevent FileCheck from recognizing the
commented occurrences of `X32:` and `X64:` above as directives.
Moreover, FileCheck diagnostics have been proposed that might complain
about the occurrences of `X64` that don't have the trailing `:`
because they look like directive typos:
<http://lists.llvm.org/pipermail/llvm-dev/2020-April/140610.html>
I think dodging all these problems can prove tedious for test authors,
and directive syntax mangling already makes the purpose of existing
test code unclear. `COM:` can avoid all these problems.
This patch also updates the small set of existing tests that define
`COM` as a check prefix:
- clang/test/CodeGen/default-address-space.c
- clang/test/CodeGenOpenCL/addr-space-struct-arg.cl
- clang/test/Driver/hip-device-libs.hip
- llvm/test/Assembler/drop-debug-info-nonzero-alloca.ll
I think lit should support `COM:` as well. Perhaps `clang -verify`
should too.
Reviewed By: jhenderson, thopre
Differential Revision: https://reviews.llvm.org/D79276
This will prove especially helpful after D79276, which introduces
comment prefixes. Specifically, identifying whether there's a
uniqueness violation will be helpful as prefixes will be required to
be unique across both check prefixes and comment prefixes.
Also, remove a related comment about `cl::list` that no longer seems
relevant now that FileCheck is also a library.
Reviewed By: jhenderson, thopre
Differential Revision: https://reviews.llvm.org/D79375
Once we hit AT_NULL, we need to bail out of the loop; not just the
enclosing switch. This fixes basic usage (e.g. `cc --version`) when
AT_EXECPATH isn't present on older branches (e.g. under
emu-user-static, at the moment), where we would previously run off
the end of ::environ.
Patch By: kevans
Reviewed By: arichardson
Differential Revision: https://reviews.llvm.org/D79239
For empty directories (except the first one) we've been adding a file
with the same name as the directory to the result VFS mapping.
Differential Revision: https://reviews.llvm.org/D79551
Currently, when compiling to IR (presumably at the clang level) LLVM
mangles symbols and sometimes they have illegal file characters
including `?`'s in them. This causes a problem when building a graph via
llc on Windows because the code currently passes the machine function
name all the way down to the Windows API which frequently returns error
123 **ERROR_INVALID_NAME**
https://docs.microsoft.com/en-us/windows/win32/debug/system-error-codes--0-499-
Thus, we need to remove those illegal characters from the machine
function name before generating a graph, which is the purpose of this
patch.
https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file
I've created a static helper function replace_illegal_filename_chars
which within GraphWriter.cpp to help with replacing illegal file
character names before generating a dot graph filename.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D76863
Currently, normalize() for posix replaces backslashes to slashes, except
that two backslashes in sequence are kept as-is.
clang calls normalize() to convert \ to / is microsoft compat mode. This
generally works well, but a path like "c:\\foo\\bar.h" with two
backslashes doesn't work due to the exception in normalize().
These paths happen naturally on Windows hosts with e.g.
`#include __FILE__`, and them not working on other hosts makes it
more difficult to write tests for this case.
The special case has been around without justification since this code
was added in r203611 (since then moved around in r215241 r215243). No
integration tests fail if I remove it.
Try removing the special case.
Differential Revision: https://reviews.llvm.org/D79265
Size==0 triggers `assert(Size != 0)` in mapped_file_region::init.
I plan to use an empty file in D79339 (llvm-objcopy --dump-section).
According to POSIX, "If len is zero, mmap() shall fail and no mapping
shall be established." Just specialize case Size=0 to use
createInMemoryBuffer.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D79338
This reverts commit fb5fd74685.
Re-instates commit 53913a65b4
The fix is to trim off trailing separators, as in `/foo/bar/` and
produce `/foo/bar`. VFS tests rely on this. I added unit tests for
remove_dots.
LLD calls this on every source file string in every object file when
writing PDBs, so it is somewhat hot.
Avoid rewriting paths that do not contain path traversal components
(./..). Use find_first_not_of(separators) directly instead of using the
path iterators. The path component iterators appear to be slow, and
directly searching for slashes makes it easier to find double separators
that need to be canonicalized.
I discovered that the VFS relies on remote_dots to not canonicalize
early slashes (/foo or C:/foo) on Windows, so I had to leave that
behavior behind with unit tests for it. This is undesirable, but I claim
that my change is NFC.
This generalizes the main Windows command line tokenizer to be able to
produce StringRef substrings as well as freshly copied C strings. The
implementation is still shared with the normal tokenizer, which is
important, because we have unit tests for that.
.drective sections can be very long. They can potentially list up to
every symbol in the object file by name. It is worth avoiding these
string copies.
This saves a lot of memory when linking chrome.dll with PGO
instrumentation:
BEFORE AFTER % IMP
peak memory: 6657.76MB 4983.54MB -25%
real: 4m30.875s 2m26.250s -46%
The time improvement may not be real, my machine was noisy while running
this, but that the peak memory usage improvement should be real.
This change may also help apps that heavily use dllexport annotations,
because those also use linker directives in object files. Apps that do
not use many directives are unlikely to be affected.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D79262
Summary:
This was introduced in dda3c19a36 aka D77621.
The unused template instantiation causes a warning on 32 bit systems
about truncating a uint64_t to 32-bit size_t.
Reviewed By: dblaikie, smeenai
Differential Revision: https://reviews.llvm.org/D79214
This reverts commit ad38f4b371.
As it broke building the unittests:
.../sources/llvm-project/llvm/unittests/Support/Path.cpp:334:5: error: use of undeclared identifier 'set'
set(Value);
^
1 error generated.
Summary:
This patch adds a function that is similar to `llvm::sys::path::home_directory`, but provides access to the system cache directory.
For Windows, that is %LOCALAPPDATA%, and applications should put their files under %LOCALAPPDATA%\Organization\Product\.
For *nixes, it adheres to the XDG Base Directory Specification, so it first looks at the XDG_CACHE_HOME environment variable and falls back to ~/.cache/.
Subsequently, the Clangd Index storage leverages this new API to put index files somewhere else than the users home directory.
Fixes https://github.com/clangd/clangd/issues/341
Reviewers: sammccall, chandlerc, Bigcheese
Reviewed By: sammccall
Subscribers: hiraditya, ilya-biryukov, MaskRay, jkorous, dexonsmith, arphaman, kadircet, ormris, usaxena95, cfe-commits, llvm-commits
Tags: #clang-tools-extra, #clang, #llvm
Differential Revision: https://reviews.llvm.org/D78501
* Merge QueueLock and CompletionLock.
* Avoid spurious CompletionCondition.notify_all() when ActiveThreads is greater than 0.
* Use default member initializers.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D78856
SmallVector currently uses 32bit integers for size and capacity to reduce
sizeof(SmallVector). This limits the number of elements to UINT32_MAX.
For a SmallVector<char>, this limits the SmallVector size to only 4GB.
Buffering bitcode output uses SmallVector<char>, but needs >4GB output.
This changes SmallVector size and capacity to conditionally use word-size
integers if the element type is small (<4 bytes). For larger elements types,
the vector size can reach ~16GB with 32bit size.
Making this conditional on the element type provides both the smaller
sizeof(SmallVector) for larger types which are unlikely to grow so large,
and supports larger capacities for smaller element types.
This recommit fixes the same template being instantiated twice on platforms
where uintptr_t is the same as uint32_t.
With a fix to unittests/Support/TarWriterTest.cpp
This makes lld's --reproduce output more compatible with tar 1.13 and
before. This is a very old version of tar, but it's the version in
both gnuwin and unxutils, and the cost for supporting them are very
low, so we might as well just do that.
https://bugs.chromium.org/p/chromium/issues/detail?id=1073524#c21
and onward has more details.
Differential Revision: https://reviews.llvm.org/D78945
This makes lld's --reproduce output more compatible with tar 1.13 and
before. This is a very old version of tar, but it's the version in
both gnuwin and unxutils, and the cost for supporting them are very
low, so we might as well just do that.
https://bugs.chromium.org/p/chromium/issues/detail?id=1073524#c21
and onward has more details.
Differential Revision: https://reviews.llvm.org/D78945
Summary:
Specifically make some simple refactorings to get PointerUnion.h and
Twine.h out of the public includes. While here, trim out a lot of
transitive includes as well.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78870
Summary:
This introduces a new SourceMgr::FindLocForLineAndColumn method that
uses the OffsetCache in SourceMgr::SrcBuffer to do do a constant time
lookup for the line number (once the cache is populated).
Use this method in MLIR's SourceMgrDiagnosticHandler::convertLocToSMLoc,
replacing the O(n) scanning logic. This resolves a long standing TODO
in MLIR, and makes one of my usecases go dramatically faster (which is
currently producing many diagnostics in a 40MB SourceBuffer).
NFC, this is just a performance speedup and cleanup.
Reviewers: rriddle!, ftynse!
Subscribers: hiraditya, mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, grosul1, frgossen, Kayjukh, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78868
SmallVector currently uses 32bit integers for size and capacity to reduce
sizeof(SmallVector). This limits the number of elements to UINT32_MAX.
For a SmallVector<char>, this limits the SmallVector size to only 4GB.
Buffering bitcode output uses SmallVector<char>, but needs >4GB output.
This changes SmallVector size and capacity to conditionally use word-size
integers if the element type is small (<4 bytes). For larger elements types,
the vector size can reach ~16GB with 32bit size.
Making this conditional on the element type provides both the smaller
sizeof(SmallVector) for larger types which are unlikely to grow so large,
and supports larger capacities for smaller element types.
As reported here: https://reviews.llvm.org/D75153#1987272
Before, each instance of llvm-cov was creating one thread per hardware core, which wasn't needed probably because the number of inputs were small. This was probably causing a thread rlimit issue on large core count systems.
After this patch, the previous behavior is restored (to what was before rG8404aeb5):
If --num-threads is not specified, we create one thread per input, up to num.cores.
When specified, --num-threads indicates any number of threads, with no upper limit.
Differential Revision: https://reviews.llvm.org/D78408
There's an ABI breakage here if LLVM is compiled in C++14 without
aligned allocation and a user tries to use the result with aligned
allocation. If DenseMap or unique_function is used across that ABI
boundary it will break (PR45413). Moving it out of line is a bit of
a band-aid and LLVM doesn't really give ABI guarantees at this level,
but given the number of complaints I've received over this it still
seems worth fixing.
Time profiler emits relative timestamps for events (the number of
microseconds passed since the start of the current process).
This patch allows combining events from different processes while
preserving their relative timing by emitting a new attribute
"beginningOfTime". This attribute contains the system time that
corresponds to the zero timestamp of the time profiler.
This has at least two use cases:
- Build systems can use this to merge time traces from multiple compiler
invocations and generate statistics for the whole build. Tools like
ClangBuildAnalyzer could also leverage this feature.
- Compilers that use LLVM as their backend by invoking llc/opt in
a child process. If such a compiler supports generating time traces
of its own events, it could merge those events with LLVM-specific
events received from llc/opt, and produce a more complete time trace.
A proof-of-concept script that merges multiple logs that
contain a synchronization point into one log:
https://github.com/broadwaylamb/merge_trace_events
Differential Revision: https://reviews.llvm.org/D78030
Context:
/// Double the size of the allocated memory, guaranteeing space for at
/// least one more element or MinSize if specified.
void grow(size_t MinSize = 0) { this->grow_pod(MinSize, sizeof(T)); }
void push_back(const T &Elt) {
if (LLVM_UNLIKELY(this->size() >= this->capacity()))
this->grow();
memcpy(reinterpret_cast<void *>(this->end()), &Elt, sizeof(T));
this->set_size(this->size() + 1);
}
When grow is called in push_back() without a MinSize specified, this is
relying on the guarantee of space for at least one more element.
There is an edge case bug where the SmallVector is already at its maximum size
and push_back() calls grow() with default MinSize of zero. Grow is unable to
provide space for one more element, but push_back() assumes the additional
element it will be available. This can result in silent memory corruption, as
this->end() will be an invalid pointer and the program may continue executing.
Another alternative to fix would be to remove the default argument from
grow(), which would mean several changing grow() to grow(this->size()+1)
in several places.
No test case added because it would require allocating ~4GB.
Reviewers: echristo
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77601
There are few `std::vector<std::string>` members in
`FileCheckRequest`. This patch changes these arrays to `std::vector<StringRef>`
and refactors the code related to cleanup/improve/simplify it.
Differential revision: https://reviews.llvm.org/D78202
sched_getaffinity (Linux specific) has been available
* in glibc since 2002-08-08 (commit 972e719e8154eec5f543b027e2a08dfa285d55d5)
* in musl since the initial check-in.
SmallVector currently uses 32bit integers for size and capacity to reduce
sizeof(SmallVector). This limits the number of elements to UINT32_MAX.
For a SmallVector<char>, this limits the SmallVector size to only 4GB.
Buffering bitcode output uses SmallVector<char>, but needs >4GB output.
This changes SmallVector size and capacity to conditionally use word-size
integers if the element type is small (<4 bytes). For larger elements types,
the vector size can reach ~16GB with 32bit size.
Making this conditional on the element type provides both the smaller
sizeof(SmallVector) for larger types which are unlikely to grow so large,
and supports larger capacities for smaller element types.
This change also includes a fix for the bug where a SmallVector with 32bit
size has reached UINT32_MAX elements, and cannot provide guaranteed growth.
Context:
// Double the size of the allocated memory, guaranteeing space for at
// least one more element or MinSize if specified.
void grow(size_t MinSize = 0) { this->grow_pod(MinSize, sizeof(T)); }
void push_back(const T &Elt) {
if (LLVM_UNLIKELY(this->size() >= this->capacity()))
this->grow();
memcpy(reinterpret_cast<void *>(this->end()), &Elt, sizeof(T));
this->set_size(this->size() + 1);
}
When grow is called in push_back() without a MinSize specified, this is
relying on the guarantee of space for at least one more element.
There is an edge case bug where the SmallVector is already at its maximum size
and push_back() calls grow() with default MinSize of zero. Grow is unable to
provide space for one more element, but push_back() assumes the additional
element it will be available. This can result in silent memory corruption, as
this->end() will be an invalid pointer and the program may continue executing.
An alternative to this fix would be to remove the default argument from
grow(), which would mean several changing grow() to grow(this->size()+1)
in several places.
No test case added because it would require allocating a large ammount.
Differential Revision: https://reviews.llvm.org/D77621
Summary:
Currently, cl::ConsumeAfter only works for the case that has exactly one
positional argument. Without the fix, it skip fulfilling first positional
argument and put that additional positional argument in interpreter arguments.
Reviewers: bkramer, Mordante, rnk, lattner, beanz, craig.topper
Reviewed By: rnk
Subscribers: JosephTremoulet, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77242
Currently, `--dump-input` implies that all `--implicit-check-not`
patterns appear on line 1 by printing annotations like:
```
1: foo bar baz
not:1 !~~ error: no match expected
```
This patch changes that to:
```
1: foo bar baz
not:imp1 !~~ error: no match expected
```
`imp1` indicates the first `--implicit-check-not` pattern.
Reviewed By: thopre
Differential Revision: https://reviews.llvm.org/D77605
Imagine we have the following invocation:
`FileCheck -check-prefix=UNKNOWN-PREFIX -implicit-check-not=something`
When the check prefix does not exist it does not fail.
This patch fixes the issue.
Differential revision: https://reviews.llvm.org/D78024
Summary:
Instead of storing a vptr in each FoldingSet instance, form an
equivalent struct and pass it implicitly from FoldingSet into the
various FoldingSetBase methods.
This has three benefits:
* FoldingSet becomes one pointer smaller.
* Under LTO, the "virtual" functions are much easier to inline.
* The element type no longer needs to be complete when instantiating
FoldingSet<T>, only when instantiating an insert / lookup member.
Reviewers: rnk
Subscribers: hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78247
This is needed to fix the reason
0a2be46cfd (Modules: Invalidate out-of-date PCMs as they're
discovered) and 5b44a4b07fc1d ([modules] Do not cache invalid state for
modules that we attempted to load.) were reverted.
These patches changed Clang to use `isVolatile` when loading modules.
This had the side effect of not using mmap when loading modules, and
thus greatly increased memory usage.
The reason it wasn't using mmap is because `MemoryBuffer` plays some
games with file size when you request null termination, and it has to
disable these when `isVolatile` is set as the size may change by the
time it's mmapped. Clang by default passes
`RequiresNullTerminator = true`, and `shouldUseMmap` ignored if
`RequiresNullTerminator` was even requested.
This patch adds `RequiresNullTerminator` to the `FileManager` interface
so Clang can use it when loading modules, and changes `shouldUseMmap` to
only take volatility into account if `RequiresNullTerminator` is true.
This is fine as both `mmap` and a `read` loop are vulnerable to
modifying the file while reading, but are immune to the rename Clang
does when replacing a module file.
Differential Revision: https://reviews.llvm.org/D77772
Summary:
Improve error message in case of conflict between several implicit
format to mention the operand that conflict.
Reviewers: jhenderson, jdenny, probinson, grimar, arichardson, rnk
Reviewed By: jdenny
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77741
Summary:
This revision adds two utilities currently present in MLIR to LLVM StringExtras:
* convertToSnakeFromCamelCase
Convert a string from a camel case naming scheme, to a snake case scheme
* convertToCamelFromSnakeCase
Convert a string from a snake case naming scheme, to a camel case scheme
Differential Revision: https://reviews.llvm.org/D78167
FileCheckImpl.h internal header uses type defined in the public
FileCheck.h header but fails to include it. This commit fixes that.
Test Plan: Built it locally successfully.
This patch extracts the RTTI part of llvm::ErrorInfo into its own class
(RTTIExtends) so that it can be used in other non-error hierarchies, and makes
it compatible with the existing LLVM RTTI function templates (isa, cast,
dyn_cast, dyn_cast_or_null) by adding the classof method.
Differential Revision: https://reviews.llvm.org/D39111
Summary:
StringPool has many caveats and isn't used in the monorepo. I will
propose removing it as a patch separate from this refactoring patch.
Reviewers: rriddle
Subscribers: hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77976
Summary:
StringMapEntry.h can have lower dependencies, than StringMap.h, which
is useful for public headers that want to expose inline methods on
StringMapEntry<> but don't need to expose all of StringMap.h. One
example of this is mlir's Identifier.h, another example is the existing
LLVM StringPool.h.
StringPool also could use a cleanup, I'll deal with that in a follow-on
patch.
Reviewers: rriddle
Subscribers: hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77963
- Move Adapters array to the stack, we know the size precisely
- Parse format string on demand into a SmallVector. In theory this could
lead to parsing it multiple times, but I couldn't find a single instance
of that in LLVM.
- Make more of the implementation details private.
Currently the library is separately linked, but this isn't correct to
implement fast math flags correctly. Each module should get the
version of the library appropriate for its combination of fast math
and related flags, with the attributes propagated into its functions
and internalized.
HIP already maintains the list of libraries, but this is not used for
OpenCL. Unfortunately, HIP uses a separate --hip-device-lib argument,
despite both languages using the same bitcode library. Eventually
these two searches need to be merged.
An additional problem is there are 3 different locations the libraries
are installed, depending on which build is used. This also needs to be
consolidated (or at least the search logic needs to deal with this
unnecessary complexity).
Summary:
There are at least three clients for KnownBits calculations:
ValueTracking, SelectionDAG and GlobalISel. To reduce duplication the
common logic should be moved out of these clients and into KnownBits
itself.
This patch does this for AND, OR and XOR calculations by implementing
and using appropriate operator overloads KnownBits::operator& etc.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74060
Now compiler defines 5 sets of constants to represent rounding mode.
These are:
1. `llvm::APFloatBase::roundingMode`. It specifies all 5 rounding modes
defined by IEEE-754 and is used in `APFloat` implementation.
2. `clang::LangOptions::FPRoundingModeKind`. It specifies 4 of 5 IEEE-754
rounding modes and a special value for dynamic rounding mode. It is used
in clang frontend.
3. `llvm::fp::RoundingMode`. Defines the same values as
`clang::LangOptions::FPRoundingModeKind` but in different order. It is
used to specify rounding mode in in IR and functions that operate IR.
4. Rounding mode representation used by `FLT_ROUNDS` (C11, 5.2.4.2.2p7).
Besides constants for rounding mode it also uses a special value to
indicate error. It is convenient to use in intrinsic functions, as it
represents platform-independent representation for rounding mode. In this
role it is used in some pending patches.
5. Values like `FE_DOWNWARD` and other, which specify rounding mode in
library calls `fesetround` and `fegetround`. Often they represent bits
of some control register, so they are target-dependent. The same names
(not values) and a special name `FE_DYNAMIC` are used in
`#pragma STDC FENV_ROUND`.
The first 4 sets of constants are target independent and could have the
same numerical representation. It would simplify conversion between the
representations. Also now `clang::LangOptions::FPRoundingModeKind` and
`llvm::fp::RoundingMode` do not contain the value for IEEE-754 rounding
direction `roundTiesToAway`, although it is supported natively on
some targets.
This change defines all the rounding mode type via one `llvm::RoundingMode`,
which also contains rounding mode for IEEE rounding direction `roundTiesToAway`.
Differential Revision: https://reviews.llvm.org/D77379
Summary:
In `Unix/Process.inc`, we seed a random number generator from
`/dev/urandom` if possible, but if not, we're happy to fall back to
ordinary pseudorandom strategies, like the current time and PID.
The corresponding function on Windows calls `CryptGenRandom`, but it
//doesn't// have a fallback if that strategy fails. But `CryptGenRandom`
//can// fail, if a cryptography provider isn't properly initialized, or
occasionally (by our observation) simply intermittently.
If it's reasonable on Unix to implement traditional pseudorandom-number
seeding as a fallback, then it's surely reasonable to do the same on
Windows. So this patch adds a last-ditch use of ordinary rand(), using
much the same strategy as the Unix fallback code.
Reviewers: hans, sammccall
Reviewed By: hans
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77553
Summary:
This patch adds the optional Error argument, and the Cursor variants to
more DataExtractor methods. The functions now behave the same way as
other error-aware functions (they set the error when they fail, and
don't do anything if the error is already set).
I have merged the LEB128 implementations via a template (similarly to
how fixed-size functions are handled) to reduce code duplication.
Depends on D77304.
Reviewers: dblaikie, aprantl
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77306
Summary:
This patch adds an optional Error argument to DataExtractor functions
for string extraction, and makes them behave like other DataExtractor
functions (set the error if extraction fails, don't do anything if the
error is already set).
I have merged the StringRef and C string versions of the functions to
reduce code duplication.
Reviewers: dblaikie, MaskRay
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77307
As detailed on PR45043, static analysis was warning that the StringRef::iterator Position argument was being ignored and the function was hardwired to use the Current iterator.
This patch ensures we use the provided iterator and removes the (barely necessary) setError wrapper that always used Current.
Differential Revision: https://reviews.llvm.org/D76512
Added unit tests for 2 scenarios that were failing.
Made replace_path_prefix back to 3 parameters instead of 5, simplifying the implementation. The other 2 were always used with the default value.
This commit is intended to be the first of 3:
1) simplify/fix replace_path_prefix.
2) use it in the context of -fdebug-prefix-map and -fmacro-prefix-map (see D76869).
3) Make Windows version of replace_path_prefix insensitive to both case and separators (slash vs backslash).
Differential Revision: https://reviews.llvm.org/D77223
--no-threads is a name copied from gold.
gold has --no-thread, --thread-count and several other --thread-count-*.
There are needs to customize the number of threads (running several lld
processes concurrently or customizing the number of LTO threads).
Having a single --threads=N is a straightforward replacement of gold's
--no-threads + --thread-count.
--no-threads is used rarely. So just delete --no-threads instead of
keeping it for compatibility for a while.
If --threads= is specified (ELF,wasm; COFF /threads: is similar),
--thinlto-jobs= defaults to --threads=,
otherwise all available hardware threads are used.
There is currently no way to override a --threads={1,2,...}. It is still
a debate whether we should use --threads=all.
Reviewed By: rnk, aganea
Differential Revision: https://reviews.llvm.org/D76885
Leverage ARM ELF build attribute section to create ELF attribute section
for RISC-V. Extract the common part of parsing logic for this section
into ELFAttributeParser.[cpp|h] and ELFAttributes.[cpp|h].
Differential Revision: https://reviews.llvm.org/D74023
Extend the FileCollector's API with addDirectory which adds a directory
and its contents to the VFS mapping.
Differential revision: https://reviews.llvm.org/D76671
Extend the FileCollector's API with addDirectory which adds a directory
and its contents to the VFS mapping.
Differential revision: https://reviews.llvm.org/D76671
The current implementation of the JSONWriter does not support writing
out directory entries. Earlier today I added a unit test to illustrate
the problem. When an entry is added to the YAMLVFSWriter and the path is
a directory, it will incorrectly emit the directory as a file, and any
files inside that directory will not be found by the VFS.
It's possible to partially work around the issue by only adding "leaf
nodes" (files) to the YAMLVFSWriter. However, this doesn't work for
representing empty directories. This is a problem for clients of the VFS
that want to iterate over a directory. The directory not being there is
not the same as the directory being empty.
This is not just a hypothetical problem. The FileCollector for example
does not differentiate between file and directory paths. I temporarily
worked around the issue for LLDB by ignoring directories, but I suspect
this will prove problematic sooner rather than later.
This patch fixes the issue by extending the JSONWriter to support
writing out directory entries. We store whether an entry should be
emitted as a file or directory.
Differential revision: https://reviews.llvm.org/D76670
Before this patch, it wasn't possible to extend the ThinLTO threads to all SMT/CMT threads in the system. Only one thread per core was allowed, instructed by usage of llvm::heavyweight_hardware_concurrency() in the ThinLTO code. Any number passed to the LLD flag /opt:lldltojobs=..., or any other ThinLTO-specific flag, was previously interpreted in the context of llvm::heavyweight_hardware_concurrency(), which means SMT disabled.
One can now say in LLD:
/opt:lldltojobs=0 -- Use one std::thread / hardware core in the system (no SMT). Default value if flag not specified.
/opt:lldltojobs=N -- Limit usage to N threads, regardless of usage of heavyweight_hardware_concurrency().
/opt:lldltojobs=all -- Use all hardware threads in the system. Equivalent to /opt:lldltojobs=$(nproc) on Linux and /opt:lldltojobs=%NUMBER_OF_PROCESSORS% on Windows. When an affinity mask is set for the process, threads will be created only for the cores selected by the mask.
When N > number-of-hardware-threads-in-the-system, the threads in the thread pool will be dispatched equally on all CPU sockets (tested only on Windows).
When N <= number-of-hardware-threads-on-a-CPU-socket, the threads will remain on the CPU socket where the process started (only on Windows).
Differential Revision: https://reviews.llvm.org/D75153
When Clang crashes a useful message is output:
"PLEASE submit a bug report to https://bugs.llvm.org/ and include the
crash backtrace, preprocessed source, and associated run script."
A similar message is now output for all tools.
Differential Revision: https://reviews.llvm.org/D74324
Summary:
This patch introduces command-line support for the Armv8.6-a architecture and assembly support for BFloat16. Details can be found
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
in addition to the GCC patch for the 8..6-a CLI:
https://gcc.gnu.org/legacy-ml/gcc-patches/2019-11/msg02647.html
In detail this patch
- march options for armv8.6-a
- BFloat16 assembly
This is part of a patch series, starting with command-line and Bfloat16
assembly support. The subsequent patches will upstream intrinsics
support for BFloat16, followed by Matrix Multiplication and the
remaining Virtualization features of the armv8.6-a architecture.
Based on work by:
- labrinea
- MarkMurrayARM
- Luke Cheeseman
- Javed Asbar
- Mikhail Maltsev
- Luke Geeson
Reviewers: SjoerdMeijer, craig.topper, rjmccall, jfb, LukeGeeson
Reviewed By: SjoerdMeijer
Subscribers: stuij, kristof.beyls, hiraditya, dexonsmith, danielkiss, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D76062
The algorithm supports both assigning a fixed offset to a field prior to
layout and allowing fields to have sizes that aren't multiples of their
required alignments. This means that the well-known algorithm of sorting
by decreasing alignment isn't always good enough. Still, we start with
that, and only if that leaves padding around do we fall back on a greedy
padding-minimizing algorithm.
There is no known efficient algorithm for producing a guaranteed-minimal
layout in all cases. In fact, allowing arbitrary fixed-offset fields means
there's a straightforward reduction from bin-packing, making this NP-hard.
But as usual with such problems, we can still efficiently produce adequate
solutions to the cases that matter most to us.
I intend to use this in coroutine frame layout, where the retcon lowerings
very badly want to minimize total space usage, and where the switch lowering
can indeed produce a header with interior padding if the promise field is
highly-aligned. But it may be useful in a much wider variety of situations.
Summary:
When building a large Xcode project with multiple module dependencies, and mixed Objective-C & Swift, I observed a large number of clang processes stalling at zero CPU for 30+ seconds throughout the build. This was especially prevalent on my 18-core iMac Pro.
After some sampling, the major cause appears to be the lock file implementation for precompiled modules in the module cache. When the lock is heavily contended by multiple clang processes, the exponential backoff runs in lockstep, with some of the processes sleeping for 30+ seconds in order to acquire the file lock.
In the attached patch, I implemented a more aggressive polling mechanism that limits the sleep interval to a max of 500ms, and randomizes the wait time. I preserved a limited form of exponential backoff. I also updated the code to use cross-platform timing, thread sleep, and random number capabilities available in C++11.
On iMac Pro (2.3 GHz Intel Xeon W, 18 core):
Xcode 11.1 bundled clang:
502.2 seconds (average of 5 runs)
Custom clang build with LockFileManager patch applied:
276.6 seconds (average of 5 runs)
This is a 1.82x speedup for this use case.
On MacBook Pro (4 core 3.1GHz Intel i7):
Xcode 11.1 bundled clang:
539.4 seconds (average of 2 runs)
Custom clang build with LockFileManager patch applied:
509.5 seconds (average of 2 runs)
As expected, machines with fewer cores benefit less from this change.
```
Call graph:
2992 Thread_393602 DispatchQueue_1: com.apple.main-thread (serial)
2992 start (in libdyld.dylib) + 1 [0x7fff6a1683d5]
2992 main (in clang) + 297 [0x1097a1059]
2992 driver_main(int, char const**) (in clang) + 2803 [0x1097a5513]
2992 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (in clang) + 1608 [0x1097a7cc8]
2992 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (in clang) + 3299 [0x1097dace3]
2992 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (in clang) + 509 [0x1097dcc1d]
2992 clang::FrontendAction::Execute() (in clang) + 42 [0x109818b3a]
2992 clang::ParseAST(clang::Sema&, bool, bool) (in clang) + 185 [0x10981b369]
2992 clang::Parser::ParseFirstTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&) (in clang) + 37 [0x10983e9b5]
2992 clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&) (in clang) + 141 [0x10983ecfd]
2992 clang::Parser::ParseExternalDeclaration(clang::Parser::ParsedAttributesWithRange&, clang::ParsingDeclSpec*) (in clang) + 695 [0x10983f3b7]
2992 clang::Parser::ParseObjCAtDirectives(clang::Parser::ParsedAttributesWithRange&) (in clang) + 637 [0x10a9be9bd]
2992 clang::Parser::ParseModuleImport(clang::SourceLocation) (in clang) + 170 [0x10c4841ba]
2992 clang::Parser::ParseModuleName(clang::SourceLocation, llvm::SmallVectorImpl<std::__1::pair<clang::IdentifierInfo*, clang::SourceLocation> >&, bool) (in clang) + 503 [0x10c485267]
2992 clang::Preprocessor::Lex(clang::Token&) (in clang) + 316 [0x1098285cc]
2992 clang::Preprocessor::LexAfterModuleImport(clang::Token&) (in clang) + 690 [0x10cc7af62]
2992 clang::CompilerInstance::loadModule(clang::SourceLocation, llvm::ArrayRef<std::__1::pair<clang::IdentifierInfo*, clang::SourceLocation> >, clang::Module::NameVisibilityKind, bool) (in clang) + 7989 [0x10bba6535]
2992 compileAndLoadModule(clang::CompilerInstance&, clang::SourceLocation, clang::SourceLocation, clang::Module*, llvm::StringRef) (in clang) + 296 [0x10bba8318]
2992 llvm::LockFileManager::waitForUnlock() (in clang) + 91 [0x10b6953ab]
2992 nanosleep (in libsystem_c.dylib) + 199 [0x7fff6a22c914]
2992 __semwait_signal (in libsystem_kernel.dylib) + 10 [0x7fff6a2a0f32]
```
Differential Revision: https://reviews.llvm.org/D69575
Check the path length limit against the length of the UTF-16 version of
the input rather than the UTF-8 equivalent, as the UTF-16 length may be
shorter. Move widenPath from the llvm::sys::path namespace in Path.h to
the llvm::sys::windows namespace in WindowsSupport.h. Only use the
reduced path length limit for create directory. Canonicalize using
sys::path::remove_dots().
Differential Revision: https://reviews.llvm.org/D75372
Currently, when building with the Unix support library and `isatty` does
not exist for the target platform (i.e. `HAVE_ISATTY` is false),
compilation of the file `raw_ostream.cpp` will fail due to direct use of
`isatty` in the function `raw_fd_ostream::preferred_buffer_size()`.
Use is_displayed() to fix the problem.
Reviewed By: probinson, MaskRay
Differential Revision: https://reviews.llvm.org/D75278
This adds the --debug-vars option to llvm-objdump, which prints
locations (registers/memory) of source-level variables alongside the
disassembly based on DWARF info. A vertical line is printed for each
live-range, with a label at the top giving the variable name and
location, and the position and length of the line indicating the program
counter range in which it is valid.
Currently, this only works for object files, not executables or shared
libraries.
Differential revision: https://reviews.llvm.org/D70720
After a crash catched by the CrashRecoveryContext, this patch prevents from accessing dangling pointers in TimerGroup structures before the clang tool exits. Previously, the default TimerGroup had internal linked lists which were still pointing to old Timer or TimerGroup instances, which lived in stack frames released by the CrashRecoveryContext.
Fixes PR45164.
Differential Revision: https://reviews.llvm.org/D76099
Behavior of IEEEFloat::roundToIntegral is aligned with IEEE-754
operation roundToIntegralExact. In partucular this function now:
- returns opInvalid for signaling NaNs,
- returns opInexact if the result of rounding differs from argument.
Differential Revision: https://reviews.llvm.org/D75246
Added a write method for TimeTrace that takes two strings representing
file names. The first is any file name that may have been provided by the
user via `time-trace-file` flag, and the second is a fallback that should
be configured by the caller. This method makes it cleaner to write the
trace output because there is no longer a need to check file names at the
caller and simplifies future TimeTrace usages.
Reviewed By: modocache
Differential Revision: https://reviews.llvm.org/D74514
* Delete boilerplate
* Change functions to return `Error`
* Test parsing errors
* Update callers of ARMAttributeParser::parse() to check the `Error` return value.
Since this patch touches nearly everything in the file, I apply
http://llvm.org/docs/Proposals/VariableNames.html and change variable
names to lower case.
Reviewed By: compnerd
Differential Revision: https://reviews.llvm.org/D75015
and follow-ups:
a2ca1c2d "build: disable zlib by default on Windows"
2181bf40 "[CMake] Link against ZLIB::ZLIB"
1079c68a "Attempt to fix ZLIB CMake logic on Windows"
This changed the output of llvm-config --system-libs, and more
importantly it broke stand-alone builds. Instead of piling on more fix
attempts, let's revert this to reduce the risk of more breakages.
This patch upstreams support for the ARM Armv8.1m cpu Cortex-M55.
In detail adding support for:
- mcpu option in clang
- Arm Target Features in clang
- llvm Arm TargetParser definitions
details of the CPU can be found here:
https://developer.arm.com/ip-products/processors/cortex-m/cortex-m55
Reviewers: chill
Reviewed By: chill
Subscribers: dmgreen, kristof.beyls, hiraditya, cfe-commits,
llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74966
Lots of headers pass around MemoryBuffer objects, but very few open
them. Let those that do include FileSystem.h.
Saves ~250 includes of Chrono.h & FileSystem.h:
$ diff -u thedeps-before.txt thedeps-after.txt | grep '^[-+] ' | sort | uniq -c | sort -nr
254 - ../llvm/include/llvm/Support/FileSystem.h
253 - ../llvm/include/llvm/Support/Chrono.h
237 - ../llvm/include/llvm/Support/NativeFormatting.h
237 - ../llvm/include/llvm/Support/FormatProviders.h
192 - ../llvm/include/llvm/ADT/StringSwitch.h
190 - ../llvm/include/llvm/Support/FormatVariadicDetails.h
...
This requires duplicating the file_t typedef, which is unfortunate. I
sunk the choice of mapping mode down into the cpp file using variable
template specializations instead of class members in headers.
llvm-ar is using CompareStringOrdinal which is available
only starting with Windows Vista (WINVER 0x600).
Fix this by hoising WindowsSupport.h, which sets _WIN32_WINNT
to 0x0601, up to llvm/include/llvm/Support and use it in llvm-ar.
Patch by Cristian Adam!
Differential revision: https://reviews.llvm.org/D74599
Summary: Include the offset at which this happened.
Reviewers: dblaikie, jhenderson
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75265
MathExtras.h was just wrapping SwapByteOrder.h functionality, so have
the callers use it directly. Use the MathExtras.h name (ByteSwap_NN) as
the standard naming, since it appears to be the most popular.
Summary:
These modificaitons will be used in D74883.
Fixed length C strings can have trailing NULLs or sometimes spaces (BSD archive files), so the fixed length C string defaults to stripping trailing NULLs, but can have the arguments specify to remove one or more kinds of spaces if needed. This is used to extract fixed length C strings from ELF NOTEs in D74883.
Reviewers: labath, dblaikie, aprantl
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74991
This patch upstreams support for the AArch64 Armv8-A cpu Cortex-A34.
In detail adding support for:
- mcpu option in clang
- AArch64 Target Features in clang
- llvm AArch64 TargetParser definitions
details of the cpu can be found here:
https://developer.arm.com/ip-products/processors/cortex-a/cortex-a34
Reviewers: SjoerdMeijer
Reviewed By: SjoerdMeijer
Subscribers: SjoerdMeijer, kristof.beyls, hiraditya, cfe-commits,
llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74483
Change-Id: Ida101fc544ca183a0a0e61a1277c8957855fde0b
The CheckAtomic module performs two tests to determine if passing
'-latomic' to the linker is required: one for 64-bit atomics, and
another for non-64-bit atomics. Include the missing check for 64-bit
atomics.
Reviewers: beanz, compnerd
Reviewed By: beanz, compnerd
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69444
As noted on D74621, the bswap intrinsic has a self imposed limitation that the type's bitwidth must be divisible by 16, but there's no reason that APInt::byteSwap must have the same limitation, given that it can already handle any byte width.
Summary:
this review is extracted from D74308.
It creates two error handlers which allow to redefine error
reporting routine and should be used for all places
where errors are reported:
std::function<void(Error)> RecoverableErrorHandler = defaultErrorHandler;
std::function<void(Error)> WarningHandler = defaultWarningHandler;
It also creates accessors to above handlers which should be used to
report errors.
function_ref<void(Error)> getRecoverableErrorHandler() {
return RecoverableErrorHandler;
}
function_ref<void(Error)> getWarningHandler() { return WarningHandler; }
It patches all error reporting places inside DWARFContext and DWARLinker.
Reviewers: jhenderson, dblaikie, probinson, aprantl, JDevlieghere
Reviewed By: jhenderson, JDevlieghere
Subscribers: hiraditya, llvm-commits
Tags: #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D74481