These names have been changed from CamelCase to camelCase, but there were
many places (comments mostly) that still used the old names.
This change is NFC.
Currently there are 4 different mechanisms for controlling denormal
flushing behavior, and about as many equivalent frontend controls.
- AMDGPU uses the fp32-denormals and fp64-f16-denormals subtarget features
- NVPTX uses the nvptx-f32ftz attribute
- ARM directly uses the denormal-fp-math attribute
- Other targets indirectly use denormal-fp-math in one DAGCombine
- cl-denorms-are-zero has a corresponding denorms-are-zero attribute
AMDGPU wants a distinct control for f32 flushing from f16/f64, and as
far as I can tell the same is true for NVPTX (based on the attribute
name).
Work on consolidating these into the denormal-fp-math attribute, and a
new type specific denormal-fp-math-f32 variant. Only ARM seems to
support the two different flush modes, so this is overkill for the
other use cases. Ideally we would error on the unsupported
positive-zero mode on other targets from somewhere.
Move the logic for selecting the flush mode into the compiler driver,
instead of handling it in cc1. denormal-fp-math/denormal-fp-math-f32
are now both cc1 flags, but denormal-fp-math-f32 is not yet exposed as
a user flag.
-cl-denorms-are-zero, -fcuda-flush-denormals-to-zero and
-fno-cuda-flush-denormals-to-zero will be mapped to
-fp-denormal-math-f32=ieee or preserve-sign rather than the old
attributes.
Stop emitting the denorms-are-zero attribute for the OpenCL flag. It
has no in-tree users. The meaning would also be target dependent, such
as the AMDGPU choice to treat this as only meaning allow flushing of
f32 and not f16 or f64. The naming is also potentially confusing,
since DAZ in other contexts refers to instructions implicitly treating
input denormals as zero, not necessarily flushing output denormals to
zero.
This also does not attempt to change the behavior for the current
attribute. The LangRef now states that the default is ieee behavior,
but this is inaccurate for the current implementation. The clang
handling is slightly hacky to avoid touching the existing
denormal-fp-math uses. Fixing this will be left for a future patch.
AMDGPU is still using the subtarget feature to control the denormal
mode, but the new attribute are now emitted. A future change will
switch this and remove the subtarget features.
It's been an empty target since r360498 and friends
(`git log --grep='Move InstPrinter files to MCTargetDesc.' llvm/lib/Target`),
but due to hwo the way these targets are structured it was silently
an empty target without anyone noticing.
No behavior change.
This patch updates the formatting and language of the Features section of the
ORCv2 design document. It also fixes a TBD by adding discussion of the
absoluteSymbols, symbolAliases, and reexports utilities.
Typos found during editing were also fixed.
When LLVM_APPEND_VC_REV=OFF is set, the current git hash is no
longer embedded into binaries (mostly for --version output).
Without it, most binaries need to relink after every single
commit, even if they didn't change otherwise (due to, say,
a documentation-only commit).
LLVM_APPEND_VC_REV is ON by default, so this doesn't change the
default behavior of anything.
With this, all clients of GenerateVersionFromVCS.cmake honor
LLVM_APPEND_VC_REV.
Differential Revision: https://reviews.llvm.org/D72855
Summary:
This patch could be treated as a rebase of D33960. It also fixes PR35547.
A fix for `llvm/test/Other/close-stderr.ll` is proposed in D68164. Seems
the consensus is that the test is passing by chance and I'm not
sure how important it is for us. So it is removed like in D33960 for now.
The rest of the test fixes are just adding `--crash` flag to `not` tool.
** The reason it fixes PR35547 is
`exit` does cleanup including calling class destructor whereas `abort`
does not do any cleanup. In multithreading environment such as ThinLTO or JIT,
threads may share states which mostly are ManagedStatic<>. If faulting thread
tearing down a class when another thread is using it, there are chances of
memory corruption. This is bad 1. It will stop error reporting like pretty
stack printer; 2. The memory corruption is distracting and nondeterministic in
terms of error message, and corruption type (depending one the timing, it
could be double free, heap free after use, etc.).
Reviewers: rnk, chandlerc, zturner, sepavloff, MaskRay, espindola
Reviewed By: rnk, MaskRay
Subscribers: wuzish, jholewinski, qcolombet, dschuff, jyknight, emaste, sdardis, nemanjai, jvesely, nhaehnle, sbc100, arichardson, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, lenary, s.egerton, pzheng, cfe-commits, MaskRay, filcab, davide, MatzeB, mehdi_amini, hiraditya, steven_wu, dexonsmith, rupprecht, seiya, llvm-commits
Tags: #llvm, #clang
Differential Revision: https://reviews.llvm.org/D67847
This updates the discussion of lazy reexports, fixes a TBD for a usage example,
and adds a reference to the fully worked lazy reexports example that was added
in e9e26c01cd.
When using the option, draw the histogram representing the debug
location buckets. The resulting histogram will be saved in a png
file.
Differential Revision: https://reviews.llvm.org/D71869
Making these changes, the code becomes more robust and easier for
adding the new features.
-Introduce the LocationStats class representing the statistics
-Add the pretty_print() method in the LocationStats class
-Add additional '-' for the program options
-Add the verify_program_inputs() function
-Add the parse_locstats() function
-Rename 'results' => 'opts'
-Add more comments
Differential Revision: https://reviews.llvm.org/D71868
Summary:
This allows you to make some of the defs in a multiclass or `foreach`
conditional on an expression computed from the parameters or iteration
variables.
It was already possible to simulate an if statement using a `foreach`
with a dummy iteration variable and a list constructed using `!if` so
that it had length 0 or 1 depending on the condition, e.g.
foreach unusedIterationVar = !if(condition, [1], []<int>) in { ... }
But this syntax is nicer to read, and also more convenient because it
allows an else clause.
To avoid upheaval in the implementation, I've implemented `if` as pure
syntactic sugar on the `foreach` implementation: internally, `ParseIf`
actually does construct exactly the kind of foreach shown above (and
another reversed one for the else clause if present).
Reviewers: nhaehnle, hfinkel
Reviewed By: hfinkel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71474
Summary:
This allows you to define a global or local variable to an arbitrary
value, and refer to it in subsequent definitions.
The main use I anticipate for this is if you have to compute some
difficult function of the parameters of a multiclass, and then use it
many times. For example:
multiclass Foo<int i, string s> {
defvar op = !cast<BaseClass>("whatnot_" # s # "_" # i);
def myRecord {
dag a = (op this, (op that, the other), (op x, y, z));
int b = op.subfield;
}
def myOtherRecord<"template params including", op>;
}
There are a couple of ways to do this already, but they're not really
satisfactory. You can replace `defvar x = y` with a loop over a
singleton list, `foreach x = [y] in { ... }` - but that's unintuitive
to someone who hasn't seen that workaround idiom before, and requires
an extra pair of braces that you often didn't really want. Or you can
define a nested pair of multiclasses, with the inner one taking `x` as
a template parameter, and the outer one instantiating it just once
with the desired value of `x` computed from its other parameters - but
that makes it awkward to sequentially compute each value based on the
previous ones. I think `defvar` makes things considerably easier.
You can also use `defvar` at the top level, where it inserts globals
into the same map used by `defset`. That allows you to define global
constants without having to make a dummy record for them to live in:
defvar MAX_BUFSIZE = 512;
// previously:
// def Dummy { int MAX_BUFSIZE = 512; }
// and then refer to Dummy.MAX_BUFSIZE everywhere
Reviewers: nhaehnle, hfinkel
Reviewed By: hfinkel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71407
Summary:
The older method of adding 'Patch by John Doe' is documented in the
`Attribution of Changes` section to support correct attribution of commits
that pre-date the adoption of git.
Reviewers: hfinkel, aaron.ballman, mehdi_amini
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72468
Summary:
This patch adds intrinsics and ISelDAG nodes for
signed and unsigned fixed-point division:
llvm.sdiv.fix.*
llvm.udiv.fix.*
These intrinsics perform scaled division on two
integers or vectors of integers. They are required
for the implementation of the Embedded-C fixed-point
arithmetic in Clang.
Patch by: ebevhan
Reviewers: bjope, leonardchan, efriedma, craig.topper
Reviewed By: craig.topper
Subscribers: Ka-Ka, ilya, hiraditya, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70007
Summary: As a novice here I tried to `git push` my changes for a while before figuring out the correct workflow which is described on other pages. This small change doesn't reduce redundancy between those pages, but at least readers can follow the links now.
Reviewers: Kokan, Jim
Reviewed By: Kokan, Jim
Subscribers: riccibruno, kiszk, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72077
Summary:
Remove the restrictions that preventing "asm goto" from returning non-void
values. The values returned by "asm goto" are only valid on the "fallthrough"
path.
Reviewers: jyknight, nickdesaulniers, hfinkel
Reviewed By: jyknight, nickdesaulniers
Subscribers: rsmith, hiraditya, llvm-commits, cfe-commits, craig.topper, rnk
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D69876
Adds the RISC-V asm template argument modifiers currently supported by LLVM.
Additional ones supported by GCC will be added to the documentation when we
start supporting them.
There's quite a lot of references to Polly in the LLVM CMake codebase. However
the registration pattern used by Polly could be useful to other external
projects: thanks to that mechanism it would be possible to develop LLVM
extension without touching the LLVM code base.
This patch has two effects:
1. Remove all code specific to Polly in the llvm/clang codebase, replaicing it
with a generic mechanism
2. Provide a generic mechanism to register compiler extensions.
A compiler extension is similar to a pass plugin, with the notable difference
that the compiler extension can be configured to be built dynamically (like
plugins) or statically (like regular passes).
As a result, people willing to add extra passes to clang/opt can do it using a
separate code repo, but still have their pass be linked in clang/opt as built-in
passes.
Differential Revision: https://reviews.llvm.org/D61446
D56351 (included in LLVM 8.0.0) introduced "frame-pointer". All tests
which use "no-frame-pointer-elim" or "no-frame-pointer-elim-non-leaf"
have been migrated to use "frame-pointer".
Implement UpgradeFramePointerAttributes to upgrade the two obsoleted
function attributes for bitcode. Their semantics are ignored.
Differential Revision: https://reviews.llvm.org/D71863
llvm-symbolizer is used by sanitizers to symbolize errors discovered by
sanitizer, but there's no way to pass options to llvm-symbolizer since
the tool is invoked directly by the sanitizer runtime. Therefore, we
don't have a way to pass options needed to find debug symbols such as
-dsym-hint or -debug-file-directory. This change enables reading options
from the LLVM_SYMBOLIZER_OPTS in addition to command line which can be
used to pass those additional options to llvm-symbolizer invocations
made by sanitizer runtime.
Differential Revision: https://reviews.llvm.org/D71668
Add new intrinsics
llvm.experimental.constrained.minimum
llvm.experimental.constrained.maximum
as strict versions of llvm.minimum and llvm.maximum.
Includes SystemZ back-end support.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D71624
Remappings involving extern "C" names were already supported in the
context of <local-name>s, but this support didn't work for remapping the
complete mangling itself. (Eg, we would remap X<foo> but not foo itself,
if foo is an extern "C" function.)
The following intrinsics currently carry a rounding mode metadata argument:
llvm.experimental.constrained.minnum
llvm.experimental.constrained.maxnum
llvm.experimental.constrained.ceil
llvm.experimental.constrained.floor
llvm.experimental.constrained.round
llvm.experimental.constrained.trunc
This is not useful since the semantics of those intrinsics do not in any way
depend on the rounding mode. In similar cases, other constrained intrinsics
do not have the rounding mode argument. Remove it here as well.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D71218
of integers to floating point.
This includes some of Craig Topper's changes for promotion support from
D71130.
Differential Revision: https://reviews.llvm.org/D69275
Summary:
This is a follow up for D70548.
Currently, variables with debug info coverage between 0% and 1% are put into
zero-bucket. D70548 changed the way statistics calculate a variable's coverage:
we began to use enclosing scope rather than a possible variable life range.
Thus more variables might be moved to zero-bucket despite they have some debug
info coverage.
The patch is to distinguish between a variable that has location info but
it's significantly less than its enclosing scope and a variable that doesn't
have it at all.
Reviewers: djtodoro, aprantl, dblaikie, avl
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71070
Summary:
This changes the representation of 'coverage buckets' in llvm-dwarfdump and
llvm-locstats to one that makes more clear what the buckets contain.
See some related details in D71070.
Reviewers: djtodoro, aprantl, cmtice, jhenderson
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71366
The target feature matrix in the code generator documentation is
outdated. This PR fixes some entries for PowerPC and SystemZ.
Both have:
- assembly parser
- disassembler
- .o file writing
Reviewers: uweigand
Differential Revision: https://reviews.llvm.org/D71004
Summary:
- Clarify AMDGPU address spaces.
- Correct path to AMDGPU backend since now in the mono-repo.
- Fix numerous text style and typo issues.
- Correct reStructure text formatting warnings.
- Made reStructure directive usage more consistent.
- Add references for gfx10 ISA specification.
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71392
This is the first patch adding an initial set of matrix intrinsics and a
corresponding lowering pass. This has been discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2019-October/136240.html
The first patch introduces four new intrinsics (transpose, multiply,
columnwise load and store) and a LowerMatrixIntrinsics pass, that
lowers those intrinsics to vector operations.
Matrixes are embedded in a 'flat' vector (e.g. a 4 x 4 float matrix
embedded in a <16 x float> vector) and the intrinsics take the dimension
information as parameters. Those parameters need to be ConstantInt.
For the memory layout, we initially assume column-major, but in the RFC
we also described how to extend the intrinsics to support row-major as
well.
For the initial lowering, we split the input of the intrinsics into a
set of column vectors, transform those column vectors and concatenate
the result columns to a flat result vector.
This allows us to lower the intrinsics without any shape propagation, as
mentioned in the RFC. In follow-up patches, we plan to submit the
following improvements:
* Shape propagation to eliminate the embedding/splitting for each
intrinsic.
* Fused & tiled lowering of multiply and other operations.
* Optimization remarks highlighting matrix expressions and costs.
* Generate loops for operations on large matrixes.
* More general block processing for operation on large vectors,
exploiting shape information.
We would like to add dedicated transpose, columnwise load and store
intrinsics, even though they are not strictly necessary. For example, we
could instead emit a large shufflevector instruction instead of the
transpose. But we expect that to
(1) become unwieldy for larger matrixes (even for 16x16 matrixes,
the resulting shufflevector masks would be huge),
(2) risk instcombine making small changes, causing us to fail to
detect the transpose, preventing better lowerings
For the load/store, we are additionally planning on exploiting the
intrinsics for better alias analysis.
Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor, efriedma, rengolin
Reviewed By: anemet
Differential Revision: https://reviews.llvm.org/D70456
Summary:
These allow you to get and set the operator of a dag node, without
affecting its list of arguments.
`!getop` is slightly fiddly because in many contexts you need its
return value to have a static type more specific than 'any record'. It
works to say `!cast<BaseClass>(!getop(...))`, but it's cumbersome, so
I made `!getop` take an optional type suffix itself, so that can be
written as the shorter `!getop<BaseClass>(...)`.
Reviewers: hfinkel, nhaehnle
Reviewed By: nhaehnle
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71191
New C code snippet is more viable for SLP vectorization in most architectures.
Patch by: @lsandov1 (Leonardo Sandoval)
Differential Revision: https://reviews.llvm.org/D70866
This adds support for constrained floating-point comparison intrinsics.
Specifically, we add:
declare <ty2>
@llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>,
metadata <condition code>,
metadata <exception behavior>)
declare <ty2>
@llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>,
metadata <condition code>,
metadata <exception behavior>)
The first variant implements an IEEE "quiet" comparison (i.e. we only
get an invalid FP exception if either argument is a SNaN), while the
second variant implements an IEEE "signaling" comparison (i.e. we get
an invalid FP exception if either argument is any NaN).
The condition code is implemented as a metadata string. The same set
of predicates as for the fcmp instruction is supported (except for the
"true" and "false" predicates).
These new intrinsics are mapped by SelectionDAG codegen onto two new
ISD opcodes, ISD::STRICT_FSETCC and ISD::STRICT_FSETCCS, again
representing quiet vs. signaling comparison operations. Otherwise
those nodes look like SETCC nodes, with an additional chain argument
and result as usual for strict FP nodes. The patch includes support
for the common legalization operations for those nodes.
The patch also includes full SystemZ back-end support for the new
ISD nodes, mapping them to all available SystemZ instruction to
fully implement strict semantics (scalar and vector).
Differential Revision: https://reviews.llvm.org/D69281
Summary:
Add a new cl::callback attribute to Option.
This attribute specifies a callback function that is called when
an option is seen, and can be used to set other options, as in
option A implies option B. If the option is a `cl::list`, and
`cl::CommaSeparated` is also specified, the callback will fire
once for each value. This could be used to validate combinations
or selectively set other options.
Reviewers: beanz, thomasfinch, MaskRay, thopre, serge-sans-paille
Reviewed By: beanz
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70620
Revise the coverage mapping format to reduce binary size by:
1. Naming function records and marking them `linkonce_odr`, and
2. Compressing filenames.
This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB)
and speeds up end-to-end single-threaded report generation by 10%. For
reference the compressed name data in llc is 81MB (__llvm_prf_names).
Rationale for changes to the format:
- With the current format, most coverage function records are discarded.
E.g., more than 97% of the records in llc are *duplicate* placeholders
for functions visible-but-not-used in TUs. Placeholders *are* used to
show under-covered functions, but duplicate placeholders waste space.
- We reached general consensus about giving (1) a try at the 2017 code
coverage BoF [1]. The thinking was that using `linkonce_odr` to merge
duplicates is simpler than alternatives like teaching build systems
about a coverage-aware database/module/etc on the side.
- Revising the format is expensive due to the backwards compatibility
requirement, so we might as well compress filenames while we're at it.
This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB).
See CoverageMappingFormat.rst for the details on what exactly has
changed.
Fixes PR34533 [2], hopefully.
[1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html
[2] https://bugs.llvm.org/show_bug.cgi?id=34533
Differential Revision: https://reviews.llvm.org/D69471
Summary: There is a discussion of git-format-patch in GettingStarted guide, but no mention of it in the Phabricator.html page.
Reviewers: jyknight, delcypher
Reviewed By: delcypher
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69323
This revision is revised to update Go-bindings and Release Notes.
The original commit message follows.
This patch, adds support for DW_AT_alignment[DWARF5] attribute, to be emitted with typdef DIE.
When explicit alignment is specified.
Patch by Awanish Pandey <Awanish.Pandey@amd.com>
Reviewers: aprantl, dblaikie, jini.susan.george, SouraVX, alok,
deadalinx
Differential Revision: https://reviews.llvm.org/D70111
Without this patch, `FILECHECK_OPTS` isn't propagated to FileCheck's
test suite so that `FILECHECK_OPTS` doesn't inadvertently affect test
results by affecting the output of FileCheck calls under test. As a
result, `FILECHECK_OPTS` is useless for debugging FileCheck's test
suite.
In `llvm/test/FileCheck/lit.local.cfg`, this patch provides a new
subsitution, `%ProtectFileCheckOutput`, to address this problem for
both `FILECHECK_OPTS` and the deprecated
`FILECHECK_DUMP_INPUT_ON_FAILURE`. The rest of the patch uses
`%ProtectFileCheckOutput` throughout the test suite
Fixes PR40284.
Reviewed By: probinson, thopre
Differential Revision: https://reviews.llvm.org/D65121
Hostcall is a service that allows a kernel to submit requests to the
host using shared buffers, and block until a response is
received. This will eventually replace the shared buffer currently
used for printf, and repurposes the same hidden kernel argument. This
change introduces a new ValueKind in the HSA metadata to represent the
hostcall buffer.
Differential Revision: https://reviews.llvm.org/D70038
Summary
In several places we need to enumerate all constrained intrinsics or IR
nodes that should be represented by them. It is easy to miss some of
the cases. To make working with these intrinsics more convenient and
robust, this change introduces file containing definitions of all
constrained intrinsics and some of their properties. This file can be
included to generate constrained intrinsics processing code.
Reviewers: kpn, andrew.w.kaylor, cameron.mcinally, uweigand
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69887
Several links in this document referred to `LangImpl4.html` or
`LangImpl7.html`. However, now these pages use two digits, so for these
links to function they need to be modified to `LangImpl04.html`, and so
on -- note the extra `0`.
Avoids the need to include TargetMachine.h from various places just for
an enum. Various other enums live here, such as the optimization level,
TLS model, etc. Data suggests that this change probably doesn't matter,
but it seems nice to have anyway.
The parsing error tests in ELF/redefine-symbols.test are not specific to ELF.
Move them to redefine-symbols.test.
Add COFF/redefine-symbols.test for COFF specific tests.
Also fix the documentation regarding --redefine-syms: the old and new
names are separated by whitespace, not an equals sign.
Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D70036
Summary: The options aren't supported so they can be removed.
Reviewers: beanz, smeenai, compnerd
Reviewed By: compnerd
Subscribers: mgorny, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69877
It looks like I pushed an older version of this commit without the review
fixups earlier. This applies the review changes
Differential Revision: https://reviews.llvm.org/D69545
Summary:
Rework the GMIR documentation to focus more on the end user than the
implementation and tie it in to the MIR document. There was also some
out-of-date information which has been removed.
The quality of the GenericOpcode reference is highly variable and drops
sharply as I worked through them all but we've got to start somewhere :-).
It would be great if others could expand on this too as there is an awful
lot to get through.
Also fix a typo in the definition of G_FLOG. Previously, the comments said
we had two base-2's (G_FLOG and G_FLOG2).
Reviewers: aemerson, volkan, rovka, arsenm
Reviewed By: rovka
Subscribers: wdng, arphaman, jfb, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69545
Summary:
This is largely based off of the slides from the keynote
Depends on D69545
Reviewers: volkan, rovka, arsenm
Subscribers: wdng, arphaman, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69644
--only-keep-debug produces a debug file as the output that only
preserves contents of sections useful for debugging purposes (the
binutils implementation preserves SHT_NOTE and non-SHF_ALLOC sections),
by changing their section types to SHT_NOBITS and rewritting file
offsets.
See https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html
The intended use case is:
```
llvm-objcopy --only-keep-debug a a.dbg
llvm-objcopy --strip-debug a b
llvm-objcopy --add-gnu-debuglink=a.dbg b
```
The current layout algorithm is incapable of deleting contents and
shrinking segments, so it is not suitable for implementing the
functionality.
This patch adds a new algorithm which assigns sh_offset to sections
first, then modifies p_offset/p_filesz of program headers. It bears a
resemblance to lld/ELF/Writer.cpp.
Reviewed By: jhenderson, jakehehrlich
Differential Revision: https://reviews.llvm.org/D67137
-mvzeroupper will force the vzeroupper insertion pass to run on
CPUs that normally wouldn't. -mno-vzeroupper disables it on CPUs
where it normally runs.
To support this with the default feature handling in clang, we
need a vzeroupper feature flag in X86.td. Since this flag has
the opposite polarity of the fast-partial-ymm-or-zmm-write we
used to use to disable the pass, we now need to add this new
flag to every CPU except KNL/KNM and BTVER2 to keep identical
behavior.
Remove -fast-partial-ymm-or-zmm-write which is no longer used.
Differential Revision: https://reviews.llvm.org/D69786
This reverts commit 004ed2b0d1.
Original commit hash 6d03890384
Summary:
This adds a clang option to disable inline line tables. When it is used,
the inliner uses the call site as the location of the inlined function instead of
marking it as an inline location with the function location.
https://reviews.llvm.org/D67723
As discussed in https://bugs.llvm.org/show_bug.cgi?id=43870,
this transform is missing a crucial legality check:
the old (non-countable) loop would early-return upon first mismatch,
but there is no such guarantee for bcmp/memcmp.
We'd need to ensure that [PtrA, PtrA+NBytes) and [PtrB, PtrB+NBytes)
are fully dereferenceable memory regions. But that would limit
the transform to constant loop trip counts and would further
cripple it because dereferenceability analysis is *very* partial.
Furthermore, even if all that is done, every single test
would need to be rewritten from scratch.
So let's just give up.
Summary: A lot of this is work in progress...
Reviewers: kcc, pcc
Subscribers: cryptoad, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69289
This works around a bug in Debian's patchset for glibc. The bug is
described in detail in the upstream debian bug:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=943798, but the short
version of it is that glibc on any Debian based distro don't load
libraries unless it has a .ARM.attribute section.
Reviewed by: jhenderson, rupprecht, MaskRay, jakehehrlich
Differential Revision: https://reviews.llvm.org/D69188
Patch by Tobias Hieta.
Summary:
This adds a clang option to disable inline line tables. When it is used,
the inliner uses the call site as the location of the inlined function instead of
marking it as an inline location with the function location.
See https://bugs.llvm.org/show_bug.cgi?id=42344
Reviewers: rnk
Subscribers: hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D67723
The legalizer page was in a fairly good state. I've mostly just inlined
some information as a note and removed a reference to potential future
work that I think is very unlikely to be done (it's very hard to tell if
a pattern or set of patterns fully covers a node due to C++ predicates).
Also added a note that 'selectable' doesn't mean that InstructionSelect
must do it.
Summary:
Delete the BasicBlockPass and BasicBlockManager, all its dependencies and update documentation.
The BasicBlockManager was improperly tested and found to be potentially broken, and was deprecated as of rL373254.
In light of the switch to the new pass manager coming before the next release, this patch is a first cleanup of the LegacyPassManager.
Reviewers: chandlerc, echristo
Subscribers: mehdi_amini, sanjoy.google, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69121
Summary:
This extends the rules for when a call instruction is deemed to be an
FPMathOperator, which is based on the type of the call (i.e. the return
type of the function being called). Previously we only allowed
floating-point and vector-of-floating-point types. Now we also allow
arrays (nested to any depth) of floating-point and
vector-of-floating-point types.
This was motivated by llpc, the pipeline compiler for AMD GPUs
(https://github.com/GPUOpen-Drivers/llpc). llpc has many math library
functions that operate on vectors, typically represented as <4 x float>,
and some that operate on matrices, typically represented as
[4 x <4 x float>], and it's useful to be able to decorate calls to all
of them with fast math flags.
Reviewers: spatel, wristow, arsenm, hfinkel, aemerson, efriedma, cameron.mcinally, mcberg2017, jmolloy
Subscribers: wdng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69161
Add the `-whitelist-filename-regex` option to restrict coverage
reporting to file paths that match a whitelist regex.
Patch by Michael Daniels!
rdar://56720320
This patch adds support for deleted C++ special member functions in
clang and llvm. Also added Defaulted member encodings for future
support for defaulted member functions.
Patch by Sourabh Singh Tomar!
Differential Revision: https://reviews.llvm.org/D69215
I had hoped that I could have some
```
.. code-block:: MIR
```
sections for MIR examples which causes a warning about pygments not
supporting it but we have warnings treated as errors
Summary:
I haven't refreshed the Function Calls section as I don't feel I have
sufficient knowledge of that area. It would be appreciated if someone could
review that section.
Note: I'm aware that pygments doesn't support 'mir' as used in one of the
code-block directives. This currently emits a warning and I decided to
keep it to enable finding them later. Maybe we can teach pygments to
support it.
Depends on D69456
Reviewers: volkan, aditya_nandakumar
Subscribers: rovka, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69457
Summary:
Rewrite the pipeline overview to be more focused on the structure and
flexibility as well as highlight the increased usefulness of
MachineVerifier and increased testability resulting from the smaller
incremental passes approach.
The diagrams are lifted from the slides for the LLVMDev 2019 talk
'Generating Optimized Code with GlobalISel' and adapted to be readable on
the white background used in the docs.
Reviewers: volkan
Subscribers: rovka, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69456
Emit a remarks section by default for the following formats:
* bitstream
* yaml-strtab
while still providing -remarks-section=<bool> to override the defaults.
Summary:
A new function pass (Transforms/CFGuard/CFGuard.cpp) inserts CFGuard checks on
indirect function calls, using either the check mechanism (X86, ARM, AArch64) or
or the dispatch mechanism (X86-64). The check mechanism requires a new calling
convention for the supported targets. The dispatch mechanism adds the target as
an operand bundle, which is processed by SelectionDAG. Another pass
(CodeGen/CFGuardLongjmp.cpp) identifies and emits valid longjmp targets, as
required by /guard:cf. This feature is enabled using the `cfguard` CC1 option.
Reviewers: thakis, rnk, theraven, pcc
Subscribers: ychen, hans, metalcanine, dmajor, tomrittervg, alex, mehdi_amini, mgorny, javed.absar, kristof.beyls, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D65761
This extension point is not needed. Provide the equivalent option
through `CMAKE_CXX_STANDARD` which mirrors the previous extension point. Rely on
CMake to provide the check for the compiler instead.
This patch was reviewed and approved by chandlerc.
"Getting Started with the LLVM System" is the first point of contact for many newcomers in the LLVM community.
* Make the first two paragraphs more welcoming
* Use more inclusive language
The llvm-ar command guide had not been updated in some time, it was
missing current functionality and contained information that was out
of date. This change:
- Updates the use of reStructuredText directives, as seen in other tools
command guides.
- Updates the command synopsis.
- Updates the descriptions of the tool behaviour.
- Updates the options section.
- Adds details of MRI script functionality.
- Removes the sections "Standards" and "File Format"
Differential Revision: https://reviews.llvm.org/D68998
llvm-svn: 375412
Summary: GNU objcopy accepts the --wildcard flag to allow wildcard matching on symbol-related flags. (Note: it's implicitly true for section flags).
The basic syntax is to allow *, ?, \, and [] which work similarly to how they work in a shell. Additionally, starting a wildcard with ! causes that wildcard to prevent it from matching a flag.
Use an updated GlobPattern in libSupport to handle these patterns. It does not fully match the `fnmatch` used by GNU objcopy since named character classes (e.g. `[[:digit:]]`) are not supported, but this should support most existing use cases (mostly just `*` is what's used anyway).
Reviewers: jhenderson, MaskRay, evgeny777, espindola, alexshap
Reviewed By: MaskRay
Subscribers: nickdesaulniers, emaste, arichardson, hiraditya, jakehehrlich, abrachet, seiya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66613
llvm-svn: 375169
Since GNU ar 2.31, the 't' operation prints member offsets beside file
names if the 'O' modifier is specified. 'O' is ignored for thin
archives.
Reviewed By: gbreynoo, ruiu
Differential Revision: https://reviews.llvm.org/D69087
llvm-svn: 375106
Remove dead virtual functions from vtables with
replaceNonMetadataUsesWith, so that CGProfile metadata gets cleaned up
correctly.
Original commit message:
Currently, it is hard for the compiler to remove unused C++ virtual
functions, because they are all referenced from vtables, which are referenced
by constructors. This means that if the constructor is called from any live
code, then we keep every virtual function in the final link, even if there
are no call sites which can use it.
This patch allows unused virtual functions to be removed during LTO (and
regular compilation in limited circumstances) by using type metadata to match
virtual function call sites to the vtable slots they might load from. This
information can then be used in the global dead code elimination pass instead
of the references from vtables to virtual functions, to more accurately
determine which functions are reachable.
To make this transformation safe, I have changed clang's code-generation to
always load virtual function pointers using the llvm.type.checked.load
intrinsic, instead of regular load instructions. I originally tried writing
this using clang's existing code-generation, which uses the llvm.type.test
and llvm.assume intrinsics after doing a normal load. However, it is possible
for optimisations to obscure the relationship between the GEP, load and
llvm.type.test, causing GlobalDCE to fail to find virtual function call
sites.
The existing linkage and visibility types don't accurately describe the scope
in which a virtual call could be made which uses a given vtable. This is
wider than the visibility of the type itself, because a virtual function call
could be made using a more-visible base class. I've added a new
!vcall_visibility metadata type to represent this, described in
TypeMetadata.rst. The internalization pass and libLTO have been updated to
change this metadata when linking is performed.
This doesn't currently work with ThinLTO, because it needs to see every call
to llvm.type.checked.load in the linkage unit. It might be possible to
extend this optimisation to be able to use the ThinLTO summary, as was done
for devirtualization, but until then that combination is rejected in the
clang driver.
To test this, I've written a fuzzer which generates random C++ programs with
complex class inheritance graphs, and virtual functions called through object
and function pointers of different types. The programs are spread across
multiple translation units and DSOs to test the different visibility
restrictions.
I've also tried doing bootstrap builds of LLVM to test this. This isn't
ideal, because only classes in anonymous namespaces can be optimised with
-fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not
work correctly with -fvisibility=hidden. However, there are only 12 test
failures when building with -fvisibility=hidden (and an unmodified compiler),
and this change does not cause any new failures for either value of
-fvisibility.
On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size
reduction of ~6%, over a baseline compiled with "-O2 -flto
-fvisibility=hidden -fwhole-program-vtables". The best cases are reductions
of ~14% in 450.soplex and 483.xalancbmk, and there are no code size
increases.
I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which
show a geomean size reduction of ~3%, again with no size increases.
I had hoped that this would have no effect on performance, which would allow
it to awlays be enabled (when using -fwhole-program-vtables). However, the
changes in clang to use the llvm.type.checked.load intrinsic are causing ~1%
performance regression in the C++ parts of SPEC2006. It should be possible to
recover some of this perf loss by teaching optimisations about the
llvm.type.checked.load intrinsic, which would make it worth turning this on
by default (though it's still dependent on -fwhole-program-vtables).
Differential revision: https://reviews.llvm.org/D63932
llvm-svn: 375094