Commit Graph

7955 Commits

Author SHA1 Message Date
Richard Smith b6e90a1a10 llvm-cxxmap: fix support for remapping non-mangled names.
Remappings involving extern "C" names were already supported in the
context of <local-name>s, but this support didn't work for remapping the
complete mangling itself. (Eg, we would remap X<foo> but not foo itself,
if foo is an extern "C" function.)
2019-12-18 10:47:02 -08:00
Justin Bogner b6f5caa48f [docs] Remove `git llvm push` and `git llvm revert` from GettingStarted
These sections aren't accurate since the github move.

Differential Revision: https://reviews.llvm.org/D71640
2019-12-17 17:16:20 -08:00
Ulrich Weigand 1e89188d35 [FPEnv] Remove unnecessary rounding mode argument for constrained intrinsics
The following intrinsics currently carry a rounding mode metadata argument:

    llvm.experimental.constrained.minnum
    llvm.experimental.constrained.maxnum
    llvm.experimental.constrained.ceil
    llvm.experimental.constrained.floor
    llvm.experimental.constrained.round
    llvm.experimental.constrained.trunc

This is not useful since the semantics of those intrinsics do not in any way
depend on the rounding mode. In similar cases, other constrained intrinsics
do not have the rounding mode argument. Remove it here as well.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D71218
2019-12-17 21:10:36 +01:00
Kevin P. Neal b1d8576b0a This adds constrained intrinsics for the signed and unsigned conversions
of integers to floating point.

This includes some of Craig Topper's changes for promotion support from
D71130.

Differential Revision: https://reviews.llvm.org/D69275
2019-12-17 10:06:51 -05:00
Dmitri Gribenko 51707196a0 Fix title underline in LangRef
The docs didn't compile:
http://lab.llvm.org:8011/builders/llvm-sphinx-docs/builds/38906
2019-12-16 09:05:13 +01:00
Seiya Nuta 9e119ad69d
[llvm-objcopy][MachO] Implement --add-section
Reviewers: alexshap, rupprecht, jhenderson

Reviewed By: alexshap, jhenderson

Subscribers: mgorny, jakehehrlich, abrachet, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66283
2019-12-16 14:07:29 +09:00
Kristina Bessonova d5655c4d2e [llvm-dwarfdump][Statistics] Don't count coverage less than 1% as 0%
Summary:
This is a follow up for D70548.
Currently, variables with debug info coverage between 0% and 1% are put into
zero-bucket. D70548 changed the way statistics calculate a variable's coverage:
we began to use enclosing scope rather than a possible variable life range.
Thus more variables might be moved to zero-bucket despite they have some debug
info coverage.
The patch is to distinguish between a variable that has location info but
it's significantly less than its enclosing scope and a variable that doesn't
have it at all.

Reviewers: djtodoro, aprantl, dblaikie, avl

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71070
2019-12-13 17:34:58 +03:00
Kristina Bessonova 1cc4b603ba [llvm-dwarfdump][Statistics] Change the coverage buckets representation. NFC
Summary:
This changes the representation of 'coverage buckets' in llvm-dwarfdump and
llvm-locstats to one that makes more clear what the buckets contain.

See some related details in D71070.

Reviewers: djtodoro, aprantl, cmtice, jhenderson

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71366
2019-12-13 16:08:25 +03:00
Kai Nacke caa7c9e6f3 [Docs] Fix target feature matrix for PowerPC and SystemZ
The target feature matrix in the code generator documentation is
outdated. This PR fixes some entries for PowerPC and SystemZ.

Both have:
- assembly parser
- disassembler
- .o file writing

Reviewers: uweigand

Differential Revision: https://reviews.llvm.org/D71004
2019-12-13 06:18:08 -05:00
Tony 7a54f727a2 [AMDGPU] AMDGPUUsage clarify address space information and other typo and formatting fixes
Summary:
- Clarify AMDGPU address spaces.
- Correct path to AMDGPU backend since now in the mono-repo.
- Fix numerous text style and typo issues.
- Correct reStructure text formatting warnings.
- Made reStructure directive usage more consistent.
- Add references for gfx10 ISA specification.

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71392
2019-12-12 14:51:27 -05:00
Florian Hahn 526244b187 [Matrix] Add first set of matrix intrinsics and initial lowering pass.
This is the first patch adding an initial set of matrix intrinsics and a
corresponding lowering pass. This has been discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2019-October/136240.html

The first patch introduces four new intrinsics (transpose, multiply,
columnwise load and store) and a LowerMatrixIntrinsics pass, that
lowers those intrinsics to vector operations.

Matrixes are embedded in a 'flat' vector (e.g. a 4 x 4 float matrix
embedded in a <16 x float> vector) and the intrinsics take the dimension
information as parameters. Those parameters need to be ConstantInt.
For the memory layout, we initially assume column-major, but in the RFC
we also described how to extend the intrinsics to support row-major as
well.

For the initial lowering, we split the input of the intrinsics into a
set of column vectors, transform those column vectors and concatenate
the result columns to a flat result vector.

This allows us to lower the intrinsics without any shape propagation, as
mentioned in the RFC. In follow-up patches, we plan to submit the
following improvements:
 * Shape propagation to eliminate the embedding/splitting for each
   intrinsic.
 * Fused & tiled lowering of multiply and other operations.
 * Optimization remarks highlighting matrix expressions and costs.
 * Generate loops for operations on large matrixes.
 * More general block processing for operation on large vectors,
   exploiting shape information.

We would like to add dedicated transpose, columnwise load and store
intrinsics, even though they are not strictly necessary. For example, we
could instead emit a large shufflevector instruction instead of the
transpose. But we expect that to
  (1) become unwieldy for larger matrixes (even for 16x16 matrixes,
      the resulting shufflevector masks would be huge),
  (2) risk instcombine making small changes, causing us to fail to
      detect the transpose, preventing better lowerings

For the load/store, we are additionally planning on exploiting the
intrinsics for better alias analysis.

Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor, efriedma, rengolin

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D70456
2019-12-12 15:42:18 +00:00
Simon Tatham 1fed9a0c0c [TableGen] Add bang-operators !getop and !setop.
Summary:
These allow you to get and set the operator of a dag node, without
affecting its list of arguments.

`!getop` is slightly fiddly because in many contexts you need its
return value to have a static type more specific than 'any record'. It
works to say `!cast<BaseClass>(!getop(...))`, but it's cumbersome, so
I made `!getop` take an optional type suffix itself, so that can be
written as the shorter `!getop<BaseClass>(...)`.

Reviewers: hfinkel, nhaehnle

Reviewed By: nhaehnle

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71191
2019-12-11 12:05:22 +00:00
Sanjay Patel a0c558ee4c [Docs] Improve SLP code snippet
New C code snippet is more viable for SLP vectorization in most architectures.

Patch by: @lsandov1 (Leonardo Sandoval)

Differential Revision: https://reviews.llvm.org/D70866
2019-12-10 09:32:40 -05:00
Nico Weber 761dd780ea Fix a few doc typos, to cycle bots. 2019-12-08 18:51:48 -05:00
Ulrich Weigand 9db13b5a7d [FPEnv] Constrained FCmp intrinsics
This adds support for constrained floating-point comparison intrinsics.

Specifically, we add:

      declare <ty2>
      @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>,
                                          metadata <condition code>,
                                          metadata <exception behavior>)
      declare <ty2>
      @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>,
                                           metadata <condition code>,
                                           metadata <exception behavior>)

The first variant implements an IEEE "quiet" comparison (i.e. we only
get an invalid FP exception if either argument is a SNaN), while the
second variant implements an IEEE "signaling" comparison (i.e. we get
an invalid FP exception if either argument is any NaN).

The condition code is implemented as a metadata string.  The same set
of predicates as for the fcmp instruction is supported (except for the
"true" and "false" predicates).

These new intrinsics are mapped by SelectionDAG codegen onto two new
ISD opcodes, ISD::STRICT_FSETCC and ISD::STRICT_FSETCCS, again
representing quiet vs. signaling comparison operations.  Otherwise
those nodes look like SETCC nodes, with an additional chain argument
and result as usual for strict FP nodes.  The patch includes support
for the common legalization operations for those nodes.

The patch also includes full SystemZ back-end support for the new
ISD nodes, mapping them to all available SystemZ instruction to
fully implement strict semantics (scalar and vector).

Differential Revision: https://reviews.llvm.org/D69281
2019-12-07 11:28:39 +01:00
Don Hinton 6555995a6d [CommandLine] Add callbacks to Options
Summary:
Add a new cl::callback attribute to Option.

This attribute specifies a callback function that is called when
an option is seen, and can be used to set other options, as in
option A implies option B.  If the option is a `cl::list`, and
`cl::CommaSeparated` is also specified, the callback will fire
once for each value.  This could be used to validate combinations
or selectively set other options.

Reviewers: beanz, thomasfinch, MaskRay, thopre, serge-sans-paille

Reviewed By: beanz

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70620
2019-12-06 15:16:45 -08:00
Nico Weber 3b42eb3512 wrap an rst file to 80 cols, to cycle bots 2019-12-06 17:28:02 -05:00
Georgii Rymar cd2c409ceb [llvm-readobj] - Implement --dependent-libraries flag.
There is no way to dump SHT_LLVM_DEPENDENT_LIBRARIES sections
currently. This patch implements this.

The section is described here:
https://llvm.org/docs/Extensions.html#sht-llvm-dependent-libraries-section-dependent-libraries

Differential revision: https://reviews.llvm.org/D70665
2019-12-06 14:28:29 +03:00
Daniel Sanders 82f3c5d4a6 [lit] Document the undocumented pre-defined substitutions 2019-12-04 14:25:12 -08:00
Sanjay Patel ead0d77409 [LangRef] make per-element poison behavior explicit
As discussed in D70246 and PR43958:
https://bugs.llvm.org/show_bug.cgi?id=43958

The LangRef seems ambiguous about the behavior of poison with respect
to vectors.

We could go further with text and/or examples - suggestions welcome.

Also, see discussion on llvm-dev;
http://lists.llvm.org/pipermail/llvm-dev/2019-November/137243.html

Differential Revision: https://reviews.llvm.org/D70641
2019-12-04 15:32:19 -05:00
Vedant Kumar f208b70fbc Revert "[Coverage] Revise format to reduce binary size"
This reverts commit e18531595b.

On Windows, there is an error:

http://lab.llvm.org:8011/builders/sanitizer-windows/builds/54963/steps/stage%201%20check/logs/stdio

error: C:\b\slave\sanitizer-windows\build\stage1\projects\compiler-rt\test\profile\Profile-x86_64\Output\instrprof-merging.cpp.tmp.v1.o: Failed to load coverage: Malformed coverage data
2019-12-04 10:35:14 -08:00
Vedant Kumar e18531595b [Coverage] Revise format to reduce binary size
Revise the coverage mapping format to reduce binary size by:

1. Naming function records and marking them `linkonce_odr`, and
2. Compressing filenames.

This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB)
and speeds up end-to-end single-threaded report generation by 10%. For
reference the compressed name data in llc is 81MB (__llvm_prf_names).

Rationale for changes to the format:

- With the current format, most coverage function records are discarded.
  E.g., more than 97% of the records in llc are *duplicate* placeholders
  for functions visible-but-not-used in TUs. Placeholders *are* used to
  show under-covered functions, but duplicate placeholders waste space.

- We reached general consensus about giving (1) a try at the 2017 code
  coverage BoF [1]. The thinking was that using `linkonce_odr` to merge
  duplicates is simpler than alternatives like teaching build systems
  about a coverage-aware database/module/etc on the side.

- Revising the format is expensive due to the backwards compatibility
  requirement, so we might as well compress filenames while we're at it.
  This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB).

See CoverageMappingFormat.rst for the details on what exactly has
changed.

Fixes PR34533 [2], hopefully.

[1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html
[2] https://bugs.llvm.org/show_bug.cgi?id=34533

Differential Revision: https://reviews.llvm.org/D69471
2019-12-04 10:10:55 -08:00
Kit Barton 06911aee7f Add discussion of git-format-patch to Phabricator.html
Summary: There is a discussion of git-format-patch in GettingStarted guide, but no mention of it in the Phabricator.html page.

Reviewers: jyknight, delcypher

Reviewed By: delcypher

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69323
2019-12-03 18:54:46 -05:00
Sourabh Singh Tomar f1e3988aa6 Recommit "[DWARF5]Addition of alignment atrribute in typedef DIE."
This revision is revised to update Go-bindings and Release Notes.

The original commit message follows.

This patch, adds support for DW_AT_alignment[DWARF5] attribute, to be emitted with typdef DIE.
When explicit alignment is specified.

Patch by Awanish Pandey <Awanish.Pandey@amd.com>

Reviewers: aprantl, dblaikie, jini.susan.george, SouraVX, alok,
deadalinx

Differential Revision: https://reviews.llvm.org/D70111
2019-12-03 09:51:43 +05:30
Seiya Nuta d72a8a4dd5
[llvm-objcopy][MachO] Implement --dump-section
Reviewers: alexshap, rupprecht, jhenderson

Reviewed By: alexshap, rupprecht, jhenderson

Subscribers: MaskRay, jakehehrlich, abrachet, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66408
2019-11-25 12:30:37 +09:00
Joel E. Denny f471eb8e99 [FileCheck] Make FILECHECK_OPTS useful for its test suite
Without this patch, `FILECHECK_OPTS` isn't propagated to FileCheck's
test suite so that `FILECHECK_OPTS` doesn't inadvertently affect test
results by affecting the output of FileCheck calls under test.  As a
result, `FILECHECK_OPTS` is useless for debugging FileCheck's test
suite.

In `llvm/test/FileCheck/lit.local.cfg`, this patch provides a new
subsitution, `%ProtectFileCheckOutput`, to address this problem for
both `FILECHECK_OPTS` and the deprecated
`FILECHECK_DUMP_INPUT_ON_FAILURE`.  The rest of the patch uses
`%ProtectFileCheckOutput` throughout the test suite

Fixes PR40284.

Reviewed By: probinson, thopre

Differential Revision: https://reviews.llvm.org/D65121
2019-11-21 18:01:12 -05:00
Dmitri Gribenko 161742a612 Make coding standards document more inclusive
Summary: Patch by Doug Gregor, Tres Popp, and Dmitri Gribenko.

Reviewers: chandlerc

Subscribers: hfinkel, bmcreusillet, arsenm, doug.gregor, mgrang, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69354
2019-11-21 13:37:17 +01:00
Josh Kunz 6760ca8c76 [docs] Tiny rewording in the portability FAQ entry
The entry reads better with these two words swapped.
2019-11-20 16:40:30 -08:00
Djordje Todorovic 979592a6f7 [DebugInfo] Remove the DIFlagArgumentNotModified debug info flag
Due to changes in D68206, we remove the DIFlagArgumentNotModified
and its usage.

Differential Revision: https://reviews.llvm.org/D68207
2019-11-20 13:18:40 +01:00
Sameer Sahasrabuddhe 52c5014da0 [AMDGPU] add support for hostcall buffer pointer as hidden kernel argument
Hostcall is a service that allows a kernel to submit requests to the
host using shared buffers, and block until a response is
received. This will eventually replace the shared buffer currently
used for printf, and repurposes the same hidden kernel argument. This
change introduces a new ValueKind in the HSA metadata to represent the
hostcall buffer.

Differential Revision: https://reviews.llvm.org/D70038
2019-11-20 15:53:55 +05:30
Fangrui Song 7d980319ab [FEnv] Fix AddingConstrainedIntrinsics.rst after llvmorg-10-init-10282-g0c50c0b0552 2019-11-19 23:09:13 -08:00
Serge Pavlov 0c50c0b055 [FEnv] File with properties of constrained intrinsics
Summary
In several places we need to enumerate all constrained intrinsics or IR
nodes that should be represented by them. It is easy to miss some of
the cases. To make working with these intrinsics more convenient and
robust, this change introduces file containing definitions of all
constrained intrinsics and some of their properties. This file can be
included to generate constrained intrinsics processing code.

Reviewers: kpn, andrew.w.kaylor, cameron.mcinally, uweigand

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69887
2019-11-20 13:30:07 +07:00
Tim Northover 75b5db3094 [docs] Remove dangling parenthesis from documentation
Patch by leiteg.
2019-11-19 20:47:21 +00:00
Brian Gesiak 5864cb38da [docs] Fix broken links in Kaleidoscope chapter 3
Several links in this document referred to `LangImpl4.html` or
`LangImpl7.html`. However, now these pages use two digits, so for these
links to function they need to be modified to `LangImpl04.html`, and so
on -- note the extra `0`.
2019-11-17 21:35:02 -05:00
kristina 5e782e74b3 [Docs] Remove stray :doc: directive. 2019-11-16 23:32:48 +00:00
kristina fb55d56fcf [Docs] Fix sphinx warning.
Fix sphinx warning over an ambigious reference.
2019-11-16 23:23:26 +00:00
kristina 63cf704081 [Docs] Try fixing the tutorial toctree
Unorphan the old tutorial and reference every page in the index
explicitly. This should hopefully make Sphinx generate correct
hyperlinks now.
2019-11-16 23:06:50 +00:00
kristina 2916489c54 [Docs] Fix relative links in tutorial.
Update relative links in Kaleidoscope tutorial.
2019-11-16 21:09:16 +00:00
Seiya Nuta bc11830c6a
[llvm-objcopy][MachO] Implement --remove-section
Reviewers: alexshap, rupprecht, jhenderson

Reviewed By: rupprecht, jhenderson

Subscribers: jakehehrlich, abrachet, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66282
2019-11-15 14:20:11 +09:00
Kevin P. Neal d2b6cc7ff6 Document more specifically the rounding for "llvm.round".
Differential Revision: https://reviews.llvm.org/D68810
2019-11-14 13:15:15 -05:00
Kevin P. Neal 56ae3e2692 Make the language more consistent since I'm about to commit a content
change next.
2019-11-14 13:10:59 -05:00
Reid Kleckner 1dfede3122 Move CodeGenFileType enum to Support/CodeGen.h
Avoids the need to include TargetMachine.h from various places just for
an enum. Various other enums live here, such as the optimization level,
TLS model, etc. Data suggests that this change probably doesn't matter,
but it seems nice to have anyway.
2019-11-13 16:39:34 -08:00
Fangrui Song 7af6025bd1 [llvm-objcopy][COFF] Implement --redefine-sym and --redefine-syms
The parsing error tests in ELF/redefine-symbols.test are not specific to ELF.
Move them to redefine-symbols.test.
Add COFF/redefine-symbols.test for COFF specific tests.

Also fix the documentation regarding --redefine-syms: the old and new
names are separated by whitespace, not an equals sign.

Reviewed By: mstorsjo

Differential Revision: https://reviews.llvm.org/D70036
2019-11-12 11:28:00 -08:00
Nuno Lopes a7244c56bd docs: fix warning in LangRef parsing 2019-11-11 10:45:42 +00:00
drichards-87 bcca123bd0 Docs: Updates Sphinx Quickstart template for new contributors 2019-11-10 09:27:32 -07:00
Simon Pilgrim 1dbcf8ba8a Try to fix sphinx "Could not lex literal_block as "llvm"" warning.
Code block isn't IR - so treat it as "none" instead.
2019-11-09 22:15:26 +00:00
Stephan T. Lavavej 3a7a22445e [www] More HTTPS and outdated link fixes.
Resolves D69981.
2019-11-08 14:41:27 -08:00
Tom Stellard 3ffbf9720f [cmake] Remove LLVM_{BUILD,LINK}_LLVM_DYLIB options on Windows
Summary: The options aren't supported so they can be removed.

Reviewers: beanz, smeenai, compnerd

Reviewed By: compnerd

Subscribers: mgorny, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69877
2019-11-08 10:37:16 -08:00
Lang Hames baaa097360 [docs] Fix references to a renamed flag.
The -use-mcjit option was replaced with -jit-kind=mcjit a while back. This patch
updates the docs to reflect that.

Patch by Yu Jian. Thanks Jian!
2019-11-06 14:42:57 -08:00
Daniel Sanders e0dd8f36ce [globalisel][docs] Rework GMIR documentation and add an early GenericOpcode reference
It looks like I pushed an older version of this commit without the review
fixups earlier. This applies the review changes

Differential Revision: https://reviews.llvm.org/D69545
2019-11-05 15:44:26 -08:00
Daniel Sanders ad0dfb0a25 [globalisel][docs] Rework GMIR documentation and add an early GenericOpcode reference
Summary:
Rework the GMIR documentation to focus more on the end user than the
implementation and tie it in to the MIR document. There was also some
out-of-date information which has been removed.

The quality of the GenericOpcode reference is highly variable and drops
sharply as I worked through them all but we've got to start somewhere :-).
It would be great if others could expand on this too as there is an awful
lot to get through.

Also fix a typo in the definition of G_FLOG. Previously, the comments said
we had two base-2's (G_FLOG and G_FLOG2).

Reviewers: aemerson, volkan, rovka, arsenm

Reviewed By: rovka

Subscribers: wdng, arphaman, jfb, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69545
2019-11-05 15:16:43 -08:00
Daniel Sanders 7060840bc9 [globalisel][docs] Add a section about debugging with the block extractor
Summary: Depends on D69644

Reviewers: rovka, volkan, arsenm

Subscribers: wdng, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69645
2019-11-05 14:48:27 -08:00
Daniel Sanders 312932a334 [globalisel][docs] Add KnownBits Analysis documentation
Summary:
This is largely based off of the slides from the keynote

Depends on D69545

Reviewers: volkan, rovka, arsenm

Subscribers: wdng, arphaman, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69644
2019-11-05 09:55:33 -08:00
Fangrui Song 5ad0103d8a [llvm-objcopy][ELF] Implement --only-keep-debug
--only-keep-debug produces a debug file as the output that only
preserves contents of sections useful for debugging purposes (the
binutils implementation preserves SHT_NOTE and non-SHF_ALLOC sections),
by changing their section types to SHT_NOBITS and rewritting file
offsets.

See https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html

The intended use case is:

```
llvm-objcopy --only-keep-debug a a.dbg
llvm-objcopy --strip-debug a b
llvm-objcopy --add-gnu-debuglink=a.dbg b
```

The current layout algorithm is incapable of deleting contents and
shrinking segments, so it is not suitable for implementing the
functionality.

This patch adds a new algorithm which assigns sh_offset to sections
first, then modifies p_offset/p_filesz of program headers. It bears a
resemblance to lld/ELF/Writer.cpp.

Reviewed By: jhenderson, jakehehrlich

Differential Revision: https://reviews.llvm.org/D67137
2019-11-05 08:56:15 -08:00
Nuno Lopes 2d21068d9f [Docs] Add LangRef documentation for freeze instruction
Summary:
 - Describe the new freeze instruction
 - Make it explicit that branch on undef/poison is UB

Reviewers: chandlerc, majnemer, efriedma, nikic, reames, jdoerfert, lebedev.ri, regehr

Subscribers: fhahn, bollu, lebedev.ri, delcypher, spatel, filcab, llvm-commits, aqjune

Differential Revision: https://reviews.llvm.org/D29121
2019-11-05 11:35:55 +00:00
Craig Topper b2b6a54f84 [X86] Add support for -mvzeroupper and -mno-vzeroupper to match gcc
-mvzeroupper will force the vzeroupper insertion pass to run on
CPUs that normally wouldn't. -mno-vzeroupper disables it on CPUs
where it normally runs.

To support this with the default feature handling in clang, we
need a vzeroupper feature flag in X86.td. Since this flag has
the opposite polarity of the fast-partial-ymm-or-zmm-write we
used to use to disable the pass, we now need to add this new
flag to every CPU except KNL/KNM and BTVER2 to keep identical
behavior.

Remove -fast-partial-ymm-or-zmm-write which is no longer used.

Differential Revision: https://reviews.llvm.org/D69786
2019-11-04 11:03:54 -08:00
Amy Huang ab76cfdd20 Recommit "[CodeView] Add option to disable inline line tables."
This reverts commit 004ed2b0d1.
Original commit hash 6d03890384

Summary:
This adds a clang option to disable inline line tables. When it is used,
the inliner uses the call site as the location of the inlined function instead of
marking it as an inline location with the function location.

https://reviews.llvm.org/D67723
2019-11-04 09:15:26 -08:00
Stefan Stipanovic f35740d6e9 NoFree argument attribute.
Summary: Deducing nofree atrribute for function arguments.

Reviewers: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67886
2019-11-02 19:40:48 +01:00
Stefan Stipanovic 5fb1782918 Revert "NoFree argument attribute."
This reverts commit c12efa2ed0.
2019-11-02 17:31:02 +01:00
Stefan Stipanovic c12efa2ed0 NoFree argument attribute.
Summary: Deducing nofree atrribute for function arguments.

Reviewers: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67886
2019-11-02 16:35:38 +01:00
Roman Lebedev c4b757be02
Revert BCmp Loop Idiom recognition transform (PR43870)
As discussed in https://bugs.llvm.org/show_bug.cgi?id=43870,
this transform is missing a crucial legality check:
the old (non-countable) loop would early-return upon first mismatch,
but there is no such guarantee for bcmp/memcmp.

We'd need to ensure that [PtrA, PtrA+NBytes) and [PtrB, PtrB+NBytes)
are fully dereferenceable memory regions. But that would limit
the transform to constant loop trip counts and would further
cripple it because dereferenceability analysis is *very* partial.

Furthermore, even if all that is done, every single test
would need to be rewritten from scratch.

So let's just give up.
2019-11-02 12:48:03 +03:00
Evgenii Stepanov 27c9abae65 Add MemTagSanitizer documentation.
Summary: A lot of this is work in progress...

Reviewers: kcc, pcc

Subscribers: cryptoad, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69289
2019-11-01 10:46:04 -07:00
Adrian Prantl 9370a74158 Fix a few typos in SourceLevelDebugging.rst 2019-10-31 16:03:44 -07:00
James Henderson fb4a55010e [llvm-objcopy] Preserve .ARM.attributes section when stripping files
This works around a bug in Debian's patchset for glibc. The bug is
described in detail in the upstream debian bug:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=943798, but the short
version of it is that glibc on any Debian based distro don't load
libraries unless it has a .ARM.attribute section.

Reviewed by: jhenderson, rupprecht, MaskRay, jakehehrlich

Differential Revision: https://reviews.llvm.org/D69188

Patch by Tobias Hieta.
2019-10-31 11:57:19 +00:00
Seiya Nuta 9bbf2a1544
[llvm-objcopy][MachO] Implement --strip-all
Reviewers: alexshap, rupprecht, jdoerfert, jhenderson

Reviewed By: alexshap

Subscribers: jakehehrlich, abrachet, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66281
2019-10-31 14:26:46 +09:00
Amy Huang 004ed2b0d1 Revert "[CodeView] Add option to disable inline line tables."
because it breaks compiler-rt tests.

This reverts commit 6d03890384.
2019-10-30 17:31:12 -07:00
Amy Huang 6d03890384 [CodeView] Add option to disable inline line tables.
Summary:
This adds a clang option to disable inline line tables. When it is used,
the inliner uses the call site as the location of the inlined function instead of
marking it as an inline location with the function location.

See https://bugs.llvm.org/show_bug.cgi?id=42344

Reviewers: rnk

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D67723
2019-10-30 16:52:39 -07:00
Daniel Sanders 204a529cb0 [globalisel][docs] Add the tutorial to the Porting document
In lieu of converting that tutorial to text, add a link to the porting
tutorial from the 2017 Dev Meeting to the porting page
2019-10-30 14:53:39 -07:00
Alina Sbirlea bbb43df011 [ReleaseNotes] Add item on deleting the BasicBlockPass(Manager). 2019-10-30 14:26:46 -07:00
Daniel Sanders 2d098bea03 [globalisel][docs] Rework the Legalizer page slightly
The legalizer page was in a fairly good state. I've mostly just inlined
some information as a note and removed a reference to potential future
work that I think is very unlikely to be done (it's very hard to tell if
a pattern or set of patterns fully covers a node due to C++ predicates).

Also added a note that 'selectable' doesn't mean that InstructionSelect
must do it.
2019-10-30 13:42:19 -07:00
Evandro Menezes 215da6606c [clang][llvm] Obsolete Exynos M1 and M2 2019-10-30 15:02:59 -05:00
Daniel Sanders 91e2151d04 [globalisel][docs] Add a pass index 2019-10-30 12:06:22 -07:00
Daniel Sanders 443f99eae2 [globalisel][docs] Fix a label that was renamed 2019-10-30 11:47:29 -07:00
Alina Sbirlea 9f0ff0b263 [LegacyPassManager] Delete BasicBlockPass/Manager.
Summary:
Delete the BasicBlockPass and BasicBlockManager, all its dependencies and update documentation.
The BasicBlockManager was improperly tested and found to be potentially broken, and was deprecated as of rL373254.

In light of the switch to the new pass manager coming before the next release, this patch is a first cleanup of the LegacyPassManager.

Reviewers: chandlerc, echristo

Subscribers: mehdi_amini, sanjoy.google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69121
2019-10-30 11:40:16 -07:00
Jay Foad 2da4b6e514 [IR] Allow fast math flags on calls with floating point array type.
Summary:
This extends the rules for when a call instruction is deemed to be an
FPMathOperator, which is based on the type of the call (i.e. the return
type of the function being called). Previously we only allowed
floating-point and vector-of-floating-point types. Now we also allow
arrays (nested to any depth) of floating-point and
vector-of-floating-point types.

This was motivated by llpc, the pipeline compiler for AMD GPUs
(https://github.com/GPUOpen-Drivers/llpc). llpc has many math library
functions that operate on vectors, typically represented as <4 x float>,
and some that operate on matrices, typically represented as
[4 x <4 x float>], and it's useful to be able to decorate calls to all
of them with fast math flags.

Reviewers: spatel, wristow, arsenm, hfinkel, aemerson, efriedma, cameron.mcinally, mcberg2017, jmolloy

Subscribers: wdng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69161
2019-10-30 14:00:33 +00:00
Vlad Tsyrklevich 8d24d72f7f Revert "[llvm-cov] Add option to whitelist filenames"
This reverts commit bfed824b57, the
included test fails on many bots including the sanitier bots, e.g. in
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/36140
2019-10-29 22:38:38 -07:00
Vedant Kumar bfed824b57 [llvm-cov] Add option to whitelist filenames
Add the `-whitelist-filename-regex` option to restrict coverage
reporting to file paths that match a whitelist regex.

Patch by Michael Daniels!

rdar://56720320
2019-10-29 18:26:33 -07:00
Adrian Prantl f919be3365 [DWARF5] Added support for deleted C++ special member functions.
This patch adds support for deleted C++ special member functions in
clang and llvm. Also added Defaulted member encodings for future
support for defaulted member functions.

Patch by Sourabh Singh Tomar!

Differential Revision: https://reviews.llvm.org/D69215
2019-10-29 13:44:06 -07:00
Daniel Sanders 3260fa2cb0 [globalisel][docs] Fix warning treated as error
I had hoped that I could have some
```
.. code-block:: MIR
```
sections for MIR examples which causes a warning about pygments not
supporting it but we have warnings treated as errors
2019-10-29 13:27:48 -07:00
Daniel Sanders 6f665fc786 [globalisel][docs] Rewrite the IRTranslator documentation
Summary:
I haven't refreshed the Function Calls section as I don't feel I have
sufficient knowledge of that area. It would be appreciated if someone could
review that section.

Note: I'm aware that pygments doesn't support 'mir' as used in one of the
code-block directives. This currently emits a warning and I decided to
keep it to enable finding them later. Maybe we can teach pygments to
support it.

Depends on D69456

Reviewers: volkan, aditya_nandakumar

Subscribers: rovka, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69457
2019-10-29 13:14:58 -07:00
Philip Reames e14f935ce2 [Docs] Reflect the slow migration from guard to widenable condition which is currently in progress. 2019-10-29 12:46:24 -07:00
Daniel Sanders 1765f31f5a [globalisel][docs] Rewrite the pipeline overview
Summary:
Rewrite the pipeline overview to be more focused on the structure and
flexibility as well as highlight the increased usefulness of
MachineVerifier and increased testability resulting from the smaller
incremental passes approach.

The diagrams are lifted from the slides for the LLVMDev 2019 talk
'Generating Optimized Code with GlobalISel' and adapted to be readable on
the white background used in the docs.

Reviewers: volkan

Subscribers: rovka, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69456
2019-10-29 11:21:24 -07:00
Francis Visoiu Mistrih c7557dd692 [Remarks] Remove references to ELF support
There is no ELF support at the moment.

Remove all the references to the `.remarks` section.
2019-10-28 12:50:46 -07:00
Francis Visoiu Mistrih 209d5a12c5 [Remarks] Emit the remarks section by default for certain formats
Emit a remarks section by default for the following formats:

* bitstream
* yaml-strtab

while still providing -remarks-section=<bool> to override the defaults.
2019-10-28 12:50:46 -07:00
Andrew Paverd d157a9bc8b Add Windows Control Flow Guard checks (/guard:cf).
Summary:
A new function pass (Transforms/CFGuard/CFGuard.cpp) inserts CFGuard checks on
indirect function calls, using either the check mechanism (X86, ARM, AArch64) or
or the dispatch mechanism (X86-64). The check mechanism requires a new calling
convention for the supported targets. The dispatch mechanism adds the target as
an operand bundle, which is processed by SelectionDAG. Another pass
(CodeGen/CFGuardLongjmp.cpp) identifies and emits valid longjmp targets, as
required by /guard:cf. This feature is enabled using the `cfguard` CC1 option.

Reviewers: thakis, rnk, theraven, pcc

Subscribers: ychen, hans, metalcanine, dmajor, tomrittervg, alex, mehdi_amini, mgorny, javed.absar, kristof.beyls, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D65761
2019-10-28 15:19:39 +00:00
Seiya Nuta 7f19dd1ebf
[llvm-objcopy][MachO] Implement --only-section
Reviewers: alexshap, rupprecht, jdoerfert, jhenderson

Reviewed By: alexshap, rupprecht, jhenderson

Subscribers: mgorny, jakehehrlich, abrachet, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65541
2019-10-28 16:00:20 +09:00
Daniel Sanders feab0334f5 [globalisel] Restructure the GlobalISel documentation
There's a couple minor deletions amongst this but 99% of it is just moving
the documentation around to prepare the way for more meaningful changes.
2019-10-25 15:51:09 -07:00
Daniel Sanders 27887bc1e7 [globalisel] Fix typo in 'Add LLVMDev 2019 talks and links for the 2017 talks' 2019-10-25 15:01:14 -07:00
Daniel Sanders 7913126a08 [globalisel] Add LLVMDev 2019 talks and links for the 2017 talks 2019-10-25 14:53:58 -07:00
Saleem Abdulrasool 2724d9e129 build: remove `LLVM_CXX_STD` extension point
This extension point is not needed. Provide the equivalent option
through `CMAKE_CXX_STANDARD` which mirrors the previous extension point. Rely on
CMake to provide the check for the compiler instead.
2019-10-25 11:51:47 -07:00
Simon Atanasyan 77b3c794e3 [docs] Update Mips feature table in CodeGenerator.rst
Patch by Miloš Stojanović

Differential Revision: https://reviews.llvm.org/D69381
2019-10-25 12:17:34 +03:00
Tom Stellard 27bfee01e9 docs: Update instructions for requesting commit access 2019-10-24 20:42:02 -07:00
Simon Atanasyan fd77e578e9 [docs] Add Mips as a supported architecture in GettingStarted.rst
Patch by Miloš Stojanović

Differential Revision: https://reviews.llvm.org/D69380
2019-10-24 15:56:30 +03:00
Simon Atanasyan c84cfaf9bc [docs] Update link to the MIPS 64-bit ELF object file specification
Patch by Miloš Stojanović

Differential Revision: https://reviews.llvm.org/D69377
2019-10-24 15:56:30 +03:00
Marek Kurdej 73cebfe412 [libFuzzer] docs: update note to include REDUCE event. 2019-10-24 12:04:12 +02:00
Meike Baumgärtner 23fdd513a3
Improve language in GettingStarted.rst
This patch was reviewed and approved by chandlerc.

"Getting Started with the LLVM System" is the first point of contact for many newcomers in the LLVM community.
 * Make the first two paragraphs more welcoming
 * Use more inclusive language
2019-10-23 12:32:57 -07:00
Chandler Carruth bf2975eca0 Remove a no longer accurate sentence from the coding standards.
(And test my commit access. We're working on larger changes here.)
2019-10-23 11:40:45 -07:00
Kit Barton efd7caaa4e Fix broken sphinx link in CMake.rst.
Reviewers: delcypher, beanz

Subscribers: mgorny, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69325
2019-10-22 14:49:58 -07:00
Owen Reynolds fe263c4f0f [docs][llvm-ar] Update llvm-ar command guide
The llvm-ar command guide had not been updated in some time, it was
missing current functionality and contained information that was out
of date. This change:
- Updates the use of reStructuredText directives, as seen in other tools
  command guides.
- Updates the command synopsis.
- Updates the descriptions of the tool behaviour.
- Updates the options section.
- Adds details of MRI script functionality.
- Removes the sections "Standards" and "File Format"

Differential Revision: https://reviews.llvm.org/D68998

llvm-svn: 375412
2019-10-21 13:13:31 +00:00
Sylvestre Ledru 751e0bb6af Explicit in the doc the current list of projects (with easy copy and paste)
llvm-svn: 375339
2019-10-19 09:55:24 +00:00
Sylvestre Ledru 963e0d6755 Make it clear in the doc that 'all' in LLVM_ENABLE_PROJECTS does install ALL projects
llvm-svn: 375337
2019-10-19 09:27:14 +00:00
Jay Foad aa3806b47c Update docs for fast-math flags.
This adds fneg, phi and select to the list of operations that may use
fast-math flags.

llvm-svn: 375250
2019-10-18 16:07:09 +00:00
Jordan Rupprecht edeebad771 [llvm-objcopy] Add support for shell wildcards
Summary: GNU objcopy accepts the --wildcard flag to allow wildcard matching on symbol-related flags. (Note: it's implicitly true for section flags).

The basic syntax is to allow *, ?, \, and [] which work similarly to how they work in a shell. Additionally, starting a wildcard with ! causes that wildcard to prevent it from matching a flag.

Use an updated GlobPattern in libSupport to handle these patterns. It does not fully match the `fnmatch` used by GNU objcopy since named character classes (e.g. `[[:digit:]]`) are not supported, but this should support most existing use cases (mostly just `*` is what's used anyway).

Reviewers: jhenderson, MaskRay, evgeny777, espindola, alexshap

Reviewed By: MaskRay

Subscribers: nickdesaulniers, emaste, arichardson, hiraditya, jakehehrlich, abrachet, seiya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66613

llvm-svn: 375169
2019-10-17 20:51:00 +00:00
Fangrui Song 5095a67a1a [docs][llvm-ar] Fix option:: O after r375106
docs-llvm-html fails => unknown option: O

There are lots of formatting issues in the file but they will be fixed by D68998.

llvm-svn: 375107
2019-10-17 11:56:26 +00:00
Fangrui Song a69cc92cb5 [llvm-ar] Implement the O modifier: display member offsets inside the archive
Since GNU ar 2.31, the 't' operation prints member offsets beside file
names if the 'O' modifier is specified. 'O' is ignored for thin
archives.

Reviewed By: gbreynoo, ruiu

Differential Revision: https://reviews.llvm.org/D69087

llvm-svn: 375106
2019-10-17 11:34:29 +00:00
Oliver Stannard 3b598b9c86 Reland: Dead Virtual Function Elimination
Remove dead virtual functions from vtables with
replaceNonMetadataUsesWith, so that CGProfile metadata gets cleaned up
correctly.

Original commit message:

Currently, it is hard for the compiler to remove unused C++ virtual
functions, because they are all referenced from vtables, which are referenced
by constructors. This means that if the constructor is called from any live
code, then we keep every virtual function in the final link, even if there
are no call sites which can use it.

This patch allows unused virtual functions to be removed during LTO (and
regular compilation in limited circumstances) by using type metadata to match
virtual function call sites to the vtable slots they might load from. This
information can then be used in the global dead code elimination pass instead
of the references from vtables to virtual functions, to more accurately
determine which functions are reachable.

To make this transformation safe, I have changed clang's code-generation to
always load virtual function pointers using the llvm.type.checked.load
intrinsic, instead of regular load instructions. I originally tried writing
this using clang's existing code-generation, which uses the llvm.type.test
and llvm.assume intrinsics after doing a normal load. However, it is possible
for optimisations to obscure the relationship between the GEP, load and
llvm.type.test, causing GlobalDCE to fail to find virtual function call
sites.

The existing linkage and visibility types don't accurately describe the scope
in which a virtual call could be made which uses a given vtable. This is
wider than the visibility of the type itself, because a virtual function call
could be made using a more-visible base class. I've added a new
!vcall_visibility metadata type to represent this, described in
TypeMetadata.rst. The internalization pass and libLTO have been updated to
change this metadata when linking is performed.

This doesn't currently work with ThinLTO, because it needs to see every call
to llvm.type.checked.load in the linkage unit. It might be possible to
extend this optimisation to be able to use the ThinLTO summary, as was done
for devirtualization, but until then that combination is rejected in the
clang driver.

To test this, I've written a fuzzer which generates random C++ programs with
complex class inheritance graphs, and virtual functions called through object
and function pointers of different types. The programs are spread across
multiple translation units and DSOs to test the different visibility
restrictions.

I've also tried doing bootstrap builds of LLVM to test this. This isn't
ideal, because only classes in anonymous namespaces can be optimised with
-fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not
work correctly with -fvisibility=hidden. However, there are only 12 test
failures when building with -fvisibility=hidden (and an unmodified compiler),
and this change does not cause any new failures for either value of
-fvisibility.

On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size
reduction of ~6%, over a baseline compiled with "-O2 -flto
-fvisibility=hidden -fwhole-program-vtables". The best cases are reductions
of ~14% in 450.soplex and 483.xalancbmk, and there are no code size
increases.

I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which
show a geomean size reduction of ~3%, again with no size increases.

I had hoped that this would have no effect on performance, which would allow
it to awlays be enabled (when using -fwhole-program-vtables). However, the
changes in clang to use the llvm.type.checked.load intrinsic are causing ~1%
performance regression in the C++ parts of SPEC2006. It should be possible to
recover some of this perf loss by teaching optimisations about the
llvm.type.checked.load intrinsic, which would make it worth turning this on
by default (though it's still dependent on -fwhole-program-vtables).

Differential revision: https://reviews.llvm.org/D63932

llvm-svn: 375094
2019-10-17 09:58:57 +00:00
Alina Sbirlea c0e6a92e34 Update ReleaseNotes: expand the section on enabling MemorySSA
llvm-svn: 375045
2019-10-16 21:52:09 +00:00
Owen Reynolds 28a3b2aeb4 [llvm-ar] Make paths case insensitive when on windows
When on windows gnu-ar treats member names as case insensitive. This
commit implements the same behaviour.

Differential Revision: https://reviews.llvm.org/D68033

llvm-svn: 375002
2019-10-16 14:07:57 +00:00
DeForest Richards 75b991ebdf [Docs] Updates sidebar links and sets max-width property for div.body
Updates the sidebar links for Getting Started. Also sets max-width on div.body to 1000px.

llvm-svn: 374949
2019-10-15 21:27:20 +00:00
David Stenberg 1ae2d9a2bd [DebugInfo] Add a DW_OP_LLVM_entry_value operation
Summary:
Internally in LLVM's metadata we use DW_OP_entry_value operations with
the same semantics as DWARF; that is, its operand specifies the number
of bytes that the entry value covers.

At the time of emitting entry values we don't know the emitted size of
the DWARF expression that the entry value will cover. Currently the size
is hardcoded to 1 in DIExpression, and other values causes the verifier
to fail. As the size is 1, that effectively means that we can only have
valid entry values for registers that can be encoded in one byte, which
are the registers with DWARF numbers 0 to 31 (as they can be encoded as
single-byte DW_OP_reg0..DW_OP_reg31 rather than a multi-byte
DW_OP_regx). It is a bit confusing, but it seems like llvm-dwarfdump
will print an operation "correctly", even if the byte size is less than
that, which may make it seem that we emit correct DWARF for registers
with DWARF numbers > 31. If you instead use readelf for such cases, it
will interpret the number of specified bytes as a DWARF expression. This
seems like a limitation in llvm-dwarfdump.

As suggested in D66746, a way forward would be to add an internal
variant of DW_OP_entry_value, DW_OP_LLVM_entry_value, whose operand
instead specifies the number of operations that the entry value covers,
and we then translate that into the byte size at the time of emission.

In this patch that internal operation is added. This patch keeps the
limitation that a entry value can only be applied to simple register
locations, but it will fix the issue with the size operand being
incorrect for DWARF numbers > 31.

Reviewers: aprantl, vsk, djtodoro, NikolaPrica

Reviewed By: aprantl

Subscribers: jyknight, fedor.sergeev, hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D67492

llvm-svn: 374881
2019-10-15 11:31:21 +00:00
Jorge Gorbe Moya b052331bd6 Revert "Dead Virtual Function Elimination"
This reverts commit 9f6a873268.

llvm-svn: 374844
2019-10-14 23:25:25 +00:00
DeForest Richards 22373c595e [Docs] Moves Control Flow Document to User Guides
Moves Control Flow document from Reference docs page to User guides page.

llvm-svn: 374733
2019-10-13 20:05:22 +00:00
Roman Lebedev 76cdcf25b8 [LoopIdiomRecognize] Recommit: BCmp loop idiom recognition
Summary:
This is a recommit, this originally landed in rL370454 but was
subsequently reverted in  rL370788 due to
https://bugs.llvm.org/show_bug.cgi?id=43206
The reduced testcase was added to bcmp-negative-tests.ll
as @pr43206_different_loops - we must ensure that the SCEV's
we got are both for the same loop we are currently investigating.

Original commit message:

@mclow.lists brought up this issue up in IRC.
It is a reasonably common problem to compare some two values for equality.
Those may be just some integers, strings or arrays of integers.

In C, there is `memcmp()`, `bcmp()` functions.
In C++, there exists `std::equal()` algorithm.
One can also write that function manually.

libstdc++'s `std::equal()` is specialized to directly call `memcmp()` for
various types, but not `std::byte` from C++2a. https://godbolt.org/z/mx2ejJ

libc++ does not do anything like that, it simply relies on simple C++'s
`operator==()`. https://godbolt.org/z/er0Zwf (GOOD!)

So likely, there exists a certain performance opportunities.
Let's compare performance of naive `std::equal()` (no `memcmp()`) with one that
is using `memcmp()` (in this case, compiled with modified compiler). {F8768213}

```
#include <algorithm>
#include <cmath>
#include <cstdint>
#include <iterator>
#include <limits>
#include <random>
#include <type_traits>
#include <utility>
#include <vector>

#include "benchmark/benchmark.h"

template <class T>
bool equal(T* a, T* a_end, T* b) noexcept {
  for (; a != a_end; ++a, ++b) {
    if (*a != *b) return false;
  }
  return true;
}

template <typename T>
std::vector<T> getVectorOfRandomNumbers(size_t count) {
  std::random_device rd;
  std::mt19937 gen(rd());
  std::uniform_int_distribution<T> dis(std::numeric_limits<T>::min(),
                                       std::numeric_limits<T>::max());
  std::vector<T> v;
  v.reserve(count);
  std::generate_n(std::back_inserter(v), count,
                  [&dis, &gen]() { return dis(gen); });
  assert(v.size() == count);
  return v;
}

struct Identical {
  template <typename T>
  static std::pair<std::vector<T>, std::vector<T>> Gen(size_t count) {
    auto Tmp = getVectorOfRandomNumbers<T>(count);
    return std::make_pair(Tmp, std::move(Tmp));
  }
};

struct InequalHalfway {
  template <typename T>
  static std::pair<std::vector<T>, std::vector<T>> Gen(size_t count) {
    auto V0 = getVectorOfRandomNumbers<T>(count);
    auto V1 = V0;
    V1[V1.size() / size_t(2)]++;  // just change the value.
    return std::make_pair(std::move(V0), std::move(V1));
  }
};

template <class T, class Gen>
void BM_bcmp(benchmark::State& state) {
  const size_t Length = state.range(0);

  const std::pair<std::vector<T>, std::vector<T>> Data =
      Gen::template Gen<T>(Length);
  const std::vector<T>& a = Data.first;
  const std::vector<T>& b = Data.second;
  assert(a.size() == Length && b.size() == a.size());

  benchmark::ClobberMemory();
  benchmark::DoNotOptimize(a);
  benchmark::DoNotOptimize(a.data());
  benchmark::DoNotOptimize(b);
  benchmark::DoNotOptimize(b.data());

  for (auto _ : state) {
    const bool is_equal = equal(a.data(), a.data() + a.size(), b.data());
    benchmark::DoNotOptimize(is_equal);
  }
  state.SetComplexityN(Length);
  state.counters["eltcnt"] =
      benchmark::Counter(Length, benchmark::Counter::kIsIterationInvariant);
  state.counters["eltcnt/sec"] =
      benchmark::Counter(Length, benchmark::Counter::kIsIterationInvariantRate);
  const size_t BytesRead = 2 * sizeof(T) * Length;
  state.counters["bytes_read/iteration"] =
      benchmark::Counter(BytesRead, benchmark::Counter::kDefaults,
                         benchmark::Counter::OneK::kIs1024);
  state.counters["bytes_read/sec"] = benchmark::Counter(
      BytesRead, benchmark::Counter::kIsIterationInvariantRate,
      benchmark::Counter::OneK::kIs1024);
}

template <typename T>
static void CustomArguments(benchmark::internal::Benchmark* b) {
  const size_t L2SizeBytes = []() {
    for (const benchmark::CPUInfo::CacheInfo& I :
         benchmark::CPUInfo::Get().caches) {
      if (I.level == 2) return I.size;
    }
    return 0;
  }();
  // What is the largest range we can check to always fit within given L2 cache?
  const size_t MaxLen = L2SizeBytes / /*total bufs*/ 2 /
                        /*maximal elt size*/ sizeof(T) / /*safety margin*/ 2;
  b->RangeMultiplier(2)->Range(1, MaxLen)->Complexity(benchmark::oN);
}

BENCHMARK_TEMPLATE(BM_bcmp, uint8_t, Identical)
    ->Apply(CustomArguments<uint8_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint16_t, Identical)
    ->Apply(CustomArguments<uint16_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint32_t, Identical)
    ->Apply(CustomArguments<uint32_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint64_t, Identical)
    ->Apply(CustomArguments<uint64_t>);

BENCHMARK_TEMPLATE(BM_bcmp, uint8_t, InequalHalfway)
    ->Apply(CustomArguments<uint8_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint16_t, InequalHalfway)
    ->Apply(CustomArguments<uint16_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint32_t, InequalHalfway)
    ->Apply(CustomArguments<uint32_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint64_t, InequalHalfway)
    ->Apply(CustomArguments<uint64_t>);
```
{F8768210}
```
$ ~/src/googlebenchmark/tools/compare.py --no-utest benchmarks build-{old,new}/test/llvm-bcmp-bench
RUNNING: build-old/test/llvm-bcmp-bench --benchmark_out=/tmp/tmpb6PEUx
2019-04-25 21:17:11
Running build-old/test/llvm-bcmp-bench
Run on (8 X 4000 MHz CPU s)
CPU Caches:
  L1 Data 16K (x8)
  L1 Instruction 64K (x4)
  L2 Unified 2048K (x4)
  L3 Unified 8192K (x1)
Load Average: 0.65, 3.90, 4.14
---------------------------------------------------------------------------------------------------
Benchmark                                         Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------------------------------
<...>
BM_bcmp<uint8_t, Identical>/512000           432131 ns       432101 ns         1613 bytes_read/iteration=1000k bytes_read/sec=2.20706G/s eltcnt=825.856M eltcnt/sec=1.18491G/s
BM_bcmp<uint8_t, Identical>_BigO               0.86 N          0.86 N
BM_bcmp<uint8_t, Identical>_RMS                   8 %             8 %
<...>
BM_bcmp<uint16_t, Identical>/256000          161408 ns       161409 ns         4027 bytes_read/iteration=1000k bytes_read/sec=5.90843G/s eltcnt=1030.91M eltcnt/sec=1.58603G/s
BM_bcmp<uint16_t, Identical>_BigO              0.67 N          0.67 N
BM_bcmp<uint16_t, Identical>_RMS                 25 %            25 %
<...>
BM_bcmp<uint32_t, Identical>/128000           81497 ns        81488 ns         8415 bytes_read/iteration=1000k bytes_read/sec=11.7032G/s eltcnt=1077.12M eltcnt/sec=1.57078G/s
BM_bcmp<uint32_t, Identical>_BigO              0.71 N          0.71 N
BM_bcmp<uint32_t, Identical>_RMS                 42 %            42 %
<...>
BM_bcmp<uint64_t, Identical>/64000            50138 ns        50138 ns        10909 bytes_read/iteration=1000k bytes_read/sec=19.0209G/s eltcnt=698.176M eltcnt/sec=1.27647G/s
BM_bcmp<uint64_t, Identical>_BigO              0.84 N          0.84 N
BM_bcmp<uint64_t, Identical>_RMS                 27 %            27 %
<...>
BM_bcmp<uint8_t, InequalHalfway>/512000      192405 ns       192392 ns         3638 bytes_read/iteration=1000k bytes_read/sec=4.95694G/s eltcnt=1.86266G eltcnt/sec=2.66124G/s
BM_bcmp<uint8_t, InequalHalfway>_BigO          0.38 N          0.38 N
BM_bcmp<uint8_t, InequalHalfway>_RMS              3 %             3 %
<...>
BM_bcmp<uint16_t, InequalHalfway>/256000     127858 ns       127860 ns         5477 bytes_read/iteration=1000k bytes_read/sec=7.45873G/s eltcnt=1.40211G eltcnt/sec=2.00219G/s
BM_bcmp<uint16_t, InequalHalfway>_BigO         0.50 N          0.50 N
BM_bcmp<uint16_t, InequalHalfway>_RMS             0 %             0 %
<...>
BM_bcmp<uint32_t, InequalHalfway>/128000      49140 ns        49140 ns        14281 bytes_read/iteration=1000k bytes_read/sec=19.4072G/s eltcnt=1.82797G eltcnt/sec=2.60478G/s
BM_bcmp<uint32_t, InequalHalfway>_BigO         0.40 N          0.40 N
BM_bcmp<uint32_t, InequalHalfway>_RMS            18 %            18 %
<...>
BM_bcmp<uint64_t, InequalHalfway>/64000       32101 ns        32099 ns        21786 bytes_read/iteration=1000k bytes_read/sec=29.7101G/s eltcnt=1.3943G eltcnt/sec=1.99381G/s
BM_bcmp<uint64_t, InequalHalfway>_BigO         0.50 N          0.50 N
BM_bcmp<uint64_t, InequalHalfway>_RMS             1 %             1 %
RUNNING: build-new/test/llvm-bcmp-bench --benchmark_out=/tmp/tmpQ46PP0
2019-04-25 21:19:29
Running build-new/test/llvm-bcmp-bench
Run on (8 X 4000 MHz CPU s)
CPU Caches:
  L1 Data 16K (x8)
  L1 Instruction 64K (x4)
  L2 Unified 2048K (x4)
  L3 Unified 8192K (x1)
Load Average: 1.01, 2.85, 3.71
---------------------------------------------------------------------------------------------------
Benchmark                                         Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------------------------------
<...>
BM_bcmp<uint8_t, Identical>/512000            18593 ns        18590 ns        37565 bytes_read/iteration=1000k bytes_read/sec=51.2991G/s eltcnt=19.2333G eltcnt/sec=27.541G/s
BM_bcmp<uint8_t, Identical>_BigO               0.04 N          0.04 N
BM_bcmp<uint8_t, Identical>_RMS                  37 %            37 %
<...>
BM_bcmp<uint16_t, Identical>/256000           18950 ns        18948 ns        37223 bytes_read/iteration=1000k bytes_read/sec=50.3324G/s eltcnt=9.52909G eltcnt/sec=13.511G/s
BM_bcmp<uint16_t, Identical>_BigO              0.08 N          0.08 N
BM_bcmp<uint16_t, Identical>_RMS                 34 %            34 %
<...>
BM_bcmp<uint32_t, Identical>/128000           18627 ns        18627 ns        37895 bytes_read/iteration=1000k bytes_read/sec=51.198G/s eltcnt=4.85056G eltcnt/sec=6.87168G/s
BM_bcmp<uint32_t, Identical>_BigO              0.16 N          0.16 N
BM_bcmp<uint32_t, Identical>_RMS                 35 %            35 %
<...>
BM_bcmp<uint64_t, Identical>/64000            18855 ns        18855 ns        37458 bytes_read/iteration=1000k bytes_read/sec=50.5791G/s eltcnt=2.39731G eltcnt/sec=3.3943G/s
BM_bcmp<uint64_t, Identical>_BigO              0.32 N          0.32 N
BM_bcmp<uint64_t, Identical>_RMS                 33 %            33 %
<...>
BM_bcmp<uint8_t, InequalHalfway>/512000        9570 ns         9569 ns        73500 bytes_read/iteration=1000k bytes_read/sec=99.6601G/s eltcnt=37.632G eltcnt/sec=53.5046G/s
BM_bcmp<uint8_t, InequalHalfway>_BigO          0.02 N          0.02 N
BM_bcmp<uint8_t, InequalHalfway>_RMS             29 %            29 %
<...>
BM_bcmp<uint16_t, InequalHalfway>/256000       9547 ns         9547 ns        74343 bytes_read/iteration=1000k bytes_read/sec=99.8971G/s eltcnt=19.0318G eltcnt/sec=26.8159G/s
BM_bcmp<uint16_t, InequalHalfway>_BigO         0.04 N          0.04 N
BM_bcmp<uint16_t, InequalHalfway>_RMS            29 %            29 %
<...>
BM_bcmp<uint32_t, InequalHalfway>/128000       9396 ns         9394 ns        73521 bytes_read/iteration=1000k bytes_read/sec=101.518G/s eltcnt=9.41069G eltcnt/sec=13.6255G/s
BM_bcmp<uint32_t, InequalHalfway>_BigO         0.08 N          0.08 N
BM_bcmp<uint32_t, InequalHalfway>_RMS            30 %            30 %
<...>
BM_bcmp<uint64_t, InequalHalfway>/64000        9499 ns         9498 ns        73802 bytes_read/iteration=1000k bytes_read/sec=100.405G/s eltcnt=4.72333G eltcnt/sec=6.73808G/s
BM_bcmp<uint64_t, InequalHalfway>_BigO         0.16 N          0.16 N
BM_bcmp<uint64_t, InequalHalfway>_RMS            28 %            28 %
Comparing build-old/test/llvm-bcmp-bench to build-new/test/llvm-bcmp-bench
Benchmark                                                  Time             CPU      Time Old      Time New       CPU Old       CPU New
---------------------------------------------------------------------------------------------------------------------------------------
<...>
BM_bcmp<uint8_t, Identical>/512000                      -0.9570         -0.9570        432131         18593        432101         18590
<...>
BM_bcmp<uint16_t, Identical>/256000                     -0.8826         -0.8826        161408         18950        161409         18948
<...>
BM_bcmp<uint32_t, Identical>/128000                     -0.7714         -0.7714         81497         18627         81488         18627
<...>
BM_bcmp<uint64_t, Identical>/64000                      -0.6239         -0.6239         50138         18855         50138         18855
<...>
BM_bcmp<uint8_t, InequalHalfway>/512000                 -0.9503         -0.9503        192405          9570        192392          9569
<...>
BM_bcmp<uint16_t, InequalHalfway>/256000                -0.9253         -0.9253        127858          9547        127860          9547
<...>
BM_bcmp<uint32_t, InequalHalfway>/128000                -0.8088         -0.8088         49140          9396         49140          9394
<...>
BM_bcmp<uint64_t, InequalHalfway>/64000                 -0.7041         -0.7041         32101          9499         32099          9498
```

What can we tell from the benchmark?
* Performance of naive equality check somewhat improves with element size,
  maxing out at eltcnt/sec=1.58603G/s for uint16_t, or bytes_read/sec=19.0209G/s
  for uint64_t. I think, that instability implies performance problems.
* Performance of `memcmp()`-aware benchmark always maxes out at around
  bytes_read/sec=51.2991G/s for every type. That is 2.6x the throughput of the
  naive variant!
* eltcnt/sec metric for the `memcmp()`-aware benchmark maxes out at
  eltcnt/sec=27.541G/s for uint8_t (was: eltcnt/sec=1.18491G/s, so 24x) and
  linearly decreases with element size.
  For uint64_t, it's ~4x+ the elements/second.
* The call obvious is more pricey than the loop, with small element count.
  As it can be seen from the full output {F8768210}, the `memcmp()` is almost
  universally worse, independent of the element size (and thus buffer size) when
  element count is less than 8.

So all in all, bcmp idiom does indeed pose untapped performance headroom.
This diff does implement said idiom recognition. I think a reasonable test
coverage is present, but do tell if there is anything obvious missing.

Now, quality. This does succeed to build and pass the test-suite, at least
without any non-bundled elements. {F8768216} {F8768217}
This transform fires 91 times:
```
$ /build/test-suite/utils/compare.py -m loop-idiom.NumBCmp result-new.json
Tests: 1149
Metric: loop-idiom.NumBCmp

Program                                         result-new

MultiSourc...Benchmarks/7zip/7zip-benchmark    79.00
MultiSource/Applications/d/make_dparser         3.00
SingleSource/UnitTests/vla                      2.00
MultiSource/Applications/Burg/burg              1.00
MultiSourc.../Applications/JM/lencod/lencod     1.00
MultiSource/Applications/lemon/lemon            1.00
MultiSource/Benchmarks/Bullet/bullet            1.00
MultiSourc...e/Benchmarks/MallocBench/gs/gs     1.00
MultiSourc...gs-C/TimberWolfMC/timberwolfmc     1.00
MultiSourc...Prolangs-C/simulator/simulator     1.00
```
The size changes are:
I'm not sure what's going on with SingleSource/UnitTests/vla.test yet, did not look.
```
$ /build/test-suite/utils/compare.py -m size..text result-{old,new}.json --filter-hash
Tests: 1149
Same hash: 907 (filtered out)
Remaining: 242
Metric: size..text

Program                                        result-old result-new diff
test-suite...ingleSource/UnitTests/vla.test   753.00     833.00     10.6%
test-suite...marks/7zip/7zip-benchmark.test   1001697.00 966657.00  -3.5%
test-suite...ngs-C/simulator/simulator.test   32369.00   32321.00   -0.1%
test-suite...plications/d/make_dparser.test   89585.00   89505.00   -0.1%
test-suite...ce/Applications/Burg/burg.test   40817.00   40785.00   -0.1%
test-suite.../Applications/lemon/lemon.test   47281.00   47249.00   -0.1%
test-suite...TimberWolfMC/timberwolfmc.test   250065.00  250113.00   0.0%
test-suite...chmarks/MallocBench/gs/gs.test   149889.00  149873.00  -0.0%
test-suite...ications/JM/lencod/lencod.test   769585.00  769569.00  -0.0%
test-suite.../Benchmarks/Bullet/bullet.test   770049.00  770049.00   0.0%
test-suite...HMARK_ANISTROPIC_DIFFUSION/128    NaN        NaN        nan%
test-suite...HMARK_ANISTROPIC_DIFFUSION/256    NaN        NaN        nan%
test-suite...CHMARK_ANISTROPIC_DIFFUSION/64    NaN        NaN        nan%
test-suite...CHMARK_ANISTROPIC_DIFFUSION/32    NaN        NaN        nan%
test-suite...ENCHMARK_BILATERAL_FILTER/64/4    NaN        NaN        nan%
Geomean difference                                                   nan%
         result-old    result-new       diff
count  1.000000e+01  10.00000      10.000000
mean   3.152090e+05  311695.40000  0.006749
std    3.790398e+05  372091.42232  0.036605
min    7.530000e+02  833.00000    -0.034981
25%    4.243300e+04  42401.00000  -0.000866
50%    1.197370e+05  119689.00000 -0.000392
75%    6.397050e+05  639705.00000 -0.000005
max    1.001697e+06  966657.00000  0.106242
```

I don't have timings though.

And now to the code. The basic idea is to completely replace the whole loop.
If we can't fully kill it, don't transform.
I have left one or two comments in the code, so hopefully it can be understood.

Also, there is a few TODO's that i have left for follow-ups:
* widening of `memcmp()`/`bcmp()`
* step smaller than the comparison size
* Metadata propagation
* more than two blocks as long as there is still a single backedge?
* ???

Reviewers: reames, fhahn, mkazantsev, chandlerc, craig.topper, courbet

Reviewed By: courbet

Subscribers: miyuki, hiraditya, xbolva00, nikic, jfb, gchatelet, courbet, llvm-commits, mclow.lists

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61144

llvm-svn: 374662
2019-10-12 15:35:32 +00:00
Oliver Stannard 9f6a873268 Dead Virtual Function Elimination
Currently, it is hard for the compiler to remove unused C++ virtual
functions, because they are all referenced from vtables, which are referenced
by constructors. This means that if the constructor is called from any live
code, then we keep every virtual function in the final link, even if there
are no call sites which can use it.

This patch allows unused virtual functions to be removed during LTO (and
regular compilation in limited circumstances) by using type metadata to match
virtual function call sites to the vtable slots they might load from. This
information can then be used in the global dead code elimination pass instead
of the references from vtables to virtual functions, to more accurately
determine which functions are reachable.

To make this transformation safe, I have changed clang's code-generation to
always load virtual function pointers using the llvm.type.checked.load
intrinsic, instead of regular load instructions. I originally tried writing
this using clang's existing code-generation, which uses the llvm.type.test
and llvm.assume intrinsics after doing a normal load. However, it is possible
for optimisations to obscure the relationship between the GEP, load and
llvm.type.test, causing GlobalDCE to fail to find virtual function call
sites.

The existing linkage and visibility types don't accurately describe the scope
in which a virtual call could be made which uses a given vtable. This is
wider than the visibility of the type itself, because a virtual function call
could be made using a more-visible base class. I've added a new
!vcall_visibility metadata type to represent this, described in
TypeMetadata.rst. The internalization pass and libLTO have been updated to
change this metadata when linking is performed.

This doesn't currently work with ThinLTO, because it needs to see every call
to llvm.type.checked.load in the linkage unit. It might be possible to
extend this optimisation to be able to use the ThinLTO summary, as was done
for devirtualization, but until then that combination is rejected in the
clang driver.

To test this, I've written a fuzzer which generates random C++ programs with
complex class inheritance graphs, and virtual functions called through object
and function pointers of different types. The programs are spread across
multiple translation units and DSOs to test the different visibility
restrictions.

I've also tried doing bootstrap builds of LLVM to test this. This isn't
ideal, because only classes in anonymous namespaces can be optimised with
-fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not
work correctly with -fvisibility=hidden. However, there are only 12 test
failures when building with -fvisibility=hidden (and an unmodified compiler),
and this change does not cause any new failures for either value of
-fvisibility.

On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size
reduction of ~6%, over a baseline compiled with "-O2 -flto
-fvisibility=hidden -fwhole-program-vtables". The best cases are reductions
of ~14% in 450.soplex and 483.xalancbmk, and there are no code size
increases.

I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which
show a geomean size reduction of ~3%, again with no size increases.

I had hoped that this would have no effect on performance, which would allow
it to awlays be enabled (when using -fwhole-program-vtables). However, the
changes in clang to use the llvm.type.checked.load intrinsic are causing ~1%
performance regression in the C++ parts of SPEC2006. It should be possible to
recover some of this perf loss by teaching optimisations about the
llvm.type.checked.load intrinsic, which would make it worth turning this on
by default (though it's still dependent on -fwhole-program-vtables).

Differential revision: https://reviews.llvm.org/D63932

llvm-svn: 374539
2019-10-11 11:59:55 +00:00
Kai Nacke 5b5b2fd2b8 [FileCheck] Implement --ignore-case option.
The FileCheck utility is enhanced to support a `--ignore-case`
option. This is useful in cases where the output of Unix tools
differs in case (e.g. case not specified by Posix).

Reviewers: Bigcheese, jakehehrlich, rupprecht, espindola, alexshap, jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D68146

llvm-svn: 374538
2019-10-11 11:59:14 +00:00
Tom Stellard 97578b14fc docs/DeveloperPolicy: Add instructions for requesting GitHub commit access
Subscribers: mehdi_amini, jtony, xbolva00, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66840

llvm-svn: 374474
2019-10-10 23:36:06 +00:00
Julian Lettner b858895c85 [lit] Bring back `--threads` option alias
Bring back `--threads` option which was lost in the move of the
command line argument parsing code to cl_arguments.py.  Update docs
since `--workers` is preferred.

llvm-svn: 374432
2019-10-10 19:43:57 +00:00
Jinsong Ji 26cd5c9370 [PowerPC][docs] Update IBM official docs in Compiler Writers Info page
Summary:
Just realized that most of the links in this page are deprecated.
So update some important reference here:
* adding PowerISA 3.0B/2.7B
* adding P8/P9 User Manual
* ELFv2 ABI and errata

Move deprecated ones into "Other documents..".

Reviewers: #powerpc, hfinkel, nemanjai

Reviewed By: hfinkel

Subscribers: shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68817

llvm-svn: 374428
2019-10-10 19:25:30 +00:00
Roman Lebedev a5e65c1cf7 [MCA] Show aggregate over Average Wait times for the whole snippet (PR43219)
Summary:
As disscused in https://bugs.llvm.org/show_bug.cgi?id=43219,
i believe it may be somewhat useful to show //some// aggregates
over all the sea of statistics provided.

Example:
```
Average Wait times (based on the timeline view):
[0]: Executions
[1]: Average time spent waiting in a scheduler's queue
[2]: Average time spent waiting in a scheduler's queue while ready
[3]: Average time elapsed from WB until retire stage

      [0]    [1]    [2]    [3]
0.     3     1.0    1.0    4.7       vmulps     %xmm0, %xmm1, %xmm2
1.     3     2.7    0.0    2.3       vhaddps    %xmm2, %xmm2, %xmm3
2.     3     6.0    0.0    0.0       vhaddps    %xmm3, %xmm3, %xmm4
       3     3.2    0.3    2.3       <total>
```
I.e. we average the averages.

Reviewers: andreadb, mattd, RKSimon

Reviewed By: andreadb

Subscribers: gbedwell, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68714

llvm-svn: 374361
2019-10-10 14:46:21 +00:00
Dmitri Gribenko d3aed7fc79 Revert "[FileCheck] Implement --ignore-case option."
This reverts commit r374339. It broke tests:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19066

llvm-svn: 374359
2019-10-10 14:27:14 +00:00
Kai Nacke dfd2b6f07f [FileCheck] Implement --ignore-case option.
The FileCheck utility is enhanced to support a `--ignore-case`
option. This is useful in cases where the output of Unix tools
differs in case (e.g. case not specified by Posix).

Reviewers: Bigcheese, jakehehrlich, rupprecht, espindola, alexshap, jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D68146

llvm-svn: 374339
2019-10-10 13:15:41 +00:00
Roman Lebedev 536b0ee40a [UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour
Summary:
Quote from http://eel.is/c++draft/expr.add#4:
```
4     When an expression J that has integral type is added to or subtracted
      from an expression P of pointer type, the result has the type of P.
(4.1) If P evaluates to a null pointer value and J evaluates to 0,
      the result is a null pointer value.
(4.2) Otherwise, if P points to an array element i of an array object x with n
      elements ([dcl.array]), the expressions P + J and J + P
      (where J has the value j) point to the (possibly-hypothetical) array
      element i+j of x if 0≤i+j≤n and the expression P - J points to the
      (possibly-hypothetical) array element i−j of x if 0≤i−j≤n.
(4.3) Otherwise, the behavior is undefined.
```

Therefore, as per the standard, applying non-zero offset to `nullptr`
(or making non-`nullptr` a `nullptr`, by subtracting pointer's integral value
from the pointer itself) is undefined behavior. (*if* `nullptr` is not defined,
i.e. e.g. `-fno-delete-null-pointer-checks` was *not* specified.)

To make things more fun, in C (6.5.6p8), applying *any* offset to null pointer
is undefined, although Clang front-end pessimizes the code by not lowering
that info, so this UB is "harmless".

Since rL369789 (D66608 `[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null`)
LLVM middle-end uses those guarantees for transformations.
If the source contains such UB's, said code may now be miscompiled.
Such miscompilations were already observed:
* https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190826/687838.html
* https://github.com/google/filament/pull/1566

Surprisingly, UBSan does not catch those issues
... until now. This diff teaches UBSan about these UB's.

`getelementpointer inbounds` is a pretty frequent instruction,
so this does have a measurable impact on performance;
I've addressed most of the obvious missing folds (and thus decreased the performance impact by ~5%),
and then re-performed some performance measurements using my [[ https://github.com/darktable-org/rawspeed | RawSpeed ]] benchmark:
(all measurements done with LLVM ToT, the sanitizer never fired.)
* no sanitization vs. existing check: average `+21.62%` slowdown
* existing check vs. check after this patch: average `22.04%` slowdown
* no sanitization vs. this patch: average `48.42%` slowdown

Reviewers: vsk, filcab, rsmith, aaron.ballman, vitalybuka, rjmccall, #sanitizers

Reviewed By: rsmith

Subscribers: kristof.beyls, nickdesaulniers, nikic, ychen, dtzWill, xbolva00, dberris, arphaman, rupprecht, reames, regehr, llvm-commits, cfe-commits

Tags: #clang, #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D67122

llvm-svn: 374293
2019-10-10 09:25:02 +00:00
DeForest Richards edbb895b18 [Docs] Adds section for Additional Topics on Reference page
Adds a new section for Additional Topics on the Reference documentation page. Also moves Support Library topic to User Guides page.

llvm-svn: 374230
2019-10-09 21:09:09 +00:00
DeForest Richards 02d264a547 [Docs] Adds Documentation links to sidebar
Adds links to Getting Started/Tutorials, User Guides, and Reference documentation pages to sidebar. Also adds a new section for LLVM IR on the Reference documentation page.

llvm-svn: 374214
2019-10-09 20:26:13 +00:00
DeForest Richards b7538c5140 [Docs] Fixes broken sphinx build - undefined label
Removes label ref pointing to non-existent subsystem docs page.

llvm-svn: 374128
2019-10-08 22:45:20 +00:00
Clement Courbet 2cd0f28959 [llvm-exegesis] Add options to SnippetGenerator.
Summary:
This adds a `-max-configs-per-opcode` option to limit the number of
configs per opcode.

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68642

llvm-svn: 374054
2019-10-08 14:30:24 +00:00
Kevin P. Neal c91f1992a6 Nope, I'm wrong. It looks like someone else removed these on purpose and
it just happened to break the bot right when I did my push. So I'm undoing
this mornings incorrect push. I've also kicked off an email to hopefully
get the bot fixed the correct way.

llvm-svn: 374049
2019-10-08 14:10:26 +00:00
Kevin P. Neal 0929e5eca2 Restore documentation that 'svn update' unexpectedly yanked out from under me.
llvm-svn: 374045
2019-10-08 13:38:42 +00:00
Joerg Sonnenberger 2b9f0b064b Fix the spelling of my name.
llvm-svn: 373980
2019-10-07 22:55:42 +00:00
Reid Kleckner f9b67b810e [X86] Add new calling convention that guarantees tail call optimization
When the target option GuaranteedTailCallOpt is specified, calls with
the fastcc calling convention will be transformed into tail calls if
they are in tail position. This diff adds a new calling convention,
tailcc, currently supported only on X86, which behaves the same way as
fastcc, except that the GuaranteedTailCallOpt flag does not need to
enabled in order to enable tail call optimization.

Patch by Dwight Guth <dwight.guth@runtimeverification.com>!

Reviewed By: lebedev.ri, paquette, rnk

Differential Revision: https://reviews.llvm.org/D67855

llvm-svn: 373976
2019-10-07 22:28:58 +00:00
Kevin P. Neal 9f4de84eb0 Fix another sphinx warning.
Differential Revision:	https://reviews.llvm.org/D64746

llvm-svn: 373909
2019-10-07 14:14:46 +00:00
Kevin P. Neal a6fc72fba9 Fix sphinx warnings.
Differential Revision:	https://reviews.llvm.org/D64746

llvm-svn: 373902
2019-10-07 13:39:56 +00:00
Kevin P. Neal 1c3d19c82d [FPEnv] Add constrained intrinsics for lrint and lround
Earlier in the year intrinsics for lrint, llrint, lround and llround were
added to llvm. The constrained versions are now implemented here.

Reviewed by:	andrew.w.kaylor, craig.topper, cameron.mcinally
Approved by:	craig.topper
Differential Revision:	https://reviews.llvm.org/D64746

llvm-svn: 373900
2019-10-07 13:20:00 +00:00
Djordje Todorovic 0c56f425a0 [llvm-locstats] Fix a typo in the documentation; NFC
llvm-svn: 373880
2019-10-07 07:31:49 +00:00
DeForest Richards 38d16c15b7 [Docs] Removes Subsystem Documentation page
Removes Subsystem Documentation page. Also moves existing topics on Subsystem Documentation page to User Guides and Reference pages.

llvm-svn: 373872
2019-10-06 22:49:22 +00:00
DeForest Richards de0e3aac2a [Docs] Removes Programming Documentation page
Removes Programming Documentation page. Also moves existing topics on Programming Documentation page to User Guides and Reference pages. 

llvm-svn: 373856
2019-10-06 16:10:11 +00:00
DeForest Richards 6d19651410 [Docs] Adds new Getting Started/Tutorials page
Adds a new page for Getting Started/Tutorials topics. Also updates existing topic categories on the User Guides and Reference pages.

llvm-svn: 373854
2019-10-06 15:36:37 +00:00
Sylvestre Ledru 68eef2bcd0 Update the FAQ: remove stuff related to the previous license +
update info about the portability of LLVM.

llvm-svn: 373576
2019-10-03 09:43:54 +00:00
Fangrui Song 671fb34358 [llvm-objcopy] Add --set-section-alignment
Fixes PR43181. This option was recently added to GNU objcopy (binutils
PR24942).

`llvm-objcopy -I binary -O elf64-x86-64 --set-section-alignment .data=8` can set the alignment of .data.

Reviewed By: grimar, jhenderson, rupprecht

Differential Revision: https://reviews.llvm.org/D67656

llvm-svn: 373461
2019-10-02 12:41:25 +00:00
Djordje Todorovic 2ef18fb41a Reland "[utils] Implement the llvm-locstats tool"
The tool reports verbose output for the DWARF debug location coverage.
The llvm-locstats for each variable or formal parameter DIE computes what
percentage from the code section bytes, where it is in scope, it has
location description. The line 0 shows the number (and the percentage) of
DIEs with no location information, but the line 100 shows the number (and
the percentage) of DIEs where there is location information in all code
section bytes (where the variable or parameter is in the scope). The line
50..59 shows the number (and the percentage) of DIEs where the location
information is in between 50 and 59 percentage of its scope covered.

Differential Revision: https://reviews.llvm.org/D66526

The cause of the test failure was resolved.

llvm-svn: 373427
2019-10-02 07:00:01 +00:00
Vedant Kumar a1e7efaaa8 [ReleaseProcess] Document requirement to set MACOSX_DEPLOYMENT_TARGET
llvm-svn: 373356
2019-10-01 17:10:45 +00:00
Djordje Todorovic 372048e908 Revert "Reland "[utils] Implement the llvm-locstats tool""
This reverts commit rL373317 due to test failure on the
clang-s390x-linux build bot.

llvm-svn: 373336
2019-10-01 13:21:15 +00:00
Djordje Todorovic 6d7f7e6792 Reland "[utils] Implement the llvm-locstats tool"
The tool reports verbose output for the DWARF debug location coverage.
The llvm-locstats for each variable or formal parameter DIE computes what
percentage from the code section bytes, where it is in scope, it has
location description. The line 0 shows the number (and the percentage) of
DIEs with no location information, but the line 100 shows the number (and
the percentage) of DIEs where there is location information in all code
section bytes (where the variable or parameter is in the scope). The line
50..59 shows the number (and the percentage) of DIEs where the location
information is in between 50 and 59 percentage of its scope covered.

Differential Revision: https://reviews.llvm.org/D66526

llvm-svn: 373317
2019-10-01 09:59:15 +00:00
Fangrui Song 2d92c8844e [llvm-readobj/llvm-readelf] Delete --arm-attributes (alias for --arch-specific)
D68110 added --arch-specific (supported by GNU readelf) and made
--arm-attributes an alias for it. The tests were later migrated to use
--arch-specific.

Note, llvm-readelf --arch-specific currently just uses llvm-readobj
style output for ARM attributes. The readelf-style output is not
implemented.

Reviewed By: compnerd, kongyi, rupprecht

Differential Revision: https://reviews.llvm.org/D68196

llvm-svn: 373291
2019-10-01 01:31:15 +00:00
Pablo Barrio ffac4e8603 Fix doc for t inline asm constraints for ARM/Thumb
Summary: The constraint goes up to regs d15 and q7, not d16 and q8.

Subscribers: kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68090

llvm-svn: 373228
2019-09-30 16:55:10 +00:00
Kevin P. Neal 71c5b38acd Fix breakage of sphinx builders. Sorry for leaving this broken over the
weekend!

llvm-svn: 373215
2019-09-30 14:51:59 +00:00
Djordje Todorovic 8180f3b1cc Revert "Reland "[utils] Implement the llvm-locstats tool""
This reverts commit rL373183.

llvm-svn: 373200
2019-09-30 11:19:11 +00:00
Djordje Todorovic 0f30960619 Reland "[utils] Implement the llvm-locstats tool"
The tool reports verbose output for the DWARF debug location coverage.
The llvm-locstats for each variable or formal parameter DIE computes what
percentage from the code section bytes, where it is in scope, it has
location description. The line 0 shows the number (and the percentage) of
DIEs with no location information, but the line 100 shows the number (and
the percentage) of DIEs where there is location information in all code
section bytes (where the variable or parameter is in the scope). The line
50..59 shows the number (and the percentage) of DIEs where the location
information is in between 50 and 59 percentage of its scope covered.

Differential Revision: https://reviews.llvm.org/D66526

llvm-svn: 373183
2019-09-30 07:35:17 +00:00
DeForest Richards eb78dea4cc [Docs] Moves article links to new pages
Moves existing article links on the Programming, Subsystem, and Reference documentation pages to new locations. Also moves Github Repository and Publications links to the sidebar.

llvm-svn: 373169
2019-09-29 15:31:52 +00:00
DeForest Richards ac5969933a [Docs] Adds sections for Command Line and LibFuzzer articles
Adds sections for Command Line and Libfuzzer articles on Programming Documentation page.

llvm-svn: 373158
2019-09-29 02:16:38 +00:00