Commit Graph

53 Commits

Author SHA1 Message Date
Diego Novillo c04270d2e4 Fix SamplePGO segfault when debug info is missing.
When emitting a remark for a conditional branch annotation, the remark
uses the line location information of the conditional branch in the
message.  In some cases, that information is unavailable and the
optimization would segfaul. I'm still not sure whether this is a bug or
WAI, but the optimizer should not die because of this.

llvm-svn: 251420
2015-10-27 17:37:00 +00:00
Diego Novillo 7963ea1996 SamplePGO - Add optimization reports.
This adds a couple of optimization remarks to the SamplePGO
transformation. When it decides to inline a hot function (to mimic the
inline stack and repeat useful inline decisions in the original build).

It will also report branch destinations. For instance, given the code
fragment:

     6      if (i < 1000)
     7        sum -= i;
     8      else
     9        sum += -i * rand();

If the 'else' branch is taken most of the time, building this code with
-Rpass=sample-profile will produce:

a.cc:9:14: remark: most popular destination for conditional branches at small.cc:6:9 [-Rpass=sample-profile]
      sum += -i * rand();
             ^

llvm-svn: 251330
2015-10-26 18:52:53 +00:00
Diego Novillo 5eb5ad09d5 Cleanup test case debug info. NFC.
llvm-svn: 251320
2015-10-26 18:14:44 +00:00
Dehao Chen 100424124b Tolerate negative offset when matching sample profile.
In some cases (as illustrated in the unittest), lineno can be less than the heade_lineno because the function body are included from some other files. In this case, offset will be negative. This patch makes clang still able to match the profile to IR in this situation.

http://reviews.llvm.org/D13914

llvm-svn: 250873
2015-10-21 01:22:27 +00:00
Diego Novillo b93483dbce Sample profiles - Re-arrange binary format to emit head samples only on top functions.
The number of samples collected at the head of a function only make
sense for top-level functions (i.e., those actually called as opposed to
being inlined inside another).

Head samples essentially count the time spent inside the function's
prologue.  This clearly doesn't make sense for inlined functions, so we
were always emitting 0 in those.

llvm-svn: 250539
2015-10-16 18:54:35 +00:00
Diego Novillo 760c5a8f45 Sample profiles - Add a name table to the binary encoding.
Binary encoded profiles used to encode all function names inline at
every reference.  This is clearly suboptimal in terms of space.  This
patch fixes this by adding a name table to the header of the file.

llvm-svn: 250241
2015-10-13 22:48:46 +00:00
Diego Novillo a7f1e8ef83 Add inline stack streaming to binary sample profiles.
With this patch we can now read and write inline stacks in sample
profiles to the binary encoded profiles.

In a subsequent patch, I will add a string table to the binary encoding.
Right now function names are emitted as strings every time we find them.
This is too bloated and will produce large files in applications with
lots of inlining.

llvm-svn: 249861
2015-10-09 17:54:24 +00:00
Dehao Chen 7c41dd6498 Update sample profile propagation algorithm.
http://reviews.llvm.org/D13218

llvm-svn: 248968
2015-10-01 00:26:56 +00:00
Dehao Chen aae9e1f2bd Add unittest for new samle profile format.
http://reviews.llvm.org/D13145

llvm-svn: 248870
2015-09-30 01:05:37 +00:00
Dehao Chen 6722688eaa http://reviews.llvm.org/D13145
Support hierarachical sample profile format.

llvm-svn: 248865
2015-09-30 00:42:46 +00:00
Dehao Chen 028e122ca9 Revert r248810 which breaks tests.
llvm-svn: 248814
2015-09-29 18:18:49 +00:00
Dehao Chen 410a25aa7a http://reviews.llvm.org/D13231
Change lookup functions to const functions.

llvm-svn: 248810
2015-09-29 17:59:58 +00:00
Cong Hou 15ea016346 Use fixed-point representation for BranchProbability.
BranchProbability now is represented by its numerator and denominator in uint32_t type. This patch changes this representation into a fixed point that is represented by the numerator in uint32_t type and a constant denominator 1<<31. This is quite similar to the representation of BlockMass in BlockFrequencyInfoImpl.h. There are several pros and cons of this change:

Pros:

1. It uses only a half space of the current one.
2. Some operations are much faster like plus, subtraction, comparison, and scaling by an integer.

Cons:

1. Constructing a probability using arbitrary numerator and denominator needs additional calculations.
2. It is a little less precise than before as we use a fixed denominator. For example, 1 - 1/3 may not be exactly identical to 1 / 3 (this will lead to many BranchProbability unit test failures). This should not matter when we only use it for branch probability. If we use it like a rational value for some precise calculations we may need another construct like ValueRatio.

One important reason for this change is that we propose to store branch probabilities instead of edge weights in MachineBasicBlock. We also want clients to use probability instead of weight when adding successors to a MBB. The current BranchProbability has more space which may be a concern.

Differential revision: http://reviews.llvm.org/D12603

llvm-svn: 248633
2015-09-25 23:09:59 +00:00
Diego Novillo d2e2137b4c Temporarily fix gcov failures in big-endian hosts.
This test uses a gcov file generated in a little-endian host. The gcov
reader does not allow different endianness, so the test fails on big
endian hosts.

XFAILing for now.

llvm-svn: 247920
2015-09-17 19:05:48 +00:00
Diego Novillo 3376a78781 GCC AutoFDO profile reader - Initial support.
This adds enough machinery to support reading simple GCC AutoFDO
profiles. It now supports reading flat profiles (no function calls).
Subsequent patches will add support for:

- Inlined calls (in particular, the inline call stack is not traversed
  to accumulate samples).

- Working sets and modules. These are used mostly for GCC's LIPO
  optimizations, so they're not needed in LLVM atm. I'm not sure that
  we will ever need them. For now, I've if0'd around the calls.

The patch also adds support in GCOV.h for gcov version V704 (generated
by GCC's profile conversion tool).

llvm-svn: 247874
2015-09-17 00:17:24 +00:00
Duncan P. N. Exon Smith 814b8e91c7 DI: Require subprogram definitions to be distinct
As a follow-up to r246098, require `DISubprogram` definitions
(`isDefinition: true`) to be 'distinct'.  Specifically, add an assembler
check, a verifier check, and bitcode upgrading logic to combat testcase
bitrot after the `DIBuilder` change.

While working on the testcases, I realized that
test/Linker/subprogram-linkonce-weak-odr.ll isn't relevant anymore.  Its
purpose was to check for a corner case in PR22792 where two subprogram
definitions match exactly and share the same metadata node.  The new
verifier check, requiring that subprogram definitions are 'distinct',
precludes that possibility.

I updated almost all the IR with the following script:

    git grep -l -E -e '= !DISubprogram\(.* isDefinition: true' |
    grep -v test/Bitcode |
    xargs sed -i '' -e 's/= \(!DISubprogram(.*, isDefinition: true\)/= distinct \1/'

Likely some variant of would work for out-of-tree testcases.

llvm-svn: 246327
2015-08-28 20:26:49 +00:00
Duncan P. N. Exon Smith 55ca964e94 DI: Disallow uniquable DICompileUnits
Since r241097, `DIBuilder` has only created distinct `DICompileUnit`s.
The backend is liable to start relying on that (if it hasn't already),
so make uniquable `DICompileUnit`s illegal and automatically upgrade old
bitcode.  This is a nice cleanup, since we can remove an unnecessary
`DenseSet` (and the associated uniquing info) from `LLVMContextImpl`.

Almost all the testcases were updated with this script:

    git grep -e '= !DICompileUnit' -l -- test |
    grep -v test/Bitcode |
    xargs sed -i '' -e 's,= !DICompileUnit,= distinct !DICompileUnit,'

I imagine something similar should work for out-of-tree testcases.

llvm-svn: 243885
2015-08-03 17:26:41 +00:00
Duncan P. N. Exon Smith ed013cd221 DI: Remove DW_TAG_arg_variable and DW_TAG_auto_variable
Remove the fake `DW_TAG_auto_variable` and `DW_TAG_arg_variable` tags,
using `DW_TAG_variable` in their place Stop exposing the `tag:` field at
all in the assembly format for `DILocalVariable`.

Most of the testcase updates were generated by the following sed script:

    find test/ -name "*.ll" -o -name "*.mir" |
    xargs grep -l 'DILocalVariable' |
    xargs sed -i '' \
      -e 's/tag: DW_TAG_arg_variable, //' \
      -e 's/tag: DW_TAG_auto_variable, //'

There were only a handful of tests in `test/Assembly` that I needed to
update by hand.

(Note: a follow-up could change `DILocalVariable::DILocalVariable()` to
set the tag to `DW_TAG_formal_parameter` instead of `DW_TAG_variable`
(as appropriate), instead of having that logic magically in the backend
in `DbgVariable`.  I've added a FIXME to that effect.)

llvm-svn: 243774
2015-07-31 18:58:39 +00:00
Diego Novillo ffc84e378a Add function entry counts from sample profiles.
This patch uses the new function profile metadata "function_entry_count"
to annotate entry counts from sample profiles.

In a sampling profile, the total samples collected at the function entry
are an approximation for the number of times that function was invoked.

llvm-svn: 237265
2015-05-13 17:04:29 +00:00
Diego Novillo 14f94de1ee Allow 0-weight branches in BranchProbabilityInfo.
Summary:
When computing branch weights in BPI, we used to disallow branches with
weight 0. This is a minor nuisance, because a branch with weight 0 is
different to "don't have information". In the context of
instrumentation, it may mean "never executed", in the context of
sampling, it means "never or seldom executed".

In allowing 0 weight branches, I ran into issues with the switch
expansion code in selection DAG. It is currently hardwired to not handle
branches with weight 0. To maintain the current behaviour, I changed it
to use 1 when it finds 0, but perhaps the algorithm needs changes to
tolerate branches with weight zero.

Reviewers: hansw

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9533

llvm-svn: 236617
2015-05-06 17:55:11 +00:00
Duncan P. N. Exon Smith a9308c49ef IR: Give 'DI' prefix to debug info metadata
Finish off PR23080 by renaming the debug info IR constructs from `MD*`
to `DI*`.  The last of the `DIDescriptor` classes were deleted in
r235356, and the last of the related typedefs removed in r235413, so
this has all baked for about a week.

Note: If you have out-of-tree code (like a frontend), I recommend that
you get everything compiling and tests passing with the *previous*
commit before updating to this one.  It'll be easier to keep track of
what code is using the `DIDescriptor` hierarchy and what you've already
updated, and I think you're extremely unlikely to insert bugs.  YMMV of
course.

Back to *this* commit: I did this using the rename-md-di-nodes.sh
upgrade script I've attached to PR23080 (both code and testcases) and
filtered through clang-format-diff.py.  I edited the tests for
test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns
were off-by-three.  It should work on your out-of-tree testcases (and
code, if you've followed the advice in the previous paragraph).

Some of the tests are in badly named files now (e.g.,
test/Assembler/invalid-mdcompositetype-missing-tag.ll should be
'dicompositetype'); I'll come back and move the files in a follow-up
commit.

llvm-svn: 236120
2015-04-29 16:38:44 +00:00
David Blaikie 23af64846f [opaque pointer type] Add textual IR support for explicit type parameter to the call instruction
See r230786 and r230794 for similar changes to gep and load
respectively.

Call is a bit different because it often doesn't have a single explicit
type - usually the type is deduced from the arguments, and just the
return type is explicit. In those cases there's no need to change the
IR.

When that's not the case, the IR usually contains the pointer type of
the first operand - but since typed pointers are going away, that
representation is insufficient so I'm just stripping the "pointerness"
of the explicit type away.

This does make the IR a bit weird - it /sort of/ reads like the type of
the first operand: "call void () %x(" but %x is actually of type "void
()*" and will eventually be just of type "ptr". But this seems not too
bad and I don't think it would benefit from repeating the type
("void (), void () * %x(" and then eventually "void (), ptr %x(") as has
been done with gep and load.

This also has a side benefit: since the explicit type is no longer a
pointer, there's no ambiguity between an explicit type and a function
that returns a function pointer. Previously this case needed an explicit
type (eg: a function returning a void() function was written as
"call void () () * @x(" rather than "call void () * @x(" because of the
ambiguity between a function returning a pointer to a void() function
and a function returning void).

No ambiguity means even function pointer return types can just be
written alone, without writing the whole function's type.

This leaves /only/ the varargs case where the explicit type is required.

Given the special type syntax in call instructions, the regex-fu used
for migration was a bit more involved in its own unique way (as every
one of these is) so here it is. Use it in conjunction with the apply.sh
script and associated find/xargs commands I've provided in rr230786 to
migrate your out of tree tests. Do let me know if any of this doesn't
cover your cases & we can iterate on a more general script/regexes to
help others with out of tree tests.

About 9 test cases couldn't be automatically migrated - half of those
were functions returning function pointers, where I just had to manually
delete the function argument types now that we didn't need an explicit
function type there. The other half were typedefs of function types used
in calls - just had to manually drop the * from those.

import fileinput
import sys
import re

pat = re.compile(r'((?:=|:|^|\s)call\s(?:[^@]*?))(\s*$|\s*(?:(?:\[\[[a-zA-Z0-9_]+\]\]|[@%](?:(")?[\\\?@a-zA-Z0-9_.]*?(?(3)"|)|{{.*}}))(?:\(|$)|undef|inttoptr|bitcast|null|asm).*$)')
addrspace_end = re.compile(r"addrspace\(\d+\)\s*\*$")
func_end = re.compile("(?:void.*|\)\s*)\*$")

def conv(match, line):
  if not match or re.search(addrspace_end, match.group(1)) or not re.search(func_end, match.group(1)):
    return line
  return line[:match.start()] + match.group(1)[:match.group(1).rfind('*')].rstrip() + match.group(2) + line[match.end():]

for line in sys.stdin:
  sys.stdout.write(conv(re.search(pat, line), line))

llvm-svn: 235145
2015-04-16 23:24:18 +00:00
Duncan P. N. Exon Smith 988a7f8b79 DebugInfo: Fix bad debug info for compile units and types
Fix debug info in these tests, which started failing with a WIP patch to
verify compile units and types.  The problems look like they were all
caused by bitrot.  They fell into these categories:

  - Using `!{i32 0}` instead of `!{}`.
  - Using `!{null}` instead of `!{}`.
  - Using `!MDExpression()` instead of `!{}`.
  - Using `!8` instead of `!{!8}`.
  - `file:` references that pointed at `MDCompileUnit`s instead of the
    same `MDFile` as the compile unit.
  - `file:` references that were numerically off-by-one or (off-by-ten).

llvm-svn: 233415
2015-03-27 20:46:33 +00:00
Duncan P. N. Exon Smith b786572d7c DebugInfo: Fix testcases that fail -verify-debug-info=true
As part of PR22777, fix testcases that fail the debug info verifier.
The changes fall into the following categories:

  - Empty `filename:` fields in `MDFile`s.  Compile units and some types
    require non-empty filenames.  A number of testcases have empty
    filenames, probably due to hand-reduction of testcases.
  - Not-quite empty arrays: `!{i32 0}`.  This used to be equivalent in
    the debug info schema to `!{}`.  They cause problems for
    `!MDSubroutineType`'s `types:` array, since it requires all operands
    to be valid types.  (Note that `!{null}` is the correct type array
    for functions that take no arguments and return `void`.)
  - Significantly bitrotted testcases.  Nodes got left behind a few
    upgrades ago because of missing or invalid tags.

llvm-svn: 232415
2015-03-16 21:10:12 +00:00
Duncan P. N. Exon Smith 166121ad0b Verifier: Check debug info intrinsic arguments
Verify that debug info intrinsic arguments are valid.  (These checks
will not recurse through the full debug info graph, so they don't need
to be cordoned of in `DebugInfoVerifier`.)

With those checks in place, changing the `DbgIntrinsicInst` accessors to
downcast to `MDLocalVariable` and `MDExpression` is natural (added isa
specializations in `Metadata.h` to support this).

Added tests to `test/Verifier` for the new -verify checks, and fixed the
debug info in all the in-tree tests.

If you have out-of-tree testcases that have started to fail to -verify,
hopefully the verify checks are helpful.  The most likely problem is
that the expression argument is `!{}` (instead of `!MDExpression()`).

llvm-svn: 232296
2015-03-15 01:21:30 +00:00
David Blaikie f72d05bc7b [opaque pointer type] Add textual IR support for explicit type parameter to gep operator
Similar to gep (r230786) and load (r230794) changes.

Similar migration script can be used to update test cases, which
successfully migrated all of LLVM and Polly, but about 4 test cases
needed manually changes in Clang.

(this script will read the contents of stdin and massage it into stdout
- wrap it in the 'apply.sh' script shown in previous commits + xargs to
apply it over a large set of test cases)

import fileinput
import sys
import re

rep = re.compile(r"(getelementptr(?:\s+inbounds)?\s*\()((<\d*\s+x\s+)?([^@]*?)(|\s*addrspace\(\d+\))\s*\*(?(3)>)\s*)(?=$|%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|zeroinitializer|<|\[\[[a-zA-Z]|\{\{)", re.MULTILINE | re.DOTALL)

def conv(match):
  line = match.group(1)
  line += match.group(4)
  line += ", "
  line += match.group(2)
  return line

line = sys.stdin.read()
off = 0
for match in re.finditer(rep, line):
  sys.stdout.write(line[off:match.start()])
  sys.stdout.write(conv(match))
  off = match.end()
sys.stdout.write(line[off:])

llvm-svn: 232184
2015-03-13 18:20:45 +00:00
Duncan P. N. Exon Smith e274180f0e DebugInfo: Move new hierarchy into place
Move the specialized metadata nodes for the new debug info hierarchy
into place, finishing off PR22464.  I've done bootstraps (and all that)
and I'm confident this commit is NFC as far as DWARF output is
concerned.  Let me know if I'm wrong :).

The code changes are fairly mechanical:

  - Bumped the "Debug Info Version".
  - `DIBuilder` now creates the appropriate subclass of `MDNode`.
  - Subclasses of DIDescriptor now expect to hold their "MD"
    counterparts (e.g., `DIBasicType` expects `MDBasicType`).
  - Deleted a ton of dead code in `AsmWriter.cpp` and `DebugInfo.cpp`
    for printing comments.
  - Big update to LangRef to describe the nodes in the new hierarchy.
    Feel free to make it better.

Testcase changes are enormous.  There's an accompanying clang commit on
its way.

If you have out-of-tree debug info testcases, I just broke your build.

  - `upgrade-specialized-nodes.sh` is attached to PR22564.  I used it to
    update all the IR testcases.
  - Unfortunately I failed to find way to script the updates to CHECK
    lines, so I updated all of these by hand.  This was fairly painful,
    since the old CHECKs are difficult to reason about.  That's one of
    the benefits of the new hierarchy.

This work isn't quite finished, BTW.  The `DIDescriptor` subclasses are
almost empty wrappers, but not quite: they still have loose casting
checks (see the `RETURN_FROM_RAW()` macro).  Once they're completely
gutted, I'll rename the "MD" classes to "DI" and kill the wrappers.  I
also expect to make a few schema changes now that it's easier to reason
about everything.

llvm-svn: 231082
2015-03-03 17:24:31 +00:00
David Blaikie a79ac14fa6 [opaque pointer type] Add textual IR support for explicit type parameter to load instruction
Essentially the same as the GEP change in r230786.

A similar migration script can be used to update test cases, though a few more
test case improvements/changes were required this time around: (r229269-r229278)

import fileinput
import sys
import re

pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)")

for line in sys.stdin:
  sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line))

Reviewers: rafael, dexonsmith, grosser

Differential Revision: http://reviews.llvm.org/D7649

llvm-svn: 230794
2015-02-27 21:17:42 +00:00
David Blaikie 79e6c74981 [opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction
One of several parallel first steps to remove the target type of pointers,
replacing them with a single opaque pointer type.

This adds an explicit type parameter to the gep instruction so that when the
first parameter becomes an opaque pointer type, the type to gep through is
still available to the instructions.

* This doesn't modify gep operators, only instructions (operators will be
  handled separately)

* Textual IR changes only. Bitcode (including upgrade) and changing the
  in-memory representation will be in separate changes.

* geps of vectors are transformed as:
    getelementptr <4 x float*> %x, ...
  ->getelementptr float, <4 x float*> %x, ...
  Then, once the opaque pointer type is introduced, this will ultimately look
  like:
    getelementptr float, <4 x ptr> %x
  with the unambiguous interpretation that it is a vector of pointers to float.

* address spaces remain on the pointer, not the type:
    getelementptr float addrspace(1)* %x
  ->getelementptr float, float addrspace(1)* %x
  Then, eventually:
    getelementptr float, ptr addrspace(1) %x

Importantly, the massive amount of test case churn has been automated by
same crappy python code. I had to manually update a few test cases that
wouldn't fit the script's model (r228970,r229196,r229197,r229198). The
python script just massages stdin and writes the result to stdout, I
then wrapped that in a shell script to handle replacing files, then
using the usual find+xargs to migrate all the files.

update.py:
import fileinput
import sys
import re

ibrep = re.compile(r"(^.*?[^%\w]getelementptr inbounds )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")
normrep = re.compile(       r"(^.*?[^%\w]getelementptr )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")

def conv(match, line):
  if not match:
    return line
  line = match.groups()[0]
  if len(match.groups()[5]) == 0:
    line += match.groups()[2]
  line += match.groups()[3]
  line += ", "
  line += match.groups()[1]
  line += "\n"
  return line

for line in sys.stdin:
  if line.find("getelementptr ") == line.find("getelementptr inbounds"):
    if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("):
      line = conv(re.match(ibrep, line), line)
  elif line.find("getelementptr ") != line.find("getelementptr ("):
    line = conv(re.match(normrep, line), line)
  sys.stdout.write(line)

apply.sh:
for name in "$@"
do
  python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name"
  rm -f "$name.tmp"
done

The actual commands:
From llvm/src:
find test/ -name *.ll | xargs ./apply.sh
From llvm/src/tools/clang:
find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}"
From llvm/src/tools/polly:
find test/ -name *.ll | xargs ./apply.sh

After that, check-all (with llvm, clang, clang-tools-extra, lld,
compiler-rt, and polly all checked out).

The extra 'rm' in the apply.sh script is due to a few files in clang's test
suite using interesting unicode stuff that my python script was throwing
exceptions on. None of those files needed to be migrated, so it seemed
sufficient to ignore those cases.

Reviewers: rafael, dexonsmith, grosser

Differential Revision: http://reviews.llvm.org/D7636

llvm-svn: 230786
2015-02-27 19:29:02 +00:00
Duncan P. N. Exon Smith 9885469922 IR: Move MDLocation into place
This commit moves `MDLocation`, finishing off PR21433.  There's an
accompanying clang commit for frontend testcases.  I'll attach the
testcase upgrade script I used to PR21433 to help out-of-tree
frontends/backends.

This changes the schema for `DebugLoc` and `DILocation` from:

    !{i32 3, i32 7, !7, !8}

to:

    !MDLocation(line: 3, column: 7, scope: !7, inlinedAt: !8)

Note that empty fields (line/column: 0 and inlinedAt: null) don't get
printed by the assembly writer.

llvm-svn: 226048
2015-01-14 22:27:36 +00:00
Duncan P. N. Exon Smith be7ea19b58 IR: Make metadata typeless in assembly
Now that `Metadata` is typeless, reflect that in the assembly.  These
are the matching assembly changes for the metadata/value split in
r223802.

  - Only use the `metadata` type when referencing metadata from a call
    intrinsic -- i.e., only when it's used as a `Value`.

  - Stop pretending that `ValueAsMetadata` is wrapped in an `MDNode`
    when referencing it from call intrinsics.

So, assembly like this:

    define @foo(i32 %v) {
      call void @llvm.foo(metadata !{i32 %v}, metadata !0)
      call void @llvm.foo(metadata !{i32 7}, metadata !0)
      call void @llvm.foo(metadata !1, metadata !0)
      call void @llvm.foo(metadata !3, metadata !0)
      call void @llvm.foo(metadata !{metadata !3}, metadata !0)
      ret void, !bar !2
    }
    !0 = metadata !{metadata !2}
    !1 = metadata !{i32* @global}
    !2 = metadata !{metadata !3}
    !3 = metadata !{}

turns into this:

    define @foo(i32 %v) {
      call void @llvm.foo(metadata i32 %v, metadata !0)
      call void @llvm.foo(metadata i32 7, metadata !0)
      call void @llvm.foo(metadata i32* @global, metadata !0)
      call void @llvm.foo(metadata !3, metadata !0)
      call void @llvm.foo(metadata !{!3}, metadata !0)
      ret void, !bar !2
    }
    !0 = !{!2}
    !1 = !{i32* @global}
    !2 = !{!3}
    !3 = !{}

I wrote an upgrade script that handled almost all of the tests in llvm
and many of the tests in cfe (even handling many `CHECK` lines).  I've
attached it (or will attach it in a moment if you're speedy) to PR21532
to help everyone update their out-of-tree testcases.

This is part of PR21532.

llvm-svn: 224257
2014-12-15 19:07:53 +00:00
NAKAMURA Takumi d9913e6d35 llvm/test/Transforms/SampleProfile/syntax.ll: Relax MISSING-FILE not to
check locale-aware message catalog.

llvm-svn: 220934
2014-10-30 22:28:46 +00:00
Diego Novillo c572e92c76 Add profile writing capabilities for sampling profiles.
Summary:
This patch finishes up support for handling sampling profiles in both
text and binary formats. The new binary format uses uleb128 encoding to
represent numeric values. This makes profiles files about 25% smaller.

The profile writer class can write profiles in the existing text and the
new binary format. In subsequent patches, I will add the capability to
read (and perhaps write) profiles in the gcov format used by GCC.

Additionally, I will be adding support in llvm-profdata to manipulate
sampling profiles.

There was a bit of refactoring needed to separate some code that was in
the reader files, but is actually common to both the reader and writer.

The new test checks that reading the same profile encoded as text or
raw, produces the same results.

Reviewers: bogner, dexonsmith

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6000

llvm-svn: 220915
2014-10-30 18:00:06 +00:00
Diego Novillo a67c0b43e1 Change error to warning when a profile cannot be found.
When the profile for a function cannot be applied, we use to emit an
error. This seems extreme. The compiler can continue, it's just that the
optimization opportunities won't include profile information.

llvm-svn: 220386
2014-10-22 13:36:35 +00:00
Diego Novillo 8027b80b41 Support using sample profiles with partial debug info.
Summary:
When using a profile, we used to require the use -gmlt so that we could
get access to the line locations. This is used to match line numbers in
the input profile to the line numbers in the function's IR.

But this is actually not necessary. The driver can provide source
location tracking without the emission of debug information. In these
cases, the annotation 'llvm.dbg.cu' is missing from the IR, but the
actual line location annotations are still present.

This patch adds a new way of looking for the start of the current
function. Instead of looking through the compile units in llvm.dbg.cu,
we can walk up the scope for the first instruction in the function with
a debug loc. If that describes the function, we use it. Otherwise, we
keep looking until we find one.

If no such instruction is found, we then give up and produce an error.

Reviewers: echristo, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5887

llvm-svn: 220382
2014-10-22 12:59:00 +00:00
Duncan P. N. Exon Smith 176b691d32 Revert "Revert "DI: Fold constant arguments into a single MDString""
This reverts commit r218918, effectively reapplying r218914 after fixing
an Ocaml bindings test and an Asan crash.  The root cause of the latter
was a tightened-up check in `DILexicalBlock::Verify()`, so I'll file a
PR to investigate who requires the loose check (and why).

Original commit message follows.

--

This patch addresses the first stage of PR17891 by folding constant
arguments together into a single MDString.  Integers are stringified and
a `\0` character is used as a separator.

Part of PR17891.

Note: I've attached my testcases upgrade scripts to the PR.  If I've
just broken your out-of-tree testcases, they might help.

llvm-svn: 219010
2014-10-03 20:01:09 +00:00
Duncan P. N. Exon Smith 786cd049fc Revert "DI: Fold constant arguments into a single MDString"
This reverts commit r218914 while I investigate some bots.

llvm-svn: 218918
2014-10-02 22:15:31 +00:00
Duncan P. N. Exon Smith 571f97bd90 DI: Fold constant arguments into a single MDString
This patch addresses the first stage of PR17891 by folding constant
arguments together into a single MDString.  Integers are stringified and
a `\0` character is used as a separator.

Part of PR17891.

Note: I've attached my testcases upgrade scripts to the PR.  If I've
just broken your out-of-tree testcases, they might help.

llvm-svn: 218914
2014-10-02 21:56:57 +00:00
Adrian Prantl 87b7eb9d0f Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.

Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.

By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.

The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)

This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.

What this patch doesn't do:

This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.

http://reviews.llvm.org/D4919
rdar://problem/17994491

Thanks to dblaikie and dexonsmith for reviewing this patch!

Note: I accidentally committed a bogus older version of this patch previously.
llvm-svn: 218787
2014-10-01 18:55:02 +00:00
Adrian Prantl b458dc2eee Revert r218778 while investigating buldbot breakage.
"Move the complex address expression out of DIVariable and into an extra"

llvm-svn: 218782
2014-10-01 18:10:54 +00:00
Adrian Prantl 25a7174e7a Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.

Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.

By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.

The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)

This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.

What this patch doesn't do:

This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.

http://reviews.llvm.org/D4919
rdar://problem/17994491

Thanks to dblaikie and dexonsmith for reviewing this patch!

llvm-svn: 218778
2014-10-01 17:55:39 +00:00
David Blaikie 2f3f76fdb1 Use DILexicalBlockFile, rather than DILexicalBlock, to track discriminator changes to ensure discriminator changes don't introduce new DWARF DW_TAG_lexical_blocks.
Somewhat unnoticed in the original implementation of discriminators, but
it could cause instructions to end up in new, small,
DW_TAG_lexical_blocks due to the use of DILexicalBlock to track
discriminator changes.

Instead, use DILexicalBlockFile which we already use to track file
changes without introducing new scopes, so it works well to track
discriminator changes in the same way.

llvm-svn: 216239
2014-08-21 22:45:21 +00:00
Diego Novillo 213bb00245 Tolerate unmangled names in sample profiles.
Summary:
The compiler does not always generate linkage names. If a function
has been inlined and its body elided, its linkage name may not be
generated.

When the binary executes, the profiler will use its unmangled name
when attributing samples. This results in unmangled names in the
input profile.

We are currently failing hard when this happens. However, in this case
all that happens is that we fail to attribute samples to the inlined
function. While this means fewer optimization opportunities, it should
not cause a compilation failure.

This patch accepts all valid function names, regardless of whether
they were mangled or not.

Reviewers: chandlerc

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D3087

llvm-svn: 204142
2014-03-18 12:03:12 +00:00
NAKAMURA Takumi 64587433ce llvm/test/Transforms/SampleProfile/syntax.ll: Suppress checking the message catalog in ENOENT. It is locale-dependent on Windows.
llvm-svn: 203997
2014-03-15 02:32:21 +00:00
Diego Novillo a32aa3251c Use DiagnosticInfo facility.
Summary:
The sample profiler pass emits several error messages. Instead of
just aborting the compiler with report_fatal_error, we can emit
better messages using DiagnosticInfo.

This adds a new sub-class of DiagnosticInfo to handle the sample
profiler.

Reviewers: chandlerc, qcolombet

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D3086

llvm-svn: 203976
2014-03-14 21:58:59 +00:00
Diego Novillo 92aa8c220a Use discriminator information in sample profiles.
Summary:
When the sample profiles include discriminator information,
use the discriminator values to distinguish instruction weights
in different basic blocks.

This modifies the BodySamples mapping to map <line, discriminator> pairs
to weights. Instructions on the same line but different blocks, will
use different discriminator values. This, in turn, means that the blocks
may have different weights.

Other changes in this patch:

- Add tests for positive values of line offset, discriminator and samples.
- Change data types from uint32_t to unsigned and int and do additional
  validation.

Reviewers: chandlerc

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2857

llvm-svn: 203508
2014-03-10 22:41:28 +00:00
NAKAMURA Takumi a64d0bccc8 llvm/test/Transforms/SampleProfile/syntax.ll: Eliminate locale-sensitive message check.
llvm-svn: 199000
2014-01-11 09:23:52 +00:00
Diego Novillo 9518b63bfc Extend and simplify the sample profile input file.
1- Use the line_iterator class to read profile files.

2- Allow comments in profile file. Lines starting with '#'
   are completely ignored while reading the profile.

3- Add parsing support for discriminators and indirect call samples.

   Our external profiler can emit more profile information that we are
   currently not handling. This patch does not add new functionality to
   support this information, but it allows profile files to provide it.

   I will add actual support later on (for at least one of these
   features, I need support for DWARF discriminators in Clang).

   A sample line may contain the following additional information:

   Discriminator. This is used if the sampled program was compiled with
   DWARF discriminator support
   (http://wiki.dwarfstd.org/index.php?title=Path_Discriminators). This
   is currently only emitted by GCC and we just ignore it.

   Potential call targets and samples. If present, this line contains a
   call instruction. This models both direct and indirect calls. Each
   called target is listed together with the number of samples. For
   example,

                    130: 7  foo:3  bar:2  baz:7

   The above means that at relative line offset 130 there is a call
   instruction that calls one of foo(), bar() and baz(). With baz()
   being the relatively more frequent call target.

   Differential Revision: http://llvm-reviews.chandlerc.com/D2355

4- Simplify format of profile input file.

   This implements earlier suggestions to simplify the format of the
   sample profile file. The symbol table is not necessary and function
   profiles do not need to know the number of samples in advance.

   Differential Revision: http://llvm-reviews.chandlerc.com/D2419

llvm-svn: 198973
2014-01-10 23:23:51 +00:00
Diego Novillo 0accb3d2bc Propagation of profile samples through the CFG.
This adds a propagation heuristic to convert instruction samples
into branch weights. It implements a similar heuristic to the one
implemented by Dehao Chen on GCC.

The propagation proceeds in 3 phases:

1- Assignment of block weights. All the basic blocks in the function
   are initial assigned the same weight as their most frequently
   executed instruction.

2- Creation of equivalence classes. Since samples may be missing from
   blocks, we can fill in the gaps by setting the weights of all the
   blocks in the same equivalence class to the same weight. To compute
   the concept of equivalence, we use dominance and loop information.
   Two blocks B1 and B2 are in the same equivalence class if B1
   dominates B2, B2 post-dominates B1 and both are in the same loop.

3- Propagation of block weights into edges. This uses a simple
   propagation heuristic. The following rules are applied to every
   block B in the CFG:

   - If B has a single predecessor/successor, then the weight
     of that edge is the weight of the block.

   - If all the edges are known except one, and the weight of the
     block is already known, the weight of the unknown edge will
     be the weight of the block minus the sum of all the known
     edges. If the sum of all the known edges is larger than B's weight,
     we set the unknown edge weight to zero.

   - If there is a self-referential edge, and the weight of the block is
     known, the weight for that edge is set to the weight of the block
     minus the weight of the other incoming edges to that block (if
     known).

Since this propagation is not guaranteed to finalize for every CFG, we
only allow it to proceed for a limited number of iterations (controlled
by -sample-profile-max-propagate-iterations). It currently uses the same
GCC default of 100.

Before propagation starts, the pass builds (for each block) a list of
unique predecessors and successors. This is necessary to handle
identical edges in multiway branches. Since we visit all blocks and all
edges of the CFG, it is cleaner to build these lists once at the start
of the pass.

Finally, the patch fixes the computation of relative line locations.
The profiler emits lines relative to the function header. To discover
it, we traverse the compilation unit looking for the subprogram
corresponding to the function. The line number of that subprogram is the
line where the function begins. That becomes line zero for all the
relative locations.

llvm-svn: 198972
2014-01-10 23:23:46 +00:00
NAKAMURA Takumi 9ae4da2e6d llvm/test/Transforms/SampleProfile/syntax.ll: Relax an expression, not to check locale-dependent message.
llvm-svn: 196195
2013-12-03 02:20:53 +00:00