Commit Graph

292 Commits

Author SHA1 Message Date
Roman Lebedev 5b94fe9623 [llvm-exegesis] Cut run time of analysis mode by -84% (*sic*) (YamlContext::getInstrOpcode())
Summary:
```
$ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-old.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-old.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-old.html'

 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-old.html' (9 runs):

           9465.46 msec task-clock                #    1.000 CPUs utilized            ( +-  0.05% )
                60      context-switches          #    6.363 M/sec                    ( +- 79.45% )
                 0      cpu-migrations            #    0.000 K/sec
             11364      page-faults               # 1200.697 M/sec                    ( +-  0.60% )
       37935623543      cycles                    # 4008083.912 GHz                   ( +-  0.05% )  (83.32%)
        2371625356      stalled-cycles-frontend   #    6.25% frontend cycles idle     ( +-  0.37% )  (83.32%)
        8476077875      stalled-cycles-backend    #   22.34% backend cycles idle      ( +-  0.18% )  (33.36%)
       41822439158      instructions              #    1.10  insn per cycle
                                                  #    0.20  stalled cycles per insn  ( +-  0.02% )  (50.03%)
       11607658944      branches                  # 1226405861.486 M/sec              ( +-  0.01% )  (66.69%)
         210864633      branch-misses             #    1.82% of all branches          ( +-  0.06% )  (83.34%)

           9.46636 +- 0.00441 seconds time elapsed  ( +-  0.05% )
```
```
$ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-bew.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-bew.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-bew.html'

 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-bew.html' (9 runs):

           1480.66 msec task-clock                #    1.000 CPUs utilized            ( +-  0.19% )
                13      context-switches          #    8.483 M/sec                    ( +- 83.10% )
                 0      cpu-migrations            #    0.075 M/sec                    ( +-100.00% )
             11596      page-faults               # 7834.247 M/sec                    ( +-  0.59% )
        5933732194      cycles                    # 4008977.535 GHz                   ( +-  0.19% )  (83.22%)
         438111928      stalled-cycles-frontend   #    7.38% frontend cycles idle     ( +-  0.37% )  (83.25%)
        1454969705      stalled-cycles-backend    #   24.52% backend cycles idle      ( +-  0.94% )  (33.53%)
        7724218604      instructions              #    1.30  insn per cycle
                                                  #    0.19  stalled cycles per insn  ( +-  0.07% )  (50.14%)
        1979796413      branches                  # 1337599858.945 M/sec              ( +-  0.06% )  (66.74%)
          32641638      branch-misses             #    1.65% of all branches          ( +-  0.18% )  (83.31%)

           1.48128 +- 0.00284 seconds time elapsed  ( +-  0.19% )

$ sha512sum /tmp/clusters-*
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf  /tmp/clusters-bew.html
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf  /tmp/clusters-old.html
```

Reviewers: courbet, gchatelet

Reviewed By: courbet

Subscribers: tschuett, llvm-commits, RKSimon

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57657

llvm-svn: 353024
2019-02-04 09:12:21 +00:00
Roman Lebedev 1a0d595f15 [llvm-exegesis] Throughput support in analysis mode
Summary:
D57000 / [[ https://bugs.llvm.org/show_bug.cgi?id=37698 | PR37698 ]] added support for measuring of the inverse throughput.
But the support for the analysis was not added.
This attempts to fix that. (analysis done o bdver2 / piledriver)

First, small-scale experiment:
```
$ ./bin/llvm-exegesis -num-repetitions=10000 -mode=inverse_throughput -opcode-name=BSF64rr
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-d0acdd.o
---
mode:            inverse_throughput
key:
  instructions:
    - 'BSF64rr RAX RDX'
  config:          ''
  register_initial_values:
    - 'RDX=0x0'
cpu_name:        bdver2
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: inverse_throughput, value: 3.0278, per_snippet_value: 3.0278 }
error:           ''
info:            instruction has no tied variables picking Uses different from defs
assembled_snippet: 48BA0000000000000000480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2C3
...
```
If we plug `bsfq	%r12, %r10` into llvm-mca:
https://godbolt.org/z/ZtOyhJ
```
Dispatch Width:    4
uOps Per Cycle:    3.00
IPC:               0.50
Block RThroughput: 2.0
```
So RThroughput mismatch exists.

Now, let's upscale and analyse:
{F8207148}
`$ ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html`:
{F8207172}
{F8207197}
And if we now look at https://www.agner.org/optimize/instruction_tables.pdf,
`Reciprocal throughput` for `BSF r,r` is listed as `3`.
Yay?

Reviewers: courbet, gchatelet

Reviewed By: courbet

Subscribers: tschuett, RKSimon, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57647

llvm-svn: 353023
2019-02-04 09:12:17 +00:00
Roman Lebedev dc78bc277d [llvm-exegesis] deserializeMCInst(): bump SmallVector small size up to 16
Summary:
... from 8.
`VALIGNDZ128rmbik XMM0 XMM0 K1 XMM3 RDI i_0x1  i_0x0  i_0x1` instruction already has 9 components.
It does not matter much in terms of performance, but avoiding allocation seems to come with low cost here..

Reviewers: courbet, gchatelet

Reviewed By: courbet

Subscribers: tschuett, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57654

llvm-svn: 353022
2019-02-04 09:12:13 +00:00
Clement Courbet 362653f7af [llvm-exegesis] Add throughput mode.
Summary:
This just uses the latency benchmark runner on the parallel uops snippet
generator.

Fixes PR37698.

Reviewers: gchatelet

Subscribers: tschuett, RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D57000

llvm-svn: 352632
2019-01-30 16:02:20 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Clement Courbet 176388c973 Revert rL350035 "[llvm-exegesis] Clustering: don't enqueue a point multiple times"
Let's discuss this on the review thread before submitting.

llvm-svn: 350207
2019-01-02 09:21:00 +00:00
Fangrui Song cd93d7ef43 [llvm-exegesis] Clustering: don't enqueue a point multiple times
Summary:
SetVector uses both DenseSet and vector, which is time/memory inefficient. The points are represented as natural numbers so we can replace the DenseSet part by indexing into a vector<char> instead.

Don't cargo cult the pseudocode on the wikipedia DBSCAN page. This is a standard BFS style algorithm (the similar loops have been used several times in other LLVM components): every point is processed at most once, thus the queue has at most NumPoints elements. We represent it with a vector and allocate it outside of the loop to avoid allocation in the loop body.

We check `Processed[P]` to avoid enqueueing a point more than once, which also nicely saves us a `ClusterIdForPoint_[Q].isUndef()` check.

Many people hate the oneshot abstraction but some favor it, therefore we make a compromise, use a lambda to abstract away the neighbor adding process.

Delete the comment `assert(Neighbors.capacity() == (Points_.size() - 1));` as it is wrong.

llvm-svn: 350035
2018-12-23 20:48:52 +00:00
Simon Pilgrim 96408bb04a Revert rL349136: [llvm-exegesis] Optimize ToProcess in dbScan
Summary:
Use `vector<char> Added + vector<size_t> ToProcess` to replace `SetVector ToProcess`

We also check `Added[P]` to enqueueing a point more than once, which
also saves us a `ClusterIdForPoint_[Q].isUndef()` check.

Reviewers: courbet, RKSimon, gchatelet, john.brawn, lebedev.ri

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54442
........
Patch wasn't approved and breaks buildbots

llvm-svn: 349139
2018-12-14 09:25:08 +00:00
Fangrui Song 92537ccc7e [llvm-exegesis] Optimize ToProcess in dbScan
Summary:
Use `vector<char> Added + vector<size_t> ToProcess` to replace `SetVector ToProcess`

We also check `Added[P]` to enqueueing a point more than once, which
also saves us a `ClusterIdForPoint_[Q].isUndef()` check.

Reviewers: courbet, RKSimon, gchatelet, john.brawn, lebedev.ri

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54442

llvm-svn: 349136
2018-12-14 08:27:35 +00:00
Jinsong Ji 56c74cff70 [llvm-exegesis][NFC] Some code style cleanup
Apply review comments of https://reviews.llvm.org/D54185 to other target as well, specifically:

1. make anonymous namespaces as small as possible, avoid using static inside anonymous namespaces
2. Add missing header to some files
3. GetLoadImmediateOpcodem-> getLoadImmediateOpcode
4. Fix typo

Differential Revision: https://reviews.llvm.org/D54343

llvm-svn: 347309
2018-11-20 14:41:59 +00:00
Clement Courbet bbab546a71 [llvm-exegesis][NFC] More tests for ExegesisTarget::fillMemoryOperands().
Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54304

llvm-svn: 347209
2018-11-19 14:31:43 +00:00
Roman Lebedev 71fdb57640 [llvm-exegesis] (+final perf overview) InstructionBenchmarkClustering::rangeQuery(): reserve for the upper bound of Neighbors
Summary:
As it was pointed out in D54388+D54390, the maximal size of `Neighbors` is known,
it will contain at most Points_.size() minus one (the center of the cluster)

While that is the upper bound, meaning in the most cases, the actual count
will be much smaller, since D54390 made the allocation persistent,
we no longer have to worry about overly-optimistically `reserve()`ing.

Old: (D54393)
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (16 runs):

       6553.167456      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.21% )
...
            6.5547 +- 0.0134 seconds time elapsed  ( +-  0.20% )
```
New:
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (16 runs):

       6315.057872      task-clock (msec)         #    0.999 CPUs utilized            ( +-  0.24% )
...
            6.3187 +- 0.0160 seconds time elapsed  ( +-  0.25% )
```
And that is another -~4%.


Since this is the last (as of this moment) patch in this patch series,
it is a good time to summarize:
Old: (svn trunk, as stated in D54381)
```
$ time ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html &> /dev/null

real    0m24.884s
user    0m24.099s
sys     0m0.785s
```
So these patches, on a given benchmark,
has decreased llvm-exegesis analysis time by 74.62%.

There surely is more room for further improvements.
D54514 may improve thins by -11.5% more (relative to this patch).
Parallelization may improve things further significantly, too.


Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn

Reviewed By: courbet, MaskRay

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54415

llvm-svn: 347204
2018-11-19 13:28:41 +00:00
Roman Lebedev 8e315b66c2 [llvm-exegesis] Move InstructionBenchmarkClustering::isNeighbour() into header
Summary:
Old: (D54390)
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs):

       7432.421721      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.15% )
...
            7.4336 +- 0.0115 seconds time elapsed  ( +-  0.15% )
```
New:
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs):

       6569.936144      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.22% )
...
            6.5711 +- 0.0143 seconds time elapsed  ( +-  0.22% )
```
And another -12%. You'd think it would be `inline`d anyway, but no! :)

Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn

Reviewed By: courbet, MaskRay

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54393

llvm-svn: 347203
2018-11-19 13:28:36 +00:00
Roman Lebedev 666d855fbb [llvm-exegesis] InstructionBenchmarkClustering::rangeQuery(): write into llvm::SmallVectorImpl& output parameter
Summary:
I do believe this is the correct fix.
We call `rangeQuery()` *very* often. And many times it's output vector is large (tens of thousands entries), so small-size-opt won't help.

Old: (D54389)
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs):

       7934.528363      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.19% )
...
            7.9354 +- 0.0148 seconds time elapsed  ( +-  0.19% )
```
New:
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs):

       7383.793440      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.47% )
...
            7.3868 +- 0.0340 seconds time elapsed  ( +-  0.46% )
```
And another -7%. And that isn't even the good bit yet.

Old:
* calls to allocation functions: 2081419
* temporary allocations: 219658 (10.55%)
* bytes allocated in total (ignoring deallocations): 4.31 GB

New:
* calls to allocation functions: 1880295 (-10%)
* temporary allocations: 18758 (1%) (-91% *sic*)
* bytes allocated in total (ignoring deallocations): 545.15 MB (-88% *sic*)

Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn

Reviewed By: courbet, MaskRay

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54390

llvm-svn: 347202
2018-11-19 13:28:31 +00:00
Roman Lebedev 5c5b1ea725 [llvm-exegesis] InstructionBenchmarkClustering::dbScan(): replace std::vector<> with std::deque<> in llvm::SetVector<>
Summary:
Old: (D54388)
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs):

       8606.323981      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.11% )
...
           8.60773 +- 0.00978 seconds time elapsed  ( +-  0.11% )
```
New:
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs):

       7971.403653      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.14% )
...
            7.9728 +- 0.0113 seconds time elapsed  ( +-  0.14% )
```
Another -~7%.

Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn

Reviewed By: courbet, RKSimon

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54389

llvm-svn: 347201
2018-11-19 13:28:26 +00:00
Roman Lebedev 8aecb0c489 [llvm-exegesis] InstructionBenchmarkClustering::rangeQuery(): use llvm::SmallVector<size_t, 0> for storage.
Summary:
Old: (D54383)
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs):

       9098.781978      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.16% )
...
            9.1015 +- 0.0148 seconds time elapsed  ( +-  0.16% )
```
New:
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs):

       8553.352480      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.12% )
...
            8.5539 +- 0.0105 seconds time elapsed  ( +-  0.12% )
```
So another -6%.
That is because the `SmallVector` **doubles** it size when reallocating, which is great here,
since we can't `reserve()` since we can't know how many `Neighbors` we will have.

Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54388

llvm-svn: 347200
2018-11-19 13:28:22 +00:00
Roman Lebedev b311c1d6b8 [llvm-exegesis] Analysis: writeMeasurementValue(): don't alloc string for double each time.
Summary:
Test data: 500kLOC of benchmark.yaml, 23Mb. (that is a subset of the actual uops benchmark i was trying to analyze!)
Old time: (D54382)
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (16 runs):

       9024.354355      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.18% )
...
            9.0262 +- 0.0161 seconds time elapsed  ( +-  0.18% )
```
New time:
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (16 runs):

       8996.541057      task-clock (msec)         #    0.999 CPUs utilized            ( +-  0.19% )
...
            9.0045 +- 0.0172 seconds time elapsed  ( +-  0.19% )
```
-~0.3%, not that much. But this isn't the important part.

Old:
* calls to allocation functions: 2109712
* temporary allocations: 33112
* bytes allocated in total (ignoring deallocations): 4.43 GB

New:
* calls to allocation functions: 2095345 (-0.68%)
* temporary allocations: 18745 (-43.39% !!!)
* bytes allocated in total (ignoring deallocations): 4.31 GB (-2.71%)

Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn

Reviewed By: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54383

llvm-svn: 347199
2018-11-19 13:28:17 +00:00
Roman Lebedev f8b28e9bf4 [llvm-exegesis] Analysis::writeSnippet(): be smarter about memory allocations.
Summary:
Test data: 500kLOC of benchmark.yaml, 23Mb. (that is a subset of the actual uops benchmark i was trying to analyze!)
Old time: (D54381)
```
$ time ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html &> /dev/null

real    0m10.487s
user    0m9.745s
sys     0m0.740s
```
New time:
```
$ time ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html &> /dev/null

real    0m9.599s
user    0m8.824s
sys     0m0.772s

```
Not that much, around -9%. But that is not the good part yet, again.

Old:
* calls to allocation functions: 3347676
* temporary allocations: 277818
* bytes allocated in total (ignoring deallocations): 10.52 GB

New:
* calls to allocation functions: 2109712 (-36%)
* temporary allocations: 33112 (-88%)
* bytes allocated in total (ignoring deallocations): 4.43 GB (-58% *sic*)

Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn

Reviewed By: courbet, MaskRay

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54382

llvm-svn: 347198
2018-11-19 13:28:14 +00:00
Roman Lebedev 0b4b512826 [llvm-exegesis] InstructionBenchmarkClustering::dbScan(): use llvm::SetVector<> instead of ILLEGAL std::unordered_set<>
Summary:
Test data: 500kLOC of benchmark.yaml, 23Mb. (that is a subset of the actual uops benchmark i was trying to analyze!)
Old time:
```
$ time ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html &> /dev/null

real    0m24.884s
user    0m24.099s
sys     0m0.785s
```
New time:
```
$ time ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html &> /dev/null

real    0m10.469s
user    0m9.797s
sys     0m0.672s
```
So -60%. And that isn't the good bit yet.

Old:
* calls to allocation functions: 106560180  (yes, 107 *million* allocations.)
* bytes allocated in total (ignoring deallocations): 12.17 GB

New:
* calls to allocation functions: 3347676  (-96.86%)  (just 3 mil)
* bytes allocated in total (ignoring deallocations): 10.52 GB (~2GB less)

---

Two points i want to raise:
* `std::unordered_set<>` should not have been used there in the first place.
  It is banned by the https://llvm.org/docs/ProgrammersManual.html#other-set-like-container-options
* There is no tests, so i'm not fully sure this is correct.
  Since it was unordered set, i guess there are zero restrictions on the order, and anything will be ok?
* I tried other containers suggested in https://llvm.org/docs/ProgrammersManual.html#set-like-containers-std-set-smallset-setvector-etc,
  this `llvm::SetVector<>` seems to be best here.

Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn

Reviewed By: courbet

Subscribers: kristina, bobsayshilol, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54381

llvm-svn: 347197
2018-11-19 13:28:09 +00:00
Clement Courbet eee2e06e2a [llvm-exegesis][NFC] Add a way to declare the default counter binding for unbound CPUs for a target.
Summary:
This simplifies the code and moves everything to tablegen for consistency. This
also prepares the ground for adding issue counters.

Reviewers: gchatelet, john.brawn, jsji

Subscribers: nemanjai, mgorny, javed.absar, kbarton, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54297

llvm-svn: 346489
2018-11-09 13:15:32 +00:00
Jinsong Ji 5fd3e75478 [PowerPC][llvm-exegesis] Add a PowerPC target
This is patch to add PowerPC target to llvm-exegesis.
The target does just enough to be able to run llvm-exegesis in latency mode for at least some opcodes.

Differential Revision: https://reviews.llvm.org/D54185

llvm-svn: 346411
2018-11-08 16:51:42 +00:00
Clement Courbet 54c2fa1202 [llvm-exegesis][NFC] Add missing header guard + cosmetics.
Reviewers: gchatelet

Reviewed By: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54252

llvm-svn: 346400
2018-11-08 12:37:56 +00:00
Clement Courbet 0d79aaf1a7 Revert "[llvm-exegesis] Add a snippet generator to generate snippets to compute ROB sizes."
This reverts accidental commit rL346394.

llvm-svn: 346398
2018-11-08 12:09:45 +00:00
Clement Courbet c0950ae990 [llvm-exegesis] Add a snippet generator to generate snippets to compute ROB sizes.
llvm-svn: 346394
2018-11-08 11:45:14 +00:00
Clement Courbet 5b0d783078 [llvm-exegesis] Remove superfluous move.
/Users/buildslave/as-bldslv9_new/lld-x86_64-darwin13/llvm.src/tools/llvm-exegesis/lib/X86/Target.cpp:155:12: error: moving a local object in a return statement prevents copy elision [-Werror,-Wpessimizing-move]
    return std::move(Error);
           ^
/Users/buildslave/as-bldslv9_new/lld-x86_64-darwin13/llvm.src/tools/llvm-exegesis/lib/X86/Target.cpp:155:12: note: remove std::move call here
    return std::move(Error);
           ^~~~~~~~~~     ~

llvm-svn: 346333
2018-11-07 16:52:50 +00:00
Clement Courbet c544838f87 [llvm-exegesis] Correclty handle all X86 memory encoding formats.
Summary:
Add unit tests to check the support for each supported format to avoid
regressions such as the one in PR36906.

Reviewers: gchatelet

Subscribers: tschuett, lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D54144

llvm-svn: 346330
2018-11-07 16:14:55 +00:00
Clement Courbet 7066769223 [llvm-exegesis] Increasing wrapping limit.
Summary: Fixes PR39097.

Reviewers: gchatelet

Subscribers: llvm-commits, tschuett

Differential Revision: https://reviews.llvm.org/D54151

llvm-svn: 346328
2018-11-07 15:46:45 +00:00
Clement Courbet 003e08ff28 [llvm-exegesis] Ignore X86 pseudo instructions.
Summary: They do not lower to actual MCInsts and have no scheduling info.

Reviewers: gchatelet

Subscribers: llvm-commits, tschuett

Differential Revision: https://reviews.llvm.org/D54147

llvm-svn: 346227
2018-11-06 14:11:58 +00:00
Matthias Braun 3d849f67cb MachineModuleInfo: Store more specific reference to LLVMTargetMachine; NFC
MachineModuleInfo can only be used in code using lib/CodeGen, hence we
can keep a more specific reference to LLVMTargetMachine rather than just
TargetMachine around.

llvm-svn: 346182
2018-11-05 23:49:13 +00:00
Clement Courbet 4d837fce88 [llvm-exegesis] Fix SNB counter definition and handling.
Summary: SNB is the only one that has P23 as a single proc res.

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53766

llvm-svn: 345480
2018-10-28 19:09:14 +00:00
Simon Pilgrim 2a9c728088 Fix MSVC llvm-exegesis build. NFCI.
MSVC is a bit funny about is_pod.....

llvm-svn: 345252
2018-10-25 10:45:38 +00:00
Clement Courbet b4b6ec01c6 [llvm-exegesis] Add missing initializer.
This is a better fix than rL345245.

llvm-svn: 345246
2018-10-25 08:11:35 +00:00
Clement Courbet fa99b36e4d [llvm-exegesis] Fix VC build of r345243.
"const members cannot be default initialized unless their type has a user defined default constructor"

Make members non-const.

llvm-svn: 345245
2018-10-25 08:08:58 +00:00
Clement Courbet 8902c885d6 [llvm-exegesis] Fix warning in r345243.
warning C4099: 'llvm::exegesis::PfmCountersInfo': type name first seen using 'class' now seen using 'struct'

llvm-svn: 345244
2018-10-25 08:06:35 +00:00
Clement Courbet 41c8af3924 [MCSched] Bind PFM Counters to the CPUs instead of the SchedModel.
Summary:
The pfm counters are now in the ExegesisTarget rather than the
MCSchedModel (PR39165).

This also compresses the pfm counter tables (PR37068).

Reviewers: RKSimon, gchatelet

Subscribers: mgrang, llvm-commits

Differential Revision: https://reviews.llvm.org/D52932

llvm-svn: 345243
2018-10-25 07:44:01 +00:00
Guillaume Chatelet da11b85606 [llvm-exegesis] Implements a cache of Instruction objects.
llvm-svn: 345130
2018-10-24 11:55:06 +00:00
Fangrui Song a342834b24 [llvm-exegesis] Fix name lookup ambiguity in MSVC after 344922
llvm-svn: 344927
2018-10-22 17:52:31 +00:00
Fangrui Song 32401afd8c [llvm-exegesis] Move namespace exegesis inside llvm::
Summary:
This allows simplifying references of llvm::foo with foo when the needs
come in the future.

Reviewers: courbet, gchatelet

Reviewed By: gchatelet

Subscribers: javed.absar, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53455

llvm-svn: 344922
2018-10-22 17:10:47 +00:00
Guillaume Chatelet 18ef4a4a0d [llvm-exegesis] Crash when assembling invalid Operand
llvm-svn: 344907
2018-10-22 15:06:10 +00:00
Guillaume Chatelet 02f70a3fde [llvm-exegesis] Mark x86 segment register instructions as unsupported.
Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53499

llvm-svn: 344906
2018-10-22 14:55:43 +00:00
Guillaume Chatelet 3c639f33b4 [llvm-exegesis] Reject x86 instructions that use non uniform memory accesses
Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53438

llvm-svn: 344905
2018-10-22 14:46:08 +00:00
Clement Courbet 8d0dd0ba0e [llvm-exegesis] Mark second-form X87 instructions as unsupported.
Summary:
We only support the first form because we rely on information that is
only available there.

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53430

llvm-svn: 344782
2018-10-19 12:24:49 +00:00
Clement Courbet 22bad0497e [llvm-exegesis] Re-enable liveliness tracker.
Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53429

llvm-svn: 344780
2018-10-19 12:08:05 +00:00
Clement Courbet c51f45239d [llvm-exegesis] X87 RFP setup code.
Summary:
This was lost during refactoring in rL342644.

Fix and simplify simplify value size handling: always go through a 80 bit value,
because the value can be 1 byte). Add unit tests.

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53423

llvm-svn: 344779
2018-10-19 09:56:54 +00:00
Fangrui Song 2e83b2e9ee Use llvm::{all,any,none}_of instead std::{all,any,none}_of. NFC
llvm-svn: 344774
2018-10-19 06:12:02 +00:00
Krasimir Georgiev 11bc3a18e2 [llvm-exegesis] Mark destructor virtual after r344695
This was causing a -Wnon-virtual-dtor warning.

llvm-svn: 344721
2018-10-18 02:06:16 +00:00
Clement Courbet f973c2df9d [llvm-exegesis] Allow measuring several instructions in a single run.
Summary:
We try to recover gracefully on instructions that would crash the
program.

This includes some refactoring of runMeasurement() implementations.

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53371

llvm-svn: 344695
2018-10-17 15:04:15 +00:00
Guillaume Chatelet 6f4bc17309 Fix uninitialized variable
llvm-svn: 344692
2018-10-17 12:27:46 +00:00
Guillaume Chatelet 952b121a9c BuildBot fix, compiler complains about array decay to pointer
llvm-svn: 344690
2018-10-17 12:09:21 +00:00
Guillaume Chatelet fcbb6f3c2b [llvm-exegeis] Computing Latency configuration upfront so we can generate many CodeTemplates at once.
Summary: LatencyGenerator now computes all possible mode of serial execution for an Instruction upfront and generates CodeTemplate for the ones that give the best results (e.g. no need to generate a two instructions snippet when repeating a single one would do). The next step is to generate even more configurations for cases (e.g. for XOR we should generate "XOR EAX, EAX, EAX" and "XOR EAX, EAX, EBX")

Reviewers: courbet

Reviewed By: courbet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53320

llvm-svn: 344689
2018-10-17 11:37:28 +00:00
Guillaume Chatelet a3849490b1 [llvm-exegesis] Fix missing std::move.
llvm-svn: 344496
2018-10-15 09:21:21 +00:00
Guillaume Chatelet 296a862cbe [llvm-exegesis][NFC] Return many CodeTemplates instead of one.
Summary: This is part one of the change where I simply changed the signature of the functions. More work need to be done to actually produce more than one CodeTemplate per instruction.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53209

llvm-svn: 344493
2018-10-15 09:09:19 +00:00
Guillaume Chatelet 946fb0517a [llvm-exegesis][NFC] Simplify code at the cost of small code duplication
Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53198

llvm-svn: 344351
2018-10-12 15:12:22 +00:00
Guillaume Chatelet f33b60258e [llvm-exegesis] Fix always true assert
llvm-svn: 344151
2018-10-10 16:16:43 +00:00
Guillaume Chatelet 9b59238822 [llvm-exegesis][NFC] Pass Instruction instead of bare Opcode
llvm-svn: 344145
2018-10-10 14:57:32 +00:00
Guillaume Chatelet ee9c2a17b8 [llvm-exegesis][NFC] Code simplification
Summary: Simplify code by having LLVMState hold the RegisterAliasingTrackerCache.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53078

llvm-svn: 344143
2018-10-10 14:22:48 +00:00
John Brawn c616a7236c [llvm-exegesis] Fix function return generation so it doesn't return register 0
When fillMachineFunction generates a return on targets without a return opcode
(such as AArch64) it should pass an empty set of registers as the return
registers, not 0 which means register number zero.

Differential Revision: https://reviews.llvm.org/D53074

llvm-svn: 344139
2018-10-10 13:03:23 +00:00
Guillaume Chatelet d227754973 [llvm-exegesis] Fix broken build.
llvm-svn: 344131
2018-10-10 10:09:42 +00:00
Guillaume Chatelet ffc3ffac7d [llvm-exegesis][NFC] Simplify code now that Instruction has more semantic
Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53065

llvm-svn: 344130
2018-10-10 09:45:17 +00:00
Guillaume Chatelet 0c17cbf790 [llvm-exegesis] Remove unused variable, add more semantic to Instruction.
Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53062

llvm-svn: 344127
2018-10-10 09:12:36 +00:00
Guillaume Chatelet 22cccffa06 Fix function case.
llvm-svn: 344051
2018-10-09 14:51:33 +00:00
Guillaume Chatelet 547d2dd1dd [llvm-exegesis] Fix invalid return type and add a Dump function.
Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53020

llvm-svn: 344050
2018-10-09 14:51:29 +00:00
Guillaume Chatelet fe5b0b488e [llvm-exegesis] Fix wrong index type.
llvm-svn: 344032
2018-10-09 10:06:19 +00:00
Guillaume Chatelet cf6d5fab32 [llvm-exegesis] Fix unused lambda capture.
llvm-svn: 344029
2018-10-09 09:33:29 +00:00
Guillaume Chatelet 09c2839c02 [llvm-exegesis][NFC] Use accessors for Operand.
Summary:
This moves checking logic into the accessors and makes the structure smaller.
It will also help when/if Operand are generated from the TD files.

Subscribers: tschuett, courbet, llvm-commits

Differential Revision: https://reviews.llvm.org/D52982

llvm-svn: 344028
2018-10-09 08:59:10 +00:00
Guillaume Chatelet 9157bc914f [llvm-exegesis][NFC] Improve parsing of the YAML files
Summary: sscanf turns out to be slow for reading floating points.

Reviewers: courbet

Subscribers: tschuett, llvm-commits, RKSimon

Differential Revision: https://reviews.llvm.org/D52866

llvm-svn: 343771
2018-10-04 12:33:46 +00:00
Simon Pilgrim 92d02027c2 [llvm-exegesis] Avoid yaml parser from calling sscanf for obvious non-matches (PR39102)
deserializeMCOperand - ensure that we at least match the first character of the sscanf pattern before calling

This reduces llvm-exegesis uops analysis of the instructions supported from btver2 from 5m13s to 2m1s on debug builds.

llvm-svn: 343690
2018-10-03 14:51:09 +00:00
Clement Courbet 5a768ddd44 [llvm-exegesis][NFC] Revert rL343682 "Fix unused variable warning".
That was not the proper fix: the variable is used in debug mode.

llvm-svn: 343685
2018-10-03 12:48:50 +00:00
Clement Courbet 8a5a6be47a [llvm-exegesis] Fix rL343680 in release mode.
llvm-svn: 343684
2018-10-03 12:35:35 +00:00
Clement Courbet af50a5b85f [llvm-exegesis][NFC] Fix unused variable warning.
llvm-svn: 343682
2018-10-03 12:27:43 +00:00
Clement Courbet d5a39553ff [llvm-exegesis] Resolve variant classes in analysis.
Summary: See PR38884.

Reviewers: gchatelet

Subscribers: tschuett, RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D52825

llvm-svn: 343680
2018-10-03 11:50:25 +00:00
Guillaume Chatelet 415b2fbef5 [llvm-exegesis][NFC] Move random functions from CodeTemplate to SnippetGenerator.
Summary: Just moving methods around.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52720

llvm-svn: 343461
2018-10-01 12:19:10 +00:00
Guillaume Chatelet c6268f3ba2 [llvm-exegesis][NFC] Make randomizeUnsetVariables a free function.
Summary: This is prelimineary to moving random functions to SnippetGenerator.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52718

llvm-svn: 343456
2018-10-01 11:46:06 +00:00
Clement Courbet 30183093ab [llvm-exegesis] Fix PR39096.
Summary: The key is now the resource name, not the resource id.

Reviewers: gchatelet

Subscribers: tschuett, RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D52607

llvm-svn: 343208
2018-09-27 13:26:37 +00:00
Guillaume Chatelet 70ac019efa [llvm-exegesis][NFC] moving code around.
Summary: Renaming InstructionBuilder into InstructionTemplate and moving code generation tools from MCInstrDescView to CodeTemplate.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52592

llvm-svn: 343188
2018-09-27 09:23:04 +00:00
Fangrui Song 0cac726a00 llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...)
Summary: The convenience wrapper in STLExtras is available since rL342102.

Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb

Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D52573

llvm-svn: 343163
2018-09-27 02:13:45 +00:00
Clement Courbet 28d4f85824 [llvm-exegesis] Get rid of debug_string.
Summary:
THis is a backwards-compatible change (existing files will work as
expected).

See PR39082.

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52546

llvm-svn: 343108
2018-09-26 13:35:10 +00:00
Guillaume Chatelet 7f8d310b76 [llvm-exegesis][NFC] Move CodeTemplate to it's own file.
Summary: This is is preparation of exploring value ranges.

Reviewers: courbet

Reviewed By: courbet

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52542

llvm-svn: 343098
2018-09-26 11:57:24 +00:00
Clement Courbet 596c56ff9c [llvm-exegesis] Add support for measuring NumMicroOps.
Summary:
Example output for vzeroall:

---
mode:            uops
key:
  instructions:
    - 'VZEROALL'
  config:          ''
  register_initial_values:
cpu_name:        haswell
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { debug_string: HWPort0, value: 0.0006, per_snippet_value: 0.0006,
      key: '3' }
  - { debug_string: HWPort1, value: 0.0011, per_snippet_value: 0.0011,
      key: '4' }
  - { debug_string: HWPort2, value: 0.0004, per_snippet_value: 0.0004,
      key: '5' }
  - { debug_string: HWPort3, value: 0.0018, per_snippet_value: 0.0018,
      key: '6' }
  - { debug_string: HWPort4, value: 0.0002, per_snippet_value: 0.0002,
      key: '7' }
  - { debug_string: HWPort5, value: 1.0019, per_snippet_value: 1.0019,
      key: '8' }
  - { debug_string: HWPort6, value: 1.0033, per_snippet_value: 1.0033,
      key: '9' }
  - { debug_string: HWPort7, value: 0.0001, per_snippet_value: 0.0001,
      key: '10' }
  - { debug_string: NumMicroOps, value: 20.0069, per_snippet_value: 20.0069,
      key: NumMicroOps }
error:           ''
info:            ''
assembled_snippet: C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C5FC77C3
...

Reviewers: gchatelet

Subscribers: tschuett, RKSimon, andreadb, llvm-commits

Differential Revision: https://reviews.llvm.org/D52539

llvm-svn: 343094
2018-09-26 11:22:56 +00:00
Clement Courbet 684a5f6753 [llvm-exegesis] Output the unscaled value as well as the scaled one.
Summary: See PR38936 for context.

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52500

llvm-svn: 343081
2018-09-26 08:37:21 +00:00
Guillaume Chatelet 345fae5d56 [llvm-exegesis] Serializes registers initial values.
Summary: Adds the registers initial values to the YAML output of llvm-exegesis.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52460

llvm-svn: 342982
2018-09-25 15:15:54 +00:00
Guillaume Chatelet 6078f82241 [llvm-exegesis] Fix missing document separator in YAML output.
Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52496

llvm-svn: 342981
2018-09-25 14:48:24 +00:00
Clement Courbet 86baebc5fd [llvm-exegesis] Add lit tests (v2).
Summary: This revisits rL342953 by adding detection of host support.

Reviewers: gchatelet, lebedev.ri, alexshap

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52464

llvm-svn: 342975
2018-09-25 13:59:35 +00:00
Guillaume Chatelet 55ad087a4c [llvm-exegesis][NFC] Rewrite of the YAML serialization.
Summary: This is a NFC in preparation of exporting the initial registers as part of the YAML dump

Reviewers: courbet

Reviewed By: courbet

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52427

llvm-svn: 342967
2018-09-25 12:18:08 +00:00
Clement Courbet 78b2e73d15 [llvm-exegesis] Allow benchmarking arbitrary code snippets.
Summary:

This is a step towards fixing PR38048.

Note that right now the measurements are given per instruction. We'll
need to give measurements a per code snippet and update the analysis (PR38731).

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52041

llvm-svn: 342947
2018-09-25 07:31:44 +00:00
Clement Courbet 1e8fdbe3c3 [llvm-exegesis] Fix PR39021.
Summary:
The `set` statements was incorrectly reading the value of the local variable and
setting the value of the parent variable.

Reviewers: tycho, gchatelet, john.brawn

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52343

llvm-svn: 342865
2018-09-24 08:39:48 +00:00
Guillaume Chatelet c96a97bac7 [llvm-exegesis] Improve Register Setup (roll forward of D51856).
Summary:
Added function to set a register to a particular value + tests.
Add EFLAGS test, use new setRegTo instead of setRegToConstant.

Reviewers: courbet, javed.absar

Subscribers: llvm-commits, tschuett, mgorny

Differential Revision: https://reviews.llvm.org/D52297

llvm-svn: 342644
2018-09-20 12:22:18 +00:00
Simon Pilgrim f652ef3d52 Revert rL342465: Added function to set a register to a particular value + tests.
rL342465 is breaking the MSVC buildbots.

llvm-svn: 342490
2018-09-18 15:38:16 +00:00
Simon Pilgrim 0242689725 Revert rL342466: [llvm-exegesis] Improve Register Setup.
rL342465 is breaking the MSVC buildbots, but I need to revert this dependent revision as well.

Summary:
Added function to set a register to a particular value + tests.
Add EFLAGS test, use new setRegTo instead of setRegToConstant.

Reviewers: courbet, javed.absar

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D51856

llvm-svn: 342489
2018-09-18 15:35:49 +00:00
Guillaume Chatelet 937f3fedec [llvm-exegesis] Improve Register Setup.
Summary:
Added function to set a register to a particular value + tests.
Add EFLAGS test, use new setRegTo instead of setRegToConstant.

Reviewers: courbet, javed.absar

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D51856

llvm-svn: 342466
2018-09-18 11:26:48 +00:00
Guillaume Chatelet 8721ad98d1 Added function to set a register to a particular value + tests.
llvm-svn: 342465
2018-09-18 11:26:35 +00:00
Guillaume Chatelet 5ad2909e52 Improve Register Setup
llvm-svn: 342464
2018-09-18 11:26:27 +00:00
Simon Pilgrim a2fd56c3e4 Fix "not all control paths return a value" MSVC warning. NFCI.
llvm-svn: 342394
2018-09-17 13:56:42 +00:00
Guillaume Chatelet cd488efe7e [llvm-exegesis] Add predefined floating point values so we can test impact of special values on latency.
Summary: This will be useful to generate many configurations and test instruction regimes (NaN, Inf, subnormal, normal).

Reviewers: courbet

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D51858

llvm-svn: 342369
2018-09-17 11:09:32 +00:00
Nico Weber b09a8c9bd9 Revert r342148 (and follow-on fix attempts r342154, r342180, r342182, r342193)
Many bots buildling with make have been broken for several days, e.g.
http://lab.llvm.org:8011/builders/lld-x86_64-darwin13

llvm-svn: 342336
2018-09-15 19:04:27 +00:00
Richard Diamond f29b36c76d [cmake] Fix missing DEPENDS.
Not sure how I didn't catch this.

llvm-svn: 342154
2018-09-13 17:10:44 +00:00
Richard Diamond f3063baa6e Renovate CMake files in the `llvm-(cfi-verify|exegesis|mca)` tools.
llvm-svn: 342148
2018-09-13 16:15:03 +00:00
Clement Courbet d939f6d013 [llvm-exegesis][NFC] Split BenchmarkRunner class
Summary:
The snippet-generation part goes to the SnippetGenerator class.

This will allow benchmarking arbitrary code (see PR38437).

Reviewers: gchatelet

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D51979

llvm-svn: 342117
2018-09-13 07:40:53 +00:00
Clement Courbet 903667e956 [llvm-exegesis][NFC]Remove dead function parameter
llvm-svn: 342035
2018-09-12 09:26:32 +00:00
Simon Pilgrim fc2931375d [llvm-exegesis] Ignore double spaced separators in asm strings
Some asm has double spaces between operands, the deserializer was keeping these empty split pieces, causing assertions later on:

'ADC16mi RDI i_0x1x  i_0x0x  i_0x1x'

llvm-svn: 341799
2018-09-10 10:45:04 +00:00
Guillaume Chatelet e60866a4e0 [llvm-exegesis] Renaming classes and functions.
Summary: Functional No Op.

Reviewers: gchatelet

Subscribers: tschuett, courbet, llvm-commits

Differential Revision: https://reviews.llvm.org/D50231

llvm-svn: 338836
2018-08-03 09:29:38 +00:00
Guillaume Chatelet 171f3f46c8 [llvm-exegesis] Rename InstructionInstance into InstructionBuilder.
Summary: Non functional change.

Subscribers: tschuett, courbet, llvm-commits

Differential Revision: https://reviews.llvm.org/D50176

llvm-svn: 338701
2018-08-02 11:12:02 +00:00
Guillaume Chatelet fb94354d2d [llvm-exegesis] Provide a way to handle memory instructions.
Summary:
And implement memory instructions on X86.

This fixes PR36906.

Reviewers: gchatelet

Reviewed By: gchatelet

Subscribers: lebedev.ri, filcab, mgorny, tschuett, RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D48935

llvm-svn: 338567
2018-08-01 14:41:45 +00:00
Clement Courbet f9a0bb330d [llvm-exegesis] Add uop computation for more X87 instruction classes.
Summary:
This allows measuring comparisons (UCOM_FpIr32,UCOM_Fpr32,...),
conditional moves (CMOVBE_Fp32,...)

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D48713

llvm-svn: 336352
2018-07-05 13:54:51 +00:00
Clement Courbet 2c278cdd98 [llvm-exegesis][NFC]clang-format
llvm-svn: 336343
2018-07-05 12:26:12 +00:00
Clement Courbet e945fad250 [llvm-exegesis] Remove dead comment.
llvm-svn: 336266
2018-07-04 12:31:00 +00:00
John Brawn c4ed60042f [llvm-exegesis] Add an AArch64 target
The target does just enough to be able to run llvm-exegesis in latency mode for
at least some opcodes.

Differential Revision: https://reviews.llvm.org/D48780

llvm-svn: 336187
2018-07-03 10:10:29 +00:00
Clement Courbet e785169fce [llvm-exegesis] ExegisX86Target::setRegToConstant() should depend on the subtarget features.
Summary: This fixes PR38008.

Reviewers: gchatelet, RKSimon

Subscribers: tschuett, craig.topper, llvm-commits

Differential Revision: https://reviews.llvm.org/D48820

llvm-svn: 336171
2018-07-03 06:17:05 +00:00
John Brawn 346856dc6c [llvm-exegesis] Change how the native architecture is determined
Currently the llvm-exegesis native architecture is determined by comparing the
llvm native architecture with X86, so to add a new target would mean adding a
new check. Change this to building up a list of the targets llvm-exegesis
supports then using that, as this means that when adding a new target you just
add the target to the list of supported targets.

Differential Revision: https://reviews.llvm.org/D48778

llvm-svn: 336105
2018-07-02 13:53:46 +00:00
John Brawn 8fc5ec78d5 [llvm-exegesis] Delegate the decision of cycle counter name to the target
Currently the cycle counter is taken from the subtarget schedule model, which
isn't any use if the subtarget doesn't have one. Delegate the decision to the
target benchmark runner, as it may know better what to do in that case, with
the default being the current behaviour.

Differential Revision: https://reviews.llvm.org/D48779

llvm-svn: 336099
2018-07-02 13:14:49 +00:00
Clement Courbet a53349251c [llvm-exegesis][NFC] Cleanup useless braces.
llvm-svn: 336076
2018-07-02 06:39:55 +00:00
Clement Courbet 717c9768d3 [llvm-exegesis] Add partial X87 support.
Summary:
This enables the X86-specific X86FloatingPointStackifierPass, and allow
llvm-exegesis to generate and measure X87 latency/uops for some FP ops.

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D48592

llvm-svn: 335815
2018-06-28 07:41:16 +00:00
Clement Courbet 650db339a5 [llvm-exegesis][NFC] Fix windows warning in rL335465.
llvm-svn: 335591
2018-06-26 10:52:41 +00:00
Clement Courbet 4860b98443 [llvm-exegesis] Get the BenchmarkRunner from the ExegesisTarget.
Summary:
This allows targets to override code generation for some instructions.
As an example of override, this also moves ad-hoc instruction filtering
for X86 into the X86 ExegesisTarget.

Reviewers: gchatelet

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D48587

llvm-svn: 335582
2018-06-26 08:49:30 +00:00
Clement Courbet 0e8bf4e5aa [llvm-exegesis][NFC] Remove unnecessary member variables.
llvm-svn: 335470
2018-06-25 13:44:27 +00:00
Clement Courbet 6a60f2fcfc [llvm-exegesis] Fix warning in r22752: Initialize IsSnippetSetupComplete.
llvm-svn: 335467
2018-06-25 13:39:50 +00:00
Clement Courbet a51efc266c [llvm-exegesis] Generate snippet setup code.
Summary:
This ensures that the snippet always sees the same values for registers,
making measurements reproducible.
This will also allow exploring different values.

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D48542

llvm-svn: 335465
2018-06-25 13:12:02 +00:00
Clement Courbet 9523ef0b29 [llvm-exegesis][NFC] Simplify BenchmarkRunner ctor.
llvm-svn: 335456
2018-06-25 11:44:29 +00:00
Clement Courbet cff2caac75 [llvm-exegesis][NFC] clang-format
llvm-svn: 335452
2018-06-25 11:22:23 +00:00
Clement Courbet e2fc89fdd6 [llvm-exegesis][NFC] Fix `Operand` class comments.
llvm-svn: 335450
2018-06-25 11:12:30 +00:00
Clement Courbet 1ef6aa814d [llvm-exegesis][NFC] Simplify BenchmarkRunner.
Get rid of createExecutableFunction().

llvm-svn: 335240
2018-06-21 14:49:04 +00:00
Clement Courbet 760d1d5741 [llvm-exegesis][NFC] Simplify LLVMState.
Summary: Pretty much everything we need is in llvm::TargetMachine.

Reviewers: gchatelet

Subscribers: llvm-commits, tschuett

Differential Revision: https://reviews.llvm.org/D48428

llvm-svn: 335237
2018-06-21 14:11:09 +00:00
Clement Courbet 6fd00e32e5 [llvm-exegesis] Add mechanism to add target-specific passes.
Summary:
createX86FloatingPointStackifierPass is disabled until we handle
TracksLiveness correctly.

Reviewers: gchatelet

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D48360

llvm-svn: 335117
2018-06-20 11:54:35 +00:00
Clement Courbet e4f885b5a2 [llvm-exegesis] Remove noexcept in r335105.
gcc checks for transitivity (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53903)

llvm-svn: 335109
2018-06-20 09:18:37 +00:00
Clement Courbet 2c409702a9 [llvm-exegesis] Fix missing move in r335105.
llvm-svn: 335108
2018-06-20 09:18:32 +00:00
Guillaume Chatelet ef6cef5b57 [llvm-exegesis] Use a Prototype to defer picking a value for free vars.
Summary: Introducing a Prototype object to capture Variables that must be set but keeps degrees of freedom as Invalid. This allows exploring non constraint variables later on.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D48316

llvm-svn: 335105
2018-06-20 08:52:30 +00:00
Roman Lebedev 3de9664494 llvm-exegesis: mark ~ExegesisTarget() as virtual. Fixes build.
/build/llvm/tools/llvm-exegesis/lib/X86/../Target.h:32:3: error: 'exegesis::ExegesisTarget' has virtual functions but non-virtual destructor [-Werror,-Wnon-virtual-dtor]
  ~ExegesisTarget();
  ^
/build/llvm/tools/llvm-exegesis/lib/X86/Target.cpp:15:7: error: 'exegesis::(anonymous namespace)::ExegesisX86Target' has virtual functions but non-virtual destructor [-Werror,-Wnon-virtual-dtor]
class ExegesisX86Target : public ExegesisTarget {
      ^

llvm-svn: 335042
2018-06-19 11:58:10 +00:00
Clement Courbet 44b4c54e26 Re-land r335038 "[llvm-exegesis] A mechanism to add target-specific functionality.""
Fix typo: LLVM_NATIVE_ARCH -> LLVM_EXEGESIS_NATIVE_ARCH.

llvm-svn: 335041
2018-06-19 11:28:59 +00:00
Clement Courbet 46751785ee Revert r335038 "[llvm-exegesis] A mechanism to add target-specific functionality."
Breaks buildbots.

llvm-svn: 335040
2018-06-19 10:54:12 +00:00
Clement Courbet 6780b5f97d [llvm-exegesis] A mechanism to add target-specific functionality.
Summary: This is a step towards implementing memory operands and X87.

Reviewers: gchatelet

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D48210

llvm-svn: 335038
2018-06-19 10:39:50 +00:00
Clement Courbet 205276bf37 [llvm-exegesis][NFC] Remove dead variable.
llvm-svn: 334813
2018-06-15 09:46:57 +00:00
Clement Courbet f64007fe82 [llvm-exegesis][NFC] Add more comments.
llvm-svn: 334811
2018-06-15 09:27:12 +00:00
Clement Courbet 4273e1e828 [llvm-exegesis] Print the whole snippet in analysis.
Summary:
On hover, the whole asm snippet is displayed, including operands.

This requires the actual assembly output instead of just the MCInsts:
This is because some pseudo-instructions get lowered to actual target
instructions during codegen (e.g. ABS_Fp32 -> SSE or X87).

Reviewers: gchatelet

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D48164

llvm-svn: 334805
2018-06-15 07:30:45 +00:00
Clement Courbet 49fad1cbf2 [llvm-exegesis] Use BenchmarkResult::Instructions instead of OpcodeName
Summary:
Get rid of OpcodeName.

To remove the opcode name from an old file:
```
cat old_file | sed '/opcode_name.*/d'
```

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D48121

llvm-svn: 334691
2018-06-14 06:57:52 +00:00
Guillaume Chatelet b391f24303 [llvm-exegesis] Fix buildbot - power was using native target for X86.
Reviewers: courbet

Reviewed By: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D48125

llvm-svn: 334601
2018-06-13 14:07:36 +00:00
Guillaume Chatelet 60e3d582f6 [llvm-exegesis] Fix failing assert when creating Snippet for LAHF.
Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D48123

llvm-svn: 334599
2018-06-13 13:53:56 +00:00
Guillaume Chatelet c9f727bb85 [llvm-exegesis] Cleaner design without mutable data.
Summary: Previous design was relying on the 'mutate' keyword and was quite confusing. This version separate mutable from immutable data and makes it clearer what changes and what doesn't.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D48020

llvm-svn: 334596
2018-06-13 13:24:41 +00:00
Clement Courbet 3827537abc [llvm-exegesis] Sum counter values when several counters are specified for a ProcRes.
Summary: This allows handling memory ports on SNB.

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D48076

llvm-svn: 334502
2018-06-12 13:28:37 +00:00
Guillaume Chatelet 0782881161 [llvm-exegesis] Move libpfm linking to LLVMExegesis.
Summary: This patch moves linking of libpfm from different places to a single one.

Reviewers: courbet

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D48075

llvm-svn: 334499
2018-06-12 13:07:16 +00:00
Zachary Turner 1f67a3cba9 [FileSystem] Split up the OpenFlags enumeration.
This breaks the OpenFlags enumeration into two separate
enumerations: OpenFlags and CreationDisposition.  The first
controls the behavior of the API depending on whether or not
the target file already exists, and is not a flags-based
enum.  The second controls more flags-like values.

This yields a more easy to understand API, while also allowing
flags to be passed to the openForRead api, where most of the
values didn't make sense before.  This also makes the apis more
testable as it becomes easy to enumerate all the configurations
which make sense, so I've added many new tests to exercise all
the different values.

llvm-svn: 334221
2018-06-07 19:58:58 +00:00
Guillaume Chatelet b4f1582ac5 [llvm-exegesis] Make BenchmarkRunner handle multiple configurations.
Summary: BenchmarkRunner subclasses can now create many configurations - although this patch still generates one.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D47877

llvm-svn: 334197
2018-06-07 14:00:29 +00:00
Guillaume Chatelet 7b852cd814 [llvm-exegesis] Add a Configuration object for Benchmark.
Summary: This is the first step to have the BenchmarkRunner create and measure many different configurations (different initial values for instance).

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D47826

llvm-svn: 334169
2018-06-07 08:11:54 +00:00
Guillaume Chatelet 8c91d4cb04 [llvm-exegesis] Improve error reporting.
Summary: BenchmarkResult IO functions now return an Error or Expected so caller can deal take proper action.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D47868

llvm-svn: 334167
2018-06-07 07:51:16 +00:00
Guillaume Chatelet 083a0c1621 [llvm-exegesis] Serializes instruction's operand in BenchmarkResult's key.
Summary: Follow up patch to https://reviews.llvm.org/D47764.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D47785

llvm-svn: 334165
2018-06-07 07:40:40 +00:00
Clement Courbet 62b34fa89a [llvm-exegesis] move Mode from Key to BenchmarResult.
Moves the Mode field out of the Key. The existing yaml benchmark results can be fixed with the following script:

```
readonly FILE=$1
readonly MODE=latency # Change to uops to fix a uops benchmark.
cat $FILE | \
  sed "/^\ \+mode:\ \+$MODE$/d" | \
  sed "/^cpu_name.*$/i mode:            $MODE"
```

Differential Revision: https://reviews.llvm.org/D47813

Authored by: Guillaume Chatelet

llvm-svn: 334079
2018-06-06 09:42:36 +00:00
Clement Courbet 53d35d2dc4 [llvm-exegesis] Add instructions to BenchmarkResult Key.
We want llvm-exegesis to explore instructions (effect of initial register values, effect of operand selection). To enable this a BenchmarkResult muststore all the relevant data in its key. This patch starts adding such data. Here we simply allow to store the generated instructions, following patches will add operands and initial values for registers.

https://reviews.llvm.org/D47764

Authored by: Guilluame Chatelet

llvm-svn: 334008
2018-06-05 10:56:19 +00:00
Clement Courbet 2cb97b95a2 [llvm-exegesis][NFC] Use an enum instead of a string for benchmark mode.
Summary: YAML encoding is backwards-compatible.

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D47705

llvm-svn: 333886
2018-06-04 11:43:40 +00:00
Clement Courbet 7228721b30 [llvm-exegesis] Analysis: Show inconsistencies between checked-in and measured data.
Summary:
We now highlight any sched classes whose measurements do not match the
LLVM SchedModel. "bad" clusters are marked in red.

Screenshot in phabricator diff.

Reviewers: gchatelet

Subscribers: tschuett, mgrang, RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D47639

llvm-svn: 333884
2018-06-04 11:11:55 +00:00
Clement Courbet df79e79e22 [llvm-exegesis] Analysis: Display idealized sched class port pressure.
Summary: Screenshot in phabricator diff.

Reviewers: gchatelet

Subscribers: mgorny, tschuett, mgrang, llvm-commits

Differential Revision: https://reviews.llvm.org/D47329

llvm-svn: 333753
2018-06-01 14:18:02 +00:00
Roman Lebedev 71d4afb90a [llvm-exegesis][NFCI] Counter::Counter(): more useful msg on event open error
Summary:
I'm slowly looking into a new X86 scheduler model,
for AMD Bulldozer CPU, model 2 (bdver2, Piledriver).

And naturally, i have hit that assert :)
I happened to know what it meant, and how to fix it,
but that is not too common knowledge.

Reviewers: courbet, RKSimon

Reviewed By: courbet

Subscribers: tschuett, llvm-commits, craig.topper

Differential Revision: https://reviews.llvm.org/D47572

llvm-svn: 333632
2018-05-31 07:08:26 +00:00