forked from OSchip/llvm-project
263 lines
13 KiB
ReStructuredText
263 lines
13 KiB
ReStructuredText
===================================================================
|
|
How To Add Your Build Configuration To LLVM Buildbot Infrastructure
|
|
===================================================================
|
|
|
|
Introduction
|
|
============
|
|
|
|
This document contains information about adding a build configuration and
|
|
buildbot-worker to private worker builder to LLVM Buildbot Infrastructure.
|
|
|
|
Buildmasters
|
|
============
|
|
|
|
There are two buildmasters running.
|
|
|
|
* The main buildmaster at `<https://lab.llvm.org/buildbot>`_. All builders
|
|
attached to this machine will notify commit authors every time they break
|
|
the build.
|
|
* The staging buildmaster at `<https://lab.llvm.org/staging>`_. All builders
|
|
attached to this machine will be completely silent by default when the build
|
|
is broken.
|
|
|
|
In order to remain connected to the main buildmaster (and thus notify
|
|
developers of failures), a builbot must:
|
|
|
|
* Be building a supported configuration. Builders for experimental backends
|
|
should generally be attached to staging buildmaster.
|
|
* Be able to keep up with new commits to the main branch, or at a minimum
|
|
recover to tip of tree within a couple of days of falling behind.
|
|
|
|
Additionally, we encourage all bot owners to point their bots towards the
|
|
staging master during maintenance windows, instability troubleshooting, and
|
|
such.
|
|
|
|
Roles & Expectations
|
|
====================
|
|
|
|
Each buildbot has an owner who is the responsible party for addressing problems
|
|
which arise with said buildbot. We generally expect the bot owner to be
|
|
reasonably responsive.
|
|
|
|
For some bots, the ownership responsibility is split between a "resource owner"
|
|
who provides the underlying machine resource, and a "configuration owner" who
|
|
maintains the build configuration. Generally, operational responsibility lies
|
|
with the "config owner". We do expect "resource owners" - who are generally
|
|
the contact listed in a workers attributes - to proxy requests to the relevant
|
|
"config owner" in a timely manner.
|
|
|
|
Most issues with a buildbot should be addressed directly with a bot owner
|
|
via email. Please CC `Galina Kistanova <mailto:gkistanova@gmail.com>`_.
|
|
|
|
Steps To Add Builder To LLVM Buildbot
|
|
=====================================
|
|
Volunteers can provide their build machines to work as build workers to
|
|
public LLVM Buildbot.
|
|
|
|
Here are the steps you can follow to do so:
|
|
|
|
#. Check the existing build configurations to make sure the one you are
|
|
interested in is not covered yet or gets built on your computer much
|
|
faster than on the existing one. We prefer faster builds so developers
|
|
will get feedback sooner after changes get committed.
|
|
|
|
#. The computer you will be registering with the LLVM buildbot
|
|
infrastructure should have all dependencies installed and you can
|
|
actually build your configuration successfully. Please check what degree
|
|
of parallelism (-j param) would give the fastest build. You can build
|
|
multiple configurations on one computer.
|
|
|
|
#. Install buildbot-worker (currently we are using buildbot version 2.8.5).
|
|
Depending on the platform, buildbot-worker could be available to download and
|
|
install with your package manager, or you can download it directly from
|
|
`<http://trac.buildbot.net>`_ and install it manually.
|
|
|
|
#. Create a designated user account, your buildbot-worker will be running under,
|
|
and set appropriate permissions.
|
|
|
|
#. Choose the buildbot-worker root directory (all builds will be placed under
|
|
it), buildbot-worker access name and password the build master will be using
|
|
to authenticate your buildbot-worker.
|
|
|
|
#. Create a buildbot-worker in context of that buildbot-worker account. Point it
|
|
to the **lab.llvm.org** port **9994** (see `Buildbot documentation,
|
|
Creating a worker
|
|
<http://docs.buildbot.net/current/tutorial/firstrun.html#creating-a-worker>`_
|
|
for more details) by running the following command:
|
|
|
|
.. code-block:: bash
|
|
|
|
$ buildbot-worker create-worker <buildbot-worker-root-directory> \
|
|
lab.llvm.org:9994 \
|
|
<buildbot-worker-access-name> \
|
|
<buildbot-worker-access-password>
|
|
|
|
This will cause your new worker to connect to the staging buildmaster
|
|
which is silent by default. Only once a new worker is stable, and
|
|
approval from Galina has been received (see last step) should it
|
|
be pointed at the main buildmaster.
|
|
|
|
#. Fill the buildbot-worker description and admin name/e-mail. Here is an
|
|
example of the buildbot-worker description::
|
|
|
|
Windows 7 x64
|
|
Core i7 (2.66GHz), 16GB of RAM
|
|
|
|
g++.exe (TDM-1 mingw32) 4.4.0
|
|
GNU Binutils 2.19.1
|
|
cmake version 2.8.4
|
|
Microsoft(R) 32-bit C/C++ Optimizing Compiler Version 16.00.40219.01 for 80x86
|
|
|
|
#. Make sure you can actually start the buildbot-worker successfully. Then set
|
|
up your buildbot-worker to start automatically at the start up time. See the
|
|
buildbot documentation for help. You may want to restart your computer
|
|
to see if it works.
|
|
|
|
#. Send a patch which adds your build worker and your builder to
|
|
`zorg <https://github.com/llvm/llvm-zorg>`_. Use the typical LLVM
|
|
`workflow <https://llvm.org/docs/Contributing.html#how-to-submit-a-patch>`_.
|
|
|
|
* workers are added to ``buildbot/osuosl/master/config/workers.py``
|
|
* builders are added to ``buildbot/osuosl/master/config/builders.py``
|
|
|
|
Please make sure your builder name and its builddir are unique through the
|
|
file.
|
|
|
|
All new builders should default to using the "'collapseRequests': False"
|
|
configuration. This causes the builder to build each commit individually
|
|
and not merge build requests. To maximize quality of feedback to developers,
|
|
we *strongly prefer* builders to be configured not to collapse requests.
|
|
This flag should be removed only after all reasonable efforts have been
|
|
exhausted to improve build times such that the builder can keep up with
|
|
commit flow.
|
|
|
|
It is possible to allow email addresses to unconditionally receive
|
|
notifications on build failure; for this you'll need to add an
|
|
``InformativeMailNotifier`` to ``buildbot/osuosl/master/config/status.py``.
|
|
This is particularly useful for the staging buildmaster which is silent
|
|
otherwise.
|
|
|
|
#. Send the buildbot-worker access name and the access password directly to
|
|
`Galina Kistanova <mailto:gkistanova@gmail.com>`_, and wait till she
|
|
will let you know that your changes are applied and buildmaster is
|
|
reconfigured.
|
|
|
|
#. Check the status of your buildbot-worker on the `Waterfall Display (Staging)
|
|
<http://lab.llvm.org/staging/#/waterfall>`_ to make sure it is
|
|
connected, and the `Workers Display (Staging)
|
|
<http://lab.llvm.org/staging/#/workers>`_ to see if administrator
|
|
contact and worker information are correct.
|
|
|
|
#. At this point, you have a working builder connected to the staging
|
|
buildmaster. You can now make sure it is reliably green and keeps
|
|
up with the build queue. No notifications will be sent, so you can
|
|
keep an unstable builder connected to staging indefinitely.
|
|
|
|
#. (Optional) Once the builder is stable on the staging buildmaster with
|
|
several days of green history, you can chose to move it to the production
|
|
buildmaster to enable developer notifications. Please email `Galina
|
|
Kistanova <mailto:gkistanova@gmail.com>`_ for review and approval.
|
|
|
|
To move a worker to production (once approved), stop your worker, edit the
|
|
buildbot.tac file to change the port number from 9994 to 9990 and start it
|
|
again.
|
|
|
|
Best Practices for Configuring a Fast Builder
|
|
=============================================
|
|
|
|
As mentioned above, we generally have a strong preference for
|
|
builders which can build every commit as they come in. This section
|
|
includes best practices and some recommendations as to how to achieve
|
|
that end.
|
|
|
|
The goal
|
|
In 2020, the monorepo had just under 35 thousand commits. This works
|
|
out to an average of 4 commits per hour. Already, we can see that a
|
|
builder must cycle in less than 15 minutes to have a hope of being
|
|
useful. However, those commits are not uniformly distributed. They
|
|
tend to cluster strongly during US working hours. Looking at a couple
|
|
of recent (Nov 2021) working days, we routinely see ~10 commits per
|
|
hour during peek times, with occasional spikes as high as ~15 commits
|
|
per hour. Thus, as a rule of thumb, we should plan for our builder to
|
|
complete ~10-15 builds an hour.
|
|
|
|
Resource Appropriately
|
|
At 10-15 builds per hour, we need to complete a new build on average every
|
|
4 to 6 minutes. For anything except the fastest of hardware/build configs,
|
|
this is going to be well beyond the ability of a single machine. In buildbot
|
|
terms, we likely going to need multiple workers to build requests in parallel
|
|
under a single builder configuration. For some rough back of the envelope
|
|
numbers, if your build config takes e.g. 30 minutes, you will need something
|
|
on the order of 5-8 workers. If your build config takes ~2 hours, you'll
|
|
need something on the order of 20-30 workers. The rest of this section
|
|
focuses on how to reduce cycle times.
|
|
|
|
Restrict what you build and test
|
|
Think hard about why you're setting up a bot, and restrict your build
|
|
configuration as much as you can. Basic functionality is probably
|
|
already covered by other bots, and you don't need to duplicate that
|
|
testing. You only need to be building and testing the *unique* parts
|
|
of the configuration. (e.g. For a multi-stage clang builder, you probably
|
|
don't need to be enabling every target or building all the various utilities.)
|
|
|
|
It can sometimes be worthwhile splitting a single builder into two or more,
|
|
if you have multiple distinct purposes for the same builder. As an example,
|
|
if you want to both a) confirm that all of LLVM builds with your host
|
|
compiler, and b) want to do a multi-stage clang build on your target, you
|
|
may be better off with two separate bots. Splitting increases resource
|
|
consumption, but makes it easy for each bot to keep up with commit flow.
|
|
Additionally, splitting bots may assist in triage by narrowing attention to
|
|
relevant parts of the failing configuration.
|
|
|
|
In general, we recommend Release build types with Assertions enabled. This
|
|
generally provides a good balance between build times and bug detection for
|
|
most buildbots. There may be room for including some debug info (e.g. with
|
|
`-gmlt`), but in general the balance between debug info quality and build
|
|
times is a delicate one.
|
|
|
|
Use Ninja & LLD
|
|
Ninja really does help build times over Make, particularly for highly
|
|
parallel builds. LLD helps to reduce both link times and memory usage
|
|
during linking significantly. With a build machine with sufficient
|
|
parallism, link times tend to dominate critical path of the build, and are
|
|
thus worth optimizing.
|
|
|
|
Use CCache and NOT incremental builds
|
|
Using ccache materially improves average build times. Incremental builds
|
|
can be slightly faster, but introduce the risk of build corruption due to
|
|
e.g. state changes, etc... At this point, the recommendation is not to
|
|
use incremental builds and instead use ccache as the latter captures the
|
|
majority of the benefit with less risk of false positives.
|
|
|
|
One of the non-obvious benefits of using ccache is that it makes the
|
|
builder less sensitive to which projects are being monitored vs built.
|
|
If a change triggers a build request, but doesn't change the build output
|
|
(e.g. doc changes, python utility changes, etc..), the build will entirely
|
|
hit in cache and the build request will complete in just the testing time.
|
|
|
|
With multiple workers, it is tempting to try to configure a shared cache
|
|
between the workers. Experience to date indicates this is difficult to
|
|
well, and that having local per-worker caches gets most of the benefit
|
|
anyways. We don't currently recommend shared caches.
|
|
|
|
CCache does depend on the builder hardware having sufficient IO to access
|
|
the cache with reasonable access times - i.e. a fast disk, or enough memory
|
|
for a RAM cache, etc.. For builders without, incremental may be your best
|
|
option, but is likely to require higher ongoing involvement from the
|
|
sponsor.
|
|
|
|
Enable batch builds
|
|
As a last resort, you can configure your builder to batch build requests.
|
|
This makes the build failure notifications markedly less actionable, and
|
|
should only be done once all other reasonable measures have been taken.
|
|
|
|
Leave it on the staging buildmaster
|
|
While most of this section has been biased towards builders intended for
|
|
the main buildmaster, it is worth highlighting that builders can run
|
|
indefinitely on the staging buildmaster. Such a builder may still be
|
|
useful for the sponsoring organization, without concern of negatively
|
|
impacting the broader community. The sponsoring organization simply
|
|
has to take on the responsibility of all bisection and triage.
|
|
|
|
|