+++ /dev/null
-# Mesa testing
-
-The goal of the "test" stage of the .gitlab-ci.yml is to do pre-merge
-testing of Mesa drivers on various platforms, so that we can ensure no
-regressions are merged, as long as developers are merging code using
-marge-bot.
-
-There are currently 4 automated testing systems deployed for Mesa.
-LAVA and gitlab-runner on the DUTs are used in pre-merge testing and
-are described in this document. Managing bare metal using
-gitlab-runner is described under [bare-metal/README.md]. Intel also
-has a jenkins-based CI system with restricted access that isn't
-connected to gitlab.
-
-## Mesa testing using LAVA
-
-[LAVA](https://lavasoftware.org/) is a system for functional testing
-of boards including deploying custom bootloaders and kernels. This is
-particularly relevant to testing Mesa because we often need to change
-kernels for UAPI changes (and this lets us do full testing of a new
-kernel during development), and our workloads can easily take down
-boards when mistakes are made (kernel oopses, OOMs that take out
-critical system services).
-
-### Mesa-LAVA software architecture
-
-The gitlab-runner will run on some host that has access to the LAVA
-lab, with tags like "lava-mesa-boardname" to control only taking in
-jobs for the hardware that the LAVA lab contains. The gitlab-runner
-spawns a docker container with lava-cli in it, and connects to the
-LAVA lab using a predefined token to submit jobs under a specific
-device type.
-
-The LAVA instance manages scheduling those jobs to the boards present.
-For a job, it will deploy the kernel, device tree, and the ramdisk
-containing the CTS.
-
-### Deploying a new Mesa-LAVA lab
-
-You'll want to start with setting up your LAVA instance and getting
-some boards booting using test jobs. Start with the stock QEMU
-examples to make sure your instance works at all. Then, you'll need
-to define your actual boards.
-
-The device type in lava-gitlab-ci.yml is the device type you create in
-your LAVA instance, which doesn't have to match the board's name in
-`/etc/lava-dispatcher/device-types`. You create your boards under
-that device type and the Mesa jobs will be scheduled to any of them.
-Instantiate your boards by creating them in the UI or at the command
-line attached to that device type, then populate their dictionary
-(using an "extends" line probably referencing the board's template in
-`/etc/lava-dispatcher/device-types`). Now, go find a relevant
-healthcheck job for your board as a test job definition, or cobble
-something together from a board that boots using the same boot_method
-and some public images, and figure out how to get your boards booting.
-
-Once you can boot your board using a custom job definition, it's time
-to connect Mesa CI to it. Install gitlab-runner and register as a
-shared runner (you'll need a gitlab admin for help with this). The
-runner *must* have a tag (like "mesa-lava-db410c") to restrict the
-jobs it takes or it will grab random jobs from tasks across fd.o, and
-your runner isn't ready for that.
-
-The runner will be running an ARM docker image (we haven't done any
-x86 LAVA yet, so that isn't documented). If your host for the
-gitlab-runner is x86, then you'll need to install qemu-user-static and
-the binfmt support.
-
-The docker image will need access to the lava instance. If it's on a
-public network it should be fine. If you're running the LAVA instance
-on localhost, you'll need to set `network_mode="host"` in
-`/etc/gitlab-runner/config.toml` so it can access localhost. Create a
-gitlab-runner user in your LAVA instance, log in under that user on
-the web interface, and create an API token. Copy that into a
-`lavacli.yaml`:
-
-```
-default:
- token: <token contents>
- uri: <url to the instance>
- username: gitlab-runner
-```
-
-Add a volume mount of that `lavacli.yaml` to
-`/etc/gitlab-runner/config.toml` so that the docker container can
-access it. You probably have a `volumes = ["/cache"]` already, so now it would be
-
-```
- volumes = ["/home/anholt/lava-config/lavacli.yaml:/root/.config/lavacli.yaml", "/cache"]
-```
-
-Note that this token is visible to anybody that can submit MRs to
-Mesa! It is not an actual secret. We could just bake it into the
-gitlab CI yml, but this way the current method of connecting to the
-LAVA instance is separated from the Mesa branches (particularly
-relevant as we have many stable branches all using CI).
-
-Now it's time to define your test runner in
-`.gitlab-ci/lava-gitlab-ci.yml`.
-
-## Mesa testing using gitlab-runner on DUTs
-
-### Software architecture
-
-For freedreno and llvmpipe CI, we're using gitlab-runner on the test
-devices (DUTs), cached docker containers with VK-GL-CTS, and the
-normal shared x86_64 runners to build the Mesa drivers to be run
-inside of those containers on the DUTs.
-
-The docker containers are rebuilt from the debian-install.sh script
-when DEBIAN\_TAG is changed in .gitlab-ci.yml, and
-debian-test-install.sh when DEBIAN\_ARM64\_TAG is changed in
-.gitlab-ci.yml. The resulting images are around 500MB, and are
-expected to change approximately weekly (though an individual
-developer working on them may produce many more images while trying to
-come up with a working MR!).
-
-gitlab-runner is a client that polls gitlab.freedesktop.org for
-available jobs, with no inbound networking requirements. Jobs can
-have tags, so we can have DUT-specific jobs that only run on runners
-with that tag marked in the gitlab UI.
-
-Since dEQP takes a long time to run, we mark the job as "parallel" at
-some level, which spawns multiple jobs from one definition, and then
-deqp-runner.sh takes the corresponding fraction of the test list for
-that job.
-
-To reduce dEQP runtime (or avoid tests with unreliable results), a
-deqp-runner.sh invocation can provide a list of tests to skip. If
-your driver is not yet conformant, you can pass a list of expected
-failures, and the job will only fail on tests that aren't listed (look
-at the job's log for which specific tests failed).
-
-### DUT requirements
-
-#### DUTs must have a stable kernel and GPU reset.
-
-If the system goes down during a test run, that job will eventually
-time out and fail (default 1 hour). However, if the kernel can't
-reliably reset the GPU on failure, bugs in one MR may leak into
-spurious failures in another MR. This would be an unacceptable impact
-on Mesa developers working on other drivers.
-
-#### DUTs must be able to run docker
-
-The Mesa gitlab-runner based test architecture is built around docker,
-so that we can cache the debian package installation and CTS build
-step across multiple test runs. Since the images are large and change
-approximately weekly, the DUTs also need to be running some script to
-prune stale docker images periodically in order to not run out of disk
-space as we rev those containers (perhaps [this
-script](https://gitlab.com/gitlab-org/gitlab-runner/issues/2980#note_169233611)).
-
-Note that docker doesn't allow containers to be stored on NFS, and
-doesn't allow multiple docker daemons to interact with the same
-network block device, so you will probably need some sort of physical
-storage on your DUTs.
-
-#### DUTs must be public
-
-By including your device in .gitlab-ci.yml, you're effectively letting
-anyone on the internet run code on your device. docker containers may
-provide some limited protection, but how much you trust that and what
-you do to mitigate hostile access is up to you.
-
-#### DUTs must expose the dri device nodes to the containers.
-
-Obviously, to get access to the HW, we need to pass the render node
-through. This is done by adding `devices = ["/dev/dri"]` to the
-`runners.docker` section of /etc/gitlab-runner/config.toml.
-
-### HW CI farm expectations
-
-To make sure that testing of one vendor's drivers doesn't block
-unrelated work by other vendors, we require that a given driver's test
-farm produces a spurious failure no more than once a week. If every
-driver had CI and failed once a week, we would be seeing someone's
-code getting blocked on a spurious failure daily, which is an
-unacceptable cost to the project.
-
-Additionally, the test farm needs to be able to provide a short enough
-turnaround time that people can regularly use the "Merge when pipeline
-succeeds" button successfully (until we get
-[marge-bot](https://github.com/smarkets/marge-bot) in place on
-freedesktop.org). As a result, we require that the test farm be able
-to handle a whole pipeline's worth of jobs in less than 5 minutes (to
-compare, the build stage is about 10 minutes, if you could get all
-your jobs scheduled on the shared runners in time.).
-
-If a test farm is short the HW to provide these guarantees, consider
-dropping tests to reduce runtime.
-`VK-GL-CTS/scripts/log/bottleneck_report.py` can help you find what
-tests were slow in a `results.qpa` file. Or, you can have a job with
-no `parallel` field set and:
-
-```
- variables:
- CI_NODE_INDEX: 1
- CI_NODE_TOTAL: 10
-```
-
-to just run 1/10th of the test list.
-
-If a HW CI farm goes offline (network dies and all CI pipelines end up
-stalled) or its runners are consistenly spuriously failing (disk
-full?), and the maintainer is not immediately available to fix the
-issue, please push through an MR disabling that farm's jobs by adding
-'.' to the front of the jobs names until the maintainer can bring
-things back up. If this happens, the farm maintainer should provide a
-report to mesa-dev@lists.freedesktop.org after the fact explaining
-what happened and what the mitigation plan is for that failure next
-time.
+++ /dev/null
-# bare-metal Mesa testing
-
-Testing Mesa with gitlab-runner running on the devices being tested
-(DUTs) proved to be too unstable, so this set of scripts is for
-running Mesa testing on bare-metal boards connected to a separate
-system using gitlab-runner. Currently only "fastboot" and "ChromeOS
-Servo" devices are supported.
-
-In comparison with LAVA, this doesn't involve maintaining a separate
-webservice with its own job scheduler and replicating jobs between the
-two. It also places more of the board support in git, instead of
-webservice configuration. On the other hand, the serial interactions
-and bootloader support are more primitive.
-
-## Requirements (fastboot)
-
-This testing requires power control of the DUTs by the gitlab-runner
-machine, since this is what we use to reset the system and get back to
-a pristine state at the start of testing.
-
-We require access to the console output from the gitlab-runner system,
-since that is how we get the final results back from the tests. You
-should probably have the console on a serial connection, so that you
-can see bootloader progress.
-
-The boards need to be able to have a kernel/initramfs supplied by the
-gitlab-runner system, since the initramfs is what contains the Mesa
-testing payload.
-
-The boards should have networking, so that (in a future iteration of
-this code) we can extract the dEQP .xml results to artifacts on
-gitlab.
-
-## Requirements (servo)
-
-For servo-connected boards, we can use the EC connection for power
-control to reboot the board. However, loading a kernel is not as easy
-as fastboot, so we assume your bootloader can do TFTP, and that your
-gitlab-runner mounts the runner's tftp directory specific to the board
-at /tftp in the container.
-
-Since we're going the TFTP route, we also use NFS root. This avoids
-packing the rootfs and sending it to the board as a ramdisk, which
-means we can support larger rootfses (for piglit or tracie testing),
-at the cost of needing more storage on the runner.
-
-Telling the board about where its TFTP and NFS should come from is
-done using dnsmasq on the runner host. For example, this snippet in
-the dnsmasq.conf.d in the google farm, with the gitlab-runner host we
-call "servo".
-
-```
-dhcp-host=1c:69:7a:0d:a3:d3,10.42.0.10,set:servo
-
-# Fixed dhcp addresses for my sanity, and setting a tag for
-# specializing other DHCP options
-dhcp-host=a0:ce:c8:c8:d9:5d,10.42.0.11,set:cheza1
-dhcp-host=a0:ce:c8:c8:d8:81,10.42.0.12,set:cheza2
-
-# Specify the next server, watch out for the double ',,'. The
-# filename didn't seem to get picked up by the bootloader, so we use
-# tftp-unique-root and mount directories like
-# /srv/tftp/10.42.0.11/jwerner/cheza as /tftp in the job containers.
-tftp-unique-root
-dhcp-boot=tag:cheza1,cheza1/vmlinuz,,10.42.0.10
-dhcp-boot=tag:cheza2,cheza2/vmlinuz,,10.42.0.10
-
-dhcp-option=tag:cheza1,option:root-path,/srv/nfs/cheza1
-dhcp-option=tag:cheza2,option:root-path,/srv/nfs/cheza2
-```
-
-## Setup
-
-Each board will be registered in fd.o gitlab. You'll want something
-like this to register a fastboot board:
-
-```
-sudo gitlab-runner register \
- --url https://gitlab.freedesktop.org \
- --registration-token $1 \
- --name MY_BOARD_NAME \
- --tag-list MY_BOARD_TAG \
- --executor docker \
- --docker-image "alpine:latest" \
- --docker-volumes "/dev:/dev" \
- --docker-network-mode "host" \
- --docker-privileged \
- --non-interactive
-```
-
-For a servo board, you'll need to also volume mount the board's NFS
-root dir at /nfs and TFTP kernel directory at /tftp.
-
-The registration token has to come from a fd.o gitlab admin going to
-https://gitlab.freedesktop.org/admin/runners
-
-The name scheme for Google's lab is google-freedreno-boardname-n, and
-our tag is something like google-freedreno-db410c. The tag is what
-identifies a board type so that board-specific jobs can be dispatched
-into that pool.
-
-We need privileged mode and the /dev bind mount in order to get at the
-serial console and fastboot USB devices (--device arguments don't
-apply to devices that show up after container start, which is the case
-with fastboot, and the servo serial devices are acctually links to
-/dev/pts). We use host network mode so that we can (in the future)
-spin up a server to collect XML results for fastboot.
-
-Once you've added your boards, you're going to need to add a little
-more customization in `/etc/gitlab-runner/config.toml`. First, add
-`concurrent = <number of boards>` at the top ("we should have up to
-this many jobs running managed by this gitlab-runner"). Then for each
-board's runner, set `limit = 1` ("only 1 job served by this board at a
-time"). Finally, add the board-specific environment variables
-required by your bare-metal script, something like:
-
-```
-[[runners]]
- name = "google-freedreno-db410c-1"
- environment = ["BM_SERIAL=/dev/ttyDB410c8", "BM_POWERUP=google-power-up.sh 8", "BM_FASTBOOT_SERIAL=15e9e390"]
-```
-
-Once you've updated your runners' configs, restart with `sudo service
-gitlab-runner restart`
--- /dev/null
+LAVA CI
+=======
+
+`LAVA <https://lavasoftware.org/>`_ is a system for functional testing
+of boards including deploying custom bootloaders and kernels. This is
+particularly relevant to testing Mesa because we often need to change
+kernels for UAPI changes (and this lets us do full testing of a new
+kernel during development), and our workloads can easily take down
+boards when mistakes are made (kernel oopses, OOMs that take out
+critical system services).
+
+Mesa-LAVA software architecture
+-------------------------------
+
+The gitlab-runner will run on some host that has access to the LAVA
+lab, with tags like "lava-mesa-boardname" to control only taking in
+jobs for the hardware that the LAVA lab contains. The gitlab-runner
+spawns a docker container with lava-cli in it, and connects to the
+LAVA lab using a predefined token to submit jobs under a specific
+device type.
+
+The LAVA instance manages scheduling those jobs to the boards present.
+For a job, it will deploy the kernel, device tree, and the ramdisk
+containing the CTS.
+
+Deploying a new Mesa-LAVA lab
+-----------------------------
+
+You'll want to start with setting up your LAVA instance and getting
+some boards booting using test jobs. Start with the stock QEMU
+examples to make sure your instance works at all. Then, you'll need
+to define your actual boards.
+
+The device type in lava-gitlab-ci.yml is the device type you create in
+your LAVA instance, which doesn't have to match the board's name in
+``/etc/lava-dispatcher/device-types``. You create your boards under
+that device type and the Mesa jobs will be scheduled to any of them.
+Instantiate your boards by creating them in the UI or at the command
+line attached to that device type, then populate their dictionary
+(using an "extends" line probably referencing the board's template in
+``/etc/lava-dispatcher/device-types``). Now, go find a relevant
+healthcheck job for your board as a test job definition, or cobble
+something together from a board that boots using the same boot_method
+and some public images, and figure out how to get your boards booting.
+
+Once you can boot your board using a custom job definition, it's time
+to connect Mesa CI to it. Install gitlab-runner and register as a
+shared runner (you'll need a gitlab admin for help with this). The
+runner *must* have a tag (like "mesa-lava-db410c") to restrict the
+jobs it takes or it will grab random jobs from tasks across fd.o, and
+your runner isn't ready for that.
+
+The runner will be running an ARM docker image (we haven't done any
+x86 LAVA yet, so that isn't documented). If your host for the
+gitlab-runner is x86, then you'll need to install qemu-user-static and
+the binfmt support.
+
+The docker image will need access to the lava instance. If it's on a
+public network it should be fine. If you're running the LAVA instance
+on localhost, you'll need to set ``network_mode="host"`` in
+``/etc/gitlab-runner/config.toml`` so it can access localhost. Create a
+gitlab-runner user in your LAVA instance, log in under that user on
+the web interface, and create an API token. Copy that into a
+``lavacli.yaml``:
+
+.. code-block:: yaml
+
+ default:
+ token: <token contents>
+ uri: <url to the instance>
+ username: gitlab-runner
+
+Add a volume mount of that ``lavacli.yaml`` to
+``/etc/gitlab-runner/config.toml`` so that the docker container can
+access it. You probably have a ``volumes = ["/cache"]`` already, so now it would be::
+
+ volumes = ["/home/anholt/lava-config/lavacli.yaml:/root/.config/lavacli.yaml", "/cache"]
+
+Note that this token is visible to anybody that can submit MRs to
+Mesa! It is not an actual secret. We could just bake it into the
+gitlab CI yml, but this way the current method of connecting to the
+LAVA instance is separated from the Mesa branches (particularly
+relevant as we have many stable branches all using CI).
+
+Now it's time to define your test runner in
+``.gitlab-ci/lava-gitlab-ci.yml``.
--- /dev/null
+Bare-metal CI
+=============
+
+The bare-metal scripts run on a system with gitlab-runner and docker,
+connected to potentially multiple bare-metal boards that run tests of
+Mesa. Currently only "fastboot" and "ChromeOS Servo" devices are
+supported.
+
+In comparison with LAVA, this doesn't involve maintaining a separate
+webservice with its own job scheduler and replicating jobs between the
+two. It also places more of the board support in git, instead of
+webservice configuration. On the other hand, the serial interactions
+and bootloader support are more primitive.
+
+Requirements (fastboot)
+-----------------------
+
+This testing requires power control of the DUTs by the gitlab-runner
+machine, since this is what we use to reset the system and get back to
+a pristine state at the start of testing.
+
+We require access to the console output from the gitlab-runner system,
+since that is how we get the final results back from the tests. You
+should probably have the console on a serial connection, so that you
+can see bootloader progress.
+
+The boards need to be able to have a kernel/initramfs supplied by the
+gitlab-runner system, since the initramfs is what contains the Mesa
+testing payload.
+
+The boards should have networking, so that (in a future iteration of
+this code) we can extract the dEQP .xml results to artifacts on
+gitlab.
+
+Requirements (servo)
+--------------------
+
+For servo-connected boards, we can use the EC connection for power
+control to reboot the board. However, loading a kernel is not as easy
+as fastboot, so we assume your bootloader can do TFTP, and that your
+gitlab-runner mounts the runner's tftp directory specific to the board
+at /tftp in the container.
+
+Since we're going the TFTP route, we also use NFS root. This avoids
+packing the rootfs and sending it to the board as a ramdisk, which
+means we can support larger rootfses (for piglit or tracie testing),
+at the cost of needing more storage on the runner.
+
+Telling the board about where its TFTP and NFS should come from is
+done using dnsmasq on the runner host. For example, this snippet in
+the dnsmasq.conf.d in the google farm, with the gitlab-runner host we
+call "servo"::
+
+ dhcp-host=1c:69:7a:0d:a3:d3,10.42.0.10,set:servo
+
+ # Fixed dhcp addresses for my sanity, and setting a tag for
+ # specializing other DHCP options
+ dhcp-host=a0:ce:c8:c8:d9:5d,10.42.0.11,set:cheza1
+ dhcp-host=a0:ce:c8:c8:d8:81,10.42.0.12,set:cheza2
+
+ # Specify the next server, watch out for the double ',,'. The
+ # filename didn't seem to get picked up by the bootloader, so we use
+ # tftp-unique-root and mount directories like
+ # /srv/tftp/10.42.0.11/jwerner/cheza as /tftp in the job containers.
+ tftp-unique-root
+ dhcp-boot=tag:cheza1,cheza1/vmlinuz,,10.42.0.10
+ dhcp-boot=tag:cheza2,cheza2/vmlinuz,,10.42.0.10
+
+ dhcp-option=tag:cheza1,option:root-path,/srv/nfs/cheza1
+ dhcp-option=tag:cheza2,option:root-path,/srv/nfs/cheza2
+
+Setup
+-----
+
+Each board will be registered in fd.o gitlab. You'll want something
+like this to register a fastboot board:
+
+.. code-block:: console
+
+ sudo gitlab-runner register \
+ --url https://gitlab.freedesktop.org \
+ --registration-token $1 \
+ --name MY_BOARD_NAME \
+ --tag-list MY_BOARD_TAG \
+ --executor docker \
+ --docker-image "alpine:latest" \
+ --docker-volumes "/dev:/dev" \
+ --docker-network-mode "host" \
+ --docker-privileged \
+ --non-interactive
+
+For a servo board, you'll need to also volume mount the board's NFS
+root dir at /nfs and TFTP kernel directory at /tftp.
+
+The registration token has to come from a fd.o gitlab admin going to
+https://gitlab.freedesktop.org/admin/runners
+
+The name scheme for Google's lab is google-freedreno-boardname-n, and
+our tag is something like google-freedreno-db410c. The tag is what
+identifies a board type so that board-specific jobs can be dispatched
+into that pool.
+
+We need privileged mode and the /dev bind mount in order to get at the
+serial console and fastboot USB devices (--device arguments don't
+apply to devices that show up after container start, which is the case
+with fastboot, and the servo serial devices are acctually links to
+/dev/pts). We use host network mode so that we can (in the future)
+spin up a server to collect XML results for fastboot.
+
+Once you've added your boards, you're going to need to add a little
+more customization in ``/etc/gitlab-runner/config.toml``. First, add
+``concurrent = <number of boards>`` at the top ("we should have up to
+this many jobs running managed by this gitlab-runner"). Then for each
+board's runner, set ``limit = 1`` ("only 1 job served by this board at a
+time"). Finally, add the board-specific environment variables
+required by your bare-metal script, something like::
+
+ [[runners]]
+ name = "google-freedreno-db410c-1"
+ environment = ["BM_SERIAL=/dev/ttyDB410c8", "BM_POWERUP=google-power-up.sh 8", "BM_FASTBOOT_SERIAL=15e9e390"]
+
+Once you've updated your runners' configs, restart with ``sudo service
+gitlab-runner restart``
--- /dev/null
+Docker CI
+=========
+
+For llvmpipe and swrast CI, we run tests in a container containing
+VK-GL-CTS, on the shared gitlab runners provided by `freedesktop
+<http://freedesktop.org>`_
+
+Software architecture
+---------------------
+
+The docker containers are rebuilt from the debian-install.sh script
+when DEBIAN\_TAG is changed in .gitlab-ci.yml, and
+debian-test-install.sh when DEBIAN\_ARM64\_TAG is changed in
+.gitlab-ci.yml. The resulting images are around 500MB, and are
+expected to change approximately weekly (though an individual
+developer working on them may produce many more images while trying to
+come up with a working MR!).
+
+gitlab-runner is a client that polls gitlab.freedesktop.org for
+available jobs, with no inbound networking requirements. Jobs can
+have tags, so we can have DUT-specific jobs that only run on runners
+with that tag marked in the gitlab UI.
+
+Since dEQP takes a long time to run, we mark the job as "parallel" at
+some level, which spawns multiple jobs from one definition, and then
+deqp-runner.sh takes the corresponding fraction of the test list for
+that job.
+
+To reduce dEQP runtime (or avoid tests with unreliable results), a
+deqp-runner.sh invocation can provide a list of tests to skip. If
+your driver is not yet conformant, you can pass a list of expected
+failures, and the job will only fail on tests that aren't listed (look
+at the job's log for which specific tests failed).
+
+DUT requirements
+----------------
+
+In addition to the general :ref:`CI-farm-expectations`, using
+docker requiers:
+
+* DUTs must have a stable kernel and GPU reset (if applicable).
+
+If the system goes down during a test run, that job will eventually
+time out and fail (default 1 hour). However, if the kernel can't
+reliably reset the GPU on failure, bugs in one MR may leak into
+spurious failures in another MR. This would be an unacceptable impact
+on Mesa developers working on other drivers.
+
+* DUTs must be able to run docker
+
+The Mesa gitlab-runner based test architecture is built around docker,
+so that we can cache the debian package installation and CTS build
+step across multiple test runs. Since the images are large and change
+approximately weekly, the DUTs also need to be running some script to
+prune stale docker images periodically in order to not run out of disk
+space as we rev those containers (perhaps `this script
+<https://gitlab.com/gitlab-org/gitlab-runner/issues/2980#note_169233611>`_).
+
+Note that docker doesn't allow containers to be stored on NFS, and
+doesn't allow multiple docker daemons to interact with the same
+network block device, so you will probably need some sort of physical
+storage on your DUTs.
+
+* DUTs must be public
+
+By including your device in .gitlab-ci.yml, you're effectively letting
+anyone on the internet run code on your device. docker containers may
+provide some limited protection, but how much you trust that and what
+you do to mitigate hostile access is up to you.
+
+* DUTs must expose the dri device nodes to the containers.
+
+Obviously, to get access to the HW, we need to pass the render node
+through. This is done by adding ``devices = ["/dev/dri"]`` to the
+``runners.docker`` section of /etc/gitlab-runner/config.toml.
Continuous Integration
======================
-
GitLab CI
---------
- Sanity checks (``meson test`` & ``scons check``)
- Some drivers (softpipe, llvmpipe, freedreno and panfrost) are also tested
using `VK-GL-CTS <https://github.com/KhronosGroup/VK-GL-CTS>`__
+- Replay of application traces
A typical run takes between 20 and 30 minutes, although it can go up very quickly
if the GitLab runners are overwhelmed, which happens sometimes. When it does happen,
`Eric Anholt <https://gitlab.freedesktop.org/anholt>`__ (``anholt`` on
IRC).
+The three gitlab CI systems currently integrated are:
+
+
+.. toctree::
+ :maxdepth: 1
+
+ bare-metal
+ LAVA
+ docker
Intel CI
--------
<https://gitlab.freedesktop.org/craftyguy>`__ (``craftyguy`` on IRC) or
`Nico Cortes <https://gitlab.freedesktop.org/ngcortes>`__ (``ngcortes``
on IRC).
+
+.. _CI-farm-expectations:
+
+CI farm expectations
+--------------------
+
+To make sure that testing of one vendor's drivers doesn't block
+unrelated work by other vendors, we require that a given driver's test
+farm produces a spurious failure no more than once a week. If every
+driver had CI and failed once a week, we would be seeing someone's
+code getting blocked on a spurious failure daily, which is an
+unacceptable cost to the project.
+
+Additionally, the test farm needs to be able to provide a short enough
+turnaround time that we can get our MRs through marge-bot without the
+pipeline backing up. As a result, we require that the test farm be
+able to handle a whole pipeline's worth of jobs in less than 5 minutes
+(to compare, the build stage is about 10 minutes, if you could get all
+your jobs scheduled on the shared runners in time.).
+
+If a test farm is short the HW to provide these guarantees, consider
+dropping tests to reduce runtime.
+``VK-GL-CTS/scripts/log/bottleneck_report.py`` can help you find what
+tests were slow in a ``results.qpa`` file. Or, you can have a job with
+no ``parallel`` field set and:
+
+.. code-block:: yaml
+
+ variables:
+ CI_NODE_INDEX: 1
+ CI_NODE_TOTAL: 10
+
+to just run 1/10th of the test list.
+
+If a HW CI farm goes offline (network dies and all CI pipelines end up
+stalled) or its runners are consistenly spuriously failing (disk
+full?), and the maintainer is not immediately available to fix the
+issue, please push through an MR disabling that farm's jobs by adding
+'.' to the front of the jobs names until the maintainer can bring
+things back up. If this happens, the farm maintainer should provide a
+report to mesa-dev@lists.freedesktop.org after the fact explaining
+what happened and what the mitigation plan is for that failure next
+time.