ci/bare-metal: Retry booting chezas instead of failing when !POWER_GOOD
authorEric Anholt <eric@anholt.net>
Wed, 19 Aug 2020 23:28:27 +0000 (16:28 -0700)
committerMarge Bot <eric+marge@anholt.net>
Fri, 21 Aug 2020 20:10:18 +0000 (20:10 +0000)
If we get this error, we can just try rebooting again and see if it comes
up then.  The POWER_GOOD failures are clustered in time, but it's better
to retry a few times in a row in one job (which has its own 60min timeout)
than to spuriously fail someone's pipeline.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6398>

.gitlab-ci/bare-metal/cros_servo_run.py

index 0c43419854f05f2042eca2adb32df980c981cc4d..976371d18af38955a138734ea0bf07df6da2b861 100755 (executable)
@@ -55,8 +55,12 @@ class CrosServoRun:
         for line in self.cpu_ser.lines():
             if re.match("---. end Kernel panic", line):
                 return 1
+
+            # The Cheza boards have issues with failing to bring up power to
+            # the system sometimes, possibly dependent on ambient temperature
+            # in the farm.
             if re.match("POWER_GOOD not seen in time", line):
-                return 1
+                return 2
 
             result = re.match("bare-metal result: (\S*)", line)
             if result:
@@ -76,7 +80,10 @@ def main():
 
     servo = CrosServoRun(args.cpu, args.ec)
 
-    retval = servo.run()
+    while True:
+        retval = servo.run()
+        if retval != 2:
+            break
 
     # power down the CPU on the device
     servo.ec_write("power off\n")