Appendix E - Troubleshooting¶
Fixing Deployment Problems¶
Sometimes a node fails to deploy. When this happens, check the installation output on the node’s MAAS page. (Click the Logs tab and then click Installation Output.) Often, a clue to the nature of the problem appears near the end of that output. If you don’t spot anything obvious, copy that output into a file and send it to the Server Certification Team.
One common cause of deployment problems is IP address assignment issues. Depending on your MAAS configuration and local network needs, your network might work better with DHCP, Auto Assign, or Static Assign as the method of IP address assignment. To change this setting, you must first release the node. You can then click the Network tab on the node’s summary page in MAAS and reconfigure the network options by using the Actions field, as described earlier, in Installing Ubuntu on the System.
If, when you try to deploy a GA kernel, MAAS complains that the kernel is too old, try this:
Click the node’s Configuration tab in MAAS.
Click Edit under Machine Configuration.
In the Minimum Kernel radio button, select No Minimum Kernel.
Click Save Changes.
Try to re-deploy.
Adding PPAs Manually¶
Sometimes you may need to add a PPA manually. In order for this to work, your
SUT must be able to reach the internet and more specifically reach
launchpad.net
. If either of those requirements are not met, you will receive a
somewhat confusing message like this:
ubuntu@ubuntu:~$ sudo add-apt-repository ppa:checkbox-dev/stable
Cannot add PPA: 'ppa:checkbox-dev/stable'.
Please check that the PPA name or format is correct.
To resolve this, ensure that your SUT can reach the internet and can reach
launchpad.net
directly.
Submitting Results¶
If submitting results from the Server Test Suite itself fails, you can use
the checkbox-cli
program, as described earlier, in
Manually Uploading Test Results to the Certification Site. You can try
this on the SUT, but if network problems prevented a successful submission,
you may need to bring the files out on a USB flash drive or other removable
medium and submit them from a computer with better Internet connectivity.
Resolving Network Problems¶
Network problems are common in testing. These problems can manifest as complete failures of all network tests or as failures of just some tests. Specific suggestions for fixing these problems include:
Check cables and other hardware – Yes, this is very basic; but bad cables can cause problems. For instance, one bad cable at Canonical resulted in connections at 100 Mbps rather than 1 Gbps, and therefore failures. Some of these failures were identified in the output as the lack of a route to the host. Similarly, if a switch connecting the SUT to the
iperf3
server is deficient, it will affect the network test results.Use the simplest possible network – Complex network setups and those with heavy traffic from computers uninvolved in the testing or those with multiple switches, bridges, etc., can create problems for network testing. Simplifying the network in whatever way is practical can improve matters.
Check firewall settings – Successful deployments may require access to several network sites. These include repositories at
archive.ubuntu.com
(or a regional mirror), Ubuntu’s PPA site atppa.launchpad.net
, and Ubuntu’s key server atkeyserver.ubuntu.com
. (You may instead use local mirrors of the archive and PPA sites.) If your site implements strict outgoing firewall rules, you may need to open access to these sites on ports 80 and/or 443.Check the iperf3 server – Ensure that the server computer is up and that the
iperf3
server program is running on it. Also ensure that the computer has no issues, such as a runaway process that’s consuming too much CPU time.Verify the iperf3 server is not overworked – The
iperf3
server program refuses connections if it’s already talking to another client. Thus, a SUT may fail its network test if theiperf3
server is already in use. You may need to re-run the network tests on one or more SUTs if this is the case. Note that a fasteriperf3
server (say, one with a 10 Gbps NIC used to test 1 Gbps SUTs) requires special configuration to handle multiple simultaneous connections, as described in the Certification Environment Setup Guide.Ensure the iperf3 server is on the SUT’s local network – The network tests temporarily remove the default route from the routing table, so the
iperf3
server must be on the same network segment as the SUT.Check the SUT’s network configuration – A failure to configure the network ports will cause a failure of the network tests. Likewise, a failure to bring up a network interface before testing will cause the test to fail, even if the Server Test Suite detects the interface.
Check your DHCP server – A sluggish or otherwise malfunctioning DHCP server can delay bringing up the SUT’s network interfaces (which repeatedly go down and come up during testing). This in turn can cause network testing failures.
If you end up having to re-run the network tests, you can do so as described earlier, in Appendix B - Re-Testing and Installing Updated Tests.
Fixing Virtualization Test Problems¶
Virtualization tests can fail for a number of reasons. If these tests fail, you should first try these diagnostic or corrective actions:
Type
sudo apt install -f
on the SUT. This command repairs some package installation problems, which can sometimes cause the KVM test to fail.Check your virtualization image sources, as described in Running the Certification Tests. Note that you may need to check the configuration on the SUT (in
/etc/xdg/canonical-certification.conf
) and on whatever server you use to host your virtualization images.If you’re not hosting virtualization images locally, be aware that the virtualization tests will try to download images from the Internet. In this case, you must ensure that the SUT has Internet access.
You can run the virtualization tests alone by typing
test-virtualization
on the SUT.
Handling Secure Boot MOKs¶
Although most Ubuntu components, such as GRUB, the Linux kernel, and
standard Linux kernel modules, are cryptographically signed with
Canonical’s key, some third-party and specialized modules (notably
including some used by the firmware test suite, or fwts
) are not so
signed. To use such modules, they must be signed with a machine owner key
(MOK), which is stored in the computer’s NVRAM; and to store the MOK, UEFI
Secure Boot policy requires manual boot-time approval. Thus, if the
computer is deployed with Secure Boot active and certain packages are
updated via apt
, the apt
program will prompt for a password and,
upon reboot, the computer’s console will display a prompt to enter a
password, and the MOK will be added only if the password matches the one
you entered as part of the apt
package update. The prompt at reboot has
no timeout, so if you can’t see the console, the reboot will fail.
If console access is not available, it’s best to configure computers with Secure Boot disabled; however, as a general rule, we encourage use of Secure Boot so as to ensure that this feature works. “Console access” can be via a remote KVM or even IPMI SoL. Enabling and disabling Secure Boot generally requires this access, too.
Repeatedly deploying a server with Secure Boot active may result in the
accumulation of multiple MOKs in the computer’s NVRAM. In theory, these
could grow to consume enough space in the NVRAM to cause problems. Typing
sudo mokutil --reset
at an Ubuntu console will cause all the MOKs to be
deleted; however, this will cause kernel modules signed with a MOK to fail
to load. It’s best to use this command just prior to releasing a node.
Handling Miscellaneous Issues During Testing¶
The testing process should be straightforward and complete without issue.
Should you encounter problems during testing, please contact your account
manager. Be sure to save the ~/.local/share/checkbox-ng
and
~/.cache/plainbox
directory trees as they will contain logs and other
data that will help the Server Certification Team determine if the issue is
a testing issue or a hardware issue that will affect the certification
outcome.
If possible, please also save a copy of any terminal output or tracebacks you notice to a text file and save that along with the previously-noted directories. (Feel free to send us a photo of the screen taken with a digital camera.)