(34 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
=== | |||
=== Overview === | |||
* login to bodhi-backend01.phx2.fedoraproject.org | |||
* sign testing/stable update packages | |||
* Things to consider before pushing updates. | |||
* push testing/stable update packages | |||
* monitor the updates push | |||
* troubleshooting | |||
* pushing very important update packages to stable | |||
=== Bodhi2 push monitoring=== | |||
* https://bodhi.fedoraproject.org/masher | |||
=== Bodhi2 push=== | |||
ssh to bodhi-backend01 | |||
<pre> | |||
sudo true ; yes | sudo -u apache -S bodhi-push --releases="f24,f25,el-5,el-6,epel-7" --username parasense | |||
</pre> | |||
That's it. Because of auto signing, and recent improvements there are no more obvious things to troubleshoot or fix. The remaining page might be of amusement for historical look into pushing. | |||
=== Troubleshooting === | |||
<pre> | <pre> | ||
# get the errors | # get the errors | ||
sudo journalctl --since=yesterday -o short -u fedmsg-hub > ~/error.out | sudo journalctl --since=yesterday -o short --full --no-pager -u fedmsg-hub > ~/error.out | ||
awk '/E[Rr][Rr]/' ~/error.out | awk '/E[Rr][Rr]/' ~/error.out | ||
Line 12: | Line 34: | ||
OSError: [Errno 39] Directory not empty: '/mnt/koji/mash/updates/dist-6E-epel-testing-151201.1956/../dist-6E-epel-testing.repocache/repodata/' | |||
Reset the fedmsg hub if that happens | |||
<pre> | <pre> | ||
# reset the fedmsg-hub service | # reset the fedmsg-hub service | ||
Line 17: | Line 43: | ||
</pre> | </pre> | ||
NOTE: This is apparently due to stale NFSv3 locks. Bodhi2 is attempting to remove an old repocache, but there are stale NFS locks that prevent. | |||
Inspect the repo cache areas (optional) to verify empty ? | |||
<pre> | |||
ls \ | |||
/mnt/fedora_koji/koji/mash/updates/dist-5E-epel{,-testing}.repocache/repodata/ \ | |||
/mnt/fedora_koji/koji/mash/updates/dist-6E-epel{,-testing}.repocache/repodata/ \ | |||
/mnt/fedora_koji/koji/mash/updates/epel7{,-testing}.repocache/repodata/ \ | |||
/mnt/fedora_koji/koji/mash/updates/f23-updates{,-testing}.repocache/repodata/ \ | |||
/mnt/fedora_koji/koji/mash/updates/f24-updates{,-testing}.repocache/repodata/ \ | |||
/mnt/fedora_koji/koji/mash/updates/f25-updates{,-testing}.repocache/repodata/ | |||
</pre> | |||
OSError: [Errno 16] Device or resource busy: '/var/lib/mock/fedora-23-updates-x86_64/root/var/tmp/rpm-ostree.hjvMfC' | |||
A bind mount needs to be removed. Look for TMPFS relics from rpm-ostree | |||
<pre> | <pre> | ||
findmnt -t tmpfs -o TARGET | grep rpm-ostree | findmnt -t tmpfs -o TARGET | grep rpm-ostree | ||
</pre> | |||
Dismount ALL the relic TMPFS mounts (warning it's unwise to do this unless you know bodhi2 is completely idle) | |||
<pre> | |||
sudo umount $(findmnt -t tmpfs -o TARGET | grep rpm-ostree) | |||
</pre> | |||
IOError: Cannot open /mnt/koji/mash/updates/dist-6E-epel-151105.1606/../dist-6E-epel.repocache/repodata/repomd.xml: File /mnt/koji/mash/updates/dist-6E-epel-151105.1606/../dist-6E-epel.repocache/repodata/repomd.xml doesn't exists or not a regular file | |||
<pre> | |||
rm -rf /mnt/koji/mash/updates/dist-6E-epel-151105.1606/../dist-6E-epel.repocache | |||
</pre> | </pre> | ||
ERROR: can't download kf5-kdesu-None:5.13.0-1.fc24.i686 from signed path /mnt/koji/packages/kf5-kdesu/5.13.0/1.fc24/data/signed/8e1431d5/i686/kf5-kdesu-5.13.0-1.fc24.i686.rpm | |||
Sometimes an update has lingered so long that the signed rpm's have been garbage collected, so we simply put them back from the signature cache | |||
<pre> | <pre> | ||
NSS_HASH_ALG_SUPPORT=+MD5 ~/releng/scripts/sigulsign_unsigned.py fedora-24 -v --write-all --sigul-batch-size=25 kf5-kdesu-5.13.0-1.fc24 | |||
</pre> | </pre> | ||
===Signing=== | |||
Start the bodhi push for signing (responding 'no' when prompted) | |||
<pre> | <pre> | ||
cd /var/cache/sigul | |||
sudo true ; yes 'no' | sudo -u apache -S bodhi-push --releases="f24,f25,el-5,el-6,epel-7" --username parasense | |||
</pre> | </pre> | ||
Sign the builds | |||
<pre> | <pre> | ||
for i in 25 24 ; do NSS_HASH_ALG_SUPPORT=+MD5 ~/releng/scripts/sigulsign_unsigned.py fedora-$i -v --write-all --sigul-batch-size=500 $(cat /var/cache/sigul/{Stable,Testing}-F${i}) ; done | |||
for i in 7 6 5 ; do NSS_HASH_ALG_SUPPORT=+MD5 ~/releng/scripts/sigulsign_unsigned.py epel-$i -v --write-all --sigul-batch-size=500 $(cat /var/cache/sigul/{Stable,Testing}-*EL-${i}) ; done | |||
/ | |||
/ | |||
/ | |||
</pre> | </pre> | ||
Another way to do this would be to use screen to tmux, start the push but delay the y/n answer. The files in /var/cache/sigul are created, so go ahead and sign those builds, then send "y" to the bodhi-push prompt. | |||
=== Test if bodhi masher is running === | |||
These checks are not scientific, but may help inform. | |||
Check for existing masher locks of currently running push, or failed previous push | |||
<pre> | <pre> | ||
ls -l /mnt/koji/mash/updates/MASHING-* | ls -l /mnt/koji/mash/updates/MASHING-* | ||
</pre> | </pre> | ||
Check for running bodhi2 push (via masher) | |||
<pre> | <pre> | ||
pgrep -af /usr/bin/mash | pgrep -af /usr/bin/mash | ||
</pre> | |||
Also check for rsync | |||
<pre> | |||
pgrep -af rsync | pgrep -af rsync | ||
</pre> | </pre> | ||
=== Resuming failed push === | |||
<pre> | <pre> | ||
# resume the push interactivly | # resume the push interactivly | ||
sudo -u | sudo -u apache bodhi-push --resume --username parasense | ||
# resume the push responding yes to everything | # resume the push responding yes to everything | ||
yes| sudo -u | sudo true; yes | sudo -u apache -S bodhi-push --resume --username parasense | ||
</pre> | </pre> | ||
Line 93: | Line 153: | ||
# Review the bridge.py output | # Review the bridge.py output | ||
# Findout the location of log file | |||
lsof | awk '/sigul/ && /log/ {print $NF;exit}' | |||
tail -f /var/log/sigul_bridge.log | tail -f /var/log/sigul_bridge.log | ||
</pre> | </pre> | ||
=== Stable push requests === | === Stable push requests === | ||
Line 105: | Line 169: | ||
Once that is done the bodhi-push comment will permit the stable push. | Once that is done the bodhi-push comment will permit the stable push. | ||
<pre> | <pre> | ||
sudo -u | sudo -u apache bodhi-push --releases='f21,f22,f23' --request=stable --builds 'lorax-21.34-1.fc21 lorax-22.13-1.fc22 freeipa-4.2.3-1.fc23' --username parasense | ||
</pre> | </pre> | ||
=== Testing push request === | === Testing push request === | ||
Line 115: | Line 177: | ||
The testing push uses a different lock file, so they will go in parallel with the stable push in progress. | The testing push uses a different lock file, so they will go in parallel with the stable push in progress. | ||
<pre> | <pre> | ||
sudo -u masher bodhi-push --releases | sudo -u masher bodhi-push --releases="f23,f24,f25" --request=testing --username parasense | ||
</pre> | |||
=== Script to avoid common problems === | |||
Here is a script automates most common bodhi2 troubleshooting. | |||
* Invalid repocache | |||
* Restarting fedmsg hub to clear persisting sqlite file locks left behind by createrepo | |||
* Umount persisting tmpfs mounts from Atomic compose | |||
<pre> | |||
#!/bin/bash | |||
# | |||
# Copyright (C) 2016 Red Hat, Inc. | |||
# SPDX-License-Identifier: MIT | |||
# | |||
# Authors: | |||
# Jon Disnard <jdisnard@redhat.com> | |||
# | |||
# Attempt to fix common bodhi2 issues prior to running | |||
## Stale NFS locks | |||
## "OSError: [Errno 39] Directory not empty" | |||
## | |||
## REMARKS: | |||
## These are caused by createrepo sqlite database locking. | |||
## This issue could be solved by using either using createrepo_c, and/or generate the metadata off to the side in /tmp, then move to NFS. | |||
## Also, koji signed-repos would solve this | |||
## | |||
printf '* Restarting fedmsg-hub\n' | |||
sudo systemctl restart fedmsg-hub | |||
## Stale tmpfs mounts | |||
## "OSError: [Errno 16] Device or resource busy" | |||
## | |||
## REMARKS: | |||
## A recursive 'umount -R' would solve this. | |||
## Atomic needs to clean after itself, or the atomic compose parts in bodhi2. | |||
## | |||
while read tmpfs | |||
do if test -z "$tmpfs" | |||
then continue | |||
else | |||
printf '* tmpfs found; umounting %s\n' "$tmpfs" | |||
sudo umount -v "$tmpfs" | |||
fi | |||
done < <(findmnt -t tmpfs -o TARGET | grep rpm-ostree) | |||
## Missing repodata/repomd.xml | |||
## "IOError: Cannot open" | |||
## | |||
## REMARKS: | |||
## The repocache improve speed, less effort | |||
## But it's ridiculous to fail here | |||
## Easy Bodhi2 fix. | |||
## | |||
for I in /mnt/fedora_koji/koji/mash/updates/dist-5E-epel{,-testing}.repocache/repodata/ \ | |||
/mnt/fedora_koji/koji/mash/updates/dist-6E-epel{,-testing}.repocache/repodata/ \ | |||
/mnt/fedora_koji/koji/mash/updates/epel7{,-testing}.repocache/repodata/ \ | |||
/mnt/fedora_koji/koji/mash/updates/f23-updates{,-testing}.repocache/repodata/ \ | |||
/mnt/fedora_koji/koji/mash/updates/f24-updates{,-testing}.repocache/repodata/ \ | |||
/mnt/fedora_koji/koji/mash/updates/f25-updates{,-testing}.repocache/repodata/ ; | |||
do if test -d "$I" | |||
then if test -f "${I}/repomd.xml" | |||
then continue | |||
else printf '* No repomd.xml found; Removing %s\n' "$I" | |||
sudo rmdir "$I" | |||
fi | |||
fi | |||
done | |||
#THE END | |||
</pre> | </pre> |
Latest revision as of 15:24, 20 February 2017
Overview
- login to bodhi-backend01.phx2.fedoraproject.org
- sign testing/stable update packages
- Things to consider before pushing updates.
- push testing/stable update packages
- monitor the updates push
- troubleshooting
- pushing very important update packages to stable
Bodhi2 push monitoring
Bodhi2 push
ssh to bodhi-backend01
sudo true ; yes | sudo -u apache -S bodhi-push --releases="f24,f25,el-5,el-6,epel-7" --username parasense
That's it. Because of auto signing, and recent improvements there are no more obvious things to troubleshoot or fix. The remaining page might be of amusement for historical look into pushing.
Troubleshooting
# get the errors sudo journalctl --since=yesterday -o short --full --no-pager -u fedmsg-hub > ~/error.out awk '/E[Rr][Rr]/' ~/error.out # or with color egrep 'ERROR|Errno' ~/error.out
OSError: [Errno 39] Directory not empty: '/mnt/koji/mash/updates/dist-6E-epel-testing-151201.1956/../dist-6E-epel-testing.repocache/repodata/'
Reset the fedmsg hub if that happens
# reset the fedmsg-hub service sudo systemctl restart fedmsg-hub
NOTE: This is apparently due to stale NFSv3 locks. Bodhi2 is attempting to remove an old repocache, but there are stale NFS locks that prevent.
Inspect the repo cache areas (optional) to verify empty ?
ls \ /mnt/fedora_koji/koji/mash/updates/dist-5E-epel{,-testing}.repocache/repodata/ \ /mnt/fedora_koji/koji/mash/updates/dist-6E-epel{,-testing}.repocache/repodata/ \ /mnt/fedora_koji/koji/mash/updates/epel7{,-testing}.repocache/repodata/ \ /mnt/fedora_koji/koji/mash/updates/f23-updates{,-testing}.repocache/repodata/ \ /mnt/fedora_koji/koji/mash/updates/f24-updates{,-testing}.repocache/repodata/ \ /mnt/fedora_koji/koji/mash/updates/f25-updates{,-testing}.repocache/repodata/
OSError: [Errno 16] Device or resource busy: '/var/lib/mock/fedora-23-updates-x86_64/root/var/tmp/rpm-ostree.hjvMfC'
A bind mount needs to be removed. Look for TMPFS relics from rpm-ostree
findmnt -t tmpfs -o TARGET | grep rpm-ostree
Dismount ALL the relic TMPFS mounts (warning it's unwise to do this unless you know bodhi2 is completely idle)
sudo umount $(findmnt -t tmpfs -o TARGET | grep rpm-ostree)
IOError: Cannot open /mnt/koji/mash/updates/dist-6E-epel-151105.1606/../dist-6E-epel.repocache/repodata/repomd.xml: File /mnt/koji/mash/updates/dist-6E-epel-151105.1606/../dist-6E-epel.repocache/repodata/repomd.xml doesn't exists or not a regular file
rm -rf /mnt/koji/mash/updates/dist-6E-epel-151105.1606/../dist-6E-epel.repocache
ERROR: can't download kf5-kdesu-None:5.13.0-1.fc24.i686 from signed path /mnt/koji/packages/kf5-kdesu/5.13.0/1.fc24/data/signed/8e1431d5/i686/kf5-kdesu-5.13.0-1.fc24.i686.rpm
Sometimes an update has lingered so long that the signed rpm's have been garbage collected, so we simply put them back from the signature cache
NSS_HASH_ALG_SUPPORT=+MD5 ~/releng/scripts/sigulsign_unsigned.py fedora-24 -v --write-all --sigul-batch-size=25 kf5-kdesu-5.13.0-1.fc24
Signing
Start the bodhi push for signing (responding 'no' when prompted)
cd /var/cache/sigul sudo true ; yes 'no' | sudo -u apache -S bodhi-push --releases="f24,f25,el-5,el-6,epel-7" --username parasense
Sign the builds
for i in 25 24 ; do NSS_HASH_ALG_SUPPORT=+MD5 ~/releng/scripts/sigulsign_unsigned.py fedora-$i -v --write-all --sigul-batch-size=500 $(cat /var/cache/sigul/{Stable,Testing}-F${i}) ; done for i in 7 6 5 ; do NSS_HASH_ALG_SUPPORT=+MD5 ~/releng/scripts/sigulsign_unsigned.py epel-$i -v --write-all --sigul-batch-size=500 $(cat /var/cache/sigul/{Stable,Testing}-*EL-${i}) ; done
Another way to do this would be to use screen to tmux, start the push but delay the y/n answer. The files in /var/cache/sigul are created, so go ahead and sign those builds, then send "y" to the bodhi-push prompt.
Test if bodhi masher is running
These checks are not scientific, but may help inform.
Check for existing masher locks of currently running push, or failed previous push
ls -l /mnt/koji/mash/updates/MASHING-*
Check for running bodhi2 push (via masher)
pgrep -af /usr/bin/mash
Also check for rsync
pgrep -af rsync
Resuming failed push
# resume the push interactivly sudo -u apache bodhi-push --resume --username parasense # resume the push responding yes to everything sudo true; yes | sudo -u apache -S bodhi-push --resume --username parasense
# Follow the output of fedmsg-hub sudo journalctl -o short -u fedmsg-hub -l -f
Sign Bridge Tasks
# monitor the signing on bridge for potential stalls from bodhi-backend ssh -v -o'ControlPath=none' sign-bridge01 'tail -f /var/log/sigul_bridge.log'
# Verify the bridge is running or not pgrep -af bridge.py # Restart the bridge as necessary sudo pkill -f -9 bridge.py sudo NSS_HASH_ALG_SUPPORT=+MD5 sigul_bridge -d -v -v # Review the bridge.py output # Findout the location of log file lsof | awk '/sigul/ && /log/ {print $NF;exit}' tail -f /var/log/sigul_bridge.log
Stable push requests
Sometimes an urgent request to have something pushed to stable.
Here we had two lorax builds in the testing queue, and QA engineer requested they go to stable ASAP. So you have to header over to the bodhi2 web front end, revoke the "tasting push", then choose to push the build to stable. Once that is done the bodhi-push comment will permit the stable push.
sudo -u apache bodhi-push --releases='f21,f22,f23' --request=stable --builds 'lorax-21.34-1.fc21 lorax-22.13-1.fc22 freeipa-4.2.3-1.fc23' --username parasense
Testing push request
So there was a stable push in progress, and folks requesting testing push. The testing push uses a different lock file, so they will go in parallel with the stable push in progress.
sudo -u masher bodhi-push --releases="f23,f24,f25" --request=testing --username parasense
Script to avoid common problems
Here is a script automates most common bodhi2 troubleshooting.
- Invalid repocache
- Restarting fedmsg hub to clear persisting sqlite file locks left behind by createrepo
- Umount persisting tmpfs mounts from Atomic compose
#!/bin/bash # # Copyright (C) 2016 Red Hat, Inc. # SPDX-License-Identifier: MIT # # Authors: # Jon Disnard <jdisnard@redhat.com> # # Attempt to fix common bodhi2 issues prior to running ## Stale NFS locks ## "OSError: [Errno 39] Directory not empty" ## ## REMARKS: ## These are caused by createrepo sqlite database locking. ## This issue could be solved by using either using createrepo_c, and/or generate the metadata off to the side in /tmp, then move to NFS. ## Also, koji signed-repos would solve this ## printf '* Restarting fedmsg-hub\n' sudo systemctl restart fedmsg-hub ## Stale tmpfs mounts ## "OSError: [Errno 16] Device or resource busy" ## ## REMARKS: ## A recursive 'umount -R' would solve this. ## Atomic needs to clean after itself, or the atomic compose parts in bodhi2. ## while read tmpfs do if test -z "$tmpfs" then continue else printf '* tmpfs found; umounting %s\n' "$tmpfs" sudo umount -v "$tmpfs" fi done < <(findmnt -t tmpfs -o TARGET | grep rpm-ostree) ## Missing repodata/repomd.xml ## "IOError: Cannot open" ## ## REMARKS: ## The repocache improve speed, less effort ## But it's ridiculous to fail here ## Easy Bodhi2 fix. ## for I in /mnt/fedora_koji/koji/mash/updates/dist-5E-epel{,-testing}.repocache/repodata/ \ /mnt/fedora_koji/koji/mash/updates/dist-6E-epel{,-testing}.repocache/repodata/ \ /mnt/fedora_koji/koji/mash/updates/epel7{,-testing}.repocache/repodata/ \ /mnt/fedora_koji/koji/mash/updates/f23-updates{,-testing}.repocache/repodata/ \ /mnt/fedora_koji/koji/mash/updates/f24-updates{,-testing}.repocache/repodata/ \ /mnt/fedora_koji/koji/mash/updates/f25-updates{,-testing}.repocache/repodata/ ; do if test -d "$I" then if test -f "${I}/repomd.xml" then continue else printf '* No repomd.xml found; Removing %s\n' "$I" sudo rmdir "$I" fi fi done #THE END