From Fedora Project Wiki
 
(20 intermediate revisions by the same user not shown)
Line 13: Line 13:
=== Bodhi2 push monitoring===
=== Bodhi2 push monitoring===
* https://bodhi.fedoraproject.org/masher
* https://bodhi.fedoraproject.org/masher
=== Bodhi2 push===
ssh to bodhi-backend01
<pre>
sudo true ; yes  | sudo -u apache -S bodhi-push --releases="f24,f25,el-5,el-6,epel-7" --username parasense
</pre>
That's it. Because of auto signing, and recent improvements there are no more obvious things to troubleshoot or fix. The remaining page might be of amusement for historical look into pushing.


=== Troubleshooting ===
=== Troubleshooting ===
Line 18: Line 26:
<pre>
<pre>
# get the errors
# get the errors
sudo journalctl --since=yesterday -o short -u fedmsg-hub > ~/error.out
sudo journalctl --since=yesterday -o short --full --no-pager -u fedmsg-hub > ~/error.out
awk '/E[Rr][Rr]/' ~/error.out
awk '/E[Rr][Rr]/' ~/error.out


Line 34: Line 42:
sudo systemctl restart fedmsg-hub
sudo systemctl restart fedmsg-hub
</pre>
</pre>
NOTE: This is apparently due to stale NFSv3 locks. Bodhi2 is attempting to remove an old repocache, but there are stale NFS locks that prevent.


Inspect the repo cache areas (optional) to verify empty ?
Inspect the repo cache areas (optional) to verify empty ?
Line 42: Line 53:
/mnt/fedora_koji/koji/mash/updates/dist-6E-epel{,-testing}.repocache/repodata/ \
/mnt/fedora_koji/koji/mash/updates/dist-6E-epel{,-testing}.repocache/repodata/ \
/mnt/fedora_koji/koji/mash/updates/epel7{,-testing}.repocache/repodata/        \
/mnt/fedora_koji/koji/mash/updates/epel7{,-testing}.repocache/repodata/        \
/mnt/fedora_koji/koji/mash/updates/f21-updates{,-testing}.repocache/repodata/  \
/mnt/fedora_koji/koji/mash/updates/f23-updates{,-testing}.repocache/repodata/  \
/mnt/fedora_koji/koji/mash/updates/f22-updates{,-testing}.repocache/repodata/  \
/mnt/fedora_koji/koji/mash/updates/f24-updates{,-testing}.repocache/repodata/  \
/mnt/fedora_koji/koji/mash/updates/f23-updates{,-testing}.repocache/repodata/
/mnt/fedora_koji/koji/mash/updates/f25-updates{,-testing}.repocache/repodata/
</pre>
</pre>


Line 58: Line 69:
Dismount ALL the relic TMPFS mounts (warning it's unwise to do this unless you know bodhi2 is completely idle)
Dismount ALL the relic TMPFS mounts (warning it's unwise to do this unless you know bodhi2 is completely idle)
<pre>
<pre>
sudo umount /var/lib/mock/*/root/var/tmp/rpm-ostree.??????
sudo umount $(findmnt -t tmpfs -o TARGET | grep rpm-ostree)
</pre>
</pre>


Line 70: Line 81:




ERROR: can't download kf5-kdesu-None:5.13.0-1.fc22.i686 from signed path /mnt/koji/packages/kf5-kdesu/5.13.0/1.fc22/data/signed/8e1431d5/i686/kf5-kdesu-5.13.0-1.fc22.i686.rpm
ERROR: can't download kf5-kdesu-None:5.13.0-1.fc24.i686 from signed path /mnt/koji/packages/kf5-kdesu/5.13.0/1.fc24/data/signed/8e1431d5/i686/kf5-kdesu-5.13.0-1.fc24.i686.rpm


Sometimes an update has lingered so long that the signed rpm's have been garbage collected, so we simply put them back from the signature cache
Sometimes an update has lingered so long that the signed rpm's have been garbage collected, so we simply put them back from the signature cache
<pre>
<pre>
NSS_HASH_ALG_SUPPORT=+MD5 ~/releng/scripts/sigulsign_unsigned.py fedora-22 -v --write-all --sigul-batch-size=25 kf5-kdesu-5.13.0-1.fc22
NSS_HASH_ALG_SUPPORT=+MD5 ~/releng/scripts/sigulsign_unsigned.py fedora-24 -v --write-all --sigul-batch-size=25 kf5-kdesu-5.13.0-1.fc24
</pre>
</pre>


Line 82: Line 93:
<pre>
<pre>
cd /var/cache/sigul
cd /var/cache/sigul
yes 'no' | sudo -u masher -S bodhi-push --releases '23 22 5 6 7' --username parasense
sudo true ; yes 'no' | sudo -u apache -S bodhi-push --releases="f24,f25,el-5,el-6,epel-7" --username parasense
</pre>
</pre>


Line 88: Line 99:
Sign the builds
Sign the builds
<pre>
<pre>
for i in 23 22 ; do NSS_HASH_ALG_SUPPORT=+MD5 ~/releng/scripts/sigulsign_unsigned.py fedora-$i -v --write-all --sigul-batch-size=25 $(cat /var/cache/sigul/{Stable,Testing}-F${i})    ; done
for i in 25 24  ; do NSS_HASH_ALG_SUPPORT=+MD5 ~/releng/scripts/sigulsign_unsigned.py fedora-$i -v --write-all --sigul-batch-size=500 $(cat /var/cache/sigul/{Stable,Testing}-F${i})    ; done
for i in  7  6  5 ; do NSS_HASH_ALG_SUPPORT=+MD5 ~/releng/scripts/sigulsign_unsigned.py epel-$i  -v --write-all --sigul-batch-size=25 $(cat /var/cache/sigul/{Stable,Testing}-*EL-${i}) ; done
for i in  7  6  5 ; do NSS_HASH_ALG_SUPPORT=+MD5 ~/releng/scripts/sigulsign_unsigned.py epel-$i  -v --write-all --sigul-batch-size=500 $(cat /var/cache/sigul/{Stable,Testing}-*EL-${i}) ; done
</pre>
</pre>
Another way to do this would be to use screen to tmux, start the push but dellay the y/n answer. The files in /var/cache/sigul are created, so go ahead and sign those builds, then send "y" to the bodhi-push prompt.
Another way to do this would be to use screen to tmux, start the push but delay the y/n answer. The files in /var/cache/sigul are created, so go ahead and sign those builds, then send "y" to the bodhi-push prompt.


=== Test if bodhi masher is running ===
=== Test if bodhi masher is running ===
Line 115: Line 126:
<pre>
<pre>
# resume the push interactivly
# resume the push interactivly
sudo -u masher bodhi-push --resume --username parasense
sudo -u apache bodhi-push --resume --username parasense


# resume the push responding yes to everything
# resume the push responding yes to everything
yes| sudo -u masher -S bodhi-push --resume --username parasense
sudo true; yes | sudo -u apache -S bodhi-push --resume --username parasense
</pre>
</pre>


Line 142: Line 153:


# Review the bridge.py output
# Review the bridge.py output
# Findout the location of log file
lsof | awk '/sigul/ && /log/ {print $NF;exit}'
tail -f /var/log/sigul_bridge.log
tail -f /var/log/sigul_bridge.log
</pre>
</pre>


=== Stable push requests ===
=== Stable push requests ===
Line 154: Line 169:
Once that is done the bodhi-push comment will permit the stable push.  
Once that is done the bodhi-push comment will permit the stable push.  
<pre>
<pre>
sudo -u masher bodhi-push --releases '21 22 23' --request=stable  --builds 'lorax-21.34-1.fc21 lorax-22.13-1.fc22 freeipa-4.2.3-1.fc23' --username parasense
sudo -u apache bodhi-push --releases='f21,f22,f23' --request=stable  --builds 'lorax-21.34-1.fc21 lorax-22.13-1.fc22 freeipa-4.2.3-1.fc23' --username parasense
</pre>
</pre>


=== Testing push request ===
=== Testing push request ===
Line 164: Line 177:
The testing push uses a different lock file, so they will go in parallel with the stable push in progress.
The testing push uses a different lock file, so they will go in parallel with the stable push in progress.
<pre>
<pre>
sudo -u masher bodhi-push --releases '21 22 23' --request=testing  --username parasense
sudo -u masher bodhi-push --releases="f23,f24,f25" --request=testing  --username parasense
</pre>
 
=== Script to avoid common problems ===
 
Here is a script automates most common bodhi2 troubleshooting.
 
* Invalid repocache
* Restarting fedmsg hub to clear persisting sqlite file locks left behind by createrepo
* Umount persisting tmpfs mounts from Atomic compose
 
<pre>
#!/bin/bash
#
# Copyright (C) 2016 Red Hat, Inc.
# SPDX-License-Identifier:      MIT
#
# Authors:
#    Jon Disnard <jdisnard@redhat.com>
#
# Attempt to fix common bodhi2 issues prior to running
 
 
## Stale NFS locks
## "OSError: [Errno 39] Directory not empty"
##
## REMARKS:
## These are caused by createrepo sqlite database locking.
## This issue could be solved by using either using createrepo_c, and/or generate the metadata off to the side in /tmp, then move to NFS.
## Also, koji signed-repos would solve this
##
printf '* Restarting fedmsg-hub\n'
sudo systemctl restart fedmsg-hub
 
 
## Stale tmpfs mounts
## "OSError: [Errno 16] Device or resource busy"
##
## REMARKS:
## A recursive 'umount -R' would solve this.
## Atomic needs to clean after itself, or the atomic compose parts in bodhi2.
##
while read tmpfs
do  if test -z "$tmpfs"
    then  continue
    else 
          printf '* tmpfs found; umounting %s\n' "$tmpfs"
          sudo umount -v "$tmpfs"
    fi
done < <(findmnt -t tmpfs -o TARGET | grep rpm-ostree)
 
 
## Missing repodata/repomd.xml
## "IOError: Cannot open"
##
## REMARKS:
## The repocache improve speed, less effort
## But it's ridiculous to fail here
## Easy Bodhi2 fix.
##
for I in /mnt/fedora_koji/koji/mash/updates/dist-5E-epel{,-testing}.repocache/repodata/ \
        /mnt/fedora_koji/koji/mash/updates/dist-6E-epel{,-testing}.repocache/repodata/ \
        /mnt/fedora_koji/koji/mash/updates/epel7{,-testing}.repocache/repodata/        \
        /mnt/fedora_koji/koji/mash/updates/f23-updates{,-testing}.repocache/repodata/  \
        /mnt/fedora_koji/koji/mash/updates/f24-updates{,-testing}.repocache/repodata/  \
        /mnt/fedora_koji/koji/mash/updates/f25-updates{,-testing}.repocache/repodata/  ;
do  if test -d "$I"
    then  if test -f "${I}/repomd.xml"
          then  continue
          else  printf '* No repomd.xml found; Removing %s\n' "$I"
                sudo rmdir "$I"
          fi
    fi
done
 
 
#THE END
</pre>
</pre>

Latest revision as of 15:24, 20 February 2017


Overview

  • login to bodhi-backend01.phx2.fedoraproject.org
  • sign testing/stable update packages
  • Things to consider before pushing updates.
  • push testing/stable update packages
  • monitor the updates push
  • troubleshooting
  • pushing very important update packages to stable

Bodhi2 push monitoring


Bodhi2 push

ssh to bodhi-backend01

sudo true ; yes  | sudo -u apache -S bodhi-push --releases="f24,f25,el-5,el-6,epel-7" --username parasense

That's it. Because of auto signing, and recent improvements there are no more obvious things to troubleshoot or fix. The remaining page might be of amusement for historical look into pushing.

Troubleshooting

# get the errors
sudo journalctl --since=yesterday -o short --full --no-pager -u fedmsg-hub > ~/error.out
awk '/E[Rr][Rr]/' ~/error.out

# or with color
egrep 'ERROR|Errno' ~/error.out


OSError: [Errno 39] Directory not empty: '/mnt/koji/mash/updates/dist-6E-epel-testing-151201.1956/../dist-6E-epel-testing.repocache/repodata/'

Reset the fedmsg hub if that happens

# reset the fedmsg-hub service
sudo systemctl restart fedmsg-hub

NOTE: This is apparently due to stale NFSv3 locks. Bodhi2 is attempting to remove an old repocache, but there are stale NFS locks that prevent.


Inspect the repo cache areas (optional) to verify empty ?

ls \
/mnt/fedora_koji/koji/mash/updates/dist-5E-epel{,-testing}.repocache/repodata/ \ 
/mnt/fedora_koji/koji/mash/updates/dist-6E-epel{,-testing}.repocache/repodata/ \
/mnt/fedora_koji/koji/mash/updates/epel7{,-testing}.repocache/repodata/        \
/mnt/fedora_koji/koji/mash/updates/f23-updates{,-testing}.repocache/repodata/  \
/mnt/fedora_koji/koji/mash/updates/f24-updates{,-testing}.repocache/repodata/  \
/mnt/fedora_koji/koji/mash/updates/f25-updates{,-testing}.repocache/repodata/


OSError: [Errno 16] Device or resource busy: '/var/lib/mock/fedora-23-updates-x86_64/root/var/tmp/rpm-ostree.hjvMfC'

A bind mount needs to be removed. Look for TMPFS relics from rpm-ostree

findmnt -t tmpfs -o TARGET | grep rpm-ostree

Dismount ALL the relic TMPFS mounts (warning it's unwise to do this unless you know bodhi2 is completely idle)

sudo umount $(findmnt -t tmpfs -o TARGET | grep rpm-ostree)


IOError: Cannot open /mnt/koji/mash/updates/dist-6E-epel-151105.1606/../dist-6E-epel.repocache/repodata/repomd.xml: File /mnt/koji/mash/updates/dist-6E-epel-151105.1606/../dist-6E-epel.repocache/repodata/repomd.xml doesn't exists or not a regular file

rm -rf /mnt/koji/mash/updates/dist-6E-epel-151105.1606/../dist-6E-epel.repocache


ERROR: can't download kf5-kdesu-None:5.13.0-1.fc24.i686 from signed path /mnt/koji/packages/kf5-kdesu/5.13.0/1.fc24/data/signed/8e1431d5/i686/kf5-kdesu-5.13.0-1.fc24.i686.rpm

Sometimes an update has lingered so long that the signed rpm's have been garbage collected, so we simply put them back from the signature cache

NSS_HASH_ALG_SUPPORT=+MD5 ~/releng/scripts/sigulsign_unsigned.py fedora-24 -v --write-all --sigul-batch-size=25 kf5-kdesu-5.13.0-1.fc24

Signing

Start the bodhi push for signing (responding 'no' when prompted)

cd /var/cache/sigul
sudo true ; yes 'no' | sudo -u apache -S bodhi-push --releases="f24,f25,el-5,el-6,epel-7" --username parasense


Sign the builds

for i in 25 24  ; do NSS_HASH_ALG_SUPPORT=+MD5 ~/releng/scripts/sigulsign_unsigned.py fedora-$i -v --write-all --sigul-batch-size=500 $(cat /var/cache/sigul/{Stable,Testing}-F${i})    ; done
for i in  7  6  5 ; do NSS_HASH_ALG_SUPPORT=+MD5 ~/releng/scripts/sigulsign_unsigned.py epel-$i   -v --write-all --sigul-batch-size=500 $(cat /var/cache/sigul/{Stable,Testing}-*EL-${i}) ; done

Another way to do this would be to use screen to tmux, start the push but delay the y/n answer. The files in /var/cache/sigul are created, so go ahead and sign those builds, then send "y" to the bodhi-push prompt.

Test if bodhi masher is running

These checks are not scientific, but may help inform.

Check for existing masher locks of currently running push, or failed previous push

ls -l /mnt/koji/mash/updates/MASHING-*

Check for running bodhi2 push (via masher)

pgrep -af /usr/bin/mash

Also check for rsync

pgrep -af rsync


Resuming failed push

# resume the push interactivly
sudo -u apache bodhi-push --resume --username parasense

# resume the push responding yes to everything
sudo true; yes | sudo -u apache -S bodhi-push --resume --username parasense
# Follow the output of fedmsg-hub
sudo journalctl -o short -u fedmsg-hub -l -f

Sign Bridge Tasks

# monitor the signing on bridge for potential stalls from bodhi-backend
ssh -v -o'ControlPath=none' sign-bridge01 'tail -f /var/log/sigul_bridge.log'
# Verify the bridge is running or not
pgrep -af bridge.py

# Restart the bridge as necessary 
sudo pkill -f -9 bridge.py
sudo NSS_HASH_ALG_SUPPORT=+MD5 sigul_bridge -d -v -v

# Review the bridge.py output

# Findout the location of log file
lsof | awk '/sigul/ && /log/ {print $NF;exit}'


tail -f /var/log/sigul_bridge.log

Stable push requests

Sometimes an urgent request to have something pushed to stable.

Here we had two lorax builds in the testing queue, and QA engineer requested they go to stable ASAP. So you have to header over to the bodhi2 web front end, revoke the "tasting push", then choose to push the build to stable. Once that is done the bodhi-push comment will permit the stable push.

sudo -u apache bodhi-push --releases='f21,f22,f23' --request=stable  --builds 'lorax-21.34-1.fc21 lorax-22.13-1.fc22 freeipa-4.2.3-1.fc23' --username parasense

Testing push request

So there was a stable push in progress, and folks requesting testing push. The testing push uses a different lock file, so they will go in parallel with the stable push in progress.

sudo -u masher bodhi-push --releases="f23,f24,f25" --request=testing  --username parasense

Script to avoid common problems

Here is a script automates most common bodhi2 troubleshooting.

  • Invalid repocache
  • Restarting fedmsg hub to clear persisting sqlite file locks left behind by createrepo
  • Umount persisting tmpfs mounts from Atomic compose
#!/bin/bash
#
# Copyright (C) 2016 Red Hat, Inc.
# SPDX-License-Identifier:      MIT
# 
# Authors:
#     Jon Disnard <jdisnard@redhat.com>
#
# Attempt to fix common bodhi2 issues prior to running


## Stale NFS locks
## "OSError: [Errno 39] Directory not empty"
##
## REMARKS:
## These are caused by createrepo sqlite database locking.
## This issue could be solved by using either using createrepo_c, and/or generate the metadata off to the side in /tmp, then move to NFS.
## Also, koji signed-repos would solve this
##
printf '* Restarting fedmsg-hub\n'
sudo systemctl restart fedmsg-hub


## Stale tmpfs mounts
## "OSError: [Errno 16] Device or resource busy"
##
## REMARKS:
## A recursive 'umount -R' would solve this.
## Atomic needs to clean after itself, or the atomic compose parts in bodhi2.
##
while read tmpfs
do  if test -z "$tmpfs"
    then  continue
    else  
          printf '* tmpfs found; umounting %s\n' "$tmpfs"
          sudo umount -v "$tmpfs"
    fi
done < <(findmnt -t tmpfs -o TARGET | grep rpm-ostree)


## Missing repodata/repomd.xml
## "IOError: Cannot open"
##
## REMARKS:
## The repocache improve speed, less effort
## But it's ridiculous to fail here
## Easy Bodhi2 fix.
##
for I in /mnt/fedora_koji/koji/mash/updates/dist-5E-epel{,-testing}.repocache/repodata/ \
         /mnt/fedora_koji/koji/mash/updates/dist-6E-epel{,-testing}.repocache/repodata/ \
         /mnt/fedora_koji/koji/mash/updates/epel7{,-testing}.repocache/repodata/        \
         /mnt/fedora_koji/koji/mash/updates/f23-updates{,-testing}.repocache/repodata/  \
         /mnt/fedora_koji/koji/mash/updates/f24-updates{,-testing}.repocache/repodata/  \
         /mnt/fedora_koji/koji/mash/updates/f25-updates{,-testing}.repocache/repodata/  ;
do  if test -d "$I"
    then  if test -f "${I}/repomd.xml"
          then  continue
          else  printf '* No repomd.xml found; Removing %s\n' "$I"
                sudo rmdir "$I"
          fi
    fi
done


#THE END