From Fedora Project Wiki

(update fedora repo path & add troubleshooting)
(Added new troubleshoot)
Line 74: Line 74:
=== VMs can't receive tasks anymore ===
=== VMs can't receive tasks anymore ===
If for some reason VM appear to not receive tasks, this's because it's not reachable anymore.<BR>
If for some reason VM appear to not receive tasks, this's because it's not reachable anymore.<BR>
Reload you browser will show the VM as state '''<no reachable>''' in the ovirt UI.<BR>
Reloading your browser will show the VM as state '''<unreachable>''' in the ovirt UI.<BR>
At first, check that hosts are still available. If so, you gonna have to do a manual reload on the broker connectivity.
At first, check that hosts are still available. If so, you gonna have to do a manual reload of ovirt taksOmatic on the broker connectivity.


Log in to capp1 and restart services in this order (hold few sec for each)<BR>
Log in to capp1 and restart services in this order (hold few sec for each)<BR>
sudo service ovirt-taskomatic restart
<BR>
=== Cnodes or VM are unreachable ===
If cnode(s) or VM got unreachable, there're a couple of way to figure out what's going on.<BR>
*0. first off, logs are always useful (specificaly : db-omatic.log and task-omatic.log).<BR>
* 1. Check that cnode(s) or VM are still "physically" reachable.<BR>
* 2. If there are but VM, check if libvirt-qpid or libvirtd is still running on related cnode(s).<BR>
If there're not :
sudo libvirtd-qpid start
sudo libvirtd start
<BR>
* 3. If both cnode and VM are still UP and running, it could be a timeout on qmf connectivity or db-omatic died without any reason. The best way to fix this is to reload ovirt qmf/qpid's process as follow : <BR>
  sudo service qpidd restart
  sudo service qpidd restart
  sudo service ovirt-db-omatic restart
  sudo service ovirt-db-omatic restart
  sudo service ovirt-taskomatic restart
  sudo service ovirt-taskomatic restart
<BR>




[[Category:Infrastructure SOPs]]
[[Category:Infrastructure SOPs]]

Revision as of 22:19, 30 September 2009

Shortcuts:
ISOP:FEDORACLOUD
ISOP:OVIRT


We are still working on the Fedora cloud setup, the content of this page will grow at the same time we work and troubleshoot all services.

Fedora Cloud computing

Contact Information

Owner: Fedora Infrastructure Team

Contact: #fedora-admin, sysadmin-cloud group

Persons: mmcgrath, SmootherFrOgZ, G

Location: Phoenix ?

Servers: capp1.fedoraproject.org, cnode[1-5].fedoraproject.org, store[1-4]

Purpose: Provide Virtual Machine for Fedora contributors.

Description

blabalbalablab
blablablabal

Rebuild capp1 (ovirt-server)

Log into cnode1
Check that no capp1 domain is running

sudo virsh list


If there is a capp1 running, proceed as follow

sudo virsh destroy capp1
sudo virsh undefine capp1


Format capp1 disk for a better new virtual install

sudo /sbin/mkfs.ext3 -j /dev/VolGroup00/appliance1


You can now start install a new fresh capp1 virtual system

sudo virt-install -n capp1 -r 1024 --vcpus=2 --os-variant fedora11 --os-type linux \
-l http://mirrors.kernel.org/fedora/releases/11/Fedora/x86_64/os/ \
--disk="path=/dev/VolGroup00/appliance1" --nographics --noacpi --hvm --network=bridge:br2 \
--accelerate -x "console=ttyS0 ks=http://infrastructure.fedoraproject.org/rhel/ks/fedora ip=209.132.178.19 netmask=255.255.254.0 gateway=209.132.179.254 dns=4.2.2.2"

Note: If the network messes up during the prompt install, just configure it manually. NM will takes care of it then.

Note2: The above ks file seems to have graphical install as install method. Rebuild one or do a manual install to continue.

Network configuration

capp1 network interfaces will need to be setup manually in order to work against physical one.
Here is how to proceed, create your network interface

sudo vi /etc/sysconfig/network-scripts/ifcfg-eth1


Then add this following configuration to the file

DEVICE=eth1
BOOTPROTO=static
ONBOOT=yes
PEERNTP=yes
IPADDR=$physical_br_IP
NETMASK=$physical_br_NETMASK
HWADDR=$random_mac_addr


Reproduce the above for eth3 against br3
You can get br? IP and netmask on cnode1 with <ifconfig> cd-line.


Troubleshooting

VMs can't receive tasks anymore

If for some reason VM appear to not receive tasks, this's because it's not reachable anymore.
Reloading your browser will show the VM as state <unreachable> in the ovirt UI.
At first, check that hosts are still available. If so, you gonna have to do a manual reload of ovirt taksOmatic on the broker connectivity.

Log in to capp1 and restart services in this order (hold few sec for each)

sudo service ovirt-taskomatic restart


Cnodes or VM are unreachable

If cnode(s) or VM got unreachable, there're a couple of way to figure out what's going on.

  • 0. first off, logs are always useful (specificaly : db-omatic.log and task-omatic.log).
  • 1. Check that cnode(s) or VM are still "physically" reachable.
  • 2. If there are but VM, check if libvirt-qpid or libvirtd is still running on related cnode(s).

If there're not :

sudo libvirtd-qpid start
sudo libvirtd start


  • 3. If both cnode and VM are still UP and running, it could be a timeout on qmf connectivity or db-omatic died without any reason. The best way to fix this is to reload ovirt qmf/qpid's process as follow :
sudo service qpidd restart
sudo service ovirt-db-omatic restart
sudo service ovirt-taskomatic restart