From Fedora Project Wiki

Revision as of 16:27, 24 May 2008 by Ravidiip (talk | contribs) (1 revision(s))

Wiki - SOP

Contact Information

Owner: Fedora Infrastructure Team / Fedora Website Team

Contact: #fedora-admin or #fedora-websites on irc.freenode.net

Location: http://fedoraproject.org/wiki/

Servers: proxy[1-2] app[1-2]

Purpose: Provides our production wiki

Description

Our wiki currently runs moin. It's based off of the stock version in EPEL with an ACL patch. Common performance issues relate to the size of our wiki. Page saves iterate over each user to determine who to contact and pages using dynamic lists based off of category can DOS the site because of iteration over the pages to determine what category they are in.

Architecture

File:Infrastructure SOP wiki wiki.png

Troubleshooting and Resolution

Pages only partially loading

Symptom: Pages only partially load. Content is missing, images missing or css / formatting issues.
Problem: The most common issue here is one of the app servers has gotten overloaded.
Solution: Remove the offending app server from the mix by disabling its proxy server. (Note: this is a temporary solution until we get an actual load balancer between the proxy servers and the app servers). For example, if app1 is over loaded, shut off puppet and httpd on proxy1 (proxy1 -> app1, proxy2 -> app2)

High load / unresponsive app server

Symptom: Application server has become unresponsive and has high load
Problem: The most common issue here is one of the app servers has gotten overloaded doing something inefficient on the wiki. Some page formattings, searches, emails can cause an app server to get overloaded. This is especially true if the user keeps clicking search or save. This can also be from a popular page being hit (like on release day)
Solution: Remove the offending app server from the mix by disabling its proxy server. (Note: this is a temporary solution till we get an actual load balancer between the proxy servers and the app servers). For example, if app1 is over loaded, shut off puppet and httpd on proxy1 (proxy1 -> app1, proxy2 -> app2)
Solution 2: If load is high because the wiki is just popular (slashdot, release day, etc) simply find what pages are being hit the most. The following command (below) will list the top 20 pages hit over the last 5 hours. Run it on the proxy servers. Take abnormally popular pages and convert them to a static html page using wget or saving from your browser. Place these static pages on the proxy servers and create an alias or redirect for them. (Don't forget to use puppet to create these aliases, puppet will overwrite your changes. Disable puppet while your testing if needed). If it is not possible to get a static copy of the pages just shut the website down until load comes down enough to get the page.
awk '{ print $7 }' <code>ls -tr fedoraproject.org-access.log.* | \
tail -n 5<code> | grep -v "css\|js\|wikidata\|/wiki/WikiGraphics" | sort | uniq -c | \
sort -n | tail -n 20

UnicodeEncodeError

Symptom: Pages error with !UnicodeEncodeError
Problem: NULL chars in log files for the page in question and the main edit-log

{{ Template:message/note | Solution: Edit the edit-log of the page in question and the main edit-log to remove entries with null chars. An update to Moin is ready upstream to fix this bug.