Mediawiki Setup Guide

From Opinionated Free Software Wiki
Jump to: navigation, search

Introduction

tldr: For GNU/Linux (with a bit of Debian bias), a more concise, holistic and automated install than the official Mediawiki docs. Do some initial configuration then download this page and run it, or execute it as you read.

Goals / Why use this guide?

  • Good recommendations. Official docs mostly avoid recommendations among a myriad of possibilities
  • Closely references & supplements official documentation
  • Automatic security updates
  • Explicit automation support wherever practical
  • Used to setup this site (style is optional)
  • Support for multiple gnu/linux distros
  • Holistic scope (backups, server setup), but sections are independent
  • Code blocks are idempotent
  • Edits to this page are tested on this site and reviewed by the main author.

Assumptions

  • Self hosting, single GNU/Linux system with root Bash shell


Version Support

Very minor adjustments needed for other distros. Help expand this list.

  • Mediawiki 1.28, updated as new versions are released
  • Debian 8 + backports
  • Debian 8
  • Debian testing (last tested Aug 7, 2016)

Pre 5/2016 revisions ran Mediawiki 1.23, tested on Fedora 20 and Ubuntu 14.04.

Prerequisites

Getting a Server & a Domain

The most common route and the one taken by this site is buying a domain name from a site like namecheap, and a cheap vps from companies like linode or digital ocean. They have good getting started guides which mostly apply beyond their own sites.

Email Setup

Setting up email can be an involved process, and this guide assumes that a some program (usually postfix or exim) is implementing a functional sendmail interface. Mediawiki uses email with to send password reminders or notifications, and this guide includes cronjobs for updating mediawiki and doing backups which will send mail in the case of an error. Email is also the recommended way to get notifications of package updates which require manual steps such as restarting of services.

If you are not setting up your server to send mail with a program that uses the default sendmail interface, see these pages when you are configuring mediawiki: Manual:$wgEnableEmail, Manual:Email_settings, Manual:$wgSMTP

Setup Guide Configuration

  1. Set variables below
  2. Save the code in this section to a file (~/mw_vars is suggested)
  3. Source it at the beginning of scripts containing later commands
  4. Source it from your .bashrc file while you are setting up Mediawiki

Requires customization:

# Replace REPLACE_ME as appropriate

export mwdescription="REPLACE_ME" # eg. Opinionated Free Software Wiki

# username/pass of the first wiki admin user
export wikiuser="REPLACE_ME"
export wikipass=REPLACE_ME

# root password for the mysql database
export dbpass=REPLACE_ME

export mwdomain=REPLACE_ME # domain name. for this site, it's ofswiki.org

# customize these questions. Try not to have the answer be a word in the question.
captchaArray() {
    if ! grep -Fx '$localSettingsQuestyQuestions = array (' $mwc; then
       tee -a $mwc <<'EOF'
$localSettingsQuestyQuestions = array (
    "What is the name of the wiki software this site (and wikipedia) uses?" => "Mediawiki",
    "REPLACE_ME with a question" => "REPLACE_ME with an answer"
);
EOF
    fi
}

# The rest of this section will work fine with no changes.

# git branch for mediawiki + extensions. See intro for supported versions.
# branch names: https://git.wikimedia.org/branches/mediawiki%2Fcore.git
export mw_branch=REL1_28

# As set by gui installer when choosing cc by sa.
export mw_RightsUrl='https://creativecommons.org/licenses/by-sa/4.0/'
export mw_RightsText='Creative Commons Attribution-ShareAlike'
export mw_RightsIcon='$wgScriptPath/resources/assets/licenses/cc-by-sa.png'

# Alphanumeric site name for pywikibot.
# Here we use the domain minus the dots, which should work fine without changing.
export mwfamily=${mwdomain//./}
# install path for mediawiki. This should work fine.
export mw=/var/www/$mwdomain/html/w


# wiki sender address / wiki & wiki server contact email.
# see email section for more info on email
export mw_email="admin@$mwdomain"

# Leave as is:
mwc="$mw/LocalSettings.php"

Download this page and run it

This is an option to do automated setup. Optional code blocks are skipped (they have a bold warning just before them and a tag on the source block). The only important things left after running this are running the automated backup setup code on another machine.

Requires manual step: inspect output file: /tmp/mw-setup, then run it

start=' *<source lang="bash"> *'
end=' *<\/source> *'
ruby <<'EOF' | sed -rn "/^$start$/,/^$end$/{s/^$start|$end$/# \0/;p} > /tmp/mw-setup"
require 'json'
puts JSON.parse(`curl 'https://ofswiki.org/w/api.php?\
action=query&titles=Mediawiki_Setup_Guide&prop=revisions&rvprop=content&\
format=json'`.chomp)['query']['pages'].values[0]['revisions'][0]['*']
EOF
chmod +x /tmp/mw-setup

Required Bash Functions

Here we define some small useful bash functions. This should be part of the same ~/mw_vars file if you are running the code step by step.

# identify if this is a debian based distro
isdeb() { command -v apt &>/dev/null; }
# tee unique. append each stdin line if it does not exist in the file
teeu () {
    local MAPFILE
    mapfile -t
    for line in "${MAPFILE[@]}"; do
        grep -xFq "$line" "$1" &>/dev/null || tee -a "$1" <<<"$line"
    done
}

# get and reset an extension/skin repository, and enable it
mw-clone() {
    local url=$1
    local original_pwd="$PWD"
    local name
    local re='[^/]*/[^/]*$' # last 2 parts of path
    [[ $url =~ $re ]] ||:
    target=$mw/${BASH_REMATCH[0]}
    if [[ ! -e $target/.git ]]; then
        git clone $url $target
    fi
    if ! cd $target; then
        echo "mw-ext error: failed cd $target";
        exit 1
    fi
    git fetch
    git checkout -qf origin/$mw_branch || git checkout -qf origin/master
    git clean -xffd
    cd "$original_pwd"

}
mw-ext () {
    local ext
    for ext; do
        mw-clone https://gerrit.wikimedia.org/r/p/mediawiki/extensions/$ext
        if [[ -e $mw/extensions/$ext/extension.json ]]; then
            # new style extension
            teeu $mwc <<EOF
wfLoadExtension( '$ext' );
EOF
        else
            teeu $mwc <<EOF
require_once( "\$IP/extensions/$ext/$ext.php" );
EOF
        fi
    done
    # --quick is quicker than default flags,
    # but still add a sleep to make sure everything works right
    sudo -u $apache_user php $mw/maintenance/update.php -q --quick; sleep 1
}
mw-skin() {
    local skin=$1
    mw-clone https://gerrit.wikimedia.org/r/p/mediawiki/skins/$skin
    sed -i --follow-symlinks '/^wfLoadSkin/d' $mwc
    sed -i --follow-symlinks '/^\$wgDefaultSkin/d' $mwc
    teeu $mwc <<EOF
\$wgDefaultSkin = "${skin,,*}";
wfLoadSkin( '$skin' );
EOF
    sudo -u $apache_user php $mw/maintenance/update.php -q --quick; sleep 1
}

if command -v apt &>/dev/null; then
    apache_user=www-data
else
    apache_user=apache
fi

Install Mediawiki Dependencies

The best way to get core dependencies is to install the mediawiki package itself. Nothing about it will get in the way of using a version from upstream.

Mediawiki Main Page: the beginning of the official docs.

Manual:Installation_requirements: Overview of installation requirements.

Note, this guide needs a little adjustment before it will work with php7.0: make sure settings are still valid, update ini path.


# From here on out, exit if a command fails.
# This will prevent us from not noticing an important failure.
# We recommend setting this for the entire installation session.
# If you are running commands interactively, it might be best to
# put it in your ~/.bashrc temporarily.
set -eE -o pipefail
trap 'echo "$0:$LINENO:error: \"$BASH_COMMAND\" returned $?" >&2' ERR
source ~/mw_vars

if isdeb; then
    # main reference:
    # https://www.mediawiki.org/wiki/Manual:Running_MediaWiki_on_Ubuntu
    apt-get update
    DEBIAN_FRONTEND=noninteractive apt-get install -y imagemagick curl
    if apt-get install -s mediawiki &>/dev/null; then
        # mediawiki is packaged in jessie backports.
        DEBIAN_FRONTEND=noninteractive apt-get -y install php5-apcu mediawiki
    else
        # https://www.mediawiki.org/wiki/Manual:Installation_requirements
        if apt-get install -s php7.0 &>/dev/null; then
            # note, 7.0 is untested by the editor here, since it's not
            # available in debian 8. it's listed as supported
            # in the mediawiki page.
            # noninteractive to avoid mysql password prompt.
            DEBIAN_FRONTEND=noninteractive apt-get install -y apache2  \
                           default-mysql-server \
                           php7.0 php7.0-mysql libapache2-mod-php7.0 php7.0-xml \
                           php7.0-apcu php7.0-mbstring
        else
            # note: mbstring is recommended, but it's not available for php5 in
            # debian jessie.
            DEBIAN_FRONTEND=noninteractive apt-get install -y apache2 \
                           default-mysql-server \
                           php5 php5-mysql libapache2-mod-php5 php5-apcu
        fi
    fi
    service apache2 restart
else
    # note
    # fedora deps are missing a database, so some is translated from debian packages
    yum -y install mediawiki ImageMagick php-mysqlnd php-pecl-apcu mariadb-server

    systemctl restart mariadb.service
    systemctl enable mariadb.service
    systemctl enable httpd.service
    systemctl restart httpd.service
fi


# slightly different depending on if we already set the root pass
if echo exit|mysql -u root -p"$dbpass"; then
    # answer interactive prompts:
    # mysql root pass, change pass? no, remove anon users? (default, yes)
    # disallow remote root (default, yes), reload? (default, yes)
    echo -e "$dbpass\nn\n\n\n\n" | mysql_secure_installation
else
    # I had 1 less newline at the start when doing ubuntu 14.04,
    # compared to debian 8, so can't say this is especially portable.
    # It won't hurt if it fails.
    echo -e "\n\n$dbpass\n$dbpass\n\n\n\n\n" | mysql_secure_installation
fi


Skippable notes


php[5]-mysqlnd is a faster mysql driver package, but the default in debian php-mysql, appparently because some non-mediawiki packages are not compatible with it. If you run into this issue, simply use the php-mysql package.


Additional packages rational

  • ImageMagick is recommended.
  • Gui install and mediawikiwiki:Manual:Cache recomend the apc package.
  • Clamav for virus scanning of uploads is mentioned in the mediawiki manual. However, wikipedia doesn't seem to do it, so it doesn't seem like it's worth bothering. It also makes uploading a set of images take twice as long on broadband.

Install Mediawiki

Here, we mediawikiwiki:Download_from_Git, or reset our installation if it is already there, and create the wiki database. mediawikiwiki:Manual:Installing_MediaWiki

mkdir -p $mw
cd $mw
# this will just fail if it already exists which is fine
if [[ ! -e .git ]]; then
    git clone https://gerrit.wikimedia.org/r/p/mediawiki/core.git .
fi
# to see available branches: https://www.mediawiki.org/wiki/Version_lifecycle
# and
# git branch -r
git checkout -f origin/$mw_branch
git clean -ffxd
# apply librejs patch
curl "https://iankelling.org/git/?p=mediawiki-librejs-patch;a=blob_plain;f=mediawiki-1.28-librejs.patch;hb=HEAD" | patch -r - -N -p1
# Get the php libraries wmf uses. Based on:
# https://www.mediawiki.org/wiki/Download_from_Git#Fetch_external_libraries
if [[ ! -e vendor/.git ]]; then
    git clone  https://gerrit.wikimedia.org/r/p/mediawiki/vendor.git
fi
cd vendor
git checkout -f origin/$mw_branch
cd ..

# Drop any previous database which may have been installed while testing.
# If upgrading, we should have a db backup which will get restored.
# https://www.mediawiki.org/wiki/Manual:Upgrading
mysql -u root -p$dbpass <<'EOF' ||:
drop database my_wiki;
exit
EOF
php $mw/maintenance/install.php --pass $wikipass --scriptpath /w \
    --dbuser root --dbpass $dbpass "$mwdescription" "$wikiuser"
teeu $mwc <<'EOF'
# lock down the wiki to only the initial owner until anti-spam measures are put in place
# limit edits to registered users
$wgGroupPermissions['*']['edit'] = false;
# don't allow any account creation
$wgGroupPermissions['*']['createaccount'] = false;
EOF


Note: When testing, you may need to clear the apc cache to see changes take effect in the browser. Simplest solution is just restart apache. http://stackoverflow.com/questions/911158/how-to-clear-apc-cache-entries

Skippable Notes

If we wanted to reset our installation, but leave the extension repositories alone, alter the command above to be git clean -fxd

Rational for choosing git sources

Upstream vs distro packages. Upstream is responsive, and it's distributed within a single directory, so packaging does not integrate with the distro's filesystem. The only potential value would be less bugs by using stable versions, but we choose not to make that tradeoff.

Why use git over zip file releases? Mediawiki supports git usage through release branches which get post-release fixes. This means we can auto-update, get more granular fixes, easier to manage updates, and rollbacks.

Configure Apache

Note, non-debian based installs: modify instructions below to use /etc/httpd/conf.d/$mwdomain.conf, and don't run a2ensite.

I use scripts I maintains separately to setup Let's Encrypt certificates and apache config: (url pending).

If you are doing a test setup on your local machine, you can make your domain resolve to your local test installation, then remove it later when you are done. Note, you will need non-local site to get Let's Encrypt certificates, and then transfer them locally, or disable ssl from the apache config (neither is covered here) and replace all instances of https in these instructions with http. Another option is to get a cheap 2 dollar domain for your test site.

Not for production:

teeu /etc/hosts<<<"127.0.0.1 $mwdomain"

To not use my scripts, and still use Let's Encrypt: follow this doc page: https://letsencrypt.org/getting-started/. It's a little long winded, so I would boil it down to this:

Optional & requires additional steps:

git clone https://github.com/certbot/certbot
cd certbot
./certbot-auto apache
cd /etc/apache/sites-available
mv 000-default-le-ssl.conf $mwdomain.conf
rm ../sites-enabled/000-default-le-ssl.conf
# edit $mwdomain.conf, so documentroot is /var/www/$mwdomain/html
# and ServerName is $mwdomain
a2ensite $mwdomain.conf

Then, copy the input to apache-site below and insert it into the apache config.

Here, we use some scripts automate setting up the Let 's Encrypt cert and the apache config.

temp=$(mktemp -d)
cd $temp
git_site=https://iankelling.org/git
git clone $git_site/acme-tiny-wrapper
l=$mw/../../logs
mkdir -p $l

acme-tiny-wrapper/acme-tiny-wrapper -t $mwdomain

git clone $git_site/basic-https-conf
{ cat <<EOF
ServerAdmin $mw_email
RewriteEngine On
# make the site's root url go to our main page
RewriteRule ^/?wiki(/.*)?\$ %{DOCUMENT_ROOT}/w/index.php [L]
# use short urls https://www.mediawiki.org/wiki/Manual:Short_URL
RewriteRule ^/*\$ %{DOCUMENT_ROOT}/w/index.php [L]
EOF
find -L $(readlink -f $mw) -name .htaccess \
    | while read line; do
    echo -e "<Directory ${line%/.htaccess}>\n $(< $line)\n</Directory>";
done
} | basic-https-conf/apache-site -r ${mw%/*} - $mwdomain
cd
rm -rf $temp

Now mediawiki should load in your browser at $mwdomain .

Allow proper search bots and internet archiver bots, via Mediawiki:Robots.txt, and install the default skin.

dd of=$mw/../robots.txt <<'EOF'
        User-agent: *
Disallow: /w/
        User-agent: ia_archiver
Allow: /*&action=raw
EOF
mw-skin Vector

Skippable Notes

This section assumes we are redirecting www to a url without www.

Apache recommends moving .htaccess rules into it's config for performance. So we look for .htaccess files from mediawiki and copy their contents into this config. In modern apache versions, we would have to explicitly set options like AllowOverride to allow .htaccess files to take effect.

Mediawiki Settings

Overall reference: mediawikiwiki:Manual:Configuration_settings.

Settings which the gui setup prompts for but aren't set by the automated install script.

teeu $mwc<<EOF
\$wgServer = "https://$mwdomain";
\$wgDBserver = "localhost";
\$wgRightsUrl = "$mw_RightsUrl";
\$wgRightsText = "$mw_RightsText";
\$wgRightsIcon = "$mw_RightsIcon";
EOF

Settings asked by the gui setup which are different than the install script defaults. They different because the defaults are the most compatible and unobtrusive.

teeu $mwc<<EOF
\$wgPasswordSender = "$mw_email";
\$wgEmergencyContact = "$mw_email";
\$wgEnotifUserTalk = true; # UPO
\$wgEnotifWatchlist = true; # UPO
\$wgMainCacheType = CACHE_ACCEL;
\$wgEnableUploads = true;
\$wgUseInstantCommons = true;
\$wgPingback = true;
EOF

Other misc settings

teeu $mwc <<'EOF'
# from https://www.mediawiki.org/wiki/Manual:Short_URL
$wgArticlePath = "/wiki/$1";

# https://www.mediawiki.org/wiki/Manual:Combating_spam
# check that url if our precautions don't work
# not using nofollow is good practice, as long as we avoid spam.
$wgNoFollowLinks = false;
# Allow user customization.
$wgAllowUserCss = true;
# use imagemagick over GD
$wgUseImageMagick = true;
# manual says this is not production ready, I think that is mostly
# because they are using MobileFrontend extension instead, which gives
# an even cleaner more minimal view, I plan to try setting it up
# sometime but this seems like a very nice improvement for now.
$wgVectorResponsive = true;
EOF


# https://www.mediawiki.org/wiki/Manual:Configuring_file_uploads
# Increase from default of 2M to 100M.
# This will at least allow high res pics etc.
php_ini=$(php -r 'echo(php_ini_loaded_file());')
sed -i --follow-symlinks 's/^\(upload_max_filesize\|post_max_size\)\b.*/\1 = 100M/' $php_ini
if isdeb; then
    service apache2 restart
else
    systemctl restart httpd.service
fi

# if you were to install as a normal user, you would need this for images
# sudo usermod -aG $apache_user $USER

# this doesn't propogate right away
chgrp -R $apache_user $mw/images
chmod -R g+w $mw/images

Style settings. Omit to use a different style.

teeu $mwc <<'EOF'
$wgLogo = null;
#$wgFooterIcons = null;
EOF
# Make the toolbox go into the drop down.
cd $mw/skins/Vector
if ! git remote show ian-kelling &>/dev/null; then
    git remote add ian-kelling https://iankelling.org/git/forks/Vector
fi
git fetch ian-kelling
git checkout ian-kelling/${mw_branch}-toolbox-in-dropdown

Install and Configure Mediawiki Extensions

When installing extensions on a wiki with important content, backup first as a precaution.

Extensions with no configuration needed

Name Description
Extension:Cite Have references in footnotes*
Extension:CiteThisPage Ability to generate citations to pages in a variety of styles*
Extension:CheckUser Get ip addresses from inside mediawiki so you can ban users
Extension:CSS Allows CSS stylesheets to be included in specific articles
Extension:Echo Notification subsystem for usage by other extensions
Extension:Gadgets UI extension system for users*
Extension:ImageMap Links for a region of an image*
Extension:Interwiki Tool for nice links to other wikis*
Extension:News Embed or rss recent changes
Extension:Nuke Mass delete of pages, in the case of spam*
Extension:ParserFunctions Useful for templates*
Extension:Poem Useful for formatting things various ways*
Extension:Renameuser Allows bureaucrats to rename user accounts*
Extension:SyntaxHighlight_GeSHi Source code highlighting*
Extension:Variables Define per-page variables

* = Comes with the MediaWiki default download.

mw-ext Cite CiteThisPage CheckUser CSS Echo Gadgets ImageMap Interwiki News \
       Nuke ParserFunctions Poem Renameuser SyntaxHighlight_GeSHi Variables


Extension:AntiSpoof: Disallow usernames with unicode trickery to look like existing names

mw-ext AntiSpoof
# recommended setup script to account for existing users
sudo -u $apache_user php $mw/extensions/AntiSpoof/maintenance/batchAntiSpoof.php


Extension:Wikidiff2: Faster and international character supported page diffs

I used packaged version since this is a c++ and probably not very tied to the Mediawiki version. This isn't packaged in fedora, haven't gotten around to testing and adding the code to compile it for fedora.

if isdeb; then
    apt-get -y install php-wikidiff2
    teeu $mwc <<'EOF'
$wgExternalDiffEngine = 'wikidiff2';
EOF
    dir=$(dirname $(php -r 'echo(php_ini_loaded_file());'))/../apache2/conf.d
    ln -sf ../../mods-available/wikidiff2.ini $dir
    service apache2 restart
fi


Extension:Math Display equations

mw-ext Math
# php5-curl according to Math readme
if isdeb; then
    curl_pkg=php7.0-curl
    if ! apt-get -s install $curl_pkg &>/dev/null; then
        curl_pkg=php5-curl
    fi
    apt-get -y install latex-cjk-all texlive-latex-extra texlive-latex-base \
            ghostscript imagemagick ocaml $curl_pkg make
else
    # todo, php5-curl equivalent on fedora
    yum -y install texlive-cjk ghostscript ImageMagick texlive ocaml
fi
service apache2 restart

cd $mw/extensions/Math/math; make # makes texvc
cd $mw/extensions/Math/texvccheck; make

teeu $mwc <<'EOF'
# Enable MathJax as rendering option
$wgUseMathJax = true;
# Enable LaTeXML as rendering option
$wgMathValidModes[] = 'latexml';
# Set LaTeXML as default rendering option, because it is nicest
$wgDefaultUserOptions['math'] = 'latexml';
EOF

Skippable notes

There is no current list of package depencies so I took dependencies from mediawiki-math package in Debian 7. Fedora didn't have a mediawik math package, so I just translated from debian. Ocaml is for math png rendering, as backup option to the nicer looking LatexML and MathJax. Debian has texvc package, but it didn't work right for me, plus it required additional configuration in mediawiki settings.


Extension:SpamBlacklist: Import/create IP blacklists, mainly for spam

Comes with MediaWiki.

mw-ext SpamBlacklist
if ! grep -F '$wgSpamBlacklistFiles = array(' $mwc &>/dev/null; then
    tee -a $mwc <<'EOF'
$wgEnableDnsBlacklist = true;
$wgDnsBlacklistUrls = array( 'xbl.spamhaus.org', 'dnsbl.tornevall.org' );

ini_set( 'pcre.backtrack_limit', '10M' );
$wgSpamBlacklistFiles = array(
   "[[m:Spam blacklist]]",
   "http://en.wikipedia.org/wiki/MediaWiki:Spam-blacklist"
);
EOF
fi

Extension:TitleBlacklist: Anti-spam

Comes with Mediawiki.

mw-ext TitleBlacklist
if ! grep -F '$wgTitleBlacklistSources = array(' $mwc &>/dev/null; then
    tee -a $mwc <<'EOF'
$wgTitleBlacklistSources = array(
    array(
         'type' => 'local',
         'src'  => 'MediaWiki:Titleblacklist',
    ),
    array(
         'type' => 'url',
         'src'  => 'http://meta.wikimedia.org/w/index.php?title=Title_blacklist&action=raw',
    ),
);
EOF
fi

Extension:WikiEditor: Editing box extras and a fast preview tab

Comes with MediaWiki.

mw-ext WikiEditor
teeu $mwc <<'EOF'
# Enable Wikieditor by default
$wgDefaultUserOptions['usebetatoolbar'] = 1;
$wgDefaultUserOptions['usebetatoolbar-cgd'] = 1;

# Display the Preview and Changes tabs
$wgDefaultUserOptions['wikieditor-preview'] = 1;
EOF

Extension:CategoryTree: Enables making nice outlines of pages in a category

mw-ext CategoryTree
teeu $mwc <<'EOF'
# Mediawiki setting dependency for CategoryTree
$wgUseAjax = true;
EOF

Extension:AbuseFilter: Complex abilities to stop abuse

Used by big wiki sites. As a smaller site, we won't use it much, but it's good to have. It's page suggests a few defaults:

mw-ext AbuseFilter
teeu $mwc<<'EOF'
$wgGroupPermissions['sysop']['abusefilter-modify'] = true;
$wgGroupPermissions['*']['abusefilter-log-detail'] = true;
$wgGroupPermissions['*']['abusefilter-view'] = true;
$wgGroupPermissions['*']['abusefilter-log'] = true;
$wgGroupPermissions['sysop']['abusefilter-private'] = true;
$wgGroupPermissions['sysop']['abusefilter-modify-restricted'] = true;
$wgGroupPermissions['sysop']['abusefilter-revert'] = true;
EOF

Extension:ConfirmEdit: Custom Captcha.

Uses captchaArray defined in mw_vars. Comes with MediaWiki.

mw-ext ConfirmEdit
captchaArray
teeu $mwc <<'EOF'
wfLoadExtension( 'ConfirmEdit/QuestyCaptcha' );
$wgCaptchaClass = 'QuestyCaptcha';
# only captcha on registration
$wgGroupPermissions['user'         ]['skipcaptcha'] = true;
$wgGroupPermissions['autoconfirmed']['skipcaptcha'] = true;
EOF
if ! grep -Fx 'foreach ( $localSettingsQuestyQuestions as $key => $value ) {' $mwc; then
    tee -a $mwc <<'EOF'
foreach ( $localSettingsQuestyQuestions as $key => $value ) {
        $wgCaptchaQuestions[] = array( 'question' => $key, 'answer' => $value );
}
EOF
fi

Enable account creation that we initially disabled.

sed -i --follow-symlinks "/\\\$wgGroupPermissions\\['\\*'\\]\\['createaccount'\\] = false;/d" $mwc

Additional Configuration with Pywikibot

There are quite a few special pages which act like variables to configure special wiki content and style. A big part of this wiki's style is configured in this section. We use Pywikibot to automate editing those pages.


Pywikibot Install

Manual:Pywikibot/Installation

# get repo
if [[ ! -e ~/pywikibot/.git ]]; then
    git clone --recursive \
        https://gerrit.wikimedia.org/r/pywikibot/core.git ~/pywikibot
fi
cd ~/pywikibot
#updating
git pull --all
git submodule update


Pywikibot Configuration

Relevent docs: Manual:Pywikibot/Use_on_non-WMF_wikis, Manual:Pywikibot/Quick_Start_Guide


cd $HOME/pywikibot
dd of=user-config.py <<EOF
mylang = 'en'
usernames["$mwfamily"]['en'] = u'$wikiuser'
family = "$mwfamily"
console_encoding = 'utf-8'
password_file = "secretsfile"
EOF

dd of=secretsfile <<EOF
("$wikiuser", "$wikipass")
EOF

# it won't overrwrite an existing file. Remove if if one exists
rm -f  pywikibot/families/${mwfamily}_family.py
if isdeb; then
    apt-get install -y python-requests
else
    yum -y install python-requests
fi

python generate_family_file.py https://$mwdomain/wiki/Main_Page "$mwfamily"

# Note, this needed only for ssl site
tee -a pywikibot/families/${mwfamily}_family.py<<'EOF'
    def protocol(self, code):
        return 'https'
EOF


Pywikibot Script

This will take a full minute or so because the bot waits a few seconds between edits. Useful doc: mediawikiwiki:Pywikipediabot/Create_your_own_script.

cd "$HOME/pywikibot"

dd of=scripts/${mwfamily}_setup.py<<EOF
import pywikibot
import time
import sys
site = pywikibot.Site()
def x(p, t=""):
   page = pywikibot.Page(site, p)
   page.text = t
   #force is for some anti-bot thing, not necessary in my testing, but might as well include it
   page.save(force=True)

# Small/medium noncommercial wiki should be fine with no privacy policy
# based on https://www.mediawiki.org/wiki/Manual:Footer
x("MediaWiki:Privacy")

# licenses for uploads. Modified from the mediawiki's wiki
x("MediaWiki:Licenses", u"""* Same as this wiki's text (preferred)
** CC BY-SA or GFDL| Creative Commons Attribution ShareAlike or GNU Free Documentation License
* Others:
** Unknown_copyright|I don't know exactly
** PD|PD: public domain
** CC BY|Creative Commons Attribution
** CC BY-SA|Creative Commons Attribution ShareAlike
** GFDL|GFDL: GNU Free Documentation License
** GPL|GPL: GNU General Public License
** LGPL|LGPL: GNU Lesser General Public License""")
x("MediaWiki:Copyright", '$mw_license')
x("MediaWiki:Mainpage-description", "$mwdescription")



# The rest of the settings are for the site style

# Remove various clutter
x("MediaWiki:Lastmodifiedat")
x("MediaWiki:Disclaimers")
x("MediaWiki:Viewcount")
x("MediaWiki:Aboutsite")
# remove these lines from sidebar
# ** recentchanges-url|recentchanges
# ** randompage-url|randompage
# ** helppage|help
x("MediaWiki:Sidebar", """* navigation
** mainpage|mainpage-description
* SEARCH
* TOOLBOX
* LANGUAGES""")

# remove side panel
# helpfull doc: https://www.mediawiki.org/wiki/Manual:Interface/Sidebar
x("mediawiki:Common.css", """/* adjust sidebar to just be home link and up top  */
/* adjust sidebar to just be home link and up top  */
/* panel width increased to fit full wiki name. */
/* selectors other than final id are for increasing priority of rule */
div#mw-panel { top: 10px; padding-top: 0em; width: 20em }
div#footer, #mw-head-base, div#content { margin-left: 1em; }
#left-navigation { margin-left: 1em; }


/* logo, and toolbar hidden */
#p-logo, div#mw-navigation div#mw-panel #p-tb {
   display:none;
}

div#mw-content-text {
    max-width: 720px;
}
""")
EOF

# this can spam a warning, so uniq it
python pwb.py ${mwfamily}_setup |& uniq


Skippable Notes

The docs suggest manually entering the pass with python pwb.py login.py, then it should stay logged in. That didn't work for me, and anyways, we automation, so we use secrets file method.

Family name, and all its duplicattions documented as supposed to be $wgSitename, but it works fine using any name.

Automatic Backups

Here we will have a daily cronjob where a backup host sshs to the mediawiki host, makes a backup then copies it back. Copy ~/mw_vars to the backup host at /root/mw_vars. Setup passwordless ssh from the backup host to the mediawiki host. Then run this code on the backup host. This will make a versioned backup of the wiki to ~/backup.

backup_script=/etc/cron.daily/mediawiki_backup
sudo dd of=$backup_script <<'EOFOUTER'
#!/bin/bash
# if we get an error, keep going but return it at the end
last_error=0
trap 'last_error=$?' ERR
source ~/mw_vars
# No strict because the host is likely not named the same as
# the domain.
ssh="ssh -oStrictHostKeyChecking=no"
logfile=/var/log/${mwdomain}_backup.log
{
echo "#### starting backup at $(date) ####"
$ssh root@$mwdomain <<ENDSSH
set -x
tee -a $mwc<<'EOF'
\$wgReadOnly = 'Dumping Database, Access will be restored shortly';
EOF
mkdir -p ~/wiki_backups
mysqldump -p$dbpass --default-character-set=binary my_wiki  > ~/wiki_backups/wiki_db_backup
sed -i '\$ d' $mwc # delete read only setting
ENDSSH
# add no strict option to the defaults

rdiff() { rdiff-backup --remote-schema "$ssh -C  %s rdiff-backup --server" "$@"; }
set -x
rdiff root@$mwdomain::/root/wiki_backups ~/backup/${mwdomain}_wiki_db_backup
rdiff root@$mwdomain::$mw ~/backup/${mwdomain}_wiki_file_backup
set +x
echo "=== ending backup at $(date) ===="
}  &>>$logfile
if [[ $last_error != 0 ]]; then
    echo "backup for $mwdomain failed. See $logfile"
fi
exit $last_error
EOFOUTER

sudo chmod +x $backup_script

Optional & requires additional steps

If you are like most people and don't use the old-school mail spool, setup the backup system to send mail externally so you are notified if it fails. For examples of how to do this, stackoverflow, script I use.

Restoring Backups

Whenever you implement a backup system, you should test that restoring the backup works.

You should be able to restore your wiki to a new machine by repeating all install steps, then restoring the database and the images directory. I've done this many times. However, we backup the entire Mediawiki directory in case you forget to record a step or some corner case happens. Since most people don't record the steps they took to setup Mediawiki, this is also the officially recommended method. In the code below we restore only the database and images folder from the full backup. You can try this after setting up a wiki from scratch. If it doesn't work, you know your fresh setup is not replicating your backed up wiki correctly. In that case, you can fall back to doing a full restore by copying the full directory instead of just the images. See mediawikiwiki:Manual:Restoring a wiki from backup if you run into any problems.

To test a backup restore:

  1. Do a backup of your wiki with some content in it, as described in the previous section
  2. Move your mediawiki install directory, or setup Mediawiki on a new machine
  3. Re-execute the mediawiki install steps
  4. Change REPLACE_ME in the code below (as in the backup section so you get the right variables),
  5. Execute the code on the backup machine.

Optional

#!/bin/bash
source ~/mw_vars
restore="rdiff-backup --force -r now"
$restore ~/backup/${mwdomain}_wiki_file_backup /tmp/wiki_file_restore
$restore ~/backup/${mwdomain}_wiki_db_backup /tmp/wiki_db_restore
o=-oStrictHostKeyChecking=no
scp $o -r /tmp/wiki_file_restore/images/* root@$mwdomain:$mw/images
scp $o -r /tmp/wiki_db_restore root@$mwdomain:/tmp
ssh $o root@$mwdomain <<EOF
set -e
chmod -R g+w $mw/images
chgrp -R www-data $mw/images
mysql -u root -p$dbpass my_wiki < /tmp/wiki_db_restore/wiki_db_backup
php $mw/maintenance/update.php
EOF

Then browse to your wiki and see if everything appears to work.

Updates

Subscribe to get release and security announcements MediaWiki-announce.

For updates, we simply git pull all the repos, then run the maintenance script. This should be done after a backup. We recommend automatic updates to get security fixes and since not much is changing on the release branch. In this example, we update at 5 am daily (1 hour after the automatic backup example).

Major version upgrades should be done manually, and it is recommended to use a new installation directory and the same procedure as for backup & restore. Official reference: Manual:Upgrading

Minor updates script:

s=/etc/cron.daily/mediawiki_update
dd of=$s<<'EOF'
#!/bin/bash
source ~/mw_vars
update() {
    dir=$1
    cd $mw
    [[ -d $dir ]] || return 1
    cd $dir
    branch=$(git describe --all)
    branch=${branch#remotes/}
    git fetch --all -q
    new_head=$(git rev-parse $branch)
    log=$(git log HEAD..$new_head)
    if [[ ! $log ]]; then
        return 1
    fi
    pwd
    echo "$log"
    git checkout -qf $new_head
    cd $mw
    return 0
}
for dir in extensions/* skins/* vendor; do
    update "$dir" ||:
done
if update .; then
    curl "https://iankelling.org/git/?p=mediawiki-librejs-patch;a=blob_plain;f=mediawiki-1.28-librejs.patch;hb=HEAD" | patch -r - -N -p1
fi
php $mw/maintenance/update.php -q --quick
EOF

Upgrading Major Versions

Reference documentation is at mediawikiwiki:Manual:Upgrading

My strategy is:

  1. Read the "Upgrade notices for MediaWiki administrators" on the upgrade version and any skipped versions at mediawikiwiki:Version_lifecycle.
  2. Setup a blank test wiki with the new version.
  3. Backup the old database, restore it to the new wiki, run php maintenance/update.php.
  4. If everything looks good, repeat and replace the old wiki with the new one.

Stopping Spam

There is a balance between effective anti-spam measures and blocking/annoying contributors. Mediawiki documentation on how to combat spam, is not very good, but it has improved over time: manual: Combating Spam. It's possible for a spammer to quickly make thousands of edits, and there is no good documentation on purging lots of spam, so you should have a good strategy up front. My current strategy is 3 fold, and is limited to small/medium wiki's:

  • Find new spam quickly, revert it & ban the user.
    • Watch, and get notified of changes on all primary content pages: Special:Preferences, Bottom of the page, set an email address, then turn on "Email me also for minor edits of pages and files."
    • Use a rss/atom feed reader, and subscribe to recent changes across the wiki. Newer browsers have an rss feed subscribe button, you can click after going to Special:RecentChanges. If that is not available, you can construct the proper url based on these instructions.
  • Require registration to edit, and a custom captcha question on registration.
  • Install all non-user inhibiting anti-spam extensions / settings that take a reasonable amount of time to figure out.

Choosing Extensions

Mediawiki.org has pages for ~5200 extensions. Mediawiki maintains ~700 extensions in it's git repo. Wikipedia uses over 100 extensions. Major distributors package ~36 extensions. We looked closely at the distributor's and briefly at the Mediawiki repo extensions. We haven't found any other useful list or recommendations.

Here are brief descriptions of extensions that are part of distributions and why they were rejected for this wiki.

Footnote deprecated in newer versions
InputBox Add html forms to pages. Can't imagine using it. Would install if I did.
LocalisationUpdate update localization only. I'm fine updating all of mediawiki, there aren't many updates.
NewestPages A page creation history that doesn't expire like recent-changes. Meh
NewUserNotif Send me a notification when a user registers. Seems like an excessive notification.
Openid Poor UI. 2 pages & 2 links <login> <login with openid> which is confusing & ugly.
Pdfhandler Gallery of pages from a pdf file. Can't imagine using it. Would install if I did.
RSSReader Embed an rss feed. Can't imagine using it. Would install if I did.
Semantic Seems like a lot of trouble around analyzing kinds of data which my wiki will not have.
Validator dependency of of semantic

Misc Notes

Web Analytics Software

I do not recommend using google analytics: it's proprietary software and gives private information of your website visitors to google for them to make money. Piwik has the best features and I recommend it, but I use goaccess because it is simpler to manage and good enough.

Mediawiki Documentation Quality

Overall the documentation is good, but like wikipedia, it depends.

The closer a topic is to core functionality and commonly used features, the better the documentation is likely to be. Wikimedia Foundation (WMF) has a competing priority of being a good upstream to mediawiki users and being good for their own sites. That, plus the multitude of unconnected extension developers, and official documentation is sometimes neglected in favor of bug reports, readme files, comments, code, and unpublished knowledge. User's documentation edits vary in quality, and often aren't reviewed by anyone. If you run into an issue, try viewing/diffing the most recent version of a page by the last few editors.

One issue is that mediawiki.org needs a lot of organizing, deleting, and verifying of material, and that is relatively unpopular, tedious, and sometimes difficult work. The discussion pages of mediawiki.org are a wasteland of unanswered questions and outdated conversations, which is poor form for a wiki. However, if you communicate well, you can get great help from their support forum, irc, and mailing list.


Bash here documents, EOF vs 'EOF'

Here documents are used throughout this page, some people may not be aware of a small but important syntax. When the delimiter is quoted, as in <<'EOF', then the contents of the here document are exactly verbatim. Otherwise $ and ` are expanded as in bash, and must be escaped by prefixing them with \, which itself must then also be escaped to be used literally.


Mediawiki automation tools survey 7/2014

Barely maintained:

Getting basic maintenance

Actively developed, used by wikimedia foundation a lot.


Troubleshooting Errors

If mediawiki fails to load, or shows an error in the browser, enable some settings and it will print much more useful information. mediawikiwiki:Manual:How to debug

License

This page and this wiki is licensed under cc-by-sa 4.0. This means the code is compatible with gplv3.

todo list for this page

  • Get Visual editor extension.
  • Don't require registration for edits