Sean Feeney
Architect of the digital age

MediaWiki to Pepperminty Wiki Migration

8 August 2020

Looking for a lighter weight wiki that might not break with every update, I stumbled upon Pepperminty Wiki. It’s an open-source project still early in development but seems better suited to novice wiki users than the similar concept, lightweight, single-page wiki, TiddlyWiki. Most importantly, unlike TiddlyWiki, it’s designed with a multi-user use case in mind.

Pepperminty, like many wikis these days, uses markdown. Converting your MediaWiki content to markdown will make it more portable, whether you try Pepperminty or some other solution.

Pandoc is a useful tool for converting text file formats. There is an online converter, but it isn’t suitable for long text, and for any sufficiently large wiki you’ll want to automate the conversion process anyways. So the first step is to install pandoc locally (or on your webserver, wherever you want to run this process).

Now we need the source content, the wiki markup, for each article in MediaWiki. You could copy and paste this from the “View source” tab on each page, but that’s not very automated. There is a special page, index.php?title=Special:Export, that will allow you to export your content to XML. Or it should, but seeing as my MediaWiki install is mostly nonfunctional after recent updates, I had to explore another option. If you have command-line access to your MediaWiki server, in the maintenance folder you’ll find a helper dumpBackup.php. This will also allow you to export to XML. I used a command similar to this:

php dumpBackup.php --current > dump.xml

The relevant content is between the tags in this dump. I chose to ignore titles File: (images) and Template: (MediaWiki style templates) for now, to focus on text migration.

I pulled each page’s field into a text file of its own with a .wiki extension. From there, we can run pandoc in an automated fashion. I’m doing this on a Windows box, so I wrote a PowerShell script, convert.ps1, to facilitate the conversion:

$filePath = Get-ChildItem -Path ./ -File

ForEach($i in $filePath){
  $file = (Get-Item $i).Basename
  $extension = (Get-Item $i).Extension
  # Exclude this script and your Mediawiki dump file
  if (($file -notmatch 'convert') -and ($extension -notmatch '.xml')){
    $sourceFile = $file+'.wiki'
    $destinationFile = $file+'.md'
    pandoc $sourceFile -f mediawiki -t markdown_phpextra -s -o $destinationFile
    if($LASTEXITCODE -ne 0){
      Write-Host 'Alert! ' -ForegroundColor Red -NoNewline
      Write-Host 'Error processing file: '+$i -BackgroundColor Yellow -ForegroundColor Black -NoNewline
    }
  }
}

I exclude the script and dump file, in case you’re doing this all in one directory. Turns out there isn’t just one markdown standard. The closest in wide usage is GitHub Flavored Markdown (GFM). Pepperminty uses PHP Parsedown Extreme, which is based on PHP Parsedown Extra, hence the choice of markdown_phpextra.

A help page is provided as a starting point for confirming syntax support. With my source content, I ran into these issues that needed manually addressed:

Given that all those styled transclusion templates likely won’t work and in my case, I don’t want to enable HTML support since this is meant to be a publicly editable site, I have to make some design choices. Since all of my images were in styled templates, this is where I make hard choices about whether or not I’ll bring them over and how they should look on the page if I do. Note: Imagick is a required PHP extension to use image uploads in Pepperminty and for video uploads, you need to allow /etc/mime.types in your php.ini open_basedir. Other non-default PHP extensions required for Pepperminty can be found in the docs.

Now that the md files are in working order, theoretically, you could drop them in your Pepperminty folder and update navigation (idindex.json x1, pageindex.json x3) as needed, but I’m sure there are some additional steps necessary. For one, you’d need to duplicate each file to filename.md.r0 to simulate the first commit/revision. My wiki was small enough that manually uploaded images and pasted the md content into Pepperminty Wiki’s new page GUI (I know, I know, what happened to automation?).

Mad props to @SBRLabs for sharing this tech with the world.

Which route did you go when exiting your MediaWiki install? Put your experiences in the comments below.

Posted in mediawiki, wiki, migration

You agree to my disclaimer, regardless of the decision in Nguyen v. B&N.

Social

Causes

Genealogy


I Love Geni