Jeff Epler's blog

17 January 2013, 16:34 UTC

moinmoin cleanup script


I run a small wiki using moinmoin. Recently I noticed that it was not uncommon for moin.cgi processes to run for over a minute, which was worrying!

I determined that the slow-to-render page was TitleIndex, and that this was slow because there were a lot of 'unborn' pages which spammers had attempted to create. These take the form of directories under data/pages with only a zero-byte 'edit-log' file. Rendering TitleIndex has to walk all these directories, even though they don't contribute anything to the list of pages.

So I wrote a Python script to clean these up. It's intended to be run from the wiki's 'data' directory, and it prints a list of bourne shell commands to delete the offending directories as well as any users with the same name as an offending directory. Redirect the output of the script to a file, look over the file, and once you've decided it's good, just 'sh' it (running it as your wiki user if necessary).

This dropped the render time of TitleIndex quite substantially!

I did a few other things for the sake of my wiki at the same time: I turned on Xapian indexing (which seems to help searches but not TitleIndex); I switched from cgi to wsgi; and I enabled textchas for signup. That latter step didn't work for long, as 48 hours later the bots have started successfully creating new accounts again, though the overall rate may be lower now.

Files currently attached to this page:

wikiclean.py637 bytes

[permalink]

All older entries
Website Copyright © 2004-2024 Jeff Epler