Jeff Epler's blog2013-01-17T16:34:27ZPhotos, electronics, cnc, and moreJeff Eplerjepler@unpythonic.netmoinmoin cleanup script2013-01-17T16:34:27Z2013-01-17T16:34:27Zhttps://emergent.unpythonic.net/01358440467
I run a small wiki using moinmoin. Recently I noticed that it was not uncommon
for moin.cgi processes to run for over a minute, which was worrying!
<p>I determined that the slow-to-render page was TitleIndex, and that this was
slow because there were a lot of 'unborn' pages which spammers had attempted
to create. These take the form of directories under data/pages with only a
zero-byte 'edit-log' file. Rendering TitleIndex has to walk all these
directories, even though they don't contribute anything to the list of pages.
<p>So I wrote a Python script to clean these up. It's intended to be run
from the wiki's 'data' directory, and it prints a list of bourne shell
commands to delete the offending directories as well as any users with the same
name as an offending directory. Redirect the output of the script to a file,
look over the file, and once you've decided it's good, just 'sh' it (running it
as your wiki user if necessary).
<p>This dropped the render time of TitleIndex quite substantially!
<p>I did a few other things for the sake of my wiki at the same time: I turned
on Xapian indexing (which seems to help searches but not TitleIndex); I
switched from cgi to wsgi; and I enabled textchas for signup. That latter step
didn't work for long, as 48 hours later the bots have started successfully
creating new accounts again, though the overall rate may be lower now.
<p><p><b>Files currently attached to this page:</b>
<table cellpadding=5 style="width:auto!important; clear:none!important"><col><col style="text-align: right"><tr bgcolor=#eeeeee><td><a href="https://media.unpythonic.net/emergent-files/01358440467/wikiclean.py">wikiclean.py</a></td><td>637 bytes</td></tr></table><p>