Some Tidbits From ~/bin

Here's a few random scripts from my ~/bin. Please take 'em and use 'em, but do note where they came from originally, it's simply good manners.

cdhistory: NAME
web-browser style history for UNIX shells
[use the bash functions and aliases instead of invoking directly]
Install these bash functions and aliases in your C<$HOME/.bashrc> file to use this:
# override the default "cd" with a custom one, so that cd history
# is recorded
cd() {
command cd "$*" && _cdhist="|$PWD"

# cdh - display the cd history
cdh() {
cdhistory ls "$_cdhist"

# cdfwd - go "forward" through the history, or switch between the
# current and previous dirs, if we're already at the "front" of history
cdfwd() {
_cdhist=`cdhistory fwd "$_cdhist" "$1"`
new=`cdhistory entry "$_cdhist" "$1" -1`
echo "$new"
command cd "$new"

# cdback - go "back" through the cd history
cdback() {
new=`cdhistory entry "$_cdhist" "$1"`
_cdhist=`cdhistory back "$_cdhist" "$1"`
echo "$new"
command cd "$new"

# a few short helper aliases, easy to type quickly
alias +='cdfwd 1'
alias ++='cdfwd 2'
alias +++='cdfwd 3'
alias -- -='cdback 1'

C is a perl script used to implement web-browser style "history" for UNIX shells; as you use the C command to explore the filesystem, your moves are remembered, and you can go "back" through history, and "forward" again, as you like.
It's easier to display an example here. First, let's build up a few directories in the history:
$ cd /tmp ; cdh
: (1); cd "/tmp"
$ cd /dev ; cdh
: (1); cd "/dev"
: (2); cd "/tmp"
$ cd /etc ; cdh
: (1); cd "/etc"
: (2); cd "/dev"
: (3); cd "/tmp"

Now, let's start going backwards!
$ - ; cdh
: (1) [forward]; cd "/etc"
: (2); cd "/dev"
: (3); cd "/tmp"
$ - ; cdh
: (1) [forward]; cd "/etc"
: (2) [forward]; cd "/dev"
: (3); cd "/tmp"

OK, let's say instead of going forwards, I decide to cd to another dir:
$ cd /usr ; cdh
: (1); cd "/usr"
: (2); cd "/tmp"

See, web-browser style, even when it may be an annoying feature that should probably be revisited later. ;)
Dec 7 2004 jm
Justin Mason,


'cookie' spamtrap address generator in PHP

Include this PHP code in a page on your site, and hey presto, within weeks you'll have your very own pet spammers scraping it and sending you spam!

Very handy, since these addresses are perfect proof that they were scraped from your website, and provide a legal trail to *who scraped them*.

For best results, add text on the same page to note that the addresses are not to be used for commercial e-mail, or similar; this may help in clarifying grey areas in some spam laws.

License: same license as Perl: dual GPL/PAL.

Version: Oct 24 2003 jm

extract-rfc822-attachment: README
extract a "mail/rfc822" attachment from a mail.
extract-rfc822-attachment < msg > newmsg

Exit status will be 0 if there was an attachment and the attachment was extracted successfully, 1 if there was no attachment found. The remaining non-zero exit statuses are reserved for other failure modes.
Quoted-printable or base64-encoded attachments are not currently supported.
Feb 21 2003 jm

find-hidden-word-text: NAME
find hidden text in MS Word documents
find-hidden-word-text word.doc > hidden.txt
This is a command-line UNIX tool to ease the task of discovering hidden text in MS Word documents.
More specifically, it is an implementation of Method 2 from Simon Byers' paper, _Scalable Exploitation of, and Responses to Information Leakage Through Hidden Data in Published Documents_, at .
This goes a little further in that it removes some common 'noise' strings, like 'Word.Document.8', 'Title', 'PAGE', 'Microsoft Word Document' and the like. It will also remove any strings that do not contain at least 1 whitespace character.
This tool requires antiword be installed.
Justin Mason, C
1.0 Aug 15 2003 jm

graph-top-referers: NAME
produce a PNG graph of the top 10 referers in access_logs
Justin Mason, jm at jmason dot org

id3-from-filename: generate ID3 tags based on the path to an MP3 or OGG file

usage: id3-from-filename [--sub 's/.../.../g'] --format '...' [--capitalize FIELD1[,FIELD2...] ] file1 [file2 ...] > shcommands

sh -x shcommands

format: ALBUM becomes the ID3 album name tag ARTIST becomes the ID3 artist name tag TRACK becomes the ID3 track title TRACKNUM becomes the ID3 track number

otherwise, format is a standard Perl regular expression. Example:

id3-from-filename --format '.*/ARTIST_-_ALBUM/TRACKNUM-TRACK.mp3' \ ./Ska/The_Selecter_-_Greatest_Hits/*.mp3

knewtab: create a new tab in a konsole window, from the commandline

usage: knewtab {tabname} {command line ...}


Creates a new tab in a "konsole" window (the current window, or a new one if the command is not run from a konsole).

Requires that the konsole app be run with the "--script" switch.


Justin Mason,

lndir-dupes: NAME
reduce disk usage by linking identical files
lndir-dupes dir1 [...]
This script will descend one or more directory trees provided on the command line, and will hard-link all identical files found, of sizes greater than 1024 bytes, to each other.
"lndir-dupes", run on over 200 GB of backups with lots of duplicated files, took over a day to complete, using up 620 MB of temporary storage and hit a max of 24 MB memory usage, with 18 MB resident.
1.1 Jan 3 2009 jm
Same as Perl itself
Justin Mason
Bits from

mailman-archive-to-rss: scrape the archives of a MailMan list and convert to an RSS feed using XML::RSS.

To use, either edit and update the @LISTS array, or run with -help to view the options used to specify list details on the command line. Requires XML::RSS.

Released under the same license as Perl itself.

Nov 28 2001 jm -

Updated 28 July 2002 -

Ryan Wise hacked support for threaded mode, post counts, printing of the subject rather then the author as the item title and striping the list name in the subject from the RSS title. You can see examples at or in use at

Updated Nov 11 2002 jm - now escapes HTML stuff in description correctly, thanks to Bill Kearney for pointing this out

Updated Feb 17 2003 jm - Sean M. Burke pointed out that high-bit chars were not escaped, which is illegal XML. fixed using HTML::Entities.

Updated Nov 8 2005 jm - Bill McGonigle sent over a patch to add support for removing the 'description' field from the start of the Subject line, a la "[foo] real subject".

mailman-block-non-members: insert in front of a MailMan list-submission address to block non-members of that list from posting. If a non-member posts, it will respond and request confirmation that they are human before allowing it through.

mailtunnel: see "perldoc mailtunnel" for documentation

mhthread: NAME
sort an MH folder into 'threaded' order
mhthread [options] +folder
mhthread [options] /path/to/folder

options accepted: [-debug] [-no-write] [-fast] [-lock]
This will thread an MH folder. It re-orders the messages (as sortm(1) would do), and annotates each one with a new header, "X-MH-Thread-Markup", which can be displayed by scan(1).
Together, this results in the messages being displayed in "threaded" order, as in trn(1) or mutt(1).
Sequences will be rewritten appropriately. The folder will also be "packed", as if 'folder -pack' had been run; see folder(1).
Here's some sample output from scan(1), after threading the folder:
430 03/23 mathew 3 [Asrg] Re: [OffTopic - NNTP]
431 03/23 Kee Hinckley 5 |- [Asrg] Re: [OffTopic - NNTP]
432 -03/23 Chuq Von Rospach 11 | |- Parameters for success? (was Re: [A
433 03/23 To:Chuq Von Rospa 4 | | \- Re: Parameters for success? (was
434 03/23 Matt Sergeant 3 | \- Re: [Asrg] Re: [OffTopic - NNTP]
435 03/23 Chuq Von Rospach 7 \- Re: [Asrg] Re: [OffTopic - NNTP]

=over 4
=item -fast
Use an on-disk cache to speed up operation.
=item -lock
Use a folder-wide lock-file to synchronize access to folders, so that multiple processes will not stomp on each other's changes or cause folder corruption. If you use this, you should ensure that you also use a locking version of other tools, such as the C script that comes with ExMH (typical location: C ).
=item -no-write
Do not rewrite the messages; instead, output a line for each message noting the actions that would be taken.
=item -debug
Output debugging info to stderr.
Note that options will also be read from the C entry in your C<.mh_profile> file, in traditional MH style.
To display the results in scan(1) output, use something like the following for the subject-display part of the scan.form file:

If you do not have a "scan.form" file of your own, you will need to set it up. This functionality is accessed using the -form or -format switches to the scan(1) command. To use this, copy the /etc/nmh/scan.default file to your ~/Mail dir and modify it with the above line, then add
scan: -form scan.form

to your ~/.mh_profile.
Copy this script to somewhere in your path, called C . Then run that whenever you want to re-thread the folder, in the same way you would C , C or similar.
Copy this script to somewhere in your path, called C .
Add the following function to your C<~/.tk/exmh/user.tcl> file:
proc Folder_Thread {} {
global exmh
Exmh_Status "Threading folder..." blue
if {[Ftoc_Changes "Thread"] == 0} then {
if {[catch {MhExec mhthread +$exmh(folder)} err]} {
Exmh_Status $err error
} else {
# finish off by using the ExMH packing logic to redisplay folder
# then show the first unseen message

Next, you need to rebuild the C file. Run C and type:
auto_mkindex ~/.tk/exmh *.tcl

Now add a button to run this function. To do this, you must exit ExMH first, then edit the C<~/.exmh/exmh-defaults> file and add these files at the top of the file:
*Fops.ubuttonlist: thread
*Fops.thread.text: Thread
*Fops.thread.command: Folder_Thread

Restart ExMH, and there should be a new button marked B on the folder button-bar. Press this to re-thread the current folder.
The threading algorithm uses the In-Reply-To, Message-Id and References headers. Thanks to JWZ for guidance, in the form of his page on threading at C .
The 'X-MH-Thread-Markup' headers are encoded using RFC-2047 encoding, using 'no-break space' characters for whitespace, as otherwise MH's scan(1) format code will strip them. Here's an example of the results:
X-MH-Thread-Markup: =?US-ASCII?Q?=a0=a0=a0=a0=5c=2d=a0?=

dealing with private sequences (stored in .mh_profile); limiting displayed thread-depth to keep UI readable (so far has not been a problem).
duplicate messages will always be shuffled in order each time C is run, due to handling of identical Message-Ids.
Latest version can be found at .
Justin Mason, C
version = 1.7, Jul 25 2003 jm

moderate-list: NAME
moderate some mailing lists from the command-line
moderate-list [options] { --dir maildir | --file filename }
[... script outputs message synopsis...]
Moderation action [ynab] (y=yes, n=no, a=allow in future, b=block in future)?
[... script mails the correct address appropriately.]

moderate-list --auto [options] { --dir maildir | --file filename }
[... script operates automatically with no user intervention...]

--config configfile: file where previous moderation choices,
and list passwords, are stored

Works for ezmlm and MailMan 2.1.x lists, which both produce 'message awaiting moderation' mails with enough info to allow this to operate.
=item --dir=/path/to/maildir
Directory containing the pending moderation requests, RFC-822 format, one per file. (MH or Maildir style!)
=item --config=/path/to/configfile
Where moderation passwords and 'always accept mail from this address' choices are stored between runs.
These config settings can be used in C<~/.moderate_sa/>:
# The directory containing your pending moderation requests, RFC-822
# format, one per file. Maildir 'cur' and 'new' subdirectories are OK,
# as are MH folders.
defaultdir ~/Mail/Mod

# block messages that score this high in SpamAssassin score.
blockscore 5.0

1.6, May 1 2007 jm

mp3info-id3v1: tag an MP3 file with an ID3v1 tag.

Needed because mp3info(1) uses ID3v2 tags, which xmms doesn't like.

mythsshimport: TODO: use ffmpeg NAME
transcode and install video files onto a MythTV box
mythsshimport file1 [file2 ...]

Transcodes video files (AVI, MPEG, MOV, WMV etc.) into MythTV-compatible and PVR-350-optimised MPEG-2 .nuv files, suitable for viewing on a 4/3 screen, then transfers them to the MythTV backend, inserts them into the "recorded programs" listings, and builds seek tables.
All this happens on-the-fly, at faster-than-real-time rates; with a recent CPU in the transcoding box, and over an 802.11b wifi home network, you can start the process and start watching the video within 20 seconds, while it is transcoded and transferred in the background.
SSH is used as the network transport. If you have the CPU power available on the MythTV backend itself, you can run this script there (as the mythtv user) and it will skip the SSH parts entirely.
- ssh password-less key access from transcode box into mythtv@mythbox (this could be localhost, if you're transcoding on the mythbox).
Test using: "ssh mythtv@mythbox echo hi"
If you run this script on the mythbox as the mythtv user, this is not

- mencoder. Tested with 2:0.99+1.0pre7try2+cvs20060117-0ubuntu8 (I swear that's a version string and not just me rolling my
head around the keyboard)

- MythTV. Tested with MythTV 0.20.
- The "contrib/" script from the MythTV source tarball, installed on the mythbox in $PATH: download from

- screen(1) installed on the transcoding box, used to keep the mencoder output readable

Edit the 'CHANGE THIS' section below for configuration.
- if an error occurs (e.g. read bad block from a DVD) during mencoder use, the mencoder screen will immediately disappear. this is suboptimal.

Mar 31 2009 jm

new-referrer-rss: NAME
generate RSS feed of new referrer URLs from access_log
new-referrers-rss nameofsite [source ...] > new-referrers.xml

source: access_log files or directory containing same.
'new-referrers.xml' should be at a web-visible location.

Given the name of a web site, and a selection of Apache combined log format 'access_log' files containing referrer URL data, this will generate an RSS feed containing the latest referrers.
The script should be run periodically with 'fresh' access_log data, from cron.
A file called 'hist' in the current directory is created to hold historical context information; using this, if a URL is listed in the RSS output, it will not be listed again. (As a result, subscribers should ensure that they do not update less frequently than the cron job executes!)
new-referrers-rss /var/log/apache/* \
> scraped/

This tool requires the following CPAN modules:

Justin Mason, C
SEE ALSO for comments
- 1.2 May 14 2006 jm: also exclude "/"
- 1.1 May 10 2006 jm: put full URL into 'link' area, consider 3xx HTTP response codes as valid
- 1.0 May 9 2006 jm: initial rev

reboot-zyxel: NAME
reboot a Zyxel P-660RU DSL modem/router
[edit script to set $ROUTER_IP and $ADMIN_PASSWORD] reboot-zyxel
Reboots a Zyxel P-660RU ADSL2+ "ethernet/USB router" -- ie. one of these:

It's cheap and nasty, and it's what ESAT/BT deliver to customers as their ADSL router. It offers an *interesting* javascript-based web UI, and this script contains enough reverse-engineered smarts to reboot the router using that.
It'll cause the router to reboot, then wait for the connection to return up to a maximum of 3 minutes. If the reboot and reconnection is successful, it'll exit with a status of 0; otherwise, exit status will be non-0.

snbencode: snarf-n-barf encode

Pass files back and forth between windows, by encoding them in SNB encoding (snarf-n-barf encoding ;). This is much more compact than base64, and ignores all embedded spaces. However, it allows high-bit chars (0xa1 to 0xff) through, so if you use different charsets, it may screw up.


#!21/usr/bin/perl_-w`#`#_snbencode/decode_--_snarf-n-barf_encode`#`#_Pass_ files_back_and_forth_between_windows,_by_encoding_them_in_SNB_encoding`#_(s narf-n-barf_encoding_;).__This_is_much_more_compact_than_base64,_and`#_igno res_all_embedded_spaces.__However,_it_allows_high-bit_chars_(0xa1_to`#_0xff )_through,_so_if_you_use_different_charsets,_it_may_screw_up.``my_@passthru !5franges_=_(0x22_.._0x5e,_0x61_.._0x7e,_0xa1_.._0xff);`my_%mappings_=_(`!0 90x20_=>_0x5f,!09!09!09#_space_->_!5f`!090x0a_=>_0x60!09!09!09#_nl_->_!60`) ... etc.

tap-to-junit-xml: NAME
convert perl-style TAP test output to JUnit-style XML
tap-to-junit-xml [--help|--man] [--[no]hidesummary]
[--input ]
[--output ]
[ ] [outputprefix]

Parse test suite output in TAP (Test Anything Protocol, C ) format, and produce XML output in a similar format to that produced by the ant task. This is useful for consumption by continuous-integration systems like Hudson (C ).
C<"test suite name"> is a descriptive string used as the B attribute on the top-level node of the output XML. Defaults to "make test".
If C is specified, multi-file output will be generated, with multiple XML files created using C as the start of their filenames. The files are separated by testplan. This option is ignored if --puretap is specified (TAP only allows one testplan per input file). This prefix may contain slashes, in which case the files will be placed into a directory hierarchy accordingly (although care should be taken to ensure these directories exist in advance).
If --input I is not specified, STDIN will be read. If C or --output is not specified, a single XML file will be generated on STDOUT.
--output I is used to write a single XML file to I .
--puretap parses a single TAP source and handles parse errors and directives (todo, skip, bailout). --puretap ignores unknown (non-TAP) input. Without --puretap, the script will parse some additional non-TAP test input, such as Perl tests that can include a "Test Summary Report", but it won't generate correct XML unless the TAP testplan comes before the test cases. --hidesummary report (the default) will hide the summary report, --no-hidesummary will display it (neither has an effect when --puretap is specified).
prove -v 2>&1 | tee tests.log
tap-to-junit-xml "make test" testxml/tests < tests.log

(JUnit-formatted XML is now in "testxml/tests*.xml".)

- Output is optimized for Hudson, and may not look quite as good in
other UIs.
- Doesn't do anything with the STDERR from tests.
- Doesn't fill in the 'errors' attribute in the element.
(--puretap handles parse errors)
- Doesn't handle "todo" or "skip" (--puretap does)
- Doesn't get the elapsed time for each 'test' (i.e. assertion.)
(TAP output has no elapsed time convention).

original,, by Matisse Enzer ; see C .
pretty much entirely rewritten by Justin Mason , Feb 2008.
Miscellaneous fixes and mods (--puretap) by Jascha Lee , Mar 2009.
Mar 27 2008 jm
Mar 17 2009 jl

Copyright (c) 2007 Matisse Enzer. All Rights Reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

vbulletin2mail: NAME
gate from a vBulletin web board to email
[edit script to change settings]
vbulletin2mail recipient@email.address

Requires 'lynx'.
Dec 2 2001 jm

wqvlinktoppm: README
convert Palm WQVLink database backup to ppm images
This is a quick hack to parse out Casio Wrist Camera images from the Palm's WQVLinkDB.pdb backup file.
To use (unashamedly *nix-specific instructions):
echo "Backing up Palm"

mkdir -p ~/pilot/backup; pilot-xfer -s ~/pilot/backup

echo "Finding images"

wqvlinktoppm ~/pilot/backup/WQVLinkDB.pdb

echo "viewing images"

ee wqv_image_*.ppm

- - details of image bytes and

WQVLink as Palm PRC

To do:
- make it friendlier. I'm not going to, this is a quick hack ;)

- read metadata. Again, not bothered

Author: jm /at/ -- License: GPL Last modified Sep 26 2001 jm