Handy utilities, listed roughly in order of handiness.
safeunrar (also: safeuntar, safeunzip, safeunjar) - extract into a single directory
This tool saves you from having to always list the contents of an
archive before you extract it, just to make sure that it doesn't
pollute the current directory with a million files, instead of
putting them in a subdirectory like you expect. (I actually stopped
doing this for a while with source code, since it seemed like it
always followed the right convention -- until I got burned and wrote
this tool.)
It extracts zip, jar, tar.gz, tar.bz2, and rar files (support for
.7z forthcoming), putting all of the extracted files into a single
directory, or into the current directory if the root contains only
one file like it should.
duhsort - numerical sort for du --human output
Sorts the output of "du -h" according to size. I suggest defining a
bash function like this: duh() { du -h "$@" | duhsort | less -E; }
interactive-rename - rename files using a text editor
Specify a list of files on the command-line, and edit them in "vi"
(or your editor of choice). This is often even more convenient
and useful than perl's "rename" util, since you can apply multiple
patterns, see intermediate results, use undo, etc. I mainly used
this for renaming mp3s back when they were often untagged.
A similar to a tool is included in the "renameutils" package, but
this one is better. However, you'll still want "renameutils" for
"imv", which allows you to edit a single filename in a "readline"
buffer. That's even more handy, since it's rare to rename multiple
files.
choose-wlan - A simple menu to connect to a wireless network.
"choose-wlan" uses iwlist, iwconfig, and dhclient to handle the
network interface directly. It uses 9menu to present a menu of
available networks, and provides options to connect to one of them,
to rescan, or to reload the wifi driver. (This is often necessary
in my experience.) It's a very easy, simple, fast way to get
online, originally intended to be used as a desktop icon for my
sister, but I use it from the command line (with ratmenu instead of
9menu).
wget-patch - add "--rename-output" option to wget
This patch adds an option that allows the user to specify a perl
expression used to modify the target filenames of a call to wget.
(Basically, it lets you "rename" the output, although it skips the
part where the output is given the original name.)
See wget-patch for an extended description
along with source and binaries.
In order to make it brain-dead simple to build and install the
patch, I wrote an installer shell script that clones the repo,
applies the patch, downloads build prerequisites, creates a .deb
package, and installs it. I wrote the script in a general enough
way that it could be used as an installer for patched versions of
other Debian packages; you only have to change the value of some
environment variables.
vimix - adjust PCM and master volumes simultaneously, with vi keys.
A command line tool to change the volume. Unlike most "mixer"
programs, this one adjusts both PCM and "master" volume on the
sound card simultaneously, so that you don't have to mess with
two controls in order to get the full volume on your card. (If
these values are not the same when it starts out, it sets them each
to their average. This is only appropriate if you're not using
anything but PCM to generate sound.)
It's called "vimix" because it uses vi-like keybindings and
numerical prefixes. The keys "j" and "k" lower and raise the volume
by 2 percent; if you type a number instead, they will lower or raise
the volume by that percentage instead.
It also doesn't muck up your command line history with a bunch of
unnecessary graphics the way that "aumix" does: it uses one line of
the terminal to display a horizontal percentage bar. (Sine I wrote
this, aumix may have figured out how to use the "alternative screen"
to avoid mucking up the command line history.)
make-pdf.pl
This will make a pdf out of jpegs specified on the command
line. I made it to make PDFs out of some books I ripped
from books.google.com, back when you could download entire
books without creating multiple gmail accounts (you only
had to use multiple cookies). See also [[piratext.xpi]],
[[google_books_helper.user.js]] and [[googlebooks.user.js]], my
[[google books ripping script]] and my [[book copying machine]].
out.c, out.pl
I wrote out.c years ago to fill a gap in the GNU coreutils. It
allows you to pipe in data and writes it to a file that you specify
-- but it doesn't start writing the file until you finish sending
data, so that you can safely overwrite the input.
Since writing it, I learned of several alternative implementations,
including one called 'sponge' in Debian package 'moreutils'. I
think mine might be better than most, since it tries to minimize
copying. I wrote the version in perl much later than the C one,
because I couldn't find the C source at the time and thought I had
lost it in a hard drive crash. The perl version is better: it tries
to avoid using the temporary file at all, only opening it after a
memory buffer is full.
addgigs.pl - quickly add data to a file
Adds the specified number of gigs to a file. It uses "seek" to
add this space as a "hole"; i.e., disk space will not actually be
allocated to the file until it is used. This makes it much faster
than "dd if=/dev/zero", besides saving you disk space.
Quite useful for creating disk images. To manage partitions within
disk images, see [[lodisk]].
lodisk - manage loopback files containing partitions
Handles the (un)mounting of loopback devices for partitions
within files. Lodisk will create a loopback device, run a command on
it, then (unless the command was mount) deallocate the loopback device.
The following commands are supported:
mount mkfs.ext3 mkswap fsck tune2fs resize2fs [c|s]fdisk
Adding new commands is trivial, and it would be possible to support
unanticipated commands; however, because lodisk is intended to
be fool-proof, and because linux loopback devices on partitions
allow the user to write beyond the end of a partition, it is not
allowed to specify an unsupported command. An example of the danger
is "resize2fs": lodisk will specify the command line necessary
to enlarge the filesystem to the end of the partition; however,
if "resize2fs" were not specifically recognized by lodisk, the
typical usage of "resize2fs" would destroy all trailing partitions
by enlarging to the end of the disk.
sea.c - a fast, somewhat fancy bash prompt
This is a bash prompt that list running and stopped jobs (only if
there are any) and the exit code of the last program run (unless
it's 0). In fact, it can be used to conditionally display any
integer along with a single character code indicating to the user
what it is. It will also display the current working directory in
the prompt, but only the end of it if it's too long.
I wrote this back when I used to write code just for the fun of
it, or for educational experience, or something, because it's
implemented in C without much good reason. I was probably just
looking for a C project. (I once wrote an implementation of 'ed' in
pure bash, too.) I wrote it after I read the "bash prompt HOWTO"
and noticed how terrible the shell script code in there was (in
terms of performance, and not just that). That shell code could
have been improved, but I decided to write my bash prompt in C.
xcopy.c
This program copies text into the X clipboard. I had to read a
bunch of X documentation to write this, and I didn't end up using it
the way I intended (to unify X and [[GNU screen]] paste buffers).
Nor did I ever do any X programming again. However I do at least
know how X works, so I can be very confident in my final conclusion
that X sucks.
ssh-remote-add-public-key
This does what openssh's ssh-copy-id does, except that it will also
create a key if none exists, and won't read keys from the ssh agent.
I don't think ssh-copy-id existed when I wrote it; anyway, I didn't
know about it.
rrmmod - recursive rmmod
Amazingly, modprobe lacks any facility to remove a module with
dependencies by removing its dependencies. This does just that, by
parsing the output to rmmod.
S16startx - a script for fast Debian boot-up
This is a startx script designed to run in /etc/rcS.d on a Debian
system. I use this on my laptop to make it start X *much* sooner
than it would otherwise. It integrates with a matching xinitrc in a
clever and secure way, to allow the user to give priority to his own
processes (e.g. KDE startup). It is the result of a morning spent
experimenting with starting X at various points in the boot process
and seeing what broke and why. Since then I have used it on a few
Debian KDE machines with great success. However it remains easy for
hacks like this to be broken by upstream Debian changes.
In any case, you can't just install this and have it work: you have
to specify a user and you have to do something about the other
init scripts mentioned in this script's comments. You might also
have to load some other modules or scripts particular to your X
configuration.
tmpwm.pl - cascade windows for ratpoison's "tmpwm" command
This is a wrapper for ratpoison's "tmpwm" command that will cascade
the existing windows using "nawm". It suffers from an unfortunate
"nawm" limitation regarding multiple windows with the same name.
I don't use ratpoison anymore; now I use [[xmonad]]. xmonad,
unfortunately, lacks a tmpwm facility, although it's less necessary.
rename-tvshow.pl - rename tv shows using tvrage.com data
I wrote this for a friend so he could rename TV shows he downloads
using bittorrent. It's meant to rename the sort of filenames that
are used by release groups. Since he uses Windows cmd.exe for his
shell, it performs glob expansion on its arguments. (I myself don't
use Windows - or watch TV shows.)
editxmp - edit XMP metadata (Adobe PDF)
I wrote this for DIzzIEe. It allows the user to edit XMP metadata
in PDF files (or other Adobe product file formats) using a text
editor. Such metadata is often responsible for information leaks.
I should have made this into a web app, because I don't think
DIzzIEe ever figured out how to compile programs, so that, far from
"handy," this has probably never been used! (Except I did test it.)
CGI scripts
I've written a lot of CGI programs, and plan to put more of them
online eventually...
caching-proxy - a CGI work-alike to apache mod_proxy
This is caching proxy that works like apache's mod_cache, except
as a perl CGI script that can run on a cheap unlimited shared
host. I use it on one such host in order to mirror the web page I
host on my DSL link.
I wrote this because when I uploaded large files to my web host, I
always wanted to post the link before the upload had finished, but
sometimes people would end up downloading incomplete versions --
depending on which transfer won the race. This proxy eliminates
that race.
Now when I want to share a large file on IRC or over the phone,
I can post a link to the cache as soon as I copy it into my web
root. If more than one person clicks the link, the transfers will
each go at the full upstream speed of my DSL line (or faster, as
more of the file is uploaded), instead of each user having to
share the link.
I would like to implement the full HTTP cache control semantics
for this CGI proxy, like mod_proxy does. However, the important
features for my use are already there, so I'm putting that off
for now. The only really important feature that it is missing is
the ability to satisfy HTTP Range requests from clients (i.e., to
allow clients to resume files). However, it does make
Range requests to the server in order to resume the downloads it
initiates.
Another feature I'd like to add is the ability to force an upload
into the cache -- which is simple enough. But I also want a
tool on the other end, on my DSL line web server, that will
intelligently seed the cache with new additions. (Not necessarily
immediately, but whenever there is available upstream bandwidth.)
jp2a.sh - convert an image URL to ASCII art
This is a simple wrapper around jp2a which accepts an URL as an
argument and caches its output to avoid unnecessary fetching.
Despite its name, it is not limited to JPEG images (unlike jp2a).
ttf-cgi - render text as PNG using true-type fonts
Supports different fonts, sizes, foregrounad and background
colors, and word-wrapping behaviors. This is useful
for including fonts that people don't normally have on
web pages. For a really cool way to use it, check out:
http://www.kryogenix.org/code/browser/lir/
listen together - sync your playlist with friends in real-time
This is a streaming media player in which the controls are shared
by multiple clients, so that they stay in sync with what they're
playing. (I didn't write the media play, only the synchronization
part.)
A couple years ago, I had a long phone conversation with someone
during which she recommended I listen to a particular album. I
wound up putting the album on while talking to her, and she put it
on as well; I synced it up manually by listening to the music on
her end coming through the phone. (It took a couple tries.)
I think we were both playing the album as mp3s on our computers,
and the thing could have been made perfect if the computers had
done the sync for us over the network. If she was using Linux at
the time, I might have set up a ssh link or something. Anyway,
later on I had the idea of making it into a web app that streamed
to several clients simultaneously.
Eventually I got around to making this happen in the form of a web
app. It supports video as well, and in fact the intention was to
be able to watch movies with someone over the phone.
Although this code works, it's far from polished, and there is, or
at least was, a rather unfortunate bug inherited from the GPL'd
FLV player that it is based on. That is probably fixed by now; in
any case, there might be better alternatives (I could only find
one free software FLV player at the time).
More seriously, there is no way to upload videos (one specifies an
URL, which is then sent to all other clients for streaming), and
the playlist editor is extremely rudimentary. If I wanted to make
this really pretty useful, I think what I would do is add support
for easily adding youtube videos (whose FLV URLs can be fetched
easily with "youtube-dl").
I also never got around to compensating for any potential
differences in the clocks of the clients. This is no problem
if they're all using NTP, but it would be trivial to check the
date on the server against the date on the client, and this would
even (automatically) compensate for lag in the HTTP response.
[TODO: Actually, I might have implemented this, I don't remember;
go check.]
In any case, I don't have a lot of interest in this code right now
because I don't often have extended phone conversations anymore.
I'm more likely to chat on IRC, where it's unimportant to have the
video in real-time sync. Still, I intend to finish this some day,
since it's close so close to being useful.
blog w/ threaded comments
sarah's mysql forum
unthreaded blog thing
Irssi scripts
client:
simple_away.pl
autorejoin.pl
move_active_windows.pl
rignore.pl
save_channels.pl
tts-pub.pl
anarchobot:
ban.pl
downforeveryone.pl
http-head.pl
knockout-repeat.pl
mode.pl
urban-dictionary.pl
see irssi/, ~/.irssi/scripts, ~anarchobot/.irssi/scripts
Greasemonkey scripts & firefox extensions - excluding download helpers
FEBE.xpi
xkcd.user.js
w3m.user.js
google_books_helper.user.js
googlebooks.user.js
dumparump.user.js
Downloaders
piratext.xpi
pirate bay ripper
gigapedia ripper
4chan-save
ehow.com-download
google books ripper
jstor-save-links + jstor-rename-history.txt
Filesharing sites
ifile.it.user.js
filefactory.com.user.js
sendspace.com.user.js
rapidshare.com.user.js
HTML tools
I have written various tools for manipulating HTML files, which I'll
try to release here when I can.
html-add-smart-quotes - convert "" into “ and ”
This program uses perl's HTML::Parser to parse the html and is
clever about handling markup in the middle of your text (along
quotation boundaries, etc.)
html-extract-element - find HTML elements via tag/attribute/value
This is an extremely handy script for initial extracting of
content from web sites. You can run it like this:
html-extract-element http://google.com?q=test h3 class r
...and you have extracted google results. It allows any number of
attribute/value pairs after the initial tag. You can also specify
a filename instead of an URL, or use it like this:
html-extract-element http://google.com?q=test h3 class r |
html-extract-element - a
...and you have extracted google results again, but better.
I plan to change the syntax of this so that it accepts multiple
tags, differentiating them from attributes in the form
"attribute=value".
html-linkify-urls - convert URLs in text elements to links
html-generate-slides.pl - display a sequence of images without javascript
This generates HTML for a "slideshow" like presentation for a list
of images, in the sense that only one image is visible at a time.
You click on the image to view the next image in the sequence. I
created it to showcase a sequence of screenshots from a movie,
made into sort of a comic book.
Originally I had written some javascript to do this, but DIzzIEe
complained about it not working without javascript, so I figured
out how to do it without any. It just uses internal anchors
to link between images. It's enough to create a pleasant
viewing experience, compared to a directory of images with an
apache-generated index.
Once someone used it to release a scanned comic book in HTML form
on piratebay, which I thought was pretty cool.
html-to-info
This converts HTML (a subset of it) into a GNU Info file, which
can then be converted into texinfo, LaTeX, and finally a PDF
pamphlet. It seemed like the easiest way to turn HTML into a PDF
with printed book quality. I used it to print out a book that
I had converted to HTML. And high quality it is! Footnotes,
intelligent page- and line-breaks, internal references between
pages: beautiful features.
However, I was never able to figure out how to set the margins in
the final PDF! This made the printed text much smaller than I
would like, wasting the edges of the paper. The problem is that
texinfo's LaTeX template is inflexible, so I would have had to
modify it, and in order to do so, I'd have to learn more about TeX
than I know. Or ask somebody, which is what I had planned to do
(but haven't as of yet).
This software: http://html2latex.sourceforge.net/ may be a much
better solution. Unlike my script (and GNU Info itself), it
supports tables and international characters. The next time I
need to print a book from HTML, I'll try it, and see if I can
make a decent book out of the output. Otherwise, I might fix the
margins on texinfo output. (It would be nice to be able to print
texinfo documentation the way that I want, anyway.)
Unwritten:
html-generate-toc - generate table of contents from headers
html-link-footnotes - link (and back-link) footnotes using anchors
html-fix-punctuation - move punctuation inside quotation marks and before footnotes
System administration tools
debchanges.pl - this needs a long writeup
push-lvm-snapshots - use rsync & lvm to remotely backup daily snapshots
This uses the famous rsync hard link snapshot technique
for daily remote backups, combined with LVM in order to
keep the copy atomic. The technique is described here:
http://www.mikerubel.org/computers/rsync_snapshots/
I wrote this after looking at a number of the scripts linked from
that page and also searching the debian repo. Everything I saw
seemed needlessly complicated -- and wrongly designed, copying
the original approach of Mike Rubel involving renaming backups
-- for what is really a very simple task. All you need to do is
ascertain the --link-dest argument before doing an ordinary rsync.
I looked at a few different scripts until I finally read this page:
http://rsnapshot.org/faq.html which details bug after bug after
bug -- serious bugs -- in a major software project which was even
written about in O'Reilly books! After reading that I decided to
avoid the possibility of bugs by writing it myself.
As explained, I wrote this script because backups are too important
to trust some random internet asshole's code. For that same reason,
my code is intentionally short and transparent: you can
verify its correctness pretty easily yourself. And it probably
still does everything you need: it goes through each specified local
volume, creates an LVM snapshot if it's an LVM volume, and rsyncs
the data to the specified remote backup location using the specified
rsync options. It first uses rsync to get a list of existing remote
backups, and uses --link-dest on the subsequent copy in order to
share data with the latest of them.
A separate function exists to prune old backups intelligently. By
default, it will keep every day for two weeks, every week for 10
weeks, and every month forever. You can take the snapshots as often
as you want by running the script more often. You can prune them
however you want by modifying the arguments.
It's safe to use multiple copies of this script at the same time
to back up the same volume to the same remote destination, as long
as they aren't started within the same second on the system clock.
However, since it's more likely you'd like to wait for the previous
snapshot to finish before taking a new one, that's what the script
does, by waiting on a lock before doing anything.
wait_for_files.c
This uses the Linux kernel's new "inotify" feature to wait for a
set of files to come into existence. Its purpose is actually to
schedule init scripts according to dependencies without any central
daemon. In the initrd scripts I wrote for [[Samizdat Live CD]] I
used the following shell functions as wrappers:
bootwait()
{
mkdir -p /bootwait
local i=$#; while let i--; do
local f="$1"; shift; set -- "$@" "/bootwait/$f"
done
wait_for_files "$@"
}
bootdone()
{
mkdir -p /bootwait
local i=$#; while let i--; do
local f="$1"; shift; set -- "$@" "/bootwait/$f"
done
touch "$@"
}
Init scripts can specify their dependencies at the top with a single
call to "bootwait" and specify what they provide at the bottom
with a call to "bootdone". With the new Linux facility "udev",
this could be used to rewrite the entire init process along an
event-driven model (where scripts called by udevd do everything, and
init doesn't do anything but maybe run udevd and maybe getty) --
the way that it should be. (And the way that it is on my initrd.)
However, to my dismay, the distributions seem to be moving towards
giant central daemon processes for init, the same way that Apple's
Mac OS X does it.
UNSORTED:
alternative-representation-for-trees.txt
borf
draggable.js
dynmenu.c
FEBE 2009 01-23 06.03.59.xpi
FEBE.xpi
funtoo
interfaces.txt
mkquine.scm
random.org.sh
UrbanDictionary.pm
wget-log
xo-debian-install.sh
xo-debian.tgz
vi-storable
ALSO:
in ~/bin: diff2html df block-device-size fetchcookie p readline-stream pagesort
.bashrc, hermes:src/skel & code.work/skel
in old backups: mayo bot, jerkface blog, sarah forum, unthreaded blog thing
on may.org: jstor downloader, listen together
/backup/2004-home.squashfs.mnt/mathchat*
/backup/2004-home.squashfs.mnt/src: bash-ed, C/clipboard/ mayo/
/backup/2000-home.squashfs.mnt/my-docs/work/ <-- oldest shit