Handy utilities, listed roughly in order of handiness. safeunrar (also: safeuntar, safeunzip, safeunjar) - extract into a single directory This tool saves you from having to always list the contents of an archive before you extract it, just to make sure that it doesn't pollute the current directory with a million files, instead of putting them in a subdirectory like you expect. (I actually stopped doing this for a while with source code, since it seemed like it always followed the right convention -- until I got burned and wrote this tool.) It extracts zip, jar, tar.gz, tar.bz2, and rar files (support for .7z forthcoming), putting all of the extracted files into a single directory, or into the current directory if the root contains only one file like it should. duhsort - numerical sort for du --human output Sorts the output of "du -h" according to size. I suggest defining a bash function like this: duh() { du -h "$@" | duhsort | less -E; } interactive-rename - rename files using a text editor Specify a list of files on the command-line, and edit them in "vi" (or your editor of choice). This is often even more convenient and useful than perl's "rename" util, since you can apply multiple patterns, see intermediate results, use undo, etc. I mainly used this for renaming mp3s back when they were often untagged. A similar to a tool is included in the "renameutils" package, but this one is better. However, you'll still want "renameutils" for "imv", which allows you to edit a single filename in a "readline" buffer. That's even more handy, since it's rare to rename multiple files. choose-wlan - A simple menu to connect to a wireless network. "choose-wlan" uses iwlist, iwconfig, and dhclient to handle the network interface directly. It uses 9menu to present a menu of available networks, and provides options to connect to one of them, to rescan, or to reload the wifi driver. (This is often necessary in my experience.) It's a very easy, simple, fast way to get online, originally intended to be used as a desktop icon for my sister, but I use it from the command line (with ratmenu instead of 9menu). wget-patch - add "--rename-output" option to wget This patch adds an option that allows the user to specify a perl expression used to modify the target filenames of a call to wget. (Basically, it lets you "rename" the output, although it skips the part where the output is given the original name.) See wget-patch for an extended description along with source and binaries. In order to make it brain-dead simple to build and install the patch, I wrote an installer shell script that clones the repo, applies the patch, downloads build prerequisites, creates a .deb package, and installs it. I wrote the script in a general enough way that it could be used as an installer for patched versions of other Debian packages; you only have to change the value of some environment variables. vimix - adjust PCM and master volumes simultaneously, with vi keys. A command line tool to change the volume. Unlike most "mixer" programs, this one adjusts both PCM and "master" volume on the sound card simultaneously, so that you don't have to mess with two controls in order to get the full volume on your card. (If these values are not the same when it starts out, it sets them each to their average. This is only appropriate if you're not using anything but PCM to generate sound.) It's called "vimix" because it uses vi-like keybindings and numerical prefixes. The keys "j" and "k" lower and raise the volume by 2 percent; if you type a number instead, they will lower or raise the volume by that percentage instead. It also doesn't muck up your command line history with a bunch of unnecessary graphics the way that "aumix" does: it uses one line of the terminal to display a horizontal percentage bar. (Sine I wrote this, aumix may have figured out how to use the "alternative screen" to avoid mucking up the command line history.) make-pdf.pl This will make a pdf out of jpegs specified on the command line. I made it to make PDFs out of some books I ripped from books.google.com, back when you could download entire books without creating multiple gmail accounts (you only had to use multiple cookies). See also [[piratext.xpi]], [[google_books_helper.user.js]] and [[googlebooks.user.js]], my [[google books ripping script]] and my [[book copying machine]]. out.c, out.pl I wrote out.c years ago to fill a gap in the GNU coreutils. It allows you to pipe in data and writes it to a file that you specify -- but it doesn't start writing the file until you finish sending data, so that you can safely overwrite the input. Since writing it, I learned of several alternative implementations, including one called 'sponge' in Debian package 'moreutils'. I think mine might be better than most, since it tries to minimize copying. I wrote the version in perl much later than the C one, because I couldn't find the C source at the time and thought I had lost it in a hard drive crash. The perl version is better: it tries to avoid using the temporary file at all, only opening it after a memory buffer is full. addgigs.pl - quickly add data to a file Adds the specified number of gigs to a file. It uses "seek" to add this space as a "hole"; i.e., disk space will not actually be allocated to the file until it is used. This makes it much faster than "dd if=/dev/zero", besides saving you disk space. Quite useful for creating disk images. To manage partitions within disk images, see [[lodisk]]. lodisk - manage loopback files containing partitions Handles the (un)mounting of loopback devices for partitions within files. Lodisk will create a loopback device, run a command on it, then (unless the command was mount) deallocate the loopback device. The following commands are supported: mount mkfs.ext3 mkswap fsck tune2fs resize2fs [c|s]fdisk Adding new commands is trivial, and it would be possible to support unanticipated commands; however, because lodisk is intended to be fool-proof, and because linux loopback devices on partitions allow the user to write beyond the end of a partition, it is not allowed to specify an unsupported command. An example of the danger is "resize2fs": lodisk will specify the command line necessary to enlarge the filesystem to the end of the partition; however, if "resize2fs" were not specifically recognized by lodisk, the typical usage of "resize2fs" would destroy all trailing partitions by enlarging to the end of the disk. sea.c - a fast, somewhat fancy bash prompt This is a bash prompt that list running and stopped jobs (only if there are any) and the exit code of the last program run (unless it's 0). In fact, it can be used to conditionally display any integer along with a single character code indicating to the user what it is. It will also display the current working directory in the prompt, but only the end of it if it's too long. I wrote this back when I used to write code just for the fun of it, or for educational experience, or something, because it's implemented in C without much good reason. I was probably just looking for a C project. (I once wrote an implementation of 'ed' in pure bash, too.) I wrote it after I read the "bash prompt HOWTO" and noticed how terrible the shell script code in there was (in terms of performance, and not just that). That shell code could have been improved, but I decided to write my bash prompt in C. xcopy.c This program copies text into the X clipboard. I had to read a bunch of X documentation to write this, and I didn't end up using it the way I intended (to unify X and [[GNU screen]] paste buffers). Nor did I ever do any X programming again. However I do at least know how X works, so I can be very confident in my final conclusion that X sucks. ssh-remote-add-public-key This does what openssh's ssh-copy-id does, except that it will also create a key if none exists, and won't read keys from the ssh agent. I don't think ssh-copy-id existed when I wrote it; anyway, I didn't know about it. rrmmod - recursive rmmod Amazingly, modprobe lacks any facility to remove a module with dependencies by removing its dependencies. This does just that, by parsing the output to rmmod. S16startx - a script for fast Debian boot-up This is a startx script designed to run in /etc/rcS.d on a Debian system. I use this on my laptop to make it start X *much* sooner than it would otherwise. It integrates with a matching xinitrc in a clever and secure way, to allow the user to give priority to his own processes (e.g. KDE startup). It is the result of a morning spent experimenting with starting X at various points in the boot process and seeing what broke and why. Since then I have used it on a few Debian KDE machines with great success. However it remains easy for hacks like this to be broken by upstream Debian changes. In any case, you can't just install this and have it work: you have to specify a user and you have to do something about the other init scripts mentioned in this script's comments. You might also have to load some other modules or scripts particular to your X configuration. tmpwm.pl - cascade windows for ratpoison's "tmpwm" command This is a wrapper for ratpoison's "tmpwm" command that will cascade the existing windows using "nawm". It suffers from an unfortunate "nawm" limitation regarding multiple windows with the same name. I don't use ratpoison anymore; now I use [[xmonad]]. xmonad, unfortunately, lacks a tmpwm facility, although it's less necessary. rename-tvshow.pl - rename tv shows using tvrage.com data I wrote this for a friend so he could rename TV shows he downloads using bittorrent. It's meant to rename the sort of filenames that are used by release groups. Since he uses Windows cmd.exe for his shell, it performs glob expansion on its arguments. (I myself don't use Windows - or watch TV shows.) editxmp - edit XMP metadata (Adobe PDF) I wrote this for DIzzIEe. It allows the user to edit XMP metadata in PDF files (or other Adobe product file formats) using a text editor. Such metadata is often responsible for information leaks. I should have made this into a web app, because I don't think DIzzIEe ever figured out how to compile programs, so that, far from "handy," this has probably never been used! (Except I did test it.) CGI scripts I've written a lot of CGI programs, and plan to put more of them online eventually... caching-proxy - a CGI work-alike to apache mod_proxy This is caching proxy that works like apache's mod_cache, except as a perl CGI script that can run on a cheap unlimited shared host. I use it on one such host in order to mirror the web page I host on my DSL link. I wrote this because when I uploaded large files to my web host, I always wanted to post the link before the upload had finished, but sometimes people would end up downloading incomplete versions -- depending on which transfer won the race. This proxy eliminates that race. Now when I want to share a large file on IRC or over the phone, I can post a link to the cache as soon as I copy it into my web root. If more than one person clicks the link, the transfers will each go at the full upstream speed of my DSL line (or faster, as more of the file is uploaded), instead of each user having to share the link. I would like to implement the full HTTP cache control semantics for this CGI proxy, like mod_proxy does. However, the important features for my use are already there, so I'm putting that off for now. The only really important feature that it is missing is the ability to satisfy HTTP Range requests from clients (i.e., to allow clients to resume files). However, it does make Range requests to the server in order to resume the downloads it initiates. Another feature I'd like to add is the ability to force an upload into the cache -- which is simple enough. But I also want a tool on the other end, on my DSL line web server, that will intelligently seed the cache with new additions. (Not necessarily immediately, but whenever there is available upstream bandwidth.) jp2a.sh - convert an image URL to ASCII art This is a simple wrapper around jp2a which accepts an URL as an argument and caches its output to avoid unnecessary fetching. Despite its name, it is not limited to JPEG images (unlike jp2a). ttf-cgi - render text as PNG using true-type fonts Supports different fonts, sizes, foregrounad and background colors, and word-wrapping behaviors. This is useful for including fonts that people don't normally have on web pages. For a really cool way to use it, check out: http://www.kryogenix.org/code/browser/lir/ listen together - sync your playlist with friends in real-time This is a streaming media player in which the controls are shared by multiple clients, so that they stay in sync with what they're playing. (I didn't write the media play, only the synchronization part.) A couple years ago, I had a long phone conversation with someone during which she recommended I listen to a particular album. I wound up putting the album on while talking to her, and she put it on as well; I synced it up manually by listening to the music on her end coming through the phone. (It took a couple tries.) I think we were both playing the album as mp3s on our computers, and the thing could have been made perfect if the computers had done the sync for us over the network. If she was using Linux at the time, I might have set up a ssh link or something. Anyway, later on I had the idea of making it into a web app that streamed to several clients simultaneously. Eventually I got around to making this happen in the form of a web app. It supports video as well, and in fact the intention was to be able to watch movies with someone over the phone. Although this code works, it's far from polished, and there is, or at least was, a rather unfortunate bug inherited from the GPL'd FLV player that it is based on. That is probably fixed by now; in any case, there might be better alternatives (I could only find one free software FLV player at the time). More seriously, there is no way to upload videos (one specifies an URL, which is then sent to all other clients for streaming), and the playlist editor is extremely rudimentary. If I wanted to make this really pretty useful, I think what I would do is add support for easily adding youtube videos (whose FLV URLs can be fetched easily with "youtube-dl"). I also never got around to compensating for any potential differences in the clocks of the clients. This is no problem if they're all using NTP, but it would be trivial to check the date on the server against the date on the client, and this would even (automatically) compensate for lag in the HTTP response. [TODO: Actually, I might have implemented this, I don't remember; go check.] In any case, I don't have a lot of interest in this code right now because I don't often have extended phone conversations anymore. I'm more likely to chat on IRC, where it's unimportant to have the video in real-time sync. Still, I intend to finish this some day, since it's close so close to being useful. blog w/ threaded comments sarah's mysql forum unthreaded blog thing Irssi scripts client: simple_away.pl autorejoin.pl move_active_windows.pl rignore.pl save_channels.pl tts-pub.pl anarchobot: ban.pl downforeveryone.pl http-head.pl knockout-repeat.pl mode.pl urban-dictionary.pl see irssi/, ~/.irssi/scripts, ~anarchobot/.irssi/scripts Greasemonkey scripts & firefox extensions - excluding download helpers FEBE.xpi xkcd.user.js w3m.user.js google_books_helper.user.js googlebooks.user.js dumparump.user.js Downloaders piratext.xpi pirate bay ripper gigapedia ripper 4chan-save ehow.com-download google books ripper jstor-save-links + jstor-rename-history.txt Filesharing sites ifile.it.user.js filefactory.com.user.js sendspace.com.user.js rapidshare.com.user.js HTML tools I have written various tools for manipulating HTML files, which I'll try to release here when I can. html-add-smart-quotes - convert "" into “ and ” This program uses perl's HTML::Parser to parse the html and is clever about handling markup in the middle of your text (along quotation boundaries, etc.) html-extract-element - find HTML elements via tag/attribute/value This is an extremely handy script for initial extracting of content from web sites. You can run it like this: html-extract-element http://google.com?q=test h3 class r ...and you have extracted google results. It allows any number of attribute/value pairs after the initial tag. You can also specify a filename instead of an URL, or use it like this: html-extract-element http://google.com?q=test h3 class r | html-extract-element - a ...and you have extracted google results again, but better. I plan to change the syntax of this so that it accepts multiple tags, differentiating them from attributes in the form "attribute=value". html-linkify-urls - convert URLs in text elements to links html-generate-slides.pl - display a sequence of images without javascript This generates HTML for a "slideshow" like presentation for a list of images, in the sense that only one image is visible at a time. You click on the image to view the next image in the sequence. I created it to showcase a sequence of screenshots from a movie, made into sort of a comic book. Originally I had written some javascript to do this, but DIzzIEe complained about it not working without javascript, so I figured out how to do it without any. It just uses internal anchors to link between images. It's enough to create a pleasant viewing experience, compared to a directory of images with an apache-generated index. Once someone used it to release a scanned comic book in HTML form on piratebay, which I thought was pretty cool. html-to-info This converts HTML (a subset of it) into a GNU Info file, which can then be converted into texinfo, LaTeX, and finally a PDF pamphlet. It seemed like the easiest way to turn HTML into a PDF with printed book quality. I used it to print out a book that I had converted to HTML. And high quality it is! Footnotes, intelligent page- and line-breaks, internal references between pages: beautiful features. However, I was never able to figure out how to set the margins in the final PDF! This made the printed text much smaller than I would like, wasting the edges of the paper. The problem is that texinfo's LaTeX template is inflexible, so I would have had to modify it, and in order to do so, I'd have to learn more about TeX than I know. Or ask somebody, which is what I had planned to do (but haven't as of yet). This software: http://html2latex.sourceforge.net/ may be a much better solution. Unlike my script (and GNU Info itself), it supports tables and international characters. The next time I need to print a book from HTML, I'll try it, and see if I can make a decent book out of the output. Otherwise, I might fix the margins on texinfo output. (It would be nice to be able to print texinfo documentation the way that I want, anyway.) Unwritten: html-generate-toc - generate table of contents from headers html-link-footnotes - link (and back-link) footnotes using anchors html-fix-punctuation - move punctuation inside quotation marks and before footnotes System administration tools debchanges.pl - this needs a long writeup push-lvm-snapshots - use rsync & lvm to remotely backup daily snapshots This uses the famous rsync hard link snapshot technique for daily remote backups, combined with LVM in order to keep the copy atomic. The technique is described here: http://www.mikerubel.org/computers/rsync_snapshots/ I wrote this after looking at a number of the scripts linked from that page and also searching the debian repo. Everything I saw seemed needlessly complicated -- and wrongly designed, copying the original approach of Mike Rubel involving renaming backups -- for what is really a very simple task. All you need to do is ascertain the --link-dest argument before doing an ordinary rsync. I looked at a few different scripts until I finally read this page: http://rsnapshot.org/faq.html which details bug after bug after bug -- serious bugs -- in a major software project which was even written about in O'Reilly books! After reading that I decided to avoid the possibility of bugs by writing it myself. As explained, I wrote this script because backups are too important to trust some random internet asshole's code. For that same reason, my code is intentionally short and transparent: you can verify its correctness pretty easily yourself. And it probably still does everything you need: it goes through each specified local volume, creates an LVM snapshot if it's an LVM volume, and rsyncs the data to the specified remote backup location using the specified rsync options. It first uses rsync to get a list of existing remote backups, and uses --link-dest on the subsequent copy in order to share data with the latest of them. A separate function exists to prune old backups intelligently. By default, it will keep every day for two weeks, every week for 10 weeks, and every month forever. You can take the snapshots as often as you want by running the script more often. You can prune them however you want by modifying the arguments. It's safe to use multiple copies of this script at the same time to back up the same volume to the same remote destination, as long as they aren't started within the same second on the system clock. However, since it's more likely you'd like to wait for the previous snapshot to finish before taking a new one, that's what the script does, by waiting on a lock before doing anything. wait_for_files.c This uses the Linux kernel's new "inotify" feature to wait for a set of files to come into existence. Its purpose is actually to schedule init scripts according to dependencies without any central daemon. In the initrd scripts I wrote for [[Samizdat Live CD]] I used the following shell functions as wrappers: bootwait() { mkdir -p /bootwait local i=$#; while let i--; do local f="$1"; shift; set -- "$@" "/bootwait/$f" done wait_for_files "$@" } bootdone() { mkdir -p /bootwait local i=$#; while let i--; do local f="$1"; shift; set -- "$@" "/bootwait/$f" done touch "$@" } Init scripts can specify their dependencies at the top with a single call to "bootwait" and specify what they provide at the bottom with a call to "bootdone". With the new Linux facility "udev", this could be used to rewrite the entire init process along an event-driven model (where scripts called by udevd do everything, and init doesn't do anything but maybe run udevd and maybe getty) -- the way that it should be. (And the way that it is on my initrd.) However, to my dismay, the distributions seem to be moving towards giant central daemon processes for init, the same way that Apple's Mac OS X does it. UNSORTED: alternative-representation-for-trees.txt borf draggable.js dynmenu.c FEBE 2009 01-23 06.03.59.xpi FEBE.xpi funtoo interfaces.txt mkquine.scm random.org.sh UrbanDictionary.pm wget-log xo-debian-install.sh xo-debian.tgz vi-storable ALSO: in ~/bin: diff2html df block-device-size fetchcookie p readline-stream pagesort .bashrc, hermes:src/skel & code.work/skel in old backups: mayo bot, jerkface blog, sarah forum, unthreaded blog thing on may.org: jstor downloader, listen together /backup/2004-home.squashfs.mnt/mathchat* /backup/2004-home.squashfs.mnt/src: bash-ed, C/clipboard/ mayo/ /backup/2000-home.squashfs.mnt/my-docs/work/ <-- oldest shit