I noticed that my original implementation of linkripper was put in Mark Pilgrim’s blinks archive for January. This inspired me to go and update linkripper to automagically open up zip files, so you don’t need to do that on your own. Behold, the new version of Linkripper:
# linkripper.py # by Patrick Wagstrom # this work is dedicated to the public domain # see: http://creativecommons.org/licenses/publicdomain/ import libxml2 import sys import zipfile def readfile(filename): data = None try: zip = zipfile.
Posts
That’s right! Just when I thought it would never get published a book shows up for me today in the office. Right there in chapter 5 is what I wrote. It’s a little strange because it has three different affiliations listed for me, Argonne National Lab, Illinois Institute of Technology, and Carnegie Mellon Univeristy. Anyway, go buy a copy or just ask to borrow mine.
Today I received one a syllabus for one of my classes as a word document with a buch of hyper links in it. The hyperlinks are important as they link to the readings for the course. Clicking them all by hand would have taken a little while, so undaunted I sought about another way.
OpenOffice.org makes this quite easy. Just save the file then unzip it and you’ve got some XML files with all your document guts in it.
This weekend I’ve been learning how to do GUI programming in Python with PyGTK. I’ve been working on an editor for BibTeXML so you don’t always need to type all the fields by hand. This is what I’ve got going so far:
It’s only the people editing that works right now, and even that isn’t 100% yet, still a few tweaks required like auto updating the list of people to the right and being able to add people to the file.
That’s right, I literally found a Mac Classic last night when I was walking home from the Dean Meetup. It was sitting by a trash can so I picked it up and took it home. It was cold walk home with it, about a mile or so was left in my walk and it was about 15 degrees outside. Luckily it was pretty light and I could carry it with one hand.
As if I really needed another project that I will only half finish, today I started to muck around with bibTeXML because I wanted something more powerful than the standard bibTeX setup. Well, I’ve done some nifty hacking with my mad XSLT skills and got it so it can do cross references on names of authors for when authors show up multiple times. Right now it just renders to an HTML document, .
I just noticed that James Ewing has a some custom firmware for the WRT54G on his website. I think I’ll mess around with some of these this weekend. They’ve got all the features of the newest LinkSys firmware and have some additional stuff so it’s SSHable and what not. Pretty smooth.
Over the last few days I’ve been writing my own XML database in Python, today I discovered Apache Xindice (it’s pronounced zeen-dee-chay). It seems like it could be a lot like what I’m trying to do. Only it’s written in Java and I hate Java. So I’ll continue on doing my Python stuff for Pridb.
One other interesting thing that I found was XUpdate which is a query language for XML database.
Amazingly enough PriDB seems to be working pretty well. I’m also working on a test project to work in tandem with it called PriBlog. Basically, it will use some of the features of PriDB to do interesting things with my blog. Anyway, I’v managed to implement the LIMIT keyword. I also fixed the regex parser that handles the XPathQL on the server side.
I’ve also been thinking about moving away from using XMLRPC for the interface method.
This is a brief introduction to what I’m thinking for PriDB’s query langauage. It’s a weird mixture of SQL and XPATH right now, so you kinda need to know both. The most simple request will look something like this:
SELECT "//*" FROM documentName Such a query is equal to the more conventional:
SELECT * FROM tableName Except that it will select each node in the document instead of each row in the table.