Log <-

Archive for June, 2008

Why Python Rocks II: Data structures

Okay. So what's cool about Python? I can't count the number of times I've had to show skeptics why Python is cool, what Python can do that their favorite language can't do. So I'm writing a bunch of articles showing off Python's Awesome.

Previous articles in this series:

Read the rest of this entry »

MiniOrganizer v0.1 released

I just released the first version of MiniOrganizer.

MiniOrganizer is a small no-nonsense personal digital organizer written in GTK2, featuring appointments, todos and notes. It uses iCalendar files as its native back-end, supports multiple alarms per appointment/todo and has recursive todos.

MiniOrganizer currently features:

  • Appointments.
  • Hierarchical Todos.
  • Multiple alarms per appointment and todo.
  • Alarm notification.
  • (Gnome) Panel docking.

You can visit its homepage at http://miniorganizer.electricmonk.nl.

Remember that this is only v0.1 – the first released version – so it will contain some bugs and other problems. Any feedback is greatly appreciated.

Breakpoint-induced Python debugging with IPython

Most of the Python programmers out there will know about IPython. Most of them will also know about the Python Debugger (PDB).

IPython has an advanced version of PDB (spectacularly named 'ipdb') which does the same for PDB as IPython does for the normal interactive Python interpreter. It adds tab completion, color syntax highlighting, etc. The -pdb switch to IPython gives us access to the debugger automatically in the event of uncaught exceptions:

def ham():
	x = 5
	raise NotImplementedError('Use the source, luke!')
 
ham()

We now run this code using ipython -pdb

[todsah@jib]~$ ipython -pdb
In [1]: import test.py
<type 'exceptions.NotImplementedError'>: Use the source, luke!
> /home/todsah/test.py(5)ham()
      4         x = 5
----> 5         raise NotImplementedError('Use the source, luke!')
      6 

ipdb> print x
5

And as you can see we get dropped at a nice ipdb> prompt which allows us to use the additional power of IPython to investigate the problem.

Like most decent debuggers, we can also use PDB to set breakpoints. In Python, we do that in the code, rather than via external means. To do this, we import the pdb module and tell it to drop us into the debugger when execution hits the pdb.set_trace() line.

import pdb
def ham():
	x = 5
	pdb.set_trace()
 
ham()

We run it, and lo and behold, we get dropped into the pdb debugger:

In [1]: import test.py
> /home/todsah/test.py(8)ham()
-> raise NotImplementedError('Use the source, luke!')
(Pdb) print x
5

Buuuuut… as you can see from the (Pdb) prompt, this is the normal PDB debugger, not IPython's enhanced version. Wouldn't it be neat to be able to use IPython's debugger for breakpoint-induced debugging too? I spent some time looking around on the Interwebz, trying to find out how to do this, and to my surprise I couldn't find anything? So I dove into the IPython source and discovered IPython.Debugger.Tracer:

from IPython.Debugger import Tracer; debug_here = Tracer()
 
def ham():
	x = 5
	debug_here()
	raise NotImplementedError('Use the source, luke!')
 
ham()

Now when we execute this code, we're dropped into the IPython enhanced IPDB debugger:

In [1]: import test
> /home/todsah/test.py(8)ham()
      7         debug_here()
----> 8         raise NotImplementedError('Use the source, luke!')
      9 

ipdb> print x
5

We can use tab completion and all the other goodies IPython offers over standard Python now.

George Carlin is dead.

George Carlin, an American Stand-up comedian (though I would rather classify him as a political speaker) died. Here's an excerpt from one of his acts on American politics:

Everybody complains about politicians. Everybody says they suck. Well, where do people think these politicians come from? They don't fall out of the sky. They don't pass through a membrane from another reality. They come from American parents and American families, American homes, American schools, American churches, American businesses and American universities, and they are elected by American citizens. This is the best we can do folks. This is what we have to offer. It's what our system produces: Garbage in, garbage out.

If you have selfish, ignorant citizens, you're going to get selfish, ignorant leaders. Term limits ain't going to do any good; you're just going to end up with a brand new bunch of selfish, ignorant Americans. So, maybe, maybe, maybe, it's not the politicians who suck. Maybe something else sucks around here – like, the public. Yeah, the public sucks. There's a nice campaign slogan for somebody: 'The Public Sucks. Fuck Hope'.

Because if it really is just the fault of these politicians, then where are all the other bright people of conscience? Where are all the bright, intelligent americans ready to step in and save the nation and lead the way? We don't have people like that in this country. Everybody's at the mall, scratching his ass and picking his nose and getting his creditcard out of his fanny-pack to buy a pair of sneakers with lights in them! So I have solved this little political dilemma for myself in a very simple way. On election day, I stay home! I don't vote! Fuck 'm!

Fuck 'm! I don't vote. Two reasons: First of all, it's meaningless. This country was bought and sold and payed for a long time ago. The shit they shovel around every four years? (mimics masturbation) Doesn't mean a fucking thing. And secondly I don't vote because I believe if you vote, you have no right to complain. People like to twist that around, I know. "If you don't vote you have no right to complain", but where's the logic in that? If you vote and you elect dishonest, incompetent people, they get into office and screw everything up, well, you are responsible for what they have done. You caused the problem, you voted them in. You have no right to complain. I, on the other hand, who did not vote? Who, in fact, did not even leave the house on election day, am in no way responsible for what these people have done and have every right to complain about the mess that you people created and that I have nothing to do with. I know that later on this year you're gonna have another one of those really swell elections that you like so much. You enjoy yourselves, it'll be a lot of fun, and I'm sure as soon as the election is over your country will improve immediately.

Now go and watch all his video's on YouTube The man knew what he was talking about.

Why you shouldn't be using S3 or Google App Engine

Recently a new 'hype' has been popping up, namely Amazon's S3 and Google App Engine. For those who don't know, S3 and App Engine are basically hosting facilities for web applications which run, and data that's stored, remotely on Amazon or Google's infrastructure. This allows you to benefit from the huge and reasonably super-scaling architectures that these companies have built.

While this may seem nice, I've had my doubt about the usefulness of such services. For one, it fosters vendor lock-in. Your application with an Amazon S3 database back-end won't be able to run anywhere else. Neither will your Google App Engine application. Sure, some of the software is released as Open Source, but the software is just icing on the cake. It's the architecture that really counts, and it won't be easy to reproduce Google or Amazon's architecture. And when you build your application against those architectures, it's bound to become limited to them, in the same way (and probably even beyond that) that SQL queries you write for one database will perform badly on a different database.

And I am now seeing the first proof of these problems, as well as entirely different problem: debugging. You see, beyond the tools provided by the service, you are out of options when it comes to debugging. When hosting your own data storage or application platform, you can sink your teeth into it when problems arise. Even when most of the stuff you're using isn't Open Source, you'll still be able to sick a whole range of diagnostic tools at the problem. No such luck with remote application hosting services. They're black-box beyond the most common debugging situations.

That's what the people on a thread at Amazon's Web Services Forum are experiencing right now, if you ask me. I'll quote the gist of the conversation:

all data we store on S3 has gone through the same code path for months. starting a couple days ago a small percentage of the objects we are retrieving are not checksumming to the correct values. we hash and store objects by checksum and rehash the objects when we retrieve to ensure there is no data corruption. all the objects we're having issues with were uploaded at approximately the same time period a few days ago.

we've stored 10's of millions of objects in S3 and never encountered such problems. please let me know ASAP if you have any idea what could be going on here. thanks.

I'm having similiar problems. [...] I've been investigating our end to find the problem, and it was just suggested that I should check the forums to see if anyone else was having problems. [...] This is super-high priority for us (both corporately and personally, since lack of sleep dealing with this is killing me

The first post was made on June 22, 2008 5:05 PM. An Amazon support engineer (I assume) is working on it, but at June 23, 2008 11:12 AM there is still no answer.

And this shows exactly my objections with such remotely hosted application services. It's out of your control. That's something I couldn't live with, and I think most companies shouldn't want to either. As one user comments: "This is super-high priority for us (both corporately and personally". Staking your entire business on some black-box remote service seems like a silly idea to me. Most service providing software I (and the company I work at) use are Free or otherwise Open Software, which means we'll always have the source to dive into when we run into problems. And even if the source isn't there, at least you'll get to look at the problem from both sides. When it comes to database problems, you'll be able to view the logs, turn on debugging, inspect the entire environment the service is running in, from software to hardware. That's something you can't do with a remote service.

Yes, there will always be problems for which a fix is hard to find or for which there simply isn't a fix. If you're not willing to run that risk, you can pay a company a lot of money for support, and let them handle the hard problems (or for some even the easy problems). Naturally, Amazon and Google both offer that support too, but that's still different. You see, when you run into a really unfixable problem in your own architecture, you can always swap any part of the software or hardware and try again. Or at least, that's what most developers try to achieve anyway: interchangeable software (databases, application servers, programming libraries) and hardware. But with a remote service such as S3 or App Engine, you've already committed your entire application to one, huge, non-interchangeable component from which you can never escape. Hence the vendor lock-in.

But who knows what the future brings? This may all turn out to be a non-issue, and companies and developers may all flock to such general service providers in the future without any difficulties whatsoever. I guess I'm a conservative person when it comes to these kinds of changes. I'd rather wait and see.

Free Speech

Free Speech. Why is it important? Because it's an extension of Free Thought. Should we be able to think whatever the hell we want? Yes we should. Controlling Free Speech is about nothing more than controlling Free Thought. "You're not allowed to say this, because somebody might not agree with it. You're not allowed to say that, because somebody might feel hurt by it". What they're really trying to do is control what you can think. Trying to generate a "mindset", a "zeitgeist". Brainwashing is more like it. Well, fuck that. I'll think about whatever the hell I want and as long as I'm thinking it, I'll be saying it.

So fuck the Dutch government for trying to outlaw Free Thought, and keep on publishing cartoons showing Mohammed, wearing t-shirts implying cops are corrupt (which they are), making Death-Threat Raps and telling the public about how the politicians are the real terrorists. Remember that little rhyme you used to use when you were a kid? "Sticks and stones may break my bones, but words will never hurt me"? Guess what? Kids are smarter than our police, politicians, religious fanatics and the whole government. Grow the fuck up.

This country is going to shit. Time to move to Cuba, where you're allowed more freedoms these days.

Why Python Rocks I: Inline documentation

Okay. So what's cool about Python? I can't count the number of times I've had to show skeptics why Python is cool, what Python can do that their favorite language can't do. So I'm writing a bunch of articles showing off Python's Awesome.

First up: Documentation. I'm talking about inline documentation here: annotating modules, classes, methods, etc. Most languages have third party tools that parse the source code and extract documentation from comments. This is nice, of course, but the comments get out of date and you have to regenerate the documentation each time. Different people use different documentation generators (Doxygen VS. PHPDoc, JSDoc VS. ScriptDoc, etc) which, in turn, use different documentation standards, causing unknown chaos documentation even within the same language space. You may have heard some code monkeys say "The code IS the documentation". In Python, that's actually not far from the truth. Let's look at some of the things you can do with inline Python documentation.

Read the rest of this entry »

Links

Here are some random links to interesting stuff:

FirePHP
FirePHP is a PHP debugging library and a Firefox plugin which allow you to output debugging information to the Firebug debugging panel. Since it doesn't intermingle debugging information with your page output, but writes in a special HTTP header instead, it's especially useful for AJAX debugging. It can also come in handy when you're trying to debug a server-side script which generates something else than a HTML page. A PDF or PNG file, for example.

OpenProj
OpenProj is a project management application written in Java and therefor platform independent. It has a lot of the features Microsoft Project has (according to the webpage; I have never used MS Project before, so I wouldn't know) such as Resources, Gantt Charts, Network Diagrams (PERT Charts), WBS and RBS charts, etc. There are also various different representations of tasks for resources. It doesn't really outshine Gnome Planner, but at least it's platform independent.

Typechecking Python module
Typecheck provides powerful run-time typechecking facilities for Python functions, methods and generators. Without requiring a custom preprocessor or alterations to the language, the typecheck package allows programmers and quality assurance engineers to make precise assertions about the input to, and output from, their code.

Here's a little code example:

@accepts(String, [Number], {str: Number})
def my_func(a, *vargs, **kwargs):
    pass
 
@accepts(String, Number, Number)
def my_func(a, *vargs, **kwargs):
    pass

It's Alive! Aliive!!

My personal website, Subversion, the projects website and most of the other stuff is finally back online. It disappeared somewhere in April, after another harddisk crash. This time, three of my machines decided to go belly-up all at the same time. All three were in different locations, spread across the country.

The worst thing was that I use those three servers as online backups for eachother. Imagine my surprise when all three went down with defective harddisks at the same time. After that, I couldn't really bring myself to restoring everything, so I put the harddisks in a cupboard somewhere and decided to go without a website and asorted other junk for a while. But, I've managed to recover almost all of my stuff from one harddisk or another, and now most of it is back online.

Let's hope it stays that way for a while.