Log <-

Archive for the ‘tech’ Category

RSS   RSS feed for this category

Stop Pingback/Trackback Spam on WordPress

I guess the spammers finally found my blog, cause I've been getting a lot of pignback/trackback spam. I tried some anti-spam plugins, but none really worked, so I disabled pingbacks altogether. Here's how:

First, log into wordpress as an admin. Go to Settings → Discussion, and uncheck the Allow link notifications from other blogs (pingbacks and trackbacks.) box.

That's not all though, cause that just works for new articles. Old ones will still allow ping/trackbacks. As far as I could tell, WordPress doesn't have a feature to disable those for older posts, so we'll have to fiddle around in the database directly.

Connect to your MySQL server using the WordPress username and password. If you no longer remember those, you can find them in the wp-config.php file.

$ mysql -u USERNAME -p -h localhost
Password: PASSWORD
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1684228
Server version: 5.0.51a-24+lenny5 (Debian)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql>

At the MySQL prompt, we must now switch to our WordPress database. Again, if you don't remember the name, you can find it in the wp-config.php file. In my case, it is em_wordpress

mysql> USE em_wordpress;
Database changed

Finally, we update all posts and disable ping backs on them:

mysql> UPDATE wp_posts SET ping_status = 'closed';
Query OK, 1084 rows affected (0.10 sec)
Rows matched: 1084  Changed: 1084  Warnings: 0

There we go. No more ping- and trackback spam on old posts.

This is why I don't use Apple products or DRM media

This company is going out of business because they put all their eggs in a very delicate and quite frankly evil basket:

BeamItDown Software and the iFlow Reader will cease operations as of May 31, 2011. We absolutely do not want to do this, but Apple has made it completely impossible for anyone but Apple to make a profit selling contemporary ebooks on any iOS device.

If you're a company, and you do this:

We bet everything on Apple and iOS and then Apple killed us by changing the rules in the middle of the game.

you need to have your head examined :-) This is not the first time this has happened, and it will most certainly not be the last time. Apple will do anything it can to make a buck over other company's back!

Not just the company is being royally screwed over by Apple:

Many of you have purchased books and would like to keep them. You may still be able to read them using iFlow Reader although we cannot guarantee that it will work beyond May 31, 2011 [...] your computer which will let you access them with Adobe Digital Editions or any other ebook application that is compatible with Adobe DRM protected epubs.

So iFlowReader's have probably also lost all their ebooks because they had DRM on them. DRM (Digital Rights Management) is a technology which restricts media to a certain application or device; opening it in third-party applications is usually impossible.

And that's why I have never and will never buy an Apple product, or use any media that is DRM protected.

Lessons on development of 64-bit C/C++ applications

Lessons on development of 64-bit C/C++ applications:

The course is devoted to creation of 64-bit applications in C/C++ language and is intended for the Windows developers who use Visual Studio 2005/2008/2010 environment. Developers working with other 64-bit operating systems will learn much interesting as well. The course will consider all the steps of creating a new safe 64-bit application or migrating the existing 32-bit code to a 64-bit system.

Google Calendar Holidays

Just found this out: You can easily add all the holidays for your country (or particular religion) to your Google Calendar by going to: Other Calendars → Add → Browse Interesting Calendars. There you can subcribe to calendars containing your country's holidays.

BoxBackup on Debian/Ubuntu

(The lastest version of this article is always available in stand-alone HTML format and in PDF format. The original AsciiDoc source is also available. Please link to the HTML version, not this Blog post!)

1. Introduction

BoxBackup is an online remote backup tool for Unix systems (BSDs, Linux, MacOSX). It is robust, secure, low on resources and can perform both in continues backup mode and snapshotting. In continues backup mode changes will be pushed to the server soon after they happen; in snapshot mode mode BoxBackup behaves more like traditional backup programs and creates snapshots every fixed amount of time.

This article will describe how to set up a BoxBackup server and client on Debian and Ubuntu machines. Much of this article can also be used on other Unix-like systems, however it will not discuss how to compile BoxBackup.

Read the rest of this entry »

MacOS X on VirtualBox

I've been trying to get MacOS X working on VirtualBox for a while now, and it never worked due to some ACPI problems. The latest versions of VirtualBox have added MacOS X Server as a guest possibility, and it also seems to have fixed some problems with running the normal Mac OS X.

I got this working on the following hardware/software. If your hardware/software differs, your mileage may vary:

  • CPU: Intel Pentium(R) Dual-Core CPU E5300 @ 2.60GHz
  • Videocard: Intel Corporation 4 Series Chipset Integrated Graphics Controller (rev 03) (Some Intel integrated piece of crap).
  • Ubuntu GNU/Linux v10.04
  • VirtualBox v3.2 (important)

Here are the instructions:

  • First, you'll need to get get MacOS X. I used a pre-made illegal VMWare image I got from here. (I'm sure Apple's legal team will be on my neck soon, but fuck it).
  • Second, you need the latest VirtualBox. I'm using the Non-OSI v3.2.10. You can get it from the VirtualBox download page.
  • Now, add a new VirtualBox guest and select 'MacOS X' as the Operating System and either 'Mac OS X Server' or 'Mac OS X Server (64 bits)' as the version. I'm not quite sure it works on 32 bits host processors/operating systems, but it does work on 64 bits hosts.
  • You need at least 1024 Mb of memory. Less will NOT work
  • For the Virtual Hard Disk add the Mac OS X vmdk image as a harddisk

Okay, now you'll have to through the settings and match them to the following settings:

The rest of the settings do not seem to matter, at least for getting MacOS X to boot succefully. To recap:

  • Enable IO APIC must be ON.
  • Enable absolute pointing device must be ON.
  • You must enable only ONE SINGLE CPU. Mac OS X will not boot on VirtualBox with two or more CPUs
  • Enable PAE/NX must be ON.
  • Enable VT-x/AMD-v must be ON. This also means your hardware must support it. For Linux users, run the following command to check if your hardware has support for virtualisation enhancements:

    grep "vm" /proc/cpuinfo.

    You should see one or more of these lines:


    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm xsave lahf_lm tpr_shadow vnmi flexpriority

    If it's not there, but you have a recent CPU, you may have to enable VT-x in your BIOS

  • Enable nested paging must be OFF
  • Enable 3D acceleration must be ON. This also means your hardware AND host operating system has to have support for 3D acceleration. Linux users can use the 'glxgears' and 'glxinfo' commands to see if 3D acceleration is working correctly.
  • Virtual Storage must use a SATA controller of the AHCI type, and must NOT use host I/O cache

IMPORTANT: You MAY have to boot Mac OS X with the -v boot option. Directly after starting up the Virtual Machine, hit enter and at the boot: prompt, enter -v. I had to do this the first time to get Mac OS X successfully booted. After the first successful boot, it doesn't seem necessary anymore. Safe-mode may also help in case of problems.

The Mac OS X image I linked to at the top of this post has its language set to Russian. Here's a nice blog post about how to change it, including screenshots.

I'm ditching Chrome because of the http:// stripping.

New development builds, and apparently the Beta build of Chrome for the Mac, strip the 'http://' part from the URL input field. Since I run Chromium for Linux, which uses nightly builds of Chrome, I am already affected by this retarded decision.

For this reason I will no longer be using Chrome, nor will I recommend Chrome to anybody anymore. In fact, I will actively recommend using any browser other than Chrome, including Internet Explorer 6.

I could explain why such a 'trivial' change upsets me so much that I'd stop using an otherwise… promising.. product, but life is too short to argue with stupid people, so I'll just leave it at that.

SQuirreL SQL database browser

I finally found a decent replacement for the MySQLcc database browser:

SQuirreL SQL

SQuirreL SQL Client is a graphical Java program that will allow you to view the structure of a JDBC compliant database, browse the data in tables, issue SQL commands etc

It's Java, so it's slow, but it does everything I want, and more:

  1. Syntax highlighting
  2. Multiple query tabs
  3. Multiple queries in the same tab (select the query and press ctrl-enter to run it)
  4. Export results

It has tons of options you can tweak, and it's got plugins if you want to extend it. It supports just about every relational (and some non-relational) database out there.

Awesome.

Simple Revision Control with RCS

I do a lot of stuff in text file on my system. I keep notes and todo's in them, I write drafts using a text editor and even complete user guides and other documentation. I also keep my personal planning in Gnome Planner which stores its data in an XML file, which in essence is also text. Every so often I wish I could revert a draft back to an older version, or perhaps just have little peak at how it was a couple of days ago. I also really wanted to be able to look back at my planning to see where I made mistakes and underestimated the time required for a particular task.

A revision control system would be ideal in these circumstances. However a full-featured revision control system such as Subversion or Git would be a little overkill in this situation. Enter our old and forgotten friend RCS. RCS has been around since at least 1982, and has since been deprecated by CVS, Subversion and the various other centralized and decentralized versioning systems. It is however perfectly suited for keeping revision histories of single or a small collection of files. RCS is easier to understand than just about any other revision control system too.

Let's look at how it works. First, we have to install it. It used to be installed by default on most systems, but that's no longer the case. If you're running a Debian-based distribution (such as Ubuntu), you can simply install it by typing:

aptitude install rcs

We've now got the RCS revision system installed, and we can start using it.

Let's create an empty text file, called 'simplerevisioncontrolwithrcs.txt' in the 'drafts' directory:

~$ cd drafts
~/drafts$ touch simple_revision_control_with_rcs.txt

RCS consists of a couple of simple command-line utilities:

  • 'ci': Check In RCS revisions. This allows you to save the changes made to a file that is managed by RCS.
  • 'co': Check out RCS working copy. This command retrieves a specific version (the latest by default) of a file in a RCS repository.

So let's convert our empty draft to an RCS repository by checking it in. This means RCS will create a file based on the file you want to check in with the appropriate information such as when it has been been changed, what changed, etc.

~/drafts/$ ci simple_revision_control_with_rcs.txt
simple_revision_control_with_rcs.txt,v  <--  simple_revision_control_with_rcs.txt
enter description, terminated with single '.' or end of file:
NOTE: This is NOT the log message!
>> Simple revision control with RCS
>> .
initial revision: 1.1
done

There. Our initial empty file is gone, and in its place is a file under RCS revision control: 'simplerevisioncontrolwithrcs.txt,v'. It has been given a revision number by RCS: revision 1.1.

We can now make a working copy of this file by doing a checkout. This will retrieve the latest modifications from the text file and create a new file with the original name in the current directory. The first thing we should do though, is to turn off locking. We're working on this file ourselves, and it will never be shared, so we don't care about locking at all. Locking can be turned off using the '-U' option of the 'rcs' tool:

~/drafts$ rcs -U ./simple_revision_control_with_rcs.txt,v 
RCS file: ./simple_revision_control_with_rcs.txt,v
done

Now we can do a checkout so we can edit our draft. Let's do this, and add a line to it:

~/drafts$ co simple_revision_control_with_rcs.txt,v 
simple_revision_control_with_rcs.txt,v  -->  simple_revision_control_with_rcs.txt
revision 1.1
done
~/drafts$ echo "Hello world" > simple_revision_control_with_rcs.txt

Now let's commit this change to the repository so RCS knows about it:

~/drafts$ ci simple_revision_control_with_rcs.txt
simple_revision_control_with_rcs.txt,v  <--  simple_revision_control_with_rcs.txt
new revision: 1.2; previous revision: 1.1
enter log message, terminated with single '.' or end of file:
>> Added greeting.
>> .
done

RCS keeps an entire history of all changes made to the files it keeps under revision each time you do a check in. Take a look at the history of the file with the command 'rlog':

~/drafts$ rlog ./simple_revision_control_with_rcs.txt,v
RCS file: ./simple_revision_control_with_rcs.txt,v
Working file: simple_revision_control_with_rcs.txt
head: 1.2
branch:
locks: non-strict
access list:
symbolic names:
keyword substitution: kv
total revisions: 2; selected revisions: 2
description:
Simple revision control with RCS
----------------------------
revision 1.2
date: 2009/06/09 07:48:50;  author: todsah;  state: Exp;  lines: +1 -0
Added greeting.
----------------------------
revision 1.1
date: 2009/06/09 07:46:24;  author: todsah;  state: Exp;
Initial revision
=============================================================================

That lists some basic information on the file, the repository and the two revisions we've made so far. We can now easily check out earlier revisions of the file by specifying a specific revision, or a date:

~/drafts$ co -r1.1 simple_revision_control_with_rcs.txt
simple_revision_control_with_rcs.txt,v  -->  simple_revision_control_with_rcs.txt
revision 1.1
done
~/drafts$ cat simple_revision_control_with_rcs.txt
~/drafts$

RCS has given us back our file as it existed at revision 1.1, which was the empty file. Let's retrieve the version of the file as it was at 14:00 today:

~/drafts$ co -d"14:00" simple_revision_control_with_rcs.txt,v
simple_revision_control_with_rcs.txt,v  -->  simple_revision_control_with_rcs.txt
revision 1.2
writable simple_revision_control_with_rcs.txt exists; remove it? [ny](n): y
done

In conclusion: RCS may appear to be completely dead, but it still has its uses. I know many people out there will think to themselves "Why not just use Git or Subversion"? I understand the attraction of very powerful tools, but I also really believe in keeping things simple, and using the right tool for the job. I don't know much about Git, so perhaps it is ideal for such a simple requirement, but I don't really feel like learning it just to version control a single file.

Performance optimization: The first thing to do

In the last couple of years, I've done a lot of performance optimization. I've optimized raw C, Python and PHP code. I've optimized databases: tweaked settings, memory usage, caches, SQL code, the query analyzer, hardware and indexes. I've optimized templates for disk I/O, compilation and rendering. I've optimize various caches and all kinds of other stuff like VMWare configurations.

As I've done a lot of optimization, I've noticed a couple of things. One of these things is how persons in different roles look at optimization:

Programmers always want to optimize the code. The code isn't optimal, and a lot of speed can be gained from optimizing the code. This usually boils down to optimizing algorithms to be more efficient with either CPU cycles or memory. In contrast to the programmer, the system administrator always wants to tweak the configuration of either the Operating System, the application itself, or some piece of middle-ware in between. Meanwhile, managers always want to optimize the hardware. "Developer time is more expensive than hardware", they quip, so they decide to throw money at faster and more hardware, instead of letting the developer optimize the code.

What none of them realise is that they're all wrong. None of these approaches are any good. When it comes to optimizations, all of the people above are stuck in their own little world. Of course managers just love the fact that programmers "don't understand cost versus benefit", and a nice saying such as "Developer time is more expensive than hardware" has a really nice ring to it. Managers have a high-level view of the application's eco-system, and so they search for the solution in the cheapest component: hardware. Programmers, on the other hand, know the system from a very low-level point of view. And they naturally love the fact that managers don't understand technology. They are intimately familiar with the code of their application, or the database running behind it, and so they know a lot of its weak spots. Of course, they'll assume the optimization is best performed there. The system Systems administrators have limited options either way, so they stick to what they can influence: configuration.

An excellent example of this is a recent post on the JoelOnSoftware blog. I'll recap the main points I'd like to illustrate here:

One of the FogBugz developers complained that compiling was pretty slow (about 30 seconds). [...] He asked if it would be OK if someone spent a few weeks looking for ways to parallelize and speed it up, since we all have multiple CPU cores and plenty of memory. [...] I thought it might be a good idea to just try throwing money at the problem first, before we spent a lot of (expensive and scarce) developer time. [...] so I thought I'd experiment with replacing some of the hard drives around here with solid state, flash hard drives to see if that helped.

Suddenly everything was faster. Booting, launching apps… even Outlook is ready to use in about 1 second. This was a really great upgrade. But… compile time. Hmm. That wasn't much better. I got it down from 30 seconds to … 30 seconds. Our compiler is single threaded, and, I guess, a lot more CPU-bound than IO bound.

This is an excellent example of how a manager would try to solve optimization problems. At the start of the quote we see the typical way a developer would tackle the problem: parallelize and speed up. In other words: low-level optimizations.

Now it turns out Joel was wrong. Solid State disks didn't help at all, since their problem wasn't with disk I/O at all. But that doesn't mean the developer was right either! I like to see it as a kind of Schrödinger's Cat situation: both are wrong, until one is proven right. Why is that? Because they have no idea what the problem is!. All they're doing is guessing away at the problem in the hopes of finding out what exactly will solve it, without having any clue about the actual problem! We can see this quite clearly: after having dismissed disk I/O as the problem, they assume it must be because "our compiler is single threaded, and, I guess, a lot more CPU-bound than IO bound.". Again, they jump to conclusions without knowing what the problem is. So now they might not only waste a lot of time on solid state disks without fixing the problem, but they're about to spend weeks of developer time without knowing if that will fix the problem.

So, here is my point:

The most important thing about optimization is analysis.

You can't fix a problem by simply trying different solutions to see if they work. In order to fix a problem, you have to understand the problem first.

So, please, if you're a developer, don't assume saving a couple of CPU cycles here or there will solve the problem. And if you're a manager, don't assume some new hardware will solve the problem. Do some analysis first. Finding out if disk I/O, memory, CPU cycles or single threading is the problem is really not that hard if you spend a little time thinking about it and benchmarking various things. And in the end, you'll have a much better overview of the situation and the problem, and you'll be able to come up with specific solutions which will actually work.

And that's how you save money.