contact
----------------------------

Blog <-

Archive for the ‘tech’ Category

RSS   RSS feed for this category

Quick-n-dirty HAR (HTTP Archive) viewer

HAR, HTTP Archive, is a JSON-encoded dump of a list of requests and their associated headers, bodies, etc. Here's a partial example containing a single request:

{
  "startedDateTime": "2013-09-16T18:02:04.741Z",
  "time": 51,
  "request": {
    "method": "GET",
    "url": "http://electricmonk.nl/",
    "httpVersion": "HTTP/1.1",
    "headers": [],
    "queryString": [],
    "cookies": [],
    "headersSize": 38,
    "bodySize": 0
  },
  "response": {
    "status": 301,
    "statusText": "Moved Permanently",
    "httpVersion": "HTTP/1.1",
    "headers": [],
    "cookies": [],
    "content": {
      "size": 0,
      "mimeType": "text/html"
    },
    "redirectURL": "",
    "headersSize": 32,
    "bodySize": 0
  },
  "cache": {},
  "timings": {
    "blocked": 0,
  }
},

HAR files can be exported from Chrome's Network analyser developer tool (ctrl-shift-i → Network tab → capture some requests → Right-click and select Save as HAR with contents. (Additional tip: Check the "Preserve Log on Navigation option – which looks like a recording button – to capture multi-level redirects and such)

As human-readable JSON is, it's still difficult to get a good overview of the requests. So I wrote a quick Python script that turns the JSON into something that's a little easier on our poor sysadmin's eyes:

harview_output

It supports colored output, dumping request headers and response headers and the body of POSTs and responses (although this will be very slow). You can filter out uninteresting requests such as images or CSS/JSS with the --filter-X options.

You can get it by cloning the Git repository from the Bitbucket repository.

Cheers!

bbcloner: create mirrors of your public and private Bitbucket Git repositories

 

bbclonerI wrote a small tool that assists in creating mirrors of your public and private Bitbucket Git repositories and wikis. It also synchronizes already existing mirrors. Initial mirror setup requires that you manually enter your username/password. Subsequent synchronization of mirrors is done using Deployment Keys.

You can download a tar.gz, a Debian/Ubuntu package or clone it from the Bitbucket page.

Features

  • Clone / mirror / backup public and private repositories and wikis.
  • No need to store your username and password to update clones.
  • Exclude repositories.
  • No need to run an SSH agent. Uses passwordless private Deployment Keys. (thus without write access to your repositories)

Usage

Here's how it works in short. Generate a passwordless SSH key:

$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key: /home/fboender/.ssh/bbcloner_rsa<ENTER>
Enter passphrase (empty for no passphrase):<ENTER>
Enter same passphrase again: <ENTER>

You should add the generated public key to your repositories as a Deployment Key. The first time you use bbcloner, or whenever you've added new public or private repositories, you have to specify your username/password. BBcloner will retrieve a list of your repositories and create mirrors for any new repositories not yet mirrored:

$ bbcloner -n -u fboender /home/fboender/gitclones/
Password: 
Cloning new repositories
Cloning project_a
Cloning project_a wiki
Cloning project_b

Now you can update the mirrors without using a username/password:

$ bbcloner /home/fboender/gitclones/
Updating existing mirrors
Updating /home/fboender/gitclones/project_a.git
Updating /home/fboender/gitclones/project_a-wiki.git
Updating /home/fboender/gitclones/project_b.git

You can run the above from a cronjob. Specify the -s argument to prevent bbcloner from showing normal output.

The mirrors are full remote git repositories, which means you can clone them:

$ git clone /home/fboender/gitclones/project_a.git/
Cloning into project_a...
done.

Don't push changes to it, or the mirror won't be able to sync. Instead, point the remote origin to your Bitbucket repository:

$ git remote rm origin
$ git remote add origin git@bitbucket.org:fboender/project_a.git
$ git push
remote: bb/acl: fboender is allowed. accepted payload.

Get it

Here are ways of getting bbcloner:

More information

Fore more information, please see the Bitbucket repository.

Quick Introduction to LDAP Basics

Every now and then I have to work on something that involves LDAP, and every time I seem to have completely forgotten how it works. So I'm putting this here for future me: a quick introduction to LDAP basics. Remember, future me (and anyone else reading this), at the time of writing you are by no means an LDAP expert, so take that into consideration! Also, this will be very terse. There are enough books on LDAP on the internet. I don't think we need another.

What is LDAP?

  • LDAP stands for Lightweight Directory Access Protocol.
  • It is a standard for storing and accessing "Directory" information. Directory as in the yellow pages, not the filesystem kind.
  • OpenLDAP (unix) and Active Directory (Microsoft) implement LDAP.
  • Commonly used to store organisational information such as employee information.
  • Queried for access control definitions (logging in, checking access), addressbook information, etcetera.

How is information stored?

  • LDAP is a hierachical (tree-based) database.
  • Information is stored as key-value pairs.
  • The tree structure is basically free-form. Every organisation can choose how to arrange the tree for themselves, although there are some commonly used patterns.

The tree

An example of an LDAP tree structure (some otherwise required attributes are left out for clarity!):

dc=com
    dc=megacorp
        ou=people
            uid=jjohnson
                objectClass=inetOrgPerson,posixAccount
                cn=John Johnson
                uid=jjohnson
                mail=j.johnson@megacorp.com
            uid=ppeterson
                objectClass=inetOrgPerson,posixAccount
                cn=Peter Peterson
                uid=ppeterson
                mail=p.peterson@megacorp.com
  • Each leaf in the tree has a specific unique path called the Distinguished Name (DN). For example: uid=ppeterson,ou=people,dc=megacorp,dc=com
  • Unlike file paths and most other tree-based paths which have their roots on the left, the Distinguished Name has the root of the tree on the right.
  • Instead of the conventional path separators such as the dot ( . ) or forward-slash ( / ), the DN uses the comma ( , ) to separate path elements.
  • Unlike conventional paths (e.g. /com/megacorp/people/ppeterson), the DN path includes an attribute type for each element in the path. For instance: dc=, ou= and uid=. These are abbreviations that specify the type of the attribute. More on attribute types in the Entry chapter.
  • It is common to arrange the tree in a globally unique way, using dc=com,dc=megacorp to specify the organisation.
  • Entries are parts of the tree that actually store information. In this case: uid=jjohnson and uid=ppeterson.

Entries

An example entry for DN uid=jjohnson,ou=people,dc=megacorp,dc=com (some otherwise required attributes are left out for clarity!):

objectClass=inetOrgPerson,posixAccount
cn=John Johnson
uid=jjohnson
mail=j.johnson@megacorp.com
  • An entry has an Relative Distinguished Name (RDN). The RDN is a unique identifier for the entry in that part of the tree. For the entry with Distinguished Name (DN) uid=jjohnson,ou=people,dc=megacorp,dc=com, the RDN is uid=jjohnson.
  • An entry stores key/value pairs. In LDAP lingo, these are called attribute types and attribute values. Attribute types are sometimes abbreviations. In this case, the attribute types are cn= (CommonName), uid= (UserID) and mail=.
  • Keys may appear multiple times, in which case the are considered as a list of values.
  • An entry has one or more objectClasses.
  • Object classes are defined by schemas, and they determine which attributes must and may appear in an entry. For instance, the posixAccount object class is defined in the nis.schema and must include cn, uid, etc.
  • Different object classes may define the same attribute types.
  • A reference of common object classes can be found in Appendix E of the excellent Zytrax LDAP Guide.
  • A reference of common attribute types can also be found in Appendix E.

Connecting and searching LDAP servers

The most common action to perform on LDAP servers is to search for information in the directory. For instance, you may want to search for a username to verify if they entered their password correctly, or you may want to search for Common Names (CNs) to auto-complete names and email addresses in your email client. In order to search an LDAP server, we must perform the following:

  1. Connect to the LDAP server
  2. Authenticate against the LDAP server so we are allowed to search. This is called binding. Basically it's just logging in. We bind against an LDAP server by specifying a user's DN and password. This can be confusing because there can be DNs/password with which you can bind in the LDAP, but also user/passwords which are merely stored so that other systems can authenticate users using the LDAP server.
  3. Specify which sub-part of the tree we wish to search. This is called the Base DN (Base Distinguished Name). For example: ou=people,dc=megacorp,dc=com, so search only people. Different bind DN's may search different parts of the tree.
  4. Specify how deep we want to search in the tree. This is called the level. The level can be: BaseObject (search just the named entry, typically used to read one entry), singleLevel (entries immediately below the base DN), orwholeSubtree (the entire subtree starting at the base DN).
  5. Specify what kind of entries we'd like to search for. This is called the filter. For example, (objectClass=*) will search for ANY kind of object class. (objectClass=posixAccount) will only search for entries of the posixAccount object class.

Here's an example of connecting, binding and searching an LDAP server using the ldapsearch commandline util:

$ ldapsearch -W -h ldap.megacorp.com -D "uid=ldapreader,dc=megacorp,dc=com"
  -b ou=people,dc=megacorp,dc=com "(objectclass=*)"
password: ********
  • -W tells ldapsearch to prompt for a password.
  • -h is the hostname of the LDAP server to connect to.
  • -D is the Distguished Name (DN), a.k.a the username, with which to connect. In this case, a special ldapreader account.
  • -b is the Base DN, a.k.a the subtree, we want to search.

Finally, we specify a search filter: "(objectclass=*)". This means we want to search for all object classes.

The previous example, but this time in the Python programming language:

import ldap
l = ldap.initialize('ldap://ldap.megacorp.com:389')

l.bind('uid=ldapreader,dc=megacorp,dc=com', 'Myp4ssw0rD')
l.search_s('ou=people,dc=megacorp,dc=com', ldap.SCOPE_SUBTREE, 
           filterstr="(objectclass=*)")

Further Reading

That's it! Like I said, it's terse! If you need to know more about LDAP, here are some good resources on it:

Subversion svn:ignore propery doesn't (seem) to work? [FIXED]

Say you're trying to set the "ignore" property on something in a subversion checkout like this:

svn propset svn:ignore "foo.pyc" .

Next you do a svn status:

M       foo.pyc

It seems it isn't working. In order to fix this, you must remember to first:

  • Remove the file from subversion and commit
  • svn update all the checkouts of that repository so that the file is gone everywhere!
  • Set the svn:ignore propery
  • Now commit the property change, or svn status will still show it (even in the local checkout)!
  • svn update all the checkouts of the repository

So:

host1$ svn rm foo.pyc && svn commit -m "Remove compiled python code"
host2$ svn update
host1$ svn propset svn:ignore "foo.pyc" .
host1$ svn commit -m "Ignore compiled python code"
host2$ svn update

If you get conflicts because you didn't follow these steps exactly:

host2$ svn update
   C foo.pyc
host2$ svn resolve --accept working foo.pyc
host2$ svn rm foo.pyc
host2$ svn update
At revision 123

That should solve it.

If you want all your subversion problems solved, try this.

Stop Joomla (v2.5) from logging you out of the administrator interface

The Joomla v2.5 backend administrator interface by default will log you out after you've been inactive for 24 minutes (some on the internet claim it's 15, others 30 minutes. For me, it seems it was 24). This is quite annoying, and usually easily fixed in most PHP applications by changing the session timeout. Joomla also requires that you modify some other parts. Here's how I got it to work:

Summary

Summary for the lazy technical people. These are the steps to modify the session timeout:

  1. In php.ini, find the session.gc_maxlifetime setting, and change it.
  2. In Joomla Admin inteface, go to Site → Global Configuration → System and change the Session Lifetime value.
  3. In Joomla's root directory, open configuration.php and change public $lifetime = '1440'; to the number of seconds.

If this wasn't enough information for you, read the following which explains more in-depth:

Steps

Step 1: Modify php.ini

Figure out which php.ini Joomla uses by creating the following "info.php" file in your Joomla directory:

phpinfo();
?>

Direct your browser to the file, for instance: http://mysite.example.com/info.php. You should see the purple/blue PHP info page. Locate the "Loaded Configuration File" setting. This is which php.ini file will be used. Make sure to delete the info.php file when you're done!

Edit the file (for me its /etc/php5/apache2/php.ini) and find the following setting:

session.gc_maxlifetime = ....

Change the setting to however many seconds you want to remain logged in without activity, before being logged out automatically. I set mine to 8 hours (28800 seconds):

session.gc_maxlifetime = 28800

Step 2: Timeout in the Joomla interface

I'm not sure this step is required, but I changed it, so you may also have too.

Open the Joomla Adminisatror backend (http://mysite.example.com/administator/), login as a Super User ('admin' usually), and open Site → Global Configuration → System. On the right side, change Session Lifetime to the number of seconds you want to keep the session alive. For me, that's 28000 seconds again.

Step 3: Joomla's configuration.php

Final step. In the Joomla top directory, you'll find a file called configuration.php. Open this file with your editor, and search for:

public $lifetime = '1440';

Change the number (1440) to the number of seconds you want the session to stay alive:

public $lifetime = '288000';

Save the file.

Step 4: Restart your webserver

As a final step, you may have to restart your webserver. How to do this depends on your installation.

Now your session should remain alive for the number of seconds specified, even if you're not active.

MobaXterm – (Free) All-in-one Xserver/SSH/Linux environment for Windows


I recently stumbled on MobaXterm. It's a complete unix enviroment including X Server/SSH/Telnet/SCP/FTP client all in one. The list of features is impressive to say the least. This is an excellent replacement for Putty.

A small selection of the most useful features:

  • Free. What more is there to say?
  • Tabs and Horizontal / Vertical split panels finally bring the full power of native Unix/Linux terminal emulators to Windows
  • Integrated X server. MobaXterm comes with an integrated X Server. Everything is set up correctly out-of-the-box. X11 forwarding means you can simply SSH to a remote machine and start X11 programs. It supports displaying remote X11 windows as native windows or you can run the X Server ina separate tab/window.
  • Session Management makes it easy to quickly connect to the machine you want.
  • Integrated SFTP when SSHing to a remote machine means you don't have to start a separate SFTP/SCP session. Just browse, upload and download remote files from the left side of the SSH session.
  • Many supported services, such as SSH, Telnet, local Linux/Cygwin terminal, local Windows command prompt, RSH, XDMCP, RDP, VNC, FTP, etc.
  • Session multiplexing provides a quick method of running commands on multiple machines at the same time.
  • SSH bouncing through a gateway SSH server means no more SSHing from machine to machine.
  • Cygwin environment so you can actually get some work done natively on Windows. Batteries, bells, whistles and kitchen sinks (as well as games) included: full unix environment with tools like grep, find, vim, etc, etc, etc.

There are countless more features. This is the terminal emulator app I always hoped Putty would become. Of all the different shells around Putty, separate SSH connection managers and terminals I've tried, this is by far the best one.

The user isn't always wrong

Some time ago, my mother bought a new laptop. It came preinstalled with Windows Vista, which proved to be quite the disaster. The laptop wasn't nowhere near fast enough to run it, so I installed Ubuntu on it. This allowed my mom to do everything she needed to do with the laptop, while at the same time making it easy for me to administer the beast.

One day my mom phoned me, and explained about a problem she was having:

"Whenever I move the laptop into the kitchen, it stops working!"

Now my mom is no computer expert, but she picked up Ubuntu quickly and has never needed much hand-holding when it comes to using the laptop. This one, however, sounded to me like one of those situations where the user couldn't possibly be correct. We went through the basic telephone support routine, but she persisted in her observation that somehow the kitchen was responsible for her laptop misery.

Eventually, after deciding the problem couldn't be fixed over the phone, I agreed to come over to my parents house the next evening to take a look at it. With my general moody "a family member's PC needs fixing" attitude and a healthy dose of skepticism ("this is going to be one of those typical the-cable-isn't-plugged-in problems"), I arrived at my parents.

"Okay, let's see if we can't fix this problem", I said, as I powered up the laptop upstairs. Everything worked fine. Picking up the laptop, I moved it downstairs into the living room. No problems whatsoever. Next, the kitchen. And lo and behold:

The laptop crashed almost immediately.

"Coincidence", I thought, and tried it again. And again, as soon as I entered the kitchen, the laptop crashed. I… was… Stunned! I had never encountered a problem like this before. What could possibly make it behave like that?

After pondering this strange problem for a while, I thought "what's the only location-dependent thing in a laptop?", and it dawned on me that it might just be related to the WiFi. I powered up the laptop once again in the living room, completely turned off the WiFi by rmmod-ing the relevant kernel modules, and entered the kitchen. No crash. It kept on working perfectly. Until I turned on the WiFi again.

With the aid of some log files (which I should have checked in the first place, I admit), I quickly found the culprit. The very last thing I saw in the log files just before the computer crashed… an attempt to discover the neighbors WiFi! A wonky WiFi router in combination with buggy drivers cause the laptop to crash, but only when it came in range of said WiFi router. And that happened only in the kitchen!

In the end I disabled automatic WiFi discovery on the laptop, since my mom didn't really take it out of the house anyway, and the problems disappeared. I never encountered a problem like that again, but I did learn one thing though:

No matter how impossible the problem may seem… The user isn't always wrong.

Read less

A programmer once built a vast database containing all the literature, facts, figures, and data in the world. Then he built an advanced querying system that linked that knowledge together, allowing him to wander through the database at will. Satisfied and pleased, he sat down before his computer to enjoy the fruits of his labor.

After three minutes, the programmer had a headache. After three hours, the programmer felt ill. After three days, the programmer destroyed his database. When asked why, he replied: “That system put the world at my fingertips. I could go anywhere, see anything. Because I was no longer limited by external conditions, I had no excuse for not knowing everything there is to know. I could neither sleep nor eat. All I could do was wander through the database. Now I can rest.”

— Geoffrey James, Computer Parables: Enlightenment in the Information Age

I was a major content consumer on the Internet. My Google Reader had over 120 feeds in it. It produced more than a 1000 new items every couple of hours. I religiously read Hacker News, Reddit and a variety of other high-volume sources of content. I have directories full of theoretical science papers, articles on a wide range of topics and many, many tech books. I scoured the web for interesting articles to save to my tablet for later reading. I was interested in everything. Programming, Computer Science, Biology, Theoretical Particle Physics, Psychology, rage-comics, and everything else. I could get lost for hours on Wikipedia, jumping from article to article, somehow, without noticing it, ending up at articles titled "Gross–Pitaevskii equation" or "Grand Duchy of Moscow", when all I needed to know was what the abbreviation "SCPD" stood for. (Which, by the way, Wikipedia doesn't have an article for, and means "Service Control Point Definition")

I want to make it clear I wasn't suffering from Information Overload by any definition. I was learning things. I knew things about technology which I hadn't even ever used myself. I can tell you some of the ins and outs of iPhone development. I don't even own an iPhone. I can talk about Distributed Computing, Transactional Memory and why it is and isn't a good idea, without having written more than a simple producer/consumer routine. I'm even vehemently against writing to shared memory in any situation! I can tell you shit about node.js and certain NoSQL databases without even ever having installed – much less dived into – them. Hell, I don't even like Javascript!

The things is: even though I was learning about stuff, it was superficial knowledge without context and the kind of basic information that allows you to draw these conclusions you're reading about for yourself, without the help of some article. I didn't pause to think about conclusions drawn in an article, or to let the information sink in. I read article after article. I wasn't putting the acquired knowledge into practice. The Learning Pyramid may have been discredited, but I'm convinced that we learn more from doing than we do from reading about something.

So what makes reading so attractive that we'd rather read about things than actually doing them? And I know for a fact that I'm not alone in having this problem. I think – and this might be entirely personal – it's because of a couple of reasons.

One is that it's much easier to read about something than to actually figure things out yourself. I want to experiment with sharding in NoSQL databases? I have to set up virtual machines, set up the software, write scripts to generate testing data, think about how to perform some experiments, and actually run them. Naturally I'd want to collect some data from those experiments; maybe reach a couple of conclusions even. That's a lot of work. It's much easier to just read about it. It's infinitely easier to stumble upon and read an article on "How to Really Get Things Done Using GettingThingsDone2.0 and Reverse Todo Lists" than it is to actually get something done.

The second reason, at least for me, is that it gives me the feeling that I'm learning more about things. In the time it takes me to set up all the stuff above, I could have read who-knows-how-many articles. And it's true in a sense. The information isn't useless per se. I'm learning more shallow knowledge about a lot of different things, versus in-depth knowledge about a few things. It gives me all kinds of cool ideas, things to do, stuff to try out. But I never get around to those things, because I'm always busy reading about something else!

So I have taken drastic measures.

I have removed close to 95% of my feeds from Google Reader. I've blocked access to Reddit and HackerNews so I'm not tempted to read the comments there. I check hackurls.com (an aggregator for Hacker News, Reddit's /r/programming and some other stuff) at most once a day. Anything interesting I see, I send to my tablet (at most two articles a day), which I only read on the train (where I don't have anything better to do anyway). I avoid Wikipedia like the plague.

I distinctly remember being without an Internet connection for about a month almost four years ago. It was the most productive time of my life since the Internet came around. I want to return to the times when the Internet was a resource for solving problems and doing research, not an interactive TV shoveling useless information into my head.

Now if you'll excuse me, I have an algorithm to write and a website to finish.

Stop Pingback/Trackback Spam on WordPress

I guess the spammers finally found my blog, cause I've been getting a lot of pignback/trackback spam. I tried some anti-spam plugins, but none really worked, so I disabled pingbacks altogether. Here's how:

First, log into wordpress as an admin. Go to Settings → Discussion, and uncheck the Allow link notifications from other blogs (pingbacks and trackbacks.) box.

That's not all though, cause that just works for new articles. Old ones will still allow ping/trackbacks. As far as I could tell, WordPress doesn't have a feature to disable those for older posts, so we'll have to fiddle around in the database directly.

Connect to your MySQL server using the WordPress username and password. If you no longer remember those, you can find them in the wp-config.php file.

$ mysql -u USERNAME -p -h localhost
Password: PASSWORD
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1684228
Server version: 5.0.51a-24+lenny5 (Debian)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql>

At the MySQL prompt, we must now switch to our WordPress database. Again, if you don't remember the name, you can find it in the wp-config.php file. In my case, it is em_wordpress

mysql> USE em_wordpress;
Database changed

Finally, we update all posts and disable ping backs on them:

mysql> UPDATE wp_posts SET ping_status = 'closed';
Query OK, 1084 rows affected (0.10 sec)
Rows matched: 1084  Changed: 1084  Warnings: 0

There we go. No more ping- and trackback spam on old posts.

This is why I don't use Apple products or DRM media

This company is going out of business because they put all their eggs in a very delicate and quite frankly evil basket:

BeamItDown Software and the iFlow Reader will cease operations as of May 31, 2011. We absolutely do not want to do this, but Apple has made it completely impossible for anyone but Apple to make a profit selling contemporary ebooks on any iOS device.

If you're a company, and you do this:

We bet everything on Apple and iOS and then Apple killed us by changing the rules in the middle of the game.

you need to have your head examined :-) This is not the first time this has happened, and it will most certainly not be the last time. Apple will do anything it can to make a buck over other company's back!

Not just the company is being royally screwed over by Apple:

Many of you have purchased books and would like to keep them. You may still be able to read them using iFlow Reader although we cannot guarantee that it will work beyond May 31, 2011 [...] your computer which will let you access them with Adobe Digital Editions or any other ebook application that is compatible with Adobe DRM protected epubs.

So iFlowReader's have probably also lost all their ebooks because they had DRM on them. DRM (Digital Rights Management) is a technology which restricts media to a certain application or device; opening it in third-party applications is usually impossible.

And that's why I have never and will never buy an Apple product, or use any media that is DRM protected.