Home Page

Keep your home dir in Git with a detached working directory

logo@2xMany posts have been written on putting your homedir in git. Nearly everyone uses a different method of doing so. I've found the method I'm about to describe in this blog post to work the best for me. I've been using it for more than a year now, and it haven't failed me yet. My method was put together from different sources all over the web; long since gone or untracable. So I'm documenting my setup here.

The features

So, what makes my method better than the rest? What makes it better than the multitude of pre-made tools out there? The answer is: it depends. I've simply found that this methods suits me personally because:

How does it work?

It's simple. We create what is called a "detached working tree". In a normal git repository, you've got your .git dir, which is basically your repository database. When you perform a checkout, the directory containing this .git dir is populated with files from the git database. This is problematic when you want to keep your home directory in Git, since many tools (including git itself) will scan upwards in the directory tree in order to find a .git dir. This creates crazy scenario's such as Vim's CtrlP plugin trying to scan your entire home directory for file completions. Not cool. A detached working tree means your .git dir lives somewhere else entirely. Only the actual checkout lives in your home dir. This means no more nasty .git directory.

An alias 'dgit' is added to your .profile that wraps around the git command. It understands this detached working directory and lets you use git like you would normally. The dgit alias looks like this:

alias dgit='git --git-dir ~/.dotfiles/.git --work-tree=$HOME'

Simple enough, isn't it? We simply tell git that our working tree doesn't reside in the same directory as the .git dir (~/.dotfiles), but rather it's our directory. We set the git-dir so git will always know where our actual git repository resides. Otherwise it would scan up from the curent directory your in and won't find the .git dir, since that's the whole point of this exercise.

Setting it up

Create a directory to hold your git database (the .git dir):

$ mkdir ~/.dotfiles/
$ cd ~/.dotfiles/
~/.dotfiles$ git init .

Create a .gitifnore file that will ignore everything. You can be more conservative here and only ignore things you don't want in git. I like to pick and choose exactly which things I'll add, so I ignore everything by default and then add it later.

~/.dotfiles$ echo "*" > .gitignore
~/.dotfiles$ git add -f .gitignore 
~/.dotfiles$ git commit -m "gitignore"

Now we've got a repository set up for our files. It's out of the way of our home directory, so the .git directory won't cause any conflicts with other repositories in your home directory. Here comes the magic part that lets us use this repository to keep our home directory in. Add the dgit alias to your .bashrc or .profile, whichever you prefer:

~/.dotfiles$ echo "alias dgit='git --git-dir ~/.dotfiles/.git --work-tree=\$HOME'" >> ~/.bashrc

​You'll have to log out and in again, or just copy-paste the alias defnition in your current shell. We can now the repository out in our home directory with the dgit command:

~/.dotfiles$ cd ~
$ dgit reset --hard
HEAD is now at 642d86f gitignore

Now the repository is checked out in our home directory, and it's ready to have stuff added to it. The dgit reset --hard command might seem spooky (and I do suggest you make a backup before running it), but since we're ignoring everything, it'll work just fine.

Using it

Everything we do now, we do with the dgit command instead of normal git. In case you forget to use dgit, it simply won't work, so don't worry about that.

A dgit status shows nothing, since we've gitignored everything:

$ dgit status
On branch master
nothing to commit, working directory clean

We add things by overriding the ignore with -f:

$ dgit add -f .profile 
$ dgit commit -m "Added .profile"
[master f437f9f] Added .profile
 1 file changed, 22 insertions(+)
 create mode 100644 .profile

We can push our configuration files to a remote repository:

$ dgit remote add origin ssh://fboender@git.electricmonk.nl:dotfiles
$ dgit push origin master
 * [new branch]      master -> master

And easily deploy them to a new machine:

$ ssh someothermachine
$ git clone ssh://fboender@git.electricmonk.nl:dotfiles ./.dotfiles
$ alias dgit='git --git-dir ~/.dotfiles/.git --work-tree=$HOME'
$ dgit reset --hard
HEAD is now at f437f9f Added .profile

Please note that any files that exist in your home directory will be overwritten by the files from your repository if they're present.

Conclusion

This DIY method of keeping your homedir in git should be easy to understand. Although there are tools out there that are easier to use, this method requires no installing other than Git. As I've stated in the introduction, I've been using this method for more than a year, and have found it to be the best way of keeping my home directory in git. 

SquashFS: Mountable compressed read-only filesystem

SquashFS is generally used for LiveCDs or embedded devices to store a compressed read-only version of a file system. This saves space at the expense of slightly slower access times from the media. There's another use for SquashFS: keeping an easily accessible compressed mounted image available. This is particularly useful for archival purposes such as keeping a full copy of an old server or directory around.

Usage is quite easy under Debian-derived systems. First we install the squashfs-tools package

$ sudo apt-get install squashfs-tools

Create an compressed version of a directory:

$ sudo mksquashfs /home/fboender/old-server_20150608/ old-server_20150608.sqsh

Remove the original archive:

$ sudo rm -rf /home/fboender/old-server_20150608

Finally, mount the compressed archive:

$ sudo mkdir /home/fboender/old-server_2015060
$ sudo mount -t squashfs -o loop old-server_20150608.sqsh /home/fboender/old-server_2015060

Now you can directly access files in the compressed archive:

$ sudo ls /home/fboender/old-server_2015060
home/
usr/
etc/
...

The space savings are considerable too.

$ sudo du -b -s /home/fboender/old-server_2015060
17329519042	/home/fboender/old-server_2015060
$ sudo ls -l old-server_20150608.sqsh
-rw-r--r-- 1 root root 1530535936 Jun  8 12:45

17 Gb for the full uncompressed archive versus only 1.5 Gb for the compressed archive. We just saved 15.5 Gb of diskspace. .

 Optionally, you may want to have it mounted automatically at boottime:

$ sudo vi /etc/fstab
/home/fboender/old-server_20150608.sqsh   /home/fboender/old-server_2015060        squashfs        ro,loop 0       0

If the server starts up, the archive directory will be automatically mounted.

I've released cfgtrack v1.0: Get notified of changes on your server

I needed a simple way of being notified when configuration files had been changed on some servers. Nothing fancy. No configuration management, no intrusion detection, no centralised versioning control repositories. Just a simple email saying what's been changed. I couldn't find a tool that did just that, and didn't require massive amounts of configuration, so I wrote one myself.

I've just released version 1.0 of the tool, which is available in source, Debian, Redhat and zip packages.

Here's how simple it is:

$ sudo cfgtrack track /etc/
Now tracking /etc/

# Make some changes in a file

$ sudo cfgtrack -a -m ferry.boender@example.com compare

And I'll get an email in my mailbox if anything's been changed since the last time I ran compare. A diff is included to easily spot what has changed.

Add the above to a daily cronjob and you'll be kept up-to-date about changes to your configuration files. Now you'll have a heads-up if automatic package upgrades modify configuration files or a co-administrator decided to make some changes.

More information is available on the Github project page.

 

 

Chrome’s Console API: Greatest Hits

The Chrome debugger is the best tool for locating problematic code in a JavaScript application, but there are times that diving into your code line-by-line isn’t the fastest or most convenient means to that end. We all know about console.log(), but I thought I’d write about a few of its lesser-known cousins that are more refined, and can be a lot more expressive.

Check out the helpful tips on using Chrome's javascript debugging console.

Minimalising the Gmail interface

Since a few months I've been using the Gmail web interface as my main email client. So far my experience has been pretty good, although it took some getting used to. I'm running it in a separate window instead of my main browser. For this I'm using it as an Application in Chrome (Open Gmail in Chrome and select Menu → Tools Create Application Shortcut). 

Since I'm running it in a separate window, much like a normal desktop email client, I'd like the interface to be as minimal and simple as possible. I don't use labels; either an email is in my inbox, or it's archived. Gmail's search is good enough that I don't require the use of labels.

I wrote a UserStyles style to remove unneeded elements from the interface. This is what Gmail looked like before:

gmail_cluttered

This is what it looks like with my UserStyle active:

gmail_clean

If you'd like your gmail interface to look the same:

  1. Get the Stylish plugin for your browser (Firefox, Chrome)
  2. Install the "Gmail minimal" UserStylle

It removes the labels sidebar, so you'll need to use Gmail a bit differently than you're used to if you use this style:

Dependency Injection in web.py

webpyweb.py is a lightweight Python web framework that gets out of your way and just let's you write Python.

Here's a simple program written in web.py:

import web

class index:
    def GET(self):
        return "Hello, World!`"

urls = (
    '/', 'index',
)

if __name__ == "__main__":
    app = web.application(urls, globals())
    app.run()

I quickly ran into the issue of writing larger well-structured applications in web.py though. If our program becomes bigger, we really want to break up our program into multiple files. This is of course no problem with web.py:

frontpage.py

class index:
    def GET(self):
        return "Hello, World!`"

webapp.py

import web

import frontpage

urls = (
    '/', 'frontpage.index',
)

if __name__ == "__main__":
    app = web.application(urls, globals())
    app.run()

In the example above, we put some of our routes in a seperate file and import it. web.py's urls definition understands this and happily use the Index class from the module. However, what if we want to pass some application-wide settings to the Index route? web.py's examples all use globals, but that's not gonna work if our route lives in another file. Besides, globals are annoying and make unit testing more difficult.

The way to get around this is with a technique called Dependency Injection. I couldn't find any best practices on how to do this with web.py, so I came up with the following:

frontpage.py

import web

class index:
    def GET(self):
        smtp_server = web.ctx.deps['config']['smtp_server']
        smtp_port = web.ctx.deps['config']['smtp_port']
        return "Sending email via %s:%s" % (smtp_server, smtp_port)

webapp.py

import web
import frontpage


class InjectorClass:
    def __init__(self, deps):
        self.deps = deps

    def __call__(self, handler):
        web.ctx.deps = self.deps
        return handler()

urls = (
    '/', 'frontpage.index',
)

if __name__ == "__main__":
    config = {
        'smtp_server': '127.0.0.1',
        'smtp_port': 25,
    }

    app = web.application(urls, globals())
    app.add_processor(InjectorClass({'config': config}))
    app.run()

If we run the webapp, we'll see:

Sending email via 127.0.0.1:25

The way this works is that we define an InjectorClass which simply holds a variable for us. In this case a dictionary containing a 'config' key with our configuration values. The InjectorClass also defines a __call__ method. This means any instances of the class become executable, as if it was a function. This lets us pass it to web.py as a processor (add_processor()).

Whenever a new request comes in, web.py does some magic with web.ctx (context) to ensure that the values it contains only apply to the current request. No other request sees values of any other request's web.ctx. For each request, web.py also calls every processor. In our case, that's an instance of the InjectorClass. When called, the __call__ method is invoked, which adds the dependencies to the web.ctx so the current request can access them.

So now we can pass any value to our InjectorClass on application startup, and it will automatically become available in each request.

You should be careful about what dependencies you inject. Generally, read-only values are fine, but you should realize that injected dependencies are shared among every request and should therefor be threadsafe.

I feel I should also note that we could have gone with a closure, but as I explained in an earlier article, I prefer a class.

 

Keep an archive of all mails going through a Postfix SMTP server

OB-PZ529_junkma_G_20111006070316

All too often I get asked questions about emails which I can't answer because I can't actually see the emails. While mail logging goes a long way, I'd really like to keep an archive of all mails sent via an SMTP server on that machine. Here's how to do that with Postfix. This was tested on Ubuntu 14.04, but should be applicable to other foonixes without too much trouble. Run all this as the root user.

Add a user to the system so postfix can send BCC's of all emails to it

adduser --system --home /var/archive/mail/ --no-create-home --disabled-password mailarchive

Next, create the Mailbox layout for the mail archive:

mkdir -p /var/archive/mail/tmp
mkdir -p /var/archive/mail/cur
mkdir -p /var/archive/mail/new
chown -R nobody:nogroup /var/archive

Configure Postfix to always send a copy of any emails sent to the mailarchive user:

postconf -e always_bcc=mailarchive@localhost

Configure the mail storage for the mailacrhive user so it uses the Mailbox format. This makes it easier to delete old emails:

# echo "mailarchive: /var/archive/mail/" >> /etc/aliases
# newaliases

Finally, restart postfix

/etc/init.d/postfix restart

Now to test it send an email through the SMTP server. I'll do a manual SMTP session here:

telnet localhost 25
HELO localhost
MAIL FROM: fboender@localhost
RCPT TO: ferry.boender@example.com
DATA
Subject: Mail test
Here's some mail.
.
QUIT

We check the /var/archive/mail/new directory:

ls /var/archive/mail/cur/
1425653888.Vfc00I461477M767603.posttest4:2,

And there is our mail.

To easily view the mail in the archive, install mutt:

apt-get install mutt
mutt -f /var/archive/mail/

You should probably write a cronjob that regularly cleans out the old mail, otherwise your filesystem will fill up. The following cronjob will delete all mail older than 30 days

cat /etc/cron.daily/mailarchive_clean
#!/bin/sh
find /var/archive/mail/ -type f -mtime +30 -exec rm "{}" \;
chmod 755 /etc/cron.daily/mailarchive_clean

Good luck!

Edit: Changed postfix configuration addition to postconf -e

Script to start a Chrome browser with an SSH Socks5 proxy

Socks5 proxies are great. They allow you to tunnel all traffic for applications that support Socks proxies through the proxy. One example I frequently use is starting a Chrome window that will do everthing as if it was an a remote machine. This is especially useful to bypass firewalls so you can test websites that are only available on localhost on a remote machine, or sites that can only be accessed if you're on a remote network. Basically it's a poor-man's application-specific VPN over SSH.

Normally I run the following:

ssh -D 8000 -N remote.example.com &
chromium-browser --temp-profile --proxy-server="socks5://localhost:8000"

However that quickly becomes tedious to type, so I wrote a script:

#!/bin/bash
HOST=$1
SITE=$2

if [ -z "$HOST" ]; then
    echo "Usage; $0 <HOST> [SITE]"
    exit 1
fi

while `true`;
do
    PORT=$(expr 8000 + $RANDOM / 32) # random port in range 8000 - 9000
    if [ \! "$(netstat -lnt | awk '$6 == "LISTEN" && $4 ~ ".$PORT"')" ]; then
        # Port not in use
        ssh -D $PORT -N $HOST &
        PID=$!
        chromium-browser --temp-profile --proxy-server="socks5://localhost:$PORT" $SITE
        kill $PID
        exit 0
    fi
done

The script finds a random unused port in the range 8000 – 9000, starts a Socks5 proxy to it and then starts Chromium on that socks proxy.

Together with the excellent Scripts panel plugin for Cinnamon, this makes for a nice menu to easily launch a browser to access remote sites otherwise unreachable:

lm_panel_scripts

Update: Added a second optional parameter to specify the site you want the browser to connect too, for added convience.

BBCloner v1.4: Bitbucket backup tool

I've released v1.4 of BBCloner.

BBCloner (Bitbucket Cloner) creates mirrors of your public and private Bitbucket Git repositories. It also synchronizes already existing mirrors. Initial mirror setup requires you manually enter your username/password. Subsequent synchronization of mirrors is done using Deployment Keys.

This release features a new flag: –tolerant (-t). It prevents bbcloner from complaining about failed repository synchronisation if a repository fails the first time. Only on the second failure to synchronize the same repository does bbcloner send out an email when using the -t switch. This should save on a lot of unwarranted emails, given that Bitbucket quite regularly seems to bork while syncing repositories.

Get the new release from the Downloads page or directly:

Host inventory overview using Ansible's Facts

UPDATE: I've written a fancier version of the above script as a separate project called ansible-cmdb. It uses templates and can generate a feature-laden HTML version and text versions. It also lets you extend the information from your hosts very easily; even adding completely new hosts. Packages are available for Debian, Redhat and other operating systems.

Ansible is a multiplexing configuration orchestrator. It allows you to run commands and configure servers from a central system. For example, to run the uname -a command on servers in the group "intranet":

[fboender@jib]~/Projects/ansible$ ansible -m shell -a "uname -a" intranet
host001.example.com | success | rc=0 >>
Linux host001.example.com 2.6.32-45-server #102-Ubuntu SMP Wed Jan 2 22:53:00 UTC 2013 x86_64 GNU/Linux

host004.example.com | success | rc=0 >>
Linux vps004c.example.com 2.6.32-55-server #117-Ubuntu SMP Tue Dec 3 17:45:11 UTC 2013 x86_64 GNU/Linux

Ansible can also gather system information using the 'setup' module. It returns the information as a JSON structure:

[fboender@jib]~/Projects/ansible$ ansible -m setup intranet
host001.example.com | success >> {
    "ansible_facts": {
        "ansible_all_ipv4_addresses": [
            "182.78.44.33", 
            "10.0.0.1"
        ], 
        "ansible_architecture": "x86_64", 
        "ansible_bios_date": "NA", 
        "ansible_bios_version": "NA", 
        ... etc

We can use this to display a short tabulated overview of important system information such as the FQDN, configured IPs, Disk and memory information. I wrote a quick script to do this. The result looks like this:

[fboender@jib]~/Projects/ansible$ ./hostinfo intranet
Name                     FQDN                     Datetime                    OS            Arch    Mem              Disk                 Diskfree          IPs
-----------------------  -----------------------  --------------------------  ------------  ------  ---------------  -------------------  ----------------  -------------------------------------------------------------------------
host001                  host001.example.com      2015-01-20 14:37 CET +0100  Ubuntu 12.04  x86_64  4g (free 0.16g)  80g                  40g               182.78.44.33, 10.0.0.1
host002                  host002.example.com      2015-01-20 14:37 CET +0100  Ubuntu 14.04  x86_64  2g (free 1.21g)  40g                  18g               182.78.44.34, 10.0.0.2
xxxxxx.xxxxxx.xx         xxxxxxx.example.com      2015-01-20 13:37 CET +0000  Ubuntu 10.04  x86_64  2g (free 0.04g)  241g                 20g               192.168.0.2, 10.000.0.4
xxxx.xxxxxx.xxx          xxxx.otherdom.com        2015-01-20 14:37 CET +0100  Ubuntu 13.04  x86_64  8g (free 0.14g)  292g, 1877g, 1877g   237g, 583g, 785g  192.168.1.9, 192.168.1.10, 10.0.0.6
xxxxxxxx.xxxxxx.xx       xxxx.otherdom.com        2015-01-20 14:36 CET +0100  Ubuntu 14.04  i386    6g (free 0.25g)  1860g, 1877g, 1877g  960g, 292g, 360g  10.0.0.5, 10.0.0.14, 192.168.1.12
xxxxx.xxxxx.xxx          test.otherdom.com        2015-01-20 14:37 CET +0100  Ubuntu 9.10   x86_64  2g (free 0.28g)  40g                  16g               10.0.0.15, 10.0.0.9

The script:

#!/usr/bin/python
# MIT license

import os
import sys
import shutil
import json
import tabulate
import pprint

host = sys.argv[1]
tmp_dir = 'tmp_fact_col'

try:
    shutil.rmtree(tmp_dir)
except OSError:
    pass
os.mkdir(tmp_dir)
cmd = "ansible -t {} -m setup {} >/dev/null".format(tmp_dir, host)
os.system(cmd)

headers = [
    'Name', 'FQDN', 'Datetime', 'OS', 'Arch', 'Mem', 'Disk', 'Diskfree', 'IPs',
]
d = []

for fname in os.listdir(tmp_dir):
    path = os.path.join(tmp_dir, fname)
    j = json.load(file(path, 'r'))
    if 'failed' in j:
        continue
    d.append(
        (
            fname,
            j['ansible_facts']['ansible_fqdn'],
            "%s %s:%s %s %s" % (
                j['ansible_facts']['ansible_date_time']['date'],
                j['ansible_facts']['ansible_date_time']['hour'],
                j['ansible_facts']['ansible_date_time']['minute'],
                j['ansible_facts']['ansible_date_time']['tz'],
                j['ansible_facts']['ansible_date_time']['tz_offset'],
            ),
            "%s %s" % (
                j['ansible_facts']['ansible_distribution'],
                j['ansible_facts']['ansible_distribution_version'],
            ),
            j['ansible_facts']['ansible_architecture'],
            '%0.fg (free %0.2fg)' % (
                (j['ansible_facts']['ansible_memtotal_mb'] / 1000.0),
                (j['ansible_facts']['ansible_memfree_mb'] / 1000.0)
                ),
            ', '.join([str(i['size_total']/1048576000) + 'g' for i in j['ansible_facts']['ansible_mounts']]),
            ', '.join([str(i['size_available']/1048576000) + 'g' for i in j['ansible_facts']['ansible_mounts']]),
            ', '.join(j['ansible_facts']['ansible_all_ipv4_addresses']),
        )
    )
    os.unlink(path)
shutil.rmtree(tmp_dir)
print tabulate.tabulate(d, headers=headers)

The script requires the Tabulator python library. Put the script in the directory containing your ansible hosts file, and run it.

UPDATE: I've written a fancier version of the above script as a separate project called ansible-cmdb. It uses templates and can generate a feature-laden HTML version and text versions. It also lets you extend the information from your hosts very easily; even adding completely new hosts. Packages are available for Debian, Redhat and other operating systems.