Home Page

Disable "New release available" emails on Ubuntu

We have our Ubuntu machines set up to mail us the output of cron jobs like so:

$ cat /etc/crontab 
SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=admin@example.com

# m h dom mon dow user    command

This is encrible useful, since cronjobs should never output anything unless something is wrong.

Unfortunately, this means we also get emails like this:

/etc/cron.weekly/update-notifier-common:
New release '16.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

You can fairly easily disable these by modifying the corresponding cronjob /etc/cron.weekly/update-notifier-common:

 #!/bin/sh
 
 set -e
 
 [ -x /usr/lib/ubuntu-release-upgrader/release-upgrade-motd ] || exit 0
 
 # Check to see whether there is a new version of Ubuntu available
-/usr/lib/ubuntu-release-upgrader/release-upgrade-motd
+/usr/lib/ubuntu-release-upgrader/release-upgrade-motd > /dev/null

Now you'll no longer receive these emails. It's also possible to remove the cronjob entirely, but then an upgrade is likely to put it back, and I have no idea if the cronjob has any other side effects besides emailing.

Ansible-cmdb v1.15: Generate a host overview of Ansible facts.

I've just released ansible-cmdb v1.15. Ansible-cmdb takes the output of Ansible's fact gathering and converts it into a static HTML overview page containing system configuration information. It supports multiple templates and extending information gathered by Ansible with custom data.

This release includes the following bugfixes and feature improvements:

As always, packages are available for Debian, Ubuntu, Redhat, Centos and other systems. Get the new release from the Github releases page.

cfgtrack: A simpel tool that tracks and reports diffs in files between invocations.

Sometimes other people change configurations on machines that I help administer. Unfortunately, I wouldn't know when they changed something or what they changed. There are many tools available to track configuration changes, but most are way overpowered. As a result they require too much time to set up and configure properly. All I want is a notification when things have changed, and a Unified Diff of what changes were made. I don't even care who made the changes.

So I wrote cfgtrack:

cfgtrack tracks and reports diffs in files between invocations.

It lets you add directories and files to a tracking list by keeping a separate copy of the file in a tracking directory. When invoked with the 'compare' command, it outputs a Diff of any changes made in the configuration file since the last time you invoked with the 'compare' command. It then automatically updates the tracked file. It can also send an email with the diff attached.

It's super simple to install and use. There are packages for:

Install one of the packages (see the README for instructions).

Specify something to track:

$ sudo cfgtrack /etc/
Now tracking /etc/

Show difference between the last compare, put those difference in the archive (/var/lib/cfgtrack/archive) and send an email to the admin with the diff attached:

$ sudo cfgtrack -a -m admin@example.com compare

For more info, see the project page on github.

Exploring UPnP with Python

UPnP stands for Universal Plug and Play. It's a standard for discovering and interacting with services offered by various devices on a network. Common examples include:

In this article we'll explore the client side (usually referred to as the Control Point side) of UPnP using Python. I'll explain the different protocols used in UPnP and show how to write some basic Python code to discover and interact with devices. There's lots of information on UPnP on the Internet, but a lot of it is fragmented, discusses only certain aspects of UPnP or is vague on whether we're dealing with the client or a server. The UPnP standard itself is quite an easy read though.

Disclaimer: The code in this article is rather hacky and not particularly robust. Do not use it as a basis for any real projects.

Protocols

UPnP uses a variety of different protocols to accomplish its goals:

Here's a schematic overview of the flow of a UPnP session and where the different protocols come into play.

 

 

 

 

The standard flow of operations in UPnP is to first use SSDP to discover which UPnP devices are available on the network. Those devices return the location of an XML file which defines the various services offered by each device. Next we use SCPD on each service to discover the various actions offered by each service. Essentially, SCPD is an XML-based protocol which describes SOAP APIs, much like WSDL. Finally we use SOAP calls to interact with the services.

SSDP: Service Discovery

Lets take a closer look at SSDP, the Simple Service Discovery Protocol. SSDP operates over UDP rather than TCP. While TCP is a statefull protocol, meaning both end-points of the connection are aware of whom they're talking too, UDP is stateless. This means we can just throw UDP packets over the line, and we don't care much whether they are received properly or even received at all. UDP is often used in situations where missing a few packets is not a problem, such as streaming media.

SSDP uses HTTP over UDP (called HTTPU) in broadcasting mode. This allows all UPnP devices on the network to receive the requests regardless of whether we know where they are located. Here's a very simple example of how to perform an HTTPU query using Python:

import socket
 
msg = \
    'M-SEARCH * HTTP/1.1\r\n' \
    'HOST:239.255.255.250:1900\r\n' \
    'ST:upnp:rootdevice\r\n' \
    'MX:2\r\n' \
    'MAN:"ssdp:discover"\r\n'
 
# Set up UDP socket
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDP)
s.settimeout(2)
s.sendto(msg, ('239.255.255.250', 1900) )
 
try:
    while True:
        data, addr = s.recvfrom(65507)
        print addr, data
except socket.timeout:
    pass

This little snippet of code creates a HTTP message using the M-SEARCH HTTP method, which is specific to UPnP. It then sets up a UDP socket, and sends out the HTTPU message to IP address 239.255.255.250, port 1900. That IP is a special broadcast IP address. It is not actually tied to any specific server, like normal IPs. Port 1900 is the one which UPnP servers will listen on for broadcasts.

Next, we listen on the socket for any replies. The socket has a timeout of 2 seconds. This means that after not receiving any data on the socket after two seconds, the s.recvfrom() call times out, which raises an exception. The exception is caught, and the program continues.

upnp_udp_broadcast

You will recall that we don't know how many devices might be on the network. We also don't know where they are nor do we have any idea how fast they will respond. This means we can't be certain about the number of seconds we must wait for replies. This is the reason why so many UPnP control points (clients) are so slow when they scan for devices on the network.

In general all devices should be able to respond in less than 2 seconds. It seems that manufacturers would rather be on the safe side and sometimes wait up to 10 seconds for replies. A better approach would be to cache previously found devices and immediately check their availability upon startup. A full device search could then be done asynchronous in the background. Then again, many uPNP devices set the cache validaty timeout extremely low, so clients (if they properly implement the standard) are forced to rediscover them every time.

Anyway, here's the output of the M-SEARCH on my home network. I've stripped some of the headers for brevity:

('192.168.0.1', 1900) HTTP/1.1 200 OK
USN: uuid:2b2561a3-a6c3-4506-a4ae-247efe0defec::upnp:rootdevice
SERVER: Linux/2.6.18_pro500 UPnP/1.0 MiniUPnPd/1.5
LOCATION: http://192.168.0.1:40833/rootDesc.xml

('192.168.0.2', 53375) HTTP/1.1 200 OK
LOCATION: http://192.168.0.2:1025/description.xml
SERVER: Linux/2.6.35-31-generic, UPnP/1.0, Free UPnP Entertainment Service/0.655
USN: uuid:60c251f1-51c6-46ae-93dd-0a3fb55a316d::upnp:rootdevice

Two devices responded to our M-SEARCH query within the specified number of seconds. One is a cable internet router, the other is Fuppes, a UPnP media server. The most interesting things in these replies are the LOCATION headers, which point us to an SCPD XML file: http://192.168.0.1:40833/rootDesc.xml.

SCPD, Phase I: Fetching and parsing the root SCPD file

The SCPD XML file (http://192.168.0.1:40833/rootDesc.xml) contains information on the UPnP server such as the manufacturer, the services offered by the device, etc. The XML file is rather big and complicated. You can see the full version, but here's a grealy reduced one from my router:

<?xml version="1.0" encoding="UTF-8"?>
<root xmlns="urn:schemas-upnp-org:device-1-0">
  <device>
    <deviceType>urn:schemas-upnp-org:device:InternetGatewayDevice:1</deviceType>
    <friendlyName>Ubee EVW3226</friendlyName>
    <serviceList>
      <service>
        <serviceType>urn:schemas-upnp-org:service:Layer3Forwarding:1</serviceType>
        <controlURL>/ctl/L3F</controlURL>
        <eventSubURL>/evt/L3F</eventSubURL>
        <SCPDURL>/L3F.xml</SCPDURL>
      </service>
    </serviceList>
    <deviceList>
      <device>
        <deviceType>urn:schemas-upnp-org:device:WANDevice:1</deviceType>
        <friendlyName>WANDevice</friendlyName>
        <serviceList>
          <service>
            <serviceType>urn:schemas-upnp-org:service:WANCommonInterfaceConfig:1</serviceType>
            <serviceId>urn:upnp-org:serviceId:WANCommonIFC1</serviceId>
            <controlURL>/ctl/CmnIfCfg</controlURL>
            <eventSubURL>/evt/CmnIfCfg</eventSubURL>
            <SCPDURL>/WANCfg.xml</SCPDURL>
          </service>
        </serviceList>
        <deviceList>
          <device>
            <deviceType>urn:schemas-upnp-org:device:WANConnectionDevice:1</deviceType>
            <friendlyName>WANConnectionDevice</friendlyName>
            <serviceList>
              <service>
                <serviceType>urn:schemas-upnp-org:service:WANIPConnection:1</serviceType>
                <controlURL>/ctl/IPConn</controlURL>
                <eventSubURL>/evt/IPConn</eventSubURL>
                <SCPDURL>/WANIPCn.xml</SCPDURL>
              </service>
            </serviceList>
          </device>
        </deviceList>
      </device>
    </deviceList>
  </device>
</root>

It consists of basically three important things:

URLBase

Not all SCPD XML files contain an URLBase (the one above from my router doesn't), but if they do, it looks like this:

<URLBase>http://192.168.1.254:80</URLBase>

This is the base URL for the SOAP requests. If the SCPD XML does not contain an URLBase element, the LOCATION header from the server's discovery response may be used as the base URL. Any paths should be stripped off, leaving only the protocol, IP and port. In the case of my internet router that would be: http://192.168.0.1:40833/

Devices

The XML file then specifies devices, which are virtual devices that the physical device contains. These devices can contain a list of services in the <ServiceList> tag. A list of sub-devices can be found in the <DeviceList> tag. The Devices in the deviceList can themselves contain a list of services and devices. Thus, devices can recursively contain sub-devices, as shown in the following diagram:

device_hier

As you can see, a virtual Device can contain a Device List, which can contain a virtual Device, etc. We are most interested in the <Service> elements from the <ServiceList>. They look like this:

<service>
  <serviceType>urn:schemas-upnp-org:service:WANCommonInterfaceConfig:1</serviceType>
  <serviceId>urn:upnp-org:serviceId:WANCommonIFC1</serviceId>
  <controlURL>/ctl/CmnIfCfg</controlURL>
  <eventSubURL>/evt/CmnIfCfg</eventSubURL>
  <SCPDURL>/WANCfg.xml</SCPDURL>
</service>
...
<service>
  <serviceType>urn:schemas-upnp-org:service:WANIPConnection:1</serviceType>
  <controlURL>/ctl/IPConn</controlURL>
  <eventSubURL>/evt/IPConn</eventSubURL>
  <SCPDURL>/WANIPCn.xml</SCPDURL>
</service>

The <URLBase> in combination with the <controlURL> gives us the URL to the SOAP server where we can send our requests. The URLBase in combination with the <SCPDURL> points us to a SCPD (Service Control Point Definition) XML file which contains a description of the SOAP calls. 

The following Python code extracts the URLBase, ControlURL and SCPDURL information:

import urllib2
import urlparse
from xml.dom import minidom
 
def XMLGetNodeText(node):
    """
    Return text contents of an XML node.
    """
    text = []
    for childNode in node.childNodes:
        if childNode.nodeType == node.TEXT_NODE:
            text.append(childNode.data)
    return(''.join(text))
 
location = 'http://192.168.0.1:40833/rootDesc.xml'
 
# Fetch SCPD
response = urllib2.urlopen(location)
root_xml = minidom.parseString(response.read())
response.close()
 
# Construct BaseURL
base_url_elem = root_xml.getElementsByTagName('URLBase')
if base_url_elem:
    base_url = XMLGetNodeText(base_url_elem[0]).rstrip('/')
else:
    url = urlparse.urlparse(location)
    base_url = '%s://%s' % (url.scheme, url.netloc)
 
# Output Service info
for node in root_xml.getElementsByTagName('service'):
    service_type = XMLGetNodeText(node.getElementsByTagName('serviceType')[0])
    control_url = '%s%s' % (
        base_url,
        XMLGetNodeText(node.getElementsByTagName('controlURL')[0])
    )
    scpd_url = '%s%s' % (
        base_url,
        XMLGetNodeText(node.getElementsByTagName('SCPDURL')[0])
    )
    print '%s:\n  SCPD_URL: %s\n  CTRL_URL: %s\n' % (service_type,
                                                     scpd_url,
                                                     control_url)

Output:

urn:schemas-upnp-org:service:Layer3Forwarding:1:
  SCPD_URL: http://192.168.0.1:40833/L3F.xml
  CTRL_URL: http://192.168.0.1:40833/ctl/L3F

urn:schemas-upnp-org:service:WANCommonInterfaceConfig:1:
  SCPD_URL: http://192.168.0.1:40833/WANCfg.xml
  CTRL_URL: http://192.168.0.1:40833/ctl/CmnIfCfg

urn:schemas-upnp-org:service:WANIPConnection:1:
  SCPD_URL: http://192.168.0.1:40833/WANIPCn.xml
  CTRL_URL: http://192.168.0.1:40833/ctl/IPConn

SCPD, Phase II: Service SCPD files

Let's look at the WANIPConnection service. We have an SCPD XML file for it at http://192.168.0.1:40833/WANIPCn.xml and a SOAP URL at http://192.168.0.1:40833/ctl/IPConn. We must find out which SOAP calls we can make, and which parameters they take. Normally SOAP would use a WSDL file to define its API. With UPnp however this information is contained in the SCPD XML file for the service. Here's an example of the full version of the WANIPCn.xml file. There are two interesting things in the XML file:

ActionList

The <ActionList> tag contains a list of actions understood by the SOAP server. It looks like this:

<actionList>
  <action>
    <name>SetConnectionType</name>
    <argumentList>
      <argument>
        <name>NewConnectionType</name>
        <direction>in</direction>
        <relatedStateVariable>ConnectionType</relatedStateVariable>
      </argument>
    </argumentList>
  </action>
  <action>
    [... etc ...]
  </action>
</actionList>

In this example, we discover an action called SetConnectionType. It takes one incoming argument: NewConnectionType. The relatedStateVariable specifies which StateVariable this argument should adhere to.

serviceStateTable

Looking at the <serviceStateTable> section later on in the XML file, we see:

<serviceStateTable>
  <stateVariable sendEvents="no">
    <name>ConnectionType</name>
    <dataType>string</dataType>
  </stateVariable>
  <stateVariable>
  [... etc ...]
  </stateVariable>
</serviceStateTable>

From this we conclude that we need to send an argument with name "ConnectionType" and type "string" to the SetConnectionType SOAP call.

Another example is the GetExternalIPAddress action. It takes no incoming arguments, but does return a value with the name "NewExternalIPAddress". The action will return the external IP address of your router. That is, the IP address you use to connect to the internet. 

<action>
  <name>GetExternalIPAddress</name>
  <argumentList>
    <argument>
      <name>NewExternalIPAddress</name>
      <direction>out</direction>
      <relatedStateVariable>ExternalIPAddress</relatedStateVariable>
    </argument>
  </argumentList>
</action>

Let's make a SOAP call to that action and find out what our external IP is.

SOAP: Calling an action

Normally we would use a SOAP library to create a call to a SOAP service. In this article I'm going to cheat a little and build a SOAP request from scratch.

import urllib2
 
soap_encoding = "http://schemas.xmlsoap.org/soap/encoding/"
soap_env = "http://schemas.xmlsoap.org/soap/envelope"
service_ns = "urn:schemas-upnp-org:service:WANIPConnection:1"
soap_body = """<?xml version="1.0"?>
<soap-env:envelope soap-env:encodingstyle="%s" xmlns:soap-env="%s">
  <soap-env:body>
    <m:getexternalipaddress xmlns:m="%s">
    </m:getexternalipaddress>
   </soap-env:body>
</soap-env:envelope>""" % (soap_encoding, service_ns, soap_env)
 
soap_action = "urn:schemas-upnp-org:service:WANIPConnection:1#GetExternalIPAddress"
headers = {
    'SOAPAction': u'"%s"' % (soap_action),
    'Host': u'192.168.0.1:40833',
    'Content-Type': 'text/xml',
    'Content-Length': len(soap_body),
}
 
ctrl_url = "http://192.168.0.1:40833/ctl/IPConn"
 
request = urllib2.Request(ctrl_url, soap_body, headers)
response = urllib2.urlopen(request)
 
print response.read()

The SOAP server returns a response with our external IP in it. I've pretty-printed it for your convenience and removed some XML namespaces for brevity:

<?xml version="1.0"?>
<s:Envelope xmlns:s=".." s:encodingStyle="..">
  <s:Body>
    <u:GetExternalIPAddressResponse xmlns:u="urn:schemas-upnp-org:service:WANIPConnection:1">
      <NewExternalIPAddress>212.100.28.66</NewExternalIPAddress>
    </u:GetExternalIPAddressResponse>
  </s:Body>
</s:Envelope>

We can now put the response through an XML parser and combine it with the SCPD XML's <argumentList> and <serviceStateTable> to figure out which output parameters we can expect and what type they are. Doing this is beyond the scope of this article, since it's rather straight-forward yet takes a reasonable amount of code. Suffice to say that our extenal IP is 212.100.28.66.

Summary

To summarise, these are the steps we take to actually do something useful with a UPnP service:

  1. Broadcast a HTTP-over-UDP (HTTPU) message to the network asking for UPnP devices to respond.
  2. Listen for incoming UDP replies and extract the LOCATION header.
  3. Send a WGET to fetch a SCPD XML file from the LOCATION.
  4. Extract services and/or devices from the SCPD XML file.
    1. For each service, extract the Control and SCDP urls.
    2. Combine the BaseURL (or if it was not present in the SCPD XML, use the LOCATION header) with the Control and SCDP url's.
  5. Send a WGET to fetch the service's SCPD XML file that describes the actions it supports.
  6. Send a SOAP POST request to the service's Control URL to call one of the actions that it supports.
  7. Receive and parse reply.

An example with Requests on the left and Responses on the right. Like all other examples in this article, the XML has been heavily stripped of redundant or unimportant information:

upnp_req_resp

Conclusion

I underwent this whole journey of UPnP because I wanted a way transparently support connections from externals networks to my locally-running application. While UPnP allows me to do that, I feel that UPnP is needlessly complicated. The standard, while readable, feels like it's designed by committee. The indirectness of having to fetch multiple SCPD files, the use of non-standard protocols, the nestable virtual sub-devices… it all feels slightly unnecesarry.  Then again, it could be a lot worse. One only needs to take a quick look at SAML v2 to see that UPnP isn't all that bad.

All in all, it let me do what I needed, and it didn't take too long to figure out how it worked. As a kind of exercise I partially implemented a high-level simple to use UPnP client for python, which is available on Github. Take a look at the source for more insights on how to deal with UPnP.

Alien Deb to RPM convert fails with error: "Use of uninitialized value in split…" [FIXED]

TL;DR: Run alien under the script tool.

I was trying to get my build server to build packages for one of my projects. One step involves converting a Debian package to a RPM by means of the Alien tool. Unfortunately it failed with the following error:

alien -r -g ansible-cmdb-9.99.deb
Warning: alien is not running as root!
Warning: Ownerships of files in the generated packages will probably be wrong.
Use of uninitialized value in split at /usr/share/perl5/Alien/Package/Deb.pm line 52.
no entry data.tar.gz in archive

gzip: stdin: unexpected end of file
tar: This does not look like a tar archive
tar: Exiting with failure status due to previous errors
Error executing "ar -p 'ansible-cmdb-9.99.deb' data.tar.gz | gzip -dc | tar tf -":  at /usr/share/perl5/Alien/Package.pm line 481.
make: *** [release_rpm] Error 2

The same process ran fine manually from the commandline, so I suspected something related with the controlling terminal. One often trick is to pretend to a script we actually have a working interactive TTY using the script tool. Here's how that looks:

script -c "sh ansible-cmdb-tests.sh"

The job now runs fine:

alien -r -g ansible-cmdb-9.99.deb
Warning: alien is not running as root!
Warning: Ownerships of files in the generated packages will probably be wrong.
Directory ansible-cmdb-9.99 prepared.
sed -i '\:%dir "/":d' ansible-cmdb-9.99/ansible-cmdb-9.99-2.spec
sed -i '\:%dir "/usr/":d' ansible-cmdb-9.99/ansible-cmdb-9.99-2.spec
sed -i '\:%dir "/usr/share/":d' ansible-cmdb-9.99/ansible-cmdb-9.99-2.spec
sed -i '\:%dir "/usr/share/man/":d' ansible-cmdb-9.99/ansible-cmdb-9.99-2.spec
sed -i '\:%dir "/usr/share/man/man1/":d' ansible-cmdb-9.99/ansible-cmdb-9.99-2.spec
sed -i '\:%dir "/usr/lib/":d' ansible-cmdb-9.99/ansible-cmdb-9.99-2.spec
sed -i '\:%dir "/usr/bin/":d' ansible-cmdb-9.99/ansible-cmdb-9.99-2.spec
cd ansible-cmdb-9.99 && rpmbuild --buildroot='/home/builder/workspace/ansible-cmdb/ansible-cmdb-9.99/' -bb --target noarch 'ansible-cmdb-9.99-2.spec'
Building target platforms: noarch
Building for target noarch
Processing files: ansible-cmdb-9.99-2.noarch
Provides: ansible-cmdb = 9.99-2
Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1
Requires: /usr/bin/env
Checking for unpackaged file(s): /usr/lib/rpm/check-files /home/builder/workspace/ansible-cmdb/ansible-cmdb-9.99
warning: Installed (but unpackaged) file(s) found:
   /ansible-cmdb-9.99-2.spec
Wrote: ../ansible-cmdb-9.99-2.noarch.rpm
Executing(%clean): /bin/sh -e /var/tmp/rpm-tmp.3ssxv0
+ umask 022
+ cd /home/builder/rpmbuild/BUILD
+ /bin/rm -rf /home/builder/workspace/ansible-cmdb/ansible-cmdb-9.99
+ exit 0

Hope that helps someone out there.

Ansible-cmdb v1.14: Generate a host overview of Ansible facts.

I've just released ansible-cmdb v1.14. Ansible-cmdb takes the output of Ansible's fact gathering and converts it into a static HTML overview page containing system configuration information. It supports multiple templates and extending information gathered by Ansible with custom data.

This release includes the following bugfixes and feature improvements:

As always, packages are available for Debian, Ubuntu, Redhat, Centos and other systems. Get the new release from the Github releases page.

Terrible Virtualbox disk performance

For a while I've noticed that Virtualbox has terrible performance when install Debian / Ubuntu packages. A little top, iotop and atop research later, and it turns out the culprit is the disk I/O, which is just ludicrously slow. The cause is the fact that Virtualbox doesn't have "Use host IO cache" turned on by default for SATA controllers.Turning that option on gives a massive improvement to speed.

To turn "Host IO  cache":

 

 

hostiocache

Update:

Craig Gill was kind enough to mail me with the reasoning behind why the "Host I/O cache" setting is off by default:

Hello, thanks for the blog post, I just wanted to comment on it below:

In your blog post "Terrible Virtualbox disk performance", you mention
that the 'use host i/o cache' is off by default.
In the VirtualBox help it explains why it is off by default, and it
basically boils down to safety over performance.

Here are the disadvantages of having that setting on (which these
points may not apply to you):

1. Delayed writing through the host OS cache is less secure. When the
guest OS writes data, it considers the data written even though it has
not yet arrived on a physical disk. If for some reason the write does
not happen (power failure, host crash), the likelihood of data loss
increases.

2. Disk image files tend to be very large. Caching them can therefore
quickly use up the entire host OS cache. Depending on the efficiency
of the host OS caching, this may slow down the host immensely,
especially if several VMs run at the same time. For example, on Linux
hosts, host caching may result in Linux delaying all writes until the
host cache is nearly full and then writing out all these changes at
once, possibly stalling VM execution for minutes. This can result in
I/O errors in the guest as I/O requests time out there.

3. Physical memory is often wasted as guest operating systems
typically have their own I/O caches, which may result in the data
being cached twice (in both the guest and the host caches) for little
effect.

"If you decide to disable host I/O caching for the above reasons,
VirtualBox uses its own small cache to buffer writes, but no read
caching since this is typically already performed by the guest OS. In
addition, VirtualBox fully supports asynchronous I/O for its virtual
SATA, SCSI and SAS controllers through multiple I/O threads."

Thanks,
Craig

That's important information to keep in mind. Thanks Craig!

My reply:

Thanks for the heads up!

For what it's worth, I already wanted to write the article a few weeks ago, and have been running with all my VirtualBox hosts with Host I/O cache set to "on", and I haven't noticed any problems. I haven't lost any data in databases running on Virtualbox (even though they've crashed a few times; or actually I killed them with ctrl-alt-backspace), the host is actually more responsive if VirtualBoxes are under heavy disk load. I do have 16 Gb of memory, which may make the difference. 

The speedups in the guests is more than worth any other drawbacks though. The gitlab omnibus package now installs in 4 minutes rather than 5+ hours (yes, hours).

I'll keep this blog post updates in case I run into any weird problems.

mdpreview, a Markdown previewer to be used with an external editor

There are many Markdown previewers out there, from the simplest commandline tool + webbrowser to full-fledged Markdown IDE's. I've tried quite a few, and I like none of them. I write my Markdown in an external editor (Vim), something very few Markdown previewers take in account. The ones that do are buggy. So I wrote mdpreview, a standalone Markdown previewer for Linux that works great with an external editor such as Vim. The main selling points:

A feature to automatically scroll to the last made change in the Markdown file is currently being implemented.

Here's mdpreview running the Solarized theme:

mdpreview-sol

The Github theme:

mdpreview-github

And the BitBucket theme:

mdpreview-bitbucket

 

More information and installation instructions are available on the Github page.

 

Manually scrolling a Python GTK Webview

I was trying to manually scroll a (Python) GTK embedded Webview in order to position the webview back to where it was after setting new contents with webview.load_html_string(html, 'file:///'). I couldn't get it to work, and Google wasn't of much help either.

I could scroll the Webview just fine from a key-press-event handler on the main window like this:

def __init__(self):
    # -- Removed some code here for brevity --
    self.scroll_window = gtk.ScrolledWindow(None, None)
    self.scroll_window.add(self.webview)
    self.win_main.connect("key-press-event", self.ev_key_press_event)

def ev_key_press_event(self, widget, ev):
    if ev.keyval == gtk.keysyms.t:
        self.scroll_window.get_vadjustment().set_value(100)    

But automatically scrolling when something happened to the webview (in my case new content being set via webview.load_html_string()) didn't work. 

It turns out that the webview is still handling events and won't allow scrolling using `scroll_window.get_vadjustment().set_value()` until all the events are handled.

You can manually handle all the pending GTK events before starting scrolling like this:

def __init__(self):
    # -- Removed some code here for brevity --
    self.scroll_window = gtk.ScrolledWindow(None, None)
    self.scroll_window.add(self.webview)
    webview.connect('notify::load-status', self.ev_load_status)

def ev_load_status(self, webview, load_status):
    if self.webview.get_load_status() == webkit.LOAD_FINISHED:
        while gtk.events_pending():
            gtk.main_iteration_do()

        self.scroll_window.get_vadjustment().set_value(100)

The solution above works for me on both initial load of a document and subsequent changing of the webkit contents using `webview.load_html_string()`.

Multithreaded dev web server for the Python Bottle web framework

logo_navI'm writing a simple web application in the Bottle framework. I ran into an issue where I had a single long-running request, but needed to make some additional requests from the browser to the server. It turns out that Bottle's built in development web server is single-threaded, and can't handle multiple requests at the same time. This is annoying, since I don't want to have to deploy my application each time I make a change; that's what's the built-in development web server is for.

The solution is easy: create a very simple multithreaded WSGI web server and use that to serve the Bottle application.

wsgiserver.py

"""
Simple multithreaded WSGI HTTP server.
"""

from wsgiref.simple_server import make_server, WSGIServer
from SocketServer import ThreadingMixIn

class ThreadingWSGIServer(ThreadingMixIn, WSGIServer):
    daemon_threads = True

class Server:
    def __init__(self, wsgi_app, listen='127.0.0.1', port=8080):
        self.wsgi_app = wsgi_app
        self.listen = listen
        self.port = port
        self.server = make_server(self.listen, self.port, self.wsgi_app,
                                  ThreadingWSGIServer)

    def serve_forever(self):
        self.server.serve_forever()

We then include that in the file where we create our Bottle app:

app.py

import bottle
import wsgiserver

wsgiapp = bottle.default_app()
httpd = wsgiserver.Server(wsgiapp)
httpd.serve_forever()

We now have a Bottle app that can handle multiple concucrrent requests. I'm not sure how well this works with automatic reloading and such, but I think it should be fine.