Monitoring Amazon EC2 instances and other Cloud Resources with Hyperic HQ (and other monitoring platforms)

I’ve had to tackle this task recently and could not find a write-up. Nice folks from Hyperic, and others on Twitter, suggested OpenVPN or an SSH tunnel. I opted for the second option, and after setting up two tunnels and properly configuring the agent, I now have an Amazon EC2 Windows instance show up as a platform in my Dashboard. Note that those instructions will work for other software (Zabbix comes to mind). Here’s how you can have yours too:

1. Install an SSH server on the to-be-monitored cloud instance. For Linux, OpenSSH is easy to install and setup, and usually already comes with most distributions. All you have to do is create a user and a password, or keys. On Windows, CopSSH will do the trick – you just have to add a new user and configure it through the CopSSH control panel. Make sure the SSH server runs, and the login credentials work.

2. Install an SSH client on your Hyperic HQ server. For Linux, again, OpenSSH will do the trick and is most likely already there. For Windows, try CygWIN or PUTTY.

3. Designate a unique name for localhost in the hosts file of both the Hyperic server and the cloud instance. In Linux, it would be under /etc/hosts. In windows, it moves between versions but is usually under C:\Windows\system32\drivers\etc\hosts . Call it cloudagent1. The line should look like this:

127.0.0.1     localhost cloudagent1

4. From the Hyperic server, initiate an SSH tunnel which forwards two ports. First from the cloud instance to the Hyperic server (usually on port 7443). Second from the Hyperic server to the cloud instance, to the port on which the Hyperic agent runs. If you already have a Hyperic agent on your Hyperic server, you MUST use a different port. As the local agent usually runs on port 2144, you may want to pick something like port 22144. With OpenSSH on CygWin and Linux you can create the tunnels like this (assuming your username is “user” and your cloud instance is “cloud-instance.com”):

$ ssh user@cloud-instance.com -R 7443:cloudagent1:7443 -L 22144:cloudagent1:22144 -N -f

5. Configure the Hyperic agent on your cloud instance to use port 22144. The rest of the settings can be copied from your locally monitored agents. You can use “cloudagent1” (or whichever name you have assigned to the localhost) in the configuration.

Hope this helped!

1 Comment

Filed under #!

Make your PC link the Ben to the Internet, Automagically!

A familiar pain for Ben Nanonote users is connecting the Ben online every-time they plug it in. udev can remove this pain with a simple rule to run all the commands on the host the Ben is connected to when its connected. To get this done, you will need 2 pieces: a udev rule, and a script.

Your udev rule can be a file under  /etc/udev/rules.d/ . I called mine “72-BenNanoNnote-net.rules”

It’s content should look like this:

SUBSYSTEM=="usb", ACTION=="add", ATTR{idVendor}=="0525", ATTR{idProduct}=="a4a1", RUN+="/usr/local/bin/ben-net.sh"
# where RUN+= points to your script

The script should look like this, and can feel comfortable under /usr/local/bin :

#!/bin/bash
GATEWAY_IF=ppp0
if (/usr/bin/lsusb -t -d "0525:a4a1"); then
        echo .
        echo "Ben NanoNote found, setting up USB network ... "
        if !( /sbin/lsmod | grep 'ip_tables' ) && ( /sbin/modprobe -l ip_tables ); then
                /sbin/modprobe ip_tables
                echo "ip_tables is now loaded"
        else
                echo "ip_tables already loaded"
        fi
        if ( grep '0' /proc/sys/net/ipv4/ip_forward ); then
                echo "1" > /proc/sys/net/ipv4/ip_forward
                echo "IP forwarding is now enabled"
        else
                echo "IP forwarding already enabled"
        fi
        if !( /usr/sbin/iptables -L | grep $GATEWAY_IF ); then
                /usr/sbin/iptables -t nat -A POSTROUTING -o $GATEWAY_IF -j MASQUERADE
                echo "Routing is now enabled"
        else
                echo "Routing already setup on "$GATEWAY_IF
        fi
        /sbin/ifconfig usb0 192.168.254.100 netmask 255.255.255.0
fi
# where GATEWAY_IF is the interface that is connected to your LAN or the Internet.

Leave a comment

Filed under Ben Nanonote

GNU Pem on the Ben NanoNote

image

image

image

Pem, the personal expenses manager, was ported to the NanoNote and feels at home!

Leave a comment

Filed under Ben Nanonote

Slackware 13.37 and the ASUS PCE-N13 Wireless Adapter

ASUS PCE-N13

The ASUS PCE-N13 is not especially pretty, but its cheap, fast, and officially supported!

If you are on the market for a wireless adapter for your Linux desktop, the best bang for the buck today seems to be the ASUS PCE-N13. Not only will ~30$ get you a/b/g/n support, 300Mbps transfer rates, 2 antennas and a PCIe bus, but it also says “Linux Support” right on the box, and not in some fine print in an obscure corner. The only card in my local shop to read that, although all of them work just fine. So this is a *moral* choice as well 😉

The card is indeed supported by the rt2860sta module. Unfortunately, with Both Slackware 13.37 and Ubuntu 10.10, the kernel module fails to bind to the card because the various rt2800 and rt2x00 modules conflict with rt2860sta. The module loads, but all attempts to initialize the card result in error messages. To remedy this, simply blacklist the other modules from loading by adding those modules to /etc/modprobe.d/blacklist.conf like this:

# Blacklist rt2800 and rt2x00 modules
# This will allow the rt2860sta module to bind to the ASUS PCE-N13 card:
blacklist rt2800lib
blacklist rt2800pci
blacklist rt2x00lib
blacklist rt2x00pci

2 Comments

Filed under Slackware

The Quest For The Fastest Linux Filesystem

What’s this thing about?

This post has a few main points:

1. Speeding up a filesystem’s performance by setting it up on a tuned RAID0/5 array.

2. Picking the fastest filesystem.

3. The fastest format options for Ext3/4 or XFS filesystems.

4. Tuning an Ext3/4 filesystem’s journal and directory index for speed.

5. Filesystem mount options that increase performance, such as noatime and barrier=0.

6. Setting up LILO to boot from a RAID1 /boot partition.

The title is a bit of an oversimplification 😉 The article is intended to keep being work in progress as “we” learn, and as new faster tools become available. This article is not intended to cover the fastest hardware (yet). The goal is the “fastest” filesystem possible on whatever device you have. Basically, “we” want to setup and tweak whatever is possible to get our IO writes and reads to happen quicker. Which IO reads? random or sequential? long or short? The primary goal is a quick Linux root filesystem, which is slightly different than, lets say, a database-only filesystem, or a /home partition for user files. Oh, and by the way, do not use this on your production machines, people. Seriously.

RAID

WTF is RAID?!

The first question is, how many devices would you like your filesystem to span? The simple and correct answer is – the more the faster. To use one filesystem across multiple devices, a single “virtual” device can be created from multiple partitions with RAID. (Recently developed filesystems, like BTRFS and ZFS, are capable of splitting themselves intelligently across partitions to optimize performance on their own, without RAID) Linux uses a software RAID tool which comes free with every major distribution – mdadm. Read about mdadm here, and read about using it here. There’s also a quick 10 step guide I wrote here which will give you an idea about the general procedure of setting up a RAID mdadm array.

Plan your array, and then think about it for a while before you execute – you can’t change the array’s geometry (which is the performance sensitive part) after it’s created, and it’s a real pain to migrate a filesystem between arrays. Not to mention a Linux root filesystems.

Deciding on a performance oriented type of RAID ( RAID0 vs. RAID5 )

The rule of thumb is to use 3 or more drives in a RAID5 array to gain redundancy at the cost of a slight performance loss over a RAID0 array (10% CPU load at peak times on my 2.8 GHz AthlonX2 with a 3 disk RAID5 array). If you only have 2 drives, you cannot use RAID5. Whatever your situation is, RAID0 will always be the fastest, but less responsible, choice.

RAID0 provides no redundancy and will fail irrecoverably when one of the drives in the array fails. Some would say you should avoid putting your root filesystem on an un-redundant array, but we’ll do it anyways! RAID0 is, well, the *fastest* (I threw that caution to the wind and I’m typing this from a RAID0 root partition, for what it’s worth). If you are going to be or have been using a RAID0 array, please comment about your experiences. Oh, and do backup often. At least weekly. To an *external* drive. If you only have one drive you can skip to the filesystem tuning part. If you do are going to use RAID0/5, remember to leave room for a RAID1 array, or a regular partition, for /boot. Today, LILO cannot yet boot a RAID0/5 array.

Deciding on a RAID stripe size ( 4 / 8 / 16 / 32 / 64 / 128 / 256 … )

You will need to decide, for both RAID0 and RAID5, about the size of the stripe you will use. See how such decisions affect performance here. I find the best results for my personal desktop to be 32kb chunks. 64 does not feel much different. I would not recommend going below 32 or above 128 for a general desktops root partition. I surf, play games, stream UPnP, run virtual machines, and use a small MySQL database. If I would be doing video editing, for example, a significantly bigger stripe size would be faster. Such specific usage filesystem should be setup for their own need and not on the root filesystem, if possible. Comments?

RAID 5 – deciding on a parity algorithm ( Symmetric vs. Asymmetric )

For RAID5, the parity algorithm can be set to 4 different types. Symmetric-Left, Symmetric-Right, Asymmetric-Left, and Asymmetric-Right. They are explained here, but they appear to only affect performance to a small degree for desktop usage, as one thread summarized.

Creating a RAID0 array

Using the suggestions above, the command to create a 2-disk RAID0 array for a root partition on /dev/md0 using the partitions /dev/sda1 and /dev/sdb1 should look like this:

# mdadm --create /dev/md0 --metadata=0.90 --level=0 --chunk=32 --raid-devices=2 /dev/sd[ab]1

Note the –metadata option, which with 0.90 specifies the older mdadm metadata format. If you will use anything other than 0.90, you will find Lilo failing to boot.

The Fastest Filesystem – Setup and Tuning

Deciding on a Filesystem ( Ext3 vs. Ext4 vs. XFS vs. BTRFS )

The Ext4 filesystem does seem to outperform Ext3, XFS and BTRFS, and it can be optimized for striping on RAID arrays. I recommend Ext4 until BTRFS catches up in performance, becomes compatible with LILO/GRUB, and gets an FSCK tool.

Deciding on a Filesystem Block Size ( 1 vs. 2 vs. 4 )

It is impossible to stress how important this part is. Luckily, if you don’t know what this is and just don’t touch it, most mkfs tools default to the fastest choice – 4kb. Why you would not want to use 1 or 2 is neatly shown in the benchmarking results of RAID performance on those block sizes. Even if you are not using RAID, you will find 4kb blocks to perform faster. Much like the RAID geometry, this is permanent and cannot be changed.

Creating an optimized for RAID Ext4 ( stride and stripe-width )

Use those guidelines to calculate these values:

stride = filesystem block-size / RAID chunk.
stripe-width = stride * number of drives in RAID array ( - for RAID0, and that minus one for RAID5 )

pass the stride and the stripe-width to mkfs.ext4, along with the block size in bytes, like this:

# mkfs.ext4 -b 4096 -E stride=8,stripe-width=16 /dev/md0

A handy tool to calculate those things for you can be found here.

Creating an optimized XFS filesystem ( sunit and swidth )

The XFS options for RAID optimization are sunit and swidth. A good explanation about those two options can be found in this post. A quick and dirty formula to calculate those parameters was taken from here:

sunit = RAID chunk in bytes / 512
swidth = sunit * number of drives in RAID array ( - for RAID0, and that minus one for RAID5 )

The sunit for a 32kb (or 32768 byte) array would be 32768 / 512 = 64

The command to create such a filesystem for a 32kb chunk size RAID0 array with 2 drives and a 4096 (4kb) block size will look something like this:

# mkfs.xfs -b size=4096 -d sunit=64,swidth=128 /dev/md0

Tuning the Ext3 / Ext4 Filesystem ( Journal )

There’s a good explanation about the 3 modes in which a filesystem’s journal can be used on the OpenSUSE Wiki. That same guide will rightly recommend avoiding writing actual data to the journal to improve performance. On a newly created but unmounted filesystem, disable the writing of actual data to the journal:

# tune2fs -O has_journal -o journal_data_writeback /dev/md0

Turning on Ext3 / Ext4 Directory Indexing:

Your filesystem will perform faster if the directories are indexed:

# tune2fs -O dir_index /dev/md0
# e2fsck -D /dev/md0

Filesystem Mounting Options ( noatime, nodiratime, barrier, data and errors options ):

Some options should be passed to the filesystem on mount to increase its performance:

noatime, nodiratime – Do not log access of files and directories.

barrier=0 – Disable barrier sync (Only safe if you can assure uninterrupted power to the drives, such as a UPS battery)

errors=remount-ro – When we have filesystem errors, we should remount our root filesystem readonly (and generally panic).

data=writeback – For Ext3 / Ext4. If your journal is in writeback mode (as we previously advised), set this option.

My fstab looks like this:

/dev/md0         /                ext4        noatime,nodiratime,data=writeback,stripe=16,barrier=0,errors=remount-ro      1   1

And my manual mount command will look like this:

# mount /dev/md0 /mnt -o noatime,nodiratime,data=writeback,stripe=16,barrier=0,errors=remount-ro

Did I mention to NEVER do this on a production machine?

Installing your Linux

Install as usual, but do not format the root partition you’ve setup! If you are using RAID0/5, you have to setup a separate, RAID1 or primary /boot partition. In my experience, the leaving the boot partition unoptimized does not affect regular performance, but if you are keen on shaving a few milliseconds off your boot-time you can go ahead and tune that filesystem yourself as well.

Making sure LILO boots

If you are using RAID0/5 for your root partition, you must setup a separate non-RAID or RAID1 partition as /boot. If you do setup your /boot partition to be on a RAID1 array, you have to make sure to point lilo to the right drive but editing /etc/lilo.conf :

boot = /dev/md1

and make sure LILO knows about the mirroring of the /boot partitions by adding the line:

raid-extra-boot = mbr-only

Then, LILO must be reinstalled to the Master Boot Record while the /boot partition is mounted on the root partition. From a system rescue CD, with a properly edited lilo.conf file this will look something like this:

# mount /dev/md0 /mnt
# mount /dev/md1 /mnt/boot
# /mnt/sbin/lilo -C /mnt/etc/lilo.conf

… and reboot.

Experience and thoughts:

I’ve been following my own advice for the last couple of weeks. The system is stable and best of all, *fast*. May those not be “famous last words”, but I’ll update this post as I go. The only thing we all really need is comments and input. If you use something else that works faster for you – let us know. If something downgraded your stability to the level of Win98, please let us know. More importantly – if you see any errors, you got it – let us know.

TO DO:

Test this interesting post about Aligning Partitions

Test BTRFS on 2 drives without RAID/LVM

22 Comments

Filed under #!

MediaTomb on the Ben Nanonote

What can I say, the title speaks for itself. As no big surprise, the most versatile piece of UPnP streaming media servers out there, MediaTomb, is humming along with no problems on Qi Hardware’s Ben Nanonote. Real world usage scenarios could include using the Ben as a little DJ in parties by streaming to VLC or other UPnP supporting players, or other wild fantasies Ben owners might have. The best news here is that there is absolutely no brain work involved. I simply had to fire up the network connection on the Ben, grab the right hard linked binary, untar it and run. All of this can be done directly from the Nanonote (once it’s online):

# wget http://downloads.sourceforge.net/mediatomb/mediatomb-static-0.11.0-r2-linux-uclibc-mips32el.tar.gz
# tar vxzf mediatomb-static-0.11.0-r2-linux-uclibc-mips32el.tar.gz
# cd ./mediatomb
# ./mediatomb.sh

To automate, add this to /etc/rc.local and make it executable, but remember MediaTomb must be started from the mediatomb folder.

Once started, media tomb can be accessed on port 49152 with your browser. For me, this translates to http://192.168.3.2:49152 and looks like this:

MediaTomb on Nanonote

MediaTomb on Nanonote

So far, it’s an awesome remote file browser, and as soon as I can get VLC to compile on my Slackware, it’s party time!

Leave a comment

Filed under Ben Nanonote

Checking for Rootkits

What are rootkits? You can ask Wikipedia, or take this over-simplification:

The Windows flock has to worry about various security concerns – viruses, worms, trojans, bots. So may, in fact, they had been categorized into AdWare (unwanted commercial content), SpyWare (compromise of personal information) and Viruses (generally malicious code).

Linux folks, on the other hand, have to fear only one thing – a rootkit. Because Windows vulnerabilities allow exploiting a machine without administrative access for specific goals, a diversity of exploits was invented, while properly setup Linux can only be compromised with root (Administrator) access. However, once that had been achieved, the machine is fully compromised along with the network it’s on. Namely, there are very few things an attacker will not be able to do with a rooted Linux machine.

Periodic scans are a good measure to stay clean, similar to scheduled anti-virus scans in Windows PCs. chkrootkit is such an implementation, and appears to be the most popular today, for good reasons: it’s easy to build, easy to run, and easy to understand. A good idea might be to run it periodically and email the output to a system administrator.

Great, until you realize the catch22 of checking for root-kits: Once a machine had been infected with a rootkit (usually through an badly secured SSH account), there’s virtually no way to determine that is was compromised, unless a reference set of binaries can be used to compare the infected binaries to “clean” ones. This means that you may not use a machine to check itself for rootkits, because if it was already compromised at the time it was checked, the check would be useless.

There is, of course, a hack. Namely, Network Security Hack #99 from O’Reilly suggests we can run chkrootkit in Busybox! Right, Busybox has some 200 programs built into its binary, and we could use those instead of the host’s own suspected ones. Unfortunately, it does not show how.

Let’s think about this: We assume the machine might be compromised when we scan it. We would not want to use the machine’s own binaries (that could also be compromised). We would not want to use software that had been lying around on the machine while it was compromised, for the same reason, including Busybox or chkrootkit. Ideally, then, we would use a fresh copy of Busybox to download a fresh copy of chkrootkit, run one in the other and dump both (that way, one cannot compromise something that is not there).

I ended up with the (almost entirely self-sufficient) solution below. It will run on any machine that has internet access, GNU wget (to download Busybox and chkrootkit). Other than wget, the script makes use of Busybox applets instead of the system installed binaries, for extra safety 😉

#!/bin/sh

# Temporary folder path. (It's a good idea to use a random name):
TEMPORARY_FOLDER=/tmp/security-$RANDOM

# A URL from which Busybox binaries can be downloaded
BUSYBOX_URL=http://www.busybox.net/downloads/binaries/1.17.2

# A URL from which the chkrootkit source can be downloaded:
CHKROOTKIT_URL=ftp://ftp.pangeia.com.br/pub/seg/pac

# Download and prepare the Busybox binary in the temporary folder:
/usr/bin/mkdir -p $TEMPORARY_FOLDER
cd $TEMPORARY_FOLDER
/usr/bin/wget $BUSYBOX_URL/busybox-"`uname -m`"
/usr/bin/chmod +x $TEMPORARY_FOLDER/busybox-"`uname -m`"
BUSYBOX_BIN=$TEMPORARY_FOLDER/busybox-"`uname -m`"

# Download, confirm the MD5 sum, and extract the chkrootkit source using Busybox:
$BUSYBOX_BIN wget $CHKROOTKIT_URL/chkrootkit.tar.gz
$BUSYBOX_BIN wget $CHKROOTKIT_URL/chkrootkit.md5
if [ "`$BUSYBOX_BIN cat chkrootkit.md5`" != "`$BUSYBOX_BIN md5sum chkrootkit.tar.gz`" ]; then
  $BUSYBOX_BIN echo " !!! MD5 SUM check FAILED !!! "
  exit
else
  $BUSYBOX_BIN echo "MD5 checksum for chkrootkit source passed..."
fi
$BUSYBOX_BIN tar vxf $TEMPORARY_FOLDER/chkrootkit.tar.gz

# Run the newly built chkrootkit in the just downloaded Busybox:
$BUSYBOX_BIN sh $TEMPORARY_FOLDER/chkrootkit*/chkrootkit

# Clean up:
cd /
$BUSYBOX_BIN echo "Done."
$BUSYBOX_BIN rm -rf $TEMPORARY_FOLDER

2 Comments

Filed under #!