24 November, 2013

Stripped LVM (RAID 0) performance on a NAS system

Now I am setting up a new NAS and to be able to decide if I should use Stripped LVM I did some measurements. Stripped LVM means the the data volume is spread across 2 physical disks and data reads and writes are executed in parallel to the two disks, so in theory this will duplicate the disks speed. However stripping has one disadvantage that if one of the disks fails then all data is lost, while with normal volumes, there are chances that from the functional disk it is possible to restore the data. The NAS has a Gigabit Ethernet LAN connection so in theory 125 MBytes/sec speeds can be achieved.

To do the tests I have set up two 200 GByte logical volumes on the LVM containing two 4 Tbytes physical disks, one stripped across the two disks and the other without stripping. First I measured the speed within the NAS, using the linux DD command:

On the normal volume:

>dd if=/dev/zero of=/media/data/Adatok/test bs=1M count=2000      2000+0 records in
2000+0 records out
2097152000 bytes (2.1 GB) copied, 13.979 s, 150 MB/s

and on the stripped:

>dd if=/dev/zero of=/media/data/Adatok/test bs=1M count=2000
2000+0 records in
2000+0 records out
2097152000 bytes (2.1 GB) copied, 7.52203 s, 279 MB/s

This looks promising, the stripped disk looks almost 2 times faster (and the overall speed is also pretty impressing).

In the next step I have set up a Samba share and used CrystalMark to measure the speeds on the Windows 7 client machine:

What can be seen, that with 100MB size the results are almost identical and with 1000 MB size the results are also very similar, with some differences with small block sizes. Probably these differences are some kind of measurement error.

So the conclusion is clear, it is not worth to use Raid 0 or LVM stripping in a NAS, because it just reduces the security and doesn't improve the speed.

For comparison I include 3 other measurements done on the same Windows machine with internal/USB disks:

 This may also indicate some explanation for the poor 4K performance; this may be normal for HDDs and with the NAS with small file sizes the cache may help to improve it an almost SSD level.

20 November, 2013

Speeding up NTFS file system access on Linux with a Windows virtual machine

Summary: Ntfs-3g is not performing well, here are some measurements and a complicated way to have faster NTFS filesystem access on Linux.

Disclaimer: These tests were done on an Ubuntu 10.4 LTS installation, with the latest sw versions included in this Ubuntu distribution, but on the ntfs-3g site there are more recent versions of that ntfs-3g.

I wanted to do a backup of my Linux based NAS, to an NTFS formatted hard drive, which meant to copy around 4 TByte of data.

As the first step I built the disk into the NAS machine over a SATA connection and measured the drive speed:

>sudo hdparm -t /dev/sdf

 Timing buffered disk reads:  432 MB in  3.01 seconds = 143.62 MB/sec

This looks good, this means, 4 TB/143MB ~ 7 hours of copy time.

Next I mounted the already formatted drive and did a copy test:

>sudo mount -t auto /dev/sdf2 /media/windows/
>dd if=/dev/zero of=/media/windows/test bs=512 count=390625
390625+0 records in
390625+0 records out
200000000 bytes (200 MB) copied, 94.6419 s, 2.1 MB/s

Oh what, this looks terrible. 4 TB/ 2.1 MB ~ 22 days of copy time.

Then I started to look into it to check, if there are some performance issues with the ntfs-3g program.

On the Tuxera site, it says: "Normally the I/O performance of NTFS-3G and the other file systems are very comparable. " and there are some hits, why it can be slow. After checking all, in my case only the too small block sizes apply, so let's try to to increase the block size:

>dd if=/dev/zero of=/media/windows/test bs=1M count=200
200+0 records in
200+0 records out

209715200 bytes (210 MB) copied, 11.3721 s, 18.4 MB/s

Ok, this now looks much better, but still far from the hard disk speed.

After reading a bit more on the Tuxera site, there is a comparison of Embebed NTFS and other file systems among them NTFS 3G:

The NTFS-3G is the bottom one, so there is no sense to search more, why ntfs-3g is slow, on my machine, it is generally slow and probably it is not possible to speed it up.

Then come the idea, to set up a virtual Windows machine inside that computer and use that as a driver for NTFS hard disk.

I have set up the virtual machine with Virtualbox (easy apt-get installation) and an evaluation copy of Windows Server 2008 R2 (which is also very easy to install) and continued the tests.

First I wanted to use iSCSI, but it had also some overhead and finally concluded to assign the complete 4T drive as a physical drive the the Virtual Box virtual machine.

For copying I have shared the 4T disk in Windows Server and mounted that on the linux system as a CIFS file system.

So let's test:

>dd if=/dev/zero of=/media/test/test bs=1M count=200
200+0 records in
200+0 records out

209715200 bytes (210 MB) copied, 3.8882 s, 53.9 MB/s

Still not even close to 100 MB/s but already an acceptable speed to copy the 4 TB in one day.

There are 2 more tuning possibilities, what I have thought about but did not try:

  1. Somewhere I read the as ntsf-3g is CPU limited and it uses only one core the Cool&Quiet feature of my AMD processor my reduce the clock speed. If I disable it may be the ntfs-3g driver would perform better.
  2. The network between the virtual machine and the linux host was a bridged ethernet, I do not know much about the internals of this, but using the lo interface my provide better speeds.