Speeding up NTFS file system access on Linux with a Windows virtual machine
Summary: Ntfs-3g is not performing well, here are some measurements and a complicated way to have faster NTFS filesystem access on Linux.
Disclaimer: These tests were done on an Ubuntu 10.4 LTS installation, with the latest sw versions included in this Ubuntu distribution, but on the ntfs-3g site there are more recent versions of that ntfs-3g.
I wanted to do a backup of my Linux based NAS, to an NTFS formatted hard drive, which meant to copy around 4 TByte of data.
As the first step I built the disk into the NAS machine over a SATA connection and measured the drive speed:
>sudo hdparm -t /dev/sdf
/dev/sdf:
Timing buffered disk reads: 432 MB in 3.01 seconds = 143.62 MB/sec
This looks good, this means, 4 TB/143MB ~ 7 hours of copy time.
Next I mounted the already formatted drive and did a copy test:
>sudo mount -t auto /dev/sdf2 /media/windows/
>dd if=/dev/zero of=/media/windows/test bs=512 count=390625
390625+0 records in
390625+0 records out
200000000 bytes (200 MB) copied, 94.6419 s, 2.1 MB/s
Oh what, this looks terrible. 4 TB/ 2.1 MB ~ 22 days of copy time.
Then I started to look into it to check, if there are some performance issues with the ntfs-3g program.
On the Tuxera site, it says: "Normally the I/O performance of NTFS-3G and the other file systems are very comparable. " and there are some hits, why it can be slow. After checking all, in my case only the too small block sizes apply, so let's try to to increase the block size:
>dd if=/dev/zero of=/media/windows/test bs=1M count=200
200+0 records in
200+0 records out
209715200 bytes (210 MB) copied, 11.3721 s, 18.4 MB/s
Ok, this now looks much better, but still far from the hard disk speed.
After reading a bit more on the Tuxera site, there is a comparison of Embebed NTFS and other file systems among them NTFS 3G:
The NTFS-3G is the bottom one, so there is no sense to search more, why ntfs-3g is slow, on my machine, it is generally slow and probably it is not possible to speed it up.
Then come the idea, to set up a virtual Windows machine inside that computer and use that as a driver for NTFS hard disk.
I have set up the virtual machine with Virtualbox (easy apt-get installation) and an evaluation copy of Windows Server 2008 R2 (which is also very easy to install) and continued the tests.
First I wanted to use iSCSI, but it had also some overhead and finally concluded to assign the complete 4T drive as a physical drive the the Virtual Box virtual machine.
For copying I have shared the 4T disk in Windows Server and mounted that on the linux system as a CIFS file system.
So let's test:
>dd if=/dev/zero of=/media/test/test bs=1M count=200
200+0 records in
200+0 records out
209715200 bytes (210 MB) copied, 3.8882 s, 53.9 MB/s
Still not even close to 100 MB/s but already an acceptable speed to copy the 4 TB in one day.
There are 2 more tuning possibilities, what I have thought about but did not try:
Disclaimer: These tests were done on an Ubuntu 10.4 LTS installation, with the latest sw versions included in this Ubuntu distribution, but on the ntfs-3g site there are more recent versions of that ntfs-3g.
I wanted to do a backup of my Linux based NAS, to an NTFS formatted hard drive, which meant to copy around 4 TByte of data.
As the first step I built the disk into the NAS machine over a SATA connection and measured the drive speed:
>sudo hdparm -t /dev/sdf
/dev/sdf:
Timing buffered disk reads: 432 MB in 3.01 seconds = 143.62 MB/sec
Next I mounted the already formatted drive and did a copy test:
>sudo mount -t auto /dev/sdf2 /media/windows/
>dd if=/dev/zero of=/media/windows/test bs=512 count=390625
390625+0 records in
390625+0 records out
200000000 bytes (200 MB) copied, 94.6419 s, 2.1 MB/s
Then I started to look into it to check, if there are some performance issues with the ntfs-3g program.
On the Tuxera site, it says: "Normally the I/O performance of NTFS-3G and the other file systems are very comparable. " and there are some hits, why it can be slow. After checking all, in my case only the too small block sizes apply, so let's try to to increase the block size:
>dd if=/dev/zero of=/media/windows/test bs=1M count=200
200+0 records in
200+0 records out
209715200 bytes (210 MB) copied, 11.3721 s, 18.4 MB/s
Ok, this now looks much better, but still far from the hard disk speed.
After reading a bit more on the Tuxera site, there is a comparison of Embebed NTFS and other file systems among them NTFS 3G:
The NTFS-3G is the bottom one, so there is no sense to search more, why ntfs-3g is slow, on my machine, it is generally slow and probably it is not possible to speed it up.
Then come the idea, to set up a virtual Windows machine inside that computer and use that as a driver for NTFS hard disk.
I have set up the virtual machine with Virtualbox (easy apt-get installation) and an evaluation copy of Windows Server 2008 R2 (which is also very easy to install) and continued the tests.
First I wanted to use iSCSI, but it had also some overhead and finally concluded to assign the complete 4T drive as a physical drive the the Virtual Box virtual machine.
For copying I have shared the 4T disk in Windows Server and mounted that on the linux system as a CIFS file system.
So let's test:
>dd if=/dev/zero of=/media/test/test bs=1M count=200
200+0 records in
200+0 records out
209715200 bytes (210 MB) copied, 3.8882 s, 53.9 MB/s
There are 2 more tuning possibilities, what I have thought about but did not try:
- Somewhere I read the as ntsf-3g is CPU limited and it uses only one core the Cool&Quiet feature of my AMD processor my reduce the clock speed. If I disable it may be the ntfs-3g driver would perform better.
- The network between the virtual machine and the linux host was a bridged ethernet, I do not know much about the internals of this, but using the lo interface my provide better speeds.
Comments