Linux Storage LAB: Using OpenDedup to store VM virtual images
Deduplication means, that when saving data to disks, if the same data is stored once more, then only a reference is stored and space is saved. OpenDedup is an open source online (real-time) deduplication solution, with one of the goals to provide a solution for storing Virtual Machine images, where big parts of the images could be duplicated. In this article we test OpenDedup, both from speed and compression point of view.
I have a strong Ryzen machine with 500G NVMe SSD, 2 x 500G SATA SSD and a RAID 5 array of 3 x 4T HDDs. The goal is to store the virtual machine images on the SATA SSD-s, and the goal of these measurements was to decide to use OpenDedup or simply use the plain disks.
The Ryzen machine is running Ubuntu 1604 while the virtual machine guests are typically Window 10 systems.
Disk benchmark tools for Windows and Linux
In Linux I am using iozone, while in Windows I am using CrystalMark to measure the disk IO speeds. To compare the them, I made some benchmarks on my client machine, a rather old, Intel Core 2 Quad Q6600 with a SATA2 SSD and also a 220G HDD. I did CristalMark measurements on Windows 10 then restarted from a pendrive Ubuntu 1604 and did iozone measurements there.
For iozone, the following command was used:
iozone -s 1G -r 16M -I -i 0 -i 1 -i 2 -f path/to/storage/to/measure/iozone.tmp
The -r 16M defines the record size, so I have changed it to 512k and 4k to produce comparable results as CrystalMark. For large record sizes iozone's sequential write and read performance is relevant, but for smaller record sizes the random read/write performance is critical.
The results are here:
As we can see, the two benchmark results are similar to each other.
What is important, that the normal read and write speeds are important for large file operations (e.g. copying movies), while the small record sizes are important for operating system overall speed.
Raw disk speeds of the Ryzen computer
After deciding, how to measure the speed with iozone, I measured all disk systems speeds directly under Ubuntu 1604:
From these speeds we can see that the NVMe SSD is by far the fastest, SATA RAID provided double speed for large files, but no improvement for small files and can be clearly seen, that HDD speeds degrade quickly when we decrease the record size (that makes SSD based systems boot so fast).
What we can also see, that SATA2 reduced SSD speed, to half of the speed of the SATA3 interface, so when installing SSD in an old computer, you can gain some speed, but can not reach the speed of a new system.
Installing OpenDedup
I have used the instructions from the OpenDedup site to install Opendedup:
After installing, when creating the volume I have put the file data to the SATA SSD RAID and put the HASH tables to the NVMe SSD. I have used ext2 file systems, because somewhere I have read, that is faster then ext4 which I use normally.
To create this volume I used the following command:
sudo mkfs.sdfs --volume-name=VirtDisk --volume-capacity=400GB --base-path=/media/SSD_RAID_Ext2/Adatok/Opendedup/VirtDisk --chunk-store-hashdb-location=/media/SSD_Fast_Ext2/Adatok/Opendedup/VirtDisk/chunkstore/hdb --dedup-db-store=/media/SSD_Fast_Ext2/Adatok/Opendedup/VirtDisk/ddb
And to mount it, the following command:
sudo mount.sdfs VirtDisk /media/VirtDiskDedup
After mounting it, I measured its raw speed, and as you can see, it was significantly slower than the SSD volume it was residing on.
Measuring speed in the VM guest
Next step was to measure the speed of the Virtual Machine guest operating system (Windows 10).
I did 3 measurements for all cases: Windows reboot time, Starting time for Microsoft Office Word and did also a CrystalMark measurement.
See the results here:
The windows reboot time was measured in 2 parts, one part to get to the login prompt, and the second part for programs starting after login.
What we can see here, that starting Word doesn't depend very much on the disk speed. Even on the HDD RAID, the difference is marginal. We can also see that the time after user login is not improving with the NVMe SSD probably the SATA SSD is already fast enough for this task.
The boot time is always improving as the disk speed increases, and big record speed also counts, as there is a significant improvement with the SSD RAID.
Back to the OpenDedup results, the boot time is very similar to the speed of the SATA SSD without RAID, so if we are in need for storage speed, it has an acceptable speed. I attribute the worse user boot time and Word starting time to the worse small record size writes, may be in a lot of real life applications it can be a drawback.
Space saving with OpenDedup
We are doing opendedup, to save space. I did not do much experimenting, but first copied virtual disk image containing a Windows 10 plus a standard Microsoft Office to the dedup store and could see, that it is already saving 27% disk space. I expected, that if I copy a second image, it will save much more, but to my surprise, when I copied an image with Windows 10 and a Visual Studio, the total saving for the 2 images did not change, it was again 27%. For me this meant, that there were no common savings between the two images.
Conclusion and further thoughts
From this experiment I did not get convinced that I should use Deduplication, the ~30% disk saving is not very significant, and the speed decrease is significant.
As I am new to OpenDedup, may be there are some more tweaking which could lead to better results, for example the compression was enabled, but the statistics showed that it did not produced any gain.
The other point is, that the first image was a VDI format, the second was a VMDK, may be the duplications can not be identified between the two formats?
More details
You can get the OpenDedup statistics with
sdfscli -volume-info
Here are my results after copying the first disk image and after copying the second:
Files : 1
Volume Capacity : 400 GB
Volume Current Logical Size : 49.54 GB
Volume Max Percentage Full : 95.0%
Volume Duplicate Data Written : 15.4 GB
Unique Blocks Stored: 36.01 GB
Unique Blocks Stored after Compression : 36.12 GB
Cluster Block Copies : 2
Volume Virtual Dedup Rate (Unique Blocks Stored/Current Size) : 27.32%
Volume Actual Storage Savings (Compressed Unique Blocks Stored/Current Size) : 27.09%
Compression Rate: -0.32%
Volume Capacity : 400 GB
Volume Current Logical Size : 49.54 GB
Volume Max Percentage Full : 95.0%
Volume Duplicate Data Written : 15.4 GB
Unique Blocks Stored: 36.01 GB
Unique Blocks Stored after Compression : 36.12 GB
Cluster Block Copies : 2
Volume Virtual Dedup Rate (Unique Blocks Stored/Current Size) : 27.32%
Volume Actual Storage Savings (Compressed Unique Blocks Stored/Current Size) : 27.09%
Compression Rate: -0.32%
Files : 2
Volume Capacity : 400 GB
Volume Current Logical Size : 124.16 GB
Volume Max Percentage Full : 95.0%
Volume Duplicate Data Written : 35.61 GB
Unique Blocks Stored: 90.41 GB
Unique Blocks Stored after Compression : 90.7 GB
Cluster Block Copies : 2
Volume Virtual Dedup Rate (Unique Blocks Stored/Current Size) : 27.18%
Volume Actual Storage Savings (Compressed Unique Blocks Stored/Current Size) : 26.95%
Compression Rate: -0.32%
Volume Capacity : 400 GB
Volume Current Logical Size : 124.16 GB
Volume Max Percentage Full : 95.0%
Volume Duplicate Data Written : 35.61 GB
Unique Blocks Stored: 90.41 GB
Unique Blocks Stored after Compression : 90.7 GB
Cluster Block Copies : 2
Volume Virtual Dedup Rate (Unique Blocks Stored/Current Size) : 27.18%
Volume Actual Storage Savings (Compressed Unique Blocks Stored/Current Size) : 26.95%
Compression Rate: -0.32%
Parameters to use when creating a volume
On the OpenDedup site there is a good explanation of the xml parameters but they use different syntax from the ones you have to use in the mkfs command. Here is a list of the parameters for mkfs.sdfs:
usage: mkfs.sdfs --volume-name=sdfs --volume-capacity=100GB --ali-enabled Set to enable this volume to store to Alibaba Object Storage (OSS). cloud-url, cloud-secret-key, cloud-access-key, and cloud-bucket-name will also need to be set. --atmos-enabled Set to enable this volume to store to Atmo Object Storage. cloud-url, cloud-secret-key, cloud-access-key, and cloud-bucket-name will also need to be set. --aws-aim Use aim authentication for access to AWS S3 --aws-basic-signeruse basic s3 signer for the cloud connection. This is set to true by default for all cloud url buckets --aws-bucket-location The aws location for this bucket --aws-disable-dns-bucket disable the use of dns bucket names to prepent the cloud url. This is set to true by default when cloud-url is set --aws-enabled Set to true to enable this volume to store to Amazon S3 Cloud Storage. cloud-secret-key, cloud-access-key, and cloud-bucket-name will also need to be set. --azure-enabled Set to true to enable this volume to store to Microsoft Azure Cloud Storage. cloud-secret-key, cloud-access-key, and cloud-bucket-name will also need to be set. --azurearchive-in-days Set to move to azure archive from hot after x number of days --backblaze-enabled Set to enable this volume to store to Backblaze Object Storage. cloud-url, cloud-secret-key, cloud-access-key, and cloud-bucket-name will also need to be set. --backup-volume When set, changed the volume attributes for better deduplication but slower randnom IO. --base-path the folder path for all volume data and meta data. Defaults to: /opt/sdfs/ --chunk-store-compress Compress chunks before they are stored. By default this is set to true. Set it to false for volumes that hold data that does not compress well, such as pictures and movies --chunk-store-data-location The directory where chunks will be stored. Defaults to: --base-path + /chunkstore/chunks --chunk-store-encrypt Whether or not to Encrypt chunks within the Dedup Storage Engine. The encryption key is generated automatically. For AWS this is a good option to enable. The default for this is false --chunk-store-encryption-key The encryption key used for encrypting data. If not specified a strong key will be generated automatically. They key must be at least 8 charaters long --chunk-store-gc-schedule The schedule, in cron format, to check for unclaimed chunks within the Dedup Storage Engine. This should happen less frequently than the io-claim-chunks-schedule. Defaults to: 0 0 0/2 * * ? --chunk-store-hashdb-class The class used to store hash values Defaults to: org.opendedup.collections.RocksDBMap --chunk-store-hashdb-location The directory where hash database for chunk locations will be stored. Defaults to: --base-path + /chunkstore/hdb --chunk-store-io-threads Sets the number of io threads to use for io operations to the dse storage provider. This is set to 8 by default but can be changed to more or less based on bandwidth and io. --chunk-store-iv The encryption initialization vector (IV) used for encrypting data. If not specified a strong key will be generated automatically --chunk-store-size The size in MB,TB,GB of the Dedup Storeage Engine. This . Defaults to: The size of the Volume --chunkstore-class The class for the specific chunk store to be used. Defaults to org.opendedup.sdfs.filestore.FileChunkStore --cloud-access-key Set to the value of Cloud Storage access key. --cloud-backlog-size The how much data can live in the spool for backlog. Setting to -1 makes the backlog unlimited. Setting to 0 (default) sets no backlog. Setting to GB TB MB caps the backlog. --cloud-bucket-name Set to the value of Cloud Storage bucket name. This will need to be unique and a could be set the the access key if all else fails. aws-enabled, aws-secret-key, and aws-secret-key will also need to be set. --cloud-disable-test Disables testing authentication for s3 --cloud-secret-key Set to the value of Cloud Storage secret key. --cloud-url The url of the blob server. e.g. http://s3server.localdomain/s3/ --cluster-block-replicas The number copies to distribute to descrete nodes for each unique block. As an example if this value is set to"3" the volume will attempt to write any unique block to "3" DSE nodes, if available. This defaults to "2". --cluster-config The jgroups configuration used to configure this cluster node. This defaults to "/etc/sdfs/jgroups.cfg.xml". --cluster-dse-password The jgroups configuration used to configure this cluster node. This defaults to "/etc/sdfs/jgroups.cfg.xml". --cluster-id The name used to identify the cluster group. This defaults to sdfs-cluster. This name should be the same on all members of this cluster --cluster-rack-aware If set to true, the clustered volume will be rack aware and make the best effort to distribute blocks to multiple racks based on the cluster-block-replicas. As an example, if cluster-block replicas is set to "2" and cluster-rack-aware is set to "true" any unique block will be sent to two different racks if present. The mkdse option --cluster-node-rack should be used to distinguish racks per dse node for this cluster. --compress-metadata Enable compression of metadata at the expense of speed to open and close files. This option should be enabled for backup --data-appendix Add an appendix for data files. --dedup-db-store the folder path to location for the dedup file database. Defaults to: --base-path + /ddb --enable-replication-master Enable this volume as a replication master --encrypt-config Encrypt security sensitive encryption parameters with the admin password --ext --gc-class The class used for intelligent block garbage collection. Defaults to: org.opendedup.sdfs.filestore.gc.PFullGC --glacier-in-days Set to move to glacier from s3 after x number of days --glacier-restore-class Set the class used to restore glacier data. --google-enabled Set to true to enable this volume to store to Google Cloud Storage. cloud-secret-key, cloud-access-key, and cloud-bucket-name will also need to be set. --hash-type This is the type of hash engine used to calculate a unique hash. The valid options for hash-type are tiger16 tiger24 murmur3_128 VARIABLE_MURMUR3 This Defaults to VARIABLE_MURMUR3 --help Display these options. --io-chunk-size The unit size, in kB, of chunks stored. Set this to 4 if you would like to dedup VMDK files inline. Defaults to: 4 --io-claim-chunks-schedule The schedule, in cron format, to claim deduped chunks with the Volume(s). Defaults to: 0 59 23 * * ? --io-dedup-files True mean that all files will be deduped inline by default. This can be changed on a one offbasis by using the command "setfattr -n user.cmd.dedupAll -v 556:false " Defaults to: true --io-log the file path to location for the io log. Defaults to: --base-path + /sdfs.log --io-max-file-write-buffers The amount of memory to have available for reading and writing per file. Each buffer in the size of io-chunk-size. Defaults to: 24 --io-max-open-files The maximum number of files that can be open at any one time. If the number of files is exceeded the least recently used will be closed. Defaults to: 1024 --io-meta-file-cache The maximum number metadata files to be cached at any one time. If the number of files is exceeded the least recently used will be closed. Defaults to: 1024 --io-safe-close If true all files will be closed on filesystem close call. Otherwise, files will be closed based on inactivity. Set this to false if you plan on sharing the file system over an nfs share. True takes less RAM than False. Defaults to: true --io-safe-sync If true all files will sync locally on filesystem sync call. Otherwise, by defaule (false), files will sync on close and data will per written to disk based on --max-file-write-buffers. Setting this to true will ensure that no data loss will occur if the system is turned off abrubtly at the cost of slower speed. Defaults to: false --io-write-threads The number of threads that can be used to process data writted to the file system. Defaults to: 16 --local-cache-size The local read cache size for data uploaded to the cloud. Defaults to: 10 GB --low-memory Sets the volume to mimimize the amount of ram used at the expense of speed --minio-enabled Set to enable this volume to store to Minio Object Storage. cloud-url, cloud-secret-key, cloud-access-key, and cloud-bucket-name will also need to be set. --noext --permissions-file Default File Permissions. Defaults to: 0644 --permissions-folder Default Folder Permissions. Defaults to: 0755 --permissions-group Default Group. Defaults to: 0 --permissions-owner Default Owner. Defaults to: 0 --refresh-blobs Updates blobs in s3 to keep them from moving to glacier if clamined by newly written files --report-dse-capacity If set to "true" this volume will report capacity the actualcapacity statistics from the DSE. If this value is set to "false" it willreport as virtual size of the volume and files. Defaults to "true" --report-dse-size If set to "true" this volume will used as the actual used statistics from the DSE. If this value is set to "false" it willreport as virtual size of the volume and files. Defaults to "true" --sdfscli-disable-ssl disables ssl to management interface --sdfscli-listen-addr IP Listenting address for the sdfscli management interface. This defaults to "localhost" --sdfscli-listen-port TCP/IP Listenting port for the sdfscli management interface --sdfscli-password The password used to authenticate to the sdfscli management interface. Thee default password is "admin". --sdfscli-require-auth Require authentication to connect to the sdfscli managment interface --simple-metadata If set, will create a separate object for metadata used for objects sent to the cloud. Otherwise, metadata will be stored as attributes to the object. --simple-s3 Uses basic S3 api characteristics for cloud storage backend. --tcp-keepalive Set tcp-keepalive setting for the connection with S3 storage --use-perf-mon If set to "true" this volume will log io statistics to /etc/sdfs/ directory. Defaults to "false" --user-agent-prefix Set the user agent prefix for the client when uploading to the cloud. --volume-capacity Capacity of the volume in [MB|GB|TB]. THIS IS A REQUIRED OPTION --volume-maximum-full-percentage The maximum percentage of the volume capacity, as set by volume-capacity, before the volume startsreporting that the disk is full. If the number is negative then it will be infinite. This defaults to 95 e.g. --volume-maximum-full-percentage=95 --volume-name The name of the volume. THIS IS A REQUIRED OPTION --vrts-appliance Volume is running on a NetBackup Appliance.
Comments