Lessfs-1.5.12 performance

Introduction

People frequently ask what the performance is that they may expect from Lessfs. This article will give an indication of what to expect.

About the hardware

All the tests are done using an Intel 5520HC system board with a single E5520 processor @ 2.27GHz. The meta data is written to an Intel 320 SSD while the data is written to 5 Hitachi HUA722010CLA330 SATA drives attached to an LSI Megaraid controller in RAID 5. The maximum transfer speed to the volume on the LSI controller is approximately 400 MB/sec. When I tested the same drives with Linux software raid5 I found it hard to get more then 250MB/sec out of them. Even worse is the amount of IOPS that you can get from the drives with software raid. So for now I will stick to using hardware raid.

Installing lessfs

In this test we will setup lessfs with file_io and hamsterdb 2.0.1.  After downloading and installing hamsterdb-2.0.1 we start with downloading lessfs.

wget http://sourceforge.net/projects/lessfs
          /files/lessfs/lessfs-1.5.12/lessfs-1.5.12.tar.gz
tar xvzf lessfs-1.5.12.tar.gz
cd lessfs-1.5.12
./configure --with-hamsterdb --with-snappy
make -j4

In this example the RAID5 volume on the LSI raidcontroller is mount on /data
The SSD is mounted on /data/mta

The configuration file used in this example can be downloaded here : lessfs.cfg

After downloading lessfs.cfg you will need to copy it to /etc

Please make sure that the directories /data/dta and /data/mta exist.
Now we can format lessfs and mount the filesystem:

./mklessfs -c /etc/lessfs.cfg
./lessfs /etc/lessfs.cfg /mnt

When everything went right you should now have lessfs mounted on /mnt

I now use a little tool to write 3000 files with a 1GB size to lessfs. The files can not be compressed and all have a unique content. The second pass writes files 100% identical to the first pass and will therefore be written with a much higher speed. After this the first files a read from lessfs. This is the result:

lessfs & hamsterdb

As you can see there appear to be 5 lines in this graph instead of the 3 lines that you would expect. The explanation is this: Lessfs will flush all data from the cache to disk at COMMIT_INTERVAL seconds. At this point all the transactions are committed to disc and a steady state for the file system is created. When Lessfs flushes the data to disk no new data is processed and as a result we see a dip in the transfer rate. This dip shows up at the bottom of the graph.

In this case the red line shows the worst case performance that you can have. Lessfs is writing files that are unique and can not be compressed. The purple line shows the best performance that you may expect. Since these are files that are 100% duplicates of the files written in the red line. You can improve performance when you disable compression in this case. Since the files can not be compressed this would make sense with this workload.

In this test the transfer speed of the first write appeared to be limited by the IOPS capability of the SATA RAID5 volume. The reads are limited in speed by decompression latency and IOPS capability of the SATA drives.

Next time I will show the results of Lessfs with Berkeley DB and the effect of caching the Intel 320 SSD with EPRD.

p5rn7vb
This entry was posted in Uncategorized. Bookmark the permalink.

13 Responses to Lessfs-1.5.12 performance

  1. cw says:

    how much ram is in the system?

  2. How much space did HamsterDB take on the SSD, when testing with those 3000 files?

    I wonder whether there’s a formula to size the DB based e.g. on chunk size and number of unique entries.

    • maru says:

      centos62dev:/data/mta # ls -alh /data/dta/
      total 3.0T
      drwxr-xr-x 2 root root 43 Apr 25 08:56 .
      drwxr-xr-x 4 root root 41 Apr 23 11:01 ..
      -rwx------ 1 root root 3.0T Apr 25 14:06 blockdata.dta
      -rwx------ 1 root root 0 Apr 25 08:56 replog.dta
      centos62dev:/data/mta # ls -alh /data/mta/
      total 8.5G
      drwxr-xr-x 2 root root 24K Apr 25 08:56 .
      drwxr-xr-x 4 root root 41 Apr 23 11:01 ..
      -rw-r--r-- 1 root root 628 Apr 25 08:56 DB_CONFIG
      -rw-r--r-- 1 root root 8.5G Apr 27 17:26 lessfs.db
      -rw-r--r-- 1 root root 3.3K Apr 27 17:26 lessfs.db.jrn0
      -rw-r--r-- 1 root root 4.3K Apr 27 17:24 lessfs.db.jrn1
      -rw-r--r-- 1 root root 16 Apr 27 17:26 lessfs.db.log0

      • Thanks!

        8.5GB/3000GB = 0.28% extra space for dedup DB at worst case (all unique data). Which is brilliant when compared to ZFS and OpenDedup.
        What was the chunk size?

        Best Regards

  3. Chris says:

    I’m running lessfs 1.5.12 with hamsterdb 2.0.2 with chunk_io on btrfs on ubuntu 12.04.
    My lessfs suddenly crashed, while writing files to ist. When i try to restart it, i receive the following error:
    ================================================
    root@ubuntu1204:/mnt# lessfs /mnt/dedup/lessfs.cfg /mnt/storage
    ASSERT FAILED in file blob.cc, line 1007:
    “blob_get_self(&hdr)==blobid”
    invalid blobid 0 != 63066816
    ================================================

    what’s going wrong? what can i do?

    Regards, Chris

    • Mike says:

      I have had this error also. Had no recourse, but to wipe and start over. It was reproducable by copying a large file, 100GB+, file from inside the mounted lessfs folder to a new file in the same folder. In the beginning it would just hang an operations on the mounted folder, ls, cd, fuser, untill I killed the lessfs process. Then the mount would become stale and the only way to get rid of it was to reboot. I could mount lessfs to a new folder and it would work just fine. I had done that for a while and copied several 80GB files around the folder and thought it was a fluke. I then copied the 100GB+ file and it stopped just shy of completing with an endpoint connection error. When I tried to remount it to a new folder, I recieved the above error. I am going to setup a test server to play with and see what I can see building with debug on and what have you.

  4. Joan says:

    Hi, first of all, thank your for this nice piece of software.

    Secondly, I have been doing some testing before putting any data on it, and have found a show-stopping bug/problem. Currently running:
    lessfs-1.5.12, with hamsterdb-2.0.2 and mhash-0.9.9.9 compiled on a Ubuntu 11.10 (kernel 3.0.0-12-generic-pae).

    After writing about 180GB of highly deduplicable data (maybe 20-30 vmdk files from 3 different days), lessfs crashed in the middle of a copy. Any attempt to remount the lessfs filesystems causes the following error:

    n0der@box:/mnt/2/lessfs/dta$ sudo lessfs /etc/lessfs.cfg /mnt/lessfs
    *** glibc detected *** lessfs: realloc(): invalid next size: 0x08ec1190 ***
    ======= Backtrace: =========
    /lib/i386-linux-gnu/libc.so.6(+0x6ebc2)[0xb7606bc2]
    /lib/i386-linux-gnu/libc.so.6(+0x715cf)[0xb76095cf]
    /lib/i386-linux-gnu/libc.so.6(realloc+0xf7)[0xb760aab7]
    /usr/local/lib/libhamsterdb.so.3(_ZN16DefaultAllocator7reallocEPKvj+0×37)[0xb74f04c7]
    /usr/local/lib/libhamsterdb.so.3(_Z29btree_prepare_key_for_compareP8DatabaseiP11btree_key_tP9ham_key_t+0x6d)[0xb74c99ad]
    /usr/local/lib/libhamsterdb.so.3(_Z18btree_compare_keysP8DatabaseP4PageP9ham_key_tt+0x8b)[0xb74c9aeb]
    /usr/local/lib/libhamsterdb.so.3(_Z14btree_get_slotP8DatabaseP4PageP9ham_key_tPiS5_+0x6a)[0xb74c9e7a]
    /usr/local/lib/libhamsterdb.so.3(+0×46652)[0xb74ce652]
    /usr/local/lib/libhamsterdb.so.3(+0x46cd2)[0xb74cecd2]
    /usr/local/lib/libhamsterdb.so.3(+0x4705a)[0xb74cf05a]
    /usr/local/lib/libhamsterdb.so.3(+0×47018)[0xb74cf018]
    /usr/local/lib/libhamsterdb.so.3(+0×47018)[0xb74cf018]
    etc…

    Any insight of what might be the problem or how to work around it?
    In the current state it is impossible to mount the lessfs FS.

    No hurry/worries, it was test data anyway.

    Kind regards.

    N0der.

  5. Chris says:

    I am trying to install 1.5.12 on a freshly installed CentOS 6.2, but something goes wrong repeatedly (I am using a VM and tried it several times).
    After everything is done, calling lessfs or mklessfs always shows the following message:
    lessfs: error while loading shared libraries: libtokyocabinet.so.9: cannot open shared object file: No such file or directory
    Am I doing something wrong and you have got a suggestion what that might be, or is that a bug maybe? It has been a while since I installed the previous system on Debian, but output while making and installing fuse, tc and lessfs seemed similar.

    Thank you for your great work!

    • Chris says:

      I finally figured out the problem: I had to enter the path to the lib (/usr/local/lib) in /etc/ld.so.conf

      • Joe Gruher says:

        I wonder if I am having a similar problem. When I try to run “./configure –with-hamsterdb” I get the error:

        configure: error: “Hamsterdb is not found”

  6. Hi,

    I’m having an issue with lessfs, I can’t figure what I’m doing wrong, but it doesn’t seem to actually mount the lessfs, but doesn’t output any error nor messages.

    Config file /etc/lessfs/alex.cfg available at http://pastebin.com/P335Wj6B

    [root@arch64vm ~]# mkdir -p /data/{dta,mta}
    [root@arch64vm ~]# mklessfs -c /etc/lessfs/alex.cfg

    [root@arch64vm ~]# mkdir /lessfs
    [root@arch64vm ~]# lessfs /etc/lessfs/alex.cfg /lessfs

    No error, no output, but:

    [root@arch64vm lessfs]# echo $?
    4

    Nothing new in dmesg.
    /var/log/messages shows:

    Jan 11 21:46:10 localhost lessfs[6822]: Blocksize = 131072 bytes
    Jan 11 21:46:10 localhost lessfs[6822]: MIN_SPACE_CLEAN is not set, lessfs runs -ENOSPC when reaching MIN_SPACE_FREE
    Jan 11 21:46:10 localhost lessfs[6822]: The selected data store is chunk_io.
    Jan 11 21:46:10 localhost lessfs[6822]: Lessfs transaction support is enabled.
    Jan 11 21:46:10 localhost lessfs[6822]: config->blockdata = /data/dta/
    Jan 11 21:46:10 localhost lessfs[6822]: compression = none
    Jan 11 21:46:10 localhost lessfs[6822]: Threaded background delete is disabled
    Jan 11 21:46:10 localhost lessfs[6822]: Hash MHASH_TIGER192 has been selected
    Jan 11 21:46:10 localhost lessfs[6822]: Lessfs uses a 24 bytes long hash.
    Jan 11 21:46:10 localhost lessfs[6822]: cache 2048 data blocks

    [root@arch64vm ~]# mount | grep less
    [root@arch64vm ~]# (nothing)

    So, no lessfs mounted.

    [root@arch64vm ~]# cd /lessfs/
    [root@arch64vm lessfs]# df .
    Filesystem 1K-blocks Used Available Use% Mounted on
    /dev/sda3 7060308 5937076 764584 89% /

    Anything I could be doing wrong?

    Some more data:

    # uname -a
    Linux arch64vm 3.6.11-1-ARCH #1 SMP PREEMPT Tue Dec 18 08:57:15 CET 2012 x86_64 GNU/Linux

    Versions:

    [2013-01-10 23:59] installed snappy (1.0.5-2)
    [2013-01-11 00:02] installed snzip (20130111-1)
    [2013-01-11 00:19] installed fuse (2.9.2-1)
    [2013-01-11 00:19] installed mhash (0.9.9.9-2)
    [2013-01-11 00:24] installed tokyocabinet (1.4.47-1)
    [2013-01-11 00:30] installed lessfs (1.5.12-1)

  7. Sonam says:

    Hello,

    I have been trying to understand some of the configuration options for Lessfs and I want to make sure that we aren’t unnecessarily enabling/disabling options which may cause the performance to suffer.

    I haven’t understood properly how exactly transactions work in this system, and why does it say that fsck is used to recover the file system after a crash if transactions are not enabled? Isn’t fsck a file system specific tool, how does it help recover the Lessfs database files? I apologie if this sounds like a naive question, I am very new to Lessfs.

    Another question I had was how is the data actually stored in the file_io? I see that it is all stored in a single file. Are the unique blocks stored sequentially in this file? Or is it more like a hash table, where the block where the data is stored is determined by the hash? I actually did have a look at the data file and it looks like it is sequentially stored, but I’ve had some trouble understanding the mapping of offset in the database to the actual offset in this file.

    Thank you very much. I apologize again for my naive questions, but I’ve spent a lot of time searching for answers, couldn’t really find too much about these details.

    Regards,
    Sonam

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>