Introducing TIER

Tier is a Linux kernel module that can be used to create a block device that allows automatically tiered storage. Tier can be used to aggregate up to 16 devices as one virtual device. Tier investigates access patterns to decide on which device the data should be written. It keeps track of how frequently data has been accessed as well as when it was used. Tier uses this information to decide if the data needs to be written to for example SSD/SAS or SATA.

One advantage of tier when compared to SSD caching only is that the total capacity of the tiered device is the sum of all attached devices. Kernel modules like flashcache use the SSD as cache only and therefore the capacity of the SSD is not available as part of the total size of the device.

Since TIER combines the RAM caching techniques of EPRD it is very fast. Even faster then what can be achieved with SSD only.

To get an impression of TIER performance I tested tier in this configuration.
An Intel SSD with a 160GB size is used as first tier and the second tier is made up of 6 * 300GB SAS in software RAID10.

The iometer test that is used comes from : http://vmktree.org/iometer/
Tier was configured with these parameters:

./tier_setup -f /dev/sdb:/dev/md1 -p 1000M -m 5 -b -c
                              TIER - SSD  - MD1(R10)
Max-throughput-100%read    : 32540 - 3796 - 2746
Reallife-60%rand-65%read   : 1927  - 3185 - 226
Max-Throughput-50%read     : 6890  - 1753 - 470
Random-8k-70%read          : 937   - 2870 - 401

As shown in the results table above TIER outperforms the MD raid10 on all tests. The SSD is faster in most cases but not all. TIER can outperform the SSD because it was configured to use 1GB of RAM for caching and TIER uses the speed advantage that raid10 will give on sequential reads and writes.

tier-iometer

p5rn7vb
This entry was posted in Uncategorized. Bookmark the permalink.

12 Responses to Introducing TIER

  1. dimiz says:

    Hi Maru
    Thank for your great new tool!! This is very good feature for virtualization enviroment but also for many other IT use.
    I have try to compile on Centos 5.8 but i have a lot of error this is because is not Centos 5.x compatible? I will report my log when i came to home if you need it.
    With Tier the Trim feature of SSD is it usable?
    How the data is protected on this solution? What i want to say is: Do you think there are possible Data Corruption during moving of the data through the disks? Is possible to create on top of “Tier device” software RAID 1?

    Many thanks in advance

  2. Johnathan says:

    maru, thats amazing.

    Thanks alot.

  3. silopolis says:

    hi,
    This is really great news to see a Linux project in this field !
    To date the only “serious” options I’ve found are SAM/QFS, but Solaris is not my OS of choice; and OpenArchive (http://www.openarchive.net), which runs on Linux, but has a lot of dependencies (including a patched Samba :/ ) and is a beast to configure, and certainly to maintain.
    So, a kernel module allowing to transparently and simply “stack” storage tiers would be more than welcome !
    By the way, support for tapes and tape libraries (LTO5+) would bring valuable support for high volume, energy efficient, long term storage tier…
    Thank you for this project
    Bests

  4. dude says:

    “One advantage of tier when compared to SSD caching only is that the total capacity of the tiered device is the sum of all attached devices.”

    But then if the one SSD fails, you lose all your data? Also, once the 160GB space of the SSD runs out, the rest of the 900GB space of the RAID 10 would be the same speed or lower than a simple RAID10, wouldn’t it?

    • maru says:

      The storage layers that you ‘stripe’ together with tier should be protected with raid. Any device that fails will corrupt the whole tier device. So indeed, for production purposes one should use raid. When the SSD runs out of space the automatic tiering process will make sure that the blocks that are most frequently used are stored on SSD.
      Blocks are moved by the optimization process. It uses statistics like how often the block is used and when it was used to decide if a block must be reallocated. Therefore the speed will be considerably higher then a simple RAID10.

  5. Calvin says:

    Can a TIER device be expanded after creation (without data loss)? If one of the block devices below TIER is expanded (RAID capacity expansion), will TIER be able to use the new storage (tier_setup -d followed by tier_setup …) or is the TIER metadata specific to the initial block device sizes?

  6. Calvin says:

    One follow up question – is it possible to add additional tiers to an existing tiered device?

  7. rob says:

    Has anyone tried using tiered devices to back a drbd volume? Details of a known-good configuration would be great. “Do not go there” is also a good time saver if someone knows it’s not a go.

  8. Michael says:

    Hi,
    very interesting project, how is the actual status? Is there an active developer community, or a small group / single person? Is it beeing used in productive enviroments already? Same questions for lessfs :-)

    thanks,
    Michael

  9. CS says:

    There is a person that was developing a tiered solution. There haven’t been any commits to the git project for ~5 months though. I’m interested if anyone else is working on something like this? Perhaps this is my chance to jump back into C.
    https://bbs.archlinux.org/viewtopic.php?id=113529&p=2

  10. snowman says:

    Hi Maru,

    I am really amazed at this project as my university project assigned me to do exactly the same thing. What I have done so far is only a user-space program that will migrate the file byte by byte or block by block to a block group specified by user (this is to simulate data migration between tiers). Also, my implementation is not a real-time tiering solution as the file system has to be unmounted when the program runs. In addition, as block bitmap, inode bitmap, group descriptor need to be updated, a lot of calculations are executed and it results an extremly slow speed when migrating a large file say tens of MBs. I didn’t expect such a poor performance before I completed the implementation. And this is also why I am super “shocked” when I see you implemented the feature by creating a kernel module.

    Just out of curiosity, was this project done by yourself alone or there was a group of guys developing the solution? How long did you spend on it from an idea to the first version?

    Your reply will be much appreciated.

    Thank you.

    Best regards,
    Snowman Zhang

  11. Pingback: wiki-test1-1 –SSD | test

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>