Lessfs-1.5.11 now allows users to specify the cache size that hamsterdb will use internally.
This version also solves a bug in configure.ac that would cause configure with –disable-debug to actually enable debugging. This bug caused users to report very low performance in a number of cases.
-
Archives
- January 2015
- August 2014
- February 2014
- December 2013
- May 2013
- March 2013
- January 2013
- December 2012
- July 2012
- June 2012
- April 2012
- January 2012
- October 2011
- September 2011
- August 2011
- June 2011
- May 2011
- April 2011
- March 2011
- February 2011
- January 2011
- December 2010
- October 2010
- September 2010
- August 2010
- July 2010
- June 2010
- April 2010
- March 2010
- February 2010
- January 2010
- December 2009
- November 2009
- October 2009
- September 2009
- August 2009
- June 2009
- March 2009
-
Meta
Hi Mark,
Where do I specify the hamsterdb cache size?
and what type of values do you recommend for storing about a 1TB of data?
Thanks,
Mark
I am having a performance problem with lessfs (1.5.12, hamsterdb 2.0.1, no compression, file_io on fedora 16 btrfs, 16GB RAM, fast Dell SAS RAID ) with lots of small files.
for i in 1..{1..5000}
do
echo $i
echo $i > $i
done
the first few go quickly, but by the time it gets to 4000 it slows down to 2 per second.
if i try to ls the directory (with only 4000 files) it takes quite a while to get a listing.
if it try without lessfs and just use btrfs or xfs it is very quick.
Is there a setting I am missing?
I have had a guess at the new hamsterdb setting in /etc/lessfs.cfg (HAM_PARAM_CACHESIZE=4096). is that correct?
Thanks.
Lessfs was never designed to handle lot’s of small files. I designed it for the exact opposite. A relative small number of large files.
To enable better performance with small files switching to the lowlevel fuse Api is needed. Lessfs is still using the high level Api.
OW!!! That’s a very serious problem :-(( 2 per second is so slow… I really hoped to use lessfs for a general purpose backup filesystem and I do have directories with thousands of files. I wouldn’t want to wait for a complete reimplementation such as btrfs 2.x, it’s so sad.
Wouldn’t it be possible to cache some metadata content received from fuse, e.g. dedicate 50MB of RAM to cache last fuse received data, that so to avoid the round trip to fuse API each time? First lookup the cache, then if not there ask fuse, then periodically erase entries from the cache which are too old.
If you are backing up a lot of small files at once, use tar to create a non compressed BIG file out of your small files.
Please post your results, but this should help.
Two major problems: 1) I would lose deduplication capabilities or I would need a tar thing that pads every file to the blocksize of lessfs (I don’t think it exists). 2) that would be incompatible with rsync.
You should not lose dedup capability with tar. I have used tar and lessfs, but not lately. I also believe you can still use rsync with tar. You could also do full tars once a week and the rest of the week do incremental tars. I think this would eliminate the need for rsync unless you want to make a local tar on one server then rsync to the Lessfs server.
Yes I do lose dedup capability because if the length of the first file changes, the following files would not be aligned inside the tar in the same way they were before. Since lessfs checks for dedups by segmenting the tarfile into blocksize-sized segments, none of such segments would match against the segments of the previous version of the tar.
That’s a good point. I don’t think that would Lose Dedup, but that definitely would affect it.
There should be away to specify padding so that the tar records are always a multiple of your lessfs blocksize. Since Tar doesn’t do compression Unless you specify, I’m thinking this should be possible. I’d have to research it. I have a whole bunch of tar files I can copy to test. If you could try this and we both post our results that would be better than just one of use trying it.
Hi
I see no LZ4 compression in the source trunk.
(It was presented here : http://www.lessfs.com/wordpress/?p=688 )
Is it reserved to 1.6+ serie ?
I quickly added LZ4 to the 1.6+ serie and for now I did not yet find the time to port it to 1.5.x.
For now I would advise to use snappy with the 1.5 series. However file_io + the 1.6 series + LZ4 + BDB should work without problems as well.
Thanks for clairification.
Since you are currently providing some detailed benchmark, do you also intent on providing some figures with compression enabled ?
my bad, it seems the benchmark provided already use compression…