[devel] [astralstorm@: Re: [ck] Filesystem interactivity]

Konstantin A. Lepikhov =?iso-8859-1?q?lakostis_=CE=C1_altlinux=2Eru?=
Пт Авг 12 02:53:58 MSD 2005


Hi!

Думаю, в хозяйстве эта информация сгодится :) Особенно насчет reiser4 vs
xfs.

----- Forwarded message from Radoslaw AstralStorm Szkodzinski  -----

Date: Fri, 12 Aug 2005 00:45:19 +0200
From: Radoslaw AstralStorm Szkodzinski <astralstorm@>
To: ck@
Cc: 
Subject: Re: [ck] Filesystem interactivity

First of all, I'd like this thread *not* to turn into "My FS is bettar than
yuor" flamewar.

On Thu, 11 Aug 2005 11:50:44 +0200
jos poortvliet <jos@> wrote:

> Op donderdag 11 augustus 2005 10:46, schreef Radoslaw AstralStorm Szkodzinski:
Everybody loves localised reply messages, this time in Dutch.

> > I'm looking for filesystems good in these cases:
> > a) streaming (one or two threads, single file, sequential)
> 
> I THINK reiser4 should be good here, XFS too afaik.
> 

I've already tried simple tests on R4 and XFS:
- kernel compile
- copying a large file (1G) inside the filesystem
- ripping a CD image (700M)
- copying a medium file (2.6.12 kernel archive) inside the filesystem and
from CD
- unpacking kernel archive
- copying kernel archive
- copying a large maildir (1000+ small files) both inside and from a CD
- unpacking this maildir from an archive on a CD

Relative to XFS.

Reiser4 is worse at:
- copying large files inside filesystem (better than ReiserFS)
- kernel compile (weird, but true)
Reiser4 is better at:
- copying the maildir

ReiserFS is worse at:
- copying large files inside filesystem
- ripping a CD image
ReiserFS is better at:
- copying the maildir
- unpacking the maildir

Other times are fairly similar.

I have yet to try it with Ext2, Ext3 and JFS.

The tests were done somewhere around 2.6.11, but I don't have the results of
"time" anymore. Reiser4 was the latest version of the time, from -mm. Its
performance hasn't noticeably changed since then.

> > b) massive multithreading/multiuser environment (20-30 threads, different
> > files, random)
> 
> no idea... maybe reiser4 and xfs, too

Reiser4 is especially bad for it. It somehow unfairly interleaves reads and
especially writes. I sometimes see pauses inside a r/w cycle.

Fairness is very much required in this kind of environment. Nobody likes a
user starving others with one big write.

> > c) low latency access (one or two threads, single and multiple files,
> > random)
> 
> ext3, JFS (reiser4 sucks, here, afaik)

R4 sucks at multiple threads reading/writing different files and expecting low
latency. (also see fairness comment) Its writes are also very broken in this
regard.

I'd really like some benchmark and more information wrt this workload.
See below for Interbench suggestion.

Other comments:
Reiser4 is getting much slower after some time (repacking needed?), but it
doesn't have any other performance problems with full drives.
It is evil when the workload is fsync intensive.

XFS keeps fast until only about 10% of free disc space is left. Then it
magically, instantly became dog slow.

ReiserFS is similar to XFS in this regard.

Ext3 has always nearly constant speed. Without htree it slows down in large
directories. Htree was usually slowing down filesystem until some 2.6.1x
version.

The only gripe I have with it is lack of fractional percentage of reserved
blocks for root. 1% on a 200GB drive is quite a bit of space. I'd be entirely
happy with 200MB.

> > Of course it has to be stable enough - no random file loss, crashes, oopses
> > etc.

XFS + crash = garbage in files. Especially with power loss.
Somehow it damages even the files which are only *read* from if there are
intermittent writes. Just pull the plug during the system start and shock
horror you can't boot anymore. I've seen this behaviour at least three times.
(in various kernel versions, last one - 2.6.11.something, I've not yet seen
it in any newer, but I didn't try that hard at destroying the FS)
Lost files included grep, sed, ls, [ and other utilities.
Also there's always garbage in recently modified files.
Doesn't survive a simulated human error (wiping out the first 50 MB of the
partition), is irreperable.

ReiserFS is better, but it writes garbage to modified files in an event of
crash. Probably when used with "ordered" option it is quite stable, but with
performance loss. I haven't tried yet.

Reiser4 is bullet-proof. It even survived a "human error" test. ;) 
Of course checking it took about an hour then, but it worked.
Some 100 files lost only anyway. Nothing critical.

Ext3 is also rock solid, but it can't restore filenames after a similar
test. (of course I had to use a backup superblock) 
Probably design misfeature or a bug in e2fsck.
Everything lands in lost+found with some garbage instead of file names.
However, its "ordered" mode is quite secure - for more paranoid there's also
slow journal mode. Even with writeback it's very resilient and even quite
fast. But nowhere near the next one...

Ext2 is probably as stable as Ext3, e2fsck looks really good (except the
above problem). However, I really have to beat on it. I expect garbage in
file which is being currently written to in case of crash.

JFS - not tested.

I don't know why journals are such a good thing. If your machine is not
crashing all the time or a high-availability server with minimal downtime,
then either the protection they give is most of the time low or they are
sluggish. Probably the case of bad implementation. 

I'd like to see Ext3 with wandering logs (like Reiser4). These things help at
least file creating and also reduce seeking.

> i think the best you can do is test some filesystems yourself... most 
> benchmarks are to synthetic, i think.

The major problem with testing it "the usual way" is that it takes *a lot* of
time. I've to: set up test image on the second disk (done), time copying it
to the filesystem (I didn't do it), run some scripted workloads (some done)...

Some interesting ones aren't even truly scriptable!

Interbench suggestion:
The only major thing I'd like to see in Interbench is simulating the
workloads fully - with disc reads. (in case of Audio, Video, Burn and
Gaming) Optional of course.

Audio - small, 5MB file (average MP3), sequential, low required minimal
throughput, low latency - no buffering (except filesystem cache)
Video - 700MB file, required minimal throughput (or else dropped
frames), required low latency - no buffering
Burn - similar to Video, can be high latency, different CPU usage.
Gaming - I don't know, maybe some random reads from small files? Also
requires minimal throughput, I'd say one third of Video.

-- 
AstralStorm

GPG Key ID = 0xD1F10BA2
GPG Key fingerprint = 96E2 304A B9C4 949A 10A0  9105 9543 0453 D1F1 0BA2
Please encrypt if you can.



_______________________________________________
ck@
ck mailing list. Please reply-to-all when posting.
If replying to an email please reply below the original message.
http://bhhdoa.org.au/mailman/listinfo/ck


----- End forwarded message -----

-- 
WBR, Konstantin	      chat with ==>ICQ: 109916175
     Lepikhov,	      speak  to ==>JID: lakostis на jabber.org
aka L.A. Kostis       write  to ==>mailto:lakostis на pisem.net.nospam

...The information is like the bank... 			  (c) EC8OR
----------- следующая часть -----------
Было удалено вложение не в текстовом формате...
Имя     : =?iso-8859-1?q?=CF=D4=D3=D5=D4=D3=D4=D7=D5=C5=D4?=
Тип     : application/pgp-signature
Размер  : 189 байтов
Описание: =?iso-8859-1?q?=CF=D4=D3=D5=D4=D3=D4=D7=D5=C5=D4?=
Url     : <http://lists.altlinux.org/pipermail/devel/attachments/20050812/a4705121/attachment-0001.bin>


Подробная информация о списке рассылки Devel