On Thursday 08 December 2005 00:58, Jon Masters wrote: > > > > The Linux cache is write-through so these calls were observed to be slow > > under Linux too and enabling the short op cache fixed the problem. From > > then on, the shortopcache has been enabled by default. > > I agree with that. In the testing I did it was reading and writing > very large files sequentially (inline with the requirements) but I can > see the possible problem. > > > 2) Perhaps only using the short op cache for write operations would be > > the best way to do things under Linux? > > That would be the best thing to do. Otherwise we just waste time on > reads when the page cache will get populated by YAFFS2 anyway after a > readpage. > > > > * YAFFS2 memory allocation using kmalloc does not work on very large > > > devices and needs to use vmalloc instead in those cases (>2GB > > > devices). The lack of checking for success proves to be a problem. > > > > I think this only impacts on the creation of the huge chunk bitmap > > structure. If so, this was dealt to in > > http://www.aleph1.co.uk/cgi-bin/viewcvs.cgi/yaffs2/yaffs_guts.c?r1=1.20&r > >2=1.21 Andre tested this, IIRC, and this fixed the problem. > > His hack would seem to fix that problem. > > > Is more required? > > I think YAFFS2 wants to decide how it is allocating memory. We have a > limit on vmalloc space too (though it's pretty big) so getting away > from unbounded allocations and having smaller buffers may become > necessary on very large devices. The Bluewater (Andre's) hack, IIRC, substituted vmalloc for kmalloc on a global basis and seemed to work for them (I live in the same town and am personal friends with some of these guys so I tend to hear about probs quickly :-)). I did have a concern that vmallocing might be limited or more ponderous than kmalloc, so changed the strategy. As it is now, I try kmalloc always, but in the one or two places where it can fail because of the 128kB size issue, I try vmalloc if kmalloc fails. This, I think, is the best approach since it will use kmalloc for the bulk of allocations (tnodes etc). For Linux I guess looking at slab allocation rather than self-managed might be a good idea in the future. > > > Yes, definitely the handling of alloc failures is a bit sloppy. > > That is the main problem - you don't know things are failing until you > guess that's what is happening (reading comments along the lines of > "we should probably check if this fails" was helpful, I'll grant). > > > > * YAFFS2 has various internal usage of types which makes it difficult > > > to scale to >2GB devices. We have to divide up into multiple > > > partitions. > > > > Can you give some details? I would like to fix this. There are some > > places where where ints are being used where off_t would be correct. > > That sort of thing. I started doing wholesale replacements but YAFFS2 > is corrupting kernel memory and causing untold troubles when devices > are over 2GB. There seem to be a few places that I missed and I didn't > have a continued brief to look at it - certainly I'd go through and > fix this use of ints (and typecasts). The code is pretty much layered in that at the file level it works in bytes, but below this it uses a chunk model, then below that (when talking to mtd) it uses bytes again. It probably makes sense to use off_t for byte addresses (ie.vfs interfacing and mtd interfacing) and ints are probably OK for chunks for a while (2^^31 chunks == 4TB or so). When YAFFS was first written 32MB was big and 128MB was huge :-). > > > The chunkGroupBits issue also has impact on this. > > > > > * Andre Renaud latched onto a problem which I then rediscovered in > > > performance testing. Having chunk groups of 16 reduces performance by > > > at least 50% but in practice can be much higher. By applying a > > > version of his patch, I was able ot reduce read time for a 50MB file > > > from 27 seconds to around 15 seconds and have achieved sustained reads > > > at 22.2Mbit/s on multi-GB devices reading many hundred MBs. > > > > I have written some code (minor testing so far, more testing and checkin > > within 24 hours I hope) which should fix this. > > > > This code uses variable size bitmaps to fit the required bit width, thus > > eliminating chunkgroups, but does not use as much RAM as the Bluewater > > hack. > > I saw your postings. I think that is a *much* better idea since it > will increase performance by 50-100% for some people. I combined that > hack with a couple of other fixes and a DMA enabled MTD to push > performance by over 200% of what it was when I started working on it. > > > > * YAFFS2 makes use of some additional reads and memcpy's which don't > > > seem entirely necessary - by combining and changing some of the logic > > > it looks like we could get another 10% performance gain. > > > > Very much look forward to more info on this. > > Ok. I'll look into that. There are several times where we call the MTD > read where once would do (with some extra logic) and a few memcpy's > where I think Linux could deal with a direct pointer instead (the MTD > layer should handle the cacheing issues and memory coherence problems > by doing any additional copies). > > > The WinCE stuff has some extra copying (that is actually no longer > > required and will be eliminated). I hoped the Linux stuff was not doing > > too much exta work. > > Not too much, but I took out one extra read (I'll track it down) and > got a speed bump of around 5-10% in one go. A few more of those (it's > worth someone sitting down and pouring over this code if there is > justification) and we we've got free extra speed. Certainly YAFFS2 is > now approaching the raw NAND performance when reading and writing > through /dev/mtd/blah and that is the goal. If you can provide a patch or annotated C file I will be most grateful. While I will go through this all at some stage (soon I hope), it is always good to have someone "look over your shoulder". -- Charles