EROFS File System Introduction

EROFS File System Introduction

EROFS file system Gao Xiang <[email protected]> Self-introduction & Background 1. Graduated from university and joined HUAWEI for 3 years, working for OS !ernel Lab %ept., mainly fo'using on #inux )"e system ●im&rovement for HUAWEI mo$ile phones+ ● ,-, su&&ort . bugfi(/ New solutions of requirements from our &rodu'ts+ 12HUAWEI sd'ardfs; -2ERO, / 32Other useful so"utions w*i'* are under deve"o&ment... -. User availa$"e s&ace becomes tig*t wit* t*e increase of Android RO4, especial"y for our low5end devices (some of t*em are stil" wit* 17GB tota" storage, saving 19-GB more storage spa'e is usefu" for users wit* t*e hel& of ERO, 2/ 3. ome solution to leave more spa'e and maintain *ig*5 &erforman'e to end users: 3 Questions 1. What kind of fi"e types 'an be 'ompressed pra'ti'a""y? and w*ere are t*ey stored in: ❌ a. <*otos, <i'tures,❓ 4ovies? 6no gain to 'om&ress "oss"ess"y again2 $. =e(t materia"?❌ ( 'ompressible $ut t*e total amount de&ends on end-user be*aviors2 c. %atabase: ( some &art of overwrite IO &atterns 'ou"d ki"" t*e✔️ &erforman'e 2 d. 8I0s, s*ared "ibrary, some A<!s, &re"oad con)guration )"es? (most"y read5on"y )"es2 -. Is t*ere some so"ution to reso"ve our 'om&ression re1uirements: s1uas*fs is not good enoug* for real5time de'om&ression (es&ecia""y for em$edded devices wit* "imited memory, su'* as mo$ile &*ones2 sin'e 1. e(tra memory over*ead/ 2. noti'eab"e read am&"ifi'ation 6> based on its default 12?!8 $"ock size) 3. some metadata &arsing *ave to be done syn'*ronous"y limited by its on5disk design. 3. =*e &erformance 'an be even better wit* 'om&ressing effectively in addition to reso"ving t*e above squashfs issues, as shown in the fo""owing pages. ; Fixed-output compression Advantage: 1. improved storage density (hig* C3 -D less IO for most patterns -D read performan'e improvement2/ -. decompression in5&"a'e (no mem'&y2 com&ared with )(ed-input 'om&a'ted compression/ 3. 'an be on5disk com&atib"e wit* b"o'k boundary a"igned compression (eg. 8trfs2 in order to ar'hive zero IO read am&"i)'ation (However, in my opinion, it depends on a'tua" IO patterns and "ess usefu" for fi(ed-output decompression sin'e a"" compressed data in t*e b"o'k can be compressed.2 #m LmE1 #m+- #mE3 #m #m+1 #mE- #m+3 #m #mE1 #mE- <n <nE1 <nE- <nE3 Pn PnE1 <nE- <nE3 Pn Pn+1 PnE- blo'k $oundary 'om&ression 6$trfs2 )(ed5in&ut 'om&a'ted com&ression (s1uashfs, 'ramfs2 it is only useful if C3 DD - 6more t*an BFG oA2, )(ed5out&ut 'om&ression 6erofs2 it is toug* for ;k 'om&ress $lo'k B Comparsion of erofs/squashfs/btrfs image sizes Support 4k compressed cluster only in current erofs kernel implementation due to our requirement: dataset testcase output size CR enwikH enwikH 1,FFF,FFF,FFF8 1 enwikHI;k.s1uashfs.img 7-1,-11,7;?8 1.71 enwik9_4k.erofs.img 558,133,248 1.79 enwikHI?k.s1uashfs.img BB7,1H1,N;;8 1.?F enwikHI1-?k.s1uashfs.img 3H?,-F;,H-?8 -.B1 657,608,704 enwikHI1-?k.$trfs.img> 8 1.B- si"esia.tar si"esia.tar -11,HBN,N7F8 1 si"esia.tarI;k.s1uas*fs.img 11;,B-;,17F8 1.?B si"esiaI?k.s1uashfs.img 1F7,FH;,BH-8 -.FF si"esia.tar_4k.erofs.img 1#5,2$!,2## 2.01 si"esia.tarI1-?k.s1uashfs.img ?1,N7?,;;?8 -.BH 2.03 si"esia.tarI1-?k.$trfs.img> 1F;,7-?,--;8 'om&ressed wit* "@; 1.8.3 with 'ommand "ines+ 12 mksquashfs enwikH enwikHI;k.s1uashfs.img 5'omp "@; 5J*' 5$ ;FH7 5noappend -2 mkfs.erofs 5@"@;*' enwikHI;k.erofs.img enwikH > for $trfs, mounting with Knodatasum,'om&ress5for'eL"@o”, observed with 'ompsi@e, thus no metadata at all. 7 Fixed-output compression (cont.) =ypica""y, t*ere are 2 ways for compress file system to decom&ress+ 1.read a"" 'om&ressed data into bdev page 'a'*e (like all metadata do2 and decompress O squas*fs/ 2.read com&ressed data to tem&orary $uAer and de'om&ress ⇒ e.g., $trfs. ●However, bot* 2 ways takes e(tra over*ead sin'e 3e'"aimed ,or 1), it 'ou"d 'ause &age 'a'*e t*ras*ing, imagine a$out B0% 'om&ressed data of the origina" )"e si@e are added into &age 'a'*e ina'tive #3U "ist at "east for em$eded devi'es, 'om&ressed $"ock needed to read many of t*em are used5once or at a very "ow fre1uency/ furt*ermore, it’s *ard to de'om&ress dire'tly if some of 'om&ressed &ages are re'"aimed. ● ,or -), it 'an *ard"y use t*e remaining 'om&ressed data since tem&orary $uAer 'ou"d $e freed immediate"y. ,ixed-output compression can make fu"" use of data in 'ompressed b"o'k in prin'ip"e (a"" com&ressed data can be de'ompressed if needed.2 N Our solution ERO, *as de'om&ression strategies in - dimensions+ %Dimension 1' Cached or In5&"a'e IO decom&ression ● Ca'*ed de'om&ression 'urrently for in'om&"ete 'om&ression read / ● In5&"a'e IO de'om&ression for 'om&"ete 'om&ression read. KComp"ete or in'omp"ete” means whether or not data in the 'ompressed $"ock are a"" re1uested. =*e &oli'y 'an $e re)ned "ater since for sma"" C3 it 'an $e fu""y de'om&ressed in advance since it takes t*e simi"ar amount of memory if t*e 'om&ressed data are 'a'*ed in memory. %Dimension 2' yn' or async de'om&ression ● yn' de'om&ression for sma"" num$er of &age re1uests, w*i'* de'om&ress in its 'a""er thread/ ● Asyn' de'om&ression for "arge read re1uests, w*i'* de'om&ress in work1ueue 'onte(t. =*e &oli'y 'an $e re)ned "ater as we"", since erofs 'annot make diAerence $etween asyn'Qsyn' reada*ead in .read&ages(). ? In-place IO decompression Why we need t*is? main"y for t*ree reasons+ 1.doing ca'hed decompression on"y cou"d cause ca'*e t*rashing/ -.'onsidering many read requests are pending for IO ending, temporary $uAers cannot be freed. A""o'ating immediate buAers wil" cause more memory rec"aim as wel" com&ared to un'ompressed generic file system su'* as e(t; O espe'ia""y for HUAWEI camera heavy memory work"oad/ 3.it wil" be later used for decompressing in5&"a'e. In this way, compressed data are read in its last decompressed pages as mu'h as possib"e, as il"ustrated below+ )"e &ages #m #m+1 #mE- R Ln68&P2 %ire'tly read 8& into #n 'om&ressed '"usters 8& H Decompression In-place As mentioned before, decom&ression in5&"a'e can be im&"emented wit* fixed- out&ut de'ompression, w*ich is hard for lega'y fixed-in&ut de'ompression. some margin )"e &ages #m #m+1 R Ln68&P2 IO submission read 8& dire'tly into #n 'om&ressed $"ocks 8& <er5C<U some margin buAer de'om&ress input decompression #m #m+1 R Ln68&P2 )"e &ages #m #m+1 R Ln68&P2 de'om&ress output 1F Limited bounced buffers in'e a"" "@5$ased compression a"gorit*ms use s"iding5window tec*no"ogy, E3OF can use limited (7;!8 for l@;2 boun'ed $uAers, whic* minimi@e memory 'onsum&tion as wel". Limited $oun'ed $uffers is a"so im&"emented in new decompression ba'kend. before #m #m+1 #mE- #m+3 R #m+17 #mE18 #mE19 R Ln68&P2 after #m #mE1 #mE- #mE3 R #m+1N #m+18 #m+19 R #n68&P2 11 On-disk brief introduction • • #ittle5endian • 4k $lock si@e 'urrently 6no$*2/ 4i(ed metadata with data 6In other words, • Se(ible enough for mkfs to &lay with it)/ • 32 6v12 or 7; 6v-25$yte inode $ase si@e/ 7; bit + ns timestam&, ea'* )"e 'an *ave • its own timestam& for v-/ • u&&ort tai"5end data in"ine; • u&&ort JA==R, PO IXAC# 65.-E2/ • u&&ort stat( 65.3E2/ • u&&ort 'om&acted inde(es 6B.3+)/ *ttps:QQgit.kernel.orgQ&u$Qs'mQlinu(Qkerne"QgitQtorvalds/linu(.gitQtreeQdrivers/ stagingQerofs/%ocumentationQ)lesystems/erofs.t(t:*Lv5.1 1- Microbenchmark C<U: Intel632 Core(=42 i55?-BFU C<U @ 1.6FGHz (; cores, 8 t*reads2 %%3: 8G D: I0TEL S %<E!!,37FGNHenwik9 FIO (psync, 1 thread) 800 700 600 500 erofs_4k squashfs_4k squashfs_8k 400 squashfs_16k squashfs_128k 300 200 100 0 seq rand rand1m 13 Real app launching C<U: M=7N7B (? Corte(5AB3 cores2 D%3: 2G8 eMMC: 3-G8 App. # 1 2 3 4 5 6 7 8 9 10 11 12 13 w/o FIO workload -16.4 -3.5 +4.2 -4.0 -7.5 -1.4 -6.8 +6.3 -2.2 -18.4 -3.3 -7.6 -4.5 w/ FIO workload -2.8 -12.9 -5.4 +3.9 -7.6 +3.7 +4.4 -2.6 +9.9 +4.0 -11.1 -10.3 -15.1 Relative boot time of thirteen applications.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    20 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us