<<

Oversubscribing on Embedded Platforms

By Donald Percivalle (CPE) and Scott Vanderlind (CSC) Senior Project California Polytechnic State University San Luis Obispo Dr. Zachary Peterson

June 11, 2015

Abstract For most computers running the popular , the inte- grated kernel component inotify provides adequate functionality for monitor- ing changes to files present on the filesystem. However, for certain embedded platforms where resources are very limited and filesystems are very populated (like network attached storage (NAS) devices), inotify may not have enough resources to provide watchers for every file. This results in applications missing change notifications for files they have watched. This paper explores methods for using inotify most effectively on embedded systems by leveraging more la- tent storage. Benefits of this include a reduction in dropped notifications in favor of an introduced delay on notifications for files that are less frequently changed. Contents

1 Introduction 3 1.1 Application ...... 3 1.2 Problem ...... 3 1.3 Problem Statement ...... 4

2 Possible Solutions 5 2.1 Modification of inotify ...... 5 2.2 Wholesale replacement of inotify ...... 5 2.3 Development of user-space watch aggregator ...... 5 2.4 Chosen Approach ...... 6 2.5 Related Work ...... 6

3 Design 6 3.1 Constraints ...... 6 3.2 Implementation ...... 7

4 Takeaways 8 4.1 Solution Viability ...... 8 4.2 Future Work ...... 8

5 Reference Implementation 9 5.1 Driver Application ...... 10 5.2 Watch Aggregation Module ...... 12 5.3 Directory Analysis Module ...... 25

2 1 Introduction as possible. This goal requires a system to monitor the entire filesystem for new Western Digital produces lines of Net- files. work Attached Storage (NAS) devices that are built around an embedded Linux 1.1 Application computer. This embedded system acts as the brains of the operation: It inter- Historically, applications on Linux-like faces with the hard disk drives, manages systems have tracked changes to files us- the filesystem, and provides all the ser- ing a kernel API called inotify. Inotify vices that you’d expect in a NAS, like provides applications with the means to AFP, SMB, NFS, etc. NAS appliances subscribe to filesystem events at a folder that strictly provide filesystem services level: Applications can be notified of files are pretty cut-and-dry, however, West- that have been created, modified, and ern Digital also supports a suite of mobile deleted. This can be useful to many types and desktop applications to interface with of applications, especially those providing these NAS devices that require additional media sharing capabilities. For instance, services to be provided, like DLNA and when a media service is notified of new iTunes library sharing and media ”cloud” file placed into a watched directory, that integration that makes media available service might examine the file for validity through an iOS or Android app. and make its contents available through Targeted towards both consumers and some other abstracted API. businesses, Western Digital sells My Western Digital’s devices provide Cloud devices with storage sizes ranging more advanced media sharing services from 2TBs to 24TBs. It is intended to that synchronize file state and maintain provide a seamless NAS experience where parity between multiple devices. In do- the end user can’t tell if they are inter- ing so, the services running depend heav- facing with local storage on their client ily on knowledge about the state of the machine, or on their My Cloud which filesystem as provided by inotify. could be across the Internet. Along with seamless reads and writes, Western Dig- 1.2 Problem ital aims to provide an always-on me- dia server environment for My Cloud cus- The processes running on the embedded tomers. If a customer uploads a movie or system in the My Cloud NAS depend on an album of photos to their My Cloud, knowledge of the state of a large tree of the goal is to have the iOS and Android directories. In the native implementation apps, as well as any DLNA or media of inotify, a watch is created for each file servers running locally on the My Cloud or directory that needs to be monitored. itself be notified of the new file as soon When a directory is being watched, the

3 events that trigger a notification include a plication processes desire to watch for file creation, update, or delete within that changes in the same directory (for in- directory. However, events that occur in stance, DLNA and iTunes both watch- subfolders of that watch do not trigger ing a directory called media), they both events. It is therefore necessary to create may allocate (and expend) a watch for a watch for every folder in a tree should that file. In practice, this limit is easily you want to subscribe to all file events. raised by tweaking the contents of the file The first hurdle is the hard limit /proc/sys/fs/inotify/max user watches. of watches allowed to be allocated by Let’s consider the anatomy of an each user on the system. By default inotify watch struct, one of which is this limit is set as 524,288, and there- created for each file and is registered to an fore limits the number of files eligible for inotify watcher (a glorified file descrip- concurrent watching at 524,288. The- tor), the parent struct that aggregates a oretically, this limit is further lowered set of watches in the context of an appli- by redundant watches: Should two ap- cation.

1 /∗ Excerpt from include/linux/inotify.h:L20 ∗/ 2 struct i n o t i f y w a t c h { 3 struct l i s t h e a d h l i s t ; /∗ entry in inotify handle’s list ∗/ 4 struct l i s t h e a d i l i s t ; /∗ entry in ’s list ∗/ 5 atomic t count ; /∗ reference count ∗/ 6 struct i n o t i f y h a n d l e ∗ ih ; /∗ associated inotify handle ∗/ 7 struct inode ∗ inode ; /∗ associated inode ∗/ 8 s 3 2 wd ; /∗ watch descriptor ∗/ 9 u32 mask ; /∗ event mask for this watch ∗/ 10 } ;

Considering the structures that a sin- ing is that this theoretical maximum does gle inotify watch contains, allocating a not give any allowances for other applica- single watch requires (on a 32-bit system) tions using memory. 540 bytes of kernel memory. Unfortu- nately, the My Cloud EX2 device only has 500MB of memory available to the entire 1.3 Problem Statement system. Assuming the watch limit men- It is not feasible for multiple applica- tioned above has been raised, this allows tions to simultaneously be running on for 940,740 simultaneous watches. This low-power integrated systems while also limit cannot be raised without adding monitoring sufficiently large file systems more memory to the system, as kernel for changes (with ’large’ meaning more memory cannot be swapped. Worth not- than 940,740 folders).

4 2 Possible Solutions 2.2 Wholesale replacement of inotify A number of solutions have presented themselves. Let’s examine them. While inotify is bundled with Linux (in- cluding the distribution used on My Cloud devices), it is by no means the end-all solution to file monitoring across 2.1 Modification of inotify all platforms. Certainly commercial op- erating systems like Mac OS and Win- The most obvious solution is to modify dows have developed respective analogs, how inotify works so that it is inherently as have other distributions of linux-like recursive. The immediate benefit is that and -like operating systems. The only a single watch would be required on barriers to porting another system from the root folder of the file tree to be mon- a fellow POSIX compliant OS are theoret- itored. Unfortunately, this avenue has ically low. been explored and decided unfit for any However, no system exists that is system, let alone embedded ones: sufficiently appealing. *BSD kernels (FreeBSD, OpenBSD, NetBSD, et al.) provide a module called , which In many filesystems, includ- has certain advantages over inotify but ing Unix-style ones, recurs- is similarly not recursive. A potential ad- ing down a directory hierar- vangage is that kqueue can provide sub- chy isn’t a first class opera- scribers with notifications for all filesys- tion. [...] You list the con- tem events, starting at the root directory tents of a directory and for instead of a specific folder. This is akin each content that is itself a di- to drinking from a firehose while stand- rectory, you transcend into it, ing next to a water fountain, and leads and repeat. [...] to much application-level overhead just to filter through irrelevant events. You could replicate this algo- Other alternatives have similar rithm in the kernel [...] but enough limitations as to obviate the need the performance of the system to discuss them. would suffer. This is much too long-running an operation to perform inside the kernel, 2.3 Development of user- particularly if you were to do space watch aggregator so while holding the appropri- ate locks. - , ino- The final option considered is a user- tify author[2] space watch aggregator and event sim-

5 ulator. This hybrid system depends on search and indexing tool) in 2004. The re- inotify for reporting file change events sult of his investigations is an algorithm for as many files as possible, and uses dubbed the Love-Trowbridge Recursive manual scanning of file meta-data at set Directory Scanning Algorithm[1]. The intervals to find the events that were purpose of this algorithm is to ensure that missed by inotify. The biggest advan- races do not occur when indexing the file tage of this approach is platform flexibil- system. ity: More memory will of course allow for Love-Trowbridge specifies the order in more watches, but less memory will not which inotify setup tasks must be per- necessarily limit the number of files that formed while visiting nodes in a file tree can be tracked. so that events occurring before a scan has The biggest drawback of this ap- completed are still handled. This gener- proach is time sensitivity: Because a res- ally amounts to creating event handlers can is required to find changes in files not for a directory after visiting it but before monitored by inotify, there is a consid- visiting any of its children. erable delay between the event occurring Adherence to Love-Trowbridge in the and the event being reported to the sub- developed approach may be beneficial, scriber application. Still, this solution but not necessary, as the periodic scan- lends itself well to embedded NAS sys- ning of directories will find and simulate tems where RAM is expensive, disk space any missed events. is cheap, and event response time is not critical. 3 Design

2.4 Chosen Approach Let’s investigate the design of a proof-of- concept system that implements a solu- The chosen approach is the third: Devel- tion to the problem. opment of a user-space watch aggregator. The application produced will only serve as a proof-of-concept, but will surely pro- 3.1 Constraints vide important insight into the properties that need to be considered for future de- When considering a design, there are sev- velopment. eral assumed constraints that are oper- ated under. The aggregator must have:

2.5 Related Work • Low memory footprint

Relevant work has been done by Robert • Modular prioritization algorithm Love (inotify author) when contribut- ing to the open-source project Beagle (a • Large file tracking capacity

6 3.2 Implementation watch descriptor number. This watch de- scriptor is unique only in the context of To accomplish a large file tracking ca- the inotify watcher instance to which pacity, the aggregator can additionally the watch belongs to. Because the ag- collect watchers, which in turn collect gregator will register the maximum num- watches. Ideally, the aggregator will ber of watchers and watches for each transparently abstract away from the watcher, inotify events will no longer calling process the fact that several in- have unique watch descriptors. Watch otify watchers exist and will provide a descriptors are necessary to determine unified namespace for watch descriptors. what directory an even occurred in. From the application’s perspective, all The aggregator must replace the events from all sub-directories of a given watch descriptor of all inotify events directory should be accessible from a sin- to a globally unique descriptor before gle notification descriptor. passing the event to the calling process. The aggregator will begin recursing Further, the aggregator must provide a into the directory tree and will use some mapping of the globally unique watch algorithm to assign each directory a nu- descriptors to directory paths in order merical priority. The system will exhaust for the calling process to resolve where all available inotify resources on the di- an event took place in the filesystem. rectories with the highest priority. When To accomplish this, the aggregator will all available watches are consumed, the record all watchers’ unique descriptors (a aggregator will collect all watches into a file descriptor) and all of each watcher’s fd set and will use that set in a call to watches’ watch descriptors as they are the system function (2). The ag- registered. It will assign each unique gregator will return to the calling process (watcher, watch descriptor) pair a glob- a file descriptor to which all events from ally unique number. This number will be all the watchers will be written. mapped to the directory path that that At the discretion of the calling pro- watch is registered to in a list of (global cess, the aggregator will periodically res- identifier, path) pairs. That list will be can and reallocate inotify watches uti- given to the calling process in order to do lizing the same scoring algorithm to ef- event location look ups. With these list fectively monitor the most important di- of mappings, the aggregator will be able rectories in the filesystem. As part of the to translate the watcher-unique watch de- rescan process, it will also detect and re- scriptor to the globally unique identifier create missed events across the directory of events read utilizing the select(2) tree. call described above before passing the Inotify events report, among other event to the calling process. Figure 1 dis- things, the type of filesystem event, the plays this process of identifier translation filename the event occurred on, and a and event passing. 7 Figure 1: High level system design

4 Takeaways between the event occurring and the sim- ulated event being created can be tuned 4.1 Solution Viability by changing the interval of rescan, how- ever it is important to consider that the rescan interval has a linearly inverse re- The described solution accomplished all lationship with the amount of hard disk of the requisite goals when implemented access operations generated by the appli- as a proof-of-concept. A reference appli- cation. cation was devised to monitor a file tree, and system resources were artificially lim- ited in order to characterize the behavior 4.2 Future Work of the system. The results were favorable: Files that were watched had their events Because the solution showed such promis- immediately forwarded to the monitor ing results, there is much opportunity for application. Files that did not make the future work. The user-land solution can cut had events simulated on their behalf only provide notifications to one client when the next rescan occurred. The delay application, which is less than desirable

8 in systems like the My Cloud where many Additionally, it may be beneficial to services require tracking the same files. explore incorporating functionality from Ideally, the final solution will expose the file monitoring API fanotify, which itself as an API to other applications. It is also part of the . fanotify will act as a middleware that allows appli- can, under certain circumstances, pro- cations to request the watches that they vide recursive monitoring of file systems. need without wasting any watches on files However, fanotify does not provide the that are already watched. same set of functionality as inotify and A possible approach is to build the so- was therefore not considered as a whole- lution into a pair of kernel modules. One sale replacement candidate[3]. That said, module would expose an API to applica- it could be incorporated in a way that tions wishing to subscribe to file events, would play to the strengths it does have. and another module would provide the scoring algorithms to determine watch priority. This configuration would allow 5 Reference Implemen- for changing scoring at any time by in- tation stalling a different module. Since this method of oversubscribing Following is a proof-of-concept reference can operate with any number of watch- driver application and aggregator library. ers, it’s possible to keep kernel memory This application is intended to show how usage within a reasonable amount while a real application would utilize the so- still enjoying the benefits of increased file lution to oversubscribe the inotify sys- monitoring capacity. tem.

9 5.1 Driver Application

1 /∗∗ 2 ∗ Test program for My Cloud inotify 3 ∗ Recursively watches a directory for events. 4 ∗/ 5 6 #include

7 #include 8 #include 9 #include ” m y c l o u d inotify .hpp” 10 11 #define NUMWATCHERS 128 12 #define NUMWATCHES 524288 13 14 using namespace std ; 15 16 void terminate ( int s i g ) { 17 cleanup watchers(); 18 e x i t ( s i g ) ; 19 } 20 21 int main ( int argc , char ∗∗ argv ) { 22 char ∗ buf = ( char ∗)calloc(BUF SIZE, sizeof ( char )); 23 int i, len, watcher; 24 struct pollfd fds[1]; 25 struct i n o t i f y e v e n t ∗ event ; 26 m y c l o u d i n o t i f y t i n o t i f y i n s t a n c e ; 27 28 (SIGINT, terminate); 29 i f ( argc < 2) { 30 p r i n t f ( ”Usage : t e s t \n” ) ; 31 return 1 ; 32 } 33 34 m y c l o u d i n o t i f y i n i t (NUM WATCHERS, NUM WATCHES, argv [ 1 ] , 35 i n o t i f y i n s t a n c e ) ; 36 watcher = i n o t i f y instance .monitor; 37 f d s [ 0 ] . fd = watcher ; 38 fds[0].events = POLLIN; 39 cout << ”Monitor pid: ” << i n o t i f y instance .monitor pid << ’ \n ’ ; 40 41 // Read from inotify fd 42 s l e e p ( 5 ) ; 43 r e c r a w l (NUM WATCHERS, NUM WATCHES, i n o t i f y i n s t a n c e ) ; 44 watcher = i n o t i f y instance .monitor;

10 45 while ( 1 ) { 46 i = 0 ; 47 memset ( buf , 0 , BUF SIZE); 48 49 // Block until ready 50 p o l l ( fds , 1 , −1) ; 51 (watcher, FIONREAD, &len); 52 len = read(watcher, buf, len); 53 while ( i < l e n ) { 54 event = ( struct i n o t i f y e v e n t ∗) ( buf+i ) ; 55 i f ( event−>mask == IN CREATE) 56 cout << ” [CREATE] in : ” 57 << i n o t i f y instance.watch descriptors [event−>wd ] 58 << ” on : ” << event−>name << ’ \n ’ ; 59 i f ( event−>mask == IN DELETE) 60 cout << ” [DELETE] in : ” 61 << i n o t i f y instance.watch descriptors [event−>wd ] 62 << ” on : ” << event−>name << ’ \n ’ ; 63 i f ( event−>mask == IN MODIFY) 64 cout << ” [MODIFY] in : ” 65 << i n o t i f y instance.watch descriptors [event−>wd ] 66 << ” on : ” << event−>name << ’ \n ’ ; 67 68 i += EVENT SIZE + event−>l e n ; 69 } 70 } 71 return 0 ; 72 }

11 5.2 Watch Aggregation Module

1 /∗∗ 2 ∗ m y c l o u d inotify.cpp 3 ∗ My Cloud Inotify main functions. 4 ∗ This file shall contain the functions necessary 5 ∗ to initiazlie the My Cloud inotify to set up 6 ∗ watchers . 7 ∗ 8 ∗ Donald Percivalle 9 ∗ Scott Vanderlind 10 ∗ Senior Project , 2015 11 ∗/ 12 13 #include 14 #include 15 #include 16 #include 17 #include 18 #include 19 #include 20 #include 21 #include 22 #include 23 #include 24 #include 25 #include 26 #include 27 #include 28 #include 29 #include 30 #include 31 #include 32 #include 33 #include 34 #include 35 #include ” mycloud analysis .hpp” 36 37 using namespace std ; 38 39 #define WATCH FLAGS (IN CREATE | IN DELETE) 40 #define EVENT CAP 100 41 #define EVENT SIZE ( sizeof ( struct i n o t i f y e v e n t ) ) 42 #define BUF SIZE (EVENT CAP ∗ (EVENT SIZE + 50) ) 43 #define VAR FILE ”/var/MyCloudInotify” 44

12 45 // MyCloud Inotify instance descriptor struct 46 typedef struct m y c l o u d i n o t i f y { 47 int t o t a l w a t c h e s ; 48 int num dirs ; 49 int monitor ; 50 s t r i n g root ; 51 p i d t monitor pid ; 52 s e t watched ; 53 map dir map ; 54 map w a t c h descriptors; 55 } m y c l o u d i n o t i f y t ; 56 57 // Top−Level Initialization function 58 void m y c l o u d i n o t i f y i n i t ( int max watchers , int max watches, string path , 59 m y c l o u d i n o t i f y t &inotify); 60 61 // Directory crawling and scoring functions 62 int w a l k r e c u r l i s t ( const char ∗dname , unsigned long crawl time , 63 map &l i s t ) ; 64 int g e t d i r s ( const char ∗dname , map &dir map ) ; 65 void o r d e r d i r s (map &list , vector > & ordered ) ; 66 unsigned long g e t l a s t c r a w l t i m e ( ) ; 67 unsigned long update crawl time ( ) ; 68 69 // Inotify registration and monitoring functions 70 int i n i t f d w a t c h e r s s e t ( f d s e t ∗ watchers , map > &watchers watches ) ; 72 int setup watchers ( int num watchers , int num watches , 73 vector > &ordered , 74 m y c l o u d i n o t i f y t &inotify); 75 void catch up ( int w r i t e t o , unsigned long r e f e r e n c e time , string path, 76 int w a t c h descriptor); 77 void r e c r a w l ( int max watchers , int max watches, mycloud i n o t i f y t & i n o t i f y ) ; 78 void cleanup watchers(); 79 int i n i t f d w a t c h e r s s e t ( f d s e t ∗ watchers , map > &watchers watches ) ; 81 void monitor watches ( int w r i t e t o , map > & watchers watches ) ; 82 p i d t w r i t e m o n i t o r p i d ( p i d t monitor ) ; 83 p i d t g e t m o n i t o r p i d ( ) ;

13 1 /∗∗ 2 ∗ m y c l o u d inotify.cpp 3 ∗ My Cloud Inotify main functions. 4 ∗ This file shall contain the functions necessary 5 ∗ to initiazlie the My Cloud inotify to set up 6 ∗ watchers . 7 ∗ 8 ∗ Donald Percivalle 9 ∗ Scott Vanderlind 10 ∗ Senior Project , 2015 11 ∗/ 12 13 #include ” m y c l o u d inotify .hpp” 14 15 // Top−Level Initialization Functions 16 // max watchers : the maximum allowed number of watchers. 17 // max watches : the maximum allowed number of watches. 18 // path : the path to the root directory to watch. 19 // inotify : the struct holding the system state. 20 void m y c l o u d i n o t i f y i n i t ( int max watchers , int max watches, string path , 21 m y c l o u d i n o t i f y t &i n o t i f y ) { 22 vector > ordered ; 23 24 // Initialize the state structure passed in. 25 i n o t i f y . root = path ; 26 // Crawl the filesystem. 27 i n o t i f y . num dirs = g e t dirs(inotify.root.c str(), inotify.dir map ) ; 28 // Sort the filesystem. 29 o r d e r dirs(inotify.dir map, ordered); 30 31 // Set up watchers for each directory, fork a child, and begin monitorning 32 // for changes. 33 i f ((inotify .monitor 34 = setup watchers(max watchers, max watches,ordered , inotify)) == 0) { 35 cout << ”child returned \n” ; 36 e x i t (−1) ; 37 } 38 // Update the state with the pid of the . 39 i n o t i f y . monitor pid = g e t m o n i t o r p i d ( ) ; 40 } 41 42 /∗∗

14 43 ∗ Recursively visit folders and have them scored. 44 ∗ dname − The folder path to scan. 45 ∗ crawl time − THE LAST CRAWL TIME! DECEPTIVE! 46 ∗ dir map − the map inside where scanned directories should be recorded . 47 ∗/ 48 int w a l k r e c u r l i s t ( const char ∗dname , unsigned long crawl time , map< s t r i n g , 49 long> &dir map ) { 50 static int count = 0 ; 51 DIR ∗ d i r ; 52 struct d i r e n t ∗ dent ; 53 struct s t a t s ; 54 char fn [FILENAME MAX] ; 55 int l e n ; 56 int s c o r e ; 57 58 i f (!(dir = opendir(dname))) 59 return 0 ; 60 i f (!(dent = readdir(dir))) 61 return 0 ; 62 63 64 // Now do the recurse loop 65 do { 66 // If we’re actually looking at a directory... 67 i f ( ( dent−>d type = DT DIR) ) { 68 l e n = s n p r i n t f ( fn , sizeof ( fn ) −1, ”%s/%s”, dname, dent−>d name ) ; 69 fn [ l e n ] = 0 ; 70 71 // Ignore dotfiles and parent directories. 72 i f (strcmp(dent−>d name, ”.”) == 0 | | strcmp(dent−>d name , ” . . ” ) == 0) 73 continue ; 74 // Stat the dir. 75 i f (stat(fn, &s) == 0) { 76 i f (S ISDIR ( s . st mode ) ) { 77 // If we have a directory, increment the count. 78 count++; 79 // Score it while we’re here. 80 s c o r e = a n a l y z e dir(fn, crawl time ) ; 81 dir map[fn] = score; 82 // Recurse into directory 83 count += w a l k r e c u r list(fn, crawl time , dir map ) ; 84 }

15 85 } 86 } 87 } while ((dent = readdir(dir))); 88 // Close the door behind us. 89 c l o s e d i r ( d i r ) ; 90 // Return the number of dirs we’ve seen. 91 return count ; 92 } 93 94 /∗∗ 95 ∗ Crawl the filesystem and keep track of when it happened. 96 ∗/ 97 int g e t d i r s ( const char ∗dname , map &dir map ) { 98 unsigned long crawl time ; 99 DIR ∗ d i r ; 100 struct d i r e n t ∗ dent ; 101 char fn [FILENAME MAX] ; 102 int l e n ; 103 104 crawl time = g e t l a s t c r a w l t i m e ( ) ; 105 // Manually add root itself to the map 106 i f (!(dir = opendir(dname))) 107 return 0 ; 108 i f (!(dent = readdir(dir))) 109 return 0 ; 110 cout << ”Adding manual\n” ; 111 l e n = s n p r i n t f ( fn , sizeof ( fn ) −1, ”%s”, dname); 112 fn [ l e n ] = 0 ; 113 dir map[fn] = analyze dir(fn, crawl time ) ; 114 115 return w a l k r e c u r list(dname, crawl time , dir map ) ; 116 } 117 118 /∗∗ 119 ∗ Sort the unordered list of directories into a vector. 120 ∗/ 121 void o r d e r d i r s (map &list , vector > & ordered ) { 122 // Copy [dir, score] pairs into vector 123 for ( auto itr = list.begin(); itr != list.end(); ++itr) 124 ordered . push back (∗ i t r ) ; 125 126 // Sort by score. 127 sort(ordered.begin(), ordered.end(), [=]( const pair &a , 128 const pair &b) {

16 129 return a . second > b . second | | (a.second == b.second && a. first > b . f i r s t ) ; 130 }); 131 } 132 133 /∗∗ 134 ∗ Fetch the time of the last crawl. 135 ∗/ 136 unsigned long g e t l a s t c r a w l t i m e ( ) { 137 int fp ; 138 unsigned long crawl time ; 139 140 crawl time = 0 ; 141 i f ((fp = open(VAR FILE , O RDONLY) ) > 0) { 142 // File exists, read time (should be first and only thing) 143 read ( fp , &crawl time , sizeof ( unsigned long )); 144 c l o s e ( fp ) ; 145 } 146 147 p r i n t f ( ”Time : %lu \n” , crawl time ) ; 148 return crawl time ; 149 } 150 151 /∗∗ 152 ∗ Update the var file to include the current time. 153 ∗/ 154 unsigned long update crawl time ( ) { 155 int fp ; 156 unsigned long crawl time ; 157 158 crawl time = 0 ; 159 // Open the file for writing. 160 i f (( 161 fp = open (VAR FILE , O WRONLY | O CREAT | O TRUNC, S IRUSR | S IRGRP )) > 0) { 162 // Get the time. 163 crawl time = time(NULL); 164 // Make sure at least something was written. 165 i f (write(fp, &crawl time , sizeof ( unsigned long )) <= 0) { 166 p e r r o r ( ” wr ite c a l l ” ) ; 167 e x i t (−1) ; 168 } 169 // Close the file. 170 c l o s e ( fp ) ; 171 } 172 else {

17 173 p e r r o r ( ”open c a l l ” ) ; 174 e x i t (−1) ; 175 } 176 // Return the time, it could come in handy to someone. 177 return crawl time ; 178 } 179 180 /∗∗ 181 ∗ Returns the FD for a pipe that file change events 182 ∗ from the provided map of watch descriptors. 183 ∗/ 184 int setup watchers ( int num watchers , int num watches , 185 vector > &ordered , 186 m y c l o u d i n o t i f y t &i n o t i f y ) { 187 // Our special mapping of pseudo−fds to inotify fds. 188 // There are three ints here. The first is the watcher fd.. 189 // The second is the original watch descriptor 190 // The third is ∗ our ∗ unique watch descriptor 191 map > watcher watches ; 192 // The number of watches we have created. 193 int count watches = 0 ; 194 // The number of watchers we have created within a watcher. 195 int count watchers = 0; 196 // The total number of watches we have created. 197 int t o t a l w a t c h e s = 0 ; 198 // Helper vars, the current watcher we’re filling and a temp fd h o l d e r . 199 int cur watcher , temp watch d ; 200 // Our pipe to the child process. 201 int pi pef d [ 2 ] ; 202 // The PID of the child process. 203 p i d t c h i l d p i d ; 204 // Time references 205 unsigned long t h i s c r a w l t i m e , l a s t c r a w l t i m e ; 206 207 // Get last crawl time and update it 208 l a s t c r a w l t i m e = g e t l a s t c r a w l t i m e ( ) ; 209 t h i s c r a w l time = update crawl time ( ) ; 210 211 // Init pipefd as a pipe 212 pipe ( pi pe fd ) ; 213 214 // For each pair in the ordered vector, create a watch within a watcher . 215 for ( auto itr = ordered.begin(); itr != ordered.end(); ++itr) { 216 // If the number of watches per watch has filled up

18 217 // or this is the first watcher, init a new one. 218 i f ( count watchers == 0 | | count watches >= num watches ) { 219 // Increment our watcher count. 220 count watchers++; 221 // Reset our watch count. 222 count watches = 0 ; 223 // If we would create more watchers than your body has room for , 224 // cut and run. 225 i f ( count watchers > num watchers ) { 226 cout << ”Reached watcher limit! \ n” ; 227 break ; 228 } 229 cout << ”Initiing new watcher−−−−−−−−−−−−−−−−−−−\n” ; 230 // Create a new watcher. 231 cur watcher = inotify i n i t ( ) ; 232 } 233 // Increment the watch counts. 234 count watches++; 235 t o t a l w a t c h e s ++; 236 237 // These spoofed events should use our pseudo watch descriptor , which i s 238 // t o t a l w a t c h e s . 239 i f (inotify .watched.count(itr −>f i r s t ) ) // Directory was already watched 240 catch up(pipefd[1], this c r a w l t i m e , i t r −>first , total w a t c h e s ); 241 else { // Directory hasn’t been watched before 242 catch up(pipefd[1], last c r a w l t i m e , i t r −>first , total w a t c h e s ); 243 i n o t i f y . watched . i n s e r t ( i t r −>f i r s t ) ; 244 } 245 246 // Create a watch for the file path using the current watcher 247 temp watch d 248 = i n o t i f y a d d w a t c h ( cur watcher, itr −>f i r s t . c s t r ( ) , WATCH FLAGS ); 249 250 // Map the inotify −provided to a pseudo− d e s c r i p t o r 251 // since the inotify −provided FD is not bound to be unique. 252 watcher watches[cur watcher ][temp watch d ] = t o t a l w a t c h e s ; 253 i n o t i f y . w a t c h descriptors[total watches] = itr −>f i r s t ; 254 255 // Debug print

19 256 cout << i t r −>f i r s t << ’’ << i t r −>second << ’ \n ’ ; 257 } 258 259 // Fork a child process. 260 i f ( ( c h i l d pid = fork()) == 0) { 261 // The child does not need to read from the pipe, only write. 262 // Close the read end of the pipe. 263 c l o s e ( pip efd [ 0 ] ) ; 264 // Begin monitoring the watchers we created above. 265 monitor watches(pipefd[1] , watcher watches ) ; 266 } 267 else i f ( c h i l d p i d < 0) 268 p e r r o r ( ” f o r k c a l l ” ) ; 269 else { 270 // Write the PID of the child process to file so we can kill it l 8 r . 271 w r i t e m o n i t o r p i d ( c h i l d p i d ) ; 272 // The parent does not need to write to the pipe, only read. 273 // Close the write end of the pipe. 274 c l o s e ( pip efd [ 1 ] ) ; 275 // Close inotify watcher fd(s) from parent’s perspective 276 for ( auto itr = watcher watches.begin(); itr != watcher watches . end(); ++itr) 277 c l o s e ( i t r −>f i r s t ) ; 278 return pi pef d [ 0 ] ; 279 } 280 281 return 0 ; 282 } 283 284 /∗∗ 285 ∗ Look in directory for files that would not have triggered events 286 ∗ based on given reference time, and spoof events for them. 287 ∗/ 288 void catch up ( int w r i t e t o , unsigned long r e f e r e n c e time , string path, 289 int w a t c h descriptor) { 290 static struct i n o t i f y e v e n t ∗ event 291 = ( i n o t i f y e v e n t ∗) c a l l o c ( sizeof ( i n o t i f y event) + PATH MAX, 1) ; 292 static char fn [PATH MAX] ; 293 struct stat attrib; 294 struct d i r e n t ∗ ent ; 295 DIR ∗ d i r ; 296 int l e n ; 297 298 // This is the first crawl, we don’t want to spoof events yet 299 i f ( r e f e r e n c e t i m e == 0)

20 300 return ; 301 302 // Open directory, and crawl through files in it 303 i f ((dir = opendir(path.c str())) != NULL) { 304 while ((ent = readdir (dir)) != NULL) { 305 i f (strcmp(ent−>d name, ”.”) == 0 | | strcmp ( ent−>d name , ” . . ” ) == 0) 306 continue ; 307 l e n = s n p r i n t f ( fn , sizeof ( fn ) −1, ”%s/%s”, path.c s t r ( ) , ent−> d name ) ; 308 fn [ l e n ] = 0 ; 309 s t a t ( fn , &a t t r i b ) ; 310 cout << fn << ’ \n ’ ; 311 i f (S ISREG(attrib .st mode ) 312 && ( unsigned long ) a t t r i b . st mtime > r e f e r e n c e t i m e ) { 313 event−>wd = w a t c h descriptor; 314 // The only event supported is MODIFY because unix only t r a c k s 315 // access, modification , and change times. 316 event−>mask = IN MODIFY ; 317 event−>cookie = 0x00; 318 event−>len = strlen(fn); 319 strcpy ( event−>name , fn ) ; 320 // Now write event out to pipe 321 i f (write(write to , event, EVENT SIZE + event−>l e n ) <= 0) 322 perror(”catchup: write call”); 323 } 324 } 325 c l o s e d i r ( d i r ) ; 326 } 327 } 328 329 /∗∗ 330 ∗ Re crawl the file structure. Kill off the child, then rescan for a l i s t 331 ∗ of directories. Then sort again. Then setup watchers. Basically i n i t 332 ∗ again, but kill the child first. 333 ∗/ 334 void r e c r a w l ( int max watchers , int max watches, mycloud i n o t i f y t & i n o t i f y ) { 335 // Kill child 336 cleanup watchers(); 337 m y c l o u d i n o t i f y i n i t ( max watchers, max watches, inotify.root , i n o t i f y ) ; 338 }

21 339 340 341 /∗∗ 342 ∗ Kill the child process. 343 ∗/ 344 void cleanup watchers ( ) { 345 k i l l ( g e t m o n i t o r pid() , SIGKILL); 346 cout << ”killed child \n” ; 347 } 348 349 /∗∗ 350 ∗ Insert all watches into a FD set. 351 ∗/ 352 int i n i t f d w a t c h e r s s e t ( f d s e t ∗ watchers , 353 map > &watcher watches ) { 354 int max fd = 0 ; 355 356 // Clear out fd s e t 357 FD ZERO(watchers) ; 358 359 // Add descriptors to set and keep track of the max file descriptor. 360 for ( auto itr = watcher watches.begin(); itr != watcher watches.end ( ) ; ++i t r ) 361 { 362 FD SET( i t r −>first , watchers); 363 max fd = ( max fd < i t r −>f i r s t ) ? i t r −>f i r s t : max fd ; 364 } 365 366 return max fd + 1 ; 367 } 368 369 /∗∗ 370 ∗ Monitor all watches in the map and redirect their events to the pipe 371 ∗ ‘ w r i t e t o ‘ . 372 ∗/ 373 void monitor watches ( int w r i t e t o , map > & watcher watches ) { 374 int num descriptors , ready descriptors , i, len; 375 f d s e t ∗ watchers = (fd s e t ∗) malloc ( sizeof ( f d s e t ) ) ; 376 char ∗ buf = ( char ∗)calloc(BUF SIZE, sizeof ( char )); 377 struct i n o t i f y e v e n t ∗ event ; 378 379 while ( 1 ) { 380 // C a l l f d set for every watch and get the max descriptor. 381 num descriptors = init f d w a t c h e r s set(watchers , watcher watches ) ;

22 382 // select() and wait around for fun times. 383 r e a d y descriptors = select(num descriptors , watchers , NULL, NULL, NULL) ; 384 i f ( r e a d y descriptors < 0) { 385 p e r r o r ( ” s e l e c t c a l l ” ) ; 386 e x i t (−1) ; 387 } 388 // Loop through watchers for event 389 for ( auto &watch : watcher watches ) { 390 memset ( buf , 0 , BUF SIZE); 391 // If it is set 392 i f (FD ISSET(watch. first , watchers)) { 393 // Read in the events. 394 ioctl(watch.first, FIONREAD, &len); 395 i f ((len = read(watch.first , buf, len)) < 0) { 396 p e r r o r ( ” monitor : read c a l l ” ) ; 397 continue ; 398 } 399 // Digest all the pending events. 400 i = 0 ; 401 while ( i < l e n ) { 402 // Cast to a bona fide event. 403 event = ( struct i n o t i f y e v e n t ∗) ( buf + i ) ; 404 event−>wd = watch.second[event−>wd ] ; 405 i f (write(write to, buf, event−>l e n + EVENT SIZE) <= 0) { 406 perror(”monitor: Write call”); 407 } 408 // Get the next event. 409 i += event−>l e n + EVENT SIZE ; 410 } 411 } 412 } 413 } 414 } 415 416 /∗∗ 417 ∗ Write the pid of the monitor process. 418 ∗/ 419 p i d t w r i t e m o n i t o r p i d ( p i d t monitor ) { 420 int fp ; 421 // Open the var file for writing. 422 i f ((fp = open(VAR FILE , O WRONLY, S IRUSR | S IRGRP) ) > 0) { 423 // jump the file pointer over the last crawl time. 424 i f ( l s e e k ( fp , sizeof ( unsigned long ) , SEEK SET) <= 0) { 425 p e r r o r ( ” l s e e k c a l l ” ) ; 426 k i l l ( monitor , SIGKILL) ;

23 427 e x i t (−1) ; 428 } 429 // Write the pid 430 i f (write(fp , &monitor , sizeof ( p i d t ) ) <= 0) { 431 p e r r o r ( ” wr ite c a l l ” ) ; 432 e x i t (−1) ; 433 } 434 // Close up shop. 435 c l o s e ( fp ) ; 436 } 437 else { 438 p e r r o r ( ”open c a l l ” ) ; 439 e x i t (−1) ; 440 } 441 // Return the pid written. 442 return monitor ; 443 } 444 445 /∗∗ 446 ∗ Fetch the PID of the monitor process. 447 ∗/ 448 p i d t g e t m o n i t o r p i d ( ) { 449 int fp ; 450 p i d t monitor ; 451 // Open the file for reading. 452 i f ((fp = open(VAR FILE , O RDONLY) ) > 0) { 453 // Jump the file pointer over the last crawl time. 454 i f ( l s e e k ( fp , sizeof ( unsigned long ) , SEEK SET) <= 0) { 455 perror(”monitor (get pid): lseek call”); 456 e x i t (−1) ; 457 } 458 // Read the time. 459 i f (read(fp , &monitor , sizeof ( p i d t ) ) <= 0) { 460 perror(”monitor (get pid): read call”); 461 } 462 c l o s e ( fp ) ; 463 } 464 // return what we just read. 465 return monitor ; 466 }

24 5.3 Directory Analysis Module

1 /∗∗ 2 ∗ m yc l ou d analysis.hpp 3 ∗ My Cloud Inotify Analysis header 4 ∗ This file shall contain the functions necessary 5 ∗ to analyze directories 6 ∗ 7 ∗ Donald Percivalle 8 ∗ Scott Vanderlind 9 ∗ Senior Project , 2015 10 ∗/ 11 12 #include 13 #include 14 #include

25 1 /∗∗ 2 ∗ m yc l ou d analysis.cpp 3 ∗ My Cloud Inotify Analysis header 4 ∗ This file shall contain the functions necessary 5 ∗ to analyze directories 6 ∗ 7 ∗ Donald Percivalle 8 ∗ Scott Vanderlind 9 ∗ Senior Project , 2015 10 ∗/ 11 12 #include ” mycloud analysis .hpp” 13 14 long a n a l y z e d i r ( char ∗ dir , unsigned long crawl time ) { 15 long s c o r e ; 16 struct stat attrib; 17 18 s t a t ( dir , &a t t r i b ) ; 19 s c o r e = a t t r i b . st mtime − crawl time ; 20 21 return s c o r e ; 22 }

26 References

[1] R. Love, “on issues of recursive monitoring and races therein,” October 2004. [Online]. Available: https://mail.gnome.org/archives/dashboard-hackers/ 2004-October/msg00022.html

Robert Love’s proposal of the Love-Trowbridge Recursive Directory Scanning Algorithm

[2] ——, “Inotify monitoring of directories is not recursive. is there any specific reason for this design in linux ker- nel?” August 2012. [Online]. Available: http://www.quora.com/ Inotify-monitoring-of-directories-is-not-recursive-Is-there-any-specific-reason-for-this-design-in-Linux-kernel

Robert Love justifies why inotify recursion should be implemented in rather than kernel space.

[3] A. Morgado, “Filesystem monitoring in the linux kernel,” November 2013. [On- line]. Available: http://www.lanedo.com/filesystem-monitoring-linux-kernel/

A discussion of the capabilities of fanotify vs inotify vs

27