FSMonitor: Scalable File System Monitoring for Arbitrary Storage Systems Arnab K. Paul∗, Ryan Chardy, Kyle Chardz, Steven Tueckez, Ali R. Butt∗, Ian Fostery;z ∗Virginia Tech, yArgonne National Laboratory, zUniversity of Chicago fakpaul,
[email protected], frchard,
[email protected], fchard,
[email protected] Abstract—Data automation, monitoring, and management enable programmatic management, and even autonomously tools are reliant on being able to detect, report, and respond manage the health of the system. Enabling scalable, reliable, to file system events. Various data event reporting tools exist for and standardized event detection and reporting will also be of specific operating systems and storage devices, such as inotify for Linux, kqueue for BSD, and FSEvents for macOS. How- value to a range of infrastructures and tools, such as Software ever, these tools are not designed to monitor distributed file Defined CyberInfrastructure (SDCI) [14], auditing [9], and systems. Indeed, many cannot scale to monitor many thousands automating analytical pipelines [11]. Such systems enable of directories, or simply cannot be applied to distributed file automation by allowing programs to respond to file events systems. Moreover, each tool implements a custom API and and initiate tasks. event representation, making the development of generalized and portable event-based applications challenging. As file systems Most storage systems provide mechanisms to detect and grow in size and become increasingly diverse, there is a need report data events, such as file creation, modification, and for scalable monitoring solutions that can be applied to a wide deletion. Tools such as inotify [20], kqueue [18], and FileSys- range of both distributed and local systems.