binpac: A yacc for Writing Application Protocol Parsers Ruoming Pang∗ Vern Paxson Google, Inc. International Computer Science Institute and New York, NY, USA Lawrence Berkeley National Laboratory
[email protected] Berkeley, CA, USA
[email protected] Robin Sommer Larry Peterson International Computer Science Institute Princeton University Berkeley, CA, USA Princeton, NJ, USA
[email protected] [email protected] ABSTRACT Similarly, studies of Email traffic [21, 46], peer-to-peer applica- A key step in the semantic analysis of network traffic is to parse the tions [37], online gaming, and Internet attacks [29] require under- traffic stream according to the high-level protocols it contains. This standing application-level traffic semantics. However, it is tedious, process transforms raw bytes into structured, typed, and semanti- error-prone, and sometimes prohibitively time-consuming to build cally meaningful data fields that provide a high-level representation application-level analysis tools from scratch, due to the complexity of the traffic. However, constructing protocol parsers by hand is a of dealing with low-level traffic data. tedious and error-prone affair due to the complexity and sheer num- We can significantly simplify the process if we can leverage a ber of application protocols. common platform for various kinds of application-level traffic anal- This paper presents binpac, a declarative language and com- ysis. A key element of such a platform is application-protocol piler designed to simplify the task of constructing robust and effi- parsers that translate packet streams into high-level representations cient semantic analyzers for complex network protocols. We dis- of the traffic, on top of which we can then use measurement scripts cuss the design of the binpac language and a range of issues that manipulate semantically meaningful data elements such as in generating efficient parsers from high-level specifications.