Università Degli Studi Di Pisa

Università degli Studi di Pisa DIPARTIMENTO DI INFORMATICA Corso di Laurea in Infor ati!a TESI DI LAUREA IDENTIF"ING AND REMO$IN# A%NORMAL TRAFFIC FROM T&E UCSD NET'OR( TELESCOPE Candidata) Relatore) Elif Beraat Izgordu Luca Deri Matricola: 491044 Anno A!!ademi!o *+,-.*+,/ Index , Introdu!tion 0 * Motivation and Related 'or1 - *2, UCSD Net3or1 Teles!ope 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 5 *2* Teles!ope Usage E6a ple 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 ,+ *27 IP Address Spoofing 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 ,* *20 Overloading Ca4ture Capa!it9 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 ,0 7 Ar!:ite!ture ,; 72, Colle!ted Statisti!s 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 ,/ 72,2, Port %ased Statisti!s 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 ,/ 72,2* S!anner Statisti!s 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 ,5 72,27 Receivers Statisti!s 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 ,< 72* Algorit: s and Data Stru!tures 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 ,< 0 I plementation ** 02, ndpiReader 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 ** 02* Original Contri=ution 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 ** 02*2, Statisti!s 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 *0 027 Memor9 Con!erns 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 */ 020 Filters 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 *5 0202, Filter for Pa!1et %urst 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 *5 0202* Filter for Host %urst 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 7+ 02; Sour!e Code 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 7, ; $alidation 7* ;2, Pa!1et %urst E6a ples 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 7; ;2* Host %urst E6a ples 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 7/ - Con!lusions 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 7< 2 3 1 Introduction In t:e last 7+ 9ears Internet :as :ad a revolutionar9 i pa!t =ot: on our societ9 and on our dail9 lives2 T:ere are !ountless studies for ea!: and ever9 aspect of t:e Internet> it?s =e:aviour and evolution at a!ro- s!ale also :as =een an i portant sour!e of resear!: data@ not for onl9 !omputer s!ien!e =ut for an9 dis!i4lines@ in!luding even social s!ien!es2 T:erefore@ understanding t:e evolution of Internet infrastru!ture is ver9 i portant2 "et developing instru ents and et:ods t:at !an easure and anal9se a!ros!opi! 4:enomena on t:e Internet is not trivial2 One of t:e ost i portant aspects to understand t:e evolution of t:e Internet infrastru!ture is onitoring and stud9ing internet address spa!e utilisation2 It?s a 1no3n issue t:at IPv0 address spa!e is al ost e6:austed =ut as a atter of fa!t@ not all of t:e allocated addresses are effectivel9 in use2 As entioned in t:e stud9 of Dainotti@ %enson@ (ing@ (allitsis@ #latB@ Di itropoulos C,D EMa!ros!opi! easurement of 4atterns in IPv0 address utilisation reveals insig:ts into Internet gro3t:@ in!luding to 3:at e6tent NAT and IPv- deplo9 ent are redu!ing t:e pressure on Fand demand forG IPv0 address spa!e2H 4 In t:e !ourse of t:is stud9@ t:e 4resent s!ienti8! 3or1s 3it: t:e ai of apping a!tual utiliBation of IPv0 addresses@ t:eir li itations and :o3 t:e apping !an =e i 4roved in t:e parti!ular !ase of CAIDA Net3or1 Teles!ope[2] are going to =e introdu!ed2 T3o a44roa!:es for t:e apping problem> a!tive and passive pro=ing@ t:eir !:allenges are going to =e anal9Bed in t:e Motivations and Related 'or1 !:a4ter2 Parti!ularl9 Net3or1 Teles!ope[3D (or a dar1net@ 3:i!: is a portion of routed IP address spa!e in 3:i!: little or no legiti ate traI! e6istsG@ it?s usage for s!ienti8! inferen!es and t:e 4roblems t:at are t:reatening it?s data integrit9 are going to =e introdu!ed in t:is !:a4ter2 After introdu!ing t:e ter inolog9@ li itations of t:e !urrent a44roa!: for data sanitiBation Fin order to over!ome data integrit9 pro=lemsG and diI!ulties of 3or1ing 3it: t:e telecope data are going to =e des!ri=ed2 In t:e Ar!:ite!ture !:a4ter t:ese li itations and diI!ulties are going to s:ape our approa!: and decisions ta1en to deal 3it: t:e original 4roblem of t:is 3or1) i 4roving t:e !urrent a44roa!: for data sanitiBation2 Ne6t@ in t:e Im4lementation !:a4ter details of t:e original !ontri=ution and tec:nologies used to realiBe it are going to =e introdu!ed. At t:e end in t:e $alidation !:a4ter@ effi!ien!9 and validit9 of t:e solution is going to =e demonstrated 3it: t:e test results2 5 2 Motivation and Related Wor Until no3 t:ere :as =een t3o s!ienti8! 3or1 for onitoring t:e e6tent to 3:i!: allocated IP addresses are a!tuall9 used[4D2 %ot: of t:ese 3or1s :ave t:eir o3n li itations2 T:ere are t3o approa!:es t:at separate t:ese t3o s!ienti8! 3or1 funda entall9@ t:at is onitoring !an =e i 4lemented =9 a!tive or passive probing2 First 3or1 is t:e ISI’s Internet Census proJectC;D in 3:i!: address utilisation :as =een onitored via a!tivel9 s!anning t:e entire IPv0 address spa!e2 It periodi!all9 sends ICMP ec:o requestsFi2e2 pingG to ever9 single IPv0 address (e6!luding 4rivate and ulti!ast addressesG to tra!1 t:e a!tive IP address population2 A!tive s!anning approa!: :as four 4ri ar9 li itations) C-D iG t:ere is a easure ent over:ead@ ii) easurement infrastru!ture !an =e potentiall9 =la!1listed iii) net3or1s 8ltering ICMP request !ause easurement =ias@ ivG not s!ala=le for use in a future IPv- !ensus2 Second is t:e CAIDA?s UCSD Net3or1 Teles!ope C/D proJect t:roug: passive easur ents2 T:e Center for Applied Internet Data Anal9sis (CAIDAG !ondu!ts net3or1 resear!: and =uilds resear!: infrastru!ture to support large.s!ale data !olle!tion@ !uration@ and data distri=ution to t:e s!ienti8! resear!: !om unit9 C5D2 ProJe!t is realiBed =9 anal9Bing t3o t9pes of passive traI! data) FiG Internet %a!1ground Radiation 6 (I%RG 4a!1et traI! !a4tured =9 dar1nets (a1a teles!opesG> FiiG traI! (netGLo3 su aries in operational net3or1s2 Passive traI! easurements over!omes t:e !:allenges posed from a!tive probing approa!:> it doesn?t introdu!e net3or1 traI! over:ead@ doesn?t rel9 on un8ltered responses to probing and !ould appl9 to IPv- as 3ell2 It also dete!ts additional a!tive M*0 =lo!1s t:at are not detected as a!tive 3it: ISI’s a!tive probing a44roa!:2 On t:e ot:er :and, it introdu!es ne3 !:allenges to deal 3it:) C<D iG t:e li ited visi=ilit9 of a single observation point> ii) t:e 4resen!e of spoofed IP addresses in 4a!1ets t:at !an aAect results =9 i pl9ing fa1ed addresses are a!tive2 If t:e presen!e of s4oofed pa!1etsFpa!1ets 3it: a fa1e sour!e IP addressG is signi8!antl9 large (t:ousands of IP addresses per inute) it !an invalidate t:e inferen!es@ resulting in a u!: ore densel9 utilised IPv0 address spa!e2 T:erefore@ pa!1ets 3it: spoofed sour!e addresses t:reaten integrit9 of t:e data obtained from net3or1 teles!ope, =ecause an9 resear!: use of data depends on t:e sour!e address of t:e pa!1et2 CAIDA develops and evaluates tec:niKues to identif9 and remove li1el9 spoofed pa!1ets fro =ot: dar1net (unidire!tionalG and t3o-3a9 traI! data2 T:eir 3or1 focused on 8ltering large.s!ale spoofing =9 anuall9 isolating and anal9Bing suspi!ious traI! and t:en defining 8lters to remove t:em2 7 T:ese 8lters are stati! 8lters (e2g 8lter traI! 3:i!: :as TTL N *++ and not ICMP@ 8lter traI! 3it: least signi8!ant =9te sr! addr + or *;;G 3:i!: !over ost of t:e spoofed traI! !ases =ecause t:e9 !an =e deter ined =9 3ell.1no3n 4atterns 3:i!: indi!ate t:at traI! !an =e not:ing =ut spoofed2 T:e9 signi8!antl9 redu!e a ount of spoofed traI! over t:e net3or1 =ut t:ere are still large.s!ale spoofing events t:at !an invalidate t:e inferen!es2 T:is 3or1 !ontri=utes to t:e effort of i proving dar1net data usage2 Pri aril9 !ontri=uting to 8lter spoofed sour!e traI! and pa!1et =urst traI! on t:e UCSD Net3or1 Teles!ope2 T:ese non.8ltered spoofed traI! :ave !ase.speci8! reasons2 T:erefore !urrent tec:niKues of CAIDA are e6tended 3it: a d9na i! a44roa!: to deter ine and 8lter t:ose !ases t:at !ould not =e deter ined =9 stati! 8lters2 Furt:er in t:is section@ to understand =etter t:e 4roble and it?s !:allenges@ Net3or1 Teles!ope data usage is going to =e e6a ined 3it: an e6a ple2 T:en t:e issues t:at t:reaten data integrit9 are going to =e !overed; speci8!all9 IP address spoofing and 4a!1et =urst !ases2 2.1 "#$D %et&or 'elesco)e CAIDA :osts T:e UCSD Net3or1 Teles!ope @ one of t:e largest net3or1 teles!opes (a M5 net3or1 seg ent .

