Clock Synchronization
Total Page:16
File Type:pdf, Size:1020Kb
Middleware Laboratory Sapienza Università di Roma MIDLAB Dipartimento di Informatica e Sistemistica Clock Synchronization Marco Platania www.dis.uniroma1.it/~platania [email protected] Sapienza University of Rome Time notion Each computer is equipped with a physical (hardware) clock It can be viewed as a counter incremented by ticks of an oscillator i At time t, the Operating System (OS) of a process y r o t reads the hardware clock H(t) of the processor a r o b a L e r a C = a H(t) + b w Then, it generates the software clock e l d d i M C approximately measures the time t of process i B A L D I M notion among processes among notion isas known evolve executions distributed (DS) Distributed in Systems Time Problems: how to analyze a in DS factor Time a key is The technique used to coordinate a common time time common a techniqueThe used coordinate to common notion time common of a process state during distributed a computation However, it©s important processes for a share to However, Lacking of a global reference time: it©s hard to know Lackingthe time:it©sknow to hard of a global reference ClockSynchronization MIDLAB Middleware Laboratory Clock Synchronization [10] The hardware clock of a set of computers (system nodes) may differ because they count time with different frequencies Clock Synchronization faces this problem by means of synchronization algorithms y r Standard communication infrastructure o t a r o No additional hardware b a L e r a w e l Clock synchronization algorithms d d i External M Internal B A L Heartbeat (pulse) D I M bound bound time source authoritative process ClockExternal Sync Minimizes the distance between clock between the Minimizes distance ∆ : i = 1, 2,..., n i = 1, | C(t) S(t)± and the time provided by an an by time the and provided S, within a synchronization synchronization a within | | ≤ Δ C i of MIDLAB Middleware Laboratory Internal Clock Sync No external reference time server Minimizes the distance of any two clock C and C of i j processes i,j = 1, 2, ..., n within a synchronization bound ∆: | C (t) ± C (t) | ≤ ∆ i j y r Notes: o t a r A set of processes P externally synchronized within a o b a L bound ∆ is also internally synchronized within a bound 2∆ e r a It follows directly from definition of external and internal w e l d d clock synchronization i M P A set of process internally synchronized are not B necessarily externally synchronized because they may A L drift collectively from the external time source D I M Heartbeat sync (pulse) length length at then smaller round least order one ofthe magnitude the in same instant, errorwith an that is approximately the in same instant approximately value starts, round synchronization a rather than clockthe They startThey round a synchronization finish and heartbeat local a generate periodically nodes System the node, time for each which in Aims synchronize, to MIDLAB Middleware Laboratory Clock algorithmssync two any clocks ( any two the difference on bound an upper between guarantee and delay on transmission bound upper Deterministics algorithms assume the presence of an algorithms the assume presence Deterministics groups three in divided Clock algorithms are sync Statistics Probabilistics Deterministics precision ) MIDLAB Middleware Laboratory Probabilistics algorithms Probabilistics communication errors communication occurs loss) (messages clock of if lot a of out goes synchronization others with synchronized clocks synchronized between a maximumconstant deviation guarantee But there is a probability greater than 0 thatthan 0 a process greater probability But is a there if its time,clock a knows is process any At but they No assumption on transmission delay, MIDLAB Middleware Laboratory Statistics algorithms failure failure of zero probability a with non only can be guaranteed certain probability with a maximumconstant deviation given a within are others known are of delay the transmission value Then, in probabilistics and statistics algorithms precision statistics precision and algorithms Then,probabilistics in clocks that at any time each two guarantee they Anyway, far their fromclocks are how Processes don©t know theand expected Assumes that the standard deviation MIDLAB Middleware Laboratory Why clock sync is important? An example UNIX make command is used to compile source code Typically, a large UNIX program is splitted in several files A change to one source file only requires that file to be recompiled make goes through all the source files to find out y which ones need to be recompiled r o t a r o It examines the time in which source and object files b a L were last modified e r a w e l If the source file was last modified after the object file, d d i make M calls the compiler to recompile it B A L D I M behavior an unpredictable to lead will that probably clock the slower on is that machine because time 2143 but is assigned output.c is modified time on agreement different no global two with machines, Lack of synchronization is Lack is dramatic! of synchronization of mixture a code programcontain will executable The make shortlyand after that output.o Suppose has 2144 time on compiler and running editor have to suppose Let©s will not call the compilercall not will MIDLAB Middleware Laboratory Some definitions Scale of the system LAN (Local Area Network) tipically composed by few hundreds of nodes It allows a deep data traffic analysis in order to derive an upper bound on the transmission delay In the following we©ll consider clique-based LANs (each node is connected to every other) Clique is not scalable: adding new nodes augments the load y of the system r o t a r o b a L WAN (Wide Area Network) e r a w Up to millions of nodes e l d d No data traffic analysis is feasible i M Transmission delay finished but unknown B Random graph topology: each node is connected to a A subset of nodes in the network L D I High scalability: a node interacts only with its neighbors M Failures None reliable system Crash-stop a crashed node leaves the system forever Byzantine a byzantine node is one that arbitrarily either follows correctly the algorithm, or executes different steps. In clock sync, byzantines aim to destroy the system convercence y r o t a r o b Dynamism a L e r a w e l d Static network d i Nodes leave the network only due to crashes M Dynamic network B Nodes can join and leave the network at any time, A L D voluntarily or due to crashes I M Clock sync in literature: a state of the art Authors Algorithm Synchro Fault Communi Assumption Paradigm nization Toleranc cation on message Paradig e Topology delay m Lamport Deterministic Internal None Clique Known bound Cristian Probabilistic Internal Byz. Star Known (3F+1) (Master- distribution Slave) Mills (NTP) Probabilistic External Byz. (1) Hierarchica Known l (Master- distribution Slave) y r Rodrigues Probabilistic External Crash Hierarchica Known o t a r et al. l (Master- distribution o b Slave) a L e r O.Babaoglu Probabilistic HeartBeat Crash Random None a w e et al. Graph l d d Van Steen Probabilistic External Crash Random None i M et al. Graph Baldoni, et Probabilistic Internal Byz. Random None B al. Prob. Graph A L D D. Dolev et Deterministic HeartBeat Byz. Clique Known bound I al. (3F+1) M NTP± Network [7] Protocol Time It is based on a hierarchical structure It on hierarchical a is based LAN in mostThe algorithm both WAN in used and synchronization external Provides for istance UTC istance for timesource, External Primary servers Secondary servers Clients MIDLAB Middleware Laboratory Primary servers are directly connected to external time source Secondary servers are connected to one or more primary servers Secondary servers read time from primary servers and distribute the information to clients Time estimation For each pair of messages exchanged by two servers, NTP estimates the total transmission time as: T ± T + T - T y i-2 i-3 i i-1 r o t a r o b a L e r a w e l d d i M B A L D I M Accuracy of Accuracy NTP tolerance Fault replication andcryptography itthen connects primary another to server server, it becomes secondary a server 1millisecond in LAN pathsmilliseconds on Internet 10 means servers of by failure byzantine NTP tolerates Ifsecondary a server looses connection a primaryfrom Ifprimary a serverdisconnects fromthetime source,then MIDLAB Middleware Laboratory al. [8] ± Rodrigueset CESIUM SPRAY Uses a hierarchical structure based on WANs of LANs on WANs based structure Uses hierarchical a synchronization Hybrid external/internal GPS satellites Node equipped Node Nodes over over Nodes with GPS with receiver LANs MIDLAB Middleware Laboratory Assumption: each LAN is equipped at least with a node with GPS receiver (GPS node) GPS satellites ªsprayº time information over GPS nodes GPS nodes further ªsprayº time information over LANs through broadcasts (an error-free broadcast is guaranteed to occur) Nodes within LANs apply a function (average, median, ¼) to the set of value obtained to update the clock value WANs of LANs approach: y r Accuracy of synchronization in LANs even if a o t a r disconnection from time source occurs o b a L Low error rate e r a w e Transmission delay bounded l d d Transmission delay negligible without errors i M Fault tolerance B A Replication of GPS nodes for external synchronization L D Enough node in LANs for internal synchronization I M Lamport [1] Internal clock syncronization Reliable LAN (no failures) Clique-based topology Synchronization