Quick viewing(Text Mode)

Digital Signal Processing Applications and Implementation for Accelerators

Digital Signal Processing Applications and Implementation for Accelerators

EUROPEAN ORGANIZATION FOR NUCLEAR RESEARCH CERN − SL DIVISION

CERN-SL-2002-047 (HRF)

Digital Processing Applications and Implementation for Accelerators

Digital Notch Filter with Programmable Delay and Betatron Phase Adjustment for the PS, SPS & LHC Transverse Dampers

V. Rossi

Abstract

In the framework of the LHC project and the modifications of the SPS as its injector, I present the concept of global processing applied to a particle accelerator, using Field Programmable Gate Array (FPGA) technology. The approach of global digital synthesis implements in numerical form the architecture of a system, from the start up of a project and the very beginning of the signal flow. It takes into account both the known parameters and the future evolution, whenever possible. Due to the increased performance requirements of today's projects, the CAE design methodology becomes more and more necessary to handle successfully the added complexity and speed of modern electronic circuits. Simulation is performed both for behavioural analysis, to ensure conformity to functional requirements, and for time signal analysis (speed requirements). The digital notch filter with programmable delay for the SPS Transverse Damper is now fully operational with fixed target and LHC- type beams circulating in the SPS and is a successful implementation of this concept. The transfer is programmable and therefore the same equipment can be reprogrammed for use in future projects.

General concept and digital notch filter with programmable delay presented at

Workshop on DSP Applications in the SL Division. CERN-SL, 5th November 2001

Geneva, Switzerland July 2002 2 Abstract

1. INTRODUCTION

2. FROM REQUIREMENTS TO DESIGN

2.1 Required bandwidth

2.2 Number of bits

3. DATA CONVERSION

3.1 ADC module

3.2 DAC module

3.3 Test set-up

4. TRANSVERSE SYSTEM

5. DIGITAL NOTCH FILTER AND ONE TURN DELAY

5.1 Digital Notch Filter with programmable delay for the SPS

5.2 Gating

5.3 Modes of operation

6. METHOD FOR IMPLEMENTATION OF THE DIGITAL

7. IMPLEMENTATION

7.1 Synchronous design

7.2 Delay function

7.3

7.4 User Module Interface and Look-up table

7.5 I2C Controller & Reset

8. FINE DELAY

8.1 I2C interface

8.2 Conversion algorithm

8.3 Interlock

9. DESIGN ENTRY

10. SCHEMATIC DIAGRAM

11.

12. BOARD MANUFACTURING

13. LABORATORY TEST

14. RESULTS WITH BEAM

3 15. BETATRON PHASE ADJUSTMENT

15.1 Hilbert filter

15.2 Realisation

15.3 MD results

16. CONCLUSIONS

17. FUTURE DEVELOPMENTS

References

Acknowledgements

4 Digital Signal Processing Applications and Implementation for Accelerators Digital Notch Filter with Programmable Delay and Betatron Phase Adjustment for the PS, SPS & LHC Transverse Dampers

Vittorio Rossi, CERN, SL-HRF

Abstract

In the framework of the LHC project and the modifications of the SPS as its injector, I present the concept of global digital signal processing applied to a particle accelerator, using Field Programmable Gate Array (FPGA) technology. The approach of global digital synthesis implements in numerical form the architecture of a system, from the start up of a project and the very beginning of the signal flow. It takes into account both the known parameters and the future evolution, whenever possible. Due to the increased performance requirements of today's projects, the CAE design methodology becomes more and more necessary to handle successfully the added complexity and speed of modern electronic circuits. Simulation is performed both for behavioural analysis, to ensure conformity to functional requirements, and for time signal analysis (speed requirements). The digital notch filter with programmable delay for the SPS Transverse Damper is now fully operational with fixed target and LHC-type beams circulating in the SPS and is a successful implementation of this concept. The transfer function is programmable and therefore the same equipment can be reprogrammed for use in future projects.

1. INTRODUCTION

As a general rule, , derived from the beam or from equipment through , are processed either for observation or they are injected into feedback loops, figure 1.

Fig. 1 – Block diagram of closed loop response for a feedback system.

The response of the system is:

R(s) G(s) W (s) = = (1) P(s) 1− H 1(s)⋅ H 2(s)⋅G(s) Where: W (s) system response R(s) resulting beam position P(s) perturbation G(s) beam response H 1(s) low level response H 2(s) response

The system is stable if the real part of H 1(s)⋅ H 2(s)⋅G(s) is less than +1.

5 Treating the low level signals in the digital domain allows flexibility for the subsequent treatment.

Fig. 2 presents the overall concept layout: a single Digital Signal Processing Unit (DSPU) performs all the numerical treatment necessary to implement the H 1(s) transfer function. The DSPU is preceded by an Analog to Digital Converter and followed by a Digital to Analog Converter. To achieve behaviour independent from the sampling rate, the is fed to the ADC and routed, together with the data, to the following DSPU and DAC, with identical delay. The ADC-DSPU-DAC combination possesses a unity ; its intrinsic bandwidth is from DC up to the Nyquist limit. Following the requirements of the project, amplification and filtering can be added before and/or after the data conversion or implemented in the DSPU.

Clock

Analog DSPU Analog input ADC Data Data DAC output Clock Clock

Fig. 2 – Digital Signal Processing Unit associated to data conversion.

In view of the LHC requirements and to minimise the injected to the stored beams, decreasing its lifetime, the quantisation error introduced by the data conversion should be reduced to a minimum. This requirement defines the minimum number of bits in the conversion and in the processing.

Fast applications involving beam control on a bunch-by-bunch basis suggest the choice of a hardware-based digital signal processing solution compared to a software-program approach. Fast processing speed motivates the implementation in a Field Programmable Gate Array rather than in a floating point Digital Signal . Transfer functions can thus be designed and synthesised in VHDL and implemented in programmable devices. Simulation of the FPGA is performed both for behavioural analysis (conformity to project requirements) and time analysis (speed requirements). Implementation in digital form in modern electronic circuits is realised through routing tools preserving the signal integrity (delay, crosstalk etc.).

A different transfer function can be calculated for a different application and loaded in the programmable device. Within the limits of the maximum system gates and the data conversion range, the same equipment can be employed for several applications. An example of such a procedure is presented in the following sections 2 and 3 using LHC requirements and parameters to provide concrete numbers. The detailed design of a system for the SPS transverse damper for multi-cycling applications is presented in section 4-15.

2. FROM REQUIREMENTS TO DESIGN

2.1 Required bandwidth

The system under consideration is a bunch-by-bunch feedback with bunch spacing Tb and bandwidth W =1 2Tb . For LHC, where Tb is 25 ns, the required bandwidth is 20 MHz for the whole system. The low-level part is over designed, with sampling rate up to 120 MSPS, a three times over sampling at the 40 MHz LHC bunch that allows a Nyquist bandwidth up to 60 MHz. To minimize the emittance blow-up at injection, the damping time has to be shorter than the beam decoherence time and the system should be able to damp the coherent of each individual batch. In this case the required bandwidth is less and it is determined by the batch spacing (not the bunch spacing).

6 2.2 Number of bits

When sampling is considered, an infinite number of bits are theoretically required to represent each value. Physical limitations preclude sampling with infinite precision and a system with a finite number b of bits is assumed. With digital signal processing of the signal, the main source of noise is assumed to be the quantisation noise. For the LHC stored beams, it is proposed to reduce the quantisation error below the minimum acceptable value, to reduce the injected noise and increase the beam lifetime. From [1], the transverse emittance growth rate for a beam of transverse rms size ±σ, depends on tune spread and quantisation error, as follows:

1 4 = f ⋅α 2 ⋅δQ 2 (2) τ 0 x 2 3

1 −()− where defines the transverse emittance blow-up rate, f the revolution frequency, α = 2 b 1 the τ 0 x 2 quantisation width and δQ the tune spread. = δ ≈ −2 Taking the example for the LHC: f0 11 kHz , Q 10 and an 11 bit effective, 12 bit ADC with

−()b−1 1 α = 2 = , one obtains a transverse emittance growth time τ 2 = 200 hours, which is satisfactory. 1024 x

Sampling rate up to 120 MSPS and a resolution of 12 bit are the features chosen to design the DSPU, complying with the LHC-type beam requirements.

3. DATA CONVERSION

The data conversion is fundamental for the precision of the complete system. The purpose of the development is the realisation of data conversion building blocks with high precision in the analog section and high resolution and speed in the digital section. State of the art devices have been targeted, 12 bit, (with further evolution in mind, 14-bit) and sampling rate up to 120 MSPS. In this way future evolution and reuse in other projects is possible.

3.1 ADC module

Analog to digital conversion can be performed up to 120 MHz. The heart of the circuit is the AD9432, from Analog Devices, a monolithic 12 bit ADC with integral track and hold circuit and an integral non-linearity of ± 0.5 LSB (data sheet: see http://www.analog.com ). High speed ADC’s have differential inputs and present less when driven differentially. With differential signals also, the noise becomes common mode. Therefore a DC-coupled low-impedance wide-band differential amplifier, the AD8138, is employed to drive the ADC input stage. The SPS10211-1 wide-band differential driver presents less than 0.05 dB and 0.1° differential errors as shown in fig. 3.

Fig. 3 - Differential error of the amplifier SPS 10211-1 measured at the inputs of ADC. The errors are < 0.05 dB and < 0.1° @ 100 MHz. Magnitude: (top trace, 1 dB/div.) Phase: (bottom trace, 1°/div.) Frequency: 20 MHz/div.

7 Together with the ADC board, this driver constitutes the NIM module SPS10211 - 12 BIT 120 MSPS ADC. The clock is regenerated by a line receiver, fed to the ADC and routed with identical delay, together with the data, to the following DSPU or DAC. The data bus has a width of 16 bit, where actually 12 bits (data[11..0]) are used at present and the other 4 bit are reserved for future use. A test point of the buffered input signal is provided and also a 10-bit DAC back-converts the digitised data, for test purposes. The block diagram is shown in fig. 4. The 10-bit DAC is not used to test the performances of the 12-bit ADC but it is only provided to verify the conversion process. The ADC-module analog input bandwidth is from DC to 120 MHz; the full scale input level is ± 0.5 V. The digital output is LVTTL compatible. The can be varied from 1 to 120 MSPS continuously.

Fig. 4 – Block diagram for the module SPS10211 - 12 BIT 120 MSPS ADC.

3.2 DAC module

Digital to analog conversion is accomplished via the companion AD9754, a monolithic current-output 14-bit DAC, with integral non-linearity of ± 1.5 LSB. The output is DC coupled to an AD8055, a differential amplifier with a bandwidth from DC to 200 MHz and a full scale output of ± 0.5 V (data sheet: see http://www.analog.com ). Together with the DAC board, the amplifier constitutes the module SPS10212-1 - 14 BIT 160 MSPS DAC, whose block diagram is depicted in fig. 5. For the DAC alone, the clock rate can be varied from 1 to 160 MSPS continuously. The clock rate, for the pair ADC/DAC, ranges from 1 to 120 MSPS.

Fig. 5 - Block diagram for the module SPS10212-1 - 14 BIT 160 MSPS DAC.

8 3.3 Test set-up

To avoid errors introduced by unnecessary data conversion (i.e. ADC/DAC combination when testing ADC or DAC or DSPU only etc.) and to perform measurements of any DUT in its respective domain, analog and/or digital, the test method shown in fig. 6 is used. The Arbitrary Generator provides both the analog output and the digital-encoded output of an arbitrary function and it is employed to test separately the ADC, the DAC or the DSPU. (Clock generator 10 Hz to 250 MHz, waveform length 256K x 12 bits). The measurements, both digital and analog, are performed with a 2 GSPS digital scope, HP54616C, with active probes. The data are acquired with a PC. The DSPU is tested by using the ADC module or the test generator alone. The DAC module is tested by using the generator, the ADC unit or the DSPU, as a source. In this way comparisons are made and eventual errors or limitations, introduced by each unit, are evaluated.

Fig. 6 - Block diagram for the test set-up.

The data bus connectors are provided on the modules’ rear panels (they are shown in the following chapter 12, figs. 43 and 45. The connectors allow interconnectivity between the different modules (ADC to DAC, ADC to DSPU, DSPU to DAC) and the connection of the units to the digital generator and to the test set-up.

Because of their universality, the SPS10211 (ADC) and SPS10212-1 (DAC) modules can be considered the building blocks for the data conversion and they may be reused for future projects.

Fig. 7 – Test generator driving the SPS10212-1 DAC Fig. 8 – Test generator driving the SPS10211 ADC, with a digitally encoded damped sinus-waveform. with an analog damped sinus-waveform. The ADC Clock frequency 80 MHz. drives the SPS10212-1 DAC. Clock freq. 80 MHz.

9 The performances of the pair ADC/DAC are reported in figs. 7 to 10. The settings for both figs. 7 and 8 are: Ch. 1: Out from AWG2021 generator. Ch.2: Out from DAC module. Amplitude, both traces: 400 mV/div. Time: 100 ns/div. Lower traces expanded to 10 ns/div.

By comparing fig. 7 to fig. 8, one can observe that the latency, from the input to the DAC output, has increased (actually to 175 ns) due to the double conversion. In fig. 7 and fig. 8, the channel 2 expanded traces are almost identical, indicating that the dynamic behaviour of the modules is correct in both cases.

Fig. 9 shows the ADC/DAC response to a low amplitude and low frequency signal. The linearity is good and the noise is comparable to the intrinsic noise of the scope in a 10 MHz bandwidth.

The transfer function is measured with a network analyser HP8753E (Frequency range 30 kHz to 6 GHz) and the data are acquired with a PC. Fig. 10 represents the sin(x)/x response for the ADC/DAC, measured at different clock frequency, showing a 76 dB and a smooth variation of the output (no missing bit). For a clock frequency of 80 MHz, the analog bandwidth is 28 MHz @ -3 dB and the differential phase, after delay compensation, is less than 2° up to the Nyquist limit.

Fig. 9 - ADC/DAC response to a low amplitude signal, showing good linearity and low noise. Vin 12 mVpp triangular . Clock 80 MHz. Amplitude: 2 mV/div. Time: 20 µs/div. Bandwidth: limited to 10 MHz on scope HP54616C.

Fig. 10 - ADC/DAC sin(x)/x transfer function, measured with a network analyser HP 8753E, for 40, 60 and 80 MHz clock.

Analog bandwidth: 28 MHz @ -3 dB for an 80 MHz clock. Bottom trace: differential phase, 10 deg./div. Phase is less than 1° @ 28 MHz for an 80 MHz clock.

10 4. TRANSVERSE FEEDBACK SYSTEM

In the SPS, the transverse feedback [2] is essential to damp the injection , in order to reduce emittance blow-up, and to stabilise the coupled bunch instabilities. Figure 11 depicts the schematics of a typical transverse feedback system. The beam displacement is detected by pick-up electrodes, then amplified and applied as a correction to the beam with a downstream transverse deflector (RF kicker). The transit time of the signal from the pick-up to the deflector is matched to the beam time of flight, by introducing the appropriate delay, approximately 1 machine turn when pick-up and RF kicker are closely spaced. In this way the correction is applied to the same particles that have generated the pick-up signal. To obtain beam damping, the phase advance between the signal applied to the beam, at the RF kickers location, and the beam itself must be an odd multiple of π 2 at betatron . The sign of the amplifier G is selected accordingly, to obtain the correct sign of the feedback.

Fig. 11 - Transverse feedback.

The action of the feedback will be illustrated with fig.12. It shows the normalised transverse phase space diagram of a bunch. Let’s assume that the maximum displacement occurs at the pick-up location. A quarter betatron wavelength downstream, the correction is applied as a change of slope, along the x’ axis. At the pick-up location the result is a new beam position that will produce a new correction at the kicker location. Finally the trajectory will spiral inwards and ultimately reach the origin.

Fig. 12 - Phase space diagram of a transverse bunch .

Without any additional signal processing, the amplifier G will amplify the closed-orbit offsets. This produces the undesirable effect that the system will try to correct the closed-orbit and, depending on the gain G, the kicker amplifier might saturate. To reject the unwanted closed-orbit components in the pick-up signal one has to discriminate between the quasi-static closed-orbit component and the fast varying betatron oscillation signal. Fig. 13 shows an analog periodic filter (“comb filter”) that makes the difference between the position measurements at successive turns and thereby rejects the closed-orbit that is almost constant from turn to turn. The value of the delay line is made equal to the beam revolution period T0 .

11

Fig. 13 - Rejection of the closed orbit with an analog comb filter.

With reference to fig. 13, let’s call x()t the input signal and u()t the output signal of the comb filter.

Let the Fourier Transform of x()t and u()t be X ()f and U ()f , respectively.

− j2π⋅ f ⋅T 0 Recalling the shift theorem, the Fourier Transform of x()t −T 0 is X ()f • ( e )

The frequency response of the filter is thus:

   − j2π⋅ f ⋅T 0  X ()f − X ()f •  e  U( f )   H ( f ) = = X ( f ) X ()f

− j2π⋅ f ⋅T 0 H ( f ) = (1− e ) (3)

− jπ⋅ f ⋅T 0 jπ⋅ f ⋅T 0 − jπ⋅ f ⋅T 0 H ( f ) = (e ) • (e − e )

− π ⋅ ⋅   ()= ( j f T 0 )  () •  π  Finally H f e  2 j sin f T 0  (4)

The modulus of the transfer function is twice the amplitude of a sinusoidal function and the phase is a saw tooth with π radians excursion, as depicted in fig. 14.

Now we refer to the accelerator parameters, tune Q and revolution frequency f 0 . The tune value Q , linked to the lattice of the machine, can be decomposed into an integer value, Qi , and a fractional part Qf :

= + = ()≤ < Q Qi Q f where Qi 0, 1, 2, 3 ... and 0 Q f 1

By definition, the betatron frequency fβ is: fβ = Q • f 0

The bunch signal is seen by the pick-up at each revolution. One can think that the bunch position is “sampled” by the pick-up at the pick-up location. The sampling process creates sidebands that we will call ()β + ()β − • f and f , at the fractional part Qf around each revolution frequency line m f 0 .   ()β ± = • 0 ± • 0 = • 0 ±  +  • 0 f m f Q f m f  Qi Q f  f

  ()β ± =  ±  • ± • f  m Qi  f 0 Q f f 0

For − ∞ < m < +∞ and with − ∞ < n < +∞ , the equation reduces to:

()β ± = • ± • f n f 0 Q f f 0 (5)

()β + ()β − As an example, the sidebands f and f , for Qf < 0.5, are added to fig. 14.

12

Fig. 14 - Transfer function of the notch filter.

With reference to figures 11, 13 and 14, the following points must be considered for the transverse feedback operation:

• The dependency of the gain on the fractional part of the tune.

• The filter changes the phase of the betatron signal, depending on the tune of the machine. Except when the tune is fixed and a pick-up at the requested phase is readily available, it is usually necessary to introduce a way to adjust the phase shift at the betatron frequencies. At the SPS, this is done analogically by combining, with appropriate coefficients, the difference signals from two pick-ups separated by a quarter betatron wavelength, ref.[2]. In this way one creates a “virtual” pick-up placed at the correct betatron phase location. The principle of operation, shown in fig. 15, comes from the following trigonometric relation (ω t = beam phase; ϕ = added phase):

cos ()ωt − ϕ = cosωt ⋅ cosϕ + sinωt ⋅ sinϕ (6)

Fig. 15 Betatron phase adjustment with a two pick-up scheme.

• The transit time of the signal from the pick-up to the deflector is matched to the beam time of flight. This is done by introducing an appropriate delay, added to the hardware delay, which allows the correction to be applied to the same particles that have generated the pick-up signal.

The analog filter depicted in fig. 13 is given only as an example of a possible implementation for the rejection of the closed orbit. In practice, the losses associated with the delay line limit the useful range of the filter to low frequency applications. The cable delay-line itself possesses a limited bandwidth. Also T0 depends on and the adjustment of the cable length is not very practical. These factors motivate the choice of a digital approach to realise the notch and the delay functions.

13 5. DIGITAL NOTCH FILTER AND ONE TURN DELAY

The signal between the pick-up and the kicker undergoes digital signal processing. The sampling process represents a continuous signal A(t) with its values taken at fixed intervals. With the sampling, the is shifted from the continuous domain to the discrete domain.

Fig. 16 - Continuous signal. Fig. 17 - Sampled signal.

In the sequence x(n ) for − ∞ < n < +∞ the value x(k ) represents the k th sample in the sequence.

The z-transform of the sequence x(n ) is defined by: +∞ −n X (z) = Z[]x(n) = ∑ x(n) • z n=−∞

Let y(n) be the sequence x(n ) delayed by N samples: y(n) = x (n − N )

The z-transform of y(n) is given by: +∞ +∞ +∞ −n − p−N −N − p Y(z) = Z []x(n − N) = ∑ x(n − N) • z = ∑ x( p) • z = z • ∑ x( p) • z n=−∞ p=−∞ p=−∞

−N Which simplifies to: Y (z) = z • X (z)

The difference between the z-transform of the signal and the z-transform of the signal delayed by N samples is given by: −N X (z) − Y (z) = X (z) • ()1− z

Fig. 18 - Digital notch filter composed of a one turn delay and a .

14 If Tclk is the sampling clock period, then N samples are acquired during one beam revolution period such that: N • Tclk = T 0 .

The z-N function is called the one turn delay and it constitutes the building block of the transverse feedback system. We can now create in digital form the periodic filter that makes the difference between the position measurements at successive turns. The digital notch filter is depicted in fig. 18. Its transfer function is:

X (z) − Y (z) − H (z) = = 1− z N (7) X (z)

In a digital system, the frequency response is obtained by evaluating the z-transform on the unit circle.

− π ⋅ ⋅T 0 − − π ⋅ ⋅ j 2 f z 1 → e j 2 f TCLK = e N

− − π ⋅ ⋅ Therefore: 1− z N → ()1− e j 2 f T 0 (8)

The frequency response (Eq. 8) of the digital notch filter is identical to the frequency response (Eq. 3) of its analog counterpart, see figure 13.

The transit time of the signal from the pick-up to the deflector is matched to the beam time of flight by introducing a programmable delay at the output of the subtractor. In the case of the SPS, the change of the speed of the particles vs. the change in energy leads to a phase error. The error in phase, due to the fixed cable and hardware delay, which does not change with energy, depend on the energy range and the pick-up locations. At the SPS this variation, measured at 20 MHz, is between ± 9° for an energy change from 26 Gev to 450 Gev (corresponding to the LHC-type beam) and ± 24° for an energy change from 14 Gev to 450 Gev (corresponding to the fixed target operation) and it is left uncorrected.

In the case of the for the SPS transverse feedback, the one turn programmable delay is de- composed in three blocks, as shown in fig. 19.

Fig 19 - Programmable delay.

The blocks depicted in fig. 19 have the following functions:

• (z-Mfix) sets a fixed delay. • (z-Mprog) sets a coarse delay. Mfix Mprog M • (z- )•(z- ) merges in (z- ). • (z-P/12.5) sets a fractional delay.

15 Steps of one clock period are too coarse to allow a fine adjustment of the phase shift at the betatron frequencies and thus the interval between two samples has been further divided in P sub-intervals, as shown in fig. 20. The data sequence will be shifted by the value of P divided by a constant factor of 12.5. This constant depends on criteria that will be discussed in the following paragraphs.

Fig. 20 – Sub-intervals between two samples.

The transfer function of the notch function followed by the programmable delay function is given by:

 − P  = ()()− −N −M  12.5  H (z) 1 z • z •  z  (9)  

The complete transfer function, (Eq. 9), will be implemented in the DSP Unit. In summary the functionality of the DSPU will consist in two distinct parts:

• The Notch Filter rejects the revolution frequency and its harmonics due to the closed orbit error of the uneven beam distribution in the vacuum chamber (batch structure).

• The Programmable Delay introduces a delay that, added to the hardware and cable delay, allows the correct part of the beam to be kicked.

Coming back to fig. 1 we can now replace the generic DSPU function by the notch filter with variable delay function. The new block diagram is presented in fig. 21.

Fig. 21 – Notch filter with variable delay realising the function of (Eq. 9).

16 5.1 Digital Notch Filter with programmable delay for the SPS

The detailed parameter list is used for the design of the digital filter with programmable delay for the SPS transverse feedback (damper):

Parameters used to design the DSPU at SPS (fRF = 200 MHz; harmonic number h = 4620):

• Types of beam : Protons and Ions for both fixed target and LHC-type beams.

• Analog Bandwidth : DC to 30 MHz.

• Analog Input : ± 0.5 V/50Ω.

• Gain : 0 dB (notch off).

• Signal processing : Resolution of 12 bit in view of the LHC.

• Closed orbit : Rejected to avoid saturation of power .

• µ Number of samples N : 1848 samples per beam revolution period of h/fRF = 23.1 s. 1848 • RF Clock frequency : f = (f •h)•(N/h) = f •(N/h) = 200• = 80 MHz. CLK 0 RF 4620

• Programmable Delay : 17µs to 23.2 µs in 0.5ns steps (1.8° @ 10 MHz analog input freq.)

• Number M of clk periods for the Prog. Delay : 1360 to 1847 @ TCLK = 1/ fCLK = 12.5 ns.

• Number P of clk intervals: 1 to 25 for 2•TCLK = 25 ns.

• Gating on single bunches: Possible gating on distinct bunches in view of Q measurement and chromaticity measurement with the damper (see § 5.2).

• Mode of operation : Normal mode and 20 MHz mode (see § 5.3).

• Module control : Local control for test purpose and remote control provided for the SPS multi-cycle operation. The remote control conforms to the SL/HRF serial bus protocol.

• Transfer function : Programmable, to implement new transfer functions.

-N -M -P/12.5 H(z) = (1- z )•(z )•(z )

N > M > P

N = 1848 > M = 1360÷1847 > P = 1÷25

Table 1 – Values used for the implementation of the DSPU at the SPS.

Fig. 22 shows in detail the block diagram of the digital notch filter with programmable delay for the SPS damper, module SPS10213. It is composed of the following functions, covered in detail in the following chapters:

17 • Programmable delay (17 to 23.1 µs)

• 1 turn delay covering the SPS beam revolution period (23.1 µs)

• Arithmetic Logic Unit (ALU) performing as a subtractor

• User Module Interface with Local/Remote and Notch function controls

• Look-up table

• I2C Controller and Reset for the fine delay

• Fine delay

Fig. 22 – Block diagram of the SPS 10213 - Digital Notch Filter with Programmable Delay, in conjunction with the SPS 10211 - ADC module and SPS 10212-1 - DAC module. (Thin arrows indicate the clock path).

5.2 Gating

Gating on consecutive bunches, while providing at the same time a “gain notch” of the feedback for the other portion of the beam, is of particular interest. It is implemented in the following way: The DSPU is programmed with the number of consecutive clocks, defining the gating width. The trigger to start the sequence is sent to the DSPU through one of the spare data-input lines. The programming of the gating width is done through the User Module Bus. To make this feature operational for the SPS damper may require further development work.

5.3 Mode of operation

It is foreseen to use the feedback as a beam excitation source by injecting excitation signals into the feedback system for tune and chromaticity measurements with the gating of consecutive bunches. This requires the full 20 MHz bandwidth of the RF power amplifier. This mode of operation, named “20 MHz-mode”, needs a perfect phase synchronism between the 80 MHz clock and the beam signal. In the “normal” mode of operation, low pass filtering is placed before the ADC conversion. This input signal relaxes the clock to beam signal phase-error requirement. The DSPU is intrinsically compatible with the two modes of operation. Some results for both modes of operation, with the LHC-type beam, are given in chapter 14. More work on the “20 MHz-mode” has to be done in the future.

18 6. METHOD FOR IMPLEMENTATION OF THE DIGITAL SIGNAL PROCESSING

The whole project, consisting of a 80 MHz digital notch filter with programmable delay, I2C bus and remote control, together with the associated data conversion, has been achieved through the use of the CAE suite tools, supported by CERN IT-PS Division (formerly IT-CE). Due to the performance requirements of the project, the CAE design suite methodology was used to handle successfully the complexity and speed of the modern electronic devices and boards. The functionality of the design is too complex to implement in a single design file. The CAE tools allow the creation of multiple design files and link the files into a hierarchy. The following design-flow has been employed:

• Design entry, through Hardware Description Language (VHDL), graphical (schematic capture).

• Verification and Compilation.

• Simulation.

• Targeting devices.

• PCB placing and routing.

• Post-layout Signal Integrity Analysis.

• Manufacturing.

• Programming.

Programs used:

• Cadence/Concept and Logic Work Bench for the schematic.

• Cadence/ Allegro for the PCB layout.

• MAXPLUS-II from ALTERA has been chosen because it supports 200 MHz FPGA devices.

• SPECCTRAQuest SI Expert (IBIS models) and PSPICE (SPICE models) for the post layout signal integrity analysis.

The notch filter module has been realised using this flow. At hardware completion, the external control bus was not yet available. Consequently the FPGA was programmed with a first compilation not comprising the fine delay and remote control. The notch filter module worked immediately and measurements confirmed the simulation results. Because of the excellent results, two units were installed, together with the ADC and DAC modules, for the signal processing of the SPS transverse damper, in September 2000. Starting from the run 2001, a new compilation comprising the full functionality of remote control and fine delay, was programmed into the FPGA. This permitted from then on multi-cycle operation for different types of beams (Fixed target, Ions, LHC-type beams).

In the following chapter 7, I will present the method for implementation of the transfer function in the Field Programmable Gate Array. The DSPU remote control and the serial interface, necessary to control the fine delay section, described in chap. 8, have also been incorporated into the FPGA. The design entry method to compile and to simulate the FPGA is covered in chap. 9. The DSPU schematics, incorporating the FPGA and its design entry method, are covered in chap. 10. The virtual PCB post-layout Signal Integrity analysis and the board manufacturing are presented in chap. 11 and 12, respectively. Finally, the laboratory tests, prior to installation in the SPS machine, are presented in chap. 13 and the results with beam are described in chap. 14.

19 7. FPGA IMPLEMENTATION

The 12-bit wide transfer function (1-z-N)•(z-M) has been implemented in a single ALTERA FLEX EPF10K100EQC208-1 device with data-path configurable from 8 to 16 bits (data sheet: see http://www.altera.com ). Each FLEX 10KE device contains an embedded array and a logic array. The embedded array is used to implement a variety of memory functions or a complex logic function, such as digital signal processing, wide data-path manipulation, micro controller applications and data-transformation functions. The logic array is used to implement general logic, such as counters, adders, state-machine and . With up to 98,000 bit of RAM, 500,000 gates and clock frequency up to 250 MHz, the devices enable the designer to implement an entire system on a single device.

The FLEX 10KE device architecture supports the multi-voltage Input/Output interface feature, which allows FLEX 10 KE devices to interface with systems of different supply voltage, namely 2.5 V, 3.3 V and 5V. To achieve fast speed, the internal core is powered at 2.5 V. The DSP Unit is fed by these three with individual regulators powering each section through power planes.

7.1 Synchronous design

The system uses three clocks, RF clock @ 80 MHz, Xtal clock @ 40 MHz and SD clock @ 1.6 MHz for the User Module Bus. A synchronous design is used throughout the whole compilation. To avoid unpredictable behaviour, the synchronisation is mandatory between the three clock-domains (RF, Xtal and SD clk). In circuits that have independent clocks, such as in figure 23a, it is impossible to ensure that the set-up and hold times between them are met. In such cases, one must synchronise the circuit. Figure 23b shows how the reg_a values are synchronised with clk_b before they are used. A new flip-flop, reg_c, is clocked by clk_b, which ensures that its outputs will meet the set-up time of reg_b.

Fig. 23a - Multi-clock system. Fig. 23b - Multi-clock system with synchronised register output.

7.2 Delay function

The programmable delay and one turn delay functions are implemented with dual port RAM structures using a single clock and separate read and write addresses. The embedded array blocks used for the RAM and its control implemented in the logic array blocks permit memory size and speed not easily obtainable using individual IC’s.

20 7.3 Arithmetic Logic Unit

The subtractor is implemented with a pipelined ALU structure using signed (two’s complement) numerical values. Pipelining is a technique that uses registers to break up large combinatorial delays. It increases the speed of a design at the expense of an increase in the latency (one clock cycle ahead per degree of latency). To reduce the memory size, the programmable delay function is implemented before the arithmetic subtraction that produces the overflow bit.

7.4 User Module Interface and Look-up table

The User Module Interface emulator for the remote control of the unit has been embedded in the FPGA. It conforms to the SL/HRF User Module Bus protocol for the SPS Damper [3]. The UMI contains two registers and allows the programming of the delay value and notch function in the SPS multi-cycle environment. The UMB timing is shown in fig. 24 (drawing, courtesy J. Molendijk). Its protocol is defined as follows:

• The 5 bit wide bus [A4..A0] defines the module address (1-31).

• The module maintains the module-acknowledge (Mack) during the recognised address. A register load of the serial data (SD) by the serial data clock (SDclk) is complete when the Mack signal disappears.

• At address 0, the register-select (RS0), not shown, swaps the offline register and the active register. The active register is the one that currently drives the equipment, while the offline register is programmed with the settings to be activated upon a timing event.

Further functionalities implemented are:

• Look-up table provided for the settings of all parameters (harmonic number, fixed delay, data-width). • Local control for the delay value and the notch function with parallel data. • Data format conversion from the serial - UMB protocol to the serial I2C protocol.

UMB Timing MSbit first LSbit last A4-A0 8 7 6

SD SDclk Mack

Bit-time = 640 ns

Fig. 24 - UMB protocol

7.5 I2C Controller & Reset

This function is used to send serial data (SDA) and the serial clock (SCL) to the fine-delay chips that are controlled via a serial I2C bus [4]. Following the original specification, an I2C controller could be implemented by using specific integrated circuits realising the data transfer and acknowledgement protocol, under micro controller supervision, through specific software. However in this application the I2C controller is implemented in the form of a state machine that controls over bi-directional lines, the fine delay circuits. It makes a serialisation of the control bits for the fine delay and generates a handshake sequence. The I2C controller generates the serial clock (SCL) and serial data (SDA) signals following a reset procedure. The I2C controller is embedded into the FPGA and its timing is parameterised to speed up the transfer rate (1.6 MHz) compared to the original I2C specification (100 kHz).

21 8. FINE DELAY

The 13 bit wide (12 bit data, 1 bit clock) transfer function (z-P/12.5) has been implemented by the chaining of the (z-M) function to the four PHOS4 [5] devices, a 4 channel delay generator ASIC developed by the CERN Microelectronics group. The device is sketched in figures 25 and 26. One of 25 delay taps, spaced 1ns apart can be chosen for each of the 4 signal channels and the clock channel. The chip requires, as timing reference, a 40 MHz clock that must be permanently active to keep the internal phase detector in lock. Therefore a 40 MHz SMD quartz oscillator has been added on the board. The chip is controlled via the serial I2C bus and the delay values are stored in a five-bit by five-words internal register.

In0 Out0 delay/MUX In1 Out1 in delay/MUX Phase detect. Loop filter Out2 In2 Charge pump delay/MUX In3 Out3 delay/MUX 4 delay Vcontrol 5 MUX register 5 delay Clk OutClk DLL 5 w_en w_en delay out I2C SCL Interface SDA

4 Fig. 26 - Detail of the PHOS4 delay generator. Addr The Delay Locked Loop consists of a multi-stage delay line Fig. 25 - PHOS4 delay generator. followed by a , a phase detector and a loop filter.

I found that the PHOS4 output drivers have only a limited drive capability. Loading each output with a low capacitance BFR93A microwave buffer allows a throughput rate per channel in excess of 100 MSPS, more than doubling the original 40 MSPS specification. The circuit shown in fig. 27 represents one of the PHOS4 devices delaying the data FO[11..0] from the FPGA and driving the BFR93A transistor (one transistor shown, for the sake of clarity). The optimisation of this part needs a careful PCB layout.

Fig. 27 - PHOS4 output driving small capacitive load.

8.1 I2C interface

The I2C interface uses the external signals SCL and SDA for data transfer. Since the chip does not have a reset signal, it has to be generated from the I2C interface. Therefore it is necessary to reset the chip immediately after power-up with an arbitrary transmission on the I2C bus. The first negative transition on the I2C clock line is used to trigger the reset procedure and the first data transmission will not have any effect on the delay registers.

22 One I2C write-instruction is divided into two words as shown in the following table 2.

S A A A A A A A 0 A S S S D D D D D N P 6 5 4 3 2 1 0 2 1 0 4 3 2 1 0 A

Table 2 – Data words pattern for the PHOS4. • Start condition (S).

• The first Byte (A6..A0, 0) corresponds to the hard-wired device address of the PHOS4 (A5..A2 used).

• Acknowledgement (A) from the addressed PHOS4.

• The second Byte (data word) is divided into two sections; the upper three bits (S2..S0) select one of the delay channels, the lower five bits (D4..D0) specify the delay value.

• Stop condition (P).

• Each PHOS4 data channel and clock channel should receive a complete instruction. A complete sequence needed to write a fine delay value is thus composed of 20 cycles of instruction.

8.2 Conversion algorithm

The resolution of the delay value is 0.5 ns. To obtain such resolution, the fine delay value P, in steps of 1 ns, is concatenated to the number of 80 MHz clock periods M, in steps of 12.5 ns, with the following algorithm (Eq. 10), [6], generated in the SPS control software. The conversion algorithm will map the user input (i.e. the delay value imposed by the machine operator) onto the fixed, coarse and fine delays such that the required delay will be produced. The conversion algorithm will finally add the user value for the notch filter control (Notch ON/OFF).

user_delay_val (in 0.5 ns steps) = (Dfix•l + Dco•0.0125 + Dfi•0.00l) = 17.0000 to 23.2005 µs (10)

• Fine delay (Dfi): res.: 5 bit; value range: 0-24; 1 unit = 1.0 ns

• Coarse delay (Dco): res.: 8 bit; value range: 0-255; 1 unit = 12.5 ns

• Fixed delay (Dfix): res.: 2 bit; value range: 0-3; selects a fixed delay of 17, 18, 19 or 20 µs.

For a value of '0' for all delays (fixed, coarse, fine) the resulting delay will be 17 µs.

• The values of the user_delay_val (in 0.5 ns steps) are written, for each machine sub-cycle, from the User Module Bus to the DSP Unit via the User Module Interface.

• The values of the fixed and coarse delay-registers, Dfix and Dco, are written to the programmable delay section of the FPGA.

• The values of the fine delay-register, Dfi, are written to the I2C bus interface of the four PHOS4 devices.

8.3 Interlock

This summary status output gathers any internally detected fault together with the front panel switch 'local' state, thus yielding: 'Not Ready'. Actions upon trip: 1 alarm per filter unit.

23 9. DESIGN ENTRY

The transfer function has been designed using the MAXPLUS II program from Altera. The design entry is done in Visual VHDL (schematic entry) and partly in VHDL. The top level design is partitioned into sub- sets or blocks, connected by internal nodes. The design is hierarchical; the program compiler assigns the block numbering that identifies a section in all the linked files. Following modification of one block, the compiler updates the compilation files. The input and output nodes and the internal nodes of each block may be associated in a simulation file, to verify the conformity of the compilation to functional requirements. Several blocks, or all the blocks, can be linked to perform an overall simulation of the whole compilation.

Thereafter a specific device is targeted (actually, the EPF10K100EQC208-1 device from Altera, containing more than 100,000 gates and 49,000 bits of RAM). Timing constraints are imposed and the device’s performances in the time domain are analysed, to verify its conformity to the project speed requirements. In the device floor plan, critical paths can be detected and modifications to the pin assignments imposed, if necessary. The virtual is optimised through the simulations, before a final choice.

Fig. 28 shows the top level drawing for the FPGA compilation. It is decomposed in the following blocks:

• Programmable delay, block 312 and Programmable delay control, block 334.

• One turn delay, block 311 and one turn delay control, block 313.

• Arithmetic Logic Unit, block 326.

• I2C Controller 2, block 321.

• Interlock and reset, block 337.

• User Module Interface 1a, block 230.

• Look-up table, block 316.

No internal node (i.e. RAM addressing etc.) is mapped to the device’s pins because this consumes internal resources (logic cells and logic blocks) and limits the maximum working frequency (the path to external pins is longer than the path from cell to cell). A cleaner PCB layout is also obtained. Test at the workbench is unnecessary; simulation is performed instead.

Fig 29 represents the simulation of several nodes for the compilation of fig. 28. It depicts the data flow showing the relations between data input, internal addressing and output function. The clock period is 200 ns, imposed by the simulator and the time resolution is 100 ps. The data input value is increased by 1 before each clock. The values for each node are read, after the clock transition, at an arbitrary time of 510 µs:

• 4th node input data from the ADC data[11..0] decimal value 498

• 5th node data after the prog. delay dataa[11..0] decimal value 3079

• 6th node data after the one turn delay datab[11..0] decimal value 0

• 8th node common read address raddress[10..0] decimal value 502

• 10th node write address, prog. delay wraddressa[10..0] decimal value 2015

• 11th node write address, one turn delay wraddressb[10..0] decimal value 299

• 20th node output function to the DAC fout[11..0] decimal value 1024

The last node is expanded into its terms fout11 down to fout0, to show the transition from the value 1023 to the value 1024. One can see the synchronous transition of all the outputs, 5 ns after the clock. The skew between the outputs is zero ns, indicating a perfect synchronism of the system, without any glitch. Fig. 30 presents the registered performance of the compilation and indicates a maximum clock frequency in excess of 149 MHz. The simulator also computes the critical path that is, in our case, the delay path of 6.7 ns from the raddress node to the programmable delay (block 312). These results confirm the validity of the method that has been employed to successfully handle the project.

24

Fig. 28 – Top drawing for the FPGA compilation.

25

Fig. 29 - Data flow showing relations between data input, internal addressing and output function.

Fig. 30 - Registered performance (149 MHz clock) with floor plan and critical path shown.

The simulation represented in fig. 30, shows the performance of the FPGA sections driven by the RF clock “clk”. The test is meant to indicate the ultimate frequency limit and the margin left, for the normal system operation. Tests performed for the UMB section, driven by the clock “SDclk”, show a frequency limit of 241 MHz. Similar tests performed for the I2C controller and PHOS4 circuits, driven by the “Xtal” clock, indicate a frequency limit of 121 MHz. The sections’ optimisation is implemented in their respective clock domains. The slowest clock dictates the speed of data synchronisation between sections. The throughput rate of the complete DSPU is in excess of 1 Gbit/sec.

26

Fig. 31 - I2C Controller, sub-set of the top-level drawing of fig. 34.

The project layout is hierarchical. By clicking inside a block, further sub-sets are accessed. Fig. 31 represents the I2C Controller section, block 321. The delay values, Delaya[13, 3..0] are sent to the block 137, i2ccntr, and serialised to the SDA node and clocked by the SCL clock. The sequencer1, block 144, generates the set of instructions needed to control the entire sequence. The 40 MHz Xtal clock clocks the counter, block 97, and the divider, block 141. This allows the parameterisation of the I2C sequence.

Fig. 32 - I2C timing shoving the serial data and serial data clock during one cycle of instructions.

Fig. 32 represents the I2C timing during one cycle of instructions, showing the SDA (serial data) and the SCL (serial data clock). The Start_sequence signal triggers the start condition (SDA negative transition followed by the SCL negative transition, at time 60 µs). The End_sequence signal triggers the stop condition (SCL positive transition followed by the SDA positive transition, at time 1.1 ms). The Din[7..0] bus represents the binary input to the i2ccntr, block 137, serialised to the SDA. Referring to the data pattern for the PHOS4, table 2 - page 23, the sequence shows that the chip at address1 is addressed and its channel 0 is loaded with a delay value of 9 ns. The blank line overrides the delay data during the chip addressing. The Dout[7..0] bus shows the values of the internal registers. A complete sequence needed to write a fine delay value is composed of 20 cycles of instruction.

27 10. SCHEMATIC DIAGRAM

The schematic diagram of the DSP Unit is drawn using the Logic Work Bench program from Cadence. The main peculiarities of the program are:

• The schematic diagram is partitioned into hierarchical sub-sets, linked by files that allow the global project update upon modification of a single block. • The connecting “wires” and the components, both passive and active, are linked to files that contain models that are used for the post-layout signal analysis.

Fig. 34 represents the top drawing of the DSP Unit, SPS10213 Digital notch filter with programmable delay. The schematic diagram depicts the DSPU, built around the main block, labelled NOTCH1. Full buffering is provided for the input and the output signals. The connector J6 represents the input connection for the 16-bit wide input-bus to the input network block, I541NET. 74LVTH541 SOIC devices that are optimised for fast data transfer, in excess of 120 MSPS, compose the I541NET. Similar circuits, but with slower speed requirements, are used for the MODULE_BUS input block, driven by the connector J2 and the DELAY_VALUE block, driven by the front panel connector J1.

Fig. 33 depicts the NOTCH1 block, showing in detail the main FPGA section. The IC is programmed, through the JTAG connector, with the compilation that is represented in graphical form in fig. 28.

Referring again to fig. 34, the blocks PHOS4NET and BFR93NET represent the fine delay section and its associated low capacitance , depicted in fig. 27. The fast line driver, O541NET, drives the output connector J5. The data and the 80 MHz clock are routed, with identical delays, from the input to the output connectors. Among the ancillary functions, one can see the 40 MHz quartz, QZ1 (clock name XTAL), and the optically isolated INTERLOCK output.

Fig. 33 - NOTCH1 - EPF10K100E FPGA section. The IC is programmed with the compilation of fig.32.

28

Fig. 34 - SPS10213 Digital notch filter with programmable delay – Top drawing. The schematic diagram is used for the Printed Circuit Board placing & routing and the Signal Integrity Analysis, as covered in the following sections.

29 11. SIGNAL INTEGRITY

To obtain no-fail behaviour of the logic circuits, the electrical signals must conform to the logic thresholds defined by the IC’s manufacturers. These requirements impose a circuit analysis and evaluation of possible worst-case threshold violation. This approach is aimed at maintaining the Signal Integrity through the entire process. The "SPECCTRAQuest SI expert" is the CADENCE Signal Integrity set of tools for high-speed digital design, available at CERN on Unix (Solaris) systems. It has been used to perform post-layout analysis of the SPS10213 board, designed with Cadence software for issues such as technology choices, package type, crosstalk, overshoot and undershoot, reflections, topology scheme analysis and line termination analysis. It is also important to "guide the layout" work, defining printed circuit board stack-up, critical line topologies and impedance.

In order to run the simulation, each device is associated to an I/O Buffer Information Specification (IBIS) model. IBIS is a fast and accurate behavioural method for modelling I/O buffers. The models provide analogue characteristics for each pin of a given digital device such as V/I and V/T curves, pin electrical equivalent circuit (package parasitic), logical thresholds, rise and fall times and clamping .

Some of the simulation results are reported here, figs. 35-39, and have been presented at the LEB2000 [7] Workshop as an example of a successful project done by using from the beginning the complete methodology presented in chapter 5.

Fig. 35 depicts a portion of the original DSPU PCB. Conventionally, at this stage of the project, the PCB is manufactured. Instead following the proposed methodology, the virtual PCB, shown in fig. 35, underwent extended tests for conformity to signal-integrity, using the SPECCTRAQuest program. The board uses a fast (500ps rise-time) FPGA and so is very typical of a circuit that could exhibit non-conformity to signal- integrity. Problems were found, with some signals not respecting the worst-case threshold level, as shown in fig. 36.

Circuits manufactured with PCBs which are affected by signal-integrity problem, exhibit erratic behaviour, difficult to analyse, and require further development, increasing both development time and cost. The topologies of the PCB tracks have been extracted (fig. 37 given as an example) and modifications to the original layout were implemented. The modifications consist mainly in path length matching, board stack up optimisation, to obtain the correct PCB track impedance, and finally addition of series resistors for critical path termination.

Fig. 38 shows the modified PCB layout and fig. 39 presents the waveform, obtained for the modified layout, respecting now the high and low levels logic thresholds. Multi board simulation has also been performed, incorporating the connector and cable models.

The IBIS models were not available for the microwave transistors BFR93A, used as an interface between the PHOS4 devices and the output driver. Consequently the PSPICE analog modelling of this portion of the circuit was used.

30 Original track

— driver end - - - receiver end

Fig. 36 - for original PCB track with suspected signal-integrity Fig. 35 - Original PCB track with suspected signal-integrity problem. problem.

Fig. 37 - SpecctraQuest output (topology extracted from PCB and with added series resistor).

Modified track

— driver end - - - receiver end

Fig. 39 - Waveforms for modified Fig. 38 - Modified PCB layout with series resistor . PCB layout with series resistor.

31 12. BOARD MANUFACTURING

The DSP Unit is manufactured in a 1-unit wide NIM-bin plug-in module, conforming to the SL/HRF standard. Front side controls:

• Remote/Local switch, Notch filter ON/OFF switch, 3 digit thumbwheel hex switch for manual delay entry (front side controls enabled in 'Local' control only), flat cable connection for the UMB bus, LED indicator for internally detected faults & status and for active register & module acknowledge.

Rear side connectors:

• 40 pin flat cable connectors for digital input & digital output, optically isolated output for the summary status and power connector.

Fig. 40 shows the SPS10213 PCB components layout. The actual DSPU section occupies only a small portion of the board. Its layout is “neat”, as also shown in picture 41. The FPGA and the PHOS4 devices are placed in front of the data bus connectors. No test points are provided, except for the 80 MHz and 40 MHz clocks.

Fig. 40 – SPS10213 PCB component side.

32 The PCB is designed with the Cadence/Allegro programs. It is implemented with 6 layers, allowing input and output-bus separation by ground planes. The board stack up is optimised for an impedance of 70 ohms and the data-bus tracks are optimised for equal length path.

Fig. 42 is a close-up view of the DSPU module, showing the FPGA device and the PHOS4 devices. The 40 pin digital connectors are mounted side by side, to facilitate the DSPU interconnectivity to the ADC and DAC modules, as shown in picture 43 and 45. The output connector, J5 is routed, on the PCB layer 1, to the rear panel connector, through a mezzanine card that has also been optimised for the signal-integrity. The output connectors and the mezzanine card are visible in fig. 43. The input data bus, from the connector J6, is split into two ways and routed on the PCB layer 6, not shown here, to the input buffers, placed symmetrically on each side of the FPGA. Fig. 44 is a front view of two complete chains.

Power is distributed to each sub-section of the module (i.e. input bus, transistor buffers, output bus, internal FPGA core, input/output FPGA cells, etc) through individual regulators, feeding low inductance power planes. The SMD regulators are soldered on PCB copper areas, used as heat sink. The surfaces have been calculated, on the basis of the estimated power consumption for each sub-section, according to the regulator IC data sheet. The 6 V rail powers the module that runs with no noticeable rise in temperature, draining a total current of 0.8 Amps at full dynamic load.

Fig. 41 – SPS10213 Top layer 1 showing the clean PCB layout. The copper areas are used as DC regulators heat sink.

33

Fig. 42 – The close-up view of the DSPU SPS10213 Fig. 43 - Detail of the DSPU SPS10213 module rear module, shows the EPF10K100EQC208-1 FPGA panel, showing the 40 pin connectors provided for device from Altera and the PHOS4 devices, installed the input and output bus. Fast locking of the flat in low-profile sockets. The BFR93A transistors and cable data bus is provided. the output line-drivers are visible at the right. The The mezzanine card, slightly visible behind the two chips, above and under the FPGA are the input connector at the top, interconnects the main PCB to drivers. At the left of the picture, from top to bottom the output bus connector. one can see the input buffers for the local delay control, the 40 MHz quartz oscillator and the input buffer for the User Module Bus.

Fig. 44 - Front view of two complete chains. From Fig. 45 - Rear view of two complete chains. From the top to bottom, from the left to the right, the front the top to bottom, from the left to the right, the rear panel functions are: panel functions are:

SPS10211 ADC - Test points input and ADC/DAC. SPS10212-1 DAC - DAC output; external clock (provided for test purpose); 40 pin input connector. SPS10213 DSPU – Ready status indicator; thumbwheel switch for the manual entry of the SPS10213 DSPU – Floating interlock output; output delay; Notch ON/OFF and led indicators; connector; input connector. Local/Remote control and led indicators; Online/Offline registers and Module Acknowledge SPS10211 ADC – Signal input; clock input. indicators; User Module Bus, 10 pin connector. The connection through flat cables represents the SPS10212-1 DAC - test point output. standard module layout in operation.

34 13. LABORATORY TEST

The following pictures, figs. 46 to 48, summarise the test results for the SPS10213 - Digital notch filter with programmable delay, associated with the data conversion units, SPS10211 - 12 BIT 120 MSPS ADC and SPS10211-1 - 14 BIT 160 MSPS DAC (clock 80 MHz). Fig 46 shows the response to a damped sine wave signal that is delayed by the programmed 23 µs delay. Fig 47 depicts the action of the programmable delay. It shows the original and the delayed signals (delay of one clock period of 12.5 ns) showing the clean output.

Fig. 46 - Test with the arbitrary waveform generator Fig. 47 – Test set-up as in fig. 46. AWG 2021 and scope HP 54616C. Notch OFF. Top trace: ADC input Top trace: ADC input Bottom trace: DAC output, after 19.75 µs delay Bottom trace: DAC output, after 23 µs delay Expanded trace: DAC output, incr. delay 12.5 ns Amplitude: 200 mV/div. Amplitude: 400 mV/div. and 20 mV/div. Time: 5 µs/div. Time: 5 µs/div. Expanded time 5 ns/div.

Fig. 48 represents the transfer function of the ADC/DSPU/DAC modules, tested with the Notch ON and fCLK of 80.10577 MHz. The situation corresponds to the 26 Gev injection energy of the LHC beam into the SPS and an fRF of 200.264 MHz. In this case the revolution frequency is: 200.264 F = f /h = = 43.347 kHz 0 RF 4620 This corresponds to the marker 1 position, placed on the first notch of the measured transfer function, indicating a correct functioning of the DSPU. The phase variation between two consecutive notches exceeds 180° due to the programmable delay placed before the notch function that is set to an arbitrary value.

Fig. 48 Transfer function of the ADC/DSPU/DAC modules. Ch. 1: Magnitude, log scale, 10 dB/div. Ch. 2: Phase, 90 °/div. Start frequency: 10 kHz; stop frequency: 100 kHz

35 14. RESULTS WITH BEAM

The following measurements have been done during various MD sessions at the SPS, in presence of the LHC-type beam. The pictures show the performance obtained by the new digital notch filter with programmable delay and associated data conversion, for the new transverse damper. The results presented in fig. 49 and 50 refer to the “20 MHz mode” of operation. They are aimed at evaluating the new ’ capability to process the beam signals, on a bunch-by-bunch basis.

The sum signal from the transverse damper horizontal plane pick-up is split in two. One branch excites an 80 MHz band pass filter and is used as clock signal. The other branch signal is sent to the ADC and sampled. Fig. 49 shows the beam signal and the DAC output. In this configuration the pair ADC/DAC (without digital filter) has a unity gain inversion. The “peak” to “valley” ratio, in the DAC output signal, is a function of the beam signal to clock phase.

Fig. 50 shows the beam signal for the complete LHC batch. The DAC output is delayed by 200 ns due to the double conversion latency. The first 300 ns of the beam signal are not sampled, due to the lag response of the 80 MHz BP filter, used for the beam-derived clock generation. The th ADC/DAC operate at their maximum bandwidth. Fig. 49 – Results of the 12 November 1999 MD.

Fig. 51 shows the beam signal applied to the Ch. 1 (50 mV/div.): Beam signal ADC, via a 20 MHz low pass filter, and the Ch. 2 (50 mV/div.): DAC output corresponding DAC output. By comparing fig 51 Time (12.5 ns/div.) to fig. 50 we can see the lower signal bandwidth that will put less stringent requirements on the As explained in the text, the most negative value transverse feedback power amplifier. in the DAC output reflects the positive peak value in the beam signal.

Fig. 50 – Results of the 12th November 1999 MD. Fig. 51 – Results of the 12th November 1999 MD. Ch. 1 (100 mV/div.): Beam signal Ch. 1 (100 mV/div.): Filtered beam signal Ch. 2 (100 mV/div.): DAC output Ch. 2 (100 mV/div.): DAC output Time (500 ns/div.) Time (500 ns/div.)

36 In fig. 52, the performance of the new 12-bit/80 MSPS ADC/DAC without digital filter, is compared to the existing equipment 8-bit/33 MSPS notch filter and delay. The signal applied to the 12-bit system is attenuated by 10 dB, in comparison with the 8-bit system, and still reproduces faithfully the beam signal. The 8-bit system response is delayed by 23 µs, in comparison with the 12-bit system. Sampling at 33 MHz is not sufficient for a 20 MHz bandwidth operation, as required by the specifications. This alone already required a new DSPU. Picture 53 has been taken at the output of two DSPU 12-bit 80 MSPS feeding the new transverse feedback systems. It shows the beam damping in the horizontal and vertical planes. The damping time is 1.5 ms. Figs. 54 and 55 show the rejection of the closed orbit in the horizontal plane. In fig. 54 the signal measured at the damper input shows no observable betatron oscillation. In fig. 55 the signal measured at the damper input presents strong betatron oscillation, due to instabilities driven by the electron cloud effect.

Fig. 52 – Results of the 3rd November 1999 MD. Fig. 53 – Results of the 29th August 2000 MD.

Ch. 1 (10 mV/div.): ADC/DAC 12-bit 80 MSPS Ch. 1 (100 mV/div.): Horizontal plane Ch. 2 (50 mV/div.): Filter 8-bit 33 MSPS Ch. 2 (100 mV/div.): Vertical plane Delayed Time (5 µs/div.) Time (1 ms/div.)

h Fig. 54 – Results of the 15h August 2001 MD. Fig. 55 – Results of the 15 August 2001 MD.

Ch. 1 (100 mV/div.): Beam signal before the DSPU Ch. 1 (100 mV/div.): Beam signal before the DSPU Ch. 2 (50 mV/div.): Signal at the damper input Ch. 2 (50 mV/div.): Signal at the damper input

37 15. BETATRON PHASE ADJUSTMENT

In a transverse feedback system, the phase advance between the signal applied to the beam, at the RF kickers location, and the beam itself must be an odd multiple of π 2 at betatron frequencies, to obtain optimum beam damping. The betatron phase angle between pick-up and RF kicker can be adjusted by combining the difference signals from two pick-ups separated by a quarter betatron wavelength, ref.[2].

As shown in chapter 4, the bunch position signal “sampled” by the pick-up at the pick-up location creates ()β + ()β − sidebands f and f that have opposite phase, at the fractional part Qf around each revolution frequency line m⋅ f 0 . Therefore an ideal betatron phase adjustment provides a phase shift for the first half of the fractional tune and an opposite phase shift for the second half, with unity gain at all frequencies. This characteristic must repeat around each revolution frequency line.

A single pick-up scheme with digital signal processing is proposed, figure 56, which can be used in machines where only one beam position monitor per plane is available (PS) and ref. [8].

Fig. 56 – Betatron phase adjustment with one pick-up scheme.

The principle comes from the trigonometric relation (Eq. 6), that can be rewritten :

 π  cos()ωt ±ϕ = cos ωt ⋅cos ϕ cosωt − ⋅sinϕ (11) m  2 

The difference signal from the pick-up undergoes digital signal processing and is represented by the sequence x(n) . The sampled pick-up difference signal x(n) is split into two branches.

• The upper block produces the in-phase yi(n) component of the input. It has a unity gain and zero added phase in the fractional tune bands.

• The lower block produces the quadrature yq(n) component of the input. It has a unity gain and a + 90° added phase in the 0 to 0.5 fractional tune bands and a – 90° added phase in the 0.5 to 1.0 fractional tune bands.

38 For a 90° phase shift, the ideal frequency response (unity gain and change of sign at each half of the fractional tune x ) is depicted by the odd symmetric function shown in fig. 57.

Im [H]

1 Re[H] x

-2π - π 0 π 2π 3π 4π

Fig. 57 - Odd symmetric function.

Im[H] and Re[H] are the imaginary part and the real part of the frequency response, defined as follows:

−1 for ()2n −1 π < x < 2nπ  Im[]H =   n ∈ Z (any integer value of n ) and Re[]H = 0  1 for 2n π < x < ()2n +1 π 

The Fourier expansion is represented by a series of odd terms:

4  sin3x sin5x  Im[]H = ⋅sin x + + +... π  3 5 

By substitution, x = ϖ T 0 = 2π f f 0 , one obtains:

4  f  4 1  f  4 1  f  Im[]H = ⋅sin2π  + ⋅ sin3⋅ 2π  + ⋅ sin5⋅2π  +... π  f 0  π 3  f 0  π 5  f 0 

Now, with Re[]H = 0 , we get:

 4  f  4 1  f  4 1  f   H = j ⋅  ⋅sin2π  + ⋅ sin3⋅2π  + ⋅ sin5⋅2π  +... π  f 0  π 3  f 0  π 5  f 0  

 f  j2π f f 0 − j2π f f 0 Recalling that 2 jsin2π  = e − e and truncating the sequence to the 3rd term, one  f 0  obtains:

2  jϖT 0 − jϖT 0 1 j3ϖT 0 1 − j3ϖT 0  H = e − e + e − e  (12) π  3 3 

39 From (Eq. 8) we can rewrite (Eq. 12) as follows:

2  + − 1 + 1 −  H ()z =  Z N − Z N + Z 3N − Z 3N  (13) π  3 3 

The terms Z + N and Z +3N indicate an output advanced by N and 3N samples respectively. This means that the current output depends on the system's future input. In order to design a realizable function, the system delays − by Z 3N samples the input signal before producing an output. This is acceptable at SPS because the growth time of the instabilities is significantly longer than 3 revolution periods. However it increases the sensitivity of the system to tune changes.

15.1 Hilbert filter

A system that accepts a real-valued signal and produces a complex (I,Q) output signal is called a Hilbert filter. The order M of the filter is odd and it corresponds to the truncation in the Fourier series (Eq. 13). The choice of the order M defines the approximation to the ideal function, fig 57, the resources required for its practical realisation and the propagation delay in its response.

The quadrature component (Q) of the output signal is produced by a FIR filter. The in-phase component (I) is simply the input signal delayed by an appropriate amount to compensate for the phase delay of the FIR process employed for generating the Q output. This is achieved by accessing the center tap of the sample history delay of the Q channel FIR filter as shown in figure 58. In this figure x(n ) is the real-valued input signal and yi(n) and yq(n) are the in-phase and quadrature outputs respectively.

Fig. 58 - FIR filter realization of a Hilbert transformer of order M = 3.

From (Eq. 13) we can derive the filter coefficients:

2 2 2 2 h()−1 = ; h()1 = − ; h()−3 = ; h()3 = − ; h()− n = −h ()n π π 3π 3π

40 The negative symmetry can be utilized to produce an efficient realization. Figure 59 shows the final architecture for the betatron phase adjustment system that exploits the negative symmetry characteristics of the impulse response, with the in-phase multiplier added. It realizes the structure sketched in fig. 56. The final coefficients are calculated by multiplying the quadrature coefficients h()n by sin ∆ϕ . The in-phase signal coefficient h()0 is multiplied by cos ∆ϕ . The following notation is employed: h()phase value, coefficient number : h()∆ϕ , 0 = cos ∆ϕ 2 h()∆ϕ , 1 = − sin ∆ϕ π 2 h()∆ϕ , 3 = − sin ∆ϕ 3π 2 h()∆ϕ , M = − sin ∆ϕ Mπ

Fig. 59 – Realisation of a Hilbert transformer with phase adjustment, by exploiting the negative symmetry.

15.2 Realisation

To implement the notch, the programmable delay and the Hilbert filter of order M=3, shown in fig. 60, 8 blocks ‘one turn delay” are necessary. To accommodate the compilation in the same FPGA and to exploit the available memory, the harmonic number has been reduced by a factor 4, from 1848 to 462, implying a clock frequency of 20 MHz and a Nyquist limit of 10 MHz. This is at present a limitation on performance (20 MHz required), which is being addressed (use of a FPGA with greater RAM size). The coefficients’ values are truncated to 8 bit and stored in a look-up table using the two’s complement format. The ALU Hilbert block is “added” directly at the output bus, as visible in picture 61. With the new compilation no hardware modification has been necessary.

The front panel switch functions have been reprogrammed as follows:

NOTCH ON/OFF → DELAY/PHASE LOCAL/REMOTE → HOLD/WRITE Hex switch “DELAY” → Hex switch “DELAY/PHASE”

41

Fig. 60 - Programmable delay, notch filter and programmable phase Hilbert filter.

The module notch filter with programmable delay and the betatron phase adjustment behaved as foreseen. Laboratory measurements prior to beam tests are summarised in the following figs. 62 to 66. These refer to measurements of the DSPU programmed for the notch filter with programmable delay and the betatron phase adjustment for the SPS. The clock frequency is 20 MHz.

Fig. 62 shows the DSPU transfer function, with the notch OFF, in the band 43 kHz to 86 kHz corresponding to the fractional tune 0 to 1. The phase function reverses at tune 0.5. The phase transfer function is rigorously correct for a programmed phase of 90° but it is approximate for the other values, due to the finite number of coefficients chosen (M=3). For the same reason the amplitude is not constant in the fractional tune band. The amplitude tends to a value of zero (stop bands) at tune 0, 0.5 and 1, depending on the phase settings, due to the intrinsic structure of the filter. The phase and the amplitude are measured at the inflexion points of the curves.

The following pictures are taken with the notch ON. In fig. 63, with the betatron phase set to 0°, the phase varies linearly from 90° to – 90°. The action of the Hilbert filter is such that the betatron phase is added to the notch phase in the 0 to 0.5 band and it is subtracted to the notch phase in the 0.5 to 1 band. In this picture, the betatron phase is programmed to an arbitrary value of 47°.

Fig. 64 shows the transfer function in the fractional tune band (0 to 1) for an arbitrary N·f0 of about 9 MHz. A programmed delay of 15 ns introduces a slope in the phase function that shows as an added phase of: − 360 ⋅9 ⋅106 ⋅15⋅10 9 = 48o @ 9MHz . The separate effects of the programmable delay and the betatron phase adjustment are clearly visible in the pictures.

42

Fig. 61 – Top drawing for the FPGA compilation with betatron phase adjustment.

43

Fig. 62 - DSPU transfer function in the fractional tune band (0 to 1), corresponding to f0 (43 kHz) to 2·f0 . The values are measured at the inflexion points of the curves.

Notch OFF; phase programmed to 45° and 90°, respectively.

Top traces: Amplitude (log. scale 3dB/div). Bottom traces: Phase (lin. scale 45°/div).

Fig. 63 - DSPU transfer function in the fractional tune band (0 to 1) for an arbitrary N·f0 of 9.001 MHz.

Notch ON; delay programmed to 0 ns; betatron phase programmed to 0°.

Notch ON; delay set to 0 ns and phase to 47°. At marker 2 position, corresponding to the 0.5 to 1 fractional tune band, the betatron phase is subtracted to the phase introduced by the notch (the linear phase from 90° to - 90° has a value of -53° at marker 2 position). Total phase at marker 2 (-53° -47° = -100°).

Fig. 64 - DSPU transfer function in the fractional tune band (0 to 1) for an arbitrary N·f0 of 9.001 MHz.

Notch ON; delay 0 ns; phase 47° (memory trace from the previous picture).

Notch ON; delay 15 ns; betatron phase 47°. The delay introduces an added phase of − 360 ⋅9 ⋅106 ⋅15⋅10 9 = 48o @ 9MHz .

44 The picture 65 is taken with the notch ON. Flying butterflies sweep across the screen… Polar display taken for a band covering 5 revolution frequencies (span 220 kHz). As the betatron phase is increased from 0° to 360°, the loci of the transfer function move from perfect circles to flattened curves. This fact is due to a non constant amplitude and phase functions vs. frequency of the quadrature filter. This behaviour should be taken into consideration for a complete stability analysis of the feedback system.

Fig. 65 - Polar display for N·f0 of 9.001 MHz, taken for a band covering 5 revolution frequencies (span 220 kHz). Notch ON; delay programmed to 0 ns; betatron phase programmed to 0°, 45°, 90°.

The markers 2 and 3 are placed at ± 3 kHz from the central frequency of each band, simulating the loci of arbitrary betatron frequencies ()fβ + and ()fβ − . Measurements are relative to marker 1, placed at the central frequency. The marker 2, MEM trace, indicates a phase of 12.55° due to the notch phase and a betatron phase of 0°. The marker 2, S21 trace, indicates a phase of 102.55° due to the betatron phase change from 0° to 90°.

Fig. 66 shows the unity step response of the DSPU with the notch filter ON, the programmable delay set to 20 µs and the betatron phase set to 45°.

• The 1st impulse response at time 20.0 µs is due to the programmable delay. Its amplitude and negative 2 sign are due to the multiplication of the input step by the coefficient h()∆ϕ ; 3 = − sin ∆ϕ in the 3π Hilbert filter.

• The 2nd impulse response at time (20.0 µs + 23.1 µs) is due to the one turn delay placed before the notch filter subtractor. Its amplitude should be identical and its sign opposed to those of the 1st impulse.

• The 3rd impulse response at time (20.0 µs + 2·23.1 µs) is due to summing the multiplied signals in the Hilbert filter sample sequence.

• The main output is at time (20.0 µs + 4·23.1 µs = 112.4 µs). The impulse response has the correct polarity. The pulse pattern changes, depending on the phase settings.

Fig. 66 – Impulse response of the DSPU, with notch filter, programmable delay and betatron phase adjustment.

Ch. 1 Step input, 500 mV/div. Ch. 2 System response at DAC output, 200 mV/div. Time 20 µs/div.

45 15.3 MD results

An existing filter module SPS 10213 was reprogrammed with the new compilation and successfully tested with beam at the SPS, during the 10/07/2002 MD. The module was tested with the V1 damper and standard pick-up BPV213.09. The clock frequency was set to 20 MHz and the bandwidth limited to 6 MHz. The Hilbert filter was successfully set-up at 14 GeV/c and the damping as a function of the tune studied. The Hilbert filter behaves as expected and permits betatron phase adjustment with a single pick-up. A single batch of 1.3 1013 protons (fixed target beam) was well controlled during the whole cycle. During the low intensity MD phase only damper Vl was used. The sensitivity to tune variations with the Hilbert filter with notch and delay was confirmed as being three times higher, due to the 6 supplementary one turn delay, compared to the notch and delay alone. ∆ϕ = 2π·δQ·(1.5 + M) , M=3 order of Hilbert filter. An MD Note [9] will summarise the observations made.

Fig. 67 – Tune 26.592. Injection oscillations Fig. 68 – Tune 26.588. Kick 4 mm at 200 ms damped by the V1 damper. (positive signal at time 100 µs, upper trace). Turn DSPU: delay 20.1 µs; betatron phase 22° by turn picture showing the negative response of Ch. 1 (200 mV/div.) Pick-up ∆ signal. DSPU (lower trace) 1 turn machine delay after the Ch. 2 (200 mV/div.) Output of DSPU. kick. The system response is correct at the 5th turn Time (1 ms/div.). machine (time 115.5 µs in fig.). Compare the system behaviour to the test signals, fig. 66.

Fig. 69 – Tune 26.563. Injection oscillations not Fig. 70 – Tune 26.563. Injection oscillations correctly damped due to the wrong betatron phase. correctly damped following change of the betatron DSPU: delay 20.1 µs; betatron phase 22°. phase. DSPU: delay 20.1 µs; betatron phase 62°. Ch. 1 (200 mV/div.) Pick-up ∆ signal. Ch. 1 (200 mV/div.) Pick-up ∆ signal. Ch. 2 (200 mV/div.) Output of DSPU (amplified). Ch. 2 (200 mV/div.) Output of DSPU (amplified).

46 16. CONCLUSIONS

This work presents a new method of global digital signal processing applied to particle accelerators. The method is applicable to many systems of an accelerator complex. The solutions, presented here, have the potential to implement, possibly in a single device, the transfer functions and associated controls foreseen for the LHC era. This can lead to increased speed and resolution and a dramatic reduction in the equipment’s size. The equipment is designed with a reprogramming capability and can thus be reused in future designs. Complete simulation, at the system level, replaces (or reduces) workbench tests. The solution given to the circuit’s topology, allows the designer to channel his resources to the development of new transfer functions, without the necessity of designing a new hardware for each project.

In this work, I have presented, as an example of the successful implementation of the new method of digital signal processing applied to particle accelerators, the digital notch filter with programmable delay. This unit is integrated in the transverse damper system for the SPS and allows the multi-cycle operation of the SPS. It is believed that essential information concerning the development, tests, manufacturing and actual results with beam, have been reported.

This work has been selected in a presentation to the 6th Workshop on Electronics for LHC Experiments, Cracow, Poland - September 2000, as an example of design done by employing the electronic design automation tools for high-speed electronic systems [7].

The digital notch filter with programmable delay has been presented, as a critical element of the new transverse damper for the SPS and LHC [10], to the Workshop Chamonix XI - January 2001.

I presented the concept of global digital signal processing applied to particle accelerators and the digital notch filter with programmable delay to the Workshop on DSP Applications in the SL Division - November 2001.

The successful implementation of the betatron phase adjustment with a Hilbert transformer reinforces the validity of the new method of global digital signal processing as applied to particle accelerators.

17. FUTURE DEVELOPMENTS

The universality of both the DSPU Unit and the data converters permits the implementation of new functions. The following projects are presently under evaluation at CERN:

• Modifications of the transfer function to obtain a flat response near integer tune value at the SPS. Studies on the electron cloud effect at the SPS indicate possible tune values: QH 26.17 and QV 26.24. The new compilation is compatible with the actual FPGA.

• Modifications of the delay parameters and the transfer function for near integer tune value with automatic delay compensation for the PS complex. Consequences for the PS of the dependency of the betatron phase change vs. tune change will be investigated. The new compilation is compatible with the actual FPGA.

• Modifications of the delay parameters for the LHC. The new compilation requires a FPGA with greater RAM size. The new FPGA footprint is compatible with the same PCB.

• Digital down conversion for the damper pick-up. The new compilation is compatible with the actual FPGA.

• Betatron phase adjustment (one pick-up with full bandwidth). The new compilation requires a FPGA with greater RAM size. The new FPGA footprint is compatible with the same PCB.

47 Acknowledgements

I thank many of my SL/HRF colleagues. In particular I thank the group leader, T. Linnecar, and the section leader, T. Bohl, for having supported the project of the new method of global digital signal processing applied to particle accelerators. Thanks to E. Bracke and J. Molendijk for their software and hardware development, necessary to the remote control of the notch filter with programmable delay. Thanks to the Transverse Damper project leader, W. Höfle, for all the fruitful information and for his collaboration in integrating the new digital notch filter with programmable delay into the new transverse feedback for the SPS and LHC. The pictures in the chapter 14, results with beam, and subsequently 15.3, MD results, represent the synthesis of our joint measurements.

The IT/PS group supports the CAE tools at CERN. Their expertise, application support and collaboration has been indispensable to the success of the project of the digital notch filter with programmable delay. Acknowledgements and thanks to: S. Brobecker - Cadence/Concept and Logic Work Bench programs (schematic) and MAXPLUS II (FPGA compilation and programming), M. Manent and A. Thys - system and application support, J.-M. Sainson pre-layout and post-layout Signal Integrity Analysis, Ph. Fraboulet (Volontaire International Administratif), for PSpice simulations under the supervision of J.-M. Sainson, M. Couturier and K. Zumbrock – special components library.

Acknowledgements and thanks to F. Eudaric, A. Monfort, P. Vulliez (EST-DEM division) for their competence to the development of the several PCB and plug-in mechanics, involved in the project. I thank my colleagues, Ph. Baudrenghien, T. Bohl, W. Höfle and T. Linnecar for the critical reading of the draft and for their competent comments. Last, but not least, my sincere thanks to my family, my wife Vera and my daughter Sabrina, for their continuous support during the critical phases of the project and their help in typing and revising this document.

References

[1] D. Boussard, Evaluation of transverse emittance growth from damper noise in the collider, SL/Note 92- 79 (RFS), LHC Note 218, 1992.

[2] R. Bossart, V.Rossi et al, The Damper for the transverse instabilities of the SPS, IEEE Transactions on Nuclear Science, Vol. NS-26, N° 3, 6/1979.

[3] J. Molendijk, User Module Bus specification, Internal note - JM 2002-01-09, CERN-SL/HRF.

[4] The I2C bus specification (Philips Semiconductors 1992 – 1998).

[5] A. Marchioro, P. Moreira, T. Toifl and R. Vari, A 4-Channels Rad Hard Delay Generator ASIC with 1 ns minimum time step for LHC experiments, Proceeding of the 1998 Fourth Workshop on Electronics for the LHC Experiments, CERN-LHCC-98/36.

[6] E. Bracke, J. Molendijk and V. Rossi, Protocol for the definition of control and setting bit for the digital filter SPS10213, Internal note - VR 23/07/2001, CERN-SL/HRF.

[7] B.J. Evans, E. Calvo Giraldo and T. Motos Lopez, Electronic design automation tools for high-speed electronic systems, 6th Workshop on Electronics for LHC Experiments, Cracow - Poland, 11 - 15 September 2000, Yellow Report CERN-2000-010, CERN-LHCC-2000-041 - pp.396-397.

[8] M. Schweiger, Digitale Signalverarbeitung für die Feedbacksysteme der Elektronenspeicherringe in PETRA und HERA, DESY Internal Report F56H-93-01, 1993.

[9] W. Höfle, V. Rossi, V. Vendramini, SL-MD Note, to be published.

[10] W. Höfle, Proceedings of the Workshop Chamonix XI, January 2001 - Session III: SPS as LHC Injector II - MD Results for 2000 “progress with the damper”, CERN-SL-2001-003-DI.

48