Capacitance-Voltage Measurements: an Expert System Approach

Thesis submitted by

James Austin Walls

for the degree of Doctor of Philosophy

Edinburgh Microfabrication Facility Department of Electrical Engineering University of Edinburgh United Kingdom Abstract

The problems for automating capacitance-voltage (C-V) based process monitoring tests are many and varied. The challenges are: interpreting numerical and graphical data, and secondly, performing tests in a correct sequence whilst applying the appropriate simplifying assumptions in the data analysis.

A progressive series of experiments using connectionism, pattern-recognition and knowledge-based techniques were researched, culminating with the development of a fully automatic hybrid system for the control of the Hewlett-Packard HP4061 semiconductor test system. This thesis also includes a substantial review of, and guide to, the theory and practice of the high-frequency, low-frequency and pulsed C-V measurements, conductance-voltage and capacitance-time measurements.

Several novel software packages have been written (CV-ASSIST, CV- EXPLORE), including a rule-based expert system (CV-EXPERT) for the control and opportunistic sequencing of measurements and analyses. The research concludes that the new approaches offer the full power and sensitivity of C-V measurements to the operator without the burden of careful procedure, interpretation of the data, and validation of the algorithms used. Acknowledgements

This thesis is the result of a Cooperative Award in Science and Engineering (CASE) with Hewlett Packard (UK) Ltd.. The support from the U.K. Science and Engineering Research Council (grant No. GR/F 30499) is also gratefully acknowledged.

I would like to express my thanks to my supervisor, Dr. A.J. Walton, for his enthusiastic help, encouragement, and guidance during the course of this research. My thanks also go to my industrial supervisor, Dr. T.M. Crawford of Hewlett Packard Ltd. at South Queensferry for providing the necessary internal liason and support for this project. I would also like to extend my gratitude to the many employees of Hewlett Packard Ltd., particularly David Newton, Stewart Wilson and Graeme Gourly, who provided helpful advice regarding hardware and software for this project.

My greatest thanks are due to the staff of the Edinburgh Microfabrication Facility, in particular Prof. J.M. Robertson, Mr. A.M. Gundlach and Dr. R. Holwill for their guidance and support, and to the many technicians, past and present, who have helped in the production of test wafers. Additionally, I have benefitted considerably from many discussions with other members of the Electrical Engineering Department, particularly Dr. C.F.N. Cowan and Dr. A.R. Mirzai. Furthermore, this project could not have progressed without the influence of the Artificial Intelligence Applications Institute. My particular, thanks, therefore, go to Bert Hutchings, Robert Rae, Paul Chung, and Peter Ross for their assistance with, and interest in my research.

It goes without saying that this research would not have been so interesting were it not for the inspiration, encouragement and advice I gained from my fellow research students. Finally, my thanks go to Jane Glachan for diligently typing the draft, and for her infinite patience and moral support. Declaration

I declare that the research documented in this thesis is original and entirely of my own making, except where indicated to the contrary.

James Austin Walls Contents

Chapter 1: Introduction . 1

Chapter 2: Silicon Processing & Process Control ...... 5 2.1. IC Manufacturing ...... 5 2.2. Review of the MOS Process ...... 7 2.2.1. Cleaning & Surface Preparation ...... 7 2.2.2. Oxidation ...... 9 2.2.3. Doping ...... 10 2.2.4. Interconnect Material Deposition ...... 10 2.2.5. Lithography ...... 11 2.2.6. Etching ...... 11 2.2.7. Annealing ...... 12 2.2.7.1. Implant Activation Anneal ...... 12 2.2.7.2. Post-Oxidation Anneal ...... 13 2.2.7.3. Post-Metallisation Anneal ...... 13 2.3. C-V Testing For Process Monitoring and Control ...... 13 2.3.1. Process Control Parameters ...... 14 2.4. VLSI Specifications ...... 15 2.4.1. Specifications For An Oxide ...... 16 2.4.2. Specifications For The Substrate ...... 17 2.4.3. General Considerations For VLSI Processing ...... 20 2.5. Process Automation ...... 21

Chapter 3: The MOS Capacitor ...... 24 3.1. Introduction ...... 24 3.2. Structure and Terminology ...... 25 3.2.1. Charges in The MOS System ...... 27 3.3. Bias Regimes of The MOS Capacitor ...... 29 3.4. Energy Band Diagrams ...... 31

-I - 3.5. Measuring Equipment . 36 3.5.1. Probe Station Hardware ...... 36 3.5.2. Capacitance Meter ...... 38 3.6. Sample Preparation ...... 39 3.6.1. Contacting the MOS Capacitor ...... 40 3.6.2. Edge Effects ...... 41 3.6.3. Sample Fabrication ...... 43 3.7. Conclusions ...... 43

Chapter 4: The Capacitance-Voltage Technique ...... 46 4.1. Introduction ...... 46 4.2.. Non-Ideal MOS Behaviour ...... 46 4.3. Equilibrium C-V Measurements ...... 47 4.3.1. Modelling the High-Frequency C-V Relation ...... 47 4.3.1.1. Non-Uniform Doping ...... 51 4.3.2. Parameter Extraction ...... 52 4.3.2.1. Substrate Type ...... 52 4.3.2.2. Oxide Thickness/Capacitance ...... 52 4.3.2.3. Substrate Doping ...... 54 4.3.2.4. Flatband Voltage ...... 56 4.3.3. Oxide Charges ...... 58 4.3.3.1. Fixed Charge ...... 59 4.3.3.2. Mobile Charge ...... 60

4.3.3.3. Oxide Trapped Charge Q0...... 63

4.3.3.4. Interface Trapped Charge Q., ,D ...... 64 4.4. Typical Non-Standard Curves ...... 69 Series Resistance ...... 69 Oxide Resistance ...... 70 Poor MOS Contacts ...... 72 Interface Trapped Charge ...... 73 Localised Charge Non-Uniformities ...... 73 Capacitive Substrate Top Contacts ...... 75 Effect of Incident Light and Elevated Temperatures ...... 76 Enhancement/Threshold-Adjust Implants ...... 77 Depletion/Compensating Implants . 78 Short Minority Carrier Generation Lifetime ...... 80 Lateral Inversion Layer Spreading ...... 82 4.5. Pulsed C-V Dopant Profiling ...... 83 4.5.1. Limited Depth ...... 85 4.5.2. Limited Resolution ...... 85 4.5.3. Interface Proximity Limitation ...... 86 4.5.4. Interface Trap Distortion ...... 88 4.5.5. Differential Noise ...... 90 4.6. Conductance-Voltage Measurements ...... 93 4.7. Quasi-Static C-V Measurements ...... 97 4.8. Capacitance-Time Measurements ...... 100 4.9. Conclusion ...... 105

Chapter 5: Connectionist Learning Systems ...... 113 5.1. Introduction ...... 113 5.2. Parametric Learning ...... 114 5.3. Mechanisms of Learning ...... 115 5.3.1. Parameter Adjustment ...... 115 5.3.2. Signature Tables ...... 118 5.4. Linear Adaptive Combiner ...... 119 5.4.1. Recursive Least-Squares Algorithm ...... 121 5.4.2. Performance Evaluation ...... 123 5.5. Application to C-V Measurements ...... 125 5.5.1. Extending the Parameter Set ...... 133 5.6. Limitations of Parametric Learning to C-V Measurements ...... 140 5.7. Conclusions ...... 141

Chapter 6: Techniques of Pattern Recognition ...... 1w 6.1. Introduction ...... 144 6.2. Concepts of Pattern Recognition ...... 145 6.2.1. Vector Spaces: Definitions ...... 147

- UI - 6.2.2. Statistical Decision Theory . 148 6.2.3. Design and Test Data Sets ...... 152 6.3. The Classification of C-V Curves ...... 152 6.3.1. Pre-processing ...... 153 Smoothing and Averaging ...... 153 Transformation ...... 155 6.3.2. Feature Extraction ...... 157 Representing C-V Curves ...... 160 PeakDetection ...... 164 6.3.3. Feature Selection ...... 165 Canonical Analysis ...... 167 6.3.4. Distance Classification ...... 170 Significance Testing ...... 173 6.3.5. Performance Evaluation ...... 174 Results of Testing the C-V Curve Classifier ...... 175 6.4. A Shape-Addressable C-V Curve Database ...... 183 6.5. Extending the Classifier ...... 184 6.5.1. New Cluster Generation ...... 184 6.6. Conclusion and Discussion ...... 185

Chapter 7: Knowledge-Based Control ...... 188 7.1. Introduction ...... 188 7.2. Artificial Intelligence ...... 189 7.3. Knowledge-Based Systems ...... 190 7.3.1. The Components of a Knowledge-Based System ...... 193 7.3.1.1. The Knowledge Base ...... 193 7.3.1.2. The Reasoning Mechanism ...... 194 7.3.2. The Blackboard Architecture ...... 195 7.4. A CV-EXPERT System ...... 198 7.4.1. The Edinburgh Prolog Blackboard Shell (EPBS) ...... 202 7.4.2. Interfacing to EDUCATES ...... 205 7.4.3. A Knowledge Base for C-V Measurements ...... 209 7.4.4. The CV-EXPERT in Use ...... 210 Coping With a Leaky Oxide ...... 210

- Iv - Reacting to a High Interface Trap Density ...... 212 Non-Uniform Substrates ...... 213 Depletion Implants ...... 213 7.5. Conclusions and Future Directions ...... 213 Future Developments ...... 214

Chapter 8: Conclusions ...... 219 C-V Measurement Automation ...... 220

Appendices

A. EDUCATES User Manual ...... 224 1.1. Introduction to EDUCATES ...... 224 1.1.1. Upgrade History of EDUCATES Software ...... 225 1.1.2. Implementation ...... 226 1.1.3. Suggested Uses ...... 227 1.2. Structure of the EDUCATES Program Suite ...... 227 1.2.1. General Notes ...... 227 1.2.2. Program Briefs ...... 228 1.2.3. Program Structure ...... 229 1.2.4. Programs in More Detail ...... 230 MENU4...... 230 CVASSIST...... 230 LIIIFE4...... 231 PROF4...... 231 GV4...... 232 FIFLF4...... 232 1.2.5. Other Programs Not in the Main Suite ...... 233 1.2.5.1. GFUNCSGEN ...... 233 DB-STRUCT'URE ...... 233 CV-EXPLORE ...... 233 MONITOR...... 234 AUTOSTART...... 234 INSTALL...... 235 1.3. EDUCATES Operating Manual . 235 1.3.1. Modes of Operation ...... 235 1.3.1.1. Manual Mode ...... 235 1.3.1.2. Automatic Mode ...... 236 1.3.2. Intelligent Mode ...... 236 1.3.3. General Test and Measurement Procedure ...... 237 1.3.4. Installation ...... 238 1.3.4.1. Getting Going ...... 239 1.3.4.2. Sample Installation Procedure ...... 240 1.3.5. Using the EDUCATES Software ...... 240 1.3.5.1. General notes ...... 241 1.3.5.2. Manual mode ...... 242 1.3.5.3. Automatic Mode ...... 243. 1.3.5.4. Intelligent Mode ...... 243 1.3.5.5. If Things Go Wrong ...... 245 1.3.6. Hardware and Software Requirements ...... 245 1.4. Software Documentation ...... 245 1.5. List of Deferred Defects ...... 249 1.6. Capacitor Maps ...... 250

B. MOS Capacitor Runsheet ...... 254

C. EPBS C-V Rulebase ...... 258 The Rulebase for CV-EXPERT ...... 259 The Prolog Predicates for CV-EXPERT ...... 268

D. Publications ...... 273

Bibliography ...... 292

'4. Chapter I Introduction

Capacitance-Voltage (C-V) testing plays a crucial part in the monitoring, diagnosis, and control of many fabrication processes on an integrated circuit production line. Through the analysis of a whole range of device characteristics, the process engineer may gain a deeper insight into the physical processes that govern the electrical properties of the finished product. Tests made on the Metal-Oxide- Semiconductor (MOS) capacitor are a good indicator of final device performance and provide important information about the oxide, the oxide-silicon interface and the surface and bulk properties of the silicon substrate. The processing required, however, is only a fraction of that needed to produce a complete MOS transistor.

C-V analysis of the MOS capacitor device is a fast, effective, and powerful process diagnostic tool, but it can be subject to mis-interpretation and has many pitfalls for an inexperienced user. The theory underlying C-V measurements is both complex in form and difficult to model, whilst practical experience is distributed and interpretation largely subjective. The analysis of the data produced by these measurements is thus open to criticism and, in general, requires a skilled operator to assess the significance of the results and any perceived deviations from the ideal characteristics. "C-V measurements" in the context of this thesis refer to the wide variety of tests and observations that can be made on the simple MOS capacitor structure.

This research sets out to address the fundamental problems of curve-shape analysis and the planning and sequencing of tests in a manner appropriate to the particular device. By recognising significant factors within a measurement, the correct assumptions can be made in the analysis and the final result fully justified. Additionally it is. highly desirable that the final system can be implemented as a complete system, thereby integrating the measurement and instrument control with an "intelligent" front-end capable of automatic deduction and classification of results.

-1- The following chapters serve to document the development of a working system through three complementary approaches taken to achieve this goal. These illustrate both the theoretical knowledge required to understand the C-V measurements themselves, and the necessity to bring more powerful and intelligent techniques to bear upon the problems of their recognition, interpretation, and sequencing.

As an essential background to the need for C-V testing, Chapter 2 reviews the current technology for the processing of silicon and the requirement for precise control of this exacting process. The typical uses of C-V to monitor particular stages during the MOS process are described, along with a description of the many process control parameters measurable in this way. Some specifications for current and future demands on Very Large Scale Integration (VLSI) processing help to place the accuracy and sensitivity of C-V measurements in their proper context.

Chapter 3 presents the MOS capacitor as a test structure for C-V testing, along with supporting relevant theory, terminology and notation. The practicalities of sample preparation and requirements for the measuring equipment are also detailed, together with some examples of automatic systems and techniques in use today.

A thorough and concise theoretical reference is provided by Chapter 4, with supporting practical details and example measurement data and curves. The major methods using measurements of capacitance are covered. These include the high- frequency C-V, quasi-static C-V, pulsed C-V, capacitance-time, and conductance- voltage methods. Some typical problems often encountered during these measurements are also described, together with their corrective actions. A comprehensive selection of non-ideal curves is also presented to both assist the reader in their interpretation, and illustrate the interpretation problem.

The next three chapters are dedicated to a chronological account of three approaches taken to automate C-V measurements. Far from being contradictory, these techniques overlap to a large extent, with the completed system being a hybrid of such diverse methods as machine learning, pattern recognition and knowledge-based control.

Chapter 5 begins with a discussion of connectionist systems using machine- learning to associate different curve shapes with a variety of processing conditions. This chapter also introduces the important concept of using salient features to represent

-2. the shape of a curve. These associative techniques gave a greater insight to the recognition process inherent to the human visual mechanism, and formed the basis of a major exploration in pattern recognition discussed in Chapter 6.

The principles of statistical pattern classification techniques are detailed in chapter 6, through a development of the data processing, feature extraction and multi- dimensional statistical methods for C-V curve recognition. A prototype system, CV- ASSIST was developed as an aid for inexperienced operators to make qualified statements about routine parametric test results. The pattern recognition concept was considered later to be of greater use when implemented as the front-end to a more general system capable of interpreting and acting upon the measurement data being received. Such generality was best provided by the application of an increasingly popular tool from the realms of artificial intelligence; the intelligent knowledge-based system.

In Chapter 7 the principles of artificial intelligence and knowledge-based systems are introduced. This is a primer for the description of the Edinburgh Prolog blackboard shell, an operational tool for structuring, manipulating, and using heuristic knowledge. Through the use of this automated reasoning system, a rule-based 'CV- EXPERT' is developed. This system encapsulates a great deal of practical experience with all types of C-V measurement, and greatly enhances the usefulness of these tests. By applying the suite of C-V tests intelligently, and sequencing the measurements in a way that is uniquely appropriate to the particular characteristics of the device under test, CV-EXPERT should be able to make the maximum use of the information available to the equipment.

By automating the C-V measurement, it may well be possible to reliably incorporate the information from the MOS capacitor test structure into the process control loop of a Computer-Integrated-Manufacturing (CIM) system. Chapter 8 develops this conclusion along with a discussion of the foreseeable future for C-V measurements under knowledge-based control.

The impetus for this project has come from two sources. Firstly, the realisation- that C-V was becoming an increasingly under-used and mis-trusted measurement in the semiconductor industry at a time when greater demands were being placed on understanding and controlling some of the more fundamental physical principles underlying semiconductor device operation. In an era when automatic measurement

.3- and control is commonplace in parametric testing, the more subjective type of skills required for C-V measurements are becoming rarer, more difficult to acquire and duplicate, and more expensive to use. Secondly, existing control software for C-V testing was considered to be ageing disproportionately when compared to I-V (current voltage) parametric testers, and yet is still being expected to produce results of comparable usefulness and accuracy.

The result of this research is considered to be of direct commercial importance by having made C-V testing both more 'accessible' and considerably easier to use by providing a uniquely advanced interpretation, control and diagnosis environment..

IKE Chapter II Silicon Processing & Process Control

2.1. IC Manufacturing

Since the first integrated circuit (IC) was demonstrated in 1958 [1] and the door opened to a whole new electronics technology, the microelectronics industry has made many notable developments:

Technology Device/process No. of devices

1948 Transistor effect 1 1951 Grown junctions 1956 Diffused junctions 1 1960 Planar process 1960's SSI Analogue bipolar 1-100 1965-75 MSIJLSI Digital MOS 100-100K 1980 VLSI 256K RAM 300K-1M

1986 VLSI Inmos Transputer (T414) - 200K 1989 VLSI Intel i860 RISC processor 1 million 1989 VLSI experimental 16 Mbit DRAMs 35 million 1990 VLSI production 4 Mbit DRAMs 4.6 million 2000? ULSI 64 Mbit DRAMs >100 million?

Table 2.1: History of development of the integrated circuit

Table 2.1 shows the speed at which development has taken place since the invention of the transistor through to the inception of very large scale integrated circuits (VLSI). Figure 2.1 depicts these advances as a trend towards increasing functionality and decreasing feature size in the drive for miniaturisation.

None of these impressive advances, nor the continued use of silicon could be sustained, however, without a parallel detailed understanding of the Si:Si0 2 system and

-5- 109

SSI MSI LSI VLSI 108 M0S MEMORY A BIPOLAR MEMORY 7 10 •MOS LOGIC 0 BIPOLAR LOGIC io U cr W a 10 (n I- A uj 104 10 3

0. Ui 10 3 10 2

10 2 10 1 U. 70

101 I

1O 1960 1970 1980 1990 2000

YEAR Figure 2.1 :Advances in miniaturisation [21(p. 3) the ability to control the highly tuned fabrication process. The superior physical and electrical properties of silicon have ensured the continued use of this versatile element in the semiconductor industry over other materials such as germanium and gallium arsenide [2].

The drive for miniaturisation through device scaling in order to reduce cost, give greater functionality and increase yield places even tighter constraints on the fabrication process. In order to ensure future development, advances in the automation of the monitoring, diagnostic and process control systems must continue to be made to ensure the viability of a fabrication line. Computer-aided and computer- integrated manufacturing (CAMICIM) systems [3] help to maintain the commercial cutting edge of the semiconductor manufacturing industry. Such systems may also provide the flexibility for a continuously improving process whilst simultaneously allowing for low-volume production of experimental or application-specific circuits.

The process by which an IC is produced is extremely long and complex. In order to provide an explanatory framework for C-V measurements a very brief review of the modern MOS process will be presented. It should be noted that whilst this thesis concentrates exclusively on the MOS family of technologies, the C-V technique is also important for the bipolar and gallium arsenide technologies [4].

-6- 2.2. Review of the MOS Process

The three distinct processing phases for a planar process are deposition of a material layer, patterning the defined circuit layer and etching of that layer to form the desired device elements. These phases are depicted in figure 2.2

A typical VLSI circuit may require between 10 and 20 layer patterns to build up the desired devices depending upon the complexity of the process. The following sections detail the important fabrication steps with a brief discussion on the application of C-V measurements at each stage.

2.2.1. Cleaning & Surface Preparation

The silicon surface is the most sensitive region of the MOS structure to contamination of any sort. Indeed, until the advent of ultra-clean gate oxides, the inability to passify this highly reactive crystal discontinuity resulted in many early practical difficulties in using this semiconductor [2].

Bare silicon wafers

Material formation by deposition or evaporation

11

10-20 photo- or itt1OflS I electron-beam

Etching to form device elements

Finished wafers

Figure 2.2 : The planar process for IC fabrication

-7- Prior to use, the bare silicon wafers are thoroughly cleaned to remove any contaminants and the 5-10 A native oxide which will have naturally formed on the surface after their manufacture. The cleaning process is repeated at frequent intervals during the manufacturing process and is, therefore, a common source of contamination. A widely used technique is the standard 'RCA' clean [5] as detailed in table 2.2

Typical contaminants are metallic atoms and ions. Biological contaminants are often a source of sodium ions. Failure to eradicate all sources of sodium contamination results in an unpredictable and time varying shift in all transistor threshold voltages. Heavy metal atoms can be particularly damaging, with elements such as gold causing dramatic shortening of the minority carrier generation lifetime. This is particularly troublesome for dynamic random access memories and charge coupled devices. Other metal ions such as iron and copper are reported to increase the fixed oxide charge [6] and leakage currents in bipolar junction transistors (BiT) [4]. Interstitial or substitutional impurities may interfere with subsequent layer growth by causing crystalline defects to appear [2].

Remove photoresist or other gross organic film in an oxygen plasma stripper or in Caro's acid. Caro's acid is a mixture of H2SO4 and H202 in the ratio 2:1 volumes. It should be used at 140°C. Remove residual resist in an NH 40HJH2O2IH2O mixture in the volume ratio 1:1:5. The wafers should be immersed in the solution at room temperature and then raised to 80°C and held for 10 minutes. Remove the native oxide film with dilute HF. Concentrations 2-10% are commonly used. Immerse for 15- 60 seconds. Remove metallic and ionic contaminants in an HCL/H2021H20 mixture in the volume ratio 1:1:5. Heat to 80°C, immerse the wafers, and leave for 10 minutes. Rinse and spin dry, then transfer immediately to furnace loader.

Table 2.2 : The standard RCA cleaning sequence.

-8- Using C-V measurements, the density of mobile ionic charge can be routinely assessed by bias-temperature (B-T) stressing. Capacitance-time measurements, however, can accurately measure the minority carrier lifetime to determine the effect of suspected gold contamination, whilst deep level transient spectroscopy (DLTS), a similar technique, can be used to detect bulk traps created by iron, copper and nickel.

2.2.2. Oxidation

Although thermal oxidation is still the most common means for producing a gate-quality oxide, there are a number of other emerging techniques. It is in assessing these insulators that C-V testing really comes into its own as a sensitive diagnostic tool.

The following equations give the dry and wet oxidation reactions respectively. These are typically performed in standard furnace tubes [2], often in the presence of hydrogen chloride [2] (pp. 119-121) to getter impurities such as sodium.

Si (so!id ) + 02 (gas) Si02 (solid) (2.1)

Si(501id)+ H20(gas) -. Si02 (solid) 2"2 (gas) (2.2)

The oxidising ambient [7], crystallographic orientation [8], temperature [7], pressure [7] and annealing [9] all have an effect on the oxide growth and the density and distribution of oxide charges.

Silicon nitride (Si 3N4) when incorporated with thermal Si02 in a multi-layer dielectric structure has been found to be extremely resistant to the diffusion of water molecules and sodium ions, and thus protects the silicon:oxide interface more effectively than oxide alone [2]. However, as will be detailed later, its resistance to the diffusion of water molecules creates problems in annealing out some oxide charges [10].

Rapid thermal processing (RTP) can be used to generate high quality, thin gate oxides around 40-130 A thick, suitable for VLSI applications. Quality and reproduceability are still of major concern although some preliminary results are looking promising [2] (p. 137)

-9- C-V testing provides a unique and crucial role in the control and monitoring of an oxidation process. It is capable of accurate analysis of all oxide charges and interface states to provide a tool for the assessment of new oxides and for studies of new types of dielectric.

2.2.3. Doping

The thermal diffusion of impurity elements is used when non-critical (5-10%) or heavy i doping (> 1014 cm _2) is required. Typically, diffusion is used in doping gate and interconnect polysilicon after its deposition to improve the conduction properties of the polysilicon. It is also commonly used for creating the MOSFET source and drain regions.

The theory of diffusion is well understood [2, 11], for it is an important concern when attempting to model the effects that accompany other high-temperature processing steps. For many VLSI applications, rapid thermal processing is providing an alternative to the sustained high temperatures accompanying traditional processes in order to minimise the unwanted diffusion of dopant species.

Ion implantation is used for a variety of purposes in VLSI processing where careful tailoring of the dopant profile is required, for example in threshold voltage (Vi ) adjustment, well formation, tailoring depletion transistors, punchthrough suppression and channel stops. Ion implantation is a far more accurate and predictable process than diffusion and is capable of giving much better control over impurity dose. This will typically be better than 1% for doses in the range 1011 to 1017 atoms cm The distribution of the implanted atoms is approximately Gaussian, and thus at higher energies it is possible to create 'buried' layers of doped material.

Some means of monitoring the distribution of dopant during device processing is necessary for the control of the implantation and diffusion processes. One such method is C-V profiling which provides a rapid, non-intrusive means of examining the doping profile(s) at any stage during processing.

2.2.4. Interconnect Material Deposition

The majority of the MOS capacitor samples for this thesis work were fabricated with aluminium gates which can be fabricated using evaporation or sputtering

_10- techniques. Other interconnect materials such as poly-silicon and refactory-metal suicides can be produced by thermal or low-pressure chemical vapour deposition (LPCVD) methods. MOS gate materials are continually developing, however [12], and it is becoming apparent that future needs are likely to be satisfied by combinations of poly-silicon and refractory-metal suicides as the need to maintain a low impedance interconnect at ever decreasing feature sizes is pursued [2] (section 9.3). Doping of poly-silicon gates is now more typically performed by ion-implantation which also alters the work function, 'FMS. A popular gate material is n-doped polysilicon, which has a similar work-function to aluminium without its disadvantages of electromigration, junction-spiking and corrosion. Furthermore, n-doped polysilicon matches the requirements for the self-aligned process where the gate and contacts may be automatically aligned to the active area, with its only disadvantage being its higher

resistivity. Self-aligning silicides are also now becoming available [2] (p. 397).

2.2.5. Lithography

Lithography is the process of transferring the desired layer pattern from a mask onto the wafer surface. The three mainstream exposure tools optically expose photoresist using ultra-violet wavelengths. These tools are contact/proximity, projection, and direct-step-on-wafer (DSW) reduction projection printing. Because of the defect problems associated with contact printing, projection is the favoured and most widely used method in industry today, with DSW being used for the smaller geometries. An excellent review of photo-lithography can be found in reference [13].

There has been a great deal of research to investigate alternative techniques for smaller geometries. These include electron beam [2] and X-ray [2] exposure methods, but these have yet to be successfully implemented in a high-volume production environment.

2.2.6. Etching

The removal of selected portions of deposited layers in order to leave behind the desired layer pattern can be achieved via wet chemical etching or dry etching.

A typical wet etch for the removal of aluminium uses a mixture of orthophosphoric, acetic and nitric acids at 160 °C. Wet etches are, however, in

- 11 - general, isotropic in nature and thus are not suited to VLSI applications where edge geometry must be more tightly controlled. Dry etching provides an anisotropic etch and can be performed either by using a plasma or a reactive ion etch [2]. In general these are low-temperature steps, and together with their ability to define very fine lines makes them ideal for VLSI. Some disadvantages, however, are that the resulting topography creates severe problems for subsequent step coverage, and radiation damage from electron and ion bombardment may also occur [2]. C-V measurements can be used to assess the extent and nature of any oxide charges or contamination introduced by the etch processes [14].

2.2.7. Annealing

Annealing is the process of heating wafers in order to activate ion-implanted species, reduce oxide charge densities after thermal oxidation and for reflow densification of pyrolytic oxides. This process may be carried out either in an inert environment or in 'forming gas' (90%N2 , 10%H2) where appropriate.

2.2.7.1. Implant Activation Anneal

During the process of ion implantation, the substrate semiconductor lattice becomes damaged by the bombarding species which displace many of the atoms in the target silicon. This particular form of damage manifests itself as an increase in the density of deep-level traps which can be donor or acceptor in nature. This in turn affects the minority carrier lifetime and increases the apparent resistivity. There is also the possibility that displaced atoms will coalesce and form crystalline defects. Secondly, the implanted ions will not have become electrically active unless they lie on the substitutional site of a target atom. The fraction of inactive dopant ions is dependent on the dose, with this effect becoming more noticeable at higher doses

(>8x10 12crn 2 ) [ 2].

In order then to repair the lattice damage and activate the implanted dopant, some form of high-temperature annealing is required. An activation energy of up to 5eV may be required for lattice repair. When compared to diffusion processes which have an activation energy ranging from 3 to 4eV the problems of dopant redistribution become apparent. Rapid thermal annealing (RTA) processes are considered to offer a solution to this problem by providing a very short, high-temperature step for the

-12- anneal, whilst minimising the slower diffusion process.

C-V analysis can provide both a qualitative assessment of lattice damage, and an accurate electrically-active dopant profile for comparison with a theoretical/simulated value, to assess the effectiveness of an anneal.

2.2.7.2. Post-Oxidation Anneal

After performing a thermal gate-oxidation step, there will be a high density of charge located at or near the newly-formed Si:Si0 2 interface. Both interface-traps and fixed oxide charge are the result of chemical effects during the growth of the oxide, and unsatisfied or dangling bonds due to incomplete oxidation. The deleterious effects of these two charges can be minimised by an anneal in an inert-ambient immediately following the oxidation [9].

2.2.7.3. Post-Metallisation Anneal

An alternative procedure to reduce interface trap density and fixed charge after oxidation is to perform a low-temperature anneal (-450°C) after the metallisation step. This is fortunate, since if the metal layer had been formed by electron-beam evaporation, this anneal would be required to repair damage created by soft x-rays.

C-V measurements are extremely useful for monitoring oxide anneals. They provide a very accurate means of measuring the density of interface trapped charge [15]. With care, the density of the fixed oxide charge may be simultaneously determined.

A full description of all oxide charges and their undesirable effects will be presented in the next chapter.

2.3. c-v Testing For Process Monitoring and Control

The MOS fabrication process is a long and involved one which, ideally, should be monitored after each processing step. This is not a practical proposition because of the sheer volume of data that would result, and for the extra time involved in making the measurements. C-V testing, however, does allow for observations to be made at any stage after gate oxidation. Virtually all the properties of interest to the process

- 13- engineer regarding the substrate, interface and oxide can be measured using this technique.

The typical processing operations that C-V can be used to monitor are listed below:

• effect of cleaning agents

• furnace qualification (check on contamination)

• gettering efficiency

• gate oxidation

• ion implanter dose and uniformity

• side-effects of deposition and etching processes

• annealing effectiveness

Using an MOS capacitor test structure, attributes of the processing which affect final MOSFET device performance can be analysed. The following sections indicate the capability of C-V measurements to determine parameters for the MOS family of technologies. This information can be classified as listed below:

• Basic silicon and oxide physical and electronic properties

• Process monitoring parameters

• Uniformity and trend assessment

• Prediction of transistor performance

To enable the extraction of these parameters, the process engineer has at their disposal a number of different tests, each a variation on an admittance measurement. They include the high-frequency equilibrium capacitance-voltage (C-V) sweep test, quasi-static C-V analysis (QSCV), capacitance-time (C-t) measurement, conductance vs. voltage and frequency (G-V) and deep-depletion pulsed-C-V. These individual tests and their practical implementation are discussed in Chapter 4 and Appendix A.

2.3.1. Process Control Parameters

Some of the MOSFET parameters and properties pertinent to the MOS process which can be obtained from C-V measurements are given below in table 2.3. This table is a useful reference listing the parameters, their usual symbols and the method(s)

- 14 - by which they may be measured.

2.4. VLSI Specifications

The requirement for process control has increased in line with the rising costs and added value of modern VLSI circuit fabrication. The increasing use of process

Parameter description Symbol Source(s)

Surface band bending and depletion layer M), w (V,) C/PlO width in the silicon as a function of gate bias Doping profile in the silicon N (w) CIP/Q 3• Interface trap density as a function of energy D,, (E) GIC/O in the bandgap

4 Interface trap level capture probability for C. (E) , C,, (E) 0 holes and electrons as a function of energy in the bandgap Generation and recombination minority T ,T, L carrier lifetime in the bulk silicon

MInority carrier lifetime as a function of T8 (w) L silicon depth

Surface recombination velocity 5, S 0 L

Oxide thickness D0 C Oxide breakdown field strength Ems,,

Oxide charges Q1 Q0, c Oxide charge non-uniformities Workfunction differences between silicon and C gate Ionic drift and polarisation effects in Si02 Q. C Diffusion of water inSiO 2 S

Properties of electron and hole traps inSiO 2 S

Dielectric constant of Si0 2 C Dielectric constant of silicon C Conductivity type of the silicon nJp C Substrate doping density NA N,, UP Silicon surface doping concentration N UP

Ion-implant dose and range D, 1?,, - P Interface trap capacitance C, 0/C

Hatband voltage and capacitance Vfb C/G/P Equivalent transistor threshold voltage VT P/C

Table 2.3: MOS parameters from C-V measurements(after Nicollian and Brews [4]) Key: C (C-V), C (G-V), Q (QSCV), L (C-t), P (pulsed CV), S (bias-temp. stress).

- Is - automation has also demanded parallel development of methods and tools for process control. The specifications on critical fabrication processes for VLSI are discussed in the following sections for use in referring to the sensitivity of C-V measurements. As further demands are made on IC fabrication processes to control dimensions, profiles and charges to a higher degree of accuracy, it is process-control measurements, including C-V, which will have an increasingly important role to play.

2.4.1. Specifications For An Oxide

Two factors governing the suitability of a gate oxide for a VLSI application are its quality and integrity. Integrity relates to obvious physical factors such as pinhole density, resistivity, stacking faults, and the consequent breakdown strength of the oxide. Unless voltages are scaled accordingly, the thinner gate oxides being used in today's VLSI deviôes may break down more readily unless steps are taken to minimise these effects. An acceptable range for breakdown field strength is 7-9 MVcm 1 with values of up to 11 MVcm 1 having been reported [2].

Gate quality can be defined more precisely in terms of permissable charge densities, current leakage, and maximum field strength as summarised in table 2.4

Q. 2x1010 cm-2 Q1 2x10 cm-2 D1, s 2x1010 cm-2 eV 1

:5 10 12 A at 1 O

E max > 5x10 V cm-1

Table 2.4 : VLSI processing specifications (after Huff [16])

Process factors such as anneal times and temperatures can have a marked and observable effect on these charge densities. Susceptibility of oxides to charge trapping under conditions of high field is also an indicator of oxide quality as the number of trap centres is related to the oxidation conditions after processes such as plasma etching or ion implantation.

Truly, the ultimate specification that can be imposed on a process is the maximum allowable shift in the final MOSFET threshold voltage (VT) caused by all the oxide charges compounded. Neglecting substrate non-uniformities for clarity, an

-16- expression for threshold voltage is given by [171:

VT = Vf0 + 29b + e31 NA Ob )% (2.3) where

Vib nu Qf + Q. - Qill (2.4) - ( Ox and

kT ('NA In (2.5) = Ni Equation 2.3 illustrates the dependence of the threshold voltage on oxide thickness, oxide charges and substrate doping. Indeed, mobile-ion density is more commonly expressed in terms of 'mV of shift' in the threshold voltage after bias-temperature stressing. The effect of crystal orientation can be quite dramatic; the difference between <111> and <100> plane wafers typically reduces fixed oxide (Q1 ) charge levels by a factor of ten [18].

2.4.2. Specifications For The Substrate

The silicon that forms the channel region of MOSFETs is known as the 'active area'. Factors that govern the successful operation of these transistors are variations in the doping and electronic quality of the active silicon. Channel doping variations, whether pre-meditated or just the result of other processing steps (e.g. annealing) will alter the threshold voltage of the transistor in a similar fashion to oxide charges. A low-dose ion implant, for example, can be approximated in equation 2.3 by a layer of charge near the interface, and can thus be added as an extra term to the expression:

q'D1 VT=VT+ (2.6) Cox where q' is the charge on the dopant ion, and D1 the implant dose.

Depletion transistors are formed by creating a layer of oppositely doped material at the silicon surface to the effect of lowering VT such that the device is normally on. This can create a subtle junction near the surface, the depth of which must be carefully monitored for proper transistor action.

_17- For CMOS processing it is often desirable for economic reasons to be able to use a single threshold adjustment implant to modify the threshold voltages of both the n- and p-channel devices simultaneously. Judicious choice of gate material and substrate doping can make this feasible [2] (p. 490),[19], as illustrated in figure 2.3.

In a CMOS process, the n- and p-channel threshold voltages are to kept as low as possible for performance considerations. This can be achieved through the use of high resistivity substrates (low doping) which minimises the Qb term in equation 2.3.

Exotic techniques such as the fabrication of high-capacity (Hi-C) RAM cells use a double shallow implant of boron and arsenic to induce an effective fixed charge in the substrate to increase charge storage capacity [20].

The importance of controlling these subtle variations in substrate doping stems not only from the parameter specifications, but also the dynamic MOSFET behaviour. For instance, punch-through suppression implants, if placed too close to the surface, can start to effect threshold voltage instead of performing their intended function of raising the source drain punch-through voltage [21].

As device dimensions are reduced, all these modifications to the semiconducting characteristics require to be placed nearer (-500 A) to the interface between Si and

2

L

> > a. C >I-

B IMPLANT DOSE (x1Oticm)

Figure 2.3 :Threshold voltages of n- and p-channel MOSFETs [19]

- 18- Si02. In the case of threshold adjustment implants, this may require the dopant to be

implanted through the gate oxide. Obviously, this places a tighter tolerance on Dox if the correct dose is to reach the silicon.

Another aspect of device scaling is that the actual number of dopant ions involved in 'adjusting' an individual transistor becomes very small. For instance, an n-channel, 1 m X 0.1 im MOSFET with a typical dose of 10 12cm 2 of boron atoms, giving a peak doping level of - 10 18cm 3, only implants around 1000 ions into the gate region which is verging on statistical limits. One way around this problem has been to use high resistivity substrates 1013 - 1014cm boron doped) or epitaxial layers which also increase the ease of nip well formation in twin tub CMOS circuits.

Other surface-effects such as dopant segregation into or out of the gate dielectric are proving particularly difficult to model precisely, and have serious effects on predicted transistor action since the properties of these inactive impurities at the interface is largely unknown [2] (Chapter 7).

Two other notable characteristics of a substrate material worthy of consideration are mobility and minority carrier lifetime. Mobility relates to the regularity of the lattice and is affected by the level of doping. Acceptable values depend on the speed at which the resulting circuit is expected to work, but typically suitable values are 700-1500cm 2V 1s 1 for electrons. Excessive doping can reduce mobility considerably.

Minority carrier lifetime reduction in the bulk silicon affects the dynamic behaviour of the minority carriers. Lifetime degradation which is often caused by bulk traps deep in the silicon and low in energy is an important factor in the processing of dynamic random access memories (DRAMS) as it directly influences capacitor charge retention times. Modern MOS processes can routinely process device grade silicon with lifetimes of the order of lOOms. Lattice damage caused by ion implantation and metal-ion contamination can, however, quickly reduce this figure to an unacceptable level. The move from <111> to <100> crystal silicon wafers has made a major contribution to lifetime preservation.

-19- 2.4.3. General Considerations For VLSI Processing

Processing factors such as the cleanliness of the ambient environment and all processing liquids can make the biggest difference to production yield. For example, de-ionised (DI) water is considered to have a low enough ionic content only when the resistivity rises above about 18 Mfl cm. Air in VLSI clean rooms is generally considered to have to be class 100 or better for acceptable yield. Particulates, whether airborne or present in processing liquids, can have a devastating effect on device yield not only by disrupting feature definition and causing leakage between layers, but also through the introduction of chemical contaminants. Contamination of any sort may be at the limits of chemical measurement techniques, yet still produce marked variations in the threshold voltages of the finished devices.

Trends towards higher packing density, smaller minimum feature size and shallower junctions are placing greater constraints on processing temperatures. Uncontrolled diffusion of precisely implanted doping profiles during implant activation or other subsequent high-temperature steps will cause considerable vertical and lateral movement of the dopant which may be quite significant for sub-micron devices. Such is the case with the lightly-doped drain (LDD) structure [22], [2](p. 482).

One means of preserving these delicate profiles is to use an all low-temperature process. For some steps, however, such as dopant activation, this is unfeasible. A technique which shows some promise is rapid thermal processing (RTP), where anneal times of the order of a few seconds are sufficient to repair damage with minimal diffusion.

RTP has another application in the production of thinner oxides. High-quality thin gate dielectrics of the order of 50-200 A, are necessary for sub-micron VLSI. In common with conventional dielectrics, they must have good electrical characteristics and long-term reliability. Thin oxides tend to suffer from 'wearout' as a result of increased leakage currents due to in part, Fowler-Nordheim tunneling. This is of particular concern to developers of electrically-erasable-reprogrammable read-only- memories (EEPROM) [2] (p. 117). In their favour, however, thin oxides offer the advantage of a generally higher yield as a result of a lowered susceptibility to defects. In addition, the sensitivity of VT to variations in implant dose and oxide charges is decreased, in accordance with equation 2.3. A fuller discussion of circuit sensitivities can be found in reference [2] (p. 615).

-20- 2.5. Process Automation

The evolution of the semiconductor industry's ability to process to VLSI standards and beyond, would not, and will not be possible without the development of techniques for controlling and managing the machines, processes and manpower. Increased automation is unavoidable and highly desirable for improving cleanliness and yield and giving increased fabrication flexibility in the progression towards the upper limits of integration. These will be determined by the physical limits on MOSFET channel length which are currently estimated to be —0.1 im [231.

Present computer-integrated-manufacturing (CIM) systems are beginning to assist in the complex scheduling procedures inherent to flexible ASIC manufacturing systems. The use of feed-forward compensation is an obvious way to correct for mild misprocessing further downstream and help meet the constraints for VLSI. Its implementation, however, is not a simple task.

Radically new process control tools, including an increased use of knowledge- based expert and vision systems will need to be developed. Total management of engineering yield, although a formidable task in the face of the volume of data produced by parametric testing, may become manageable through the incorporation of decision-making, learning and optimisation facilities within the individual machines or processing islands.

REFERENCES

J. S. Kilby, "Invention of the integrated circuits," IEEE Transactions on Electron Devices, Vol. ED-23, p. 648, (1976).

S. M. Sze, ed., VLSI Technology, McGraw-Hill, Singapore, (1988), second edition.

A. MacConaill and A. J. Walton, "CAM - A strategy to improve the quality & efficiency of processing," in Yield Modelling and Defect Tolerance in VLSI, ed. Moore, Maly & Strojwas, Adam Hilger, Bristol, (1988).

E. H. Nicollian and J. R. Brews, MOS (Metal Oxide Semiconductor) Physics and Technology, John Wiley & Sons, New York, (1982).

W. Kern and D. A. Poutinen, "Cleaning solution based on hydrogen-peroxide for use in silicon semiconductor technology," RCA Review, Vol. 187, pp. 187- 206, (June 1970).

I. G. McGillivray, The Measurement of Electrical Parameters and Trace Impurity Effects in MOS Capacitors, (1987). PhD. Thesis, University of Edinburgh

-21 B. E. Deal, M. Sklar, A. S. Grove, and E. H. Snow, "Characteristics of the surface-state charge (Q) of thermally oxidised silicon," Journal of the Electrochemical Society: Solid State Science, Vol. 114, (3), pp. 266-273, (March 1967).

J. Ligenza, "Effect of crystal orientation on oxidation rates in high pressure steam," Journal of Physical Chemistry, Vol. 65, p. 2011, (1961).

C. L. Claeys, "Oxidation processes and its influence on the electrical behaviour of MOS-structures," Proceedings of the First Brazilian Workshop on Microelectronics, pp. 112 - 131, Campinas, Sao Paulo, Brazil, (July 2-20, 1979). B. E. Deal, E. L. Makenna, and P. L. Castro, "Characteristics of fast surface states associated with Si0 2-Si and Si3N4 —Si02-Si Structures," Journal of the Electrochemical Society, Vol. 116, p. 917, (1969).

A. S. Grove, Physics and Technology of Semiconductor Devices, Wiley, New York, (1967). 2nd edition.

J. A. Walls, Interconnect technology, (May 1986). BSc Honours Dissertation #HD390 (Edinburgh University Electrical Engineering Department)

J.T.M. Stevenson and A.M. Gundlach, "The application of photolithography to the fabrication of microcircuits," Journal of Physics, Vol. 19, (Part E: Scientific Instrumentation), pp. 654-667, (1986). A. K. Sinha, "MOS (Si-gate) compatability of RF diode and triode sputtering processes," Journal of the Electrochemical Society, Vol. 123, (1), pp. 65 - 71, (January 1976).

J. A. Walls and A. J. Walton, "EDinburgh University CApacitor TEst Software (EDUCATES)- a user guide," EMF Internal Report, (July 1989).

H. R. Huff and F. Shimura, "Silicon material criteria for VLSI electronics," Solid State Technology, Vol. 28, pp. 103-118, (March 1985).

S. M. Sze, Physics of Semiconductor Devices, John Wiley & Sons, New York, (1981). 2nd edition.

E. Arnold, J. Ladell, and G. Abowitz, "Crystallographic symmetry of surface state density in thermally oxidised silicon," Applied Physics Letters, Vol. 13, pp. 413-416, (1968).

T. Ohzone, H. Shimura, K. Tsuji, and T. Hirao, "Silicon-gate n-well CMOS process by full ion-implantation technology," IEEE Transactions on Electron Devices, Vol. ED-27, pp. 1789-1795, (1980). A. F. Tasch, P. K. Chattergee, H. S. Fu, and T. C. Holloway, "The Hi-C RAM cell concept," IEEE Transactions on Electron Devices, Vol. ED-25, (33), (1978).

H. Nihira, M. Konaka, H. Iwai, and Y. Nishi, "Anomalous drain current in n-MOSFETs and its suppression by deep ion-implantation," IEDM Technical Digest, p. 487, (1978). S. Ogura, P. J. Tsang, W. W. Walker, P. L. Critchlow, and J. F. Shepard, "Design and characterisation of the lightly doped drain-source (LDD) insulated gate field-effect transistor," IEEE Transactions on Electron Devices, Vol. ED-27, p. 1359, (1980).

D. P. Kern, "Sub-0.1p.m silicon MOSFETs," Proceedings 19th European Solid State Device Research Conference (ESSDERC 89), pp. 633-642, Springer-Verlag, Berlin, (September 11-14, 1989).

-23- Chapter III The MOS Capacitor

3.1. Introduction

The metal-oxide-silicon (MOS) capacitor is arguably one of the most important and useful devices for analysing an MOS process (NMOS, PMOS, CMOS, B1CMOS) for it provides a great deal of fundamental information on the MOS structure inherent to these technologies. The electrical properties of the Si:SiO 2 interface have a profound influence on the performance and stability of the devices manufactured using these production processes [1, 2]

Tests on the MOS capacitor can be a good indicator of final device performance in addition to providing important information about the oxide, the oxide-silicon interface, and the surface and bulk properties of the semiconductor. The MOS capacitor is by nature extremely sensitive to processing quality and thus a powerful analytical structure for both diagnostic and research purposes. The vast body of literature [3,4] devoted to studies of the MOS capacitor is testament to its potential, but also to its drawbacks, for as a direct consequence of its sensitivity, the MOS capacitor is particularly difficult to characterise.

To use manufacturing parlance, the MOS capacitor is the 'first available sub- assembly' in a typical MOS process, and can thus be applied to the early detection of processing faults. Tests are possible on partly-processed wafers at many stages during fabrication, e.g., the control of processes such as oxidation, ion-implantation and annealing. Additionally, the MOS capacitor is the element of a number of differing MOS devices such as the MOSFET, charge-coupled-device (CCD) and non- volatile memory cell as shown in figure 3.1.

The advantages of being able to use this reduced structure to analyse the effects of processing on integrated circuit elements of greater complexity are obvious.

-24- (a) (c)

(b)

I Oxide Gate

Key: a: MOS capacitor, b: MOS transistor (MOSFET), C: Charge-Coupled Device (CCD), d: non-volatile memory cell.

Figure 3.1 : Uses of the MOS capacitor

3.2. Structure and Terminology

The most commonly used MOS capacitor test structure is that shown in figure 3.2. It can be considered to be electrically equivalent to the circuit in figure 3.3, namely a fixed,, oxide capacitance C0 in series with a gate-bias-dependent -silicon capacitance Cr1 .

In this highly simplified model, the total measured capacitance Cm is given by:

(3.1) Cm - Cox Cu where

€ €0 A Cox = (3.2) D01

E. is the permittivity of free space, E X the relative oxide permittivity (- 4.2)[11] , A the effective area of the gate and Dox the thickness of the oxide layer. C0 is considered to be a function only of the dielectric properties of the oxide.

-25- Figure 3.2 : MOS capacitor test structure

Figure 3.3: Simplified equivalent circuit of the MOS capacitor

As the gate bias is varied the thickness of the depletion layer in the substrate will vary in response to the altered charge on the gate. This in turn manifests itself as a change in the silicon capacitance:

€ €51 A csi =(3.3) where w is the depletion width which is both a function of substrate doping and surface potential. In this case, the silicon capacitance can be considered to be entirely due to depletion capacitance, since the contribution from diffusion capacitance can be neglected for a capacitor structure [5]. The presence of - the silicon substrate in the MOS structure acts so as to contribute an additional series resistance R 5 to our idealised model as shown in figure 3.4.

To develop the model further, it is first necessary to introduce some additional notation. In this thesis the classification scheme for oxide charges devised by Deal [6] is used exclusively to avoid the ambiguity that can arise through using other

-26- COX

cs i

Figure 3.4 : Alteration of the equivalent circuit by the addition of a series resistance descriptions such as 'Q' and 'Na'. The presence of charges in the oxide or at the interface affect not only the properties of the capacitor dielectric. These charges influence the surface potential of the substrate (the 'effective' gate voltage) and the response to changes in that potential. These changes may cause the MOS capacitor to deviate markedly from its expected behaviour. These non-ideal first and second order effects are discussed in detail in Chapter 4.

3.2.1. Charges in The MOS System

The four distinct, unambiguous charges specified in 1980 by the Joint Technical Committee of the IEEE and the Electrochemical Society [6] are listed below:

Q1 Fixed oxide charge

Q01 Oxide trapped charge

Qm Mobile ionic charge

Q1, Interface trapped charge

The spatial location of these charges is illustrated in the cross-sectional diagram of figure 3.5.

They can be classified into two distinct categories; oxide charge and interface trapped charge. The principal difference lies in the bias dependency of interface-trapped charge Qj, whereas the remainder are non-varying. The typical causes of the oxide charges arising during semiconductor processing are summarised below, and are discussed in much greater detail elsewhere [3, 7, 8, 91.

- 27 - METAL

JMO8ILE IONIC ( CHARGE COrn) OXIDE TRAPPED '' so2 \ CHARGE(Q) FIXED OXIDE - - - (CHARGE (Q) -- 80 0 8 so

INTERFACE TRAPPED CHARGE (Oil)

Si

Figure 3.5: Spatial location of charges in the MOS system [6]

Qj - Defined as the charge remaining after Q1, is completely annealed out and is related primarily to an excess of oxygen remaining at the interface after oxidation

[8]. It can be considered to act effectively as a charge sheet at the Si:Si0 2 interface, is dependent upon the silicon crystal orientation and is ultimately determined by the final high-temperature conditions of oxidation such as the temperature, ambient and cooling rate. It is, however, independent of the oxide thickness, the total oxidation time and the substrate doping concentration.

Q01 - Produced by hot hole/electron injection, ionising radiation, photoemission or ion implantation. It is located at the metal:Si0 2 or Si:Si02 interfaces except when produced by ion implantation, when it may be distributed throughout the oxide. Susceptibility to charge trapping is dependent upon the oxidation processing conditions.

QM - By its very nature, mobile charge can exist anywhere in the oxide, but is usually located either at the metal:Si or the Si0 2 :Si interfaces by surface contamination or drift respectively. Sodium, potassium or lithium ions will drift readily under an applied field even at room temperature. Modem processing and effective gettering techniques using chlorinated oxidising environments have, however, virtually eliminated these once troublesome sources of contamination.

Interface traps, according to Deal [10] can arise from structural imperfections due to the oxidation process in a manner similar to fixed charge, and can show a similar orientation dependence. Metallic impurities and decorated defects at the interface and

-28- the effects of ionising radiation are also responsible for the formation of interface traps. Interface-trapped charges warrant a more detailed examination for the effects they have on MOS capacitor and field-effect-transistor (FET) characteristics, particularly since they are significantly more difficult to measure. Details of these measurements will be described in more detail in Chapter 4.

It is expedient now to introduce the concepts of biasing regimes and energy band diagrams in order to help describe the relationship between capacitance and voltage, and the origin of C-V curves.

3.3. Bias Regimes of The MOS Capacitor

The usual effect of sweeping a voltage applied to the gate of a MOS capacitor is illustrated by a typical C-V curve in figure 3.6.

This curve can be segmented into three regions for the purpose of describing its origins in terms of changes in gate bias, namely accumulation, depletion, and inversion.

Accumulation: In the case of a p-type substrate, a large negative bias voltage applied to the gate causes the majority carriers to be attracted to the surface in such numbers that the electron density becomes negligible by

Lx. c %_.J I 1 1 k= i-. I t.J rm C= N.0 1

LA 72— 0 0 C 0 54 - 4) r 0 II 36 - a. 0 U lB -

OF -5.0 -3. if -1.0 1-11 RAM ff 8

Vc t. (\/)

Figure 3.6: A typical C-V curve

-29- comparison. The depletion layer is thus infinitesimily thin, so the

measured capacitance is just that of the oxide, Co. *

Depletion: Reducing the bias voltage towards a more positive value causes a depletion layer to form beneath the gate. This is due to majority carriers being repelled from the surface in numbers which are proportional to the voltage on the gate. The depletion region so

created forms a secondary capacitor, C,, which reduces the measured

capacitance Cm down from C0 as the applied voltage is increased.

Inversion: As the gate voltage becomes more positive still, minority carriers are thermally generated at the silicon surface. Continuing this process until the electron density equals or exceeds the hole density results in the formation of an 'inversion region' close to the surface. This thin inversion layer of effectively n-type material forms the conducting channel in a MOSFET and, therefore, measuring the onset of its formation will enable the threshold voltage to be predicted.

In biasing beyond the onset of inversion, one of two effects may be observed. Either the substrate will enter a steady state, 'strong inversion', whereby the depletion layer saturates at a constant width when the generation and recombination rates are equal, or a non-equilibrium 'deep depletion' effect can be induced.

Deep depletion is a condition that arises from the inability of the substrate to supply sufficient minority carriers (by generation processes) to respond to sudden changes in the gate voltage, thus retarding the formation of an inversion layer. For modern, long-lifetime devices, this effect can occur even for sweep rates as low as 0.1

VS-1 . The result of this slow response is for the charge on the gate to be satisfied instead by a widening of the depletion layer beyond its equilibrium value rather than by the creation of an inversion region. This effect, however can be used to advantage for determining minority-carrier generation lifetime, and for measuring doping density deep into the silicon substrate. The process of deep depletion may continue with further increases in bias voltage until either avalanche breakdown occurs, or the generation rate in the depleted zone increases to redress the balance. The latter mechanism is thus dependent on generation lifetime and sweep-rate.

-30- Strong inversion is an arbitrary definition, but it is just beyond this point that, if the bias is increased still further, the inversion layer effectively starts to shield the depletion region which then ceases to enlarge. With virtually all the gate field lines now terminated at the inversion layer boundary, the result of any further increases in the gate voltage will only increase the inversion layer depth. Since this process is not actually accompanied by any measurable capacitance change, the effect is not noticeable from the C-V curve which thus saturates at a minimum value, dependent on substrate doping.

3.4. Energy Band Diagrams

A better impression of the electronic effects of biasing may be gleaned from the energy band diagrams appropriate to the bias regimes. This form of representation also introduces the concept of surface potential which is convenient to explain the state of an MOS capacitor under bias. It should be noted that some of these descriptions assume a uniformly doped substrate; the effects of non-uniformity are both complex and more difficult to model. For convenience we also consider the flatband voltage in this analysis to be negligible as is depicted in figure 3.7.

Firstly to define some terms: as shown in figure 3.8 the MOS structure has a discontinuity at the Si:Si0 2 boundary at which the conductance and valence band occupancies may change. A change in Vg causes a variation in the Fermi level in the

Capacitance

Cox

Bias -5 -4 -3 -2 -1 0 1 2 3 4 5

Figure 3.7: Flatband voltage

31 - A $qX

qX

qI 3

q m E30 - Ec o 4 Egi2 Ei

E____ Ev METAL OXIDE SEMICONDUCFOR

c metal work function Ec conduction band cI semiconductor work function Ev valence band E30 potential barrier El intrinsic Fermi level b (Ei - E, )/q Eg band gap X 1 electron affinity (oxide) E F Fermi level X electron affinity (silicon)

Figure 3.8: Energy band diagram

silicon, and since this is 'pinned' deep in the substrate, an effective 'bending' of the energy bands near to the surface results. Figure 3.9 illustrates the electronic behaviour of the MOS capacitor under various conditions of bias. The surface conditions of a uniformly-doped substrate under an applied bias can thus be defined in terms of the surface potential, :

0 <0 - Accumulation Os = 0 - Flatbands

0< 95S

= Ob - Intrinsic - Inversion

Os >2, - Strong Inversion

Deep depletion is considered a special case of depletion for which 95 is momentarily greater than '15b

In accumulation, as shown in figure 3.9 (a), the negative gate bias causes the energy bands to bend up, as holes accumulate at the surface. For an ideal model, when the bias voltage falls to zero, so too will the surface potential.

-32- EFm a. (Accumulation)

&

(Flatbands)

By

&

1 C. (Depletion) Ev

&

(Weak inversion) 13

By

EC

(Deep depletion) EfS Ev Vg.O - __-

W>Wmax

& (Strong inversion) H

1/4-40-Depletion region Vg>>O

W=Wmax

Figure 3.9: Energy band diagrams under different biasing conditions

-33- In this state, the total band-bending 4i is zero and the bands are flat throughout the silicon. This condition is known as the 'flatband condition' and is illustrated by figure 3.9 (b). For a practical MOS capacitor, the existence of a work-function difference between the gate material and the semiconductor creates a small built-in potential. This potential difference, can be overcome by a corresponding opposite increase in gate voltage, which in turn gives rise to a non-zero flatband voltage [11]: (p. 397).

E VIb = ms = - (x+ b) (3.4)

Note that I (x + L9 + cbb)) is dependent upon substrate doping as shown by figure 3.10.

Making the gate voltage more positive results in the initial formation of the depletion region, whereby the bands are now bent down to compensate for the gate charge, as shown in figure 3.9 (c). At the point when 0, = Ob the silicon surface is effectively intrinsic and the extrinsic Fermi level is at mid-gap as shown in figure 3.9 (d). As / is increased beyond this point the extrinsic crosses the intrinsic Fermi level and an inversion layer begins to form as in figure 3.9 (f).

As previously mentioned, further increments in q cause little observable change in capacitance, but by definition, when 95 . = 20b (as in figure 3.9 (f)), inversion is deemed to have occurred. This is not so arbitrary as it may seem at first, for this bias point represents the instant at which the minority carrier concentration in the inversion

I!] ----;',U' - 0.8

0.6 0.4 - - 0.2 = U)

-0.2 E '- -0.4 -; -0.6 ,— AL(p - Sl)

q 4 POLY (p-SI) _____

1014 io' 5 lOIS I1 lOIS

N 8 (CM -3 ) Figure 3.10 Work function curves as a function of doping (After Sze [11], p. 397)

-34- region is equal to the majority carrier concentration in the silicon bulk; the threshold for conduction is thus defined. Finally, figure 3.9 (e) depicts the energy bands in the deep depletion condition.

Another useful representation for distinguishing between the bias regimes is provided by a plot of surface charge Q as a function of band-bending. This curve shown in figure 3.11 has three distinct sections, each described by approximate relations determined from Poisson's equation.

The 'ideal' case assumed in the preceeding analyses makes use of the following approximations:

Uniform doping up to the surface

One-dimensional field distribution

Zero flatband voltage

'High' frequency assumption

Slow d.c. voltage sweep.

These should always be bourne in mind, since for measurements made on practical MOS capacitors it is realistic to expect that one or more of these assumptions

p-TYPE Si (300K) ) NAXIO'5Cm3 / -exp (q+6 /2kT)/ 10- 5 - (STRONG INVERSION)

C~* -exp (qI45I/ 2 kT) E 10 -6 • (ACCUMULATION)

5! i-

FLAT -

BAND --

0 8 WEAK - I DEPLETI0t INVERSIONS

IO- -0.4 -0.2 0 0.2 0.4 0.6 0.8 1.0 fS (VOLT)

Figure 3.11: Surface charge as a function of band-bending (after Garret & Brattain [12])

- 35 - will not hold. The next chapter details a more complex theory on the origins of a general capacitance-voltage relation when some of these assumptions are violated.

3.5. Measuring Equipment

Fundamental to precise and repeatable C-V measurements is the condition of the measuring equipment and sample. Proper equipment set-up and sample preparation is as important as the final data analysis and interpretation. Indeed, faulty equipment may give rise to results which appear qualitatively similar in nature to those from actual production process faults. The following sections serve as a practical guide to achieving and maintaining the necessary standards which are paramount to avoid these often insidious side-effects.

3.5.1. Probe Station Hardware

This section refers to the provision of probes, chuck, wafer prober and measurement chamber. Electrical and physical isolation is necessary to eliminate the effects of noise, light and vibration respectively. A successful arrangement is illustrated by figure 3.12. An earthed, light-tight metal box surrounds the wafer prober, which itself is mounted on a shock-proof isolation table for mechanical stability.

MOS capacitor measurements, in particular the quasi-static (QSCV) technique, are very sensitive to electrical interference. Radio frequency interference (RFI) is also known [13] to cause errors in the measurement of capacitance by heterodyning with the measurement signal. In addition to a shielded enclosure, co-axial and triaxial cables and connectors should be used wherever appropriate. The precise configuration will depend upon the particular capacitance meter used. The Hewlett Packard family of LCR meters use a 4-terminal pair configuration between meter and probe station, with conversion to screened single source and sink connections at or near the probes themselves.

The configuration most often used is the 'floating' configuration. Here the 'high' connection is made through a single, co-axial shielded probe to the top capacitor plate. The other connection is made through a gold-plated chuck to the rear of the wafer. In this case the chuck itself must be electrically isolated from the platform, the rear of the wafer metallised, and a connection made to the chuck via shielded cable to the 'low' Ught-tight shielded metal box

eaN Stereo n microscope Ring illuminator co-ax to Probes Select top or tn-ax rear substrate converter contact i and switching matrix

Anti-vibration mounting 4-terminal pal wiring harnes

Figure 3.12 : A typical arrangement for a C-V measurement enclosure connector. Alternatively, if a substrate contact is provided adjacent to the capacitor, this can be contacted by use of a second, co-axial probe. Other probing configurations are possible including the use of a mercury probe. This rapid testing device, however, uses two capacitive contacts to the wafer and is known [14] to produce inaccurate readings under certain circumstances.

Inside the dark box, a light source should be provided, not just for obvious practical reasons, but also for creating electron-hole pairs at the gate periphery. This is desirable when a stable inversion layer is to be formed in the shortest possible time prior to measurement of C mm or capacitance vs. time.

A temperature controlled wafer holder or 'hot chuck' is required for bias- temperature stress measurements on MOS capacitors. Ideally hot-chucks should be computer-controllable with provision for rapid cooling as well as heating, to reduce overall measurement times. Such devices often require plumbing for water, and care should be taken to ensure that electrical or mechanical interference from the heater or coolant does not affect the capacitance measurements. Specially designed sprung probes should also be used to prevent them drifting off the contact during thermal cycling.

-37- 3.5.2. Capacitance Meter

The correct choice of capacitance-meter will depend on the nature of the measurements to be most commonly made. For example, the accurate plotting of dopant profiles will require a capacitance-meter of optimum resolution, whereas lifetime measurements on short-lifetime devices necessitate a high-speed capacitance- meter, normally at the expense of accuracy. Rapid determination of interface state density can be performed through the combined use of a capacitance-meter, and a picoammeter for the measurement of quasi-static C-V curves, although instruments which combine both these functions are now becoming available such as the 'Package 82' system from Keithley Instruments Ltd. [15]. The conductance method for interface state density evaluation requires a capacitance-meter with a variable-frequency small- signal source. Although successful tests can be made at spot frequencies [16], this function is best performed by instruments which provide a continuously variable frequency signal such as the Hewlett Packard HP 4192 [17]. Since advance requirements are often indeterminate, especially for process fault diagnosis, the most useful and versatile combination is probably a picoammeter and multi-frequency capacitance-meter with variable precision (e.g. the HP 4275A) [18]. Such a combination is provided in the Hewlett Packard HP 4061 semiconductor/parametric test system [19].

All modern computer-controlled capacitance-meters for C-V measurements are digital in operation and of the admittance bridge with lock-in amplifier configuration [18, 15,3]. The better instruments also include a correction capability to compensate for the effects of

Lead inductance and resistance

Test fixture stray capacitance and parallel conductance

Offset correction on an HP 4275A for example, is performed by shorting and opening the test probes to cancel out a) and b) respectively.

The amplitude of the high-frequency sinewave applied to determine the impedance of the device under test should be kept as small as practically possible. This avoids second order effects caused by sensitivity variations in the admittance bridge [3], (p. 585) dielectric losses in the Si0 2 insulator [13], and second harmonic generation from the the non-linearity of the C-V relation. A value of 20mV rms was standardised

-38- upon for all measurements made during the course of this research.

The 4-terminal pair arrangement [13] appears, at present, to be the most suitable configuration for the measurement of high impedence devices. The technique compensates for the effect of cabling resistances, whilst synchronous detection of phase angle minimises the effect of any stray inductances or capacitances. At some stage, however (usually the prober box), the 4-terminal-pair arrangement must revert to an inferior 2-terminal form in order to connect to the device. Ergo, great care should be taken to minimise stray capacitance and inductance in the lead dress after the conversion to the 2-terminal configuration.

3.6. Sample Preparation

The importance of a standard, reliable MOS capacitor sample cannot be overstressed. Even the best test equipment must make some assumptions about the possible range of input samples likely to be encountered. Many physical effects will distort, mask or even negate the characteristics under observation. Therefore, it is necessary to prepare test wafers with even more care than the final devices if C-V testing is to be meaningful. The use of a standard test structure and procedure is thus advised, since this will at once remove ambiguity arising from false assumptions about the device's size, shape and fringe-capacitance. The precise design of a procedure to prepare wafers for C-V testing for fault diagnosis will obviously depend upon that sub- component of the MOS structure (gate, interface, oxide or substrate) to be analysed.

The preceeding constraints do not apply so strongly in a research-type environment, when a more flexible approach is necessary. However, in a production environment when reliable and consistent results are expected, it is necessary to employ processing conditions which are appropriate to the particular parameters to be measured. The use of the word appropriate should be clarified since it is in the best interests of the process engineer to minimise second-order effects attributable to processing prior to measurement of a first-order deviation or characteristic. For example, to obtain an accurate doping profile, interface-trap densities ought to be made as low as possible. The determination of flatband voltage is far simpler if a low mobile ion density can be assumed. A minority carrier lifetime measurement is more trustworthy if made on a substrate not affected by the effect of lateral inversion layer spreading [20], either through careful preparation, or by using a substrate with a high

_39- threshold voltage.

There may appear to be a number of competing considerations here for the optimum test structure, but with careful design and fabrication of the sani °ple the task of analysing the results will be made far easier. Some of the practicalities of fabricating MOS capacitors for process control concerning the capacitor contacts, their edge effects, and their fabrication will be described in the following sections.

MOS capacitors should ideally be fabricated on separate test wafers. This allows for C-V measurements to be made at any stage during processing without affecting the remainder of the batch, and also permits destructive testing should this be required. There is, however, a certain reticence amongst process engineers to 'waste' precious wafers on such a 'simple' test. This is an argument which has steadily gathered momentum with the ever increasing size and cost of silicon wafers. If one considers though, for a moment, the real waste in having to reject a whole batch of completed wafers after a fault has been picked up by post-processing I-V parametric testers, the advantages in testing a part-finished sub-assembly become obvious. It is hoped that the conclusions which may be drawn from this thesis will throw a new light on the sophistication and utility of C-V testing.

3.6.1. Contacting the MOS Capacitor

To form the top 'plate' of the capacitor structure, a gate must be produced, preferably in a manner consistent with the process and devices to which C-V testing is to be applied. It is often more convenient, however, to create an aluminium layer for this function; the actual choice of gate material having little bearing on the ability to identify the capacitor characteristics. Some secondary effects have been attributed to the use of Al-Si over pure aluminium and with Si3N4—S'02-Si structures [21]. In such situations, it would be best to have a 'pure' aluminium test capacitor as a control to isolate any gate material-specific effects on the measurement.

The top capacitor (gate) contact should be relatively large in relation to MOSFET gate dimensions for maximum accuracy, and this generally precludes its use on anything but a separate test wafer. The most practical and versatile means of defining the top contact is via optical lithography, be it contact, projection or direct-step on wafer (DSW) printing. Additionally these methods allow for the definition of

-40- additional structures such as periphery guard rings.

Cruder techniques such as shadow masking or mercury probing are not suitable for an accurate and detailed study of the MOS system, neither are they representative of any MOS process. In a limited number of cases, for instance when the 'as-grown' qualities of a gate oxide are to be studied in minutae, these all-low-temperature steps may offer an alternative. The advantages and disadvantages are discussed in greater detail in reference [3]. (p. 632)

In addition to the gate electrode, a reliable ohmic substrate contact must be made, ideally to the back of the wafer in order to minimise second-order effects on the capacitance characteristics. This contact must also be of low resistance. A non-ohmic contact will exhibit a frequency-dependent CO,, whilst a high series resistance (>loocicm _2) will reduce the measured value of C01 and tend to mask subtle variations in the parallel conductance. Series capacitance is another factor that can influence the measurement; this arises from a thin oxide layer forming on the rear surface of the wafer prior to substrate contact formation. Although it may give only a small percentage error, this variation is bias-dependent, and more significant when the gate oxide is thin. Fortunately, the removal of this oxide and the action of the cleaning operation also helps to reduce series resistance.

Sputtered or evaporated aluminium has been found to be an extremely satisfactory means of forming a back contact, with the added advantage of being a non-contaminant. In comparison, gold has poor adhesion, unless Cr or Ni is deposited first, and wafers cannot be subsequently processed for fear of contamination. Aluminium is thus favoured as a top contact and furthermore, is readily and directly probeable using standard tungsten probe tips.

3.6.2. Edge Effects

It is well known that the edge of any capacitor plate holds more charge than its flat surface. This is due to an increased electric field strength in the region of the highest radius of curvature [22]. Fringing fields are an important consideration in the analysis and study of MOS capacitors. In addition to simply adding to the amount of stored charge and hence increasing the apparent capacitance, edge effects are responsible for a variety of second-order distortions to C-V measurements. The edge- field will typically be 10-100 times that at the centre [3], (p. 506) and this may contribute to premature dielectric breakdown of the oxide and early avalanche breakdown of the substrate in deep-depletion.

If any water vapour is upon, or adsorbed into the dielectric immediately surrounding the gate, it will acquire the same charge, and influence the substrate in much the same way as the gate bias. This causes both an apparent increase in device area when in accumulation, and the phenomenon of lateral inversion layer spreading may create an inversion layer beyond the gate (ILBG). ILBG is a dynamic effect, generally increasing with time and is particularly noticeable, therefore, when measuring long lifetimes.

Eliminating water from the sample would appear to be the obvious solution, but this poses severe practical problems, especially in areas of high humidity. Instead, modified gate definition is usually employed to overcome the ILBG phenomenon. There are three possible solutions illustrated by figures 3.13.

ALUMINIUM GATE

-- GUARD RING + + + + + +

GATE OXIDE SILICON

ALUMINIUM GATE

+ + + + + _+ + + +______+ ++

FIELD OXIDE ____ FIELD OX(

GATE OXIDE SIliCON

ALUMINIUM GATE

+ + + + + +-+ + + + 4+

IMPLANT IMPLANT

GATE OXIDE SILICON

Figure 3.13: Possible solutions to the ILBG problem

.42- Bias a guard ring placed very close to the gate edge into accumulation to prevent an inversion region forming beyond the gate.

Surround the gate with field oxide, thus increasing VT at the periphery,

C) Use a 'channel-stop' implant around the periphery to increase VT,

Option a) is easiest to achieve in practice, but can cause problems with current leakage from ring to gate. Whilst this renders conductance measurements useless, C-t measurements can certainly benefit as most capacitance-meters can tolerate a small leakage current.

3.6.3. Sample Fabrication

As a practical guide to the fabrication of suitable samples for C-V testing, an established 'recipe' is presented in Appendix B. This batch was fabricated to investigate the effects of post-oxidation anneal times on the oxide charge and interface trap densities of 'wet' and 'dry' oxides. The actual oxidation steps performed will obviously depend upon the process under examination. They are included, however, for completeness and to set the cleaning and annealing steps in context.

3.7. Conclusions

This chapter has introduced the basic concepts and terminology pertaining to the MOS capacitor as a test structure for VLSI parametric testing. Oxide charge nomenclature, biasing regimes, and energy band diagrams have been defined in the context of capacitance voltage measurements.

The fundamental requirements for the MOS capacitor sample and the measuring equipment have also been covered in detail. The design and preparation of suitable test structures has been discussed, and a suggested processing recipe has been provided in Appendix B. -

REFERENCES

1. E. H. Nicollian, "Electrical properties of the Si:Si02 interface and its influence on device performance and stability," Journal of Vacuum Science Technology, Vol. 14, (5), pp. 1112-1121, (Sept/Oct 1977).

-43- E. H. Nicollian and A. Goetzberger, "The Si-Si0 2 interface- electrical properties as determined by the metal-insulator-silicon conductance technique," Bell System Technical Journal, Vol. 46, (1), pp. 1055-1133, (1967).

E. H. Nicollian and J. R. Brews, MOS (Metal Oxide Semiconductor) Physics and Technology, John Wiley & Sons, New York, (1982).

E. S. Schlegel, "A bibliography of metal-insulator-semiconductor studies," IEEE Transactions on Electron Devices, Vol. ED-14, (11), pp. 728 - 749, (November 1967).

C. Kittel, Introduction to Solid-State Physics, John Wiley & Sons, New York, (1976). 5th edition.

B. E. Deal, "Standardised terminology for oxide charges associated with thermally oxidised silicon," Journal of the Electrochemical Society, Vol. 127, (4), pp. 979-981, (1980).

B. E. Deal, M. Sklar, A. S. Grove, and E. H. Snow, "Characteristics of the surface-state charge (Q) of thermally oxidised silicon," Journal of the Electrochemical Society: Solid State Science, Vol. 114, (3), pp. 266-273, (March 1967).

C. L. Claeys, "Oxidation processes and its influence on the electrical behaviour of MOS-structures," Proceedings of the First Brazilian Workshop on Microelectronics, pp. 112 - 131, Campinas, Sao Paulo, Brazil, (July 2-20, 1979). A. S. Grove, B. E. Deal, E. H. Snow, and C. T. Sah, "Investigation of thermally oxidised silicon surfaces using metal-oxide-semiconductor structures," Solid State Electronics, Vol. 8, pp. 145-163, (1965). B. E. Deal, "Properties and measurements of charges associated with the thermally oxidised silicon system," Proceedings of the Microelectronics Measurement Technology Seminar, pp. 39-70, San Jose, (February 6-7, 1979).

S. M. Sze, Physics of Semiconductor Devices, John Wiley & Sons, New York, (1981). 2nd edition.

C. G. B. Garrett and W. H. Brattain, "Physical theory of semiconductor surfaces," Physics Review, Vol. 99, p. 376, (1955).

B. Botos, "Designer's guide to RCL measurements," EDN, (June 5, 1975). (Reprinted by Hewlett Packard Co.)

I. G. McGillivray, J. M. Robertson, and A. J. Walton, "CV characteristics of MOS capacitors with two top capacitive contacts," Microelectronics Journal, Vol. 18, (5), pp. 25-27.

High Frequency, Quasisiatic and Simultaneous CV Semiconductor Measurements, Keithley Instruments ('Package 82' product note), Cleveland, Ohio, (1987).

I. G. McGillivray, "A Computer Controlled CV Measurement and Analysis System," EMFfHughes Report, (October 1984).

4192A LF Impedance Analyser, Hewlett Packard Co., (October 1981). (Technical data)

0

-44- Multi-frequency LCR Meters, Hewlett Packard Co., (June 1980). (Technical data)

HP 4061A Semiconductor/Component Test System, Hewlett Packard Co., (May 1983). (Technical data)

E. H. Nicollian and A. Goetzberger, "Lateral ac current flow model for metal- insulator-semiconductor capacitors," IEEE Transactions on Electron Devices, Vol. ED-12, pp. 108-117, (1965).

I. G. McGillivray, "A Comparison of the Quality of Oxide and Oxide-Nitride Gate Dielectrics," EMFIHughes Report, (October 1984).

A. F. Kip, Fundamentals of Electricity and Magnetism, McGraw-Hill, Tokyo, (1969). 2nd edition.

IMILM Chapter IV The Capacitance-Voltage Technique

4.1. Introduction

The previous chapter discussed the basic physical principles for the MOS system in an ideal case. This chapter serves to present the typical non-idealities which can be expected in any practical situation, and their means of analysis. At our disposal are the C-V (equilibrium and pulsed), G-V, quasi-static C-V and capacitance-time techniques for evaluation of MOS capacitor characteristics.

The underlying theory is presented in outline for each test with some typical results and responses. Additionally some experimental precautions are documented to both stress the importance of correct procedure and to provide a practical guide to performing these measurements. For a more detailed mathematical treatment the reader is referred to many standard texts [1,2,3].

4.2. Non-Ideal MOS Behaviour

The analysis in Chapter 3 did not take account of the presence of extraneous sources of additional charge and changes in the gate-voltage to surface-potential relationship. These can be caused by:

non-uniform doping of the substrate

non-zero flatband voltage

presence of trapping centres in bulk or at interface frequency, temperature and light-level capacitor size and shape

surface effects

failure of equivalent circuit to model the device errors in the measurement

-46- All these factors affect the form of the C-V characteristic in one way or another, and the deviations so caused can be used to assess these non-idealities in a qualitative and quantitative manner.

4.3. Equilibrium C-V Measurements

The classic, high-frequency C-V test [4, 5, 6, 7, 8] is perhaps the most important, for it provides a very rapid indication of many different aspects of the MOS capacitor [9, 10, 11]. This single measurement provides a foundation upon which all other tests can be based, from the determination of substrate type to qualitative measures of oxide integrity, minority carrier lifetime and substrate uniformity. In addition, first-order effects and characteristics can, with care, be determined with a reasonable degree of accuracy.

4.3.1. Modelling the High-Frequency C-V Relation

The model developed here is particularly accurate as it takes the spatial redistribution of charges due to ac cycling into account. Two factors considered here are:-

The total number of minority carriers is fixed by the d.c. gate bias and is independent of the a.c. signal voltage.

Minority carriers, in response to a high-frequency a.c. signal, can redistribute at

the surface causing 'polarisation' of the inversion layer [1] (p. 157).

In the following analysis, a calculation of the high-frequency silicon capacitance in weak inversion, depletion and accumulation is performed. A simple closed form approximation is shown, which is accurate to better than 1 .5% for a uniformly doped substrate. A simplified expression for the high-frequency C-V relation on a p-type substrate can be obtained from a small-signal solution of Poisson's equation:- [12]

d2v1= T 11— exp( — v)+exp(v+(JFfl +(JB)j (4.1) where

v(x) = ti(x )q/KT (dimensionless band bending) (4.2)

- 47 - (€5 KT ' = I (n-type extrinsic Debye length) (4.3) q 2ND EF —E1 UFfl = (ac electron quasi Fermi level) (4.4) KT UB = OB q /KT (dimensionless bulk potential) (4.5) If equation 4.1 is integrated between v = 0 and v5 , an expression can be derived for the electric field, F, at the silicon surface:

F(v$ ,UFfl ,UB)/Xfl = -i-- +exp(—v,)—l+exp(U B —UFfl )[exP(vs )—l]J (4.6) fV Now, it can be shown that [1] (pp. 159-162):

C5 = C5fl, [1 —exp ( - V50 )]/F (V50 ,UB ) (4.7) where C = silicon capacitance

Csfi, = flatband capacitance ( ES/X, ) v 0 = total band bending (small signal implies v5 ± 8v)

F(v50 ,UB ) = '/2 [v o —1 +exp(—v 50 )]

Note that this is valid only in depletion or weak inversion ( çb8 ) since the effect of carrier redistribution in this bias regime is negligible due to the minority carrier concentration being small relative to the acceptor density (assuming a p-type substrate).

Introducing the terms previously neglected in this assumption we may derive an expression for C5 in strong inversion:- [131

(.J2 C5 = C exp(—v50 )+ [(exp(v so )-1)+ 1] (v ,UB ) (4.8) where A is a fraction which takes into account both the constancy of the inversion layer charge and its spatial re-arrangement due to the small a.c. signal voltage.

6UFn - 1 (4.9) 6v5 - 1+

The derivation of i is detailed in reference [1] (Appendix V). The two expressions for silicon capacitance (equations 4.7 and 4.8) have to be integrated numerically to obtain an accurate solution. Once a C5 (v5 ) relation has been obtained, it is a simple matter to convert this to the more usual C-V relation thus:-

-48-

Cox .c C = (as capacitors in series) = V, + 1-(U8 —Ui ) (by Gauss' law) Cox q As an alternative, a closed form expression for the model was developed by Lindner [8] since the full analytical form is somewhat unweildy:-

CS =C, (i_ (4.10) (N '2 exp(—v 50 )+ [(exp(v50 )-1) +1] NA

where

[V:0 F (v50 ) U) { exp(v ) —exp( -v ) —2v 3, ___-- d______ (4.11) exp(v3 0 )-1 f F3(vS,UB)

(after Nicollian & Brews [1] (p. 161))

In accumulation, depletion and weak-inversion, this simpler technique uses the bias-dependent equation 4.7 coupled with a bias independent term in strong inversion. As shown in figure 4.1, however, this inevitably results in a derivative which is not continuous. These two relations should be smoothly connected at the 'match-point' of

band-bending V 0 = Vm, namely:

I,

Ii ..

U C 4)_ o U. I U

.6 1.0 1.2 1.4 1.6 1.6

(¼.1)

Figure 4.1: Magnified portion of a simulated C-V curve indicating match-point between bias-dependent and independent terms

- 49 -

C, (V,,,CL (V,. for VSO

C, (V,. = CL (V. for Vso >Vm (4.13)

In both cases above, C. is evaluated using equation 4.7. Notice CL (vm) is our familiar minimum equilibrium capacitance, Cm. To ensure maximum accuracy from this technique it is important to choose the match-point carefully. Lindner originally defined such a point VL as:

exp(vL —2(J) = VL —1 (4.14) for which an iterative solution gives the approximation:-

VL =2.1OUB+2.O8 (4.15) but this is dependent on substrate doping, and was subsequently modified by Brews [14] to give:-

V. = VLO. 75 (4.16) which is accurate to —1.5% in inversion and depletion. The agreement for perfectly uniformly doped substrates, is particularly good as shown in figure 4.2 and was shown to be a good approximation to the full form expression by Berman & Kerr [15]. The lateral shift between the measured and simulated curves in figure 4.2 is due to a finite Q1 present in the oxide.

L4 1 1 1 lo P I LJTl $

0 1911 C I \ 4' -S -a o I Simulated curve a. I I-. U LL

-4.0 -2.4 -.0 .0 2.4 4.0

v1-tg. (v)

Figure 4.2: Illustrating the good agreement between measured and simulated C-V curves

_50- Alternative means of calculating the high-frequency characteristic have been proposed by a number of workers [16,17,18,19,20,15,21,22,23,11] with methods ranging from rapid 'desktop' calculations [161 to large 'mainframe' algorithms [18] using transmission-line equivalent circuit models, to elaborate numerical simulation methods utilising Poisson's equation and Boltzmann statistics [221. The closed-form approximation of equations 4.10 and 4.11, however, was found to be more than adequate in providing a reference characteristic for the work in this thesis.

4.3.1.1. Non-Uniform Doping

The preceeding analysis conveniently omits the effects of dopant re-distribution at the Si:Si02 interface. This can be responsible for errors in NA or ND of up to 20% [1] (p. 167), particularly when the base doping concentration is high (>1016cm _3), where gate oxides are repeatedly etched and regrown or, of course, where an ion-implant has been deliberately introduced. The primary effect this has on the previous model is a breakdown of the depletion approximation [2] which assumes the abrupt termination of the depletion edge. The principal limitation that arises thus is that there exists a finite space charge in the bulk of the substrate as a result of the carrier density gradient. The conventional meaning of the 'flatband' condition will thus be lost.

In attempting to model this aspect of non-ideal behaviour, two approaches are possible; either by

determining an exact formulation of the C-V relation in the context of a non- uniform substrate [19, 22,20] or

comparing a theoretical model defined by independently extracted parameters

(en , EOX , Nsub, x0) (P,, A) with the measured characteristic, and iterating towards an approximate solution.

For the purposes of a stand-alone measurement system, however, the usefulness of the first model (1) would depend upon the availability of other sources for the doping profile, and thus almost obviate the need for the C-V measurement in the first place. The second method would appear to be more useful when used in reverse to determine the doping profile [24] but this approach must assume the absence of other effects that alter the apparent shape of the C-V curve, and these will typically not be known in advance (e.g. interface traps or charge non-uniformities).

-5'- A basic model for the ideal C-V curve is useful, however, even in the case of a non-uniformly doped device, in assisting the extraction of some useful process-related parameters from the MOS capacitor test structure.

4.3.2. Parameter Extraction

A list of some of the more useful parameters which may be obtained from high- frequency measurements is now given.

4.3.2.1. Substrate Type

Given that both the accumulation and inversion bias regimes can be achieved and fall within the gate-bias window chosen, then the determination of substrate type is a trivial matter as can be seen in figure 4.3.

4.3.2.2. Oxide Thickness/Capacitance

If the capacitor is biased hard into accummulation, Co. can be measured and hence Dox obtained:-

€, € 0 A D0 = —;. ( 4.17) Cox

where A is the effective area of the gate. The effect that variations in Dox have on the C-V curve are shown in figure 4.4. It is important that C0 is measured hard into accumulation to obtain the true (electrical) value of D0. Small errors in the measured C0 can have marked effects on calculations of silicon capacitance C for band-

I — —

S S .5 .5 N–TYPE S S 0 0 C C I I 9

0 0 I I a I I 0 U

Vo 1 t g - Figure 4.3: Determination of substrate type from well-defined C-V curves

-52-

1-4 1G21 - r- c L ,. >- c - 1 8. LL aoA

7.0

Dox 4gA 0 5.2 C Ii tJx soA ji 3.5

0 1.? U Nsub 7.E+14 a U U -5 -3 -1 1 3 5

1 -t. M C \/ )

Figure 4.4: Effect of oxide thickness on C-V curve shape

bending. McNutt and Sah [25] demonstrated that differences in C0 of as little as 0.6% can give rise to pronounced deviations in the Terman analysis of interface-trap density [26].

An empirical correction to overcome this limitation first suggested by Sah et al. [27]:

= C(V) 2kT 11+ (4.18) X A qjV—V j,I

provided a basis for further development of a more accurate determination [25]

) ( Cox _c) (4.19) I(d' )= (2ki0 which can be solved graphically or numerically.

Recent research has indicated that special treatment is needed for ultra-thin (

propose an alternative method involving the extraction of DOX at small values of surface band-bending [28]. A simpler method of an iterative C01 determination was later

-53- devised by Sarrabayrouse et al. [29]. Both methods demonstrate an accuracy of better than 1% error in DOX for arbitrarily thin oxides.

4.3.2.3. Substrate Doping

If the minimum equilibrium capacitance can be reliably measured [30] then the doping concentration near the surface of the semiconductor can be obtained [1] (p. 407) thus:

(S~~) 2 max N = (4.20) q EO E i A where Csmjn , the minimum silicon capacitance is given by:

c ITuncox = --- (4.21) C0 — C min and cbm , the maximum equilibrium band bending,

KT ( max = In (4.22) q n1

The effect of N 511 , variations on the shape of the curve are illustrated in figure 4.5. From equations 4.20 and 4.22, a transcendental relation results [1]:

4 .t C21I. - r- e3 C=j L.4 C5 .. C= > 8.8, 6.

V 7.04

U 5.28 Nub' 1.E+15 C 3.52 Nub 3.E+14

Nub 1.E+14 Ii 1.76 C Dox= 3øA 0. C U -5 -3 -t 1 3 5

Figure 4.5: Effect of substrate doping on the shape of the C-V curve

- 54- —2 NA _ _4kT i 1 1 => (4.23) ( (NA - € 0 €51 q 2 NA 1rnin'oxJ (in —I+'i41n(21n flI --1-1) ) ) _2), When C i, and C0 are per unit area (cm NA or ND will be cm . Such an iterative solution for p- or n-type substrates is provided by the algorithm:-

I N) I (NZb)J} N sub 1 = m un I + ½ln J21n I (4.24) I ni I' ni In the case of a substrate whose profile cannot be assumed to be uniform, N $ b calculated as per equation 4.24 will approximate to a spatial average (in fact, the first moment of the distribution) of the dopant concentration between the Si:Si0 2 interface and the edge of the depletion layer. In general, NSb evaluated in this way will not be particularly useful in subsequent analyses, although monitoring it will alert the operator to any major deviations from an expected value or one obtained via another technique (eg. 4-point probe).

An alternative means of evaluating the substrate doping is to employ two fast, pulsed measurements deep into the semiconductor, and then apply the formula of equation 4.45 as per a dopant profile. So long as the silicon has a long minority- carrier lifetime in comparison with the pulse duration, deep depletion will occur and the measured doping density will be a better indicator of substrate doping than the aforementioned CjjC method due to Deal et al. [31]. Indeed, it has been demonstrated by McGillivray et al. [30], that this technique is particularly useful when the long minority carrier lifetime inherent to modern, good quality processing, makes the accurate measurement of Cmin both slow and inaccurate due to the stagnancy of the inversion layer response to changes in gate bias.

Determining the correct value of N$b to use is perhaps the most difficult aspect of modelling the C-V characteristic and the largest source of error. In fact, it has been suggested by Berman and Kerr [15] that this variable be used as a fitting parameter to obtain the best agreement between measured and simulated curves in strong inversion. Along similar lines, a neural model was used by this author to determine an approximate relation (equation 4.25) between substrate doping and various aspects of the shape of the measured C-V curve [32].

-55-

(c m log(N 3 ) = 4.14x 10 -1 log +2.20X 10'log (4.25) Cox J

_9.76X1O_3_1+6.13X1O_2max dC Cox lwJ The derivation of this expression by a machine-learning technique is described in Chapter 5.

4.3.2.4. Flatband Voltage

As detailed in section 2.4.1, the determination of flatband voltage enables the calculation of VT, the threshold voltage of an equivalent MOSFET, and for the

determination of 'total oxide charge' (Q1 + Q,,, + Q0, + Q,). Vj,, can be established by interpolation from the measured C-V curve at the point at which the measured capacitance equals the 'flatband capacitance'.

= _i__+ ---- (4.26) C/b Cox C Typical values of V/b determined in this way will, however, not approach the theoretical ideal of OV. This is due in part to the gate:semiconductor work function, 1?ms Recalling the equation for flatband voltage:

Yji, =ms (Qj±Qm±QotQj) (4.27)

I Cox where cI is defined to be:

I E (4.28) ms - 1x+ • + 5bJ for which

kT 1 -1 (p-type) Ob = In x sgn (N 3b) (4.29) 1-- 1 = I (n-type) q (fl ) and for aluminium gates I,,. -0.6V. The symbols used above are defined in figure 3.8.

The limit on the accuracy of V/b extraction is therefore determined by the tabulated barrier heights [31,33,34] of the gate material. However, McGillivray [35] (p. 80), has pointed out that the reliability of these 'constants' is questionable, and as such, errors in V/b can give rise to misleading values for the total oxide charge. An additional caveat is the word 'total', since by using a single C-V measurement, one

-56-

cannot readily separate out the contributions from each charge type. In general, the flatband voltage shift between measured and ideal curves is given by:

1Vfb = (4.30) Eox

where i is the effective centroid of the total charge distribution.

Non-Uniform case

Real MOS devices can never, unfortunately, be assumed to be uniformly doped

[1] (p. 477), and consequently a voltage corresponding to zero band-bending (q, = 0) has a different meaning than Vjb in the uniform case. The difficulty lies in determining a flatband capacitance consistent with the characteristics of the sample. The term 'flatband' still tends to be used, but in reference to the condition of zero charge density in the silicon region of interest. In this case flatband must be redefined in terms of doping at the surface:

) . for accepwrs kT N (x = 0) (4.31) Jb = ± —ln I q i n, +for donors

McGillivray [35] concludes that perhaps the best estimate of Vjb in the case of implanted profiles might be through the use of a computer simulation along the lines of the algorithms due to Jaegar and Gaensslen [19] using profile data from a 1-ID simulator such as SUPREM for example.

This problem becomes particularly troublesome when there is a significant interlace state density present, when the stretchout in the curve produced by the traps will tend to mask any variations due to doping non-uniformities. A better estimate of then is from the shift measured towards the depletion end of the curve which is also less affected by variations in the surface potential relationship caused by interlace trapped charge. Alternatively, the C-V curve could be corrected for Q [36] if it could be determined from another source. A discussion on the sources of error in Vfl, can be

found in reference [1] (pp. 477-489).

It has been suggested [37,38] that the method of Ziegler and Klausmann [39] for profile determination upto the surface of the semiconductor can be used to extract an accurate value for Vfr for profiles that do not vary too rapidly near to the surface.

- 57 - 4.3.3. Oxide Charges

The determination of total oxide charge density, Q0 , defined by [1] to be the sum of Qm, Q0 and Q1, and distinct from D. which is voltage dependent, can be derived directly from the flatband voltage:

= iV fb C OX (C CM —2) (4.32) Aq where i Vib is the difference between the measured flatband voltage (see equation 4.30) and that predicted by equation 4.27.

The caveat here is that the terms involving Qit , Qi, Qm and Qot as in equation 4.27 are actually an approximation to the form:

Qi (4.33) EoX which involves a knowledge of the charge distribution. It was shown by Deal et al., however, that in the case of Qj and Q11 this charge is highly localised to within —200A of the interface [40]. This was demonstrated through the use of etch-off C-V measurements [41] for which the flatband shift was observed to be practically independent of the thickness of oxide remaining. Later experiments (using photoemission techniques) showed this value to be more of the order of 20-30A [42,43].

Similar etch-off experiments by Yon et al. showed that the distribution of mobile sodium ions, however, could not be approximated in this way [44]. Bias temperature stress testing, however, allows the sodium to be moved to and from the interface in such a way that it too can be considered as a charge sheet at the interface. This of course greatly simplifies its measurement and analysis.

Oxide trapped charge is usually assumed to be spatially located at or near the

Si :510 2 interface, and experiments using the etch-back technique have consistently reported the centroid (of hole or electron traps) to lie within 100-120A of the interface [45]. This allows for the usual approximation:

Q01 1 (4.34) Eox COX

_58-

Ion implantation, however (except that of boron), is known [46] to introduce extensive oxide traps, prior to annealing, and these are distributed throughout the whole of the oxide. In this case the above assumption will be in error. This poses a

problem when using C-V measurements since both the charge Q and the centroid must be determined separately. Since C-V measurements only provide a measurement of

effective charge, at the surface, the actual charge may be much greater when distributed and located some way into the gate oxide.

The measurements of the individual components of Q0 are difficult to isolate, particularly since there are many instances [47] when they are found to interact by influencing each others formation, annihilation and measurement.

4.3.3.1. Fixed Charge

Figures 4.6 illustrate the effect of substrate orientation on the total oxide charge.

In practice, the fixed oxide charge, Qj, is approximately two to three times greater on the <111> crystal plane than the <100> plane [48].

Q1 is normally positive after thermal oxidation, but may be made negative by injecting electrons into the oxide. The sign of the fixed charge can be determined from the direction of the shift in the measured C-V curve from the theoretical ideal

(allowing for any work function differences). As shown in figure 4.7 [1] (p. 427), a particular polarity of Q1 will cause the same shift on either n- or p- type samples.

7 [ii} [ii] [in] [001]

, S S S e

=0 '- n '

0•

o 0 2

I \T2I I -80 -60 -40 -20 0 An AO LATITUDINAL ANGLE (DEGREES)

Figures 4.6: Effect of silicon substrate crystal orientation on fixed charge density (after Arnold et al [481)

IMM P - TYPE C 00 Qo Cox -__... Cox ;;: IDEAL CURVE

---

'IDEAL[IDEAL C-V CURVE VG V0 - 0 + - -0 +

N - TYPE

C

C-V CURVE - +00 / 0o IDEAL C-V CURVE

_}_Cox IDEALV0 Ve - 0 + - 0 +

Figure 4.7: The polarity of oxide charge, and its effect on the C-V curve [1]

The direction of the shift can be explained by the concept of 'image charge'. Take the Q1 for example on a p-type substrate. This upsets the usual charge balance, creating a negative image charge on both the gate and in the silicon at the depletion layer edge, causing the depletion layer to widen. The total measured capacitance is now less than the ideal case with Qf = 0, and this applies to the capacitance measured at any value of gate bias. The C-V curve is thus shifted in a negative direction, without distortion, along the voltage axis.

The actual charge density Q1 is very dependent upon the final oxidation conditions of temperature, cooling rate and oxidising ambient [40]. It is important to realise, however, that any heat treatments such as in-Situ nitrogen post oxidation anneals, which alter the density of Q 1 , will also influence the interface trap density, either in the same or the opposite direction as shown in table 4.1 [43].

Montillo and Balk [49] observed a correlation as shown in figure 4.8 for which the physical origin is still undetermined.

4.3.3.2. Mobile Charge

Historically, the C-V technique has played a major role in the discovery, study and control of mobile ionic contamination of gate oxides. Considered by many [1]

-60-

PROCESS

•DECREASING 02 TEMP ft ft •SILICON ORIENTATION iD ft (100) _011)

• DRY N2, ARGON 4 ft *DRY N2 (LONGER TIME) 0'

• HYDROGEN

•WATER

• ARGON IMPLANT

• NEGATIVE FIELD 'iD ft

Table 4.1: Qualitative relationships between Q1, D, and processing variables [43]

• Nss P (100>

8r 211 CM ?Qox . 3 [ 1000 A SiO 2 GROWN I ATI000C z 94 E U .4;

600'C Pa

OIL. 0 20 40 60 80 100 120 140 240 NITROGEN ANNEALING TIME WIN) Figure 4.8: The observed correlation between fixed charge density and anneal time

(pp. 765-766, 754) to no longer be an issue in modern VLSI processing, the ability to measure this charge is still the biggest selling point and most common use of C-V measurement systems today [50,51,52,53]. The assumption that mobile ionic charge is no longer a problem is a fair one given the research and effort expended on its elimination from semiconductor processing since Snow et at. first used the technique to study the contamination of Si02 due to sodium in particular [54]. Typical procedures to minimise sodium contamination are described in reference [1] (pp. 765- 775). Bias-temperature stress C-V testing, however, does provide the quickest estimate of mobile ion density for the purposes of furnace checking, reliability studies, and trouble-shooting. For example, to qualify a furnace for contamination, an MOS

- 61 - capacitor is first formed in the usual manner (see Chapter 3). The capacitor is then subject to a sequence of heat treatments under mild bias stressing (i.e not enough to cause tunnelling) and C-V curves measured to determine any variation in flatband voltage so caused. The exact details of the measurement procedure are standard and given elsewhere [1] (pp. 429-435).

A typical procedure for a bias-temperature stress test is outlined in table 4.2. This simple technique can, however, be complicated by two additional factors, namely:

the formation of interface traps during stressing, and

the charging of oxide traps during negative bias.

Fabricate MOSC Measure HFCV characteristic at room temperature

Determine Vfr Heat sample to 200 °C Hold at high temperature with gate biased to -10V for 10 minutes. Cool to room temperature whilst maintaining the gate at -10V Inspect for gross stress failure. Determine Vj Repeat steps 4-7 with a + by bias..

Determine V Jb Calculate Q.

Table 4.2: Typical B-T stress test procedure

The effect of interface trap formation is often thought to be 'visible' by an observed stretchout of the curve, and this would indicate that corrective action should be taken, either in the numerical analysis [36] or in the experimental procedure. This, however, makes unjustified assumptions about the awareness of the operator.

Oxide traps are a separate issue, and described in the next section. In the majority of cases, the shift between the two curves will be a parallel one, and the flatband voltage shift thus observed, entirely due to mobile change. The typical curves

-62- from a heavily contaminated oxide are shown in figure 4.9.

4.3.3.3. Oxide Trapped Charge Q0,

Oxide traps are normally neutral, distributed within the thickness of the oxide - layer and associated with crystalline defects in the Si0 2. However, they may become charged by the introduction of free electrons or holes into the oxide by:

ionising radiation

photoemission

ion-implantation

injection from the substrate

and a number of other radiation-generating processes used in VLSI manufacture, including sputtering, electron beam/x-ray lithography, plasma etching and ion-milling.

It was reported by Gdula [55] that gate oxides produced by different processes exhibit different susceptibilities to charge-trapping caused by radiation. This is an area of active research, with many workers using avalanche injection and C-V measurements to determine the sensitivities of various gate oxides [56,57,58,59,60].

The presence of charged oxide traps gives rise to a voltage shift of the C-V curve in a similar manner to fixed charge, Q 1 . Because of this similarity, however, the two effects are generally inseparable by C-V analysis, and other techniques such as photo I-V [42] and etch-off methods [40] are more suitable for determining the trap energies

I

2 THEORETICAL 4 CURVE 0 9 3EXRIMTAL 0 CURVES U

U 0

0

0 £ I I -50 -40 -30 -20 -10 0 10 20 V6 (VOLTS) Figure 4.9: C-V curves from a heavily contaminated sample before and after B-T stressing. (after Nicollian and Brews [1])

-63- and distribution.

The effects of radiation, particularly during processing steps such as electron- beam evaporation, are well known [1] (PP. 549-577) for their ability to generate trapped charge, both in the oxide and at the interface. Given that the ionising radiation has sufficient energy to cause carrier generation (>8.8eV, the oxide bandgap) [611, then the electrons or holes thus liberated are free to get trapped at the defect sites. The trapped charge produced in this way is stable and fortunately easily removed by a post-metallisation anneal. A more detailed study of the thermal detrapping mechanism can be found in a paper by Manchanda et al [45].

An alternative means by which energetic carriers can be made to surmount the energy barrier is by direct injection into the oxide. Avalanche injection [1] (pp. 495- 508) is the preferred method for the study of the properties of oxide traps due primarily to the simplicity of the apparatus required. Additionally, avalanche injection is a predominant degradation mechanism in VLSI circuits [62] since the hot-carrier effects are now known to be related to the degradation of micron and sub-micron scale devices. The minority carriers in the inverted channel of a MOSFET can be accelerated by the source-drain field to such a degree that they are able to overcome the energy barrier at the Si:Si0 2 interface and become trapped in the oxide. This effect is responsible for the gradual and cumulative degradation of MOSFET parameters such as transconductance and threshold voltage [63]

Research [64] has indicated the superiority of IGFETs for the study of hot-electron transport in silicon. However, the MOS capacitor is still a very powerful technique for the analysis of the hot-electron trapping properties in Si0 2 films, especially for monitoring rapid changes in the fabrication process [65].

4.3.3.4. Interface Trapped Charge

The techniques for measuring the distribution, density and properties of interface traps, (often also referred to as surface states, interface states and fast states) are many and varied and include the high-frequency C-V technique [1] (pp. 325-328) first developed and used by Terman [26]. Because of their wide-ranging and degrading effects on proper MOSFET and CCD operation, interface traps are probably the most important non-ideality present in MOS structures that can be characterised by the C-V

-64. technique.

Rather than being at a single discreet energy level within the band gap, interface traps introduce a continuum of energy levels throughout the forbidden gap as illustrated in figure 4.10. Because of this distribution of Qj, it is usual to introduce the concept of interface trap density, D:

Dil - 1 dQ1, (4.35) - q dE The effect that a high density of interface traps has on the high frequency C-V curve is shown in figure 4.11. As can be seen, this curve exhibits a characteristic 'stretchout' due to the change in contributory Qj, as the bands are swept through the traps in response to the gate bias. Looking again at figure 4.10 illustrates the mechanism in terms of the change in the energy levels EC and Ev at the surface relative to the fixed trapping sites and the Fermi level.

An equivalent circuit model shown in figure 4.12 proposed by Nicollian and Goetzberger [66], shows how the traps may be modelled as a parallel RC branch to the usual silicon capacitance, C. This simplifies to a parallel combination of a frequency dependent C, and G. as shown in figure 4.12 b., where

ci' cp = csj+ (4.36) 1+W 22T and

G = Cu WT (4.37) W 1+c02T2 where

= CR (interface trap time constant)

A ll chorgedi N()I ------ - 7 Neutroll MIT neutralj

Figure 4.10: Band-bending diagrams to illustrate the occupancy of a continuum of interface trap energy levels

-65- 1 1 1 b r- 1 i...am s ro

I 0 C High trap density "-a

o LL Is 0. I ()

0

vait GL U M (v)

Figure 4.11: Effect of a high interface trap density on the shape of a C-V curve

R

I C si LI Cit CP'

Figure 4.12: Equivalent circuit model of a MOS capacitor with interface traps

The actual quantity measured by the capacitance meter (in parallel mode) is of course an admittance of the general form-

Y. (4.38) thus

2 w2CTC ox Gm = (4.39) C, )2+ W 2 (C0 + Csi + T2(C0 + Cr1) and

-66-

______ C,1)2+ =ox _(C 0 + C31 + W 2T2C (C0 + C51) Cm + 1 (4.40) 11• (Cox C,1)2+ (CO,C + C5, + C) + + T2 2 C51 w2 (Cox + cSi) J This implies that the trapped-charges can be analysed from the capacitance or the conductance. As will be shown later, the conductance method can actually yield more accurate information, and is more sensitive to lower trap densities. The high- frequency C-V technique does, however, provide a faster means of determining Vfr and total Q,,.

Trap density is given by [1] (P. 328):

J}Csi(&) Cox — (cm 2eV') (4.41) =

where d ç /dV is the slope of the /s VS. V8 curve. In fact it is the shape of this curve that contains all the relevant information for the extraction of interface trap parameters from high-frequency C-V measurements. This is based on the assumption that the interface traps themselves do not actually contribute to the capacitance at high frequencies, but only affect the response to changes in gate bias. A typical 0, -V8 curve shown in figure 4.13, from an electron-beam metallised sample before and after the post-metallisation anneal illustrates the differing shapes. The relation can be obtained from an interpolation of the simulated ideal C-V curve with the measured C-V plot. Unfortunately this relation is only accurate in depletion and weak inversion since the frequency response of the inversion layer introduces unnecessary complications into the analysis. Si.arf'a p ter,tl1-Gte b i ae plo-t

15

a

-+10 -.5 0

Figure 4.13: Surface potential - gate voltage plots from sample before (lower curve) and after (upper) post-metallisation anneal.

-67- The major sources of error in extracting the interface trap density by the high- frequency technique are:-

Errors in the theoretical C5 (01)

Errors due to differentiation of the cb.. vs. V2 curve

Failure to measure a true high-frequency C-V curve

Numerical round-off errors.

A typical D (ç5 3) plot derived from the previously mentioned sample is shown in figure 4.14. A full discussion of the errors inherent to the C-V method can be found in reference [1] (PP. 333-344).

Point 1 is worthy of further mention since any errors in the calculation of C will give rise to an erroneous extracted Cil and hence onto false Di,. This is a problem especially for non-uniform profiles and various correction factors have been proposed. Baccarani's technique is reminiscent of that used by Blacksin [24] to derive a doping profile, and involves a comparison between the measured C-V curve and that derived from solving Poisson's equations with a hypothetical doping level to calculate an apparent level of D.

Trp dri w I s.' - brd 63 rlr>'

'S .4

> 1 C

Il

E 1

U.

I, 1 I U

4, ,.

E: I - Et CV) -->E'.-

Figure 4.14: Interface trap density and distribution as determined from the high-frequency C-V data

- 68 - The main disadvantage to using the capacitance data is its lack of sensitivity to low (-1010cm 2eV 1 ) levels of interface trapped charge. The low frequency technique also suffers from this resolution problem. Since most of the current research on modern MOS processing is aimed at tracking these lower levels of D11 , the conductance method as described in section 4.6 may appear to offer a better solution to the measurement problem.

4.4. Typical Non-Standard Curves

To highlight some of the practical difficulties of interpreting C-V measurements, some lesser-known capacitance characteristics will be described. The purpose of this section is not to attempt to provide a model or detailed mathematical description for these observed non-idealities, for often there is no known model, but to raise the reader's awareness to these everyday problems. Correction factors and/or actions are, however, described as applicable.

As a consequence of the high sensitivity of C-V measurements, an understanding of some of the more unusual responses from the test device is important to be able to isolate the exact source of problems. Many of these effects leave a signature in the form of a very characteristic high-frequency C-V curve, whilst others may tend to mask the parameters or characteristics under investigation. A major drawback for automatic C-V systems is that these deviations from the expected form frequently occur, and if allowed to pass undetected they will cause erroneous data to be extracted from the curves and thus many of the usual numerical analyses fail.

Series Resistance

A common problem and one which is frequently ignored is the effect of the bulk resistivity on the equivalent circuit model, as shown in figure 4.15. For a lightly doped substrate and/or imperfect contact, the error introduced in the measured capacitance

Cm is significantly frequency dependent and of the order of a few percent. For a poor substrate and/or top probe contact, these errors can be quite severe. In the latter case, they are best prevented, for the values of capacitance measured are quite often low and subject to further round-off error by virtue of the resolution of the equipment. In the former case, it is possible to completely correct for the effect of R on Cm if it can be measured reliably. The simplest method to determine R3 is to bias the sample into

.69- G m

Figure 4.15: MOS capacitor equivalent circuit model including series resistance of the bulk silicon

accumulation, switch to series circuit representation mode and measure R directly. This is possible only in accumulation when the equivalent circuit model of figure 4.16 can be assumed and there is a negligible contribution from the silicon capacitance, C and interface-trap capacitance, C.,. The correction formulae to be applied to the

measured Gm and Cm are given below [1] (pp. 222-226):

Gm (1 —R, ) — w 2C, GC = (4.42) (1 RsGm )+ u 2C,,R 32 Cm Cc = -- (4.43) (1 —R m ) + w2C,R 32

where GC and Cc are the corrected values of the sample conductance and capacitance respectively. It should be noted that during the remainder of the C-V sweep, the capacitance meter should be set to assume a parallel circuit model as the best representation of the sample under all conditions of bias.

The effects that a range of values of R have on the measured C-V curve are shown, along with the corrected curves, in figure 4.17.

Oxide Resistance

The resistance of the gate oxide is normally considered to be so high as to be insignificant in any analysis. The exception is when the oxide is 'leaky' or has a tendency to break down under an applied bias. The effect of an oxide completely breaking down is illustrated in figure 4.18. Both of these factors can have a severe

-70- cox

Figure 4.16: Simplification of figure 4.15, valid only in strong accumulation

FR .LaJ —\./ j so , Rs0

' Rs82 x I.1 Rs150 —

- Re-330 0 C C 54—

- Rs-560 EL 0 C 36— 0. C C) Is —

I -3.0 -i.e -.6 .6 1.6 3 .0

vi.t9 ft (v)

Figure 4.17: Family of C-V curves relating to different values of series resistance, R effect on the shape of the C-V curve. Leaky oxides can originate from a variety of processing deficiencies which give rise to a significant reduction in dielectric strength, for example, an unusually high density of pinholes, particulate contamination or other defects. Leaky oxides cannot generally be characterised, and a detailed analysis is not normally pursued using electrical measurements. Physical inspection and oxide furnace control are more appropriate in this instance. The C-V curves illustrated in figure 4.18 are not, however, merely for interest. With careful biasing, keeping the substrate in a reverse-biased state to reduce current leakage, it may be possible to extract some information from the sample such as substrate doping or an estimate of Vib. The usefulness of this unreliable information, however, is questionable. Figure 4.19 shows a C-V curve measured from a wafer which had had an RTP oxide grown with an insufficient anneal time. The effects of under-annealing RTP oxides are detailed

- 71 - UU_

EIt [TTTTTTT'Tu1TT ______so ' .rJETtJ I bOlu ;. '4 ! ; hii 1 I J H ,fl zTi

OX .... : ,. :: fr;:j;u ; 'I

-tO -8 -6 -4 -2 3 2 6 6 8 tO AMF VOLTAGE (oft) -.-+

Figure 4.18: Effect of oxide breakdown on the C-V characteristic (after Huang [10])

L4.) \/ I 9, 5,- x

4

0

3

o 1± C 2 0 C U

VD 1 -t aL 0 eq (VJ Figure 4.19: C-V curve measured from a sample prepared by RTP with a moderately leaky o, elsewhere [58].

For particularly thin oxides (-100 A), Fowler-Nordheim tunnelling contributes significantly to oxide leakage for biases as low as 5 volts [671.

- 72 - Poor MOS Contacts

A non-ohmic contact to the rear of the wafer can give rise to spurious frequency- dependent behaviour which is more noticeable in accumulation. To test for this condition, C0 could be measured at several different frequencies and compared. Any discrepancies, other than those due to the frequency response of the measuring equipment may signal a non-ohmic contact. For properly prepared C-V capacitors, however, this is rarely found to be a problem.

An unreliable contact to the top plate has been reported by Huang [10] and shown to have a disastrous effect on the measured curve as shown in figure 4.20.

Interface Trapped Charge

The presence of a high density of Q1, on C-V curves is well known [1] for causing a characteristic stretchout of the curve from depletion to inversion. This effect, however, is often only appreciable when viewed in comparison with an ideal curve. Figure 4.11 has already illustrated this effect.

Huang [10] has observed another anomaly during a rapidly swept un-pulsed C-V measurement. As shown in the figure 4.21, the presence of interface trapped-charge causes the equilibrium high-frequency curve to exhibit a delayed entry into deep depletion due to the interface traps acting as a supply of minority carriers. This effect has also been observed by this author during pulsed C-V tests as illustrated by figure 4.22.

I

_TS

-10 -8 -á -4 -2 0 2 4 6 8 10 UMP VOLTACf (,*It) -4 Figure 4.20: Effect of a poor top contact to the MOS capacitor (after Huang [10])

73 - I—I I 620 IV-, P. ,-. '-4 am r -

X 0 U

U

Qte VD i M. Figure 4.21: High interface trap density causing delayed entry into deep depletion during a rapidly-swept C-V measurement

PI L4 1 I 3. E

U C

4,

CL U a. U

-a. - .j .7 1.8 2.9 (v) Figure 4.22: Glitch caused by delayed entry into deep depletion during a pulsed C-V sweep

Localised Charge Non-Uniformities

The effect of a laterally non-uniform distribution of fixed oxide-charge has been shown by Chang and Johnson [68] to produce anomalies in the C-V relation that appear qualitatively similar to those due by interface traps. To test for charge non- uniformities, the method they proposed to distinguish between the two involved an

- 74 - analysis of the flatband voltage distributiOn function. This test is based in part on an analysis by Brews and Lopez [69] which proved that the distortions in the high- frequency and the quasi-static curves could not be the same in each case.

Such a test was used by Jourdain et al. to determine the cause of the distortion in an experimental high-frequency C-V curve after bias-temperature stressing [70]. In this experiment, interface states were found to be the primary cause for the observed curve shape as shown in figure 4.23.

Capacitive Substrate Top Contacts

As an alternative to making an ohmic back contact to the sample wafer, a capacitive contact is sometimes employed. This arrangement is typically used by mercury probes which have a ring-dot contact configuration as shown in figure 4.24. The principle is in essence a simple one; as long as the 'substrate' contact capacitance is large in comparison with the gate capacitance, its effect can be neglected. Herein lies the caveat, for McGillivray et al. proved the inadequacy of this assumption even for capacitor ratios as great as 128:1 [35, 71]. Typical mercury probes use contacts which have an area ratio of 30:1 or 50:1 [72, 73]. The effect of different area-ratios on the shape of a C-V curve can be gleaned from figure 4.25. McGillivray et al. concluded that measurements using mercury probes were unlikely to be useful except in undemanding measurements. All capacitors fabricated for this thesis work employed

'Sc

E S 5 MVm

Mi*jipd Cv•

C.omgut,d ©

LU 75 2

U

Pt

FIELD PLATE YOTA&E (v) Figure 4.23: Distinguishing between interface trap charge and oxide charge non-uniformities (Reference [70])

- 75 - Substrate capacitive top contact

Figure 4.24: Ring-dot contact configuration commonly employed in mercury probes

Ohmic Back Contact Capacitor Contacts 128:1 . 64:1 ---

4:1 - -- -

x 0 U .5

U

fIll - -

-4 -3 -2 - 0 I 4

Voltage (V)

Figure 4.25: Effect of different area ratios on the MOS capacitor characteristics using the two top-capacitive contact configuration an ohmic backside substrate contact to eliminate this possible source of error.

Effect of Incident Light and Elevated Temperatures

These two environmental effects are grouped together here, for they produce almost identical effects via similar electronic processes. Incident photons of visible light are sufficient to cause electron-hole pair production in the bulk silicon [74]. Similarly, elevated temperatures will also encourage these generation processes as the silicon becomes degenerate.

The effect of additional minority carriers is very noticeable in the inversion region of the C-V curve as shown in figures 4.26, where they contribute to an increased silicon capacitance through a reduction in the inversion layer width. This causes the

- 76 -

I I I I I

2 7 a IO footcondli to IC 250 C

23 to' 09 09

ot 08 g08 33.102

07 '40• 07

N 145 i iO'4 c.- 3 27 N. 145i0VIS 06 t..02m 06 t•,•02m -66

05 I I I I -i 05 I I I I -u -to -5 0 5 to IS 20 -20 -IS -10 -5 0 5 to *5 20 vs .v V0, V

Figure 4.26: Minority carrier production at the gate periphery caused by light and elevated temperatures, and its effect on the shape of the C-V curve (after Deal [75]) capacitor to exhibit almost 'low frequency behaviour'. A further effect light has on the inversion capacitance is a reduction of the minority-carrier generation time constant in inversion [75]. However, this mechanism is overshadowed by the generation processes at high frequencies.

MOS capacitor measurements should, therefore, always be carried out in complete darkness and at room temperature to avoid complications in the analysis.

Enhancement/Threshold-Adjust Implants

It is common practice to raise the threshold voltage of MOSFETs by the implantation of additional dopant [76]. The effects that such a threshold-adjusting implant has on the C-V curve is both marked and characteristic as is shown in figure 4.27. This distortion has the effect of altering the effective threshold voltage (via a change in Vp,), and modifying the turn-on characteristic. The shape and extent of the distortion can be used to determine the substrate doping profile, as will be described later in section 4.5.

One problem with this empirical analysis of the curve shape is that for low implant doses and energies, the nature of the distortion is qualitatively similar to that produced by interface traps. An analysis of the equivalent parallel conductance, however, can resolve this uncertainty [77].

Since ion implants are known to cause a variety of other side effects such as oxide-charge trapping, care should be taken with the preparation of samples specifically

- 77 - C U C 0 4)

OIL 0 CL 0 U

3.8 4.9

1 (vj Figure 4.27: Distortion of the C-V curve caused by a threshold adjusting implant (8E11, 40KeV Bo into a Boron-doped substrate, NA =7E14 cm to determine substrate doping profiles.

Depletion/Compensating Implants

In the case of a threshold-adjusting implant that is of opposite dopant type to the substrate, type conversion can occur at the surface at sufficiently high implant doses. At the sort of doses typically used in CMOS processing to balance the threshold voltages of the n- and p- channel devices, a pn junction may form near near to the Si:Si02 interface.

This effect was studied by Fang [78] and later by Tong and Andrews [79] in more detail in relation to the observed effects on C-V curves. Both conclude that the resulting device is not readily modelled as a simple MOS capacitor. For the case of a p-channel device, fabricated in an n-well on a p-substrate, a boron Vt-adjusting implant can give rise to four different surface profiles, depending on the implant dose and energy, as listed below.

-78- For very low doses, the surface remains n-type, but more lightly doped.

For an increased dose, surface conversion takes place. For low doses, this means the pn junction is practically at the surface.

For slightly greater doses, the space-charge region created by the pn junction extends to merge with the depletion layer induced at the surface, under the gate. As a result, there is no conducting channel.

For higher doses, the surface depletion layer and the pn junction space-charge region do not meet even for gate voltages which invert the surface p-layer.

As pointed out by Tong and Andrews, case 3 is the usual situation arising in VLSI CMOS processes, with case 2 an extreme example. Case I does not affect the equivalent circuit model, nor the interpretation of the curve, and case 4 is unlikely to arise under normal circumstances.

An equivalent circuit model proposed by Fang accounts for the behaviour of the measured capacitance in accumulation, and it was noted that the shape of the C-V curves was dependent upon the depth of the junction created, and the dose of the implanted ions. A non-linear relation between the dose and the threshold voltage change was observed as reproduced in figure 4.28. V1. was measured from the I-V curve of an MOS transistor on the same substrate.

The circuit model was developed further by Tong and Andrews to cover all bias regimes of the MOS capacitor, and a similar dose - VT relation was observed as

.- >0 P+ DOSE 3E11 UJ C., 6E11 W 49 —J L r 0> iii \\t\ -03 0

w cc I I- 0 10 BORON DOSES (10/cm 2 ) Figure 4.28: Implant dose-threshold voltage relation as observed by Fang [78]

- 79 -

shown in figure 4.29. In this instance, VT was determined directly from the C-V curve.

Both of these results, however, contrast with a more recent study by Virdi et al.

who investigated the effect of Dox and implant dose and energy on the C-V curve and concluded that there is a linear relation [80]. It is this author's opinion that this difference arises solely from the inappropriate model and method of threshold voltage determination used in this study.

In summary, a compensating implant affects both the observed maximum capacitance, since this is no longer a function of the oxide alone, and the shape of the C-V curve. As can be seen from figure 4.30, a characteristic 'dip' is observable. This should signal to the operator or the system that the conventional analysis for the

determination of Dox and VT should not be applied. Instead, Dox is best determined by other means such as ellipsometry. VT, however, can be accurately determined from the curve:

VT = VG ( I C W . 4.44)

i.e. VT is the gate voltage corresponding to the capacitance minimum, C, at the 'dip' in the C-V curve in inversion.

-3

-2

a,

C p

21 I 0 2 4 6 8 10

B;1 DOSE, 1011crn2

Figure 4.29: Implant dose-threshold voltage relation as determined by Tong and Andrews [79]fro,n the C-V curve.

- 80 -

17- t-4

K '-I

C 0 C 95 4) o r. Q. 0 U 25

-2 1 4 7 10 13

V1•t9 (v) Figure 4.30: c-v curve characteristic of a depletion implant

(2E11, 180KeV As into Boron-doped substrate, NA = 1E15cm 3 )

Short Minority Carrier Generation Lifetime

A short lifetime, r8 can have a variety of undesirable effects on C-V measurements, especially the pulsed variety. It is important to be able to recognise these effects in the C-V curve, since they are virtually impossible to model, and may cause any analysis that makes use of the entire depletion-inversion section of the C-V relation, to fail.

The effect of a short lifetime prohibits complete deep depletion during a pulsed C-V sweep. The inversion layer is able to reform after a pulse into depletion and before the capacitance can be measured. Faster capacitance meters may partially overcome this limitation, although they are not currently available with an accuracy comparable to their slower counterparts. Additionally there is always the possibility that the measured curve is neither fully depleted nor fully equilibrated.

Since modern MOS processing can produce very high quality, long lifetime silicon, it is really only necessary to identify those samples which have fallen short of the mark. The undesirable, but visibly obvious effects of low 'r 8 on a pulsed C-V curve can be seen from figure 4.31. The visible step' was caused by a momentary pause due to an automatic range-change within the capacitance-meter.

- 81 - . L.4 I = tm =4 C= ...., 4r

X 'I 92 I, 0 C 0 24 -

-Li o C 16 - 0. GLITCH C 0 8-

-3.6 -2.2 -.e .6 2.0

Vc, • It (y)

Figure 4.31: The effect of applying a pulsed C-V sweep to a capacitor sample with a short minority carrier lifetime

Lateral Inversion Layer Spreading

First studied by Nicollian and Goetzberger [81], the effect of a high density of oxide charge may be to invert the silicon surface even in the absence of an applied bias. This unexpected inversion is most common on p-type substrates since the net charge is predominantly positive. The typical charge densities, however, for which Nicollian & Goetzberger observed this effect are by todays standards exceptionally high (1012crn _2) , and produced shifts of around —20V in their samples.

More important for todays measurement is the effect of minority carrier response time (in inversion) and thus on a C-V curve of an 'inversion layer beyond the gate' (ILBG) caused by the presence of moisture on the surface of the oxide. This phenomenon and possible curves were detailed earlier in section 3.6.2.

In common with the problem of high oxide charge density, the silicon surface beyond the gate may become inverted by the lateral current flow. This inverted region acts as a source of minority carriers, and thus when the region beneath the gate is biased into inversion, the effective carrier lifetime is reduced. As a result, the resulting C-V curve tends towards the low-frequency response, as can be seen in figure 4.32.

- 82 -

011 U 0

Voltage (V)

Figure 4.32: ILBG causing the high-frequency C-V curve to tend towards low-frequency behaviour (after McGillivray [35])

The effect of moisture is not limited to p-type substrates, but is likely to be more noticeable on lightly-doped substrates due to the low 'natural' threshold voltage. Unless precautions are taken this effect will be troublesome when attempting to determine an accurate C-V curve in inversion. Obviously, this moisture-induced phenomenon is greatly affected by ambient humidity, and critical measurements should, therefore, be performed in a very dry environment and the samples baked at —200°C to drive out the surface moisture.

4.5. Pulsed C-V Dopant Profiling

Specifically when the non-uniformity of the substrate is of interest, the C-V technique is readily adapted to yield the corresponding doping profile. As already mentioned, the distribution of donors and acceptors near the surface of the silicon have a marked effect on the electrical properties of any device. Irrespective of the origin of the non-uniformity, the ability to measure the final dopant profile is an important check on processing.

As observed in the previous section, the slope of the transition region of the C-V curve relates to the activity and the distribution of ionised dopants. Van Gelder and Nicollian [82] demonstrated how it was possible to obtain the profile from the slope of a 1/C 2 vs. V curve, assuming of course that interface trap stretchout could be neglected and the depletion approximation holds:-

-83-

1-1

d ( N(w)=-2 1qe0eS1 1 'j (4.45) dV-G I TJ M

where

1 (4.46)

The pulsed technique is commonly used, when the minority carrier lifetime permits, to profile 'deep' into the substrate and beyond the equilibrium maximum depletion width maximum given by:

E E çj W m = (4.47) mm A typical voltage step sequence where the device is biased alternately into accumulation and then into the selected degree of depletion is shown in figure 4.33.

A number of enhancements to the basic technique have been made over the years to overcome some of its limitations [37. 83, 38] and maximise the available parametric information. As a result, the C-V technique is probably the most convenient, accurate and non-destructive means of examining the near-surface doping profile.

Accumulation

0

Poo

Depletion

(surface) Depth into silicon (substrate)

Figure 4.33: Voltage step sequence for a pulsed C-V measurement (equal spacing)

-84- 4.5.1. Limited Depth

Even using the pulsed technique [84], at higher doping levels, the larger voltage required to deplete the substrate will eventually cause the silicon to breakdown. This avalanche breakdown voltage decreases with increasing doping and reducing oxide thickness. Additionally, there is a practical upper limit on doping of _1O 18 CM —3 imposed by inter-band tunnelling, beyond which the pulsed technique gives no more information than an equilibrium measurement [1] (p. 375). For samples with short minority carrier lifetimes, an etching technique using a device such as the Polaron cell [85] may be used to overcome the depth limitation.

4.5.2. Limited Resolution

Because the analysis of van Gelder and Nicollian actually determines the majority carrier dezsity at a diffuse edge at the boundary of the depletion region, the uncertainty of which is a function of Debye length;

Esi x D ( 4.48) = the measured profile becomes 'smeared out' and, in practice, limits the resolution to about three or four Debye lengths (at room temperature). For rapidly varying profiles, Kennedy and O'Brien have suggested a corrective routine [86]. Unfortunately, this requires 3rd-order differentiation olthe C-V curve, which can give rise to large errors in the resulting profile. Furthermore, this and other [87, 88] methods are only adequate for structures whose profiles do not vary greatly over a few Debye lengths. Wilson [89] approaches this problem through the application of an iterative simulation of the profile from a calculated C-V curve. This procedure was shown to improve the resolution of abrupt profiles to approaching that of the local extrinsic Debye length. The method of Blacksin [24] uses a similar technique. Bartelink [90] also proposed a successful correction technique involving a more critical study of the underlying assumption in the van Gelder and Nicollian model, namely the depletion approximation. 4.5.3. Interface Proximity Limitation

Another well-known shortcoming of the van Gelder and Nicollian analysis is that of the anomalous exponential rise in doping concentration near the surface of the semiconductor as shown in figure 4.34. This is again due in part to the failing of the depletion approximation within a few Debye lengths of the surface. Surface in this

context refers to either a Schottky diode junction or the MOS capacitor Si:Si0 2 interface. The MOS capacitor is the preferred device for surface profiling since it

allows for the closest approach to the surface [1] (p. 381). Additionally, fixed charge at the semiconductor surface is often due to distributed interface-traps, but at energies very close to the band edge. As a result, an artificial and rapidly rising profile is observed.

A technique devised by Ziegler et al. allowed for profiling right up to the semiconductor surface to take place [39]. For this, a uniformly doped substrate is assumed for which a closed form solution of equation 4.45 is obtained, by considering the majority carriers and the space-charge capacitance in the derivation of doping density and depletion width.

The relations proposed by Ziegler et al. refer specifically to the 'surface', w <2XD. For the pulsed C-V technique these become

I p -t 1 --P I •l — I' E 0 '-p

4' I C I a

C 0. a a B .0? .14 .21 .20 .35

ptI - CmIarr)

Figure 4.34: Anomalous rise in doping near the Si:Si02 interface caused by the breakdown of the depletion approximation.

- 86 -

kT 0 dill (w 4.49) - 2 1 2 J = 811

N(w )= ± q€2€ X 82 (4.50) (--)] Xd

where C is the measured capacitance and V the applied bias. 9 1 (w/X) contains information about the slope of the 1/C 2 relation, whereas 82 is the correction factor to the doping relation, due to majority carriers, and is in fact the ratio of the doping concentrations evaluated with and without the 'complete' depletion approximation. The g functions g 1 and 9 2 are calculated from the g-function [39] defined by the transcendental relation:

-= (g _ln(g )_1)S (4.51)

These are graphed in figure 4.35 and have the limiting values:

2 11mg 1 = - and limg 2 = 3 3 and for w/X a >2:

and 821

Thus, at distances greater than two Debye lengths from the surface, equations 4.49 and 4.50 just revert to those of 4.45.

The procedure for using these relations, which is not very clear in the literature, is as follows:

I .2

1.0

.0 (V 3 .0

.4

B .14 .20 .42 .50 .70 0 .14 .20 .42 .50 7.

Figure 4.35: g-functions for the Ziegler analysis of the near-surface doping profile

-87-

Calculate g(w/X) using equation 4.49 and the measured C-V data.

Read off 9 2 and w/Xd from figure 4.35.

Correct profile data N(W) as per equation 4.50 with value of 9 2 .

From w/Xd and equation 4.49, can calculate depth, W.

Examples of surface-corrected profiles of uniformly doped samples are shown in figure 4.36.

A frequent criticism [91] of the Ziegler technique is that since uniform doping

must be assumed for the extraction of g i and g2 from Poisson's equation, the analysis cannot really be applicable to a non-uniform profile. Bartelink [201 claims to have overcome this shortcoming. Gordon [38], however, notes that although the error will increase with an increasing gradient in the dopant near the surface, the model is reasonably accurate for ion implants of low dosage, e.g. VT -adjusting implants.

An added bonus from the Ziegler analysis is that it yields an accurate value of

Cp, for non-uniform profiles since the point at which g becomes equal to 2/3 is defined as the onset of depletion.

4.5.4. Interface Trap Distortion

As mentioned earlier, the presence of interface traps can mask or counteract variations in the C-V curve due to doping changes. Whilst it is far preferable to minimise Dit through careful preparation of the samples to be profiled, there may be

I .-, 1 1 — 4 •-' 1 1 — 'S

E E 0 0

9

I S C C S a a 0 C C I I 0 0 a a N

D p I, Cn,Io ,- ... ) p t P, C ,,. 1 0 or. - ) Figure 4.36: Doping profile corrected for interface proximity distortion (surface correction also shown, magnified)

SCRE occasions when this is impractical. In this instance, Brews [36] suggests a technique using the combined information of the high and low frequency C-V curves to null out the effect of D,, and produce a corrected profile, using equation

( i—c.ic0 '1 N(w)= _JNo (w) (4.52) L1CHF/C ox where N0 is the uncorrected profile, and CLF and CHF are the measured quasi-static and high-frequency capacitances respectively. This method, however, suffers from a depth limitation, since no equivalent of 'deep depletion' is attainable from quasi-static measurements. Additionally the method is considered only to be accurate when the interface-trap density is small, and of limited use in practice [35].

An alternative experimental procedure demonstrated by Brown et al. uses liquid nitrogen temperatures to lengthen the effective time constant of the interface traps, thereby reducing the response of the trap states to the alternating measurement signal [92].

An example of the effect of a high interface trap density on the calculated doping profile of a uniformly doped substrate is shown in figure 4.37. This is obviously misleading and to be avoided.

.f F= I.11 -. c= -F i 4E+2

ZE+1

4)

C S

1+1 m c

0 1E+ 8 .2101 .4281 .6382 .8482 1.0583

13t.h Cm1r-r)

Figure 4.37: Apparent non-uniform profile arising from the effect of a high interface trap density

-89- 4.5.5. Differential Noise

From the first application of the differential C-V technique to the profiling of p-n diodes [93], the method has suffered from noise in the profiles as a result of taking the derivative of the C-V curve. Miller [94] was the first to formally address this shortcoming with the 'feedback' method of profiling a reverse biased p-n junction. To increase the signal-noise ratio of the resulting C-V curve, the constant-field method of profiling was applied, whereby the depletion layer edge is made to fluctuate by a constant fraction of the Debye length.

Recently, McGillivray et al. proposed a method of accurately profiling MOS capacitors with noise-free results [95]. Their technique was to attack the fundamental problem of the noisy derivative by optimising the C-V data collection in such a way as to maximise detail and minimise noise. Like Miller, the method involves some optimisation of the voltage step, but uniquely this formulation optimises on the resolution of the (digital) capacitance meter to produce a noise-free derivative. Furthermore, this computer technique allows adaptive voltage-step spacing with changes in local doping density and capacitance meter range-changes.

The basic inequality for this method is given by:

= qE0 E5 N(w)ZC AV - (4.53) c 3 which specifies the voltage step in terms of the local doping level, the measured capacitance, and the optimum capacitance interval between successive points. McGillivray et al. specify this notionally to be 200R, where R is the capacitance meter resolution, to achieve a 1% noise criterion in the final profile. Whilst this simple data-optimiser produces very smooth profiles for large capacitors ( 0.0256c,n 2) as shown previously in figure 4.36, the results were somewhat disappointing for devices of small area (782x 128.i.rn) The resulting profile from the small area device, with the 1% rule is too restrictive as shown by figure 4.38. Greater information would be available at a reduced accuracy in the final profile if this restriction were relaxed.

This author has devised a modified form of the algorithm of McGillivray et al. for small area capacitors. The method involves looking ahead prior to making the C-V sweep to estimate how many measurement points could be made at the fixed 1% criterion. Then, based on heuristic knowledge about the likely degree of substrate

-90- I I-' r -P 1 1 (1) 4E+l

U '-I

>'

C n 1E+1

C

0 1+1ø. 0 n .1293 .1939 .2585 .3232

D-tP- (ml = rDr, m )

Figure 4.38: Poorly resolved profile from a small-area device non-uniformity and imposing a tower limit on the number of data points required for a profile, the noise criterion is varied so as to trade off resolution for noise in the final profile.

Between the upper and lower capacitance values of interest for profiling exists a range over which we initially assume the capacitance meter maintains a constant resolution error:

Crange = Cupper - Ciowe,. (4.54) where Cuppe,. and Ciower are the 95% and 5% points on the C-V curve:

Cupper = 0.95C0 (4.55)

Ciower = C min+ 0.05(C0 —C) (4.56)

Now, the maximum number of measurement points, N max that can be made within this range at a given noise criterion R is simply:

C Al - max where Error is an acceptable value for the error in the doping profile. Conversely, for a given number of data points N, the expected error in the profile is just:

_91- Error max= (4.58) range Now for small-area devices we wish to trade off some of the error in the profile for an increased number of points (N) to preserve profile detail. For this, we could just take the arithmetic mean, although the geometric mean provides a more centrally-weighted average. This gives a pseudo-optimal set of parameters;

- C range ) N0,,, (4.59) - 2R J 2R Error0,, = (4.60) I C range J These provide both a suggested number of data points as well as an estimate of the likely maximum error in the profile. One then uses this value for Error01,, to determine IC thus:

= 2R (4.61) Error01,,

This can then be used in equation 4.53 to determine the optimum voltage step at any point during the C-V sweep. The results of applying this modification to the algorithm of McGillivray et al. for a measurement on a small-area device can be seen from a comparison of figure 4.38 with a profile measured using the tradeoff algorithm in figure 4.39. With the enforced 1% noise criterion, only 11 points are measured, whereas a compromised 4.3% error in the doping allows for 23 data points to be measured. Figure 4.40 illustrates the effect of using a fixed voltage spacing with 23 points. The improvement is clear, with there being a noticeable increase in profile detail using the tradeoff technique. The algorithm for predicting the optimum number of measurement points also improves the resolution of a conventional, fixed spacing profile measurement as shown by figure 4.40.

When the measured capacitance range extends, however, over one or more automatic range-changes of the capacitance meter, it is necessary to consider each sub- range separately. By allowing the capacitance meter to auto-range instead of simply keeping it fixed, we obtain the maximum possible accuracy in the final profile. This algorithm was implemented (see Appendix A) to cope with one range-change during the C-V sweep since, in practice, two or more are unlikely to occur for small-area devices.

92.

ID C=. I p—i • I•— c=D -P 1 1 emp 4E+I

E I) i IE+l

fi

V C IE+1 n 0) £ 0 o AEfl 736 .1473 .220

r t I-i C m I n CD r- )

Figure 4.39: Better resolution in the profile obtained from same sample as in figure 4.38, but with a relaxed noise constraint

I r-, • -F I 3E+I

E 0 1E1

>

IE+l

! E1 0 .0687 .1373 .206 .2747 .3433

9 as —i ( m I = r c r Figure 4.40: Profile arising from the same sample as in figure 4.39, and with the same number of points, but with a fixed voltage spacing

4.6. Conductance-Voltage Measurements

The most accurate technique for the analysis of low interface trap density and distribution is that due to Nicollian and Goetzberger [66].

-93-

Most modern capacitance meters measure parallel conductance simultaneously with capacitance, it being the other quadrature component of the admittance. Referring back to equation 4.37 which is modelled on the equivalent circuit in figure 4.12 proposed by Nicollian and Goetzberger, it is possible to characterise the interface trap energies and distribution from an analysis of the parallel conductance. This has been shown to be far more accurate and sensitive than analyses involving high or low frequency capacitance measurements [1]. (Section 5.8)

Interface traps capture and release minority carriers when the band bending, is varied by a change in gate bias. Since these traps may exist above or below the Fermi level, they may exhibit acceptor or donor-like responses to such changes. The

charge held in these traps at any instant, Q1, will vary over the period of the applied measurement signal and give a capacitive effect, C1,, which is frequency dependent:

dQ1, Cit (4.62) = dt Since capture and emission processes for these traps take a finite time to occur at intermediate frequencies, the non-equilibrium distribution of charge results in a response that lags behind the a.c. signal. This non-equilibrium is restored by energy exchange with the silicon crystal lattice giving rise to a variation in the parallel conductance, G. Through the interface trap levels, these generation and recombination processes cause G to go through a peak in the depletion/weak inversion region due to the manner in which the trap time-constant changes with majority carrier

density at the surface. This relation has a Gaussian form [1] (pp. 212-218) as shown below:-

G /o - 1 1 d exp 1- 1 exp( —1 )ln [i + 2exp(2) j (4.63) qDA - V22 COf 2a5 S J G is thus a function only of two parameters E and a5 where a, the capture cross- section of the trap states, is a measure of the dispersion.

For the analysis of Nicollian and Goetzberger to fit the observed variation in they had to extend their model from a single trapping energy level to a continuum of randomly distributed states. These levels all obey Shockley-Reid-Hall statistics for trapping and emission, in order to account for the time-constant dispersion with frequency. The method of Nicollian and Goetzberger involves fitting their statistical

-94-

model to the measured Gp curve to extract these parameters from the measured curve and to evaluate D (E) and a. However, this approach assumes a constant figure for

ors which was later shown [96] to give different values for Di, (E) at different frequencies.

Having measured GAr, and corrected it for series resistance, the effect of the oxide capacitance must be corrected for [1]: (sections 5.77, 5.92)

wCGp2 - Cm ) 2 (4.64) CL) - G 2 A typical resultant interface trap conductance curve is shown in figure 4.41

Brews [97] has shown a simple means by which Di: (E) and C, may be accurately determined from G/w plots against çb or In (to). Previously, an analysis involved obtaining G data vs. frequency which is difficult without specialist equipment. The method due to Brews, however, is far simpler and more convenient. Although the G;/w peak analysis is greatly simplified, care must be taken to ensure the analysis is actually appropriate. It is possible in certain circumstances to measure curves which do not in any way conform to the expected shape and this will obviously cause the analysis to fail. A typical example is shown in figure 4.42 which came from a sample that had received a threshold-adjusting boron implant after the initial gate oxidation. The analysis of Brews involves peak detection and peak width/aspect ratio calculations and

i... 1 1 1 b r I Lim p W $ 121 ro _

K La 14 - a 0 High trap density aC 0 10 - 3 .5 C 0 0 B a I - Low trap density I- 2-

-7.0 -54 -3.0 -2.2 -.B 1.0

Voltftge Figure 4.41: A typical, well-defined conductance curve from a sample with a moderately high interface trap density

-95- E cj L4 1 1 1 im r- i _ m G JC3 ui 2 B B P

x

II C) C C

U

C 0 (3 a- C

I-

-1.5 -.9 -.2 .5 1.2 1.9

Voltage

Figure 4.42: Conductance curve arising from a device that had received a threshold-adjusting implant (4E11, 40 KeV Bo)

obviously this particular sample would be unsuitable for analysis. In this case, another method such as the combined high and low frequency technique may have been more applicable.

The problem remains for both capacitance and conductance methods to determine the surface-potential:gate-voltage relationship in order to provide the energy scale for a trap distribution analysis. For non-uniformly doped samples this gives rise

to errors in the placement of Di, within the gap, and for samples with a high D, (>10 cm 2eV 1 ), the band-bending calculations also breakdown. In this situation, a low frequency determination of 95, may be appropriate.

A further practical consideration for successful G-V measurements lies with the correct choice of voltage sweep range. A more accurate plot will result from concentrating the measurement data points in the region of interest, and problems in the Brews' data analysis which can result from spurious responses in strong inversion or accumulation will be avoided. Figure 4.43 illustrates the swamping effect of a high series resistance, the leakage through a poor oxide, and a sweep taken too far into inversion. The assumption of a Gaussian-shaped G/w curve is central to the analysis, and thus any deviation will result in errors in the extracted parameters. The equivalent circuit model [1] (pp. 281-282) in figure 4.44 is valid only in depletion and weak inversion, and furthermore, outside these regimes u 3 can no longer be assumed to be

-96- FR in '-i-, - .-., 6 ____

I I 0 [ii 0 L!] C Leakage

151- 4) I) 0 0 ) 3 C C 0 0 (1 II ..e

— 1 — — 1 -

I I I C 92 4 9 9 _9

6 Q C I a I C U I $1 I U a 0 a 3 £ a 0 U 0 U I I- I..

.8 1.7 2.9 3.9

Volt.96 Volt.96

Figure 4.43: Effect on conductance curves of:. a) High series resistance, b) oxide leakage (raw G), c) oxide leakage (G (w)), d) poorly optimised voltage sweep range independent of band bending. Correct choice of sweep range is thus essential. Neither this nor the effect of light implants appear to be mentioned in the literature. This is perhaps since the effect is visibly very obvious, and in conventional manually-operated systems, an analysis would not be considered. To automate the technique reliably, some check must be made on the validity of the measured G/cü curves.

4.7. Quasi-Static C-V Measurements

Since the measurement of capacitance at very low frequencies [98] (<1 Hz) is practically impossible, the quasi-static technique was developed, employing the 'slow- ramp' technique [99] to evaluate the displacement current. This method makes use of a picoammeter and voltage source, such as the 1-1P414013, and the resulting 'quasi- static' C-V curves can yield valuable additional information about the MOS capacitor characteristics.

-97 - (0)

o-cox I I

C S CS Cs 1 (b)

Figure 4.44: Assumed equivalent circuit models for the analysis of Brews [1] (p. 281-282) a) lumped equivalent, b) transmission-line model.

In particular, the QSCV method can be used to measure D1,(E), having the advantage of being able to produce a result more rapidly than the conductance technique. The method, however, can only be regarded as semi-qualitative, since there are a number of factors which limit its accuracy or applicability (e.g. non- uniformly doped substrates, where once again C must be determined independently.) The method has largely fallen from grace since the arrival of the QV technique [100, 101], since it becomes very difficult to measure an accurate curve in inversion, particularly for modern, long-lifetime devices for which even very low effective sweep rates ('

Under conditions when the low-frequency curve is measurable, the QSCV method is most useful when combined with a high-frequency measurement for interface-trap characterisation. In this analysis [98, 11 (pp. 331-333). both C1, and C5 can be determined directly from the measured capacitances:

(4.65) = (CHF :-]' and

1 __L_ 11 (4.66) CLF J - [CHF - Cox

-98- c L1 • _ Cr) 50

x 40 -

x 0 30 U

9- 20 :14

OIVZS 10 -

0 -'. —.8 .9 2.6 4.3

Vltg eq ( V )

Figure 4.45: Effect of sweeping the voltage too quickly during a quasi-static C-V measurement

This approach appears useful at first, since there is no requirement to determine C from an interpolation with a theoretical curve. However, there are a number of sources of error listed below which proponents of the conductance method are quick to point out in great detail [1] (pp. 333-350).

Difficulty in obtaining a true HF curve at flatbands (Vp) and in accumulation

(Vacc ) [ 11 (pp. 347-351). The use of the high-frequency capacitance in this analysis assumes a negligible interface trap response [1] (p. 325). However, at

1MHz, this assumption does not always hold, particularly for large D1,, when the shortening of the interface trap time constants at flatbands causes the measured

C-V curve to deviate from the true ideal. As a rule of thumb [1] (p. 341), for Cit = 0.1C11 , the error in Ci, will be —10%. This and the effect of an imprecise determination of C0 in accumulation [25] can result in limiting the accurate range of D,, (E) determination by the HF-LF method to less than half of the band-gap.

High-low frequency capacitance comparison not valid in inversion [1] (pp. 351- 353). At the point on the C-V curve at which the inversion layer capacitance C1 is comparable to or is a significant fraction of the interface trap capacitance C.,. C, contributes to an inflated C.,. This becomes most significant at the onset of inversion, 0, =

.99- 3. Round-off errors resulting from CHF —C [1] (pp. 345-347). Obviously, in subtracting these two similar quantities, round-off errors are bound to occur. This is always the case in accumulation, when CHFC& regardless of interface trap density. This effect extends further into depletion for thick oxides and/or high substrate doping. Overall, D(E) measurements are less accurate at the band edges for large Di, due to round-off errors.

Despite these apparent shortcomings, however, the QSCV technique is currently enjoying a revival in popularity [59, 56, 102, 581 and even the critics now appear to be

recognising its strengths [103]. The majority of present day samples are of low D11 , low substrate doping and have thin oxides, so most of the limitations described above are no longer applicable. Although the HF-LF technique can fundamentally never be as accurate as the conductance method, convenience and speed may be a dominant consideration.

Keithley Instruments [52] have recently produced some specialist equipment which allows for the simultaneous measurement of feedback-charge quasi-static and high frequency capacitance curves using the feedback-charge method [104]. This technique, it is claimed, eliminates errors of alignment between the two curves which arise from separate voltage sweeps.

4.8. Capacitance-Time Measurements

Using an MOS capacitor as the test structure, the minority carrier generation rate can be determined by the method of Zerbst [105]. The basic technique is simply to pulse the capacitor into (deep) depletion and then observe the recovery over a period of time from measurements of the capacitance. In this way, the generation rate which is typically in the range of 0.1s-. lOOms, can be measured on an extended timescale of Ss -. 2hrs. Using the size and shape of the resulting capacitance-time transient, the generation lifetime and the surface recombination velocity s, can be determined for the silicon substrate.

A Zerbst analysis of the C (t) relation as depicted by figure 4.46 involves parameter extraction from a Zerbst plot. This is a graph of the time derivative of the 2 d(C C-t curve, plotted against a normalised measure of 'effective silicon

-100- I., 1 27

24 - C I- 21- q. U 18 - 0

15 -

12 0 42 04 126 100 Ph

-r i rr-s es

Figure 4.46: A typical capacitance-time (C-t) plot c1 1 depth', -i--1J• From this the generation lifetime can be evaluated: (C(t

fli C0 CL 2.----A = T (4.67) Nsub cf . AX where C1 is the final equilibrium capacitance and AY/AX the slope of the Zerbst plot. A typical Zerbst plot is shown in figure 4.47. The intercept with the Y-axis of the extrapolated linear section yields an effective surface recombination velocity s.

This simplistic analysis, however, has been reviewed by a number of workers. Schroder and Nathanson [1061 were the first to observe discrepancies between 'r 8 and determined from Zerbst analyses of capacitance transients, and other experiments on gate-controlled diodes. On observing an unexpected and anomalous correllation between generation lifetime and surface recombination velocity, they proposed a simple model which accounted for apparent lifetime degradation by a peripheral generation current. The analysis of Zerbst had previously neglected lateral depletion at the edge of the gate and variation of s with time. As a result, when a significant surface component of the lifetime affects the measurement, the actual value of 'r g as determined from a Zerbst plot can be very different from the actual bulk lifetime:

101

V— -t. P= , C= ..

a .9 .6 .9 1.2 1.5

Figure 4.47: A typical Zerbst plot

)2

&ff g 4.68 DOX N S b d

where the s (t) relation can be derived from the Zerbst plot by extrapolation back to the Y-axis of every data point, C(t). The true depletion surface recombination

velocity is thus s0 = s I=o• To eliminate the effect of the peripheral component, the size of the MOS capacitor must be taken into account:

Tg = Tgff (l —do /d) (4.69)

where d = gate diameter (or side-length) and d0 = 4SO T8ff (characteristic length). In

fact, for large-area capacitors for which d >>d0 , equation 4.68 can approximate to

T T. eff S

Many of the non-idealities which affect the measurement of high-frequency C-V curves are also detrimental to C-t measurements. These include non-uniform substrates, interface traps, incident light and inversion layer spreading. Ion-implanted profiles introduce either an effective or an actual depth-dependent factor in the lifetime profile (previously assumed uniform). As a result the C-t transients will become distorted to some degree [4, 5] as shown in figure 4.48. The resulting Zerbst plots for these samples are shown in figure 4.49.

- 102- 1

X La

C

I- 39 '(1 U U 31-

-r i rn em Figure 4.48: C-t curve from a sample with a non-uniform (ion-implanted) substrate

ME, as p.—. -. 1 C= -t-

Figure 4.49: Zerbst plot from a sample with a non-unifrom (ion-implanted) substrate

The response of the interface traps can be minimised by an expedient choice of pulse bias voltages. It has been suggested [107] that by pulsing from inversion into deeper inversion or deep-depletion, that the surface-dominated response can be eliminated. Schroder and Guldberg [108], however, warn that this assumption must be made with caution; their measurements suggested that in some circumstances there was nothing to be gained from this method. C-t transients are normally dominated by the behaviour of bulk traps, but interface states may contribute an additional surface

- 103- component [108]. This effect, however, is not uniquely caused by interface traps, and McGillivray [109] has observed similar such transients which were presumed attributable to crystal defects. In these circumstances, a separate measurement of D1 is advised.

Non-uniform doping effects can be corrected by a simple modification of the Y- axis of the Zerbst plot, weighting it by N (w )/Nb. The effect of such a modification can be seen from figure 4.50.

A common criticism of the standard static capacitance-time recovery method, as used in the Zerbst analysis, is the length of time required to perform a measurement. A number of novel techniques have recently been developed which overcome a problem which has only now become intolerable because of the extremely high quality of modern-day silicon. They are the ramped capacitance transient (RCT) method [110], the double-sweep method [111], and the fast method [112]. The use of the standard Zerbst method is still preferred, however, over these other techniques for samples with shorter lifetimes.

The problem of optimising data collection for an indeterminately long (or short) C-t transient was tackled by Rabbani and Lamb [113] who developed an empirical method for estimating the final measurement time. This was improved and implemented by McGillivray et at., thereby allowing for an optimal number density of

P- -.

0 1.2 2.4 3.6 4.8 6.0

Figure 4.50: Zerbst plot corrected for the effect of a non-uniform doping profile

-104- points to be made over the measurement range [109].

Some practical considerations for the successful measurement of the minority carrier generation parameters are an appreciation of the precautions to be taken when the sample is either non-uniform or has a high interface trap density. Without this knowledge it may be difficult, if not impossible, to select a single tangent to the Zerbst plot that is representative of the true generation rate! In some instances, particularly when the actual precise value of T is not of interest, it is instructive to construct a lifetime profile from a differential Zerbst plot. This may be useful in assessing implant lattice damage and the effectiveness of annealing procedures thereon.

4,9. Conclusion

In recognition of the complexity of C-V testing, this thesis sets about to automate the measurements, to highlight their usefulness, and increase their reliability in a modern, computer-aided manufacturing environment. The remaining chapters document the progress made in the automation of C-V measurements during the course of this research. Three complementary 'artificial intelligence' approaches are examined: machine learning, pattern recognition and a rule-based expert system.

REFERENCES

E. H. Nicollian and J. R. Brews, MOS (Metal Oxide Semiconductor) Physics and Technology, John Wiley & Sons, New York. (1982).

S. M. Sze, Physics of Semiconductor Devices, John Wiley & Sons. New York. (1981). 2nd edition.

A. S. Grove, Physics and Technology of Semiconductor Devices, Wiley, New York, (1967). 2nd edition.

K. H. Zaininger and F. P. Heiman, "The C-V technique as an analytical tool," Solid-State Technology, (Part 1), pp. 49 - 56, (May 1970). K. H. Zaininger and F. P. Heiman, "The C-V technique as an analytical tool," Solid-State Technology, (Part 2), pp. 46 - 55, (June 1970).

J. L. Moll, Institute of Radio Engineers Wescon Convention Record, Vol. 3, p. 32, (1959).

W. G. Pfann and C. G. Garratt, Proceedings of the Institute of Radio Engineers, Vol. 41, p. 2011, (1959).

R. Lindner, "Semiconductor surface varactor," Bell System Technical Journal, Vol. 41, pp. 803-831, (1962).

- 105- K. Lehovec, "Rapid evaluation of C-V plots for MOS structures," Solid State Electronics, Vol. 11, pp. 135-137, (1968).

C. T Huang, "High frequency C-V plot as a process control tool," Proceedings of the Microelectronics Measurement Technology Seminar, San Jose California, pp. 71-84, Benmill Pub. Corp., (February 6-7, 1979).

L. McMillan, "MOS C-V techniques for IC process control," Solid State Technology, pp. 47-52, (September 1972).

J. R. Brews, "An improved high frequency MOS capacitance formula," Journal of Applied Physics, Vol. 45, pp. 1276-1279, (1974).

G. Baccarani and M. Seven, "On the accuracy of the theoretical high-frequency semiconductor capacitance for inverted MOS structures," IEEE Transactions on Electron Devices, Vol. ED-21, pp. 122-125, (Jan. 1974). J. R. Brews, "A simplified high frequency MOS capacitance formula," Solid State Electronics, Vol. 20, pp. 607-608, (1977). A. Berman and D. R. Kerr, "Inversion charge redistribution model of the high- frequency MOS capacitance," Solid State Electronics, Vol. 17, pp. 735-742, (1974).

L. Forbes and B. Rastegar, "A desktop-computer based calculation of high frequency MOS C-V curves," IEEE Transactions on Electron Devices, Vol. ED- 34, (2), pp. 427-431, (Feb. 1987).

C. T. Sah, R. F. Pierret, and A. B. Tole, "Exact analytical solution of high frequency lossless MOS capacitance-voltage characteristics and validity of charge analysis," Solid-State Electronics, Vol. 12, pp. 681 - 688, (1969). M. J. McNutt and C. T. Sah, "Exact capacitance of a lossless MOS capacitor," Solid State Electronics, Vol. 19, pp. 255-257, (1976). R. Jaegar, F. H. Gaensslen, and S. E. Diehl, "An efficient numerical algorithm for simulation of MOS capacitance," IEEE Transactions on Computer Aided Design, Vol. CAD-2, (2), pp. 111-116, (April 1983). D. J. Bartelink and R. E. Tremain, "Exact formulation of MOS C-V profiling," IEEE Transactions on Electron Devices, Vol. ED-27, (11), p. 1831, (1979).

A. Goetzberger, "Ideal MOS curves for silicon," Bell System Technical Journal, pp. 1097-1122, (September 1966).

A. G. Fortino and J. S. Nadan, "An efficient numerical method for the small- signal AC analysis of MOS capacitors," IEEE Transactions on Electron Devices, Vol. ED 24, (9), pp. 1137-1146, (September 1977). J. L. Gray and M. S. Lundstrom, "Numerical solution of Poisson's equation with application to C-V analysis of ffl-V heterojunction capacitors," IEEE Transactions on Computer Aided Design, Vol. CAD-4, (4), pp. 546-553, (Oct. 1985).

J. Blacksin, "A Poisson C-V profiler," IEEE Transactions on Electon Devices, Vol. ED-33, (9), pp. 1387-1389, (September 1986).

- 106- M. J. McNutt and C. T. Sah, "Determination of the MOS oxide capacitance," Journal of Applied Physics, Vol. 46, (9), pp. 3909-3913, (September 1975).

L. M. Terman, "An investigation of surface states at a silicon/silicon oxide interface employing metal-oxide-silicon diodes," Solid-State Electronics, Vol. 5, pp. 285 - 299, (1962).

C. T. Sah, A. B. Tole, and R. F. Pierret, "Error analysis of surface state density determination using the MOS capacitance method," Solid-State Electronics, Vol. 12, pp. 681 & 689 - 709, (1969).

B. Ricco, P. Olivo, T. N. Nguyen, T.-S. Kuan, and G. Ferriani, "Oxide thickness determination in thin-insulator MOS structures," IEEE Transactions on Electron Devices, Vol. ED-35, (4), pp. 432-438, (April 1988). G. Sarrabayrouse, F. Campabadal, and J. L. Prom, "Oxide thickness determination from C-V measurement in an MOS capacitor," lEE Proceedings - C, Vol. 136, (4), pp. 215-216. (August 1989).

I. G. McGillivray, J. M. Robertson, and A. J. Walton, Determination of C r in from the high frequency C-V characteristic of high lifetime MOS capacitors., (1989). (Presented, to be published)

B. E. Deal, A. S. Grove, E. H. Snow, and C. T. Sah, "Observation of impurity re-distribution during thermal oxidation of silicon using the MOS structure," Journal of the Electrochemical Society, Vol. 112, pp. 308-314, (1965). J. A. Walls, A. J. Walton, J. M. Robertson, and T. M. Crawford, "Interpretation of C-V curves for process fault diagnosis: a machine-learning expert systems approach," in Proceedings !CMTS 1988, pp. 174 - 178, Long Beach, CA., (February 1988).

S. Kar, "Determination of Si-Metal work function differences by MOS capacitance technique," Solid State Electronics, Vol. 18, pp. 169-181, (1975). A. H. Marshak and R. Shrivastava, "On the threshold and flatband voltages for MOS structures with polysilicon gate and non-uniformly doped substrate," Solid State Electronics, Vol. 26, (4), pp. 361-364, (1983). I. G. McGillivray, The Measurement of Electrical Parameters and Trace Impurity Effects in MOS Capacitors, (1987). PhD. Thesis, University of Edinburgh

J. R. Brews, "Correcting interface-state errors in MOS doping profile determinations," Journal of Applied Physics, Vol. 44, (7), pp. 3228-3231, (1973). Shi-Tron Lin and J. Reuter, "The complete doping profile using MOS C-V technique," Solid State Electronics, Vol. 26, (4), pp. 343-351, (1983). B. J. Gordon, "On-line capacitance-Voltage doping profile measurement of low- dose ion implants," IEEE Transactions on Electron Devices, Vol. ED-27, (12), pp. 2268-2272, (December 1980).

K. Ziegler, E. Klausmann, and S. Kar, "Determination of the semiconductor doping profile right up to the surface using the MIS capacitor," Solid State Electronics, Vol. 18, pp. 189 - 193, (1975).

- 107 - B. E. Deal, M. Sklar, A. S. Grove, and E. H. Snow, "Characteristics of the surface-state charge (Q) of thermally oxidised silicon," Journal of the Electrochemical Society: Solid State Science, Vol. 114, (3), pp. 266-273, (March 1967).

M. H. Woods and R. Williams, "Hole traps in silicon dioxide," Journal of Applied Physics, Vol. 47, (3), pp. 1082 - 1089, (March 1976).

C. N. Berglund and R. J. Powell, "Photoinjection in Si02 : electron scattering in the image force potential well," Journal of Applied Physics, Vol. 42, (2), pp. 573-579, (Feb. 1971).

B. E. Deal, "Properties and measurements of charges associated with the thermally oxidised silicon system," Proceedings of the Microelectronics Measurement Technology Seminar, pp. 39-70, San Jose, (February 6-7, 1979).

E. Yon, W. H. Ko, and A. B. Kuper, "Sodium distribution in thermal oxide on silicon by radiochemical and MOS analysis," IEEE Transactions on Electron Devices, Vol. ED-13, (12), pp. 276 - 280, (February 1966).

L. Manchanda, J. Vasi, and A. B. Baattacharyya, "The nature of intrinsic hole traps in thermal silicon dioxide," Journal of Applied Physics, Vol. 52, (7), pp. 4690-4696, (July 1981).

D. J. DiMaria, D. R. Young, R. F. DeKeersmaecker, W. R. Hunter, and C. M. Serrano, "Centroid location of implanted ions in the Si0 2 layer of MOS structures using the photo I-V technique," Journal of Applied Physics, Vol. 49, (11), pp. 5441-5444, (Nov. 1978).

K. Saminadayar, "Evolution of surface-state density of Si! wet thermal Si0 2 interface during bias-temperature treatment," Solid-State Electronics, Vol. 20, (11), pp. 891 - 897, (November 1977).

E. Arnold, J. Ladell, and G. Abowitz, "Crystallographic symmetry of surface state density in thermally oxidised silicon," Applied Physics Letters, Vol. 13, pp. 413-416, (1968).

F. Montillo and P. Balk, "High temperature annealing of oxidised silicon surfaces," Journal of the Electrochemical Society, Vol. 118, (9), pp. 1463-1468, (1971).

Advanced Computerised Semiconductor Measurement System (CSM), Materials Development Corporation (product note), USA, (1989).

Advanced C-V Plotting Systems, The Micromanipulator Company (product note), Carsen City, Nevada, (1987).

High Frequency, Quasistazic and Simultaneous CV Semiconductor Measurements, Keithley Instruments ('Package 82' product note), Cleveland, Ohio, (1987).

HP 4061A Semiconductor/Component Test System, Hewlett Packard Co., (May 1983). (Technical data)

E. H. Snow, A. S. Grove, B. E. Deal, and C. T. Sah, "Ion transport phenomena in insulating films," Journal of Applied Physics, Vol. 36, pp. 1664 - 1673, (1965).

108 R. A. Gdula, "The effects of processing on the hot-electron trapping in Si0 2 ," Journal of the Electrochemical Society, Vol. 123, (1), pp. 42-46, (Jan. 1976).

R. B. Calligaro, "Simple method to determine the Si/Si0 2 fast interface trap density change after high field stress on silicon-on-insulator substrates," lEE Proceedings :1, Vol. 134, (5), pp. 156-158, (October 1987).

P. J. Rosser and P. B. Moynagh, "Comparison of the interfacial stress resistance in rapid thermally processed thin dielectrics.," ESSDERC 87, pp. 833-836, (1987).

H. Wendt, A. Spitzer, W. Bensch, and K. V. Sichart, "The effect of rapid thermal annealing on the quality thin thermal oxides," ESSDERC 1989, pp. 345- 348, Berlin, (1989).

T. Hori, H. Iwasaki, and K. Tsuji, "Electrical and physical properties of ultrathin reoxidised nitrided oxides prepared by rapid thermal processing," IEEE Transactions on Electron Devices, Vol. ED-36, (2), pp. 340-350, (February 1989). P. E. Bagnoli and A. Nannini, "Effects of interfacial states on the capacitance- voltage characteristics of P-Si0 2-Si schottky diodes," Solid State Electronics, Vol. 30, (10), pp. 1005-1012, (1987).

R. J. Powell and G. F. Derbenwick, 'Vacuum ultraviolet radiation effects in Si0 2 ," IEEE Transactions on Nuclear Science, Vol. NS-18, (6), p. 99, (1971). J. A. Serack, Edge Effects in Silicon IGFETS, (1988). Ph.D. Thesis, Edinburgh Microfabrication Facility, Dept. Electrical Engineering, University of Edinburgh

T. H. Ning, C. M. Osburn, and H. N. Yu, "Threshold instability in IGFETs due to emission of leakage electrons from silicon substrate into silicon dioxide," Applied Physics Letters, Vol. 29, (3), pp. 198-200, (August 1976).

T. H. Ning, "Hot electron emission from silicon into silicon dioxide," Solid State Electronics, Vol. 21, pp. 273-282, (1978).

E. H. Nicollian and C. N. Berglund, "Avalanche injection of electrons into insulating SiO2 using MOS structures," Journal of Applied Physics, Vol. 41, p. 3052, (June 1970).

E. H. Nicollian and A. Goetzberger, "MOS conductance technique for measuring surface state parameters," Applied Physics Letters, Vol. 7, (8), pp. 216-219, (15 October 1965).

D. A. Baglee, "Characteristics and reliability of 100 A oxides," Proceedings 22nd IEEE Reliability Physics Symposium, p. 152, (1984).

C. C. Chang and W. C. Johnson, "Distribution of flatband voltages in laterally non-uniform MIS capacitors and application to a test for non-uniformities," IEEE Transactions on Electron Devices, Vol. ED-25, (12), pp. 1368-1378, (Dec. 1978). J. R. Brews and A. D. Lopez, "A test for lateral non uniformities in MOS devices using only capacitance curves," Solid State Electronics, Vol. 16, pp. 1267-1277, (Nov. 1973).

_109- M. Jourdain, G. Solace, C. Petit, M. Favre, and J. Despujols, "Effect of high field stress on interface states of NMOS capacitors," Solid State Electronics, Vol. 26, (4), pp. 251-257, (1983).

I. G. McGillivray, J. M. Robertson, and A. J. Walton, "CV characteristics of MOS capacitors with two top capacitive contacts," Microelectronics Journal, Vol. 18, (5), pp. 25-27.

A. Lederman, "Vacuum operated mercury probe for C-V plotting and profiling," Solid State Technology, pp. 123-126, (August 1981).

D. L. Rehrig and C. W. Pearce, "Production mercury probe capacitance-voltage testing," Semiconductor International, pp. 151-162, (May 1980).

J. Grosvalet and C. Jund, "Influence of illumination on MIS capacitance in the strong inversion region," IEEE Transactions on Electron Devices, Vol. ED-14, pp. 777-780, (1967).

A. S. Grove, B. E. Deal, E. H. Snow, and C. T. Sah, "Investigation of thermally oxidised silicon surfaces using metal-oxide-semiconductor structures," Solid State Electronics, Vol. 8, pp. 145-163, (1965). T. Ohzone, H. Shimura, K. Tsuji, and T. Hirao, "Silicon-gate n-well CMOS process by full ion-implantation technology," IEEE Transactions on Electron Devices, Vol. ED-27, pp. 1789-1795, (1980).

J. A. Walls, A. J. Walton, J. M. Robertson, and T. M. Crawford, "Automating and sequencing C-V measurements for process fault diagnosis using a pattern recognition technique," in Proceedings ICMTS 1989, Vol. 2, pp. 193 - 199, Edinburgh U.K., (March 1989).

R. C. Y. Fang, "Threshold voltage shift of the p-channel transistors by boron implantation and the C-V characteristics of the corresponding MOS structures," Solid State Electronics, Vol. 26, (1), pp. 25-32, (1983).

Qui-Yi Tong and M. K. Andrews, "PMOS design for VLSI CMOS by a C-V technique," 5th Australian & Pacific Region Microelectronics Conference, Adelaide, (13 - 15 May 1986).

G. S. Virdi, S. Singh, B. C. Pathak, and W. S. Khokle, "Adjustment of threshold voltage of MOS devices by ion-implantation," Microelectronics Journal, Vol. 19, (1), pp. 19 - 33.

81. E. H. Nicollian and A. Goetzberger, "Lateral ac current flow model for metal- insulator-semiconductor capacitors," IEEE Transactions on Electron Devices, Vol. ED-12, pp. 108-117, (1965).

W. van Gelder and E. H. Nicollian, "Silicon impurity distribution as revealed by pulsed MOS C-V measurements," Journal of the Electrochemical Society: Solid State Science, Vol. 118, (1), pp. 138- 141, (January 1971).

C. P. Wu, E. C. Douglas, and C. W. Mueller, "Limitations of the CV technique for ion-implanted profiles," IEEE Transactions on Electron Devices, Vol. ED-22, (6), pp. 319-328, (June 1975).

- 110- E. H. Nicollian and A. Goetzberger, "The S' -S'02 interface- electrical properties as determined by the metal-insulator-silicon conductance technique," Bell System Technical Journal, Vol. 46, (1), pp. 1055-1133, (1967).

Polaron Etch-Back CV System, Biorad Microscience Division, Semiconductor Business Unit, Greenhill Cres., Watford, UK.

D. P. Kennedy and R. R. O'Brien, "On the measurement of impurity atoms by the differential capacitance technique," IBM Journal of Research, Vol. 13, pp. 212-214, (1969).

W. C. Johnson and P. T. Panousis, IEEE Transactions on Electron Devices, Vol. ED-18, pp. 965-973, (1971).

R. A. Moline, "Ion-implant phosphorous in silicon: profiles using C-V analysis," Journal of Applied Physics, Vol. 42, (9), pp. 3553-3558, (Aug. 1971).

C. L. Wilson, "Correction of differential capacitance profiles for debye-length effects," IEEE Transactions on Electron Devices, Vol. ED-27, (12), pp. 2262 - 2267, (December 1980).

D. J. Bartelink, "Limits of applicability of the depletion approximation and its recent augmentation," Applied Physics Letters, Vol. 38, pp. 461-463, (1981).

D. J. Bartelink, "C-V Profiling," in Proceedings of the Microelectronics Measurement Technology Seminar, pp. 140-156, San Jose, (Feb. 6-7, 1979).

D. M. Brown, R. J. Connery, and P. V. Gray, "Doping profiles by MOSFET deep depletion C(V)," Journal of the Electrochemical Society, Vol. 122, pp. 121- 127, (1975).

J. Hilibrand and R. D. Gold, "Determination of impurity distribution in junction diodes from capacitance-voltage measurements," RCA Review, Vol. 21, (245), (1960).

G. L. Miller, "A feedback method of investigating carrier distributions in semiconductors," IEEE Transactions on Electron Devices, Vol. ED-19, (10), pp. 1103-1108, (Oct. 1972).

I. G. McGillivray, J. M. Robertson, and A. J. Walton, "Improved measurement of doping profiles in silicon using C-V techniques," IEEE Transactions on Electron Devices, Vol. ED-35, (2), pp. 174-179, (February 1988).

E. H. Nicollian, A. Goetzberger, and A. D. Lopez, "Expedient method of obtaining interface state properties from MIS conductance measurements," Solid State Electronics, Vol. 12, pp. 937-944, (1969).

J. R. Brews, "Rapid interface parameterization using a single MOS conductance curve," Solid State Electronics, Vol. 26, (8), pp. 711-716, (1983).

R. Catagne and A. Vapaille, "Description of the Si0 2-Si interface properties by means of very low frequency MOS capacitance measurements," Surface Science, Vol. 28, p. 557, (1971).

M. Kuhn, "A quasi-static technique for MOS CV and surface states measurements," Solid-State Electronics, Vol. 13, p. 873, (1970).

BUM K. Ziegler and E. Klausman, "Static technique for precise measurement of surface potential & interface state density in MOS structures," Applied Physics Letters, Vol. 26, (7), pp. 400 - 402, (April 1975).

D. Sands, K. M. Brunson, and C. B. Thomas, "A modified charge-voltage technique for interface-state measurements in metal-oxide-semiconductor capacitors," Semiconductor Science Technology, Vol. 3, pp. 477-482, (1988). P. K. Roy and A. K. Sinha, "Sythesis of high-quality ultra-thin gate oxides for ULSI applications," AT&T Technical Journal, pp. 155-174, (Nov/Dec 1988).

E. H. Nicollian, "Advances in instrumentation for C-V measurements," Solid State Technology, pp. 107-110, (August 1989).

T. J. Mego, "Improved feedback charge method for quasi-Static C-V measurements in semiconductors," Review of Scientific Instruments, Vol. 57, (11), pp. 2798-2805, (November 1986).

M. Zerbst, "Relaxationseffekte An Halbleiter-Isolator-Grenzflache," Z. Angew Phys., Vol. 22, (1), pp. 30 - 33, (1966). (in German)

D. K. Schroder and H. C. Nathanson, "On the separation of bulk and surface components of lifetime using the pulsed MOS capacitor," Solid-State Electronics, Vol. 13, pp. 577 - 582, (1970).

S. R. Hofstein, "Minority carrier lifetime determination from inversion layer transient response," IEEE Transactions dn Electron Devices, Vol. ED-14, pp. 785-786, (1968). (special issue on MIS structures)

D. K. Schroder and J. Guldberg, "Interpretation of surface and bulk effects using the pulsed MIS capacitor," Solid-State Electronics, pp. 1285 - 1297, (1971).

I. G. McGillivray, J. M. Robertson, and A. J. Walton, "Automatic measurement of minority carrier lifetimes in silicon for nonuniform doping samples using the Zerbst technique," IEEE Proceedings - I: Solid State and Electron Devices, Vol. 134, pp. 161-167, (December 1987).

P. Tiwari, B. Bhaumik, and J. Vasi, "Rapid measurement of lifetime using a ramped MOS capacitor transient," Solid-State Electronics, Vol. 26, (7), pp. 695 - 698, (1983).

Shi-Tron Lin, "Simultaneous determination of minority-carrier lifetime and deep-doping profile using a double-sweep MOS C(V) technique," IEEE Transactions Electron Devices, Vol. ED-30, (1), pp. 60-63. (January 1983). W. W. Keller, "The rapid measurement of generation lifetime in MOS capacitors with long relaxation times," IEEE Transactions on Electron Devices, Vol. ED-34, (5), pp. 1141-1146, (May 1987).

K. S. Rabbani and D. R. Lamb, "A quick method for the determination of bulk generation lifetime in semiconductors from pulsed MOS measurements," Solid State Electronics, Vol. 24, pp. 661-664, (1981).

112 . Chapter V Connectionist Learning Systems

5.1. Introduction

The shape of a high-frequency C-V curve reveals many process defects and or characteristics of the MOS capacitor sample. A great deal of qualitative information is thus conveyed in the observed shape to a trained operator. Unfortunately, this makes it difficult for the non-expert to take full advantage of the available information since visual interpretation is an acquired and subjective skill. Further confusion arises when combinations of the effects discussed in Chapter 4 are present simultaneously. It would be desirable, therefore, if the 'shape' of the C-V curve could be related to specific process effects to enable this subjective information to be used objectively by machine.

In this chapter, 'learning machines' are examined as a means of estimating process and device parameters directly from the curve by relating cause to effect. Systems that learn are useful as qualitative diagnostic tools, particularly when a large amount of skill built up from experience is required to reason about the problem. The attractions to using machine learning methods over rule-based expert systems are when the relations between patterns of symptoms and device characteristics are either uneconomic to obtain or are too complex to describe by simple heuristic rules.

The use of machine learning to analyse C-V curves appears to offer several potential advantages over manual inspection by:

Obviating the need for an operator with an in-depth knowledge of the underlying physics,

Relaxing the requirement for a detailed model of the cause-to-effect relations and derived parameters,

Easing re-programming as new faults become evident by allowing re-training on new examples without the need to extensively modify a rule base or decision tree.

-113- Some typical uses for machine learning include:

• equipment fault diagnosis

• fingerprint identification

• voice recognition

• machine vision

These tasks are characterised by a high degree of complexity requiring 'learnt' skills, and are difficult to model mathematically.

5.2. Parametric Learning

The parametric or 'black-box' approach as machine learning involves the use of an input:output device which is repeatedly trained to associate specific instances of the input pattern to processing or device parameters, by forming analogies within its internal structure. This black-box method, depicted by figure 5.1, was popularised in the late 1950's by a number of workers inspired by the discovery of the structure of the brain. A variety of techniques and algorithms to simulate the brain's functions were implemented, particularly in the fields of learning and visual recognition [1,2,31. This area of research is currently enjoying a resurgance with the advent of synthetic neural networks [4, 5].

The relationship or 'transfer function' established within a black-box system is formed through a programme of training on known sets of inputs and outputs. These internal relations are often established with little regard to their representation inside the black-box, and so long as the resulting system works, then it may be considered sufficient. This process is analogous to the behavioural scientists' understanding of conditioned animal learning. These acquired skills or reflexes can be easily learnt,

Input BLACK output BOX pattern signal(s)

Figure 5.1: The black-box method for learning

-114- whilst the means of formation of such associations is still largely unknown. A good review of 'self-organising' systems and comparison with their biological equivalents may be found in reference [6].

5.3. Mechanisms of Learning

There are two distinct means of parametric learning [7], which are distinguished by the form in which the knowledge about the input pattern is held, namely parameter adjustment and signature tables.

5.3.1. Parameter Adjustment

The parameter adjustment method is the easiest to implement, provided that the pattern representation has been chosen sensibly. Learning is achieved through the adjustment of coefficients of a discriminator or filter to a predefined criterion. That is to say, the training process is continued until there is no longer a proportionate increase in performance.

One of the earliest examples of a parameter adjustment learning machine was the 'Perceptron' pioneered in 1958 by Rosenblatt [2]. This model is represented in the schematic diagram of figure 5.2. The Perceptron operates on a vector of inputs to produce an output which is some linear combination of the inputs. The output may be optimally thresholded to yield a binary result. The input to the Perceptron, in

Weights Summer (X~ ) W Thresholder

True - or f false

Wn

Figure 5.2: The Perceptron learning mechanism

- 11S - common with all the parametric learning algorithms, is composed of continuous and/or non-linear values which represent the pattern, sequence, or image to be learnt and modelled. These numbers we refer to as 'salient features' or Just 'features', and must be derived in some way from the input pattern. Examples of salient features to describe the typical tasks mentioned earlier might be:

• Symptoms of fault: temperature, pressure, voltage and current.

• Number of lines, spacing between lines, radius of curvature.

• Duration of utterance, harmonics of frequency spectrum, sex of the speaker.

• Histogram of grey levels, density of pixels, position of brightness maximum.

A block diagram of a generic parametric learning machine (PLM) is given in figure 5.3. Before such a system may be used, it must first be trained on representative examples. By exposing the PLM to various selected salient features and their correct interpretation and then applying feedback, the PLM can be taught to recognise relationships between the input and output. This is depicted in figure 5.4 for which the feedback mechanism is a human expert in a supervisory role and a comparator to drive the adjustment of the weighting coefficients. This is generally known as a proportional-increment training strategy. The resultant trained network is capable of imitating an arbitrarily complex function, over a limited range, by a linear approximation.

It may appear that the Perceptron is suitable only for linearly separable problems. For a single layer device, this is indeed the case, but by cascading units into a multilayer Perceptron structure, arbitrary 'decision regions' can be defined [8]. The biggest obstacle to designing a PLM lies with choosing suitable salient features. As the experimental work here will show, the choice of these features may strongly influence the performance of the learning algorithm.

Data I I Pre- 1 1 Feature 1 1 Black acquisition processing extraction box Jp,t J7~~ Figure 5.3: A generic parametric learning machine (PLM) Input pattern

)ut

Figure 5.4: Schematic diagram of the proportional-increment training strategy

Many other mechanisms for parametric learning have been proposed and developed, but none have such universal appeal and utility as the Perceptron. A system named 'Pandemonium' developed by Selfridge [9] is similar in form to a four- layer perceptron. Each layer consists of a set of demons - computational elementary units which respond to various learned aspects of the input. The principal difference from the Perceptron is that Pandemonium could generate new demons/nodes as and when required. More recently, and in line with current thinking on neural networks, a PLM based on a vast, randomly interconnected mesh of processing units has been developed, known as the Boltzmann Machine [10] and illustrated in figure 5.5. Learning is achieved through 'simulated annealing' which adjusts the interconnection weights between nodes via the following procedure:

-117- Figure 5.5: Schematic diagram of the Boltzmann Machine

la. Apply the input and output patterns simultaneously to their respective ports.

lb. Leave the network to settle.

ic. Slightly increase the interconnection strengths between adjacent active nodes.

Disconnect the output, leaving input clamped.

Allow network to re-equilibrate.

Decrease slightly the weights between adjacent active nodes.

This process is continued until the desired input to output response is achieved.

In contrast with the Perceptron, the Boltzmann network is intentionally random, this being considered a closer approximation to the brain, and can be trained to learn practically any function. Despite its versatility, the Boltzmann Machine, in common with many 'neural' models, is prohibitively slow on serial hardware and thus very slow to learn.

5.3.2. Signature Tables

Another approach to parametric learning is the method of signature tables. This differs from the aforementioned parameter adjustment techniques in that a binary set of outputs from a network of processors is used to address a table, a database or

-118- computer memory. Similar input patterns thus index adjacent entries in a look-up table. Of historic interest are the character recognition system of Bledsoe and Browning [11] and Samuels draught-playing program [12]

Two noteworthy later achievements were the 'Boxes' system due to Michie and Chambers [13] and the gargantuan WISARD of Aleksander et. al [14]. The boxes system was successful at learning to balance a vertical pole mounted on a moveable cart after about 60 hours of training. An algorithmic solution involving the equations of motion would necessitate vastly more powerful computing machinery to perform similarly in real time. WISARD is a very much more extended version of Bledsoe and Brownings approach, utilising over one megabyte of content-addressable RAM with feedback, parallel addressing and randomly generated features. This unique configuration can achieve unequalled performance in complex vision tasks, notably the real-time recognition of human faces even when they are disguised.

5.4. Linear Adaptive Combiner

The particular learning algorithm used here for parameter extraction from C-V curves, was borrowed from the domain of adaptive signal processing. This was prompted by the local availability of software and expertise. The linear adaptive combiner had also already been shown to be applicable to a variety of other interpretive tasks [15, 16], and is similar in form to the single-layer Perceptron [17]. As shown in figure 5.6, the structure comprises an array of inputs, weighted by appropriate coefficients, combined (summed) to produce a continuous-valued output. The addition of an explicit feedback path allows for an error correction process, to be driven by some adaptive algorithm.

The combiner has no need for any a priori knowledge about the statistics of the input pattern, x(n) which is weighted (filtered) to produce an output, 9(n). The combiner compares this estimate with the training signal, y (n) and then by using estimates of the statistics of the input pattern {i(n ) ,C, } yields an error signal e (n). This error term, or 'cost function' is used to update the weights, usually in an iterative fashion to allow a progressive improvement in the output response, 9(n).

The multiple adaptive linear combiner for the simultaneous detection of multiple faults employs an array of such structures as shown in figure 5.7. The adaptive

-119-

Desired value (y(n))

w2

0000~ W Output 3 • Features (input pattern) {x(n)} ,_Thl_,wn Error tenn (e(n))

J Adaptive I I algorithm I

Figure 5.6: The structure of a linear adaptive combiner

... W1 •X0 [w01 w11 ]

xTW W. d2 [w2 W12

X [w w11 ... w1 ] d ij outputs ------Yadjustments J R- (inverse autocorrelation matrix)

Figure 5.7: The structure of a multiple linear adaptive combiner

algorithm is thus based on the statistics of the individual inputs, and the possibly complex interactions between each desired output.

• 120

Typical adaptive algorithms include the recursive least squares (RLS), stochastic gradient search (least mean squares, LMS) and self-orthogonalising algorithms. A description of these methods can be found in common texts [18,17] and an excellent review and comparison of their relative merits is given in reference [19]. The algorithm chosen here was the recursive least-squares (RLS) for its superior convergence properties. This was considered an important quality for the use of the combiners in the C-V system, since the number of different samples was limited, and, therefore, relatively small in number with respect to the number of data sets normally associated with the more conventional applications of adaptive systems in signal processing.

5.4.1. Recursive Least-Squares Algorithm

Starting with the basic premise of minimising the error term e (n) defined as:

e(n)=y(n)-9(n) (5.1) This can be achieved by optimising the set of weighting coefficients, 11(n). The function of the error conventionally used is the mean-square value:

2(n ) E{(y(n)-9(n)) 2 } ( 5.2)

where E{ - } is the expectation operator. Now, 9(n), the experimental output, is given by:

9(n) x T (n)W(n) (5.3) To obtain the optimum set of weights, (n), it is necessary to optimise the mean-square error with respect to the coefficients. Substituting 5.3 into 5.2 and expanding we get:

E(e 2(n )} = E{y 2(n)}+W T (n ).E{x(n)x T (n)}.W(n)-2E{y(n)x T (n)}W(n5.4) Now let

OXV = E{x(n)x T (n)} (auto-correlation matrix) (5.5)

= E {y (n )x (n) } ( cross-correlation matrix) (5.6) giving:

E{e2(n)} = E{y2(n)}+WT(n)çW(n)_2çjW(n) (5.7)

-121-

To minimise this expression we differentiate with respect to W (n) and set equal to zero:

0W(fl){(hz)} = 2coW(n)-2cb = 0 (5.8)

-.W 0 (n) = ck (5.9) = 1 -.W0p (n) XC cb (5.10) This is known as the Wiener solution [20]. ç and ç5 however, are not instantaneously available in an adaptive system, since they are evaluated from the expected input values. In a changing and learning environment, therefore, they must be estimated from the current input data and the history of data prior to the present set. These estimates can be defined as:

k r,(k)= x(n)xT(n) (5.11) ,t =0 and

k rr,(k)= x(n)y(n) (5.12) ,i=0 so the Wiener solution can be approximated to:

IT L.\ - op: ( J- r, Notice the change from the index (a) to (k) to indicate the approximation due to finite sets of data. W0 , above is now an optimum weight vector only at time k, thus a recursive expression becomes more practical for iterative training of the PLM. From 5.11 and 5.12 we get:

r. (k) =r(k-1)+x(k)x T (k) (5.14)

r(k) = r,,,(k-1)+x(k)y(k) (5.15) and thus equation 5.13 becomes:

W01 (k) = W(k-1)+r,(k)x(k)e(k) (5.16) since initially e (k) = y (k).

At this point it appears that the inverse of the r matrix requires to be calculated with each iteration. This is undesirable both from a speed and accuracy viewpoint, since RLS is sensitive to rounding errors even when using full 32-bit floating-point arithmetic [19]. Through the use of a common matrix identity, the 'Sherman- Morrison' relation, a recursive expression for r,;' may be specified:

-122- r 1 (k —1 )x (k)x T (k)r 1(k —1) (5.17) 1+xT(k)r,(k —1)x(k)

which is guaranteed to converge to a stable solution within ZN iterations, where N is the number of features in the input pattern. As discussed in reference [19], however, the algorithm of equation 5.17 is unable to track a wandering input, since the input is weighted equally at each iteration. This results in a poor 'learning' performance for non-stationary inputs, since the oldest inputs have equal precedence over the current set. Whilst this was not anticipated to be a problem with the matching of C-V curves, it is likely that such intransigence will affect the initial adaptation rate.

This problem is readily rectified through the use of an exponential time window designed to favour more recent input patterns as given by equation 5.18. The precise origin of this modification is peripheral to our discussion and may be found elsewhere [21].

r 1 (k—i )x (k)x' (k)r 1 (k-i) ii r1(k) = F{rx1k_1_ (5.18) I X+xT(k)r1(k-1)x(k) iJ where 0 < A < 1, and best set in the range 0.9 < A < 1. This algorithm is not the most computationally efficient, but its speed is more than adequate for a trainable C-V system.

5.4.2. Performance Evaluation

Despite their ease of implementation, adaptive combiners suffer from the problem inherent to all 'black-box' parametric-learning systems, that the knowledge acquired or skills learnt are impervious to interrogation. This 'opacity of knowledge' may become a limiting factor in their application, particularly for neural networks whose complex internal representations are quite meaningless outside the net. Some means of assessing performance is essential in evaluating different feature sets, since there is little available information upon which to base a decision other than the accuracy of the system in use. Additionally there are very few guidelines for feature extraction in the literature. Factors to be taken into consideration when choosing suitable features have been found to include:

-123- The degree of correlation between the individual features and the output. Features which do not correlate with the output to any great extent may be redundant and surplus to requirements. Conversely, poorly correlating features may affect the stability and convergence of the learning algorithm.

The correlation between features. Strong interaction between input features is undesirable since it at once indicates a degree of redundancy of information at the input, and will also result in spurious associations being established within the combiners. Features which are merely linear combinations of others may result in ill-conditioning of the autocorrelation matrix, thus this should be guarded against.

Non-linearity of the strongly correlated features with the output function. In this situation, additional features should be added to model the non-linearity.

The chosen features should vary a significant amount with changes in the input pattern. This is to avoid numerical rounding errors and, if necessary, features can be amplified. To aid stability, therefore, all features should be of approximately the same order of magnitude.

By evaluating the learning algorithm, it is possible to make comparisons with similar systems by suitable measures of performance. These include:

The accuracy during subsequent use after training

Monitoring the feedback error, e (k)

Observing the evolving weight coefficients

Selecting the largest entries in the auto-correlation matrix

Calculating some other performance index [19,22]

The proficiency of the adaptive algorithm is dependent upon the skill of an expert in deciding upon:

Which raw data to collect

How to process that data

The choice of appropriate salient features and

The division of the problem space

- 124 - This may require considerable experience of the measurement itself and a good qualitative understanding of the underlying physics.

5.5. Application to C-V Measurements

To assess the applicability of the adaptive learning algorithm to parameter extraction from C-V curves, we begin with the simplest case; variations in the curve shape due to changes in substrate doping and oxide thickness. Whilst accurate analytical expressions already exist for these parameters, the evaluation of these relations using the learning system provide it with a taxing exercise.

The relationship between measured capacitances and the two parameters D.X and Nb are very different in form:

EO €T A ox - (5.19) L#OX

N 14 , 4k T ___ (5.20) ln(N5 ,, /n1 )+ An [21n(N,, /n1)—1 ] = €5 q 2 (E 1 - 1

Equation 5.19 is a linear relationship whereas equation 5.20 for N ub is transcendental and provides a real challenge to the learning system. The set of input and output features for the learning system investigated are shown in table 5.1 below. For each set of inputs and outputs the evaluation was made by training upon a random set of simulated C-V curves for a range of oxide thicknesses (200-900 A) and substrate dopings (1013 - 1016 cm _3)• These are illustrated in figure 5.8

Four different feature sets were tried before an acceptable combination was found. The process of experimentation is given to trial and error, and in these tests the feedback error and the normalised weights are observed during the training. After training was completed on the 15 samples, the system accuracy was determined by testing upon each of the examples that the system was trained. The accuracy of a test was decided by the maximum error encountered, no averaging was employed. Although the procedure of testing upon the training set can result in a biased estimate, the ability of the system to recognise its own training set is a suitable benchmark for comparing feature sets.

For the first example shown, the estimation of DOX is very good because of the strong correlation between 11C0 and D0 as illustrated by the strong preference of

-125- Example 1 Examole 2 Input Output Input Output 1 log 1— log — D0 D0 Cox Cox min log Cox Cox Nrn,, log Nb C ny fly

Example 3 Example 4 Input Output Input Output 1 log 1— log — D0 D0 Cox Cox 'j, '- mi C min log log C Cox ox I log log N3 ,, Cox Of s!ope y 2 '-C mm

Table 5.1: Four different input-output configurations

ES I NI L) L F _r Li FR \/ E 20r I Ii

X .

I.' LL 'I

C) C

4.) -12 C) C 0 C U

-2 -1.2 -.4 i.

B1 CV) Figure 5.8: Simulated C-V curves for training the PLM feature 3 in figure 5.9 a. Although the weights for the N,.b combiner shown in figure 5.9 b appear to stabilise, they are only small values, thus the PLM is not establishing linear relationships. This is illustrated by the observed instability in the feedback error during training depicted in figure 5.9 c. This failure is also reflected in the very large

-126- >< rr ic= I ri emp r .338 Feature 3 I.'

3 .2624

1.

.1868

.1112 .1:

0356 U Features 1 & 2 x _ • 94L 0

-1rIr

Figure 5.9 a: Evolving weights for the D0 combiner

___ ___ rn I= I r-, aD r 1795.84

1253.872

U Feature 1

721.904 £ 0)

- 199.936 U Feature 3 1—. Feature 2 -342.032 \N

-974 0 3 6 9 12 15

-r r- am. I ni r-, G3 eD >< 1 eD Figure 5.9 b: Evolving weights for the N3,, combiner estimation errors in the subsequent testing mode (-400 %).

Figures 5.10 a, 5.10 b, and 5.10 c illustrate the second example where the PLM was trained on the same features, but taught to associate these with D0 and the log of the substrate doping. These features were chosen to make the relationships between the input and output parameters more linear [23]. These results are far more interesting. The features have stabilised nicely with the feedback error visibly decreasing as learning progresses. As a result the estimation errors have been reduced

- 127.

F cI I . I-c. CEP - - 23

18

L 13 0 I Nsub error 1 8

I 1 3 \ V H \ fox" et/ror _-

"vi -2 I i I I 0 3 S 9 12 15

1rir

Figure 5.9 C: Feedback errors for both combiners

>< rn • I r, em r 3379 Feature 3

.2663

I

'-I 1948

11 -c .1232 (I)

.0516 x Features 1 8c 2

0

irIr Figure 5.10 a: Evolving weights for the D0 combiner

considerably to 120 %. The following examples have a fourth feature in the input data vector and the effect this has upon the accuracy is now discussed.

For the third example, shown in figures 5.11 a, 5.11 b, and 5.11 c, 1/C j was chosen as the extra feature on the basis of an empirical result [24] relating it to substrate doping. There is only a small improvement in estimation accuracy to 55 %, and from looking at the weights for N,.b it can be observed that the fourth feature has been suppressed and contributes very little to the output. The fluctuation in the

-128 -

L.4 rr •1 r.- as I— 20.165

Feature 1

15.332

10.499 £ II) 5.666

- Feature 3

.833

- \ \ Feature 2

-

--- -4 I I 3 6 9 12 15

rr - I ri I ri Q IM >c M M P= 1 am

Figure 5.10 b: Evolving weights for the N3,, combiner

F I . . r I

L 1= 0 I I l\ /\Nsub error Ed b i\i \ 3 L ~Dox error

0 3 6 9 12 15

1r . I ri I rn a eq >< MmcD 1

Figure 5.10 C: Feedback errors for both combiners

feedback error has slightly increased too.

For the last example, illustrated by the graphs in figures 5.12 a, 5.12 b, and 5.12 C, 11C was replaced by the maximum gradient of a tangent to the curve in the transition region. This is known to relate to substrate doping [24]. The results show a marked improvement in the estimation error for substrate doping, down to within 5 %. The weight for the fourth feature has contributed significantly to the calculation of

N3 b, and has been correctly suppressed for D0 estimation. The feedback error has

- 129 -

ID C=1 >< i= c= ry I ri .338 Feature 3 1'.

.2504 C I V 1628

.0752 £ \ Features 1,2 & 4

-.0124 x

OW 0 3 6 9 12 15

i r-i i rn G3 eq >

Figure 5.11 a: Evolving weights for the D0 combiner

___ ___ rr I r r- 24.6281

Feature 1 18.3025

II 11.9769 \

I! 5.6512 U x Feature 4 -. S?44 \ / \ Fe at u r— e33 -

V Feature 2 -7'. 0

I r Figure 5.11 b: Evolving weights for the N,.b combiner

also been reduced as the system attempted to establish the following relations:

1lC D0 = w 1 log + w 2 log1 + w 3 + w 4 s1ope (5.21) I J c ox 1 1 (S22!2-) N5 , = w 5 log + w 6 log + W7 + w8sIope rn (5.22) Cox- J Ox Ox

-130- F 1c . I-c. eE51 . 3

Nsub error I - - - - 0 I I IJox error W I

-1 / /

-3 1 0 3 6 9 12 15

T-ir,1-

Figure 5.11 C: Feedback errors for both combiners

- I r r

.000

1

.2544 C I .1 • 1708

.0872 I

• 0036 I, x

0 3 6 9 12 15

Figure 5.12 a: Evolving weights for the DOX combiner

The expression of equation 5.22 is particularly interesting in that it represents a linear approximation to a transcendental function over a reasonably wide range of values.

Looking at the weighting coefficient and autocorrelation matrices after training can give a numerical measure of the feature:outcome correlation and the feature:feature interaction. Tables 5.2 and 5.3 show the matrices characterising the system after training on the parameters of example 3. As can be seen from the weight

-131-

1._A i r- aq r— 22. 256 Feature 1

16.4049

4) 10.5537 £

4.7024 x \ Feature 4

-1.1488 Feature 3 \ Feature 2

-7 I - I 0 3 6 9 12 15

1r.1r,ir,

Figure 5.12 b: Evolving weights for the N 3 , combiner

F 4 ic . 1<. CEO . . 23

18

L 13 0 1. 1 Id Dox error -

3 N-s-ub error / \

21 I I '.1 I I I I I I 0 3 S 9 12 15

r r M i .. I .. - m J= 1 em

Figure 5.12 C: Feedback errors for both combiners

matrix, for the D0 combiner, the largest multiplier by far is associated with feature

number 3, and is equal to the relative permittivity of SiO 2 , (3.38). For the Nb combiner, the figures are beginning to settle towards the equilibrium values of figure 5.12 b. The autocorrelation matrix gives us insight into the inner workings of the PLM. The strongest correlation exists between features 1 & 2, 1 & 3, and 3 & 2. The correlations between 1 & 4 and 3 & 4 are less strong, though still significant, whilst the interaction between 2 & 4 appears the least significant. The diagonal elements of this

-132- Weight matrix (W): 11

- Feature D., log(N)

log(—) 3.31E-5 2.67E-1 C.

log( "") 1.51E4 2.33E-1

1 3.38E-1 -1.36E-2

9.00E-6 -5 .1 OE-3 MM

Table 5.2: W matrix after training on fifteen samples. (emboldened figures: very significant, italicised figures: moderately significant)

Autocorrelation matrix (rzr ):

feature (1) (2) (3) (4) 4.08E4 -5.38E4 9.00E4 7.22E3 7.89E2 -1.12E4 -1.14E3 2.20E5 1.31E4 1.90E3

Table 5.3: r, matrix after training on fifteen samples. (emboldened figures: very significant, italicised figures: moderately significant) matrix are naturally large, since these are the correlations between each feature and itself. Tables 5.4 and 5.5 present similar information for the features of example 4. Note this time there is a higher degree of interaction between the fourth feature and the other three.

The information as presented in this format is difficult to assimilate and Mirzai [22] has recently suggested a more rigorous approach. The normalised weight and error plots, however, provide instantaneous visual feedback on the usefulness of a particular feature set and the system stability during training.

5.5.1. Extending the Parameter Set

Whilst the previous section illustrates a good example of the learning system in action, it is not particularly useful when applied to practical C-V measurements, since the range of variability between curves is never solely limited to differences in D0 and

-133- Weight matrix (W):

Feature D0 Iog(N b )

log(—) 1.31E-4 4.14E-1

log( 1. 10E-5 2.20E-1

3.38E-1 -9.76E-3

max(dC/dV) -1.23E-4 6.13E-2

Table 5.4: W matrix after training on fifteen samples. (emboldened figures: very significant, italicised figures: moderately signficant)

Autocorrelation matrix (r,):

feature (1) (2) (3) (4) 4.08E4 -5.38E3 9.00E4 1.88E3 7.89E2 -1.11E4 -3.22E2 2.20E5 4.06E3 2.02E2

Table 5.5: r matrix after training on fifteen samples. (emboldened figures: very significant, italicised figures: moderately significant) N 3241,. As already mentioned in Chapter 4, real C-V curves may be the result of many other variables. In this section an attempt has been made to extend the technique to include variations in implant dose and implant energy in order to give simultaneous qualitative information about the oxide quality and implant parameters. For this, the basic feature set had to be greatly extended. Of course, it was still necessary to include the features and outputs for D0 and N 5,, in order that the PLM could cope with normal processing variations in these basic parameters. The chosen characteristic features as depicted in the diagrams of figure 5.13 are listed below:

1 log( —) Cox C . log( Cox

Cox

-134- dC Height of first peak in - dV dC Height of second peak in - dV dC Spacing between the two peaks in - dV

Flatband voltage

a2 of voltage-shifts in range 10 - 90% of CO3

Gradient of best fit line to voltage-shifts in range 10 - 90% of CO3

Standard error of the best fit line above The complete multiple adaptive combiner system thus comprises ten inputs and four outputs. In these experiments, real samples were used to train the PLM, since a C-V curve simulator for non-uniformly doped substrates was unavailable.

The learning curves depicting feedback error and the evolving weights during training are shown in figures 5.14 to 5.16. Note that in contrast with the earlier experiments of section 5.5, the stability of the PLM is markedly reduced. As will be shown. however, this does not have too detrimental an effect on the results as at first expected.

Figure 5.14 illustrates the evolving weights for the D0 combiner. Features 1,2 & 4 contribute little additional information as expected and the weights associated with

FIRST DIFFERENTIA.

L)

Figure 5.13: Location of the features extracted from the C-V curve

-135-

rr, k I .. eq r- 84.

Feature 3

+' 64.8 Cn

11 X 45.6

-o 0) 26.4

E 7.2 Features 1, 2 & 4 \ -- ------z

_12t 1 1 1 I I 1 I I I 0 3 6 9 12 15

Training example

Figure 5.14: Evolving weights for the modified D0 combiner

L_.1 r-r, i= i ri CB r- 18 -

Feature 1 - 14 -c 0)

0) X 10

-o 0) 1 6

--I E I. o 2 z Features 2, 3 & 4

------zz -2 0 3 6 9 12 15

Training example Figure 5.15: Evolving weights for the modified N,.b combiner

features 5 to 10 are insignificant, thus correctly suppressing any influence from these

attributes. Feature number 3 (i-) is the only significant component of the D0

combiner output as before, but oscillates wildly during the training process. Further evidence of instability in this system is also evident from the graph of the feedback errors in figure 5.16.

-136-

_ 1..c, eD r r r- 4 0 • .--.- / -.....----

Energy II \ \ Ji / 0 L N L Dox Id

(5 0 J

p... \Dose/

iNsub \

I • I I I __g 3 6 9 12 15

Training example

Figure 5.16: Feedback errors for all four combiners

The most likely cause of this instability is the addition of the two new combiners, "IPD" and 'IPE" for the implant dose and energy respectively. As can be seen from the learning curves of figures 5.17 and 5.18, the weights for these two output functions do not appear to settle to steady-state values. In other words, the relationships between the features selected to represent the C-V curve and the actual 'dial' dose and energy on the ion-implantation machinery are insufficiently linear. Despite these shortcomings, however, the trained PLM is remarkably able to yield a reasonable first- order estimate of these two processing parameters.

The list of ion-implanted and non-implanted samples used to characterise this system is given below in table 5.6. The trained PLM was subsequently tested on this training set and the results are shown in table 5.6. Table 5.7 shows the parameters estimated by the PLM, and the percentage difference from the actual values, where applicable. The agreement, in the majority of cases, was observed to be reasonably good after considerable training on a variety of samples, in many different orders.

The results of testing the the PLM on new, 'unseen' examples were less successful as shown in table 5.8. The errors in the DOX and N ,.b estimates, for samples 735_14 and 735_19 are considerably greater, although the implant parameters have been predicted with an acceptable accuracy. The second two examples are much more accurate in their estimation of the the basic physical parameters, but less successful in

-137- I F= ID = CD M km I r r

\ Feature 1 (4 +' 36.8

@3 x 18.6 Feature 3

-I @3 .4 \ fu Feature 2 E \ / -17.8 '.—,L-- z \ j F.ature 4 I I •-•--r- I 3 6 9 12 15

Training example

Figure 5.17: Evolving weights for the implant-dose combiner

I F E: c c, rn I= I ri en r- 192,

(4 / Feature 3 -' 135.8 / a) I \ 03 I X 79.6 .1 ' 'Feature 1

-D @3 23.4

E ture -32.8 z I Feature 2 'I

3 6 9 12 15

Training example Figure 5.18: Evolving weights for the implant-energy combiner

determining the implant dose and energy. This would indicate that the associations established within the PLM, although sufficient to model the training set, were too well matched to these particular samples to be applicable to other devices. In short, the PLM had lost its ability to generalise.

A number of further practical drawbacks were encountered in extending the machine learning technique which ultimately led to its abandonment in favour of a pattern classifier. Furthermore, it was felt that with regard to the extraction of process parameters from the C-V curve, conventional techniques such as flatband analysis and

-138- batch wafer D . N $Ub Dose Energy no. no. (X) (atoms cm _3) (x 1011 ions cm 2) (keV) 682 1 232 1.45e15 0 0 682 2 385 1.60e15 0 0 682 3 787 1.51e15 0 0 682 4 229 7.30e14 0 0 682 5 373 6.78e14 0 0 682 6 793 7.01e14 0 0 682 7 390 6.90e14 4 40 682 8 385 7.40e14 8 40 735 15 693 1.40e14 4 60 735 17 837 7.90e14 6 40 735 18 823 8.40e14 2 40

Table 5.6: Ion implanted samples used to train the PLM

batch wafer D. N, Dose Energy no. no. (atoms cm (x 1011 ionms c,,. 2) (keV) actual est err actual est log(err) actual est err actual est err % % (abs (abs) 682 1 232 231 0.4 1.45e15 7.30e14 1.9 0 0.12 0.12 0 -0.8 -0.8 682 2 385 386 0.3 1.60e15 1.43e14 0.4 0 0.37 0.37 0 4.3 4.3 682 3 787 782 0.6 1.51e15 1.08e15 1.0 0 0.35 0.35 0 7.8 7.8 682 4 229 227 0.9 7.30e14 5.78e14 0.7 0 0.11 0.11 0 -0.2 -0.2 682 5 373 373 0 6.78e14 1.12e14 5.2 0 0.02 0.02 0 0.5 0.5 682 6 793 797 0.5 7.01e14 9.10e14 0.8 0 -0.21 -0.21 0 2.0 2.0 682 7 390 395 1.3 6.90e14 1.04e15 1.2 4 4.0 0 40 32.7 -7.3 682 8 385 390 1.3 7.40e14 1.12e15 1.2 8 6.3 -1.7 40 48.4 8.4 735 15 855 853 0.2 8.10e14 6.00e14 2.2 4 5.5 1.5 60 46.6 -13.4 735 17 837 837 0 7.90e14 4.80e14 1.4 6 6.1 0.1 40 59 -18.9 735 1 18 1 823 1 824 1 0.1 1 8.40e14 I 9.50e14 0.4 2 1.4 -0.6 40 24 -16

Table 5.7: Results of testing the PLM on its training set

-139- batch wafer D. N,,, Dose Energy no. no. (A) (atoms cm (x 1011 ions cm _2) actual est err actual est log(err) actual est err actual est err % (abs) (abs) 735 14 853 776 9.0 9.83e14 1.69e14 5.1 8 11.0 3 80 84 4 735 19 826 770 6.8 1.14e15 3.07e14 3.8 8 9.1 1.1 60 68 8 735 13 834 834 0 8.73e14 1.30e15 1.2 4 1.4 2.6 40 29 11 748 06 762 1 761 0.1 8.90e14 1.03e15 0.4 0 0.4 0.4 0 10 10

Table 5.8: Results of testing the PLM on samples not used in the training set dopant profiling, were more likely to yield reliable, predictable, and accurate parameters than the PLM. These and other limitations of the PLM approach to parameter extraction are discussed in the next section.

5.6. Limitations of Parametric Learning to C-V Measurements

Feature extraction is an informed means of data reduction and as such, appears to demand considerable knowledge about the dynamics of the process under investigation. This difficulty stems from the lack of a clear set of relationships between curve-shape and process parameters, and the cumbersome nature of the feature vector as a description of a pattern. Credit assignment, the process whereby the PLM accounts for its result by assigning credit or blame to its internal functions, is particularly awkward due to the inherent complexity and inscrutability of the 'black box', i.e. the r, and r matrices.

The stability of the PLM during training on a variety of C-V curves was often upset by numerical problems induced by:

The relative magnitude of the features,

The problems of unforeseen or unknown non-linearities and interelations between the salient features of the pattern,

The order of training.

Point 3 was found to be particularly insidious; if the adaptive combiners were trained in any methodical fashion (for instance, incrementing the oxide thickness and substrate doping whilst attempting to keep all other process variables constant), the system would detect this correlation, and would later (falsely) associate any variation in

-140- one with a change in the other! The only sure way to avoid this was to employ totally random training. A principle limitation lay with the difficultly in obtaining suitable features which were linearly related to the process variables under investigation. Furthermore, the trial and error process of discovering the most useful features was considered too time consuming to be of any practical use.

5.7. Conclusions

The feasibility of using a parametric learning algorithm to extract simple parameters from the shape of a C-V curve has been demonstrated. Even for the simple example given, it has been established that the choice of the feature set is of crucial importance to the performance of the system in use.

The fragility of the system and its sensitivity to feature choice is an unwelcome departure from the use of the PLM as a general purpose problem-solving device. This requires more specialised knowledge and a higher level of theoretical insight into the problem. This drawback has also been recognised by other workers [25]. It is interesting to note that some of the more successful PLMs (Pandemonium and WISARD) skirt around this problem by generating their own features entirely. This has two benefits, firstly, they seem to work very well, and secondly they may be trained to respond to a greater variety of input patterns than initially specified. PLMs thus appear better suited to the tuning of minor maladjustments than to the diagnosis of faults in non-linear systems.

The next chapter details the use of a pattern classifier to identify specific, well- known classes of C-V curve. In this instance the same or similar features used here in a PLM are used to generate discriminant functions rather than to attempt to model the causal links between a C-V curve and its origin.

REFERENCES

M. Minsky and 0. G. Selfridge, "Learning in neural nets," Proceedings of the Fourth London Symposium on information Theory, Academic Press, New York, (1961).

F. Rosenblatt, "The Perceptron: a probabalistic model for information storage and organisation in the brain," Psychological Review, Vol. 65, (6), pp. 386-408, (1958).

-141- F. Attneave, "Some informational aspects of visual perception," Psychological Review, Vol. 61, (3), pp. 183-193, (1954).

R. P. Lippmann, "An introduction to computing with neural nets," IEEE Acoustics, Speech And Signal Processing Magazine, pp. 4-21, (April 1987).

D. E. Rumelhart and J. L. McClelland, Parallel Distributed Processing: explorations in the microstructure of cognition, Volume 1: Foundations, MIT, USA, (1986).

6.. J. K. Hawkins, "Self-organising systems - a review and commentary," Proceedings IRE, pp. 31-48, (January 1961).

R. Forsyth and R. Rada, Machine Learning - Applications in Expert Systems and Information Retrieval, (1986). Ellis Horwood Series in Al

M. L. Minsky and S. A. Papert, Perceptrons (An Introduction to Computational Geometry), MIT Press, (1988). (Second, revised edition)

0. Selfridge, "Pandemonium: a paradigm for learning," NPL Symposium on Mechanisation of Thought Processes, HMSO, London, (1959). (reprinted in 'Pattern Recognition' by L. Uhr, Wiley, New York)

G. Hinton, "Learning in parallel networks," Byte Magazine, Vol. 10, (4), (1985).

W. W. Bledsoe and I. Browning, "Pattern recognition and reading by machine," Proceedings of the Eastern Joint Computer Conference, (1959). (reprinted in 'Pattern Recognition' by L. Uhr, Wiley, New York)

A. Samuel, "Some studies in machine learning using the game of checkers," IBM Journal of Research and Development, Vol. 3, (1959).

D. Michie and R. Chambers, "Boxes: an experiment in adaptive control," in Machine Intelligence (2), ed. Dale & Michie, Oliver & Boyd, (1968).

I. Aleksander and P. Burnett, Reinventing Man, Penguin Books, Middlesex, (1984).

T. M. Crawford, V. Marton, and C. F. N. Cowan, "Testing electronic systems using expert systems- a pattern recognition approach," lEE Symposium on Current Trends in the Application of Expert Systems, Strathclyde University, (2 April 1986).

A. R. Mirzai, C. F. N. Cowan, and T. M. Crawford, "Intelligent alignment of waveguide filters using a machine learning approach," IEEE Transactions on Microwave Theory and Techniques, Vol. MTT-37, (1), pp. 166-173, (Jan 1989). C. F. N. Cowan and P. M. Grant, Adaptive Filters, Prentice-Hall, Englewood Cliffs, N.J., (1985).

M. Honig and D. G. Messerschmitt, Adaptive Filters: Structures, Algorithms and Applications, Kluwer Academic Publishers, Dordrecht, Holland, (1984).

C. F. N. Cowan, "Performance comparisons of finite linear adaptive filters," lEE Proceedings: F, Vol. 134, (3), pp. 211-216, (June 1987).

-142- N. Wiener, Extrapolation, Interpolation and Smoothing of Stationary Time Series, Wiley, New York, (1949).

J. M. Cioffi and T. Kailath, "Windowed fast transversal filter adaptive algorithms with normalisation," IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. ASSP-33, pp. 607-625, (1985).

A. R. Mirzai, ed., Artificial Intelligence: Concepts and Applications in Engineering, Kogan Page, (1989). (Chapter 5)

A. Goetzberger, "Ideal MOS curves for silicon," Bell System Technical Journal, pp. 1097-1122, (September 1966).

K. H. Zaininger and F. P. Heiman, "The C-V technique as an analytical tool," Solid-State Technology, (Part 2), pp. 46 - 55, (June 1970). K. Brown, The application of knowledge-based techniques to fault diagnosis of 16 QAM digital microwave radio equipment., (1988). Ph.D. Thesis, University of Edinburgh.

-143- Chapter VI Techniques of Pattern Recognition

6.1. Introduction

The science of statistical pattern recognition (SPR) is over 50 years old [1, 2], yet many of the techniques are now being revived under the new banner of 'artificial intelligence' [3]. In common with artificial intelligence, SPR shares many similar goals:

to relieve tedium in repetitive tasks

to achieve more consistent results than a human operator

C. to allow subjective, perceptual skills to be distributed and accessible

d. to save money and improve quality

Pattern recognition is the way in which we perceive the world, and is an ability which is highly developed even in very young children. Our ability to mimic these processes, however, is fraught with many conceptual difficulties. This is principally due to the instinctive, but not unformulated nature of interpretive tasks. The reader will have long since learnt to distinguish between an "i" and a "j" in running text, or perhaps to distinguish between a cat and a dog, both having four legs and a tail. There are clearly discrimination algorithms in action here, but who could explain exactly what they are? The advent of computer techniques has, however, allowed for a wealth of statistical processing techniques to be applied to some common tasks:

- 144- optical character recognition (OCR)

medical diagnosis

C. predicting failure

monitoring equipment through detecting anomalies.

visual inspection for automated assembly

1. classifying land areas from satellite photographs

g. speaker identification

Many of these problem areas are also amenable to solution by connectionist learning systems as discussed in the previous chapter. Again, all are activities requiring a degree of acquired skill to perform efficiently. The difference lies in the range or number of object classes. SPR is extremely effective when the number of different states to be differentiated between is small, regardless of the complexity of the task. For example, a machine tool may be monitored for cutter wear and failure diagnosed from a complex Fourier analysis of the audible signal generated by the cutter itself [4].

This chapter explores and applies a variety of SPR algorithms to the classification of C-V curves using an established number of multivariate statistical methods.

6.2. Concepts of Pattern Recognition

Almost any function of one or more variables can be considered as a signal or pattern. Whatever the source of the pattern, be it acoustic noise from a machine tool, pixel images from a tv camera or the output from a capacitance meter, signal processing techniques of enhancement, transformation and feature extraction can be applied.

The pattern recognition process can be sub-divided into distinct tasks as shown in figure 6.1 below. These can be summarised as two phases:

deriving the decision rule

applying it to the pattern

145. Data Pre- Post- L_uisjti0n }" [ corniitioning ] [ processing

Feature Feauirefl Pattern I j r sij_ø Diagnosis extraction selection ' Lcsifica L L ]

Figure 6.1: Phases of the pattern recognition task

In statistical pattern recognition [9], this decision rule is derived from the statistics of the input signal. Such systems may be incrementally or concurrently trained on representative patterns. This learning process may take one of two distinctly different forms:

supervised, or

un-supervised

The difference lies in whether or not the objects to be classified can be attributed to pre-defined groups. A supervisor is required to identify the letters of the alphabet in a phototypical OCR system, but the number and nature of the groups in a social study of population buying habits will often be unknown a priori. This latter form of SPR is more often known as cluster analysis, and typical algorithms will attempt to unite similar groups of data, without any knowledge of their individual origin.

A second method of pattern recognition is distinguished by an alternative form of representation for the pattern. Structural (also known as syntactic or semantic pattern recognition) [10, 11], embodies more of a linguistic description of the object to be classified, for which non-statistical techniques are used to generate the decision function and identify the pattern.

-146- 6.2. 1. Vector Spaces: Definitions

In the process of progressing from data acquisition to diagnosis, it is helpful to represent the levels within this strategy by a hierarchical set of mathematical spaces. These are illustrated in figure 6.2.

The measurement space is composed of all the possible forms of the pattern to be classified as described by the raw physical data, such as x-y coordinate data, sound recordings, bitmaps of images, or responses to psychological questionnaires. The precise form of the input data is unimportant - such is the generality of the technique. Each example of the pattern may thus be represented in this multi-dimensional space by a point described by a vector. There may exist a large or possibly unlimited number of different measurements (dimensions) in the measurement space, and thus it is usual to transform this initial set of measurements to one of more manageable proportions, by extracting a smaller set of salient features.

This process of feature extraction may involve a sub-hierarchy of transformations before a suitable set of features, forming the feature space, is reached. The minimum requirement of a feature space is that it should be of finite, and low dimensionality, and yet retain sufficient information about the original pattern to allow an unambiguous classification to be made.

Finally it is the task of the classifier to map similar groups of vectors in the feature space to the same point in category space.

Interpretation

I Category space I

Reduced feature

Feature space

Measurement

Physical system

Figure 6.2: Relationship between the vector spaces used in pattern recognition

-147-

6.2.2. Statistical Decision Theory

Given that we have represented our data in this manner, it becomes necessary to define a decision rule for the purpose of dividing the feature space into separate regions. An unknown pattern can thus be classified if it falls within a specific decision region as shown in figure 6.3.

The purpose of training the pattern classifier on a known set of data is to optimise the position of the decision boundary. The decision function describing the boundary in figure 6.3 is linear. In three dimensions, this line would have the form of a plane and in four or more dimensions, a hyperplane. The linear discriminant function is the simplest discriminator, but is obviously limited to linearly separable problems.

Instances of decision regions not meeting this requirement are shown in figure 6.4. In just two dimensions, these distributions are intuitively obvious, but in higher dimensions, they are at once difficult to visualise and furthermore, may take on much more complex forms. It is important therefore to remain aware of underlying assumptions of the linear classifier, and to keep the dimensionality of the feature space as low as practically possible.

A linear discriminant function can be specified simply by the general equation of a plane:

fi decision boundary

b p • Class! s ee 0

S . , S • I • • Class • S • ,. • t . III • • 5% II. • •, • SI • •• S • f2

Figure 6.3: Separating two pattern groups by a decision boundary

-148- Enclosed - Overlapping

CC(~ ) B C

Multi-modal Concentric

Figure 6.4: Pattern groups requiring non-linear decision boundaries

g(&)=O (6.1) (a hyperplane).

To determine the classification of an unknown pattern y is simply a matter of assigning the point y to one side of the boundary or the other. Mathematically, this can be expressed as a decision rule:

lass 1ifg(y)O C(Y) (6.2) = tlZass2ifg(y)

Practical classification problems, however, rarely involve conveniently localised clusters of data points, and will more typically appear as a number of overlapping

Gaussian distributions as illustrated by figure 6.5. A discriminant function in this situation can only define the degree of class membership, and the best decision boundary will approximately follow the intersection of the distributions. In higher dimensions the form of the probability density function (PDF) is generaly assumed to

149.

X

Figure 6.5: One dimensional example of overlapping Gaussian PDFs be the multivariate normal as illustrated in figure 6.6 and defined below:

iexp (6.3) I 22 ] V2we In this situation, as envisaged in figure 6.6, a linear discriminant is not the optimal decision boundary. If, in a two class case, the data clusters are from different normal distributions, then the decision boundary, (the intersection of the contours of equal probability), is better described by a curve, as shown in the cross-sectional diagram of figure 6.7. Such a curve can always be proven to be a quadratic function [12] (Appendix 1).

In formulating a decision rule for classifying sample points from multivariate distributions, it is particularly convenient in a vector space, to use some form of distance metric to determine the optimum decision boundary. Here, we use the notion of 'distance' as a measure of similarity or dissimilarity between groups of patterns.

Given a point a = (1 if 2..... f,) in n-dimensional feature space, for which there are N possible pattern classes described by N groups of M points= 1,2..... N

each of the form Ym = (P 1 ,P2.....P) then the most general distance function must satisfy:

d(a,a)=O,d(a,,,,) Vr,91 y, - ( 6.4) d and d(a,)+dQ d(a,) One commonly used metric is the Euclidean distance illustrated schematically in figure 6.8 which can be expressed in the two dimensional case as:

13' f(xj_y,)2I (6.5) I1

-150- Figure 6.6: The multivariate normal distribution (in two dimensions)

Cl

Figure 6.7: Cl and C2 are multi-variate normal distributions characterised by different means (position) and shape (covariance)

f2

fl

Figure 6.8: The Euclidean distance metric 6.2.3. Design and Test Data Sets

To enable the use of a distance classifier, the statistics of the PDFs must be estimated from prototype data provided for the purpose. The design set (also known as the training set) is a collection of sample data used to train the pattern classifier and define the decision boundary. The set of data presented to the classifier after training, for classification of unknown points, or just for performance estimation, is known as the test set. Their use is epitomised by figure 6.9.

In many practical instances, the precise form of the PDF will be unknown and thus in the absence of any other information, must take on an assumed profile which is often that of the multivariate-normal distribution. This assumption may be invalid, particularly when the sample sizes are small, and thus the performance of the resulting classification algorithm may fall short of expectation. This and other aspects of performance evaluation are discussed in section 6.3.5.

6.3. The Classification of C-V Curves

For intelligent, automated control of C-V measurements, the test system should be able to make deductions about the device under test from the shape of the C-V curve. By treating interpretation as a classification task, we may use a pattern recognition system to distinguish between different categories of MOS capacitor characteristics.

Figure 6.9: Functions of the design and test pattern sets

-152- In the following sections, the design, implementation and test of a pattern classifier for C-V curves is documented. In this analysis, the data acquisition and diagnosis phases, as depicted in figure 6.1 have been left out, as these have already been described in detail in Chapter 4.

6.3.1. Pre-processing

The function of the pre-processing algorithms are to:

Ease the task of the feature extractor, and

Clean-up, enhance and perhaps transform the raw data into a more meaningful or intelligible representation.

Examples of pre-processing in pattern recognition include spatial filtering and grey-level thresholding in image processing, and time to frequency domain transformation in speech recognition. In the processing of C-V data, a number of techniques have been employed including the following:

Smoothing by moving average

Transformation by differentiation

Peak detection by gradient searching

The significance of these operations can be better appreciated in the context of feature extraction, discussed in the next section.

Smoothing and Averaging

The raw C-V data from pulsed or equilibrium measurement has an indeterminate form, and thus cannot be presumed to take on any particular shape. The most important operation therefore, prior to any analysis of the curve is data smoothing. Since C-V data may be noisy on a small and/or large scale, it has been found necessary to:

1. Search for any glitches or spikes and remove them as necessary. Large in this instance is defined as any change in the measured capacitance between one point and the next greater than 20% of the previous value. The spike removal algorithm is defined by:

-153- IF c(i)>5Xc(i —1) AND c(i)>c(i +1) THEN c(i) = c(i —1) (6.6) JFc(i)<0.2Xc(i-1) AND c(i)<0.2Xc(i+1) THEN c(i)c(i-1)

Glitches are likely to occur during a capacitance meter range-change during a pulsed measurement on a medium to short lifetime sample and must be removed as shown in figure 6.10. Spikes, however, may be caused by sudden electromagnetic interference, recovery of the depletion layer in deep depletion (short lifetime silicon only), or may be the result of a faulty oxide or device connection. Such an unwanted effect is shown in figure 6.11.

2. Apply a smoothing algorithm to the raw data. This is necessary to 'clean up' the data either from when there is a small amount of 'noise' evident in the capacitance data, or if the spike/glitch removal was invoked. The technique of the moving average was applied to all measurements without any loss of important detail in the C-V curve. The moving average algorithm operates by

'_. 1 - - 1 '_/ . -41 -

0 S

12 I U

14 $1

U 16 a. U U 8

7.2 —5.7 —4.2 —2.1 1.2

1 C= '.1' a.4 C25 Sc=d a,

S S

F, 12 0 I U C I 4 41 Ij

Li 6 I 6 a. I U U 8

I I I I -7.2 —5.7 —4.2 —2.7 —1.2 —7.2 —5.7 4.2 —2.1 —1.2

'101 tags '101 taGs

Figure 6.10: Phases in the removal of a glitch

- 154- -. - 1." . . _ B -

S S

II I U U

0 SPIKE 4, 4'

U U 6 I 6

I U U B 8

-7.0 .3.4 -3.8 -2.2 -.6 1.0 -7.0 -5.4 -3.8 -2.2 -.8

V.1 0 V 1 -t

Figure 6.11: c-v curves before and after removal of a spike averaging each point over the previous and next two thus:

c(i)' = (c(i-2)+c(i—l)+c(i)+c(i+l)+c(i+2) ) 15 (6.7) Although a five-point smoothing formula was used initially, it was found later through experiment that in most instances three points would have been sufficient. An option exists in the software to re-apply the three-point algorithm to particularly noisy data. Figure 6.10 illustrates four phases in the smoothing process, applied to a C-V

curve. This philosophy is mentioned by Hildebrand [13] (p. 358) who prefers to employ a low-order smoothing algorithm iterated as many times as necessary, in the opinion of the operator, to eliminate noise but preserve the essential detail of the shape.

Transformation

Since a great deal of information about the MOS capacitor is evident from the subtle changes and inflexions in the gradient of the transition, it is instructive to emphasise these variations by applying some simple transformations.

By taking the first and second deviatives using simple 3-point formulae [13] (p. 181) of equation 6.8, the differences between uniformly-doped, ion-implanted, and high interface-trap density samples are made visibly more apparent as can be seen from figures 6.12. In this form, the differences between the different classes of C-V curve are computationally easier to extract.

-155-

- ci .-' .t - 9 3 5 - I 1

A (I 0 I 0 0 U U )

6

U

'9 U

0 1 -t. = a = '.' 0 1 -t a a

- . ci I ,, t -ç c '..' • 2 _3 - . . ci I _ - F C '.7 . 7.4 91 - 0 I.

11 'I 0 U 0 U I . ) Q U U

Ii

1.8 2.9 - -

Figure 6.12:. First derivatives of C-V curves from samples with: a) VT -adjusting implant, b) high trap density, c) depletion implant, and d) uniform doping

C'(i) = [C(i+1)—C(i —1)]12h(--) (6.8)

C"(i) =[C(i—l)-2C(i)+c(i+l)]/h 2

NOTE: These formulae are valid only for equally spaced data (voltage axis).

Another way of enhancing the C-V data by observing it from a different viewpoint is to compare the measured curves with an idealised model as defined in Chapter 4. This visual comparison is often carried out by an experienced C-V operator, since any non-linear shifts from the ideal are an instant indicator of either some substrate non-uniformity, high trap density or oxide charge non-uniformity. Figure 6.13 illustrates the automation of this procedure through the use of a cubic

spline, piece-wise polynomial interpolation routine [13] (p. 478). The algorithm

-156- / awil

Capacitance

Figure 6.13: Illustrating the extraction of voltage-shift displacement curves produces a histogram of shifts between the two curves within a 10% - 90% window of the theoretical data. This can be graphed against the capacitance values corresponding to the shift. The resulting curves are displayed in figures 6.14 and may also be used to distinguish between the different shapes of C-V curves.

6.3.2. Feature Extraction

In common with parametric learning algorithms, the selection of suitable salient attributes (features) to describe the pattern to be identified, is a crucial and important step in the design of a pattern-recogniser. The rules of feature choice for a pattern recognition system are thus similar to those for a PLM, with some notable exceptions:

Features are selected on the basis of their discriminatory power rather than their ability to model any pattern:symptom relations, be they theoretical or heuristic.

The chosen features should exhibit stable and predictable behaviour for any class of pattern presented to the system.

- 157 -

tTC -rINJ L_V 2

1.6 A

' .2

A .8

o .4 >

AAAAAAA A A AAAA A o 8 16 24 32 40

C-value (*19 pF)

FZZ RL.QT z1lsrQFmIr.J Lor 4.4 2.

3.4 ;: 2. C

4. ' 2.4 I. 'C 44 A tj 1.4 4, 0) 0) 4, 4, AA 4. A A A A o .4 0 > >

-.8 I 8 16 24 32 40 8 8 IS 24 32 40

C-a). (*IE9pF) C-value (*1E9 pF)

Figure 6.14: Displacement curves from samples exhibiting: a) uniform doping, b) non-uniform doping, and c) high trap density

The philosophy underlying this process is best described by Selfridge as

'The extraction of the significant features from a background of irrelevant detail." [14]

However, this non-trivial and largely intuitive process has even been classified as an 'art' [15]. It should be noted that:

No general formula exists for choosing features relevant to a particular problem, and

The design and assembly of a feature set is an empirical process using many ad hoc strategies.

_158- Batchelor [8] in his popular book, acknowledges the "heuristic, pragmatic and experimental" aspects of pattern recognition and warns the potential user against the 'illusion that pattern-recognition is a precise subject'. Unfortunately, this text does not provide any guidelines or suggestions for selecting features. Nilsson [16] similarly dismisses the problem of feature selection as a 'deliberate omission', in the hope that future progress will be made. This is unfortunate, since the real power of any successful pattern recognition system often stems more from a careful choice of features than the particular classification algorithm used. Indeed there are trade-offs that can be made here, and a sensible selection of features will lessen the demands on the classifier. It is common therefore for the tasks of feature extraction and distance classification to be grouped together under the one heading 'classification'. The majority of failures experienced by workers in the field of pattern recognition however, can be attributed to the problems of finding good shape descriptors. To add to this problem, there appears to be very little practical information in the pattern recognition literature which tends to concentrate on classification, save for a survey [17] and a special issue [18, 191.

Some well established parametric descriptors for pattern representation include measures of geometric qualities such as size, shape and moments of inertia [20] (Chapter 16). Other more subjective measures such as 'curvature' and 'straightness' may be described by 'perceptual partitioning' between segments delimited by local extrema, points of inflexion and curvature maxima [21,22].

Curvature maxima or breakpoints[23] have long since been recognised by psychologists as informationally rich shape descriptors and widely used in pattern recognition. This approach was used as the basis for evaluating the first and second derivatives of the c-V curve to aid in the extraction of these changes in curvature.

Another important consideration in choosing suitable features is that they should be invariant under the normal variations in the pattern data. For example, in the context of C-V curves, it is necessary to eliminate or normalise any features that vary with oxide thickness, substrate doping or fixed oxide charge, since we are interested here only in the informational content of the shape of the curve. Ratioed measures such as height, width, aspect ratio, circularity and rectangularity are useful as invariant descriptors and convey important perceptual information.

-159- Parametric features fall into distinct categories depending on their ability to re- create the image or pattern that they represent. These are known respectively as information-preserving or non-preserving. Examples of information-preserving descriptors are Fourier and polygonal approximations where the coefficients of the series expansion generated by the transform is inversely related to the original pattern.

Heuristic measures of shape such as those used here are often non-unique and do not capture the full structure of the image. They are useful, however, for a limited set of patterns, and in this situation outweigh the information preserving techniques on the basis of ease of implementation.

Representing C-V Curves

An initial set of features chosen to represent C-V curves from some well known classes, as shown in figure 6.15 are listed below. These feature values are graphical attributes chosen for their discriminatory power to aid classification.

Slope between 10% and 90% points

Series resistance, R.

Oxide conductance, G,

max Swamping effects of R,: '.-oxp dC Magnitude of first peak in dV dC Magnitude of second peak in dV dC Spacing between the first and second peaks in dV dC Depth of trough between peaks in dV

Ratio of the magnitudes of the first and second peaks in dV Ratio of the log of the peak in G (w) vs. V to the peak width at % height

Magnitude of the peak in the G(w)vs. V curve

Number of apparently negative C,, (w) values, as a percentage

Amount of noise in C,, (o) data (number of positive to negative gradient changes)

-160- Standard deviation of voltage shifts, °Vh,

Mean of the voltage shifts, Vhift dVhJft Gradient of linear fit to voltage shifts dC Standard error of linear fit to voltage shifts a vfu Experience and familiarity with the types of curves generated by the equipment instinctively leads to a large number of possible features with which we may distinguish between classes of curve shape. Unfortunately quantity is not a substitute for quality [9], and some pre-selection is necessary at this stage. By removing redundant or irrelevant features from this list, it is possible to use a classifier which is less complex and more reliable. This selection was based both on intuition and on the need to meet the conditions for invariance. However, statistical techniques exist to select features on the basis of their individual contribution and these will be discussed in the next section 6.3.3.

It was decided to initially restrict the classifier to three well known classes of curve [24]. For this, eight features were identified as playing a significant role in distinguishing between C-V curves which were indicative of:

Class 1: A high interface trap density

Class 2: A non-uniform substrate doped with a threshold-tailoring implant

Class 3: A uniformly, or near-uniformly doped substrate with a low interface- trap density

For this three measurements were made:

• Pulsed C-V

• Equilibrium C-V

• Equilibrium G-V and the following eight features selected:

-161- 0.8 0.6

0.6

0.4

0.2

A

Bias voltage Bias voltage IilftS Voltage C/Cox C/Cox C/Cox

0.0

(e)

0.4

Ch

Bias voltage Bias voltage Bias voltage C/Cox CitlCmax CitJCmax

0.6

0.6

0.4

0.2

- 0 Bias voltage Bias voltage Bias voitese (a) The Ideal C-V curve, (b) Linear shift along voltage axis due to fixed oxide charge, (c) Effect of additive VT adjusting Implant (B in p substrate), (d) Effect of compensating VT adjusting implant (B In n-well), (e) Effect of a high density of Interface trapped charge ( 10cm 2eV), (f) Effect of a 'leaky' oxide, (g) Effect of short minority carrier lifetime on a pulsed C-V sweep, (h) Effect of a high D 11 on the interface trap conductance curve, (i) Effect of high series resistance on the interface trap conductance curve

Figure 6.15: Some schematic examples of representative C-V and G-V curve shapes dC Magnitude of first peak in - dV dC Magnitude of second peak in - dV

Spacing between the first and second peaks

Ratio of the magnitudes of the first and second peaks

The log of the peak in G (o) vs. V divided by the peak width at % height

Magnitude of the peak in the G (w) vs. V curve

Number of apparently negative GA!, (w) values, as a percentage

Amount of noise in C,, (w) data (number of positive to negative gradient changes)

The features relating to the voltage shift were found to contribute little additional information for the simple 3-class case, and were more difficult to extract from the data. Furthermore, because of the difficulty in obtaining an accurate value of N ,.b for some of the samples, a reliable ideal curve was sometimes unavailable.

Information from G oX and R,, although useful for signalling faults such as a leaky oxide, was not considered best used as features in a pattern classifier. Instead, preliminary checks on these and other parameters were used to pre-filter the data fed to the classifier as shown in figure 6.16. This idea of a hierarchical system for interpreting C-V data will be expanded upon in the next chapter.

Certain features such as the average slope and the depth of the 'trough' between peaks were considered redundant and were not selected to avoid co-operative effects

Measurement. I -1 Smoothing Preprocessor Feature Pattern I extraction. matcher.

Reject paths -

Short circ1U1 Leaky oxide Open circuit Sweep range Breakdown

Figure 6.16: Pre-filtering gross failures prior to pattern classification

-163-

and lessen the problems of dimensionality. Of the eight remaining features some, such as numbers 7 and 8 in the list above, were found to contribute only marginally, but increased the success rate sufficiently to warrant their inclusion.

Peak Detection

A peak in a curve is visibly and algorithmically easier to detect than an inflexion, hence the requirement for differentiation of the raw data in the pre-processing phase.

The techniques used to identify and locate peaks in the G;() and - data involve a dV series of methods:

Detection] IThresholdingl -. llndexingi — IHeuristicsl wanted peaks

Starting with the smoothed - or G11, ( w) data, the direction changes are located dV from nulls in the 2nd derivative. The local maxima are thresholded to —5% of the global maximum to remove the peaks due to residual noise. The remaining peaks are then indexed 1—n using heuristic knowledge about their expected position and order. In the case of the interface-trap conductance data, the number of un-thresholded peaks

and zero crossings are recorded as a measure of the noise in the G. () curve. The algorithms for these processes are described in Appendix A.

The correct identification of the important peaks in the dC/dy and G(o) data is shown in figure 6.17. This process is reasonably rugged and performs well for the vast majority of samples. However, for 'difficult' devices, a decision support framework

I t -f CS/i 7:3S_1 4 CS F.• S_1. i_2

Pe,,k 1 (I x PERK I.15E+11 cm-2 eV-1 I 0 U C U I 4, 15 - k U 3 Peak/width-\.808 V > C U to U 61-

'I Peak 2 I S. I- as 0 11 H

-2.6 -.6 1.4 3.4 5.4 7. - -.:

1 - — _ Vo I

Figure 6.17: Location of relevant peaks in the dC/dV and G (o) data

-164-

was built around the peak detection mechanism to enable the operator to intervene if necessary. The typical dialogue is illustrated in figure 6.18 and a summary outline of the feature-extraction processes is given by figure 6.19.

6.3.3. Feature Selection

Once a preliminary empirical feature set has been decided upon, it is often expedient to apply a theory-based technique to the selection of an optimal set. In some situations, this step may be unnecessary, but if subsequent classifier performance is not as expected, it could be that:

There is insufficient information to classify the pattern.

There is a surplus of redundant information giving rise to problems of dimensionality.

The groups of data are poorly separated using the co-ordinate axes selected.

There is conflicting information hidden within the features.

32

24

16

0

0 1 0 1.7 3.4

40-

30- Does your curve (left) appear similar to the

ae - expected shape ABOVE?

10- (Y)es, (N)o, or ?

Figure 6.18: Typical decision-support dialogue during CV-ASSIST

-16S -

Optimised + corrected C-V, G-V curves

Analyse shape Analyse shape Analyse shape of dC/dV of G(lt)-V of displacement

Extract features I I Extract features Extract featu

Most likely ttern classiier fault category < af Figure 6.19: Schematic outline of the feature extraction and classification processes

The general problem of feature selection is that of selecting the best set of features to describe and discriminate between the various pattern classes. This could simply take the form of an exhaustive search [5], selecting every possible sub-set combination and testing the resulting classifier's performance. However, even for the modest number of features suggested, the number of different experiments that would require to be made would be very large:

8 8 8 8 8 8 8 8 c+ c+ c+ c+ c+ c+ c+ C 1 2 3 4 5 6 7 8 = 8+28+56+70+56+28+8+1 = 255 (28 _1)

A more direct technique to transform the feature space to one of lower dimensionality is thus preferred. In addition to simply improving the classifier performance, feature selection can be invaluable for finding the 'intrinsic dimensionality' of the pattern space [6]. Transformation techniques which allow a two or three dimensional representation of the original data permit a visual comparison to be made, and the effects of adding or improving the basic feature set, observed. An exploratory environment for evaluating different feature sets and their transformation is provided in the C-V EXPLORE program detailed in Appendix A. The algorithmic techniques for feature selection are many and varied, and are detailed in an excellent review by Meisel [6].

-166-

A common technique for feature selection by transformation is canonical variate analysis. This is similar to principal components analysis, Karhunen-Loeve expansion [12,25,26], and factor analysis, all of which are variations on the well-known method of analysis of variance (ANOVA) [27].

Canonical Analysis

The basis of this technique is to find an optimum linear combination of the basic feature set to minimise the effects of pattern variability within a class, and enhance the variation between different classes. To assess the reliability of the resulting classifier, some figure of merit which reflects the degree of classification must be sought. It is not enough to use the differences between the group means, as these can be affected simply by scaling the feature set. Instead some normalised measure which takes account of both the size, shape and separation of the clusters is required.

In canonical variate analysis, this measure is the Fisher criterion [1], namely the ratio of the between-class and within-class sums of squares:

= between-groups spread SS B (6.9) within-group spread - SS W As illustrated in figure 6.20 for a simple two group, two dimensional example, these values are derived from a projection of the pattern groups into a lower-dimensional space, in this case, a line. Each projected data point can now be represented by a new variable, e, the distance along the line from the origin. Obviously, both the separation

Figure 6.20: Illustrating dimensionality reduction by projection

- 167 - between the projected means and the degree of the projected spread are dependent upon the angle of the line. Optimisation thus takes place by finding the orientation of the line which maximises our figure of merit, as shown in figure 6.21.

In N-dimensions, for the two group case, with normal distributions and equal group covariances, the expression for e is given by equation 6.10 which is just an expression for the linear discriminant function:

e =vx1 (6.10)

where vI are parameters describing the orientation of the hyperplane. When there are not 2, but G groups to analyse, there will often be more than one solution for the hyperplane that best separates the pattern classes. The second and subsequent solutions, however, must be subject to the constraint that they are uncorrelazed with the first to avoid duplication of the initial solution. Each of these solutions, which maximises the Fisher criterion, is termed a canonical eigenvector. There are at most min(G —1, n) of them, where n is the number of dimensions in the pattern space prior to transformation. Each eigenvector, c, has associated with it an eigenvalue, X i which is proportional to its 'separating power'.

If W represents the within-class covariance matrix, and B the between-class covariance matrix, then the purpose of canonical analysis is to find a new vector which maximises the criterion:

Selp

Figure 6.21: Two-dimensional example of the effect of changing the line of projection on pattern separability

-168-

ssB - K (6.11) = SS - Differentiating K with respect to j and equating to zero gives:

dX =0 = (6.12) di ('Wj)2 which simplifies to:

2[B.i —My J =0 (6.13) WY_ which in turn yields:

By- = XWj (6.14)

and thence if W has an inverse, an ordinary eigenvalue problem results:

W 1Bj = Xi (6.15) For two classes there is only one eigenvector, but for G classes, matrix B is of rank r = min(G —1 ,n), where r is the number of dimensions of the transformed feature space defined by the r canonical eigenvectors. In the multi-group case, equation 6.15 can be shown to be [12]:

W 1BV = VA (6.16)

where V is the matrix of eigenvectors and A a diagonal matrix of eigenvalues, K 1 . The implementation of the algorithm to extract the canonical eigenvectors from the basic feature set is documented in Appendix A.

Having obtained the requisite eigenvector matrix, V, all that is necessary to transform the feature space to select an optimum set is a matrix multiplication:

11 V12 Vp. V21 (11,12..... 18) : . (fl ' , 12 ...... fr') (6.17)

V81 V81

where r is the dimensionality of the reduced pattern space, and f' a new feature. As an example and test of the algorithm, the results of applying the canonical technique to the analysis of a classic, four-dimensional data set (Fishers iris data [1]) is shown in figure 6.22.

-169- 13 * IRIS SETOSA - EJ IRIS VERSICOLOUR L 11.8 - IRIS VIRGINICR U -* () 10.8 013 EP A * 0 ) 9.4 - 0 C - 4 0 Il * 0 8.2 0 r • 0 I I I -11 -6.4 -1.8 2.8 7.4 12

I% -t_ I'- 1 Figure 6.22: Canonical eigenvector transformation applied to Fishers Iris data [1]

Viewing the data in this way along the two most significant eigenvectors (the axes in the directions of highest information content), enables the natural grouping of the pattern classes, in this case three species of iris, to be observed. The canonical transformation has also improved the classification accuracy to greater than that which could be achieved from using the four basic features [12]. The advantage of using a transformation of the entire feature set, rather than selecting a subset, is that information from all the features is included.

The application of the canonical method for variable selection by transformation to the identification of different classes of C-V curve is detailed in section 6.3.5.

6.3.4. Distance Classification

As described in section 6.2.2 the simplest classifier conceptually, is one which uses 'distance' as a measure of similarity. The Euclidean distance metric defined in equation 6.5 can be generalised to N-dimensions thus:

dJ = (xjk -Xft )2 (6.18)

where dJ is the squared Euclidean distance between two points and

However, this measure does not take account of features which are of differing magnitudes, nor of clusters which are non-symmetrically distributed. By normalising each feature, x to zero mean p, and unit variance, a, we may at once overcome these

170- deficiencies:

X f = x-p. (6.19) U where x' is the normalised feature. In more than one dimension, however, it is important to take into account any correlation there may be between separate features. A popular distance metric which evaluates these correlations via a matrix of covariances, V, is the Mahalanobis distance, D 2 [2,12,27,25,9,5].

D2 = ( _ .)TV_l(,& (6.20) 2 where Di is the distance between the mean of the j th cluster and the prototype point , as depicted in figure 6.23.

This is the inner product of the cluster mean vector and the prototype point, and can be thought of as the simple Euclidean distance modified by a transformation which rotates (decorrelates) and scales the basis of the feature space [5].

The importance of this transformation, termed 'statistical whitening', defined by the inverse covariance matrix C' is explained in figure 6.24. Note then, that prior to

the 'whitening process, two dissimilar prototype pointsx, and S2 will be calculated as having the same Euclidean distance, even though it is clear that z2 is much less likely to belong to the cluster. The whitening affects the D 2 calculation to take account of

Cluster B

Cluster A

Cluster C LkIL

• = prototype point X =cluster mean Figure 6.23: Nearest-neighbour classification using the Mahalanobis distance measure

-171-

the ovaloid shape and scale the distances to account for the asymmetry in the distribution of points in the training set.

In series form, the Mahalanobis distance can be evaluated thus:

NN 2 1 2 Di (x,. Prj ) V' 3 (x, — p31) (6.21) r1s 1 where Vr3 is the element in the rth row and sth column of the inverse covariance matrix V in equation 6.20.

Class membership can thus be defined by:

ES ifD 2(. ,) =minD2(. ,

) (6.22)

Equation 6.20 assumes that each cluster is multivariate normally distributed, and has equal covariance, V. However, in many cases, the means, variances and covariances must be estimated from the sample data. In this situation V can be calculated from a pooled estimate C, of the covariance matrices for each pattern class:

Equidistant 4 fr 4 so Beforerescaling: "gnment difficult)

After rescaling: 0 0 (I) (simple assignment)

Nearest neighbours

Figure 6.24: Effect of re-scaling co-ordinate axes on distance classification

-172- G [(nj —i).Cj = j = G (6.23) (n-1) j=1

where C1 is the covariance matrix for the j th cluster and n• is the number of samples within that group. G is the number of groups. In many practical situations, it is

preferable not to use a pooled estimate of V, but to evaluate each C. separately. D 2 can thus be evaluated by:

DI = (6.24) where

1 " Lj ni =o and

nj Cf = L t,lj.mj_j.?J j j=0 If desired, C1 can be evaluated recursively as given by equation 6.25 using an algorithm similar to that for r . in Chapter 5.

1 (k —1)(,& —,)(& _L )TC 1 (k —1) Cf 1(k) = CJ1 (k —1)— C 1 (6.25) 1+TC/1 (k —1)(s.

This is useful in the implementation of an adaptive trainable pattern classifier as implemented in CV-ASSIST, described in Appendix A.

The mean vectors of the pattern clusters can be updated similarly:

j(k)= f[(k_1),(k_1)+x(k)] (6.26) Significance Testing

It was found in practice that the Mahalanobis distance itself was not a particularly satisfying criterion for assigning or rejecting a new sample point from an established set of pattern clusters. It was decided, therefore, to seek a normalised scale which would also have more meaning to an operator. Such a scale is the percentage confidence

rating which can be derived directly from D 2 by evaluating it as a chi-squared (x2 ) variable. If each cluster can be considered multivariate normally distributed, then D 2 actually follows a x2 distribution with n degrees of freedom, where n is the number of

-173- features [27]. This process is not as straightforward as we should like it to be however, since the translation of distances between patterns into probability ratings is complex. Furthermore, to expect any meaningful relationship between the confidence rating and the actual visual similarity of the patterns is 'unrealistic' [9] because of the subjective nature of the 'likeness'. This aside, the model is an adequate one for the purposes of pattern recognition.

The application of the x2 test enables the classifier to be equipped with a so-called 'reject-option' [12,28], which prompts the expert or 'teacher', during initial training, with the following options should the confidence level fall below some preset threshold:

Not to include the sample in the training set

To create a new category of pattern

This would indicate that the current pattern was significantly different from any encountered so far and thus prevent statistical outliers from distorting the pattern clusters. Of course, option 2) should be invoked only when the expert is certain of the uniqueness of any particular C-V curve, else the number of pattern groups would be allowed to grow uncontrollably.

6.3.5. Performance Evaluation

Perhaps the most difficult aspect of building a pattern-recognition system is assessing the effectiveness of the classifier. An obvious method to estimate the misclassification rate is simply to test the fully trained system with the design set of patterns. Not surprisingly, this gives a rather optimistic result known as the apparent error rate. James [12] defines large to mean more cases than ten times the number of features used. Ideally, we would wish to use separate design and test sets as defined earlier. Once training is complete, the error rate is determined by testing the independent test data set. However, one could argue that an overall better classifier would have resulted if all the data sets had been used for training in the first place. This is a very common 'Catch-22' situation in pattern recognition applications. However, for small design sets, some other techniques must be used, such as statistical estimation methods [29] or the more empirical 'leaving one out' method described by Lachenbruch [30]. This latter method is attractive for its simplicity. One only has to classify each of the design set points in turn on a classifier trained set upon the

-174- remaining (n —1) points, and an error rate very close to the true value results. A disadvantage with this technique, however, is its computational complexity, particularly as the training set grows when more examples are added.

Because of the small sample size, it was decided to simply use the apparent error rate as a performance measure, but to carefully select the members of the training set. James comments that any classifier can be improved by 'refusing' to classify the poor or borderline cases [12]. For this the reject option was employed, since it is clear that the most significant changes in the error rate occur as a result of including the 'doubtful' classifications.

Results of Testing the C-V Curve Classifier

The covariance matrices and mean vectors for the Mahalanobis distance calculations were recursively computed using a wide variety of samples from three classes of curve shape:

• 9 different ion-implants (VT adjust),

• 10 different oxides affected by interface traps (no post-metallisation anneal), and

• 15 different unimplanted, 'clean' oxide samples.

During training, the error rate was observed as it dropped to an acceptable level indicating that a sufficient number of examples had been presented. The success of the classification scheme was then evaluated using a x2 confidence test to assess the significance of the Mahalanobis distance. The resulting percentage confidence level then becomes a convenient, normalised gauge for quantifying two figures of merit for the classifier [25]:

Intra-class similarities show how well the samples within a pattern category were classified. The majority within any class were identified with 90% confidence or greater as shown in figure 6.25. -

The inter-class differences were calculated similarly to illustrate how well samples from one class were rejected from the other two. The majority tall well below the 90% threshold as can be seen from figure 6.26.

-175- 0 - 1 2 3 4 5 6 7 -. sample Figure 6.25a: Bar-chart showing intra-class confidence levels for the interface traps class, (1) Confidences of >90% indicate successful recognition within this category

% confidence 100

80

60

40

20

0 1 2 3 4 5 6 7 8 sample

Figure 6.25b: Bar-charts showing intra-class confidence levels for the implanted class(2)

The x2 test enables the classifier to be equipped with a 'reject option' indicating that the pattern is significantly different from any in the database. A significant result was deemed to have 90% confidence or greater. This process is illustrated by an example in figure 6.27 which shows the appropriate curves, extracted features and the resulting decisions between the three classes for a metal gate MOS capacitor. Quite obviously this is a sample that is representative of an MOS capacitor with a high interface trap density since the x2 test gives a confidence level of 95% for this symptom. This would

-176- % confidence 100

80

60

40

20

0 1 2.3 4 5 6 7 8 9 1011 sample

Figure 6.25c: Bar-charts showing intra-class confidence levels for the uniform class(3)

% confidence 100 • 1:2

80 RR 1:3

60

40

20

0 LP'.A - -. LFA P%A 1 2 3 4 5.6 7 sample

Figure 6.26a: Inter-class rejection ratios between samples in class 1 from 2 and 3. Confidences <90% illustrate successful rejection from the other categories. then provide the stimulus to invoke further measurement routines to analyse trap energy and distribution, whilst warning that any doping profiles generated must be corrected for the effects of interface trap distortion in the C-V curve.

During subsequent trials on more samples, it was found that there was insufficient discrimination between some groups. In particular, the clusters representing uniform and non-uniformly-doped substrates appeared to be very close together, though distinct. The effect this had on the inter-class discrimination can be seen from figure

- 177 - % confidence 100

80

60

40

201

0 'Il ( N 11 d N _J 1 2 3 4 5 6 7 8 sample Figure 6.26b: Inter-class rejection ratios between samples in class 2 from 1 and 3.

% confidence 100

80

60

40

20

0 1 2 3 4 5 6 7 8 9 10 11 sample

Figure 6.26c: Inter-class rejection ratios between samples in class 3 from 1 and 2.

6.26 c. Although almost all the examples fall below the 90% confidence limit, this threshold is arbitrary and its value obviously critical to the function of the classifier. The method of canonical analysis was then used to emphasise the differences between the different curve-shapes.

Performing canonical analysis, on the 8-feature set of section 6.3.2 improved classification as can be seen from the intra- and inter-class results in figures 6.28 and 6.29. The upper confidence threshold although now smaller, (-60%) is considerably different from the lower, class-rejection threshold, (-30%) and so gives a wider

- 178 - 44

S Peak wi

0 E 0 4 4. 0 - I •0 a'- a. I U o a, I- I. I- ; . I • I • I • I a a Vol now Vol %sg

I

4. Features: C (2). Magnitude of 2nd peak in dC/dV (.064 xC) 0 I. (4). Ratio of 1st to 2nd peaks in dC/dV (.16) a q. (6). Magnitude of G, vs V peak (1.4 zlO")

0 Diagnosed as: INTERFACE TRAPS with confidence of: 95% 4. I Rejected from: IMPLANTED (<1%) Rejected from: UNIFORM (57%)

a

VC 1 It skao

Figure 6.27: Example output from CV-ASSIST showing curves produced, features extracted, and diagnostic results

% confidence 100

80

60

40

20

0 1 2 3 4 5 6 7 8 9 10 sample

Figure 6.28a: Intra-class results after canon icalisation for; Class 1: Interface traps, margin between the recognition and discrimination between classes of C-V curve. The final set used which gave a consistent classification performance was the subset {1,2,4,5,6,8} of the eight available features. One of the criteria used in this choice was

-179- % confidence 100

80

60

40

20

0 1 2 3 4 5 6 7 8 9 sample Figure 6.28b: Intra-class results after canonicalisation for; Class 2: VT -adjust implant

% confidence 100

80

60

40

20

0 ------I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 sample Figure 6.28c: Intra-class results after canonicalisation for; Class 3: Uniformly-doped substr

obviously the ease and reliability of feature extraction.

The ability to view the the multi-dimensional pattern clusters in just two dimensions was found to be an extremely useful tool permitting a visual comparison of the suitability of different feature sets to be made. Figure 6.30 shows how the cluster shape and separation can be improved by adding further 'tuning' features to the basic set. The CV-EXPLORE software described in Appendix A allows for experimentation with different feature sets, and for the performance of the classifier to be evaluated in an objective manner.

-180- % confidence too

80

60

40

20

0 XI AI Xj - ) J X I XJ X1 I 1 2 3 4 5 6 7 8 9 10 sample

Figure 6.29a: Results after canonicalisation showing the inter-class rejection ratios. These charts illustrate successful rejection from the other categories.

% confidence 100

80

60

40

20

0 1 1 2 3 4 5 6 7 8 9 sample

Figure 6.29b: Results after canOnicalisation showing the inter-class rejection ratios.

Figure 6.30 a shows a scatter plot of the three sample clusters along the two most discriminatory axes of the feature space for a 3-feature subset {2,4,6}. The 'clusters' appear more as radial lines and are quite close together at the origin. Figure 6.30 b illustrates that by using all 8 curve-shape descriptors, the clusters become more rounded and move further apart.

Two words of caution are appropriate here for the use of the canonical analysis method in this instance. The method relies upon a pooled estimate of C to evaluate

_181- % confidence 100 M 3:1 80 - RR 3:2

60 -

40 -

20 -

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 sample Figure 6.29c: Results after canonicalisation showing the inter-class rejection ratios.

.t p.- 1 - .t. •1

S * a .6 I .1 N S N a I. 1. 0 0 4. 4, 0 0 >S >S -1 13 C - 0 S Af 13 13 13 13 a 4 a IA 13 U U A -3 .6• 13 -a .5 a a

• • I • I • I I -1 -I- • • • • -5 -3.6 -1.2 1.2 1.6 5 -4 -.2 1.6 7.4 IIA 15

E1gsri.car 1 ia.rI..'.ator 1 Figure 6.30: Scatter-plots of the clusters in 2-D after canonical analysis of: a) the {2,4,6} feature subset and b) the complete set of features.

the canonical eigenvectors, and yet it is apparent from figure 6.30 that the cluster shapes are visibly different for each group. Furthermore, it was found that although the classifier accuracy was improved, the resulting system becomes more selective due to the dimensionality reduction, and thus the classifier is more judicious about which new samples to accept into the group. These are minor problems however, and the benefits of applying this technique were felt to more than outweigh the mild disadvantages.

_182- 6.4. A Shape-Addressable CV Curve Database

In extensive trials, the recognition system occasionally suffered from misclassifications primarily because of missing or unreliable data which resulted in faulty features being extracted. The decision support mechanism discussed briefly in section 6.3.2 was intended to provide assistance to the operator to intervene, when the feature extraction routines were in error, and confirm, refute or modify the extraction on the basis of visual comparison with a known shape.

To maintain the flexibility of the trainable classifier, CV-ASSIST was enhanced to enable a 'nearest match' function to select representative curves from a content- addressable database. This facility was extended to allow comparisons to be made from examples in the current training set, in the case of a new shape being encountered. An operator is thus better equipped to make correct decisions even for C-V curves whose characteristics had not been previously stored.

The procedure for retrieving the nearest match to the measured curve is illustrated by figure 6.31 and is as follows:

The Mahalanobis distance classifier is used to establish the most probable class of curve shape

Within that class, a simple Euclidean measure is used to find the 'closest' stored curve by searching through all examples in that class

A look-up table is then used to extract the corresponding data file name, and the stored curve retrieved for comparison.

Features from Filename of measured •Ø [MATCHER. . sample in C-V curve database

T IF Measured NFisual Stored curve comparison curve I ACTION Figure 6.31: Schematic outline of the nearest-match function

-183- An example of this procedure in use is for the location of the peak(s) in the differential C-V data. Despite the smoothing and averaging operations, it is still possible for spurious effects either to give rise to additional peaks or mask those of interest. The FNTwo-peaks function documented in Appendix A provides an override facility if the operator decides that the measured - curve appears grossly dissimilar dV to its nearest equivalent. This would then signal to the operator that either the data is of poor quality or that extra precautions are to be taken in the location of the peak(s).

6.5. Extending the Classifier

The database and query facility was extended in order to allow a provision for the creation of new pattern classes if and when new or previously unknown faults are encountered. The addition of a trainable database and cluster generator thus completes the pattern recognition abilities of CV-ASSIST to cope with, and adapt to new and changing circumstances.

6.5.1. New Cluster Generation

Unlike conventional cluster analysis algorithms, CV-ASSIST is supervised by an expert so that greater control can be exercised over the generation of new categories of C-V and 0-V curve shapes. The mechanism for this process is depicted by the flowchart in figure 6.32. The processes to generate a new class of curve, or 'cluster' are: Known classes of QV curve: 'Apply the usual distance measure Uniform, 'clean' signifi canoe test ) Depletion implant Vt -adjust implant 1dence<80% Interface traps Get nearest sample in database from most lRikel class Visual Same Add to dambase expert 10 and re-train opinion C classifier

NEW CLASS Create new cluster

Figure 6.32: Flowchart of operations during the creation of a new pattern class

-184- Add new features if necessary, to positively identify subtleties of the new characteristic

Create the necessary directory structure and database files

Calculate new mean vector and covariance matrix

Add code in the form of a rule or function to handle a new visual comparison and/or initiate additional measurements

Name the cluster (e.g. 'leaky-oxide')

Code exists for phases 1-3 and 5, however, phase 4 has not been automated since HP-BASIC does not readily support self-modifying code. Details of the software can be found in Appendix A.

6.6. Conclusion and Discussion

The techniques described in this chapter have demonstrated how a pattern- recognition scheme can be applied to the interpretation of C-V curves.

One drawback of statistical pattern recognition techniques is the requirement for all the information regarding a measurement to be simultaneously available, in the form of a feature vector, to enable an interpretation to be made. Whilst the decision support mechanism described for feature extraction is an invaluable aid for dealing with uncertain or incomplete data, there is still a requirement to 'fill' these gaps in our knowledge of the device characteristics. The need to apply the right tests, comparisons and assumptions at the appropriate moment also remains. It would be interesting to compare the effectiveness of more complex syntactic methods of pattern-recognition in future work.

The next chapter discusses the application of a rule-based 'expert' system to the control of C-V, G-V, C-t and QSCV measurements, using the pattern recogniser described here as a front end. The pattern recognition system • thus provides for the translation of the numeric to symbolic data, and the knowledge-based system for the scheduling of all test and analysis procedures. This hybrid approach is a natural extension to the idea of a trainable diagnosis and control system, with an established knowledge-based shell providing the functions of automatic rule generation and hypothesis testing.

-185- REFERENCES R. A. Fisher, "The use of multiple measurements in taxonomic problems," Ann. Eugenics, Vol. 7, pp. 179-188, (1936). P. C. Mahalanobis, "On the generalised distance in statistics," Proceedings Of The National Institute Of Science (India), Vol. 122, pp. 49-55, (1936). B. Chandrasekaran and A. God, "From numbers to symbols to knowledge structures; artificial intelligence perspectives on the classification task," IEEE Transactions on Systems, Man and Cybernetics, Vol. 18, (3), pp. 415-424, (May/June 1988). P. Bartal and L. Monostori, "A pattern-recognition based vibration monitoring module for machine tools," Robotics and Computer Integrated Manufacturing, Vol. 4, (3/4), pp. 465-469, (1988).

D. J. Hand, Discrimination and Classification, Wiley, Chichester (UK), (1981). W. S. Meisel, "Computer-Oriented Approaches to Pattern Recognition," Vol. 83 in series: Mathematics in Science and Engineering, Academic Press, London, (1974). (2nd printing)

S. Watnabe, Methodologies of Pattern Recognition, Academic Press, New York, (1969).

B. G. Batchelor, Practical Approaches to Pattern Recognition, Plenum Books, London, (1974).

P. A. Devijver and J. Kittler, Pattern Recognition- a Statistical Approach, Prentice Hall, London, (1982).

R. 0. Duda and P. E. Hart, Pattern Classification and Scene Analysis, Wiley Interscience, (1973).

K. S. Fu, Syntactic Methods in Pattern Recognition. Mathematics in Science and Engineering (Vol. 112), Academic Press, New York, (1974).

M. James, Classification Algorithms, Collins, London, (1985).

F. B. Hildebrand, Introduction to Numerical Analysis, McGraw - Hill, New York, (1973). 0. G. Selfridge, "Pattern recognition and modern computers," Proceedings Western Joint Computer Conference, pp. 91-93, (March 1955).

R. M. Glorioso and F. C. Colon-Osorio, Engineering Intelligent Systems (Concepts, Theory and Applications), Digital Press, (1980).

N. J. Nilsson, Learning Machines- foundations of trainable pattern-classifying systems, McGraw-Hill, New York, (1965).

M. D. Levine, "Feature extraction- a survey," Proceedings of the IEEE, Vol. 57, (8), pp. 1391-1407, (Aug. 1969).

-186- IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-8, (January 1986). Special issue on shape analysis

T. Pavlidis, "Editorial: papers on shape analysis," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-8, (1), (January 1986).

K. R. Castleman, "Digital Image Processing," (Signal processing series), Prentice-Hall, (1979). Chapter 16: Measurement and Classification

M. A. Fischler and R. C. Bolles, "Perceptual organisation and curve partitioning," IEEE Transactions Pattern Analysis And Machine Intelligence, Vol. PAM! 8, (1), (January 1986).

A. Rosenfeld, "Figure extraction," in Automatic Interpretation and Classification of Images, ed. A. Grasselli, pp. 137-154, Academic Press, London, (1969). (Chapter 6)

F. Attneave, "Some informational aspects of visual perception," Psychological Review, Vol. 61, (3), pp. 183-193, (1954). J. A. Walls, A. J. Walton, J. M. Robertson, and T. M. Crawford, "Automating and sequencing C-V measurements for process fault diagnosis using a pattern recognition technique," in Proceedings ICMTS 1989, Vol. 2, pp. 193 - 199, Edinburgh U.K., (March 1989).

Sing-Tze Bow, Pattern Recognition- Applications to Large Data-Set Problems, Marcel Dekker, New York, (1984). (Electrical engineering and electronics series)

Y. T. Chien and K.-S. Fu, "On the generalised Karhunen-Loeve expansion," IEEE Transactions on Information Theory, Vol. IT-13, pp. 518-520, (July 1967). B. F. J. Manly, Multivariate Statistical Methods- a Primer, Chapman & Hall, USA, (1986).

M. E. Hellman, "The nearest-neighbour classification rule with the reject option," IEEE Transactions on Systems, Science And Cybernetics, Vol. SSC-6, pp. 179-185, (1970).

P. A. Lachenbruch and M. R. Mickey, "Estimation of error rates in discriminant analysis," Technometrics, Vol. 10, pp. 1-11, (1968). P. A. Lachenbruch, Discriminant Analysis, Hafner Press, New York, (1975).

-187- Chapter VII Knowledge-Based Control

7.1. Introduction

As discussed in the concluding remarks of Chapter 6, the statistical pattern recognition system, although extremely useful in interpreting shape information, is by no means a complete solution to the problem of automating C-V measurements. Experience with the measurement technique has indicated that there are three key aspects of the C-V method:

• Complete diagnosis is progressive, and often the result of several attempts and different measurements.

• The progressive nature of the task implies that there is a need to perform operations in an order dictated by the actual device under test such that all pre-requisites are satisfied.

• There is a requirement to be able to handle some degree of uncertainty, and to cope with bad data gracefully.

This implies, therefore, that a completely automated system could be accomplished by the following three processes:

• Interpretation (pattern recognition and matching)

• Scheduling (planning the sequence of events)

• Reasoning (ability to hypothesize and test)

The first phase, that of interpretation, has been partially tackled by the SPR techniques described in Chapter 6. There remain, however, the problems of dealing with data that does not conform to well-defined categories. A major consideration for C-V is the variability of data for which an experienced operator is often called for. In the author's opinion, it is often the exceptions to the known faults that are of most interest and greatest concern to the process engineer. Some other means, therefore, of identifying these exceptions and being able to continuously update some 'knowledge

-188- base' is required.

The scheduling ability is required to sequence the application of tests, analyses, interpretations and assumptions in a way that maximises the usefulness of all the available information as and when it becomes available. By pursuing multiple goals and maintaining a number of different viewpoints, a more global picture of the device can be built up. By combining all the information from the various measurements applicable to the MOS capacitor, it may be possible to overcome the limitations of any one measurement in isolation. Through an intelligent choice of methods and algorithms, it may become easier to optimise the individual measurements by ensuring all their pre-requisites are met. Furthermore, by allowing tests to be made only when they are needed, we may improve the overall efficiency of the measurement system.

A reasoning capability would allow the overall flow and control to become an intelligent sequence of events. The ability to hypothesize and test assumptions is a very powerful problem-solving method and allows for a final diagnosis to be quickly and accurately made by focussing on the particular characteristics of the device under test. Reasoning also allows for the reduction of operator input, which is desirable for automation, and for the proper handling of the uncertainty of bad or missing data. This is accomplished by the addition of rules-of-thumb or 'heuristics' to the reasoning process.

7.2. Artificial Intelligence

Knowledge-based systems technology is a branch of artificial intelligence (Al) concerning the solutions to problems that would normally require a significant amount of human expertise. Other fields of Al include computer vision, robotics, planning and natural language processing. In general terms AT can be epitomised as:

'...the study of ideas which enable computers to do the things that make people seem intelligent' [1].

Al is thus concerned with the simulation of human behaviour. This is. particularly difficult since there is no general agreement as to the mechanism of our own intelligence. Al has thus drawn upon the resources of many distinct areas of research, including Cognitive Science, Engineering, Computer Science, Psychology and Philosophy. A brief historical perspective on Al research can be found in reference

[2].

_189- Artificial intelligence has received a great deal of media attention in the last 20 years and this has resulted in two parallel phenomena.

An apparent distortion of its abilities and an inflation of the expectations of this research field to deliver results. This has made it very easy for the sceptics to criticise Al.

The attraction of an immense amount of Government funding. This has resulted in a lasting resentment from other research fields.

Whilst Al has not been as forthcoming as expert systems salesmen would have us believe, there have been a number of barriers to the projection of these new techniques into the areas where they would be of most use. This is partly a question of access - the technology has never been particularly amenable or welcoming to outsiders. It is possible to blame this on the obscure jargon, the high software costs and the need for specialist and/or expensive computing hardware. Additionally there is the unwillingness to change, which inevitably blocks the introduction of innovative methods, particularly those which require their problems to be better formulated and

more clearly defined. Knowledge-based systems in fact encourage a more structured approach to problem solving.

The general problem from which Al suffers is a lack of awareness of its abilities amongst potential users. This is of course characteristic of any cross-disciplinary science, particularly one with such wide application. Earlier exposure may help to overcome the doubts and misunderstandings surrounding techniques that are very likely to change the face of automation and increase the productivity of enginering knowledge [3].

7.3. Knowledge-Based Systems

Knowledge-based systems (KBS), often undeservedly termed "expert systems" are difficult to define, but must incorporate certain essential features to differentiate them from 'ordinary' computer programs:

• 190. The ability to represent, manipulate, and process knowledge and symbolic information.

Knowledge regarding the particular problem domain and the control or 'inferencing' process is separate and isolated.

Processing through the efficient, heuristic manipulation of a large body of knowledge.

A KBS should be able to explain its actions and decisions at any stage during processing.

The KBS should support and manage conflicting or incomplete knowledge (truth maintenance)

The problem-solving mechanism is an opportunistic process and not an algorithmic one.

The need for a KBS in the context of C-V measurement automation is given by a number of considerations:

• There is a shortage of C-V experts and the knowledge that they process is difficult to duplicate and distribute.

• The control structures offered by conventional systems and programming languages do not readily support the desirable facilities listed above.

• There is a need to automate C-V measurements to render them more reliable, usable and accurate. The need for non-standard procedures such as recognition, scheduling and reasoning is recognised.

The successful applications of KBS technology span many disciplines as described in the common textbooks by Winston [1], Forsyth [4], and Nilsson [5]. New applications are also in abundance [6, 7, 8, 9, 10, 11].

There are some key elements of the problems listed above that actually render them better suited to solution by a KBS. Problems that do not embody some or all of these attributes will probably not benefit from KBS techniques, and are invariably more efficiently tackled using conventional languages and methods. The tasks that KBS best perform are listed below:

_191- Diagnostic-type problems for which there are several possible explanations.

Intuitive or judgemental tasks such as forecasting or law, when there is no clearly established theory.

C. Applications that require specialist knowledge and subjective skills for which human expertise is either scarce or expensive.

d. Problem areas where the data is noisy or unpredictable, and uncertainties must be handled.

However, there are a number of attached conditions which must also be satisfied if KBS methods are to be applicable.

There must be at least one human expert on the subject area, and there should be some general agreement about the problem solving method.

The project boundaries should be clearly defined with a definite goal.

There should be some obvious economic benefits to the production of a working system, since a fully functional KBS can be expensive to produce.

The problem domain should be small.

The last point merits special considerations. It has long since been realised that the ultimate goal of a 'general problem solver' [12] is an unattainable dream. KBS are most effective, like human experts, when they are allowed to specialise.

By contrast, KBS are not applicable to the following limited set of problems:

Tasks that involve common sense knowledge, since this is difficult to define.

if the problem can be solved using conventional techniques, it is unlikely to benefit from KBS.

C. Problems that are not understood and not currently solvable are not generally advisable for a KBS application.

Despite the restrictions, KBS can offer the initiator a number of rewarding benefits, as industry is slowly beginning to realise:

192- Special knowledge and expertise becomes more accessible.

KBS can handle far larger stores of knowledge than an expert, and more reliably too.

Reasoning processes can be controlled, formalised, and unbiased.

Successful KBS are undoubtably cost effective.

The development of KBS can provide a deeper insight into the nature of a problem and even improve current understanding.

Having established when and where KBS are applicable, it is now appropriate to describe what a KBS actually is, how it functions and how it might be implemented.

7.3.1. The Components of a Knowledge-Based System

Any knowledge based system must have two vital components: a knowledge base and some sort of 'inferencing engine', i.e. a mechanism for selecting and applying the knowledge in the knowledge base. Many inferencing mechanisms claim to be domain independent, with the specific strategies for solving a particular problem embedded within the knowledge base. In practice, however, there are many variants which perform better in some environments than in others. The knowledge base is the functional equivalent of 'data' in a conventional computer system and is a record of an expert's knowledge appropriate to the particular problem to be solved. In addition to these two necessary components, a KBS may optionally include a user-interface, an explanation generator and a knowledge acquisition facility. A notional structure for the arrangement of these constituent parts is illustrated in figure 7.1

7.3.1.1. The Knowledge Base

The knowledge base contains sufficient facts and heuristics to solve the class of problems specified. Facts are records of 'short-term' information regarding the current state of the solution. Heuristics are guidelines/rules-of-thumb which enable new facts and hypotheses to be created from the current facts.

There are a variety of schemes for knowledge representation, the most common being 'production rules'. The resulting system in this case is more often known as a rule-based system or a production system [13], and knowledge takes the form of modified IF-TI-LEN functions. Other means of representing knowledge include

-193- Explanation generator

User Interface Inference engine Knowledge base

Knowledge USER acquistion module

Mandatory Optional EXPERT

Figure 7.1: Schematic outline of a knowledge-based system (KBS) predicate logic [2] and structured objects (graphs, decision trees, frames, and semantic networks) [2]. Predicate calculus is used to prove a problem represented in logical terms, whereas structured objects can be efficiently manipulated in an object-oriented programming environment. Each representation must be equivalent in its ultimate function, but some schemes are more convenient to use in certain circumstances, hence the variety. The most popular form is the production rule for its expressiveness and ease of use, although it can suffer from disorganisation in the structure of the rules and a lack of subtlety. The resulting rule-based system may thus appear to suffer from short-sightedness, and be difficult to control.

7.3.1.2. The Reasoning Mechanism

In action, the reasoning process of a KBS uses the input data or internally generated facts to 'trigger' appropriate heuristics in the knowledge base. As an example, a rule which has all of its preconditions (IFs) met will be selected as an 'appropriate' heuristic. Next, the 'action' part of the rule (THEN) will be performed, the result of which will typically be to modify some dynamic data structure (sometimes known as 'working memory').

-194- There are essentially three problems that clearly differentiate KBS from conventional, procedural programs:

The choice of an efficient direction in which to search the problem space.

Dealing with uncertainty in a rule's conclusion.

The selection of an appropriate rule.

- In the choice of an inferencing mechanism one must be prepared to trade off factors such as efficiency and comprehensibility. The two major inferencing strategies are forward and backward chaining. Forward chaining involves reasoning from the known facts to plausible hypotheses. This data-driven strategy is characterised by a seemingly unconnected series of questions about the problems in pursuit of the next hypothesis. Backward chaining reasons from the goal to be established towards the facts needed to substantiate it. This process tends to appear more focussed in its search and, at times, unrelenting in pursuit of supporting evidence. The better systems emerging offer a compromise between the two by offering both.

Uncertainty is best handled by proven quasi-statistical methods such as Bayesian logic [14], fuzzy logic [14], or certainty factors, and most appear to work satisfactorily. Essentially the problem is to yield a sensible measure of confidence in a conclusion of a heuristic when the supporting facts themselves are say, only 80% certain.

Selecting an appropriate heuristic is simple, but choosing the best from several candidates is a conflict resolution problem. It is this aspect of KBS that makes them suitable for problems which are non-deterministic. Three common strategies [2] refraction, recency and specificity are often used together to ensure the best performance and to cope with exceptions, non-determinism and to ensure a continuity in the system's line of reasoning. In rule-based systems, 'refraction' ensures a rule only fires once on the same set of conditions, 'recency' prefers the newest facts, and 'specificity' selects the rules which have the greatest number of satisfied pre-conditions.

7.3.2. The Blackboard Architecture

An alternative to the control structure depicted in Figure 7.1 is the blackboard architecture [15]. This provides multiple viewpoints for the problem in hand and employs a group decision making process to effect the reasoning. The usual analogy of

_195- a blackboard system in operation is that of a panel of experts having a discussion, but communicating via the medium of a 'blackboard', as depicted by Figure 7.2. This enables different processes, even separate 'expert systems' to contribute to the problem-solving process. In contrast with the mechanisms of forward and backward chaining, the blackboard model allows for opportunistic reasoning [16], that is, knowledge from the knowledge-base is applied at the most appropriate moment. The blackboard model for problem solving has been shown to be extremely effective in a diverse range of applications including signal processing, machine vision, structural design, protein structure determination, and speech recognition. This is as a result of the generality of the reasoning process [17, 16, 18, 19,20,21,22].

The underlying functional mechanism of a blackboard system is necessarily complex, but the structure can be simply illustrated as in Figure 7.3. The components of a blackboard system are:

• Global database: The blackboard data structure on which information regarding the current state of the solution is held. Entries on the blackboard can be hierarchically organised to reflect the 'progressive' nature of the problem. Each entry generally has associated with it a great deal of information regarding its

ideas more facts supportive facts uncertain data theory I justifications I conjecturee L facts hypotheses J

Figure 7.2: Blackboard model analogy

-196- Blackboard bal database

Levels

Of

understanding

Arrows indicate * co-operation between the levels

Figure 7.3: The blackboard model for problem solving origin, its certainty, and its dependency on other entries.

• Knowledge sources: The knowledge required to attack the problem, be it domain or control knowledge, is partitioned into knowledge sources. In some systems, these divisions are arbitrary, and totally in the hands of the designer, thereby allowing complete freedom in the structuring of the problem. Knowledge sources may take the form of production rules, object-oriented procedures, or even small expert systems.

• Scheduler: An algorithm to decide upon the order of application of the various knowledge sources. A typical sequence during one 'cycle' of the system proceeds as follows:

_197- Add the initial entries (to awaken and start the system).

Decide which knowledge sources could use the new entries.

Construct an agenda of knowledge-source activation records (KSAR) which notes all the knowledge sources that might be worth invoking.

Order the agenda according to some algorithm and select the item from the top of the list.

Execute the selected task or KSAR and make any resulting changes to the blackboard.

The separation of KS execution and initial activation allows for the 'intelligent' opportunistic scheduling which distinguishes blackboard systems from simpler production systems. A typical blackboard architecture is illustrated in Figure 7.4.

A major effort in the development of a blackboard system is the partitioning of the blackboard into 'levels of understanding', and the division of the available data and knowledge into discrete knowledge sources. This will require an in-depth knowledge of many aspects of the problem. As an example, figure 7.5 illustrates the configuration of the HEARSAY-II speech-understanding system [17]. Note the combination of forward and back-Ward reasoning in this procedure. The backward process thus works on expectation (in this example, to hypothesize missing WORDS).

7.4. A CV-EXPERT System

The blackboard model was considered particularly well suited to the sequencing of C-V measurements because of its ability to handle and generate intermediate solutions, and for the opportunistic nature of the solution process [23]. This was considered to match the procedure by which experts perform measurements for fault-

Blackboard I ____________Blackboard SCHEDULER monitor i

Focus of J Scheduling reasoning • I queues

Figure 7.4: A blackboard system architecture

-198- Levels Knowledge sources

Database interface ISEMANT PREDICT STOP Phrase PARSE CONCAT I WORD-SEQ-CTL Word-sequence WORD) SEQ WORD-CTL Word VERIFY 0 RPOL )MoW (

Syllable )M

Segment SEG

Parameter

Figure 7.5: Knowledge-source and solution hierarchy in HEARSAY-I1 diagnosis, namely a process of trial-and-error, eliminating hypotheses when appropriate, but always being directed by the sum of the available data at any instant.

The different measurements (G-V, QSCV, C-t, etc.) that can be performed may be thought of as different knowledge sources, providing information that is unique, yet complementary as shown in figure 7.6. The need to apply these measurements in the correct sequence is addressed by the blackboard scheduler under the auspices of a separate 'control' knowledge source. As is evident from the earlier chapters, the process of performing these measurements is a complex, hierarchical, sub-divided problem, and this can be reflected in the structure of the 'levels' within the blackboard. The proposed organisation of the blackboard for the C-V controller is illustrated in Figure 7.7.

At this stage it is important to emphasise the controlling objectives in the development of the complete system. These can later take the form of 'rules' in the control KS:

Objective 1: To characterise an MOS capacitor as thoroughly as the constraints of the user, system and device will allow.

-199- c-v I P C-t measuremefltSl -v measurements j measurements

Doping Blackboard r Ouasl-static profile system measurements]

I Pattern User input recogniser Heuristics

Figure 7.6: The knowledge sources used within CV-EXPERT

Objective 2: To validate and reason through any assumptions made during the analysis.

Objective 3: To combine information from different knowledge sources, and choose the most suitable/most accurate/most dependable parameters.

Objective 4: To explain the reasoning, if required, in a simple fashion.

Objective 5: To assist and guide the operator in the event of unknown faults being encountered by listing a set of likely hypotheses.

Objective 6: Allow flexibility by permitting the updating of the knowledge base when required.

The 4th and 5th objectives are highly desirable in any knowledge-based system, but unfortunately are the most time-consuming to implement. In this investigation we therefore take these to be optional extras in the final system.

An example of the first four objectives enforced might be:

.200- (v) Rules to FURTHER knowledge about the device through additional 3-V. C-t & QSCV tests

(iv) Rules to integrate PATTERN CLASSIFIER results with PREUMINARY TESTS

Rules for an Increasing INITIAL CV SWEEP level of understanding

(ii) Rules for PRELIMINARY TESTS

0) Rules for EQUIPMENT SETUP

Depth of analysis Figure 7.7: Organisation of knowledge sources in CV-EXPERT

The user only wants to know the interface trap density at mid-gap, and the substrate doping concentration.

The substrate doping profile is determined to be uniform by an analysis of high and low-frequency C-V data.

If D, is low in value, the use of the conductance technique is advocated to determine it.

The user asks how Nsb was determined.

-201. Objective number 6 is difficult to achieve in practice in a system of such complexity because of the sensitive nature of the dependencies between the knowledge sources. This is where the capabilities of a truth maintenance system (TMS) can be applied. These are often based on a knowledge of the justifications of any action taken, which are recorded in order to build up a map of the inferences made [24].

7.4.1. The Edinburgh Prolog Blackboard Shell (EPBS)

The blackboard system 'shell' selected for this project, developed by Jones et al. [25,26], is modelled upon the 1-IEARSAY-il architecture [17] and written in the Prolog [27]. This approach allows for easy access to the underlying Prolog, itself a powerful theorem-proving language, for building knowledge sources and interfacing to the outside world.

The principal features of the EPBS implementation are the blackboard entries:

bb(Tag, Status, Index, Fact, Cf) and the support relationships which are used to maintain the logical consistency of the blackboard and provide a record for explanation purposes:

supports (supporting-tag, supported-tag)

justification (Tag, Rule-no)

A tag is a unique (integer) identifier supplied by the system for any particular entry. Status is used to denote whether a blackboard entry is 'in' or 'amended'. Index corresponds to the position of an entry on the blackboard and is used to define the 'levels' within the structure shown in Figure 7.7. Fact is a Prolog 'term' which specifies the information to be placed on the blackboard at the position marked by the index. Finally Cf is a confidence factor representing the 'degree of belief in the entry. By default, this is an integer in the range 0 -. 1000.

The knowledge (in this case, the practical aspects of the C-V technique) is represented in the EPBS as production rules. Although Prolog exclusively employs backward chaining in its execution, the KS rules are all data-driven. Each rule takes the general form shown below:

-202- IF condition THEN goal TO effect EST estimate

The 'IF refers to existing entries on the blackboard which must be satisfied to generate a KSAR on the agenda of pending tasks. The body of the rule (THEN), upon satisfaction of the condition enables an arbitrary Prolog 'goal' to be executed. This might be to issue a message to the user or initiate a C-V measurement, for example. As a consequence, entries on the blackboard may be added, amended or deleted to effect the rule. The estimate is a salience factor representing the usefulness of the rule and can be used, to some extent, to control the order in which the rules are used. In the default scheme, estimate is a non-zero integer, and rules with the lowest salience are considered first by the scheduler.

The syntax for the rules can be conveniently defined in Prolog notation where a comma (,) indicates a conjunction (OR) and semi-colon (;) a disjunction (AND):

Cf_test := X < N; X > N; (N < X, X < M).

Atomic—test := [Index_pattern, Fact_pattern, Cf_test).

Test := Atomic—test; not Atomic—test; notnow Atomic—test; holds Precondition.

Condition := Test; Test and Condition; Test or Condition.

Full details of the facilities offered by the EPBS can be found in the operating manual [25].

Examples of some of these rules are given below: These rules were developed from a practical familiarity with all aspects of the measurements and their interpretation as well as the typical problems encountered during routine C-V testing.

.203- Rule No. 38 if [zerbst_test, TauO. 9] and @ [initial, parameter (Pdope, pdope) ,true] then true to delete est 5

Rule No. 7 if [prelim,param(gox,Gox) ,true] and Gox>1000 then write('warning: possible leaky oxide') to add [prelim,assume(gox, leaky) ,true] and [prelim,assume(cox,unreliable) ,true) est 2

Rule No. 24 if @[prelim,assume(cox,unreliable) ,true] and [prelim,param(type,Type) ,true] and [set_up, parain (vacc , Vacc) ,true] then (write('Cox unreliable, reducing Vacc...'), Vaccnew is Vacc-2, instruct basic (command, "Vacc="Vaccnew)) to amend [set_up, param (vacc , Vaccnew) , true] est 1. % initiates re-measurement of Dox

where each item in [brackets] is a blackboard entry.

Rule 38 can be translated to imply that should the device be determined, with at least 90% confidence, to have a low minority carrier lifetime from a Zerbst analysis of the C-t data, then pulsed doping density and all other parameters and assumptions derived from it should be withdrawn from the solution, since they are no longer valid. Rules 7 and 24 provide a mechanism for ensuring the oxide capacitance is correctly determined from a sample with a leaky oxide. Rule 24 fires off rule 7 and initiates the

start of a new DOX measurement by amending the accumulation voltage parameter upon which this measurement depends. i.e. the appearance of a new 'Vacc" fact on the blackboard causes the rule controlling D01 measurement to fire again. These two rules mutually recurse until GOX falls below 1000ii.S or until a third rule, designed to

-204- prevent an overvoltage, halts the measurement sequence.

7.4.2. Interfacing to EDUCATES

A hybrid system was envisaged to be the most convenient approach, whereby the instrumentation, measurement and analysis software (EDUCATES) is controlled directly by the blackboard system. This enables the software investment in EDUCATES to be preserved whilst allowing for opportunistic sequencing of the component sub-modules via external control. Another advantage of this integrated approach lies with the ability of the knowledge-based system to assimilate its own raw data, a facility which is frequently lacking in many other 'expert systems'.

A further requirement, determined by cost and ease of use, was for the entire system to run on a single computer. Similar 'expert control' systems have necessitated the use of two inter-linked personal computers, one running the KBS, the other the control and instrumentation programmes, together with a separate data acquisition unit [28,29]. These could hardly be considered a commercially viable proposition, since the two PCs are rarely able to operate in parallel, and thus remain idle for half the time.

To achieve the desired integration, a multi-tasking was employed. Furthermore, since the EPBS and EDUCATES are interpreted-language implementations, both require separate display screens for communication with the operator. For a single-screen system, a windowing environment was therefore necessary. Within the X-windows * environment, running under HP-UX (Hewlett Packard's implementation of UNIX),t HP-BASIC [30] and the EPBS can run interactively and concurrently.

The Prolog to BASIC instructions are effected by simulated keystrokes at the BASIC keyboard. Data and other information is exchanged via data files, whilst co- operation between the two programming environments is achieved through command files, with an elementary protocol for timing and message handling. A diagram of the overall scheme is illustrated in Figure 7.8. This dual-language environment has previously been termed 'expert control' [31].

* Copywcite Massachusets Institute of Technology t UNIX is a trademark of Bell Laboratories.

_205- Expert System Graphss

knowledge-based logical Inference + 111110- opportunistic control uncertainty handling X-win 4 i 4 BASIC (RMBUX) Interface command files algorithmic Instrument control colour graphics engineering software

___ Idafil

Figure 7.8: Communication scheme for BASIC-Prolog communication

An HP-BASIC to Edinburgh Prolog interface has been implemented using a set of high level macro programs in each language:

In Prolog: instruct-basic (Operation, Argument, Index, Entries-to-add): - basic-status (Status), basic-instruct (Command), basic-status (Status), basic-error (Message), basic-read (Index, Entries), or (atomic result): instruct-basic (Operation, Argument, Result): - basic-status (Status), basic-instruct (Command), basic-status (Status),

-206- basic-error (Message), basic-read (Result).

and in BASIC:

SUB Monitor SUB Basic-is (Status$) SUB Basic-err (Error$) SUB Tell-prolog(Fact$, Cf $)

The EPBS controls BASIC by issuing an "instruct—basic" command which hitially causes a trigger file to be created by the "basic-instruct" predicate, which is subsequently detected by the "Monitor" BASIC subprogram. The contents of this file are BASIC commands which are output to the keyboard as if they had been typed in. BASIC interprets these commands and executes the desired functions or programs, and the "Basic-is" subprogram returns a 'ready' indicator in the form of another trigger file. This in turn is detected by the Prolog "Basic-status" predicate signalling that BASIC execution has terminated. Should a BASIC error have occurred at this stage, this would be signalled to the Prolog "basic-error" predicate by the "Basic-err" subprogram. Finally, if there is any data to be read back, this is accomplished by Prolog issuing another command to initiate the "Tell-prolog" BASIC subprogram, which in turn redirects it to be received by the awaiting "basic-read" predicate in Prolog. Since the returned data may take one of two forms; a formatted blackboard entry or a single (atomic) parameter, there exist two forms of the "basic-read" predicate as shown.

In the design of the communication protocol, three fundamental aspects were considered.

Content of messages

Format of messages

Initiator and recipient handshaking

The messages from Prolog to BASIC contain BASIC commands instructing EDUCATES to perform functions such as:

-207- • MEASURE a parameter, estimate a value, or perform a complete C-V measurement

• CALCULATE other parametric information by appropriate analyses

• DISPLAY in graphical form any data to assist or advise the operator

• COMMAND the BASIC system to perform some other action such as a direct function call, the modification of a global variable, or initiate the return of information to Prolog.

The message format was left flexible to allow for a variety of different messages from the two environments. Messages to BASIC may take the form of any valid BASIC command, but for the purpose of remote execution, are generally restricted to a sub-program or function call of the form:

CALL Sub_program (Args) with the arbitrary parameter list constructed within Prolog, using the 'join_var' predicate, as a simple ASCII string. Messages from BASIC were constrained to either one of two well-defined formats, namely a blackboard entry; [Index, Fact, Cf] (a formatted entry) or just the Fact on its own (an atomic result).

The sequence of events required to initiate a BASIC instruction from Prolog, and read back the result or response is depicted in Figure 7.9. The handshaking is simplified in that EDUCATES always acts as a slave under the control of the blackboard shell.

To facilitate the master:slave configuration, the following scheme was devised:

-208- instruct—basic; (Command, Reply)

Check basic ready \ I "BASIC was ready" 4 I

I L4 Issue the command 0- Execute command Wait for completion Check basic ready again 4 I Write result "BA1C finished execution" to file I L, Read back result(s) 4 "Result was 42pF"

[Index, pammeter(cfb, 42), 1000] 4 added to blackboard.

Figure 7.9: Communications protocol between BASIC and Prolog

A BASIC 'monitor' subprogram to await requests from Prolog. For HP- BASIC this necessitates the use of a periodic, polling mechanism. Since the keyboard input cannot be re-directed, this BASIC program must be kept running continually.

The file system was used to:

• trigger BASIC to read in the next command

• convey status information to Prolog

S return the results of a measurement/analysis

Although inefficient in terms of speed and resource allocation, this method was found to perform adequately in this application.

Number conversion routines were required to format numbers in BASIC notation to a standard acceptable to Prolog.

7.4.3. A Knowledge Base for C-V Measurements

A rule-base is under development which encapsulates knowledge about the analytical strategy and interpretive phases of C-V measurements in a structured manner. Knowledge acquisition was achieved by self-inspection and frequent use of the EDUCATES software on a wide variety of different sample types. These rules implement the control of the analysis of the preliminary tests, the initiation of the first

-209- high-frequency C-V sweep and the interpretation of the resulting curve via pattern classification, and finally the commencement of any additional measurements needed to resolve ambiguity. In the early stages, analysis proceeds along a fairly well- determined route, but once an initial interpretation of the results has been made, the exact direction that the analytical strategy takes will depend directly upon these results. Progress is thus hierarchical, and assumptions made at a lower level may be refuted or amended at a higher level, or vice versa, causing the system to backtrack to a previous hypothesis and begin again. Only after a logically consistent state has been reached will the system provide a printed summary of the requested parametric data.

A major part of the rule base has been developed to satisfy the first four stages of analysis (i to iv) as outlined in Figure 7.7. This includes various 'housekeeping' procedures to allow the operator to set some objectives/preferences, to prompt the operator to input essential information such as the capacitor area and helpful information such as approximate substrate doping, oxide thickness, and substrate type (if known), and finally to ensure the capacitance meter has been correctly zeroed. The prototype system comprises 52 rules designed to handle the most frequently encountered discrepancies between the different measurements, and to provide a means of ensuring that the maximum amount of reliable data is extracted from the device under test. A complete listing of the rule-base can be found in Appendix C.

7.4.4. The CV-EXPERT in Use

Coping With a Leaky Oxide

A typical run follows the sequence of events shown in Table 7.1. When the sample exhibits, say, an early breakdown of the dielectric when biased into accumulation, the system attempts to re-measure C0 at a reduced voltage. This is the rationale behind the example rules in section 7.4.1. The system may then iteratively

reduce the accumulation voltage, V. . until an acceptable leakage level is attained or, ultimately, reject the sample completely. This would obviously depend upon the severity of the failure.

A routine test upon an oxidised n-type wafer produced by rapid thermal processing picked up the curious curve shown in Figure 7.10. Under these circumstances, a simpler deterministic C-V system may even have failed to evaluate the

-210-

User inputs expected D0 and/or Area and substrate type Determine if device connected, c-meter zeroed, leakage low Determine substrate type and optimum voltage sweep range Estimate lifetime, measure D0 , Q and N 3b Perform deep depletion or equilibrium C-V as appropriate Pattern-classify the resulting curve; check diagnosis corroborates with parameters in 4) Continue with further tests in sequence as necessary.

Table 7.1: Sequence of events for a normal capacitor sample

w

S 0 C I 4' 0 I 0. I U

-6 -4 0 4 B 12

%/W 1tftØ CD Figure 7.10: c-v curve from a capacitor sample with a leaky oxide substrate type correctly, since this is usually measured by comparing two readings taken at fixed biases to determine the direction of the central inflexion in the C-V curve. The rule-based system, however, managed to avert this as shown by the events in Table 7.2 and extract a limited set of parameters from the resulting curve as shown in Figure 7.11. This particular processing fault was later traced to an inadequate post-oxidation anneal (in fact, it had been omitted altogether).

- 211 - User input D0 700 A, n-type substrate Determined device connected, c-meter zeroed -find leakage at + by to be high (G0 = 700 mS) Default sweep range used and type-determination gives a p-type substrate which disagrees with expectation D01 measured at —1.2 A due to leakage, but C,,, OK S. Pattern analysis fails to recognise shape Fail and reset accumulation voltage to 2V; redo steps 2 & 3 Halt, V.(b =-2.00V, D0 =696 A, N,,. b=1.71X1015cm 2 Warn user, suggest perform dielectric integrity test

Table 7.2: Sequence of events for a sample with a leaky oxide

- FQ - w ii Sc

46- I 0 C 30-

0 I 20- a I U 16—

e -6.0 -6.3 -4.9 -2.9 -1.2 .5

-2 CO3 = 48576 pFcm 2 Cmin = 7775 pF cm

Cffi = 31777pFcm 2 Nb = 1.71x10 15 cm 3

D0 = 696A Rs 36.5 fl

Vth= —2.00V VT= —3.O1V 1011 Substrate = n-type Qss = 5.2x cm-2

Figure 7.11: Optimised C-V curve and reduced parameter set recovered by CV-ASSIST from a sample with a leaky-oxide

-212- Reacting to a High Interface Trap Density

In another example, suppose the pattern recogniser identifies the measured C-V curve as being characteristic of a sample with a high interlace trap density. In this instance, if the operator had requested a measure of the trap density and distribution, the combined high and low frequency method would have been selected as the most appropriate measurement. If the interface trap density had been much lower, however, the conductance technique, although considerably slower, would have been initiated as this is a more sensitive measure of low trap density.

Non-Uniform Substrates

In a third example, if the curve shape is indicative of a non-uniform substrate doping profile and, if the operator requests it, the system will automatically perform a C-V dopant profiling routine. This will involve the linking and loading of appropriate test and measurement sub-programs from a library of routines. The measurement method employed (pulsed or equilibrium) will depend upon the minority carrier lifetime of the silicon, and this will already have been estimated earlier in the measurement sequence. Furthermore, once the substrate has been determined to be non-uniformly doped, the threshold voltage calculations can be re-evaluated in the light of this new information derived from a Ziegler analysis of the profile, as described in Chapter 4. If a lifetime measurement had also been requested, the doping profile could be used to correct the distortion in the Zerbst plot caused by the non-uniformity.

Depletion Implants

Finally, in the case of a depletion implant being identified from the shape of the C-V curve, most of the usual approximations concerning the key equations break down. A warning would thus be issued to the operator that other techniques should be used to determine key parameters such as C0 and substrate doping.

7.5. Conclusions and Future Directions

The prototype system is able to correctly identify and deal with a number of common non-ideal C-V characteristics, including:

-213- e A 'leaky' oxide

• High D it

• Non-uniformly doped substrate

• A depletion implant

By controlling the measurement and analysis algorithms in an opportunistic and data-driven manner, it is possible to extract an appropriate amount of information from the sample under test. In the case of a capacitor with a 'leaky' oxide, some useful information could be obtained, and a warning issued to the operator about the possible inaccuracy of some of the other parameters. Moreover, further analyses (i.e. QSCV and (3-V measurements) are disabled on account of this sample being inappropriate for the assumed equivalent circuit models.

The quoted examples for devices with high interface trap density, non-uniformly doped substrate, and a depletion implant illustrate the influence of the operator's preferences on the measurement sequence, but more significantly, the usefulness of maintaining an overall view of the capacitor characteristics. With this 'global' viewpoint, measurements and analyses may be applied that are best suited to the known characteristics of the sample at any one moment. Of course, the degree of confidence in the known facts depends on the the number of additional tests that agree. This could well yet be taken a step further with the addition of more corroborative tests to ensure that any assumptions made, such as '(lifetime, low)', are validated once the associated detailed analyses (in this case, a Zerbst analysis) have been made. The current rule-base only touches on this aspect, serving more to demonstrate the feasibility of the blackboard approach. To develop the intricacy and complexity of the reasoning process in this respect to take full advantage of all the available experimental data appears to take a disproportionate amount of effort in comparison with that required to establish a first, working prototype. This is, perhaps, more a limitation of the shell than any inability to express the experimental technique in rule-form.

Future Developments

Although the interprocess-communication scheme detailed in section 7.4.2 is not particularly efficient, speed of operation is not necessarily an issue here, particularly as the response time of the blackboard shell is typically of the order of a few seconds. A

-214- more elegant solution, and one which would also allow communication with other machines across a network, would be to use UNIX/domain sockets. The ability to link compiled 'C' subroutines into the RMB/UX code facilitates this improvement.

It is in the nature of a rule-based system, particularly the EPBS with its built-in consistency-maintenance mechanism, to readily allow the addition of new rules as they become necessary. The structure of the problem solving hierarchy of CV-EXPERT is considered to be sufficiently 'open' and general enough to enable such extensions to be made in the future. A useful facility, which unfortunately the EPBS does not currently offer, is to be able to view the 'logical dependencies' between rules. This feature would enable the connectivity and natural ordering of the rules in the knowledge base to be readily viewed when developing additional rules to cope with new situations. At present, this structure is somewhat hidden in the knowledge-base. Consequently, the addition of new rules must be done with great care in order to avoid invalidating the existing consistency in the reasoning process.

Once a stable set of rules has been finalised (given that it is indeed possible to 'close the loop' on C-V measurements) it might be possible to re-program the CV- EXPERT 'algorithm' in the language of the measurement system itself. However, it is expected that the evolution period of such a system and the likely effort required to effect the Prolog to BASIC translation would imply that the hybrid system remains the most economic proposition at this stage.

Another possible area of development for the BASIC:EPBS system might include the opportunistic control of a current-voltage (I-V) system such as the HP4062 parametric tester. For this, the developer is fortunate to have at their disposal a large, reliable, and tested library of measurement and analysis subroutines in the form of the recently-released Test Management Shell (TMS) from Hewlett Packard. The creation and execution of customised test procedures may well be simplified and optimised by the expert control techniques described in this chapter.

-215- REFERENCES

P. H. Winston, Artificial Intelligence, Computer Science Series, Addison Wesley, Philippines, (1977).

P. Jackson, Introduction to Expert Systems, Addison-Wesley, Wokingham, England, (1986). J. R. Hogley and A. R. Korncoff, "Artificial intelligence in engineering: a revolutionary change," Proceedings of the 1st International Conference on Applications of Artificial Intelligence in Engineering Problems, Vol. II, pp. 1155- 1160, Springer-Verlag, Southampton University, (April 1986). Edited by D. Sriram, R. Adey

R. Forsyth, ed., Expert Systems - Principles and Case Studies, Chapman & Hall Computing, (1984).

N. J. Nilsson, Principles of Artificial Intelligence, Springer-Verlag, New York, (1982).

L. Boic and M. J. Coombs, eds., Expert System Applications, Springer-Verlag, (1988). Series on symbolic computation

A. R. Mirzai, ed., Artificial Intelligence: Concepts and Applications in Engineering, Kogan Page, (1989).

D. Sriram and R. A. Adey, eds., Applications of Artificial Intelligence in Engineering Problems, Volumes 1 & 2, Springer-Verlag, Southampton, U.K., (April 1986). 1st International Conference

D. Sriram and R. A. Adey, eds., Knowledge Based Expert Systems for Engineering: Classification, Education and Control, Computational Mechanics Publications, (1987). (Papers selected from the 2nd International Conference on Applications of Artificial Intelligence in Engineering Problems, Cambridge, Massachusets)

The Second International Conference on Industrial & Engineering Applications of Artificial Intelligence & Expert Systems (lEA/AlE - 89), Held at the University of Tennessee Space Institute (UTSI), Tullahoma Tennessee., (June 6 - 9, 1989). (VolumesI & II).

The First International Conference on Industrial & Engineering Applications of Artificial Intelligence & Expert Systems (lEA/AlE - 88), Held at the University of Tennessee Space Institute (UTSI), Tullahoma Tennessee., (1988). (Volumes I & II).

G. Ernst and A. Newell, GPS: a Case Study in Generality and Problem Solving, Academic Press, New York, (1969).

R. Davis and J. King, "An overview of production systems," in Machine Intelligence, Vol. 8, ed. E.W. Elcock & D. Michie, pp. 30-332, Wiley, (1977).

R. Turner, Logics for Artificial Intelligence, Ellis Horwood (series in artificial intelligence), (1984).

-216- H. P. Nii, "Blackboard systems: The blackboard model of problem solving and the evolution of blackboard architectures," The Al Magazine, pp. 38-53, (1986). (Summer 1986) H. P. Nii and E. A. Feigenbaum, "Rule-based understanding of signals," in Pattern Directed Inference Systems, ed. D.A. Waterman & F. Hayes-Roth, pp. 483-501, Academic Press, New York, (1978).

L. D. Erman, F. Hayes-Roth, V. R. Lesser, and D. R. Reddy, "The HEARSAY II speech understanding system; integrating knowledge to resolve uncertainty," Computing Surveys, Vol. 12, (2), pp. 213-253, (1980). R. Balzer, L. D. Erman, P. London, and C. Williams, "HEARSAY ifi: a domain independent framework for expert systems," Proceedings of the First National Conference on Al, pp. 108-110, (1980).

H. P. Nii and J. J. Anton, "Signal-to-symbol transformation: HASP/SlAP case study," Al Magazine, pp. 23-35, (Spring 1982). A. L. Thornburgh, "Organising multiple expert systems: a blackboard-based executive application," in Knowledge-Based Expert Systems for Engineering, Classification, Education and Control, ed. D. Sriram & R. A. Adey, pp. 145 - 155, Computational Mechanics Publications, (1987). B. L. F. Daku, P. M. Grant, C. F. N. Cowan, and J. Hallam, "Intelligent techniques for spectral estimation," Journal of JERE, Vol. 58, (6), pp. 275-283, (September-December 1988).

R. Englemore and T. Morgan, eds., Blackboard Systems, Addison Wesley, Wokingham, UK, (1988).

J. A. Walls, A. J. Walton, and J. M. Robertson, "Application of Al Techniques to the Control and Interpretation of C-V Measurements," To be presented at ICMTS 1990, San Diego, CA..

J. Doyle, "A truth maintenance system," Journal of Artificial Intelligence, Vol. 12, pp. 231-272, (1979). J. Jones, M. Millington, and P. Ross, "A blackboard shell in prolog," in Proceedings of ECAI'86, (1986).

J. Jones and M. Millington, "An Edinburgh prolog blackboard shell," D.A.I. Research Report #264, Department of Artificial Intelligence, University of Edinburgh.

W. F. Clocksin and C. S. Mellish, Programming in Prolog, Springer-Verlag, Heidelberg, W. Germany, (1987). (Third Edition) L. Lebow and G. L. Blankenship, "A -based expert controller for industrial applications," in Knowledge-Based Expert Systems for Engineering: Classification, Education and Control, ed. D. Sriram, R. A. Adey, (1987).

D. J. Cooper, "An expert systems approach to process identification and adaptive control," in Knowledge-Based-Expert-Systems For Engineering: Classification, Education And Control, ed. D. Sriram, R. A. Adey, Springer-Verlag, (1987).

-217- Hewlett Packard Company, Using the BASICIUX System on HP 9000 Series 300 Computers, Fort Collins, Colorado, USA., (1988). (BASIC/UX 5.5)

K. J. Astrom, J. J. Anton, and K. E. Arzen, "Expert control," Automatica, Vol. 22, (3), pp. 277-286, (1986).

-218- Chapter VIII Conclusions

In VLSI processing, C-V measurements to date have had an important role to

play in providing essential process control information relating to oxides, the Si0 2 interface and silicon substrate at many stages during fabrication. In the continuing drive towards further miniaturisation of MOS devices, the requirements for a more detailed study, understanding and control of the physics of the MOS system will place increasing demands on parametric test equipment, and C-V measurements in particular. The ability to make use of the unique information available from the MOS capacitor in a modern Computer-Aided Manufacturing (CAM) system has until now been severely limited by the absence of any reliable and sufficiently flexible automatic C-V system which could be used for process monitoring. The research work documented in this thesis has attempted to redress that balance by applying some previously unexplored techniques to the fundamental problems of measurement, interpretation, planning and control.

The MOS capacitor is a well-documented and extensively researched indicator of final device performance, and as such provides a good platform upon which to base a detailed study of an MOS process. Chapter 4 provided an extensive review of how this simple yet versatile test device may yield a wide variety of processing and device parameters. The difficulties in interpreting and using this data have been demonstrated by a variety of typical measurements that did not correspond to any current theory.

One major problem with the C-V technique lies not so much with the measurement itself which can be accurate and repeatable, but with its interpretation. The shape alone of a C-V curve reveals many processing defects or characteristics of the sample. The presence of some of these effects may signal that entirely different analyses are applicable in order to extract important processing parameters. Statistical pattern recognition techniques were developed in Chapter 6 to distinguish three of the most commonly occurring deviations from the 'ideal' curve.

-219- The usefulness of any information is limited by the ability to interpret it consistently and reliably, and also dependent upon its ease of extraction. In the case of C-V measurements, data reduction or screening is not as important as correct interpretation. It is very easy to mis-interpret a C-V curve, and there are some outstanding examples in the literature. As an example, Kaempf [1] who writes about 'meaningful presentation' of parametric test data, has in his paper incorrectly attributed a stretchout in a C-V curve, due to interface traps, to mobile charges. One of the dangers therefore, as re-iterated by Bailliet [2], is the reliance upon computer- generated data as an 'unquestioned source' of information.

CV Measurement Automation

This project has attempted to move the boundary between the judgemental and automated phases in C-V analysis in order to enhance the validity of the extracted data and to obviate the need for an experienced operator. The arguments in favour of automation include not only the usual advantages of 'economic leverage', dictated by improving throughput and efficiency, but also those of operator independence, reliability, and utility. C-V measurements have always been considered notoriously difficult to implement, interpret, and automate. These and other reasons were the driving force behind this research.

To achieve 'total automation' of C-V measurements, intermediate alternatives to zero operator-intervention had to be considered. This was not simply because of the complexity of the problem, which demanded the application of some novel and unconventional techniques and a knowledge of the underlying physics, but for the continual re-appraisal of what was realty required from an intelligent C-V system. Since, as far as this author is aware, these requirementss have never been defined, the formulation and identification of the requisite functions formed a major part of this research. Guidelines for such a system were poorly defined, due in part to a limited awareness amongst most industrial users of the full abilities offered by C-V. Furthermore there was a lack of any coherent analysis into the precise nature of the problems inherent to the method.

The machine learning approach of Chapter 5 was successful in extracting simple parametric information from the C-V curve shape, and demonstrated the possibility of extending the idea to include variations in implant processing. The realisation,

_220- however, that a major difficulty with the C-V technique lay in its correct interpretation and that better use of the curve-shape information could made through the use of a pattern classifier led to the development of a semi-automatic interpretation tool, CV- ASSIST, described in Chapter 6. In more of a supervisory role, CV-ASSIST was shown to be useful for guiding operators through measurements, modifying test functions based on observations already made, and assisting in the visual interpretation of C-V curves.

With the development of a system capable of automatic interpretation came the second realisation that once a particular fault or characteristic had been identified, the process of characterising that fault could be made much easier and more efficient by tailoring the analyses to the actual device in question; This important ability to identify a particular problem ensures that only an appropriate amount of measurements

are made on the device, thereby enabling the maximum amount of relevant parameters to be extracted in the minimum of time.

For process monitoring, it has been observed [3] that 95% of the fluctuations in

SPICE model parameters can be described by only four parameters, two of which (D01 and Vfr ) are directly measurable using the C-V method. For catastrophic failure analysis however, many more parameters will typically be required to track down the process fault. A properly designed C-V test in this situation, therefore, can yield a large percentage of these parameters, plus an even greater range of qualitative information.

The problem of performing measurements and analyses in the correct sequence has also been addressed. Beyond the initial first few measurements, the best sequence in which to apply further tests is indeterminate and governed entirely by the particular characteristics of the device and its response to the applied stimuli. For example, to measure certain key parameters often necessitates an iterative series of tests in order to determine the optimum voltage range or sweep speed. Other analyses to determine either the doping profile or the interface trap distribution require an estimate of the other in order that the correct analysis algorithms can be applied. Furthermore, certain characteristics or abnormalities such as a 'leaky' oxide or the presence of a depletion ion-implant require that much greater care be taken in the analysis.

The traditional algorithmic techniques advocated for control are clearly insufficient for the complete automation of C-V measurements and their analyses. All

-221- aspects of automation could be improved upon by the application of a more flexible approach to data interpretation and manipulation. It is important however, that these techniques are easily modifiable and can be improved as situations dictate. The key to flexibility therefore is adaptability, and a system which can be readily and cheaply re- programmed or trained for new processes and fault conditions may succeed in meeting future requirements.

The impetus for employing a knowledge-based system in the form of a blackboard model came from an awareness to view all the available measurements as components of one complete picture. This holistic viewpoint also enables a tradeoff to be made between accuracy and completeness, particularly when data from one measurement may conflict with that from another, or in circular situations where one analysis cannot proceed until information from another, itself dependent on the first, becomes available. In this way, the maximum amount of information can be derived from an integrated set of measurements that often exceeds that that which is available from a simple sum of the individual measurements made in isolation.

The implications for 'expert control' of C-V measurements are an increased throughput of routine parametric measurements, thereby freeing the operator from all but the high-level analysis of samples that do not conform to any description in the current knowledge base. Additionally it is hoped that C-V measurements will become more reliable as their sophistication and complexity become manageable through the application of pattern-recognition and knowledge-based techniques.

The enormity of the engineering database generated by parametric test equipment prompts for the adoption of a hierarchical approach to limit the amount of data required to be kept for later reference. A major requirement for todays automated test systems, therefore, is to reduce the plethora of electrical data to a manageable quantity at a local level and translate it in to a more meaningful form. The ability to do this quickly and act upon the results allows for on-line process control. Radically new approaches are currently under development within CAM systems to give autonomy and decision-making abilities to individual processing islands. An automatic C-V system capable of interpreting and assessing the validity of its own data is one step towards enabling this autonomy within the oxidation and implantation processing islands.

-222- REFERENCES

U. Kaempf, "Automated parametric testers to monitor the integrated circuit process," Solid State Technology, pp. 81-87, (September 1981).

J. V. Bailliet, "Automatic testing provides economic leverage," Solid State Technology, pp. 88-91, (March 1970).

P. H. Singer, "Analysing parametric test structures," Semiconductor International, pp. 74 - 79, (June 1987).

-223- Appendix A EDUCATES User Manual

Overview

This document describes the structure and operation of the EDinburgh University CApacitor TEst Software (EDUCATES) Capacitance-Voltage measurement system. The first section gives a brief overview of the current state of the software, its history, the implementation, and some suggested uses.

The second section then goes on to describe in brief detail, the structure, philosophy and organisation of the five programs that make up the suite. Some additions to allow external control and operation are detailed in preparation for the addition of the 'expert' control software. Also included are the descriptions of a number of other programs not included in the main suite, for data analysis and interpretation.

Section 3 is a simple operating manual describing the licensing and installation, and an end-user guide on 'getting going'. This section also details the hardware and software environment required to run EDUCATES successfully.

Finally, section 4 is the complete program documentation for the HP-BASIC sub-programs used in the measurement software suite.

1.1. Introduction to EDUCATES

EDUCATES has evolved from the need to be able to make a range of reliable C-V measurements from mundane, routine tests to highly detailed analyses. Two factors that have driven this evolution are

-224- The inadequacy of software presently supplied with the HP4061A semiconductor/parametric test system, and its inability to meet these requirements,

The inapplicability of many algorithms to situations encountered in general- purpose measurements

It was considered that the existing HP software had not progressed in line with similar test and control software supplied with the HP4062 current-voltage (IN) measurement system, and was unable therefore to take advantage of the new programming environments now available. It was also noted that most currently available custom C-V measurement systems were still beset by problems regarding the interpretation of (visibly) bad data, with little or no provision made to protect the user from the effects of simplifying assumptions in the handling of that data. Most important was the frequent observation that C-V measurements were not as simple to make in practice as the sales literature or textbooks suggested!

1.1.1. Upgrade History of EDUCATES Software

The EDUCATES software suite is loosely based upon that supplied by Hewlett Packard with the 4061A test system [1]. This was originally supplied on cartridge tape for use with an 9835 desktop calculator/computer. It was subsequently upgraded in 1980 and re-written in BASIC 2.0 to run on an HP9826 computer. This same software is still supplied with the 4061A today (1989).

Since its conception in 1983 by I. McGillivray [2], the EDUCATES C-V software has undergone several major revisions. The conception and conversion to BASIC 3.0 (versions 1 and 2) are largely undocumented. Subsequent upgrading by this author is described below:

Version 2 to 3 (1986 - 1987)

Extensive debugging and improvements to the user interface. Upgraded to BASIC 4.0

Many default parameters added for the custom-manual mode (found to in practice to be the most useful mode of operation).

Generalisation of code, removal of temporary bug fixes.

-225- Version 3 to 4 (1988 -1989)

Major re-working of the code to upgrade to BASIC 5.0/5.1 including the following:

Further sub-division of the code modules

Consistent and clear variable naming conventions

Sharing of common sub-programs from a unified library of routines

Further improvements to the front-end (decision support mechanism)

Replacement of all custom graphics routines by a single, general-purpose, interactive graphics sub-program (E_Z_GRAPH)

Interface to CV-ASSIST, a pattern recognition system for assisting novice users.

Version 4 to 4.1 (late 1989)

Additions to EDUCATES suite to allow external control of most measurement, calculation and display algorithms in preparation for the application of an external expert system:

Interface to the Edinburgh Prolog Blackboard Shell (CV-EXPERT)

The application of some object-oriented techniques to the production of BASIC macros for MEASUREment, CALCULATion, and DISPLAY of C-V data.

1.1.2. Implementation

The test and measurement software is exclusively written in Hewlett-Packard BASIC (rocky-mountain basic). This implementation of the BASIC language offers many facilities often associated with languages such as C, Pascal and Fortran. Such features as dynamic linking of code, shareable common memory, UNIX file compatibility and a are available, and used extensively in the capacitor test software. The BASIC language has traditionally only been available to run on dedicated HP controllers (the so-called 'workstation BASIC'). The most recent version of this language is version 5.1, which can be run on virtually all HP 9000 computers [3].

.226- A new implementation has been recently released which runs an and compiler for version 5.1, under UNIX.t This is termed RMB/UX, and runs under HP-UX 6.2 (Hewlett Packard's implementation of UNIX) or later. All the C-V software has been designed to run under version 5.5 of RMB/UX, but backwards compatibility to workstation BASIC 5.1 has been maintained for all major subprograms. As a result, the core system can be run on many HP 9000 series computers. The interface to the expert system software however is not compatible, and furthermore, requires the X- windowing environment.

All programs normally assume the use of the Hewlett Packard 4275A LCR meter unless otherwise specified. If other instruments are to be used, care should be taken to use equipment of comparable accuracy. This is particularly so for LIFE4 and PROF4 programs which require knowledge of, and are sensitive to, the accuracy, resolution, and speed of the capacitance meter used.

1.1.3. Suggested Uses

EDUCATES is ideal for exploratory research on new processes or techniques, fault-diagnosis, and routine process-monitoring. The software operates in either a totally manual mode (but not recommended except to the experienced user), an assisted mode (generally the most useful) and in future, a totally automatic mode under knowledge-based control (but not available for workstation BASIC). In essence, there is a mode of operation to suit most needs and levels of operator competence.

1.2. Structure of the EDUCATES Program Suite

1.2.1. General Notes

The current EDUCATES system is comprised wholly of sub-programs. These are all held in a common sub-program library, prior to installation, except for the graphics sub-programs which must be obtained from the originators. Any modifications made to the source code should be reflected in this library to maintain modularity of the overall system. Furthermore, any modifications to the common memory declarations (COM) must be made with care and checked against all current declarations for

t UNIX is a trademark of Bell Laboratories.

-227- conflicts or dependencies. A complete list of all the common blocks is held in sub 'Comm'. Although the individual measurement programs are run separately, the C-V program suite should be viewed as one integrated set of interelated programs, each one dependent on the others for data and parametric information. As an aid to managing these dependencies, a knowledge-based system is currently being developed to ensure integrity in the flow of information. In the meantime, and for users without HP-UX facilities, CV-ASSIST offers a less sophisticated alternative.

1.2.2. Program Briefs

MENU4 is a menu program to select the measurement programs and to set up flags, aliases, global variables and hardware-dependent bits. It is a good idea to ensure that this is run first from the 'AUTOST' program after the machine has been 'booted-up'. It is simple to use and needs little explanation.

CV-ASSIST is a high-frequency C-V measurement control program coupled with a trainable pattern-recognition system for matching C-V curves to a variety of curve shapes previously stored in a database. It will assist the user to correctly interpret the initial C-V measurement prior to an in-depth analysis. This program may also be operated in a 'bare' mode, without any pattern recognition facilities, but with all other optimisation routines available. It is strongly advised that this program (CVmeasure) be the first selection from the menu program.

LIFE4 is a capacitance-time (C-t) lifetime measurement package and is fully integrated with the doping profiler for analysis of samples with non-uniformly doped substrates [4]. It implements the analysis of Zerbst [5, 6, 7] of the C-t relation.

PROF4 is a (pulsed) depth profiler for characterising the silicon substrate. It has many

added enhancements to obtain smooth, accurate profiles right up to the Si:Si0 2 interface.

GV4 is a multi-frequency, conductance-voltage, interface trap analysis program. It implements the technique developed by Nicollian and Goetzberger [8] at 7 spot frequencies.

HFLF4 is an alternative means for evaluating interface trap density and distribution. This program uses the combined high and low frequency capacitance data as per

_228- Catagne and Vallipe [9, 10] (Pp. 331-333), to determine D.c. It is most useful when time prohibits the use of GV4, or the trap density is particularly high.

1.2.3. Program Structure

The software has been constructed such that it can be run in three different modes:

Manually

Automatically from CV-ASSIST

Under knowledge-based control

The code is supplied in the following form:-

Program 'linkers' for each measurement program

A library of sub-programs (listed at the end of this document)

Each measurement program 'linker' contains sufficient information to build the rest of the program automatically. This is primarily to encourage consistency of the library, but also to save disc space. Should the programs ever need to be changed it is not necessary to re-install all the programs using the INSTALL option in the menu. Instead, the procedure is as follows. Each program kernel has a name of the form:

NAMEk

where NAME is one of GV, PROF, LIFE, HFLF or CVASSIST. Individual programs may be regenerated by first LOADing the appropriate kernel and then executing the following command from the command line:

LOADSUB FROM "SUBLIB" followed by LOADSUB FROM "GRAPHSUBS"

Any errors after the first LOADSUB command that complain about missing sub- programs should be ignored. After the rebuild is complete, the program can be stored in the usual way, and accessed from the menu program.

-229- At any stage during any measurement, you may return to the menu by pressing softkey [ill labelled 'SYSTEM MENU". To change from one measurement to the next in manual or semi-automatic mode, this step is mandatory to ensure the common memory is cleared out before the next test is initiated.

1.2.4. Programs in More Detail

1,711 Dt1Ei

This program performs the following functions:-

Defines the location of the disc drive(s), directory and pathnames for the program and data storage areas,

Sets up softkeys for use by all programs,

Sets up aliases for interface addresses of all measurement hardware and peripherals,

Sets up the physical constants: k, T, q, €,, €, €01 in common memory,

Clears and resets all parametric variables and arrays in common memory,

Provides a softkey selection of measurements and operations.

MENU4 must be run first. As a result, the entire software suite can be quickly configured to run on many hardware and disc subsystem configurations.

CVASSIST

This program provides a general-purpose, robust measuring environment for the extraction of many MOS parameters from high-frequency C-V measurements [11] (Chapters 2,3,4). Secondly it performs pattern recognition on the measured C-V and 0-V curves to provide a measure of similarity between some well-known classes of C-V curve (e.g. uniform, implanted, interface-traps). This information is then used to sequence other tests (GV, PROF, etc.) by ensuring that the appropriate assumptions about the device have been made and are used accordingly during the numerical analyses. The program can operate without manual intervention, although for increased reliability during feature extraction, a decision support framework is provided.

-230- A number of helpful enhancements have been added, such as an automatic sweep-range optimiser, a simple minority carrier lifetime estimator and an ideal C-V curve simulator. In addition, both single-point and pulsed techniques of C determination are employed to allow for stagnant inversion layer response or short lifetimes. From these, both surface and substrate doping may be assessed, where appropriate, thereby enabling the charge deficit, Q/q, at the interface to be evaluated.

CV-ASSIST employs statistical pattern recognition techniques to identify C-V curve shapes, including canonical analysis of the feature set to emphasise discrimination between different curve shapes. A x2 significance test on the result gives a percentage confidence rating for the recognition algorithm.

CVASSIST is a trainable pattern classifier and, as such, can be adapted to recognise process-specific CV curves should the existing classification scheme be inappropriate. Details of the pattern recognition system can be found elsewhere [11] (Chapter 6).

CVASSIST assumes a directory structure as shown in figure 1.2. The example C-V curves are held in this simple database for the pattern-matching algorithm and decision-support routines.

LIFE4

This is a flexible capacitance-time measurement package for Zerbst analysis of capacitance transients. Minority carrier generation lifetime and surface recombination velocity can be determined reliably using this program.

The program includes an automatic estimator for the total measurement time, and provides for the treatment of non-uniform substrates by utilising the doping profile as measured by PROF4. There is also a provision for producing lifetime profile plots and overlaying these onto doping profiles to assess implant damage or activation anneal effectiveness. For the analysis of samples with shorter lifetimes, the use of a faster capacitance meter (eg. HP4277A) is advocated, and modified algorithms are provided.

-231- PROF4

A general purpose dopant profiler for Schottky diodes and MOS capacitors, offering a choice of pulsed or equilibrium measurement methods. Data collection is optimised for noise-free profiles and numerous numerical techniques included for near-surface analysis, sharp profile correction and implant dose assessment. The basic analysis is that due to van Gelder and Nicollian [121 with corrections upto the surface using the method of Ziegler and Klausmann [13]. A noise-abating voltage-step optimisation routine by McGillivray et al. [14], has been implemented and improved for small-area capacitors [11] (Chapter 4).

GV4

An automatic, multi-frequency implementation of the conductance technique for the accurate determination of interface-state density as a function of bandgap energy. Also determines capture probability, C,,, interface trap capacitance, C, and interface trap time constant, T. Also included is a conductance peak simulator using the extracted data as input. This can be used as a check on the validity of the analysis. The 0-V method can be used to evaluate Dit for non-uniform samples provided that

< lx 1011 eV cm -2 .

HFLF4

A control program to measure the high-frequency and quasi-static C-V curves at the same bias points, and to estimate the interface trap distribution from the difference between the two. Dit can then be calculated by one of three methods; either the HF only method (useful for high lifetime samples when the LF curve may be in error), the LF only method (which may be required for highly non-uniformly doped samples), or the combined HF and LF technique (which has the advantage of not requiring an accurate theoretical C-V relation).

HFLF4 has the potential for measuring Dj, over a wider energy range than the G-V technique, and although more qualitative in nature, is a quicker means of deriving the trap distribution. Other routines are included for the calculation of doping profiles from the HF data, or from the combined HF and LF data with correction for interface-trap stretchout due to Brews [15]. There is also a facility for

-232- comparing this profile with the pulsed C-V profiles from PROF4.

1.2.5. Other Programs Not in the Main Suite

This sub-section details a number of self-contained utility programs which were of use in the development of the EDUCATES software. They include GFUNCSGEN, GFUNCSPLOT, DB-STRUCTURE, CV-EXPLORE, MONITOR, AUTOSTART, and INSTALL.

1.2.5.1. GFUNCSGEN

This program generates the 'g 1' and 'g2' curves for impurity profile determination at the surface of the semiconductor as described by Ziegler and Klausmann [13], and outputs the relations to a file 'GFUNCS' for use by PROF4. GFUNCSPLOT performs the same calculations as above, but presents the relations in the form of a graphics plot.

DB-STRUCTURE

This program generates the necessary directory structure and file templates for the database used by CV-ASSIST in learn mode. The INSTALL program invokes this as part of the installation procedure.

CV-EXPLORE

CV-EXPLORE allows for various feature selection and distance classification combinations to be explored. This was useful in arriving at an optimal pattern- recognition system constructed to classify a known set of C-V curves.

The operations available for data conditioning and transformation are:-

-233- Smoothing - moving average

Differentiation - 1st and 2nd differentials

Peak detection - gradient searching and thresholding

Aspect-ratio calculation

Interpolation - linear & cubic spline

Deviation spectra - comparison with standard or ideal curves

Polynomial curve fitting - with root extraction

For subsequent analysis of the extracted features, the following post-processing and classification schemes are available:-

Manual feature choice (trial and error)

Canonical analysis (feature space reduction)

Cluster plotting (visual cluster analysis)

Multivariate statistics (mean, covariance, etc.)

Mahalanobis distance calculation (measure of pattern similarity)

Chi-squared testing (on group membership)

Decision thresholding

Minimum-distance classification

MONITOR

This program enables communication between Prolog and BASIC. It is LOADED and RUN by MENU4 if the expert control option is selected.

MONITOR contains the minimum amount of code to initiate this communication. Thereafter, Prolog will rebuild the program to activate the required measurement routines. As a result this program will appear to 'grow' in normal use. It is neither necessary nor advisable to RE-STORE this program at the end of a run.

-234- AUTOSTART

This short program is the first to run. It is started automatically upon booting the BASIC system, and initiates the main menu program.

INSTALL

This program manages the installation of the EDUCATES system, and includes a configuration check of the computer system.

1.3. EDUCATES Operating Manual

1.3.1. Modes of Operation

This section details the use of the measurement software in either

Manual mode,

Automatic mode,

Intelligent mode (under knowledge-based control).

1.3.1.1. Manual Mode

If the menu program (MENU4) is not already fired up after booting the BASIC system, then you must LOAD and RUN it first.

Select the desired measurement from the menu. c.. Respond to all the questions as appropriate, but if uncertain, just press and sensible defaults will be chosen. This applies to all of the initial sweep parameters thus providing greater ease of use for the infrequent operator. d. You may return to the menu at any time by pressing ffil and select further measurements as required. All global parameters (device area, sweep range, sample type, etc) will be 'remembered' from previous tests, and either you will not be prompted for them again, or they will not be re-measured.

-235- \ e. It is recommended that measurements be performed in the following order; HFLF4, PROF4, GV4, LIFE4.

1.3.1.2. Automatic Mode

Start up the menu program as described previously.

Unless you are familiar, select CV-ASSIST from the menu. The CV Advanced System for Intelligently Sequencing Tests is a front-end to the whole suite of measurement programs, and will suggest which further tests should be applied after interpreting an initial C-V curve.

C. Most instructions are fairly obvious, and simple yes/no questions may require either 1/0, or yin, or a softkey response as a reply.

Most of the possibilities for error have been avoided or will be trapped in the event of an incorrect response. If a fatal error occurs, check that the device is in fact a reasonable approximation to a capacitor, and that all extracted parameters are of a sensible magnitude. A list of deferred defects and most likely errors are given at the end this manual.

After CV-ASSIST has completed fault diagnosis, some estimated parameters will be printed, along with an interpretation of the CV curve, and a suggested list of further analyses for you to make. You will then be returned to the main menu.

1.3.2. Intelligent Mode

This requires the use of Rocky-Mountain Basic running in the X-windowing environment under HP-UX version 6.2 or greater. The RMB interpreter works in conjunction with a knowledge-based system in Edinburgh Prolog running concurrently with the BASIC measurement and instrumentation software.

RMB and Prolog must be started up in a common directory (at present) and in separate windows:

1. To begin RMB, log-in and type 'rmb' at the HP-UX prompt. An autostart program will begin which monitors for any instructions issued by Prolog.

-236- 2. To start Prolog, a saved state must be restored by typing 'prolog blackboard' at the HP-UX prompt. This performs the following:-

Retrieves the code for the blackboard shell

Assimilates the rule base for the (expert) system.

To start the blackboard, simply type 'start', and then 'run', remembering to issue a full stop ".° to terminate each command or reply to prolog [16].

To stop all execution, type ^D at the Prolog prompt ( J?- ) and QUIT in BASIC. Finally [SHIFT] [CONTROL] [RESET] will terminate the windowing session and return to a unix shell.

1.3.3. General Test and Measurement Procedure

The structure of each measurement program is very similar, and follows a standard procedure as illustrated in figure 1.1. In automatic mode, some of these procedures may be activated without the intervention of the user. In manual mode, however, greater freedom is allowed in the ordering of these individual events. Specifically, softkeys are provided to initiate interactive analyses, graphical and printed outputs, and to change set-up and sweep parameters. There is a price to pay for this flexibility, however, and care must be taken to ensure that all operations have their prerequisites satisfied. In general terms, this implies that:

Prior to a measurement, the sweep parameters be set up correctly.

Prior to parameter extraction, all necessary data has been collected and processed appropriately.

Prior to plotting any graphs or charts, the data, be it raw or processed, is available.

To complicate matters further, some of these prerequisites (eg. determination of an accurate C in, or an assessment of the substrate uniformity) may have to be determined by other programs. These, in turn, may depend on other measurements, and thus problems of 'circularity' may result! CV-EXPERT is an attempt to manage these inter-dependencies.

-237- (LOGIN) HP/UX only

______BASIC system L Quit Main menu Exit for general use

elect measurement program

Load old Measure new data data

Select file (E nter sample details _-" MANUAL Select analyses ct sweep ONLY plot curves meters

print data Make measurement4 Store data

Analyse data ---

display data -_-

Return to main menu

Figure 1.1: Flowchart of the EDUCATES measurement software

1.3.4. Installation

At the conclusion of this document a licensing agreement can be found which must first be completed by both parties prior to the installation and use of the EDUCATES software.

The software is relatively simple to install and need only be done once unless the sub-program library is substantially modified. In an effort to rationalise the use of common routines, and frequently used code, each program must first be built-up from the libraries of routines. This is akin to linking library routines in 'C' after initial compilation, except that here, the code modules are interpreted and therefore readable.

-238- After each program has been built, they can be copied onto a hard disc into an appropriate directory, for example. Note however, that it is not necessary to have this hardware to use the program suite. It is necessary however to be running the program suite under BASIC 5.0 or later since extensive use is made of the HFS (Hierarchical File Structure) directory system. For convenience, it is advised that the programs and data should be kept in separate sub-directories, and furthermore it is imperative to have separate data directories for each measurement program to avoid filename duplication. The program DB_STRUCTURE" will generate the necessary directories.

The program suite is menu-driven with each individual measurement being initiated from the menu program (MENU4). It is ALWAYS necessary, however, to run the menu program first, since it sets up a number of global variables, and clears out the common memory from previous invocations of the measurement routines.

1.3.4.1. Getting Going

Some knowledge of HP-BASIC is assumed here, else this section would to too long to be of any use as quick reference guide. Please refer to the BASIC 5.015.1 manuals for further information.

With the computer and disc drive on, and the first EDUCATES distribution disc in the drive, and the computer successfully communicating with that drive, run the installation program thus:

type: LOAD JNSTALL,1 which loads the program and runs it too.

Prior to installing the EDUCATES code, you now may wish to check if the current BASIC system is compatible:

press softkey f5 labelled ICONFIGUREI

If your hard disc requires re-partioning, the use of the Hierarchical File System is recommended. At least one sector on the hard disc requires to be in HFS format to use the EDUCATES software. Selecting the IFORMAT DISCS I option will provide advice on performing this function through the use of the standard DISC_UTIL program.

-239- 1.3.4.2. Sample Installation Procedure

Next, select the installation option:

press softkey fl, labelled IINSTALL1

This may take some time, and you will be required to follow some elementary instructions to complete the installation. Notice that a number of directories including a directory called DATA" have been created; all your experimental results will be stored here under the appropriate measurement subdirectory and with a filename corresponding to the wafer and batch number you enter when prompted.

An example of the installation procedure is given below:

INSTALLATION PROGRAM FOR THE EDUCATES SUITE

Select a destination for the EDUCATES software: (e.g. /users/fred: ,700) (see BASIC manuals for further details...) Enter floppy disc specifier for EDUCATES disc: (e.g. :,700,1) Place EDUCATES distribution disc in that drive and press continue... EDUCATES directory and sub—directories created. RE—BUILDING CVASSIST RE—BUILDING PROF RE—BUILDING HFLF RE—BUILDING LIFE RE—BUILDING GV INSTALLATION COMPLETE... REBOOTING...

The actual time taken for this process, and the exact responses by the INSTALL program will obviously depend on the hardware in use. The directory tree structure created in illustrated by figure 1.2.

1.3.5. Using the EDUCATES Software

-240- WORKING DIRECTORY/ (top level)

DATA/ HFLF_DATA/ LIFE-DATA/ PROF-DATA/ GV_ DATA! CVAS 1_DATA! C V_PAR AM S / PROGRAMS: HFLF4 LIFE4 PROF4 (+ PROFTABS) CV4 CVASSIST MENU4 AU TO ST INSTALL DB_STRUCTURE CV-EXPLORE TEST! r—TRAPS/ I- IMPLANTED/ —UNIFORM/

FILElNDEX (+ BACKUP) L- COVMATRIX (+ COVBACKUP) CV_PARAMS/ DATA FILES / Figure 1.2: Complete directory tree for the EDUCATES software

1.3.5.1. General notes

Upon selecting CV-ASSIST from the main menu, the user is presented with a few questions, to use which they either enter a response, a value, or press RETURN for default settings. Some questions do require an answer, and these will persist in obtaining the right one.

A request to load a wafer requires that the probe be raised well above the the chuck, the chuck wound out to its full extent, and the wafer placed upon it. The probe should lightly scratch the capacitor dot, as it is brought into contact, in order to penetrate the native aluminium oxide film. Care should be exercised not top damage the probe tip since co-axial C-V probes are expensive and hard to come by! Once loaded, the door to the chamber should be tightly closed to avoid light leaking in. The lamp inside the chamber will be extinguished automatically.

-241- The remaining sequence of events will be dictated by the mode of operation, selected in the menu program, described below. The diagrams of figure 1.3 best depict the key differences between the three modes.

1.3.5.2. Manual mode

It is strongly recommended that in this mode, CV-ASSIST be run first, since this gives the greatest range of qualitative information in the shortest possible time. The high frequency C-V curve thus measured can be used to determine key MOSFET and processing parameters. CV-ASSIST can be used then as a basis for deciding which further measurements ought to be made to fully characterise the device under test.

(MANUAL) (AUTOMATIC)

M EN U [—ASSIST I suggest selected 9! lautomatic MEASUREMENT

(INTELLIGENT) ri I CV—ASSIST I

CV—EXPERT user input

heuristics

Figure 1.3: Manual, automatic and intelligent modes of operation

-242- Manual mode is particularly useful when dealing with awkward cases, novel devices/processing, or for general trouble-shooting. Some optimisation procedures are included to enable sensible sweep-ranges and biasing techniques to be used. Typically, however, the user can expect to have to make several measurements at different settings, and on different sites before a satisfactory result may be obtained.

To use a different program, always return to the main menu first. If requested, the parameters entered for one measurement will be carried forward to the next, thus saving repetitious data entry.

In manual mode, previous sample test data can be retrieved and analysed. This is also possible using the BSDM software supplied with the HP4062 system [17].

1.3.5.3. Automatic Mode

In this mode, many of the questions normally asked in manual mode will be skipped, thus making the system free-running and deterministic. This will not be suitable in all situations, but for the majority of routine testing, is an invaluable facility. CV-ASSIST will perform a full high-frequency measurement and analysis, and select and execute one further measurement (PROF, GV, HFLF, LIFE) on the basis of these results. After this, however, you will be returned to the main menu to manually select another test if desired.

1.3.5.4. Intelligent Mode

The software for this feature is still very experimental, but the CV-EXPERT pilot study has been shown to be reasonably successful and appears to be a promising route to take in future work of this kind. CV-EXPERT uses the Edinburgh Prolog Blackboard shell (EPBS), a copywrited expert system shell developed locally [18]. The software environment to support this system is, at present, rather specialised. For all these reasons therefore, CV-EXPERT is not included as part of the EDUCATES package. For local users however, the following information is supplied to allow CV- EXPERT to be operated on the development workstation.

With the user logged in and the X-windowing system up and running (type 'xllstart'), the EPBS and EDUCATES software should be started up in separate windows as described in section 1.3.2. Once the BASIC MONITOR program is set running, all control will be passed over to the EPBS. The sequence shown below illustrates the input required, and the output generated during a typical session with the CV-EXPERT system:

edecv % prolog cvexpert Edinburgh Prolog version 1.5.04 (12 September 1988) Al Applications Institute, University of Edinburgh ?- rules.

ask.pl reconsulted: 4172 bytes 1.93 seconds listut.pl reconsulted: 9064 bytes 4.54 seconds files.pl reconsulted: 3692 bytes 1.68 seconds rules consulted: 29312 bytes 17.10 seconds yes ?- start. yes ?- run. C-V system menu start system start prototype system revert to manual (in BASIC) How would you like to proceed with the measurements? (1,2 or 3) (... etc.) BASIC was ready BASIC finished executing command BASIC was ready (... etc.) ?- ("D to finish) Prolog terminated edecv %

At this early stage of development, however, input may be required to be entered at either the BASIC or the Prolog screens. Selecting a screen is accomplished by moving the mouse to transfer the pointer from one to the other.

Little operator input should be required, since this scheme uses a superset of the automatic-mode, and program flow is largely dictated by the device under test. Exceptions include unrecognisable data or curve shapes, or an unknown set of circumstances. Should the BASIC system encounter an error, the MONITOR program may terminate and communication with the EPBS will cease. In some instances, this may cause the EPBS to 'hang' as it waits for a message from BASIC. Typing 'C' allows Prolog to be stopped. The shell may be restarted once the BASIC fault has been rectified by typing 'resume.'.

1.3.5.5. If Things Go Wrong

EDUCATES is an experimental piece of software and therefore not fully debugged. Neither the author nor the University accepts any responsibility for any damage or injury arising from its use. Some recognised defects still in existence are listed at the end of this document, together with suggested corrective courses of action where appropriate.

1.3.6. Hardware and Software Requirements

Tried and tested hardware configurations for the EDUCATES software have included:

HP 9000, model 216, 2 floppy discs,

HP 9000, model 217, hard disc,

HP 9000, series 300, most models,

HP 9000, model 319X, HP-UX workstation.

EDUCATES requires one of the following, currently-available BASIC language operating systems in order to be able to function:

HP-BASIC 5.0

HP-BASIC 5.1

RMBIUX 5.5 (HP-UX 6.2)

RMB/UX 5.5 (HP-UX 6.5 with X-windows)

1.4. Software Documentation

In the following pages, a detailed description is given of each sub-program used in the EDUCATES suite. This may provide a useful reference for anyone wishing to alter, customise or improve the measurement, analysis, and display routines.

IWYEME The sub-programs in this section can be found in the PROGram file called 'SUBLIB on the distribution diskette(s). This documentation was produced semi- automatically using Jim Serack's DOCUMENTOR program. For this, two other (ASCII) files 'SUBDOC' and "SUBREF' are used which contain an ASCII version of the "SUBLIB' code (produced by GET and SAVE) and the documentation text respectively. These, and the DOCIJMENTOR program may be obtained from the EMF upon request for those wishing to document their own changes and improvements in the same format.

J.A. Walls, A.J. Walton Edinburgh Microfabrication Facility Dept. Electrical Engineering The King's Buildings Mayfield Road EDINBURGH. EH9 331. Scotland. 9 November 1989

-246- REFERENCES

Yokogawa-Hewlett-Packard Ltd.,, Operating and programming manual, model 4061A semiconductor/component test system, Tokyo, Japan, (September 1980). 4061A system library, Volume I.

I. G. McGillivray, The Measurement of Electrical Parameters and Trace Impurity Effects in MOS Capacitors, (1987). PhD. Thesis, University of Edinburgh

Hewlett Packard Company, Using the BASIC 5.015.1 System on HP 9000 Series 2001300 Computers, Fort Collins, Colorado, USA., (1987). (workstation BASIC)

I. G. McGillivray, J. M. Robertson, and A. J. Walton, "Automatic measurement of minority carrier lifetimes in silicon for nonuniform doping samples using the Zerbst technique," IEEE Proceedings - I: Solid State and Electron Devices, Vol. 134, pp. 161-167, (December 1987).

M. Zerbst, "Relaxationseffekte An Halbleiter-Isolator-Grenzflache," Z. Angew Phys., Vol. 22, (1), pp. 30 - 33, (1966). (in German)

D. K. Schroder and H. C. Nathanson, "On the separation of bulk and surface components of lifetime using the pulsed MOS capacitor," Solid-State Electronics, Vol. 13, Pp. 577 - 582, (1970). D. K. Schroder and J. Guldberg, "Interpretation of surface and bulk effects using the pulsed MIS capacitor," Solid-State Electronics, pp. 1285 - 1297, (1971).

E. H. Nicollian and A. Goetzberger, "MOS conductance technique for measuring surface state parameters," Applied Physics Letters, Vol. 7, (8), pp. 216-219, (15 October 1965).

R. Catagne and A. Vapaille, "Description of the Si0 2-Si interface properties by means of very low frequency MOS capacitance measurements," Surface Science, Vol. 28, p. 557, (1971).

E. H. Nicollian and J. R. Brews, MOS (Metal Oxide Semiconductor) Physics and Technology, John Wiley & Sons, New York, (1982).

J. A. Walls, Capacitance-Voltage Measurements: an Expert System Approach, (1990). Ph.D. Thesis, Edinburgh Microfabrication Facility, Dept. Electrical Engineering, University of Edinburgh.

W. van Gelder and E. H. Nicollian, "Silicon impurity distribution as revealed by pulsed MOS C-V measurements," Journal of the Electrochemical Society: Solid State Science, Vol. 118, (1), pp. 138 - 141, (January 1971).

K. Ziegler, E. Klausmann, and S. Kar, "Determination of the semiconductor doping profile right up to the surface using the MIS capacitor," Solid State Electronics, Vol. 18, pp. 189 - 193, (1975). I. G. McGillivray, J. M. Robertson, and A. J. Walton, "Improved measurement of doping profiles in silicon using C-V techniques," IEEE Transactions on Electron Devices, Vol. ED-35, (2), pp. 174-179, (February 1988). J. R. Brews, "Correcting interface-state errors in MOS doping profile determinations," Journal of Applied Physics, Vol. 44, (7), pp. 3228-3231, (1973).

-247- W. F. Clocksin and C. S. Mellish, Programming in Prolog, Springer-Verlag, Heidelberg, W. Germany, (1987). (Third Edition)

HP 4062B Semiconductor Parametric Test System, II, Hewlett Packard Co.. (Appendix on Basic Statistics and Data Manipulation BSDM)

J. Jones, M. Millington, and P. Ross, "A blackboard shell in prolog," in Proceedings of ECAI'86, (1986). 1.5. List of Deferred Defects

1. Instruments seem to lock up upon first command to the instrument 1-IP-IB (RMB/UX only). This seems to be a system bug and it can be cleared by switching the 4275A on and off again.

Division by zero errors may occur for poor samples, eg. leaky oxide, giving rise to unexpectedly low/high value. These need to be reported so that error traps can be added.

The graphics routines do not work satisfactorily within X-windows, since the alpha and graphics displays are inseparable. - don't, therefore, expect perfect screen images under X-windows!

The HP2627A terminal is unreliable for graphics plotting, since the handshaking appears not to work with HPGL commands. - beware of using the scroll and zoom features of E-Z-Graph on this terminal.

V. On UNIX systems, AUTOLOCK should be enabled in your HOME/.rmbrc to prevent other processes and users from accessing the instruments when you are using RMBIUX.

The softkey labels sometimes do not correspond to their programmed functions when using RMB/UX. - This is a fault of RMBIUX itself (see /usr/lib/rmb/newconfig/DEFECTS).

DO NOT, under any circumstances, connect any instruments or LIF discs to the fast bus (HP-11B 14) or on any other interface which controls a UNIX system disc, as this can cause HP-UX to hang badly.

-249- 1.6. Capacitor Maps

Guides to the commonly used capacitor reticles produced in the EMF for C-V testing. 0 0

TEST CAPACITOR DETAILS ID SIDE (cm) AREA (cm2)

4.00 0.1257 2.00 0.03142 1.00 0.0.0785 0.50 0.00196

-250- VA!

Optional guard rings

C a

rm- 7-1

TEST CAPACITOR DETAILS D SIDE (cm) AREA (cm2)

A: 3.2 0.1024 B : 2.3 0.0529 C : 1.6 0.0256 D : 1.13 0.0128

- 251 - C

F'' DC

G 0 Co

TEST CAPACITOR DETAILS ID SIDE (cm) AREA (cm2) 4.53 0.20521 3.21 0.10304 2.27 0.05153 1.60 0.02560 0.80 0.00640 0.57 0.00325 C: 0.40 0.00160

-252- EDUCATES LICENSE AGREEMENT

LICENSEE: -

LICENSOR: University of Edinburgh Edinburgh Microfabrication Facility Department of Electrical Engineering The King's Buildings EDINBURGH. EH9 3JL

The licensor grants to the licensee a non-exclusive and non-transferable license to use the EDUCATES program suite upon the terms and conditions stated below:

The Licensee acknowledges that the Program suite is a research tool still in the development stage, and that it is being supplied 'as is', without any accompanying maintenance services from the Licensor.

The Program suite supplied by the Licensor shall be used only for educational or research purposes. In addition, the Licensee agrees to inform the Licensor of the results obtained by using or applying the Program suite to the Licensee's work. The Licensee may improve the Program suite or develop other programs using the whole or parts of the Program suite, provided the Licensee informs the Licensor of such improvements, applications and/or developments prior to their publication, and grants at the request of the Licensor the use of such improvements and/or developments under terms and conditions similar to those provided herein.

The Licensee acknowledges that the contents of the Program suite are to be kept confidential and are not to be disclosed or transferred to any person other than the Licensee without prior written permission from the Licensor.

The title to the Program suite and any material associated therewith shall at all times remain with the Licensor.

V. The Licensee acknowledges that the graphics sub-program 'E_Z_Graph' is copywrited and not included in the Program suite. It may be obtained from either Hewlett Packard Ltd., or the originators: Galileo Scientific, 2731 Blairstone Road, #175 Tallahassee, FL 32301. USA.

LICENSOR LICENSEE

(Date) (Signature) (Date) (Signature)

- 253 - Appendix B

There follows a complete runsheet for the fabrication of MOS capacitors:

-254- EDINBURGH MICROFABRICATION FACILITY CAPACITORS

BATCH NUMBER: 748 START DATE: 28/7/87 DEVICE IDENTIFICATION C-V Capacitors MASK SET: Guardring capacitor reticle MASKING SEQUENCE: MASK REV.LETrERS: STARTING MATERIAL: 14-20 obm. cm . (100) P-type, 3m. Dia. No. OF WAFER STARTS: 10 numbered wafers RCA CLEAN All Wafers 10 mm. 800C in 1:1:5, Ainmonia:Hydrogefl peroxide:water in Teflon jig 10 mm. 800C in 1:1:5, HC1:Hydrogen peroxide:water in Teflon jig D.I water wash 15 sec dip in 10% HF D.I water wash Wash and spin dry FURNACE# 1 Wafer# 1,2,3,4,5 (dry oxide) Tempeature: 9500C Idling ambient: Oxygen Preset gas flows as follows: Nitrogen 20% (1.5 1/mm.) Oxygen 20% (1.5 1/mm.) HC1 15% (0.15 1/mm.) Hydrogen 10% (1.7 1/mm.) Load wafers into furnace with Oxygen only flowing. mm . Oxygen + HC1 Aim for 850 Angstroms Measure oxide thickness: Angstroms FURNACE# 1 Wafer# 6,7,8,9,10 (wet oxide) Tempeature: 950CC Idling ambient: Oxygen

Preset gas flows as follows: Nitrogen 20% (1.5 1/mm.) Oxygen 20% (1.5 1/mm.) HC1 15% (0.15 1/mm.) Hydrogen 10% (1.7 1/mm.) Load wafers into furnace with Oxygen only flowing. 5 mm. Oxygen + HC1 17 mm. Oxygen + HC1 + Hydrogen 5 Thin. Oxygen

-255- Measure oxide thickness: Angstroms FURNACE# 1 (post oxidation anneal) Tempeature: 9500C Idling ambient: Nitrogen Preset gas flows as follows: Nitrogen 20% (1.5 1/mm.) Oxygen 20% (1.5 1/mm.) HC1 15% (0.15 1/mm.) Hydrogen 10% (1.7 1/mm.) Load wafers into furnace with Nitrogen only flowing. 5 mm. Nitrogen (Wafer# 3,8) 10 mm. Nitrogen (Wafer# 2,7) 20 mm. Nitrogen (Wafer# 1,6) 60 mm. Nitrogen (Wafer# 4,9) 25:1 ETCH Immerse in 25 vols Ammonium Fluoride soln.(40%w/v),1 vol HF Etch time: Wash and spin dry ALUMINIUM EVAPORATION (Si GATE) Load wafers on planets and load into Veeco 770 evaporator. Top up crucible with 99.995 grade Aluminium slugs (usually 2 or 3). Pump system to 5e-6 or better. Close shutter Run up e-gun to 10 kV and 0.7 A. Open shutter for 1.5 mm. Close shutter and shut down e-gun. When temperature drops below 500C, vent system and remove wafers.

1st PHOTO (POSITIVE RESIST) LAYER# Capacitor reticle WITH GUARDRING HMDS vapour box prime 30 mm. Spin HPR 204 at 6000 rpm for 30 secs Soft bake at 105oC for 30 min in static oven Align and expose for secs. (Optimetrix) Develop in 1 vol 351 Developer, 3.5 vols DI water at 25oC for 60 secs Inspect for proper development. Measure resist image: microns Hard bake at 1300C for 30 min in static oven Inspect for proper baking

ALUMINIUM ETCH Preheat crystallising dish of M.I.T. Aluminium Etch to 500C. Immerse wafer face upwards in etch Place on vibrator until patterns clear completely.

-256- Quench in DI water in sink for 2 mm. Transfer wafer to polypropylene (blue) jig. Repeat with rest of batch. Wier wash for 5 mm. or until resistivity returns to > 10 Mohm.cin. Spin dry in centrifuge for 200 secs.

RESIST COAT Spin HPR 204 at 6000 rpm for 20 secs. Hard bake for 30 mm. at 130oC in static oven. Inspect for proper baking 4:1 ETCH Immerse in 4:1, Ammonium Fluoride soln.(40%w/v) : HF Etch time: Wash and spin dry RESIST STRIP Immerse in Fuming Nitric Acid 10 mm Wash and spin dry Inspect for removal of resist. Measure etched image: microns ALUMINIUM EVAPORATION (Si GATE) BACKS OF WAFERS! Load wafers on planets and load into Veeco 770 evaporator. Top up crucible with 99.995 grade Aluminium slugs (usually 2 or 3). Pump system to 5e-6 or better. Close shutter Run up e-gun to 10 kV and 0.7 A. Open shutter for 1.5 mm. Close shutter and shut down e-gun. When temperature drops below 500C, vent system and remove wafers.

FURNACE# 8 (post-metallisatiOn anneal) Temperature: 435oC Idling ambient: Nitrogen Preset gas flows as follows: Nitrogen 50 (2.0 1/mm.) Nitrogen/40% Hydrogen 50 (2.0 1/mm.) 5 mm. Nitrogen 10 min. Nitrogen/Hydrogen 5 mm. Nitrogen

C-V TEST

-257- Appendix C EPBS C-V Rulebase

The rules in the following pages implement the knowledge-based system described in Chapter 7. The Prolog predicates to effect the BASIC:Prolog interfacing functions are also listed after the rule base.

-258- 3.1. The Rulebase for CV-EXPERT % ****************** set_up level ******************************************

% Rule 1 if notnow[set_up,fUflCtiOfl(X) ,_>O] then (print_main_menu, ask_user(function, X, Cf)) to add [set_up,fuflCtiOfl(X) ,CfJ est 4. % Rule 2 if [set_up,function(3),_>O] then (instruct basic(command, 'load "MENU4", 1') ,nl,write('Blackboard stopping.. .1) ,nl) to action (end_of_file) %suspend prolog est 5. % Rule 3 ------if [set_up,furiction(2) ,_>O) then (nl,write('TEST NODE...'),nl) to add[set_up,dofle(efld) ,1000] and [set_up,meter_zeroed(ok) ,l000) est 5. % Rule 5 ------if [set_up,function(l) ,_>O] and notnowfset_up,done(_) ,true] then (print_objectives_menu, ask_user (objective, X, Cf)) to add[set_up,objeCtiVeS(X) ,Cf] and [set_up,done(X) ,1000] est 5. % Rule 6 ------if @[set_up,done(X),_>O] and holds(X\end) then true to delete est. 6. % Rule 7 ------

if (set—up, done (end) , _>O) and not [set_up, meter_zeroed (Ok) , _>O] and holds(yesno('ready to measure?')) then (instruct _basic (command, 'CALL Offset (X$)'), instruct _basic(commafld, 'CALL teil_prolog(X$,"lOOO",l) ',R)) to add[set_up,meter_ZerOed(R) ,10001 and [set_up,offSet_Called, 1000) est 20. % Rule 8 ------

-259- if @ [set_up, meter_zeroed(short_fail) ,_>0] and (set_up, offset_called,_>0] then (nl,write('SHORT test failed- check lead/contact/fixture for R & L' ,nl, instruct_basic(cominand, 'CALL Offset(X$)'), instruct _basic(coifliflafld, 'CALL tell_prolog(X$, "1000", 1) ',R)) to amend[set_up, meter_zeroed (R) , 1000] est 10. % Rule 9 ------if. @[set_up,meter_zerOed(OPefl_fail) ,_>0) and [set_up, offset_called,_>0] then (nl,write('OPEN test failed- check lead/contact/fixture for C & G') nl, instruct basic(command, 'CALL Of fset(X$)'), instruct _basic(coinmand, 'CALL tell_prolog(X$, "lOOO",l) ',R)) to amend[set_up, meter_zeroed (R) , 1000] est 10.

% ****************** prelim level *****************************************

% Rule la ------if ( set _up,meter_zeroed(ok) ,_>0) and not (prelim,paraln(tYPe,TYPe) ,_>0] then instruct _basic (measure, type, Type) to add[prelim,param(type,TYPe) ,l000) est 10. % Rule 1 if @ [prelim, param(type,Type) ,_>0] and [userinput, param(typel,Typel), _>03 and holds_ Type\==Typel then (nl,write('Yours and my substrate types do not agree, using yours...'),nl) to amend[prelim,param(type,TyPel) ,1000) est 10. % Rule 2 ------

if (prelim,param(type,TyPe) ,_>0) and , param(area , A) , _>0) and (set_up, param(vacc,Vacc) , _>O] then (instruct _basic (measure, dox, prelim, DoxGoxCox)) to add DoxGoxCox est 10. % Rule 3 if [prelim,parain(doX,DoX) ,_>0] then instruct _basic (measure, rs, Rs) to add [prelim,parain(rs,Rs) , 1000) est 30.

% Rule 4

-260- if (prelim,param(rS,RS) ,_>0] and holds(Rs=<100) then true to add [prelim, assume (rS,ok),1000] est 30.

% Rule 5 ------if [prelim,paralfl(rS,RS),_>O] and holds(Rs>1000) then (nl,write('warniflg: series resistance too high'),nl) to add [prelim,assume(rs,high) ,1000] est 30.

% Rule 6 if (preliin,parain(gOX,GOX) ,_>0) and holds (Gox=<0. 001) then true to add (prelim,assulne(goX,Ok) ,1000) est 20.

% Rule 7 ------if [prelim,paraiu(goX,G0X) ,_>0] and holds (Gox>0.001) then (nl,write( 'warning: possible leaky oxide') , ni) to add [prelim,assume(goX,leakY) ,1000] and [prelim,assume(C0X,Uflreliable) ,l000) est 20.

% Rule 8 ------

if [prelim,assUlne(gOX,GOx) ,_>0] and holds(Gox\=leaky) and [prelim,assume(rs,RS) ,_>0] and holds (Rs\=high) then instruct _basic (measure, sweep_range, prelim, Vstartstop) to add Vstartstop est 10.

% Rule 9

if [prelixn,param(Vstart,Vstart) ,_>0] and holds (abs (Vstart) >20) and @[preliln,param(doX,DoX) ,_>0] then (ni ,write( 'warning: cant find accumulation in range 20,-20 V') ,nl) to delete est 10.

% Rule 10------

if (preliin,paraii(V5tart,VStart) ,_>0) and not [prelim,param(Cltuin,Cmifl) ,_>0] then instruct_basic (measure, cmin, preliin, Entries) to add Entries est 10.

-261- o_ - - 4 uie 1------if [prelim,paraifl(Settle_tilfle,Time) ,>0) and holds Tiiue<5 then true to add [preliln,assume(lifetilfle,1OW) ,1000) est 10.

% Rule lib ------if. [prelim,paraiu(settle_tiflie,Tiifle) ,_>0] and holds Time>=5 then true to add [prelim,assuine(lifetillle,Ok) ,1000] est 10.

% Rule 12------

if [prelim,assume(Clllifl,Stable) ,_>0) and not [prelim, assume (clTLifl,Uflreliable) ,_>0] then instruct _basic (xcalculate, nsith, Nsub) to add (prelim, param(nsub,Nsub) , 1000) est 10.

% Rule 13 if [preliln,assume(cTnifl,riSiflg) ,_>0] then true to add (prelim, assume (ilbg),1000) est 10.

% Rule 14a if notnow[user_inpUt, parain(type, Type) , _>0) then ask_user(type,Type,Cf) to add [user—input, param (type, Type) ,Cf) est 7.

% Rule 14

if notnow(user_input , param(nsub, Nsub) , _>0] then ask user (nsub,Nsub, Cf) to add , parain(nsub, Nsub) ,Cf) est 7.

iuie i.------

if notnow(user_iflput, param(dox,Any) ,_>0) then (ask _user(dox,Dx,Cf),D0X is Dx*le-10) to add [user_input, parain(dox, Dox) ,Cf] est 7.

i

if [user _input,paralfl(flSUb,NSUb) ,_>O] and [prelim,param(flsUb,NSUID2) ,_>0) then compare (Nsub, Nsub2)

-262- to add [prelim,bestparalfl(flSUb,NSUb2) , 1000] est 10.

% Rule 17 if (user _input,paraifl(dOX,DOX) ,>0] and (prelim,parain(dox,DoX2) ,_>0] then compare (Dox, Dox2) to add [prelim,best_paralfl(dox,D0X2) ,1000) est 10.

% Rule 18 if @[user_input,param(doX,DOx) ,>O] and [prelim,param(dox,DoX2) ,_>0] and holds not compare(Dox,Dox2) then (writef('\nDoxs do not compare-\n Your dox: %t\ndoesnt match: %t\n',[Dox,Dox2]) ,nl,yesno( 'Are you UNSURE about dox (y/n)')) to delete est 10.

% Rule 19------if [user input,param(dox,Dox) ,_>0) and @ [user_input, param(area ,Area) ,_>0] and not [prelim,best_param(dOX,AnY) ,_>0) then (writef('\nDoXS do not compare-\n Area known to be: %t\n', [Area]) ,nl ,yesno( 'Are you UNSURE about the area of the device? (y/n)')) to delete est 10.

% Rule 20 if not[user_input,paraiU(area,Area) ,_>0] then (area_menu, ask—user (area, Area, Cf) , join_var ("Area= ,, String), instruct asicb (command, String)) to add [use_r input,parain(area,Area) ,Cf] est 2.

% Rule 21 if (prelim,assuine(lifetime,X) ,_>0) and holds(X\loW) then true to add( initial, assume (Ok_to.JUl5e) ,1000] est 10.

% Rule 22 if [prelim,param(VStOP,V5tOP) ,_>0] and holds (abs(Vstop)>20) and @[prelim, assume (Clflifl,Afly) ,_>0] then (ni ,write( 'WARNING: cant find inversion region in (-20,20) Volts'), to amend[prelim,aSsUme(Clflifl,Uflreliable) ,1000] est 10.

-263- % Rule 23 if [prelim,parani(cox,COX) ,Cox<0) or [prelim,parain(Cmifl,Cmifl) ,Cox0] and [prelim,paraiu(type,TYPe) ,_>0] and [set_up,param(vacc,VaCC) , _>0] then (nl,write('Cox unreliable, reducing Vacc by 2V...') ,nl,Vaccnew is Vacc-2, join _var("Vacc" ,Vaccnew,String), instruct_basic (command, String)) to amend [set_up, parain(vacc, Vaccnew) , 1000] est 20. % Rule 25 if [prelim,param(clOX,DOX) ,>0] and [prelim,parain(type,TYPe) ,_>0] and [prelim,param(cmin,Cmifl) ,_>0] and [prelim,paraln(VStart,VStart) ,_>O] and [prelim,param(vstOp,VstOP) ,_>O] then true to add [initial,readylstsweep, 1000] est 10. % Rule 26 if not [prelim,paraiu(pdope,PdOPe) ,_>03 and [initial, assume (ok_toj)ulSe) ,_>0] then instruct _basic (measure, pdope, Pdope) to add[preliln,paraifl(pdOPe,PdOPe) ,l000] est 10.

initial level ****************************************

% Rule 1 ------if (initial, readylstsweep,_>0) and ([prelim,assume(lifetime,1OW) ,_>0] or ([preliln,parain(flSub,NSUb) ,_>0] and holds Nsub>1e18)) then instruct _basic (measure, eqm_cv, Junk) to add [initial, measured(eqm_cv) , 1000] est 10. % Rule 2 ------

if [ ,readylstsweep, _>0] and (initial, assume (Ok_to_PU1Se) ,_>0] then instruct _basic (measure, pulsed cv, Junk) to add[ initial, measured (PUlSed_CV) ,1000] est 10.

% Rule 3 if initia1,measured(pulsed_CV) ,_>0] or [initial,measured(eqlll_cV) ,_>0] then true to add[initial,sCafl(OptilfliSed) ,1000] est 10.

% Rule 4 if [initial,scan(optilflised) ,_>0] then instruct _basic (measure, patt_class , patt_class, Items_to_add) to add Items_to_add est 10.

% Rule 5 if (patt class,implanted,Cf>800] and (preltm,param(nSub,N5Ub) ,_>0] and [prelim,param(pdOPe,PdoPe) ,_>0] and holds not (compare (Nsub, Pdope)) then (nl,write('probablY implanted') ,nl) to add [initial,diagfloSiS(imPlaflted) ,Cf] est 10.

% Rule 6 if [patt_class, compensated, Cf >800] and [prelim,param(nSUb,NSUb) ,_>0] and (prelim,parain(pdope,PdoPe) ,_>0] and holds not (compare (Nsub, Pdope)) then (nl,write('probablY implanted (depletion?) ') ,nl) to add (initial, diagnosis (neg_implant) ,Cf] est 10.

% Rule 7

if [patt_class,traps, Cf>800] then true to add [initial, diagnosis (traps),Cf] est 10.

% Rule 8

if [patt_class,Uxuiforlfl, Cf >800] and [prelim,paralu(flSUb,NSUb) ,_>0] and [prelim,param(PdOPe,PdOPe),_>°] then (compare (Nsub, Pdope) ,nl,write('probablY un-implanted') ,nl) to add [initial,diagflOSiS(UflifOflhl) ,Cf] est 10.

% Rule 9

-265- if [prelim,assume(ilbg) ,_>O] and not[initial, diagnosis (neg_implant) ,_>0] then (instruct _basic (command, 'X$=VAL$ (MIN(Ceqm(*)))'), instruct basic(coiflmafld, 'CALL TellJ)rolog(XS, ,,1000",l) 1 ,Cmin)) to add[initial, best_parain(ciflin,Ciflin) ,1000] est 10.

% Rule 10 if (prelim,paraltl(flSUb,N) ,_>0] and (prelim,param(dope,D) ,_>0] and holds not compare(N,D) and not (initial ,diagnosis (neg_implant) ,_>0] then action get_lesser(N,D,S) to add[initial,best_paralfl(dOPe,S) ,1000] est 10.

% Rule 11 if [initial,diagnosis(X) ,_>O] then true to action (nl,write('Initial diagnosis: sample is '),write(X),nl) est 5.

% Rule 12 if [initial,diagnOSiS(imPlaflted) ,_>0] and notnow (further,measured (profile) ,_>0) and [set up, objectives(profile) ,_>0] and [initial,assume(Ok_tO_.pulSe) ,_>0) and [initial, assume (dit, low) , _>0) then (nl,write('Performiflg pulsed doping profile')) %,nl, instruct basic(measure,PProfile, Junk)) to add [need,tomeasure(prOfile) ,1000] est 10.

% Rule 13 if (initial,diagTIOSiS(imPlaflted) ,_>0] and notnow (further, measured(profile) ,_>0] and [set_up, objectives (profile) ,_>0] and (initial,assume(lifetilfle,10W) ,_>0) and (initial,assuine(dit,1OW) ,_>0] then (nl,write('Performiflg equilibrium doping profile') ,nl, instruct_basic (measure, eprofile,Junk)) to add [need,tonleasure(prOfile) ,1000] est 10.

% Rule 14------

if [need,tolneaSUre (prof ile) ,_>0] and not [further,meaSUred(PrOfi.1e) ,_>0J then (nl, write ('Re-bUildiflg PROF4, please wait...') ,nl, instruct _basic (command, 'LOADSUB Prof linker FROM "/users/j imw/DISTRIBUT instructbasic (command, 'LOADSUB FROM "/users/j imw/DISTRIBUTION/SUBLIB I ' I instruct_basic(coimflafld, 'LOADSUB FROM t/users/j imw/DISTRIBUTION/PZG"), instruct__basic (command, 'CALL Prof linker'),

-266- instruct _basic (command,' DELSUB Prof linker TO END')) to add[ further, measured (profile), 1000 est 10. % Rule 15 if [further,measured(profile) ,_>O] and @[initial, bestparam(vfb,Vfb) ,_>0] then (nl,write('Updatiflg Vfb from Ziegler analysis') ni, instruct _basic(command, 'X$=VAL$ (Flatv)'), instruct basic(conmiand, 'CALL tellprolog(X$, "lOOO",l) ',Vfbnew)) to amend [initial, bestparain(vfb,VfbneW),1000] est 10.

-267- 3.2. The Prolog predicates for CV-EXPERT print _main _menu: - II write('C-VA, system menu'),nl,nl, write('l. start system'),nl, write(' 2. start prototype system') ,nl, write( 1 3. revert to manual (in BASIC)'),nl,nl,nl,nl. How would you like to proceed with the measurements? (1,2 or 3) finds function in (1, 2,3]. print _objectives_menu: - ni, write(' Select desired tests from menu') ,nl, ni, write(simple) ,nl, write('profile') ,nl, write('dit') ,nl, write('all') ,nl, write(rapid) ,nl, write(end) ,nl,nl,nl,nl. Please enter the objectives one-by-one, and finish by typing end. finds objective in [profile,dit,all,rapid,end,SilflPle].

area—menu:- nl, write( 'Select device area from one of the following:-') ,nl, nl, write('0.0128 cm2'),nl, write('O.0256 cm2'),nl, write('O.0512 cm2'),nl, write('O.1024 cm2'),nl,nl,nl,nl. Please enter one the above values or anything else if non-standard finds ar

% text for various situations , for the ask—user predicate: Enter substrate type if known (n or p) finds type in [n,p,X]. Enter approximate substrate doping (/cm3) finds nsub in [X]. Enter approximate oxide thickness (angstroms) finds dox in [X]. % check two numbers are within 10 percent: compare(X,Y) :- Tenpen is X/lO, - Diff is abs(X-Y), Diff < Tenpen. % select the lesser of two values: get_lesser(X,Y, Z) :- XY,Z is Y.

-268- % build a command string for BASIC: join_var (String, Nr,Joined) : - name(Nuinber,N), append (String, N, S) naine(Joined,S).

% the instruct-basic predicates: instruct_basic (measure, What, Result) : - %atomic result basic_status (ready), %check ready build("CALL * ( IJt*8t) ",measure,What, Command), %construct a string basic_instr (Command), %send it basic _status (finished), Wait til ready 1 ,basic_read(Result). %from pipe/file instruct _basic (measure ,What, Index, Entries_to_add) : - %list result basic_status (ready), %check ready build("CALL * ( UU*t) ",measure, What, Command) , %construct a string basic_instr (Command), %send it basic_status(finished), Wait til ready basic_read(Index, Entries_to_add). %from pipe/file instruct_basic(command,What) : - %sends a command, no reply basic_status(ready), %check ready basic instr(What), %send it basic_status (finished). %wait til ready instruct basic(command,What,Result) : - basfc_status (ready), %check ready basic_instr(What), %send it basic status(finished), %wait til ready basic-read (Result) . %from pipe/file instruct basic(calculate,What,ReSU1t) : - %atomic result basic_status (ready), %check ready build("CALL * ( h 1 '* 1 t' 1 ) ", calculate,What, Command), %construct a string basic instr (Command), %send it basic_status(finished), %wait til ready ! ,basic_read (Result). %read back list from pipe/file instruct _basic (Operation,, Result) : - nl ,write ( 'un-recognised command to instruct-basic: write (Operation) ,nl, fail.

/* macros to interface to RMB/UX at a low level

/* this checks the status file */ basic-status(finished) repeat, exists('. ./ipc/statusfile'),

-269- see('../ipc/statusf lie'), % check for existence read(ready) ,seen, tab(5) ,write('BASIC finished executing command '),nl. basic-status(ready) repeat, exists('. ./ipc/statusfile'), see('. ./ipc/statusfiie'), % check for existence read(ready) ,seen, write('BASIC was ready '),nl, delete('. ./ipc/statusfile'). % remove after read basic-status(X) : - nl,write('unknown mode in basic-instruct: '),write(X) ,nl. /* and this sends a command to hp-basic */ basic instr (Command) write(Command) ni, %echo to screen teii('. ./ipc/trig') ,write(Command), told, %write out command file sh( 'mv . . /ipc/trig . . /ipc/trigger'). %make it appear to basic

1* this checks if there was a basic run-time error: */ basic-error :- see(errorf lie), read(error), %fails if "error" not read, 2nd clause activated. seen, write('ERROR in basic program'),ni.

basic-error : - seen,!,faii. %ensure seen is performed

1* these read back data from basic via fifo/file *1 % in atomic mode: basic_read (Atom) : - open('. ./ipc/f lie'), read (Atom), Atom\=end_of_file, seen, deiete('. ./ipc/file'). basic_read (Atom) : - seen, - nl,write('faiied to read results correctly (atomic mode)'),nl, !,fail. % and in list mode % format must be: fact. % cf. basic_read (Index, Entries) : -

-270- open('. ./ipc/file'), read (Fact) , Fact\=end_Of_file, read(Cf), Cf\=end_of_file, Entry=[Index,Fact,Cf], basic _rd (Index, Rest), seen, delete('. ./ipc/f lie'), accum(Entry,Rest, Entries). basic_read (md, Ent) : - %wrong if gets this far seen, ni ,write(' failed to read results correctly (list mode)') , ni, !,fail.

basic_rd(Index, Entries) : - read(Fact), Fact\=end_of_file, read(Cf), Index, Fact, Cf Entry= ] [ basic _rd (Index, Rest), accuiu(Entry,Rest, Entries). basic _rd(Index,Rest) :- %needs working on Rest=at_ the _end, read(End), End==end_of_fiie. accuxn(X, at_the_end,X). accum(X,Y,(X and Y)). 1* ------*1 /* here are some utilities */ 1* ------*1 % operator declarations:

:- op(900,xfy,cf). :- op(899,xfy,ifl). op(899,xfy,finds). op(878,xfy, of). :- op(920,xfy,and). %for accum

% the ask—user predicate: ask_user (Thing,Value, Cf) : - Question finds Thing in List, nl,write(Question) ,write(' ?') ,tab(3) ,write('> '), read (Reply) ,nl, (Reply = (Value cf Cf), member(Value,List) ; Reply = Value, member(Value, List), Cf = 1000 write('Eh !') ,ni,ask_user(Thing,ValUe,Cf)).

1* Filename substitution into shell/1 commands. */

- 271 - build(R, Fl, F2, Corn) :- dnanie(Fl, NI-Cl), sigr(R, C-N1, D), dname(F2, N2-C2), sign(D, Cl-N2, C2), naine(Com,C). % reproduce command atom s ign ([O'*T), E-E, T) :- !. sign([HT], [HIU)-E, D) :- sign(T, U-E, D). dname(A, B-C) : - name(A, D), ternii(D, B-C). termi([], C-C). termi([HIT], [HIU]-C) :- terini(T, U-C).

1* Example : files("basic < * > * -o float", raw -56, cooked, A).

gives A = " basic < raw-56 > cooked -o float" *1 % must RE-consult these library routines to avoid duplication:

:-(-'ask.pl']. %user interface utilities :-[-'listut.pl']. %list utilities :-[-'fiies.pl']. %f lie utilities

-272- Appendix D Publications

In this section, three publications by the author which are relevant to this work are reprinted. Permission to reproduce these papers was granted by the Institute of Electrical and Electronic Engineers [1,2,3].

J. A. Walls, A. J. Walton, J. M. Robertson, and T. M. Crawford, "Interpretation of C-V curves for process fault diagnosis: a machine-learning expert systems approach," in Proceedings ICMTS 1988, pp. 174 - 178, Long Beach, CA., (February 1988).

J. A. Walls, A. J. Walton, J. M. Robertson, and T. M. Crawford, "Automating and sequencing C-V measurements for process fault diagnosis using a pattern recognition technique," in Proceedings JCMTS 1989, Vol. 2, pp. 193 - 199, Edinburgh U.K., (March 1989).

J. A. Walls, A. J. Walton, and J. M. Robertson, "Application of Al Techniques to the Control and Interpretation of C-V Measurements," To be presented at ICMTS 1990, San Diego, CA..

-273- Interpretation of Capacitance-Voltage Curves for Process Fault Diagnosis: A Machine-Learning Expert Systems Approach

LA. Wail,. AJ. Walton. J.M. Robertson. T.M. Crawford "

Edinburgh Microfabrication Facility Departmcnt of Electrical Engineering. University of Edinburgh. srracr: C-V measurement of oxides and their interfaces is a characteristics. This approach can be applied to C-V measurements werful technique but it suffers from the problem of data interpre- made on capacitor dots which are invariably present on process con- ion. This makes it difficult for the non-expert to to take full trol chips. vantage of the available information. This paper discusses the plication of a pattern-recognition system which can be used to dress these problems for C-V measurements. 2. The new approach.

A machine learning system differs from a classical rule-based Introduction expert system in that the underlying rules are not known and their interactions are too complex for simple heuristics. Instead, the Sys- C.V measurements can be used to obtain MOS parameters such tem is taught to associate between input and output and establish gate oxide thickness, threshold voltage and surface doping density direct relationships between cause and effect, whilst allowing for d with further analysis, oxide charge, interface trapped charge, interdependencies and co-operation between the various parameters inority carrier lifetime and doping profile (1,2,3,4]. However, the and fault conditions. ±nique has many potential pitfalls for the unwary and it is of The use of a machine-learning system to analyse a C-V curve has a ramount importance that the algorithms used are able to account number of potential advantages over visual inspection for parameter all eventualities. The correct extraction of accurate parameters extraction: th complex interactions requires careful processing of the data, I. The operator need not have an in-depth understanding of the (2,3). A simplistic amples of which are given in references technique or the underlying physics, thus rendering the meas- roach can result in misleading or ambiguous parameters if the urement more suitable for routine process control on a fabrica- era are not experts in the interpretation of the measurement. tion line. The problem lies, not so much with the measurement which There is no requirement for detailed knowledge of the theoreti- be accurate and repeatable, but with its interpretation. The n cal relationship between the input data and the derived parame- ape of a C-V curve reveals many process defects or characteristics ters. The ML.S will automatically find any quantitative relation- a sample; some of the effects these have on C-V curves are shown ships which may include some that have been previously over- figure 1. It is when combinations of these effects are present that looked. terpretation becomes difficult. It is relatively easy to produce training examples with known lb. ne fault conditions for any particular process. Re-training can thus take place without re-programming being required (up-dating - - - - led uide rules), since the need for acquisition of expertise and subsequent sdpsti•pb.t\ \\ encoding into a rule bask is obviated. Liol V1

U. inood 3. System outline.

The new measurement system comprises five components: - —"- Data acquisition '\ Data smoothing \ \•N1 ha Vedin It Pre-processing Feature-extraction LuIt \ \ Pattern matching

Before the system can be used, it must first be trained on representa- Figure 1: Effects of implant, interface trap density, oxide tive samples: by exposing the MLS to the various features of the C-V thickness and substrate doping on the C-V curve curve and their interpretation it may be taught to recognise relation- ships and identify faults as illustrated in figure 2. The usefulness of an expert system for the routine use of C-V a diagnostic tool is the driving force behind this work and the sub- ct of this paper is the potential of a machine learning system The shape of the C-V curve must be presented to the MLS by a dLS) to extract parameters such as oxide thickness and threshold vector of salient features which are graphical attributes chosen for Ijusting implants. Interpretation of the measurement, in its rim- their uniqueness. The biggest obstacle to designing such a recogni- esi form, involves relating cause to effect; a MLS can map directly tion system is choosing suitable features. Familiarity with the C-V om patterns to parameters and can be used to derive processing measurement intuitively leads to a choice of certain features for their strong physical significance. kX is with Hcwt1 Packard Ltd., Sarth CJLctrj, Lthian. T.KCr­ West autocorrelation matrix for the input features. This compact form contains all the necessary information regarding inter-relationships 101. Ins I,- Declared tuvli between input patterns and outcomes. The versatility and ficaibility of this algorithm are apparent from its previous applications (9.101.

Outcome Input features Adaptive biiprae ••• X combiners feedback

Figure 2: MLS - Training and Use modes

It is obviously desirable to choose features that vary signifl- antly and in a linear manner for a particular parameter or fault con- ition. Break-points and curvature maxima are easily extracted and re widely used for pattern recognition [5]. These, together with Autocorrelation matrix umerical parameters were used as the set of features for the general- ed C-V curve shown in figure 3. Figure 4: Schematic illustration of the learning algorithm

The self-improvement or elementary learning of this system involves optimising a matrix of coefficients; a process for which assessment and credit assignment are quite difficult. However, by observing the emerging weights and the feedback error during the training mode, it is possible to see which features have contributed significantly to the final decision function and gauge whether a stable result has been achieved. The acid test is the accuracy with which the MLS can identify a set of new examples.

The simplest possible system was investigated by analysing the effect on the curve shape of the gate oxide thickness and the sub- strate doping. The mathematical relationship between the C-V curve and these parameters are well known 1111 but the evaluation of these relationships using an expert system provide it with a taxing exercise. The suitability of the feature set under trial was studied by examin- ing the error between the previously calculated results and those Figu given by the MIS. A number of feature sets were examined and their applicability is discussed below. he proficiency of the MLS is dependent upon the initial expertise The relationship between measured capacitances and the two rovided by an expert deciding upon: parameters D. and N.& are very different:

= €CA0r Which raw data to collect Co. (1) How to process that data i)_2 The choice of appropriate salient features and - (L.. - The division of the problem space (2) ln(N /n,)+%ln [21n(N /n1)-1 e,q C C mn J 'his requires considerable experience of the measurement itself and Equation (1) is a linear relationship whereas equation (2) for N b is good qualitative understanding of the underlying physics and the transcendental and provides a real challenge to the MIS. The fol- radical problems associated with the measurement. lowing set of input and output features for the MIS were investi- gated:

The Learning Mechanism Exam Ic 1 Example 2 Input Output Input Output approach to the task of pattern recognition is A 'black-box' [61 log!_ D. be technique which has been implemented. This is an input:output levice whose i/o relations can be specified without too much regard log C. D its internal operation. This is analogous to the behavioural C. cientist's interpretation of an animal brain in its learning phase. log N Jnfortunately, although these systems are the simplest to implement, hey are the hardest to interrogate due to the opacity of the nowledge gained during training. However, for some applications, is quite sufficient just to achieve a transfer function between input nd output without an understanding of their exact relationships 171. The actual learning algorithm which has been used, is one bor- owed from the domain of signal processing 181. It is an array of daptive linear combiners computed recursively on the same corn- titer which is used to control the instruments making the C-V meas- I Thc F° stIathy an crpen ryst= amm mts for in cdinkai by wwVimg auSt or Nam irement. The result, after training is a matrix of weights and an deons made in amving Si 2

Exam 1e 3 Exam pie 4 I F1 r Input Output Input Output

log- D, eature 3 C. C. C'W C__ WS be 3 C. a 4. log N, log N C. C. slope £ S 1 Features 1.2 ------/ Results. / I. • I S 3 5 For each set of inputs and outputs the evaluation was achieved training on a random set of simulated C-V curves for a range of Training example ride thicknesses (200-1000 Angstroms) and substrate dopings Figure 6b: Evolving weights for the D combiner 013 - 1016 cm _3)• These are illustrated in figure 5.

F- m ci to ft 95 r- .- CD r- 23 'I

18

13 I. 0 I Nsub/ error I.. U 8 l I ' I 3

Do x error

-, I I • a 6 9 12 IS CL 'a Training example U Gate Bias --> Figure 6c: Feedback errors for both combiners 5: Training set of simulated C-V curves Figure Figure 7 illustrates the second example where the MLS was trained on the same features, but taught to associate these with D For the first example shown below the estimation for D . is and the log of the substrate doping. These features were chosen to good because of the strong correlation between 1/C and D Ty make the relationships between the input and output parameters illustrated by the strong preference of feature 3 in figure 6a. more linear [111. The results are far more interesting. The features combiner shown in fig 6b appear ithough the weights for the N.,b have stabilised nicely with the feedback error visibly decreasing as stabilise, they are only small values - in effect the MIS is not learning progresses. As a result the estimation errors have been tablishing linear relationships. This failure is reflected in the very reduced by an order of magnitude to 120 %. The following exam- rge estimation errors in the subsequent testing mode (400 %) pies have a fourth feature in the input data vector and the effect this has upon the accuracy is now discussed.

­,io i ,,, In 28,

Feature res%Ura I 2 2

4 a ii C -c 0 _ ci

! I C I 4/ - Feature 3 \ Fexiw. 2 -2

Feature -5 6 9 12 15 0 3 3 6 9 12 15 Training example Training example , combiner Figure 6a: Evolving weights for the N_j Figure 7a: Evolving weights for the N_b combiner

_ mø i r m ..

F• sturS

a 0 L

S to. 10 CD

S Features 1.2 I / I / I t I lo I • I 0 3 6 9 12 0 3 6 9 12

Training example Training example

Figure 7b: Evolving weights for the D,. combiner Figure Sb: Evolving weights for the D combiner

: I. 45 _ r _

18 II I 18 J Naub error 6 - I- I t 0 I- - 8 I. fox error W -6

%l 1 I P4sub error — — - — - 3 •18. / / -- .1 I I • I / 0 3 6 9 12 •3e • 0 3 6 9 12 I: Training example Training example Figure Ic: Feedback errors for both combiners Figure Sc: Feedback errors for both combiners 2 For the third example, shown in figure 8, 11C COM.was chosen as e extra feature on the bas of an empirical result [13] relating it to ibstrate doping. There is only a small improvement in estimation N.J C ICe curacy to 55 %, and from looking at the weights for N, it can be iserved that the fourth feature has been suppressed and contributes zy little to the output. The fluctuation in the feedback error has / ighlly increased too. 17

NI w rr. I r r 11 2 £-

0 :s 2

S S 12 15 S 0 3 6 9 I Training example

Figure 9a: Evolving weights for the N-b combiner

0 3 6 9 12 25

Training example

Figure 8a: Evolving weights for the N combiner

rr I r 7. Acknowledgements 42

Feature 3 This work is funded by the Science and Engineering Research 32 Council of Great Britain. J. Wails would also like to acknowledge Hewlett Packard Ltd. for financial assistance. a C 22-

U 12- S. References

S I F..t,.s 1.2.4 (1] McGillivray, I.G. "The Measurement of Electrical Parameters and Trace Impurity Effects in MOS Capacitors" , PhD Thesis. University of Edinburgh. 1987.

: McGillivray 1.0., Robertson J.M., Walton A.J. "Improved 0 3 b Measurement of Doping Profiles in Silicon Using CV Tech- Trans Electron Devices. Feb 1988. Training example niques", IEEE McGilliway 1.0., Robertson J.M., Walton A.J. "Automatic Figure 9b: Evolving weights for the D,,, combiner Measurement of Minority Carrier Lifetimes in Silicon for Nonuniform Doping Samples Using the Zerbst Technique", to be published in lEE Proceedings - I: Solid State and Electron I I. fe r ,- C. r F Devices Nicollian, E.H., Brews, J.R. "MOS Physics and Technology", Wiley, 1982. Fischler, MA., Bolles, R.C. "Perceptual Organisation and Curve Partitioning", IEEE Trans. Pattern Analysis and Machine Intelligence-8 , 1986, pp 101. "Machine Learning . Applications in IL (6] Forsyth, R., Rada, R. 0 Expert Systems and Information Retrieval", Ellis Harwood Series I. I. in Artificial Intelligence, 1986. w [7] Rosenblatt, F. '"The Perceptron: a Probabalistic Model for Information Storage and Organisation in the Brain', Psycholog- .error ical Review, vol. 65. 1958. (8] Cowan, C.F.N., Grant, P.M. "Adaptive Filters", Prentice Hall. 1985. 191 Crawford, T.M., Marton, V. "A Machine Learning Approach to Expert Systems for Fault Diagnosis in Communications 0 3 b a Equipment", lEE Symposium on Current Trends in the Applica- Training example tion of Expert Systems - Strathclyde University - 2nd April 1986. Figure 9c: Feedback errors for both combiners (10] Mirzai, A. R. '"Tuning of Waveguide Filters Using a Machine Learning Approach", Edinburgh University Internal Report For the last example, shown in figure 9, 11C was replaced (Signal Processing Group), 30 July 1987. y the maximum gradient of a tangent to the curve in the transition [11] Goetzberger, A. "Ideal MOS Curves for Silicon", Bell System Technical Journal, Sept. 1966, pp 1097-1122. egion. This is known to relate to substrate doping 1131. The results how a marked improvement in the estimation error for substrate (12] Zainincr, K.H.. Heiman. F.P. 'The C-V Technique as an The weight for the fourth feature has Anayncal Tool.", Part 1, Solid State Technology, May 1970, pp loping, down to within 5 %. 49-5 N, and has been ontributed significantly to the calculation of [13] Zainincr, K.H., Hciman. F.P. 'The C-V Technique as an estimation. The feedback error has also onectly suppressed for D Analytical Tool.". Part 2. Solid State Technology, June 1970, pp een reduced as the system attempted to establish the following rela- 46-55. ions: 1 w 1 logi + w2 log a---- + W3 — + w 4 slope)d) D. — i I - C. I. J

Ne ws (i-) + w6log E + w7..—+ w 8 510pef) _ ( SE!1)az

Conclusion

The feasibility of using an expert system to extract important parameters from a C-V curve has been demonstrated. It has been established that the choice of the feature set is of crucial importance with the achieved recognition accuracy for the best feature set being 5% for N,, and 0.005% for D. over a wide range of process vari- ables. These figures do not necessarily represent the fundamental limit of this approach and future work will continue with developing improved feature sets. The technique will be extended further to include many of the other parameters which can be extracted from C-V measurements. UTOMATING AND SEQUENCING C-V MEASUREMENTS FOR PROCESS FAULT DIAGNOSIS USING A PATTERN-RECOGNITION APPROACH

J.A. Walls, A.J. Walton, J.M. Robertson, T.M. Crawfordt Edinburgh Microfabrication Facility The Kings Buildings Mayfield Road Edinburgh tT.M. Crawford is with Hewlett Packard Ltd., South Queensferry, West Lothian.

STRACT S paper presents a strategy for extracting the maximum ount of information from C-V and G-V measurements, by abining decision-support software and statistical pattern ognition techniques to help sequence the measurements on MOS test structure in an appropriate manner.

INTRODUCTION The initial interpretation of a capacitance-voltage (C-V) ye is the key factor in determining the further analyses uired to fully characterise the Metal-Oxide-Semiconductor OS) capacitor. Measurements applied in the wrong order with inappropriate models and assumptions will result in a leading interpretation of the device under test (DUT). For mple, systems that determine flatband voltage without wing for the effects of trapped interfacial charge and lower iority carrier lifetimes [1,2] will obviously yield false )rmation if the necessary corrections and measurement iniques have not been used. Furthermore, advanced Figure 1. Anomalous profile arising out of applying a pulsed iysis of doping profiles, interface trap distributions and C-V doping profile measurement to a sample with high inter- rnrity carrier lifetimes would be incomplete or even face trap density. ppropriate without some prior estimate of the sample racteristics. There is obviously a requirement for an intelligent troller to sequence operations. It should be able to make luctions about the shape of the C-V curve, to inform the r of any approximations made, and also recognise the ilficant factors within a measurement so that the final result y be fully justified.

Justification C-V analysis of the MOS capacitor device structure is a effective, and powerful process diagnostic tool, but it can subject to mis-interpretation. The technique, although entially reliable, has pitfalls for the unwary user, since ay procedural considerations are required in order to dmise the usefulness and accuracy of the measurements. example of the kind of problem often encountered by omatic C-V systems, is the effect of a high value of rface trapped charge on the doping profile. The layer of Figure 2. Distorted multi-frequency interface trap conduc- rfacial charge gives rise to a false peak in the concentration tance curves from a sample with a threshold-adjusting im- file as shown in figure 1. Likewise, interface trap plant. ductance (G,,) measurements performed on samples with a ily non-uniform doping profile will produce G-, curves ch deviate markedly from the usual Gaussian form, as 2. IMPLEMENTATION wn in Figure 2. This will cause difficulties in the Building upon an existing kernel of measurement and nerical analysis [3] of these curves for trap analysis routines [2,4,5], an assistant (CV-ASSIST) has been racterisation. devised to: Help inexperienced operators perform routine C-V These curve shapes may be represented by a vector of measurements for process-control to determine parameters salient features describing the curve. These feature values are such as Dox , V- Yr , Q,,, , QSS without requiring any graphical attributes chosen for their discriminatory power to knowledge of the underlying physical principles. aid classification. The complete set, {l,2,3,4,5,6,7,8}, is Assist in making correct assumptions about the DIJT. listed: dC Facilitate automatic and complete device characterisation Magnitude of first peak in by intelligently sequencing further analyses once the initial dV dC boundary conditions have been established. Magnitude of second peak in dV By treating interpretation as a classification exercise, a rn-recognition system has been developed to distinguish Spacing between the first and second peaks en different categories of MOS capacitor characteristics Ratio of the magnitudes of the first and second peaks inalysis of the C-V and 0-V curve shapes. Three The log of the peak in G, vs. V divided by the peak urements are made: width at Ye height • Pulsed C-V Magnitude of the peak in the G,1 vs. V curve • Equilibrium C-V Number of apparently negative G, values, as a • Equilibrium 0-V percentage e 3 illustrates the variety of curves which may be Amount of noise in 0, data (number of positive to Lintered. negative gradient changes)

C/Cox C/Cox

1,

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0

Bias voltage Ii C/Cox C/Cox

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0 Bias voltage Cit/Gm ax CiVGmax

(I) 0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0 Bias voltage Bias voltage Bias voltage re 3. Some schematic examples of representative C-V of a high density of interface trapped charge (- G-V curve shapes; (a) The ideal C-V curve, (b) Linear 1O"cm 2eV_'), (f) Effect of a 'leaky' oxide, (g) Effect of short along voltage axis due to fixed oxide charge, (c) Effect minority carrier lifetime on a pulsed C-V sweep, (Ii) Effect of iditive VT adjusting implant (B in p-substrate), (d) Effect a high 0h on the interface trap conductance curve. (i) Effect )mpeflSa(iflg VT adjusting implant (B in n-well), (e) Effect of high series resistance on interface trap conductance curve. e biggest obstacle to designing a successful pattern classifier where :hoosing suitable features [6]. This procedure is crucial to success of the pattern-recognition process and familiarity It the measurement accordingly leads to a choice based iL = = n-dimensional feature vector dominantly on intuition [7]. Figure 4 shows an outline of process used for feature extraction. Each sample can be xn tresented by a point in a multi-dimensional space, and and aples that are similar in nature will form groups or clusters hin that space as shown in figure 5. p = cluster mean C= covariance matrix Optimised + corrected C—V. C—V curves This can be thought of as a simple Euclidean distance (x 2+y2 +z2) modified by a transformation which rotates (decorrelates) and scales the coordinate axes [lOj. The Analyse shape Analyse shape Analyse shape covariance matrix describes the size and shape of a cluster, of dc/dy or G(it)—V I I of displacement with the mean giving its position. 2.1. Test structure. The MOS capacitors measured in this study were Extract features I I Extract features I I Extract features fabricated on <100> p-type silicon and follow the early process stages of a standard CMOS recipe: RCA wafer clean Dry HCl gate oxidation + in-situ anneal (950°C) Threshold-adjusting boron implant

Most likely Implant activation anneal (950°C) Pattern classifier; fault category Metal deposition & etch, front and back surfaces Post-metallisation anneal (435°C). ;ure 4. Schematic outline of feat ure-extractionJpattern. ognition processes The capacitor 'dots' were produced by projection lithography for optimum gate area definition, and incorporated a guard- ring around the gate periphery, as shown in figure 6, to Fn Cluster B prevent lateral charging of any surface moisture.

MOS capacitor willi guard ring ______

Cluster A ia

X. Cluster C

- - KIM nX F2 MIK__ A"MMMURVK ). Met i1 Ii sc d • prototype point C back X = cluster mean ___ - I t.ae ure 5. Clustering of similar classes of data in n- Figure 6. Schematic of the MOS test capacitors with guard- nensional feature-space. rings

Pattern classification is effected by nearest-neighbour stance calculations using a hyperspace distance measure. An 2.2. Measurement system. iclassified sample (i), will be allotted to the nearest cluster, The measurement system used was based on a Hewlett ), to which the current sample data point is most similar. Packard 4061 semiconductor test station, using a HP series 300 ie particular multivariate measure used was the Mahalanobis computer as a controller. The pattern recognition software stance, (D) [8, 9]. This is the inner product of the cluster- has initially been set up as a 3-way classification scheme can vector and the prototype point, normalised to unit between variance along each dimension in feature-space. The Class 1. Ion-implanted substrates, ototype point is the feature vector of the pattern to be Class 2. Interface traps in the MOS structure, and ssified and is indicated in figure 5 together with the cluster Class 3. Uniform or near-uniformly doped substrates. can. After sufficient training on a variety of samples under the DJ=(—)' C -' (-) (1) supervision of an expert, the system is able to identify the ority of C-V curves presented to it. Examples not falling any of the defined categories are positively rejected whilst % confidence e exhibiting combinations may be multiply classified. 100

RESULTS 80 The system was trained using a wide variety of samples. e consisted of: 60 • 9 different ion-implants (VT adjust), • 10 different oxides affected by interface traps 40 (no post- metallisation anneal), and • - 15 different unimplanted, 'clean' oxide samples. 20 ing training, the error rate was observed as it dropped to acceptable level indicating that a sufficient number of 0 1 2 3 4 5 7 mples had been presented. The success of the classification 6 ,me was then evaluated using a x2 confidence test to assess sample significance of the Mahalanobis distance. The halanobis distance is a sum of squared terms; % confidence 100 D 2 (xr p.y )c r (xs pis ) (2) r1 s1 re c'3 denotes the element in the rth row and sth column 80 . This simple statistical test can therefore be applied to as the x2 variable with (n —1) degrees of freedom, where 60 the dimensionality of the feature space. The resulting centage confidence level then becomes a convenient, 40 malised gauge for quantifying two figures of merit for the isifier [11]: 20 In ira-class similarities show how well the samples within a pattern category were classified. The majority within any 0 class were identified with 90% confidence or greater as 1 2 3 4 5 6 7 8 shown in figure 7. sample The inter-class differences were calculated similarly to illustrate how well samples from one class were rejected % confidence from the other two. The majority fall well below the 90% threshold as can be seen from Figure 8. 100 test enables the classifier to be equipped with a 'reject ion' which prompts the expert, during initial training, with following options should the confidence level fall below a set limit indicating that the pattern is significantly different in any in the database: Not to include the example in the training set, or To create a new category of pattern is also prevents statistical outliers from distorting the pattern sters. A significant result was deemed to have 90% Lfidence or greater. This process is illustrated in figure 9 ich shows the appropriate curves, extracted features and the U Z 3 4 5 b 7 0 9.1011 sIting decisions between the three classes for a metal gate I i sample )S capacitor. Quite obviously this is a sample that is resentative of an MOS capacitor with a high interface trap isity since the x2 test gives a confidence level of 95% for Figure 7. Bar-charts showing the intra-class confidence levels symptom. This then provides the stimulus to invoke for class membership. Confidences of 90% or greater were thee measurement routines to analyse trap energy and deemed to indicate successful recognition within a particular tribution, whilst warning that any doping profiles generated - category. (a) Class 1: Interface traps, (b) Class 2: V.-adjust St be corrected for the effects of interface trap distortion in implant, (c) Class 3: Uniformly-doped substrates. C-V curve. During subsequent trials on more samples, it was found 4. CANONICAL ANALYSIS t there was insufficient discrimination between some groups A solution to this problem was to use canonical analysis ich was mainly due to inadequate feature selection. In to increase the separation between classes. Canonical variate ticular, the clusters representing uniform and non- analysis [8, 101 is a statistical technique for variable (feature) formly-doped substrates were close together, though selection which involves transformation of the feature space to inct. The effect this has on the inter-class discrimination emphasize the differences between groups of patterns, it be seen from figure 8c. Although almost all the examples produces (G —1) new features, where G is the number of below the 90% confidence limit, this threshold is arbitrary classes, which are linear combinations of the original variables, I its value obviously critical to the function of the classifier. resulting in the lowest classification error rate. It thus finds confidence 100 - 1:2

80 RO 1:3 C U 60

40 CLa

a U 20

0 1 2 3 4 5 6 7 -C.0 -5.0 -4.0 -3.0 -20 sample Vol tae i confidence 'vu

a 60 F.....9 4' C V L 60. V 4- 40

+1 'I 20

0 1 2 3 4 5 6 7 8 0 sample VC 1 t age

confidence 42 100 C

80 3:2 N C a I' Ho U - 0 C 40 0 a 0 I. CL 0 - 20 I- -x n V 1234567891011 sample Volt age (c) gure S. The inter-class rejection ratios from the application the -test to the Mahalanobis distances between samples different classes. Confidences less than 90% illustrate suc- ssful rejection from the other categories. (a) Class I from and 3; (b) Class 2 from 1 and 3; (c) Class 3 from 1 and 2. e optimum linear combinations with regard to the Lfferences between the means and how much the groups read about their means in order to find axes in directions of igh information content. To maximise the differences tween the three chosen classes of C-V characteristics, the Figure 9. Example output from CV-ASSIST showing curves o most significant canonical eigenvectors of the pattern space produced, features extracted, and diagnostic results. ere selected. With up to eight different salient features available for particularly when the number of samples available is not large. -V curve classification, it is also advantageous to apply a data Performing canonical analysis, on the existing feature set Auction technique to reduce the number of dimensions, improved classification as can be sccn from the intra- and onfjdence % confidence

00 £ VU

80 80 h.1

60 60

40 40

20 20

0 0 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 sample sample (a) (a) onfidece % confidence 00 100

80 80 10

60 60

40 40

20 20

0 0 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 sample sample (b) onfidence S confidence 00 100 - 3:1

80 80 RR 3:2

60 60

40 40

20 20

0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 sample sample (c) e 10. Intra-class results after canonicalisation for (a) Figure 11. Results after canonicatisation showing the inter- 1: Interface traps, (b) Class 2: V 1-adjust implant, (c) class rejection ratios. These charts illustrate successful rejec- 3: Uniformly-doped substrates. tion from the other categories. (a) Class I from 2 and 3; (b) Class 2 from 1 and 3; (c) Class 3 from 1 and 2. class results in figures 10 and 11. The upper confidence told although now smaller, (-60%) is considerably eat from the lower, class-rejection threshold, (-30%) along the two most discriminatory axes of the feature space for o gives a wider margin between the recognition and a basic 3-feature subset {2,4,6}. The 'clusters' appear more as rnination between classes of C-V curve. radial lines and are quite close together at the origin. Figure Ihis procedure has the added bonus of enabling the 12b illustrates that the addition of further curve-shape -dimensional data to be viewed in just two dimensions descriptors results in the clusters becoming more rounded and permitting a visual comparison to be made between moving further apart. ent feature sets. Figure 12 shows how the cluster shape The final set used which gave a consistent classification eparation can be improved by adding further 'tuning' performance was the subset {l,2,4,5,6,8} of the eight available es to the basic set. features. One of the criteria used in this choice was obviously e 12a shows a scatter plot of the three sample clusters the ease and reliability of feature extraction.

- 284 - measurement system. The pattern recognition then provides for the translation of numeric to symbolic data, and the expert system for opportunistic scheduling of tests. The software environment is now available which will make this approach feasible.

0 4' 0 CONCLUSIONS U ) This paper has demonstrated how a pattern-recognition C U system can be applied to the interpretation of C-V curves on a an MOS test structure. it is capable of automatically U sequencing the appropriate measurements required to accurately extract the maximum amount of information available from C-V and 0-V measurements. Unlike most other 'Expert Systems', CV-ASSIST is an integral part of the -ç -3 -i.Z 1.2 36 U measurement, instrumentation and control software and is thus Qr., esor I able to call up a sequence of individually tailored tests for the (a) MOS test structure under investigation.

ACKNOWLEDGEMENTS 4 3.6 This work has been funded by the Science and 0 Engineering Research Council of Great Britain. J. Walls L would also like to acknowledge Hewlett Packard Ltd. for 0 1.2• 0 4, financial assistance. 0 S > REFERENCES C - l.2 A D AAf o El 5 D 13 MOS Physics and AA 0 E.H. Nicollian and J.R. Brews, U A o I.6 D Technology. Wiley, 1982. 0 I.G. McGillivray, "The Measurement of Electrical Parameters' and Trace Impurity Effects in MOS -ç • I • • I • 4 -.6 3.5 7.4 11.Z 15 Capacitors," Ph.D. Thesis, University of Edinburgh, 1987. El J.R. Brews, "Rapid interface Paramcterisation using a (b) single MOS Conductance Curve," Solid-State Electronics, vol. 26, 711-716, 1983. key: S = samples with a high Interface trap density pp. samples with uniformly-doped substrates I.G. McGillivray, J.M. Robertson, and A.J. Walton, = samples with non-uniform doping profiles "Improved Measurement of Doping Profiles in Silicon Using CV Techniques," IEEE Trans Electron Devices, vol. 35, no. 2, pp. 174-179, Feb 1988. Figure 12. Scatter-plots of the clusters In 2-D after canonical I.G. McGillivray, J.M. Robertson, and A.J. Walton, analysis of a) the {2,4,6} feature subset and b) the complete "Automatic Measurement of Minority Carrier Lifetimes set of features. in Silicon for Nonuniform Doping Samples Using the Zerbst Technique," lEE Proceedings - I: Solid State and DISCUSSION Electron Devices, vol. 134, pp. 161-167, Dec 1987. In further trials, the system occasionally suffered from M.A. Fischlcr and R.C. Bolles, "Perceptual Organisation mis-classifications primarily because of missing or unreliable and Curve Partitioning," IEEE Trans. Pattern Analysis data resulting from faulty features being extracted from the and Machine Intelligence, vol. 8, p. 101, 1986. raw data. As a step toward enhancing the reliability of the J.A. Walls, A.J. Walton, J.M. Robertson, and T.M. existing data-processing procedure, a decision-support Crawford, "Interpretation of Capacitance-Voltage Curves mechanism has been built around the basic pattern recognition for Process Fault Diagnosis: a Machine-Learning system to enable the operator to intervene, and confirm or Approach," Proceedings ICMTS 1988, pp. 174-178. Long refute feature extraction. Beach, California, 1988 This technique has been found to be a useful facility, and M. James, Classification Algorithms, Collins, 1985. was extended to help draw conclusions from other aspects of the measurement, e.g. determining acceptable voltage sweep B.F.J. Manly, Multivariaie Statistical Methods, Chapman ranges and monitoring for discontinuities occurring during the & Hall, 1986. measurement. This gives CV-ASSIST more flexibility by D.J. Hand, Discrimination and Classification, Probability assisting the user and machine with subjective questions about and Mathematical Statistics, Wiley, 1986. the nature of the curves and status of the data, with Sing-tze Bow, Pattern Recognition and Picture Processing, supporting examples of representative C-V curves and data Electrical Eng. & Electronics series, Marcel Dekker, held in a database. The operator is thus better equipped to 1984. make correct decisions even for C-V curves for which characteristics have not been previously stored. Work is now underway to incorporate the pattern- recogniser with a rule-based system to create a hybrid, flexible

We The Application of Al Techniques to the Control and Interpretation of C-V Measurements. J.A. Walls, A.J. Walton, J.M. Robertson Edinburgh Microfabrication Facility Department of Electrical Engineering The King's Buildings Mayfield Road Edinburgh, Scotland.

Abstract: Pattern-recognition and knowledge-based techniques differing MOS devices such as the MOSFET, charge-coupled- have been applied to help advance the interpretation of C-V device (CCD) and non-volatile memory cell as illustrated in curves. This has been implemented by integrating instrument figure 1. The advantages of being able to use this reduced control software with an expert system shell to intelligently structure to analyse the effects of processing on integrated sequence tests to enhance conventional CV, GV, C-t, and circuit elements of greater complexity are obvious. QSCV measurements. (I) (c) HEMMMOR MEN EMMMMEN ME

Introduction & Background Capacitance-voltage (C-V) measurements are powerful tools for the routine analysis, monitoring and control of many fabrication processes on an integrated-circuit production line. The C-V technique, when used correctly, can help identify and qualify faults in the fabrication process via analysis of a simple test structure, the metal-oxide-semiconductor (MOS) capacitor. Figure 1: The MOS capacitor (a) as the basic element of a The MOS capacitor is one of the most important devices MOSFET (b), a charge-coupled device (CCD) (C), and a for analysing an MOS process (NMOS, PMOS, CMOS, or non-volatile memory cell (d). BiCMOS) for it provides a great deal of fundamental information about the MOS structure inherent to these A C-V measurement involves ramping the voltage applied technologies. The electrical properties of the Si:Si0 2 interface across an MOS structure, and observing the change in have a profound influence on the performance and stability of capacitance. The equivalent circuit model of the MOS the devices manufactured using these production processes [1]. capacitor in the simplest case is illustrated in figure 2 alongside a cross-sectional diagram of a common test-structure. Tests on the MOS capacitor can be a good indicator of final device performance in addition to providing important Gate information about the oxide, the oxide-silicon interface, and the surface and bulk properties of the semiconductor. The oe MOS capacitor is by nature extremely sensitive to processing quality and thus a powerful analytical structure for both — 'c0 diagnostic and research purposes. The vast body of literature Silicon I [2] devoted to studies of the MOS capacitor is testament to its csi potential, but also to the drawbacks, for as a direct —U- consequence of its sensitivity, the MOS capacitor is particularly difficult to characterise. To use manufacturing parlance, the MOS capacitor is the Figure 2: Cross-sectional diagram and simplified equivalent 'first available sub-assembly' in a typical MOS process, and circuit of the MOS capacitor test structure. can thus be applied to the early detection of processing faults. Tests are possible on partly-processed wafers at many stages An alteration in the applied bias causes the depletion during fabrication, for example, the control of processes such layer in the silicon substrate to change in width, and this is as oxidation, ion-implantation, and annealing. Additionally, observed as a change in the total measured capacitance, Cm . the MOS capacitor is the basic element of a number of The measured capacitance normally varies between two extrema, C 6, and C. as shown in figure 3. The shape of One major problem with the technique lies not so much the resulting curve is well defined but only in the absence of with the measurement itself, which can be accurate and all second order processing effects and physical or electronic repeatable, but with its interpretation. The shape alone of a abnormalities. C-V curve reveals many processing defects or characteristics of the sample as shown in figure 4. The presence of some of these effects may signal that entirely different analyses are applicable in order to extract important processing parameters.

1 1 1 l ,-. I Statistical pattern recognition techniques have already been 1' developed [31 and extended [4] to distinguish between three of the most commonly occurring deviations from the 'ideal' N C-V curve. These are illustrated by the curves in figure 4.

I These deviations have their origin in three very different 0 physical processes, and as such require separate models and C I assumptions in order to extract accurate parametric ,1 information from the MOS capacitor that they characterise.

0 I a I CICox C/Cox U

-.. (a) 0.8

0.0 -3.8 -2.8 -1.8 -.8 .2 1.2 0.4 (b) 0.2 0.2 Figure 3: A typical, well-defined high-frequency C-V curve. Bias voltage Bias voltage

C/Cox C/Cox Further measurements are possible on the MOS capacitor, using the same hardware, which can reveal different aspects of (C) the device's behaviour. For example, through conductance- voltage (G-V) interface-trap determination, capacitance time 0.6 (C.t) measurements and pulsed C-V dopant profiling, the 0.4 0.4 device under test may be fully characterised [2]. These other analyses however, must be carefully controlled and performed 0.2 in an order dependent upon the results of other tests so that the correct models and assumptions can be applied at each Bias voltage Bias voltage stage. Failure to do so can give rise to misleading or ambiguous parametric data if the operator is unfamiliar with the vagaries of the technique. Figure 4: Four well-known classes of C-V curve arising from: a) Uniformly doped substrate, b) Interface trapped-charge, c) The potential usefulness of an expert system for the Threshold-adjust implant, d) Depletion (compensating) routine use of C-V as a diagnostic tool is thus the driving force implant. behind this work. This paper discusses the application of pattern recognition and knowledge-based control techniques to enhance conventional C-V measurements. Secondly, the problem of performing measurements and analyses in the correct sequence needs to be addressed. 2. Requirements for an Expert System Beyond the initial first few measurements, the best sequence in which to apply further tests is indeterminate and governed The challenges to automate C-V measurements include entirely by the particular characteristics of the device and its interpretation and scheduling, which lend themselves well to response to the applied stimuli. For example, to measure the application of artificial intelligence techniques. The key certain key parameters often necessitates an iterative series of aspects of the C-V technique have been identified as follows: tests in order to determine the optimum voltage range or sweep speed. Other analyses to determine either the doping The raw data is difficult to interpret, profile or the interface trap distribution require an estimate of the other in order that the correct analysis algorithms can be Expertise is required to reason about the results, applied. Furthermore, certain characteristics or abnormalities, such as a 'leaky' oxide, or the presence of a depletion ion- There are several and conflicting sources of implant require that much greater care be taken in the knowledge, analysis.

Diagnosis is progressive, hierarchical and data- There is clearly therefore, a requirement for an intelligent driven, controller to orchestrate the application of the measurement The underlying theory is complex and incomplete. and analysis operations. The problems outlined above would therefore seem to indicate that a completely automated system could be accomplished through the use of the following three processes: Expert System ,,::ra:ph9 • Interpretation (pattern recognition and matching) kwWMd"-b"W • Scheduling (planning the sequence of events) Iogc& k1.rsnc. opportuTs& conbc4 unc.il&nty hnq • Reasoning (ability to hypothesise and test) In addition, the system should be able to make deductions X-wln lows from the shape of the C-V curve, inform the user of any J, 4 BASIC (RMB-UX) approximations or simplifying assumptions made, and terae recognise the significant factors within a measurement so that command flies gc.ithnc the final result may be fully justified. ___ ___ Esfrwnint conb [] - ' m4meft .oftwets 3. Expert Control [.vorj Id1 A hybrid system has been developed whereby the instrumentation software is controlled by a rule-based program Figure 5: Schematic diagram illustrating the integration to facilitate opportunistic sequencing of the algorithms, whilst between the two programming environments. preserving the software investment in instrument control. The advantage of this integrated approach lies in the ability of the offers all the facilities associated with an Al computing knowledge-based system to assimilate its own raw data and environment. tailor diagnostic tests to the specific device under investigation, thereby making the best use of all the information as and when it becomes available. The assimilation function is one 4. The Edinburgh Prolog Blackboard Shell which is frequently overlooked in many other 'expert systems'. The knowledge-based system shell employed uses the A further specification, determined by cost and ease of popular blackboard architecture favoured for opportunistic use, was for the entire system to run on a single computer. control applications [11]. The particular shell used is modelled Similar 'expert control' systems have previously necessitated upon the mechanism of the HEARSAY-il speech recognition the use of two inter-linked personal computers, one running system [12]. A convenient description of the blackboard the knowledge-based system, the other the control and model is that of a panel of experts from different specialisms instrumentation software, together with a separate data mapping out a solution to the problem on a blackboard. Each acquisition unit [5]. This could hardly be considered an individual expert makes a contribution whenever possible, for optimal solution, since the two PCs are rarely able to operate the rest of the group to consider, and in time an initial in parallel, and thus remain idle for half the time. hypothesis is built up towards a full solution. The flexibility of this approach lies with the ability for each test or complete To achieve the desired level of integration, an interface measurement to act as a 'knowledge source' to contribute to a was developed to permit communication between the rule- more global picture of the capacitor's characteristics. Figure 6 based system and the instrument control software. An existing shows the knowledge sources within CV-EXPERT that are advanced test and measurement system, EDUCATES [6], was provided by the separate programs in the C-V measurement modified [7] to allow external management and control of the suite. These include heuristic knowledge about those instrumentation, analysis and graphics algorithms. Pattern- measurements, the output from the pattern recognition recognition techniques are used for the interpretive phase to algorithm, and any input from the operator. assess any similarity between known classes of C-V curve and effect the 'signal to symbol transformation'. Since the rule- based system and the C-V measurement software are interactive programs, both require separate display areas for c-v communication with the operator. For a single-screen system, measurements G-V therefore, a windowing environment was necessary. r measurementsj I measurements Within the X-windowst environment, running under HP- UX (Hewlett Packard's implementation of UNIX),t HP- BASIC [8] and the Edinburgh Prolog Blackboard Shell 19, 101 can be made to run interactively and concurrently. Co- Doping I Blackboard Quasi-static operation between the two programming environments is profu. system ' measurements effected by the transfer of data and command files, with an ] elementary protocol for timing and message handling, A schematic diagram of the overall scheme is illustrated in figure 5. Pattern I recogniser User Input This configuration was considered to offer the best Heuristics I facilities of each programming language. HP-BASIC is an accepted tool for the control of HP-lB instruments, and has recently been augmented to run under HP-UX, whilst Prolog Figure 6: The knowledge sources used within CV-EXPERT.

* Ccpywiire Mazachuzu tmb,te ci Th±noiogy t UNIX is a eadanwt ci BdI Lboatori. A rule-base has been developed which encapsulates (v) knowledge about the analytical strategy for C-V measurements Rules to FURTHER knowledge in a structured manner. The five phases in the development about the device through of a solution are illustrated in figure 7. Initially, analysis additional 3-V. C-t & QSCV tests proceeds along a pre-determined route, but after the first full C-V sweep has been performed, and an initial interpretation made, the exact direction of further analysis will depend directly upon these results. Progress is hierarchical, and hence (iv) assumptions made at a lower level may be refuted or amended Rules to integrate at a higher level, or vice versa, and so cause the diagnosis PATTERN CLASSIFIER results strategy to backtrack to a previous hypothesis and begin again. wIth PREUMINARY TESTS Only after a logically consistent state has been reached will the analysis section provide a full printed summary of the results.

The knowledge about the practical aspects of the C-V technique are represented as production rules. Each rule takes the general form:

IF condition THEN goal TO effect EST estimate

The syntax for the rules can be conveniently defined in Prolog notation where a comma (,) indicates a conjunction (OR) and a semi-colon (;) a disjunction (AND): Rules for Cf_test := X < N; EQUIPMENT SETUP X > N; (N < X, X < M). Depth of Atomic test := analysis [Index_pattern, Fact_pattern, Cf test]. Figure 7: Hierarchical strategy for interpreting C-V Test := Atomic_test; measurements not Atomic test; notnow Atomic—test; holds Precondition. Rule No. 7 if (prelim,param(gox,Gox) ,true] Condition := Test; and Gox>1000 Test and Condition; then write('warning: possible leaky oxide') Test or Condition. to add (prelim,assume(gox,leaky) ,true) These rules are modified 'IF-THEN' statements which can be and (prelim,assume(cox,unreliable) ,true) 'fired up' upon satisfaction of the condition, a goal executed, est 2 and as a consequence, entries on the blackboard added, amended or deleted. Entries intended for amendment or Rule No. 24 deletion are marked by an @. The estimate is a salience if [prelim,assume(cox,unreliable) ,true] factor representing the usefulness of the particular rule and and @[preliin,param(type,Type) ,true] can be used to control, to some extent, the order in which the then write('Cox unreliable, reducing Vacc'), rules are used. In the default scheme, this is a non-zero instruct_basic (command, "Vacc=Vacc-2") integer, and rules with the lowest salience are considered first to delete % initiates restart by the scheduler. eat 1

Examples of some of these rules are given below. These Each item in (brackets] is a blackboard entry. A blackboard rules were developed from a practical familiarity with all entry has three principal components: an index, a fact and a aspects of the measurements and their interpretation, and a confidence rating in the form [index,entry,rating]. The working knowledge of the typical problems encountered during indexing scheme is user defined, specifying a place on the routine C-V testing. blackboard for the entry, and in this application, index is used to distinguish different knowledge sources and different levels Rule No. 38 within the solution hierarchy. The fact pattern is a general if [zerbst_test,Tau0. 9] prolog term holding the information to be recorded by this and @(initial ,parameter(Pdope,pdope) ,true] entry. Rating is a term which represents a confidence in the then true certainty of the entry, and in this scheme is an integer between to delete 0 and 1. eat 5 Rule 38 can be translated to imply that should the device be determined, with at least 90% confidence, to have a low User input D,,, -700 A, n-type substrate minority carrier lifetime from a Zerbst analysis of the C-t Determined device connected, c-mete; zeroed data, then pulsed doping density and all other parameters and -find leakage at + 1 O to be high (G, = 700 mS) assumptions derived from it should be withdrawn from the Default sweep range used and type-determination gives a solution, since they are no longer valid. Rules 7 and 24 p-type substrate which disagrees with expectation provide a mechanism for ensuring the oxide capacitance is D,, measured at -1.2 A due to leakage, but C OK Pattern analysis fails to recognise shape correctly determined from a sample with a leaky oxide. Rule Fail and reset accumulation voltage to 2V; redo steps 2 & 3 24 fires off rule 7 and initiates the start of a completely new Halt, V =-2.00V, D,, =696 A, 1.71 x 1015cm measurement by removing the substrate type' parameter upon Warn user, suggest perform dielectric integrity test which all tests in the preliminary analysis depend, and reducing the voltage at which C is measured. Table 2: Sequence of events for measurements on a sample The prototype system currently comprises 52 such rules with a leaky oxide, under knowledge-based control. designed to handle the most frequently encountered discrepancies between the different measurements, and to - '.'J C= - S.-, , 92.4 _9 9 provide a means of ensuring that the maximum possible amount of reliable data is extracted from the device under test. 44 I Leakage in 0 C 5. Examples of the System in Use I 33 --jAa\c cumuaccumulation fat i on 4, A major part of the rule-base has been developed to 0 satisfy the first four stages of analysis (i to iv) as outlined in I figure 7. These rules implement the control of the analysis of S I the preliminary tests and of the first full C-V measurement. U A typical run follows the sequence of events shown in table I, but in the case of a sample exhibiting, say, an early I _ I I _31.6 I.Ø breakdown of the dielectric when biased towards 2.0 S. accumulation, the system attempts to re-measure C at a Va 1 t reduced bias or reject the sample completely depending on the severity of the failure. A routine test upon an oxidised n-type Figure 8: The effect of a leaky oxide on the shape of the wafer produced by rapid thermal processing picked up the high-frequency C-V curve. curious curve shown in figure 8. Under these circumstances a simpler deterministic C-V system may even have failed to FZZ - w - \/ . 9 2 -4 _9 9 - 1 determine the substrate type correctly since this is usually measured by taking two readings at fixed voltages to determine the direction of the central inflexion in the curve. The rule- based system however managed to avert this as shown by the I 0 events in table 2, and extract a limited set of parameters from C the resulting curve as shown in figure 9. The process fault was I later traced to an inadequate post-oxidation anneal period. 4, 0 In another example, the pattern recogniser identifies the I S measured C-V curve as being characteristic of a sample with a I high interface trap density. If in this instance the operator U had requested a measure of the trap density and distribution, the combined high and low frequency method would have C L/'"I I i I been selected as the most appropriate measurement. If the -6.3 -4.8 -2.9 -1.2 trap density had been much lower, however, the conductance -6.0 technique, although considerably slower, would have been Va 1 initiated as this is a more sensitive measure of low trap density.

C = 48576 pF -2 C = 7775 pF -2

User inputs expected D. and/or Area and substrate type C=31777pFcm 2 N,,= 1.71xl0cm 3 Determine if device connected, c-meter zeroed, leakage low Determine substrate type and optimum voltage sweep range D. = 696A R5 = 36.5 ft Estimate lifetime, measure D,,, Q. and N Perform deep depletion or equilibrium C-v as appropriate Vfb= -2.00V VT= -3.OIV Pattern-classify the resulting curve; check diagnosis corroborates with parameters in 4) Substrate = n-type Q. = 5.2x 1011 cm-2 Continue with further tests in sequence as necessary.

Figure 9: C-V curve measured, and parameters extracted Table 1: Sequence of events for a 'normal' sample. under knowledge-based control In a third example, the curve shape is indicative of a Ltd., South Queensferry. The authors would also like to non-uniform substrate doping profile and, if the operator thank T.M. Crawford of Hewlett Packard Ltd. and P. Chung requests it, the system automatically performs a C-V dopant of the Al Applications Institute at Edinburgh University for profiling routine. The measurement method employed (pulsed technical advice and assistance. or equilibrium) will depend upon the minority carrier lifetime of the silicon, and this will already have been estimated earlier in the measurement sequence. Furthermore, once the References substrate has been determined to be non-uniformly doped, the E. H. Nicollian, "Electrical properties of the Si:Si02 threshold voltage calculations can be re-evaluated in the light interlace and its influence on device performance and of this new information. If a lifetime measurement had also Vacuum Science Technology, Vol. 14, been requested, the doping profile can be used to correct the stability," Journal of pp. 1112-1121, (Sept/Oct 1977). distortion in the Zerbst plot caused by the non-uniformity. (5), E. H. Nicollian and J. R. Brews, MOS (Metal Oxide Finally, in the case of a depletion implant being identified Semiconductor) Physics and Technology, John Wiley & from the shape of the C-V curve, most of the usual approximations break down. A warning is thus issued to the Sons, New York, (1982). operator that other techniques should be used to determine J. A. Walls, A. J. Walton, J. M. Robertson, and T. M. key parameters such as C and substrate doping. Crawford, "Automating and sequencing C-V measurements for process fault diagnosis using a pattern Conclusions and Further Work recognition technique," in Proceedings ICMTS 1989, Vol. 2, pp. 193 - 199, Edinburgh U.K., (March 1989). The prototype system is able to correctly identify a number of process faults including a leaky oxide as described J. A. Walls, C-V Measurement as an Expert System, in the examples given. In this instance some useful (1990). Ph.D. Thesis, Edinburgh Microfabrication information could be obtained, and a warning issued to the Facility, Dept. Electrical Engineering, University of operator about the possible inaccuracy of some of the other Edinburgh, Scotland. parameters. Moreover, further analyses were disabled on account of the sample being inappropriate for their assumed L. Lebow and G. L. Blankenship, "A microcomputer- equivalent circuit models. The other examples illustrate the based expert controller for industrial applications," in improvements to be gained from G-V, C-t and doping profile Knowledge-Based Expert Systems for Engineering: measurements simply by recognising the important factors in a Classification, Education and Control, ed. D. Sriram, R. single C-V measurement. A. Adey, (1987). Although the current interprocess-communication scheme 1. G. McGillivray, The Measurement of Electrical is not particularly efficient, speed is not necessarily an issue Parameters and Trace Impurity Effects in MOS here, particularly as the response time of the blackboard shell Capacitors, (1987). PhD. Thesis, University of is typically of the order of a few seconds. A more elegant Edinburgh solution is, however, being investigated using UNIX/domain sockets. The ability to incorporate compiled C subroutines J. A. Wails and A. J. Walton, "EDinburgh University into the BASIC code facilitates this improvement. CApacitor TEst Software (EDUCATES)- a user guide," EMF Internal Report, (July 1989). The implications for 'expert control' of C-V measurements are an increased throughput of routine Hewlett Packard Company, Using the BASIC/UX System parametric measurements, thereby freeing the operator from on HP 9000 Series 300 Computers, Fort Collins, all but the high-level analysis of samples that do not conform Colorado, USA., (1988). (BASICIUX 5.5) any description in the current knowledge base. It is in the nature of a rule-based system to readily allow extensions to be J. Jones, M. Millington, and P. Ross, "A blackboard made as new rules become necessary, and as such, this system shell in prolog," in Proceedings of ECAF86, (1986). may be easily updated as new knowledge becomes available, be it heuristic or theoretical. Additionally it is hoped that C- R. Englemore and T. Morgan, eds., Blackboard Systems, V measurements will become more reliable as their Addison Wesley, Wokingham, UK, (1988). sophistication and complexity become manageable through the for process control," application of knowledge-based techniques. Further rules are K.-E. Al-zen, "Expert systems now being added to refine the correct use of other tests such Proceedings of the 1st International Conference on as the G-V, quasi-static C-V (QSCV) and Zerbst C-t analyses. Applications of Artificial Intelligence in Engineering Problems, Vol. 11, pp. 1127-1138, Springer-Verlag, Southampton University, (April 1986). Edited by D. Acknowledgements Sriram & R. A. Adey This research has been sponsored by the Science and L. D. Erman, F. Hayes-Roth, V. R. Lesser, and D. R. Engineering Research Council of Great Britain (grant number Reddy, "The HEARSAY II speech understanding OR/F 30499). J.A. Walls would like to acknowledge the system; integrating knowledge to resolve uncertainty," financial assistance and software support of Hewlett Packard Computing Surveys, Vol. 12, (2), pp. 213-253, (1980). BIBLIOGRAPHY

There follows a complete list of papers and books cited in this thesis. Included here also are other references considered useful as background material for the interested reader.

Polaron Etch-Back CV System, Biorad Microscience Division, Semiconductor Business Unit, Greenhill Cres., Watford, UK.

HP 4062B Semiconductor Parametric Test System, II, Hewlett Packard Co.. (Appendix on Basic Statistics and Data Manipulation BSDM)

lEE NPL Conference on Pattern Recognition (Proceedings), NPL, Middlesex, U.K., (July 1968).

IEEE Transactions on Computers, Vol. C-20, (September 1971). Special issue on feature extraction and selection in pattern recognition

Multi-frequency LCR Meters, Hewlett Packard Co., (June 1980). (Technical data)

4192A LF impedance Analyser, Hewlett Packard Co., (October 1981). (Technical data)

HP 4061A Semiconductor/Component Test System, Hewlett Packard Co., (May 1983). (Technical data)

IEEE Transactions on Pattern Analysis and Machine intelligence, Vol. PAMI-8, (January 1986). Special issue on shape analysis

High Frequency, Quasistatic and Simultaneous CV Semiconductor Measurements, Keithley Instruments ('Package 82' product note), Cleveland, Ohio, (1987).

Advanced C-V Plotting Systems, The Micromanipulator Company (product note), Carsen City, Nevada, (1987).

The First international Conference on industrial & Engineering Applications of Artificial Intelligence & Expert Systems (lEA/AlE - 88), Held at the University of Tennessee Space Institute (UTSI), Tullahoma Tennessee., (1988). (Volumes I & H).

HP 4284A Precision LCR Meter, Hewlett Packard Co., (June 1988). (Technical data)

Advanced Computerised Semiconductor Measurement System (CSM), Materials Development Corporation (product note), USA, (1989).

The Second International Conference on Industrial & Engineering Applications of Artificial Intelligence & Expert Systems (lEA/AlE - 89), Held at the University of Tennessee Space Institute (UTSI), Tullahoma Tennessee., (June 6 - 9, 1989). (Volumes I & II).

Abramovici, M. and M. A. Breuer, "Fault diagnosis based on effect-cause analysis - an introduction," Proceedings 17th Design Automation Conference, pp. 69-76, (June 1980).

Ahrenkiel, R. K., "Measurement of carrier concentration in semiconductors by capacitance- voltage," Microelectronic Manufacturing and Testing, (Part 1), pp. 13-14, (December 1988). Ahrenkiel, R. K., "Measurement of carrier concentration in semiconductors by capacitance- voltage," Microelectronic Manufacturing and Testing, (Part 2), pp. 20-21, (January 1989).

Aleksander, I. ed., Artificial Vision for Robots, Kogan Page, (1983). (Chapter 12- WISARD)

Aleksander, I. and P. Burnett, Reinventing Man, Penguin Books, Middlesex, (1984).

-292- Amble, T., Logic Programming and Knowledge Engineering, Addison-Wesley, (1987). International Computer Science Series

Ambridge, T. and M. M. Faktor, Journal of Applied Electrochemistry, Vol. 5, pp. 319-328, (1975).

Arkadev, A. G. and E. M. Braverman, Teaching Computers to Recognise Patterns, Academic Press, (1967).

Arnold, E., J. Ladell, and G. Abowitz, "Crystallographic symmetry of surface state density in thermally oxidised silicon," Applied Physics Letters, Vol. 13, pp. 413-416, (1968).

Arzen, K.-E., "Expert systems for process control," Proceedings of the 1st international Conference on Applications of Artificial Intelligence in Engineering Problems, Vol. II, pp. 1127-1138, Springer-Verlag, Southampton University, (April 1986). Edited by D. Sriram & R.A., Adey

Astrom, K. J., J. J. Anton, and K. E. Arzen, "Expert control," Automatica, Vol. 22, (3), pp. 277-286, (1986).

Attneave, F., "Some informational aspects of visual perception," Psychological Review, Vol. 61, (3), pp. 183-193, (1954). Baccarani, G. and M. Seven, "On the accuracy of the theoretical high-frequency semiconductor capacitance for inverted MOS structures," IEEE Transactions on Electron Devices, Vol. ED- 21, pp. 122-125, (Jan. 1974). Baglee, D. A., "Characteristics and reliability of 100 A oxides," Proceedings 22nd IEEE Reliability Physics Symposium, P. 152, (1984).

Bagnoli, P. E. and A. Nannini, "Effects of interfacial states on the capacitance-voltage characteristics of P-Si0 2-Si schottky diodes," Solid State Electronics, Vol. 30, (10), pp. 1005- 1012, (1987).

Bailliet, J. V., "Automatic testing provides economic leverage," Solid State Technology, pp. 88-91, (March 1970).

Balzer, R., L. D. Erman, P. London, and C. Williams, "HEARSAY ifi: a domain independent framework for expert systems," Proceedings of the First National Conference on Al, pp. 108- 110, (1980).

Barna, G., G. Carter, and J. Spencer, "Rational process optimisation and characterisation," TI Engineering Journal, pp. 18-25, (March-April 1985). Bartal, P. and L. Monostori, "A pattern-recognition based vibration monitoring module for machine tools," Robotics and Computer Integrated Manufacturing, Vol. 4, (3/4), pp. 465-469, (1988).

Bartelink, D. J. and R. E. Tremain, "Exact formulation of MOS C-V profiling," IEEE Transactions on Electron Devices, Vol. ED-27, (11), p. 1831, (1979).

Bartelink, D. J., "C-V Profiling," in Proceedings of the Microelectronics Measurement Technology Seminar, pp. 140-156, San Jose, (Feb. 6-7, 1979). Bartelink, D. J. and R. E. Tremain, Presented at 37th Annual Device Research Conference, (June 1979).

-293- Bartelink, D. J., "Surface depletion and inversion in semiconductors with arbitrary dopant profiles," Applied Physics Letters, Vol. 37, pp. 220-223, (1980). Bartelink, D. J., "Limits of applicability of the depletion approximation and its recent augmentation," Applied Physics Letters, Vol. 38, pp. 461-463, (1981).

Batchelor, B. G., Practical Approaches to Pattern Recognition, Plenum Books, London, (1974).

Berglund, C. N., "Surface states at steam-grown silicon dioxide interfaces," IEEE Transactions on Electron Devices, Vol. ED-13, (10), pp. 701-705, (1966). Berglund, C. N. and R. J. Powell, "Photoinjection in Si0 2: electron scattering in the image force potential well," Journal of Applied Physics, Vol. 42, (2), pp. 573-579, (Feb. 1971). Berman, A. and D. R. Kerr, "Inversion charge redistribution model of the high-frequency MOS capacitance," Solid State Electronics, Vol. 17, pp. 735-742, (1974).

Blacksin, J., "A Poisson C-V profiler," IEEE Transactions on Elecion Devices, Vol. ED-33, (9), pp. 1387-1389, (September 1986).

Bledsoe, W. W. and I. Browning, "Pattern recognition and reading by machine," Proceedings of the Eastern Joint Computer Conference, (1959). (reprinted in 'Pattern Recognition' by L. Uhr, Wiley, New York)

Bloomfield, B. P., "Capturing expertise by rule induction," Knowledge Engineering Review, Vol. 2, (1), pp. 55-70, Cambridge University Press, (March 1987). BoIc, L. and M. J. Coombs, eds., Expert System Applications, Springer-Verlag, (1988). Series on symbolic computation

Botos, B., "Designer's guide to RCL measurements," EDN, (June 5, 1975). (Reprinted by Hewlett Packard Co.)

Bourne, S. R., The Unix System V Environment, Addison Wesley. International Computer Science Series

Bow, Sing-Tze, Pattern Recognition- Applications to Large Data-Set Problems, Marcel Dekker, New York, (1984). (Electrical engineering and electronics series)

Bowerman, R. G. and D. E. Glover, Putting Expert Systems into Practice, Van Nostrand- Reinhold, New York, (1988).

Brews, J. R., "Correcting interface-state errors in MOS doping profile determinations," Journal of Applied Physics, Vol. 44, (7), pp. 3228-3231, (1973). Brews, J. R. and A. D. Lopez, "A test for lateral non uniformities in MOS devices using only capacitance curves," Solid State Electronics, Vol. 16, pp. 1267-1277, (Nov. 1973). Brews, J. R., "An improved high frequency MOS capacitance formula," Journal of Applied Physics, Vol. 45, pp. 1276-1279, (1974). Brews, J. R., "A simplified high frequency MOS capacitance formula," Solid State Electronics, Vol. 20, pp. 607-608, (1977).

Brews, J. R., "Rapid interface parameterization using a single MOS conductance curve," Solid State Electronics, Vol. 26, (8), pp. 711-716, (1983).

-294- Brotherton, S. D., D. R. Lamb, and J. W. Clancy, "Surface charge and annealing in the Si/Si0 2 system," International Journal of Electronics, Vol. 31, pp. 629-635, (1971).

Brown, D. M., R. J. Connery, and P. V. Gray, "Doping profiles by MOSFET deep depletion C(V)," Journal of the Electrochemical Society, Vol. 122, pp. 121-127, (1975). Brown, K., The application of knowledge-based techniques to fault diagnosis of 16 QAM digital microwave radio equipment., (1988). Ph.D. Thesis, University of Edinburgh.

Buehler, M. G., "Dopant profiles determined from enhancement-mode MOSFET DC measurements," Applied Physics letters, Vol. 31, (12), pp. 848-850, (December 1977).

Buehler, M. 0., "The DC MOSFET dopant profile method," Journal of the Electrochemical Society, Vol. 127, (3), pp. 701-704, (1980). Buehler, M. G., "Effect of the drain-source voltage on dopant profiles obtained from the DC MOSFET profile method," IEEE Transactions on Electron Devices, Vol. ED-27, (12), pp. 2273-2276, (December 1980).

Burggraaf, P. S., "C-V plotting, C-t measuring and dopant profiling: applications and equipment," Semiconductor International, pp. 29-42, (Oct. 1980).

Calligaro, R. B., "Simple method to determine the SiISi0 2 fast interface trap density change after high field stress on silicon-on-insulator substrates," lEE Proceedings :1, Vol. 134, (5), pp. 156-158, (October 1987). Castleman, K. R., "Digital Image Processing," (Signal processing series), Prentice-Hall, (1979). Chapter 16: Measurement and Classification

Catagne, R. and A. Vapaille, "Description of the Si0 2-Si interface properties by means of very low frequency MOS capacitance measurements," Surface Science, Vol. 28, p. 557, (1971).

Chan, N. and K. Johnston, "Edinburgh prolog blackboard shell users manual," AIAI/PSGm2/87, AIAI, University of Edinburgh, (September 1987).

Chandrasekaran, B. and A. Goel, "From numbers to symbols to knowledge structures; artificial intelligence perspectives on the classification task," IEEE Transactions on Systems, Man and Cybernetics, Vol. 18, (3), pp. 415-424, (May/June 1988). Chang, C. C. and W. C. Johnson, "Distribution of flatband voltages in laterally non-uniform MIS capacitors and application to a test for non-uniformities," IEEE Transactions on Electron Devices, Vol. ED-25, (12), pp. 1368-1378, (Dec. 1978). Chien, Y. T. and K.-S. Fu, "On the generalised Karhunen-Loeve expansion," IEEE Transactions on Information Theory, Vol. IT-13, pp. 518-520, (July 1967). Cioffi, J. M. and T. Kailath, "Windowed fast transversal filter adaptive algorithms with normalisation," IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. ASSP-33, pp. 607-625, (1985).

Claeys, C. L., "Oxidation processes and its influence on the electrical behaviour of MOS- structures," Proceedings of the First Brazilian Workshop on Microelectronics, pp. 112 - 131, Campinas, Sao Paulo, Brazil, (July 2-20, 1979).

Clocksin, W. F. and C. S. Mellish, Programming in Prolog, Springer-Verlag, Heidelberg, W. Germany, (1987). (Third Edition)

-295- Cooper, D. J., "An expert systems approach to process identification and adaptive control," in Knowledge-Based-Expert-Systems For Engineering: Classification, Education And Control, ed. D. Sriram, R. A. Adey, Springer-Verlag, (1987).

Cowan, C. F. N. and P. M. Grant, Adaptive Filters, Prentice-Hall, Englewood Cliffs, N.J., (1985).

Cowan, C. F. N., "Performance comparisons of finite linear adaptive filters," lEE Proceedings: F, Vol. 134, (3), pp. 211-216, (June 1987).

Crawford, T. M., V. Marton, and C. F. N. Cowan, "Testing electronic systems using expert systems- a pattern recognition approach," lEE Symposium on Current Trends in the Application of Expert Systems, Strathclyde University, (2 April 1986).

Crawford, T. M. and V. Marton, "A machine learning approch to expert systems for fault diagnosis, in communications equipment," IEEE Computer Aided Engineering Journal, Vol. 4, pp. 31-38, (February 1987).

Crawford, T. M., V. Marton, and C. F. N. Cowan, "Adaptive pattern recognition applied to an expert system for fault diagnosis in telecommunications equipment," Colloquium on Adaptive Fillers, I.E.E. Savoy Place, London, (date?). (Digest No. 1958/86)

Daku, B. L. F., P. M. Grant, C. F. N. Cowan, and J. Hallam, "Intelligent techniques for spectral estimation," Journal of JERE, Vol. 58, (6), pp. 275-283, (September-December 1988).

Daniel, C. and F. S. Wood, Fitting Equations to Data: Computer Analysis of Multifactor Data, Wiley, New York, (1980). 2nd edition.

Davies, 0. L. and P. L. Goldsmith, eds., Statistical Methods in Research and Production, Oliver & Boyd, Edinburgh, (1972).

Davis, R. and J. King, "An overview of production systems," in Machine intelligence, Vol. 8, ed. E.W. Elcock & D. Michie, pp. 30-332, Wiley, (1977).

Deal, B. E., A. S. Grove, E. H. Snow, and C. T. Sah, "Observation of impurity re-distribution during thermal oxidation of silicon using the MOS structure," Journal of the Electrochemical Society, Vol. 112, pp. 308-314, (1965). Deal, B. E., M. Sklar, A. S. Grove, and E. H. Snow, "Characteristics of the surface-state charge (Q) of thermally oxidised silicon," Journal of the Electrochemical Society: Solid State Science, Vol. 114, (3), pp. 266-273, (March 1967). Deal, B. E., E. L. Makenna, and P. L. Castro, "Characteristics of fast surface states associated with Si02-Si and Si3N4 —Si02-Si Structures," Journal of the Electrochemical Society, Vol. 116, p. 917, (1969).

Deal, B. E., "Properties and measurements of charges associated with the thermally oxidised silicon system," Proceedings of the Microelectronics Measurement Technology Seminar, pp. 39- 70, San Jose, (February 6-7, 1979).

Deal, B. E., "Standardised terminology for oxide charges associated with thermally oxidised silicon," Journal of the Electrochemical Society, Vol. 127, (4), pp. 979-981, (1980).

Deming, R. 0. and W. A. Keenan, "Low dose ion implant monitoring," Solid State Technology, Vol. 28, (9), pp. 163-167, (September 1985).

-2%- Devijver, P. A. and J. Kittler, Pattern Recognition- a Statistical Approach, Prentice Hall, London, (1982).

DiMaria, D. J., D. R. Young, R. F. DeKeersmaecker, W. R. Hunter, and C. M. Serrano, "Centroid location of implanted ions in the Si0 2 layer of MOS structures using the photo I-V technique," Journal of Applied Physics, Vol. 49, (11), pp. 5441-5444, (Nov. 1978).

Dolins, S. B., A. Srivastava, and B. E. Flinchbaugh, "Monitoring and diagnosis of plasma etch processes," IEEE Transactions on Semiconductor Manufacturing, Vol. 1, (1), pp. 23-27, (February 1988).

Doyle, J., "A truth maintenance system," Journal of Artificial Intelligence, Vol. 12, pp. 231-272, (1979).

Duda, R. 0. and P. E. Hart, Pattern Classification and Scene Analysis, Wiley Interscience, (1973).

Englemore, R. and T. Morgan, eds., Blackboard Systems, Addison Wesley, Wokingham, UK, (1988).

Erman, L. D., F. Hayes-Roth, V. R. Lesser, and D. R. Reddy, "The HEARSAY II speech understanding system; integrating knowledge to resolve uncertainty," Computing Surveys, Vol. 12, (2), pp. 213-253, (1980). Erman, L. D., P. E. London, and S. P. Fickas, "The design and example use of HEARSAY ifi," Proceedings of the 7th International Conference on Al, pp. 409-415, (1981).

Ernst, G. and A. Newell, GPS: a Case Study in Generality and Problem Solving, Academic Press, New York, (1969).

Fang, R. C. Y., "Threshold voltage shift of the p-channel transistors by boron implantation and the C-V characteristics of the corresponding MOS structures," Solid State Electronics, Vol. 26, (1), pp. 25-32, (1983).

Fischler, M. A. and R. C. Bolles, "Perceptual organisation and curve partitioning," IEEE Transactions Pattern Analysis And Machine Intelligence, Vol. PAMI 8, (1), (January 1986).

Fisher, R. A., "The use of multiple measurements in taxonomic problems," Ann. Eugenics, Vol. 7, pp. 179-188, (1936).

Forbes, L. and B. Rastegar, "A desktop-computer based calculation of high frequency MOS C-V curves," IEEE Transactions on Electron Devices, Vol. ED-34, (2), pp. 427-431, (Feb. 1987).

Forsyth, R. ed., Expert Systems - Principles and Case Studies, Chapman & Hall Computing, (1984).

Forsyth, R. and R. Rada, Machine Learning - Applications in Expert Systems and Information Retrieval, (1986). Ellis Horwood Series in AT

Forsyth, R., "The trouble with Al," Artificial Intelligence Review, (2), pp. 67-77, (1988).

Fortino, A. G. and J. S. Nadan, "An efficient numerical method for the small-signal AC analysis of MOS capacitors," IEEE Transactions on Electron Devices, Vol. ED 24, (9), pp. 1137-1146, (September 1977).

Fu, K. S., Syntactic Methods in Pattern Recognition, Mathematics in Science and Engineering (Vol. 112), Academic Press, New York, (1974).

-297- Garrett, C. G. B. and W. H. Brattain, "Physical theory of semiconductor surfaces," Physics Review, Vol. 99, p. 376, (1955).

Gdula, R. A., "The effects of processing on the hot-electron trapping in Si0 2 ," Journal of the Electrochemical Society, Vol. 123, (1), pp. 42-46, (Jan. 1976).

Gero, J. S. ed., Artificial Intelligence in Engineering: Diagnosis and Learning, Elsevier (Computational Mechanics Publications), Southampton, (1988).

Glorioso, R. M. and F. C. Colon-Osorio, Engineering Intelligent Systems (Concepts, Theory and Applications), Digital Press, (1980).

Goetzberger, A., "Ideal MOS curves for silicon," Bell System Technical Journal, pp. 1097-1122, (September 1966).

Gordon, B. J., "A microprocessor-based semiconductor measurement system," Solid State Technology, Vol. 21, p. 43, (July 1978). Gordon, B. J., "On-line capacitance-Voltage doping profile measurement of low-dose ion implants," IEEE Transactions on Electron Devices, Vol. ED-27, (12), pp. 2268-2272, (December 1980).

Grasselli, A., Automatic Interpretation and Classification of Images, Academic Press, London, (1969).

Gray, J. L. and M. S. Lundstrom, "Numerical solution of Poisson's equation with application to C-V analysis of Ell-V heterojunction capacitors," IEEE Transactions on Computer Aided Design, Vol. CAD4, (4), pp. 546-553, (Oct. 1985).

Grosvalet, J. and C. Jund, "Influence of illumination on MIS capacitance in the strong inversion region," IEEE Transactions on Electron Devices, Vol. ED-14, pp. 777-780, (1967). Grove, A. S., B. E. Deal, E. H. Snow, and C. T. Sah, "Investigation of thermally oxidised silicon surfaces using metal-oxide-semiconductor structures," Solid State Electronics, Vol. 8, pp. 145-163, (1965).

Grove, A. S., Physics and Technology of Semiconductor Devices, Wiley, New York, (1967). 2nd edition.

Haddra, H. S. and M. El-Sayed, "Conductance technique in MOSFETS: a study of interface trap properties in the depletion and weak inversion regimes," Solid State Electronics, Vol. 31, (8), pp. 1289-1298, (1988).

Hamasaki, M., "Radiation effects on thin oxide MOS capacitors caused by electron beam evaporation of aluminium," Solid State Electronics, Vol. 26, (4), pp. 299-303, (1983).

Hand, D. J., Discrimination and Classification, Wiley, Chichester (UK), (1981).

Hawkins, J. K., "Self-organising systems - a review and commentary," Proceedings IRE, pp. 31-48, (January 1961).

Hayes-Roth, B., "Blackboard architecture for control," Journal of Artificial Intelligence, Vol. 26, pp. 251-321.

Hegemann, V. and J. Fitch, "Statistical strategies for optimising processes," TI Technical Journal (Engineering Technology), (Part 1), pp. 92-97, (January-February 1987).

-298- Hegemann, V. and J. Fitch, "Statistical strategies for optimising processes," Ti Technical Journal (Engineering Technology), (Part 2), PP. 69-76, (March-April 1987).

Heilman, M. E., "The nearest-neighbour classification rule with the reject option," IEEE Transactions on Systems, Science And Cybernetics, Vol. SSC-6, pp. 179-185, (1970).

Hewlett Packard Company, Using the BASIC 5.015.1 System on HP 9000 Series 2001300 Computers, Fort Collins, Colorado, USA., (1987). (workstation BASIC)

Hewlett Packard Company, Using the BASIC/UX System on HP 9000 Series 300 Computers, Fort Collins, Colorado, USA., (1988). (BASICIUX 5.5)

Hildebrand, F. B., Introduction to Numerical Analysis, McGraw - Hill, New York, (1973).

Hilibrand, J. and R. D. Gold, "Determination of impurity distribution in junction diodes from capacitance-voltage measurements," RCA Review, Vol. 21, (245), (1960).

Hinton, G., "Learning in parallel networks," Byte Magazine, Vol. 10, (4), (1985).

Hofstein, S. R., "Minority carrier lifetime determination from inversion layer transient response," IEEE Transactions on Electron Devices, Vol. ED-14, pp. 785-786, (1968). (special issue on MIS structures)

Hogley, J. R. and A. R. Korncoff, "Artificial intelligence in engineering: a revolutionary change," Proceedings of the 1st International Conference on Applications of Artificial Intelligence in Engineering Problems, Vol. II, Pp. 1155-1160, Springer-Verlag, Southampton University, (April 1986). Edited by D. Sriram, R. Adey

Holden, T., Knowledge-Based CAD and Micro-Electronics, North-Holland, (1987).

Honig, M. and D. G. Messerschmitt, Adaptive Filters: Structures, Algorithms and Applications, Kluwer Academic Publishers, Dordrecht, Holland, (1984).

Hori, T., H. Iwasaki, and K. Tsuji, "Electrical and physical properties of ultrathin reoxidised nitrided oxides prepared by rapid thermal processing," IEEE Transactions on Electron Devices, Vol. ED-36, (2), pp. 340-350, (February 1989). Huang, C. T, "High frequency C-V plot as a process control tool," Proceedings of the Microelectronics Measurement Technology Seminar, San Jose California, pp. 71-84, Benmill Pub. Corp., (February 6-7, 1979).

Huff, H. R. and F. Shimura, "Silicon material criteria for VLSI electronics," Solid State Technology, Vol. 28, Pp. 103-118, (March 1985). Hwu, Jenn-Gwo, Way-Seen Wang, and Yun-Leei Chiou, "Relationship between mobile charges and interface states in silicon MOS capacitors," Journal of the Chinese Institute of Engineers, Vol. 9, (2), pp. 171-178, (1986). Iniewski, K. and A. Jakubowski, "A new method for the determination of channel depth and doping profile in buried-channel MOS transistors," Solid State Electronics, Vol. 31, (8), pp. 1259-1264, (1988).

Jackson, P., Introduction to Expert Systems, Addison-Wesley, Wokingh am, England, (1986).

Jaegar, R., F. H. Gaensslen, and S. E. Diehl, "An efficient numerical algorithm for simulation of MOS capacitance," IEEE Transactions on Computer Aided Design, Vol. CAD-2, (2), pp. 111- 116, (April 1983).

- 299 - Jakubowski, A. and K. Iniewski, "Simple formulas for analysis of C-V characteristics of MIS capacitor," Solid State Electronics, Vol. 26, (8), pp. 755-756, (1983).

James, M., Basic Artificial intelligence.

James, M., Classification Algorithms, Collins, London, (1985).

Jantsch, 0. and B. Borchert, "Determination of interface state density especially at the band edges by noise measurements on MOSFETS," Solid State Electronics, Vol. 30, (10), pp. 1013-1015, (1987).

Jindal, R. P., "A normalised analytical solution for the capacitance associated with uniformly doped semiconductors at equilibrium," Solid State Electronics, Vol. 26, (10), pp. 1005-1008, (1983).

Johnson, W. C. and P. T. Panousis, IEEE Transactions on Electron Devices, Vol. ED-IS, pp. 965-973, (1971). Jones, J. and M. Millington, "An Edinburgh prolog blackboard shell," D.A.I. Research Report #264, Department of Artificial Intelligence, University of Edinburgh.

Jones, J., M. Millington, and P. Ross, "A blackboard shell in prolog," in Proceedings of ECAI'86, (1986).

Jourdain, M., G. Solace, C. Petit, M. Favre, and J. Despujols, "Effect of high field stress on interface states of NMOS capacitors," Solid State Electronics, Vol. 26, (4), pp. 251-257, (1983).

Kaempf, U., "Automated parametric testers to monitor the integrated circuit process," Solid State Technology, pp. 81-87, (September 1981).

Kang, J. S. and D. K. Shroder, "The pulsed MIS capacitor," Physica Status Solidi, Vol. (a) 89, (13), pp. 13-43, (1985). (review article)

Kar, S., "Determination of Si-Metal work function differences by MOS capacitance technique," Solid State Electronics, Vol. 18, pp. 169-181, (1975).

Keller, W. W., "The rapid measurement of generation lifetime in MOS capacitors with long relaxation times," IEEE Transactions on Electron Devices, Vol. ED-34, (5), pp. 1141-1146, (May 1987).

Kennedy, D. P. and R. R. O'Brien, "On the measurement of impurity atoms by the differential capacitance technique," IBM Journal of Research, Vol. 13, pp. 212-214, (1969).

Kern, D. P., "Sub-0.1p.m silicon MOSFETs," Proceedings 19th European Solid State Device Research Conference (ESSDERC 89), pp. 633-642, Springer-Verlag, Berlin, (September 11-14, 1989).

Kern, W. and D. A. Poutinen, "Cleaning solution based on hydrogen-peroxide for use in silicon semiconductor technology," RCA Review, Vol. 187, pp. 187-206, (June 1970).

Kidd, A. L., Knowledge Acquisition for Expert Systems, a Practical Handbook, Plenum Press, New York, (1987).

Kilby, J. S., "Invention of the integrated circuits," IEEE Transactions on Electron Devices, Vol. ED-23, p. 648, (1976).

-300- Kip, A. F., Fundamentals of Electricity and Magnetism, McGraw-Hill, Tokyo, (1969). 2nd edition.

Kittel, C., Introduction to Solid-State Physics, John Wiley & Sons, New York, (1976). 5th edition.

Kuhn, M., "A quasi-static technique for MOS CV and surface states measurements," Solid-State Electronics, Vol. 13, p. 873, (1970).

Lachenbruch, P. A. and M. R. Mickey, "Estimation of error rates in discriminant analysis," Technometrics, Vol. 10, pp. 1-11, (1968).

Lachenbruch, P. A., Discriminant Analysis, I-iafner Press, New York, (1975).

Lebow, L. and G. L. Blankenship, "A microcomputer-based expert controller for industrial applications," in Knowledge-Based Expert Systems for Engineering: Classification, Education and Control, ed. D. Sriram, R. A. Adey, (1987).

Lederman, A., "Vacuum operated mercury probe for C-V plotting and profiling," Solid State Technology, pp. 123-126, (August 1981).

Lehovec, K., "Rapid evaluation of C-V plots for MOS structures," Solid State Electronics, Vol. 11, pp. 135-137, (1968).

Leistiko, 0., A. S. Grove, and C. T. Sah, "Electron and hole mobilities in inversion layers on thermally oxidised silicon surfaces," IEEE Transactions on Electron Devices, Vol. ED-12, pp. 248-254, (May 1965).

Lendaris, G. G. and G. L. Stanley, "Diffraction-pattern sampling for automatic pattern recognition," Proceedings of the IEEE, Vol. 58, (2), pp. 198-216, (Feb. 1970).

Levine, M. D., "Feature extraction- a survey," Proceedings of the IEEE, Vol. 57, (8), pp. 1391- 1407, (Aug. 1969).

Ligenza, J., "Effect of crystal orientation on oxidation rates in high pressure steam," Journal of Physical Chemistry, Vol. 65, p. 2011, (1961).

Lin, Shi-Tron and J. Reuter, "The complete doping profile using MOS C-V technique," Solid State Electronics, Vol. 26, (4), pp. 343-351, (1983).

Lin, Shi-Tron, "Simultaneous determination of minority-carrier lifetime and deep-doping profile using a double-sweep MOS C(V) technique," IEEE Transactions Electron Devices, Vol. ED- 30, (1), pp. 60-63, (January 1983).

Lindner, R., "Semiconductor surface varactor," Bell System Technical Journal, Vol. 41, pp. 803- 831, (1962).

Lippmann, R. P., "An introduction to computing with neural nets," IEEE Acoustics, Speech And Signal Processing Magazine, pp. 4-21, (April 1987).

MacConaill, A. and A. J. Walton, "CAM - A strategy to improve the quality & efficiency of processing," in Yield Modelling and Defect Tolerance in VLSI, ed. Moore, Maly & Strojwas, Adam Hilger, Bristol, (1988).

MacPherson, M. R., "The adjstment of MOS transistor threshold voltage by ion implantation," Applied Physics Letters, Vol. 18, pp. 502-504, (1971).

Mahalanobis, P. C., "On the generalised distance in statistics," Proceedings Of The National Institute Of Science (India), Vol. 122, pp. 49-55, (1936).

-301- Maher, M. L., Expert Systems for Civil Engineers: Technology and Applications, A.S.C.E., New York, (1987).

Manchanda, L., J. Vasi, and A. B. Baattacharyya, "The nature of intrinsic hole traps in thermal silicon dioxide," Journal of Applied Physics, Vol. 52, (7), pp. 4690-4696, (July 1981).

Manly, B. F. J., Multivariate Statistical Methods- a Primer, Chapman & Hall, USA, (1986).

Marshak, A. H. and R. Shrivastava, "On the threshold and flatband voltages for MOS structures with polysilicon gate and non-uniformly doped substrate," Solid State Electronics, Vol. 26, (4), pp. 361-364, (1983).

Martins, G. R., "Al: the technology that wasn't," Defence Electronics, pp. 56-59, (December 1986).

McGillivray, I. G., J. M. Robertson, and A. J. Walton, "CV characteristics of MOS capacitors with two top capacitive contacts," Microelectronics Journal, Vol. 18, (5), pp. 25-27. McGillivray, I. G., "A Computer Controlled CV Measurement and Analysis System," EMFfHughes Report, (October 1984).

McGillivray, I. G., "A Comparison of the Quality of Oxide and Oxide-Nitride Gate Dielectrics," EMF/Hughes Report, (October 1984).

McGillivray, I. G., A. J. Walton, and J. M. Robertson, "Post-metallisation annealing of aluminium-silicon," Electronics Letters, Vol. 21, pp. 973-974, (1985). McGillivray, I. G., The Measurement of Electrical Parameters and Trace Impurity Effects in MOS Capacitors, (1987). PhD. Thesis, University of Edinburgh

McGillivray, I. G., J. M. Robertson, and A. J. Walton, "Automatic measurement of minority carrier lifetimes in silicon for nonuniform doping samples using the Zerbst technique," IEEE Proceedings - I: Solid State and Electron Devices, Vol. 134, pp. 161-167, (December 1987). McGillivray, I. G., J. M. Robertson, and A. J. Walton, "Improved measurement of doping profiles in silicon using C-V techniques," IEEE Transactions on Electron Devices, Vol. ED-35, (2), pp. 174-179, (February 1988).

McGillivray, I. 0., J. M. Robertson, and A. J. Walton, Determination of C min from the high frequency C-V characteristic of high lifetime MOS capacitors., (1989). (Presented, to be published)

McMillan, L., "MOS C-V techniques for IC process control," Solid State Technology, pp. 47-52, (September 1972).

McNutt, M. J. and C. T. Sah, "Determination of the MOS oxide capacitance," Journal of Applied Physics, Vol. 46, (9), pp. 3909-3913, (September 1975).

McNutt, M. J. and C. T. Sah, "Exact capacitance of a lossless MOS capacitor," Solid State Electronics, Vol. 19, pp. 255-257, (1976). Mego, T., "Using simultaneous feedback-charge quasistatic and high frequency CV for better CV data in half the time," Microelectronic Manufacturing and Testing, pp. 19-20, (July 1988).

-302- Mego, T. J., "Improved feedback charge method for quasi-Static C-V measurements in semiconductors," Review of Scientific instruments, Vol. 57, (11), pp. 2798-2805, (November 1986).

Meindi, J. D., "Ultra-large scale integration," IEEE Transactions on Electron Devices, Vol. ED-31, (11), p. 1555, (Nov. 1984).

Meisel, W. S., "Computer-Oriented Approaches to Pattern Recognition," Vol. 83 in series: Mathematics in Science and Engineering, Academic Press, London, (1974). (2nd printing)

Michie, D. and R. Chambers, "Boxes: an experiment in adaptive control," in Machine Intelligence (2), ed. Dale & Michie, Oliver & Boyd, (1968).

Miller, G. L., "A feedback method of investigating carrier distributions in semiconductors," IEEE Transactions on Electron Devices, Vol. ED-19, (10), pp. 1103-1108, (Oct. 1972).

Minsky, M. and 0. G. Selfridge, "Learning in neural nets," Proceedings of the Fourth London Symposium on Information Theory, Academic Press, New York, (1961).

Minsky, M. L. and S. A. Papert, Perceptrons (An Introduction to Computational Geometry), MIT Press, (1988). (Second, revised edition)

Mirzai, A. R. ed., Artificial Intelligence: Concepts and Applications in Engineering, Kogan Page, (1989).

Mirzai, A. R., C. F. N. Cowan, and T. M. Crawford, "Intelligent alignment of waveguide filters using a machine learning approach," IEEE Transactions on Microwave Theory and Techniques, Vol. MTT-37, (1), pp. 166-173, (Jan 1989).

Moline, R. A., "Ion-implant phosphorous in silicon: profiles using C-V analysis," Journal of Applied Physics, Vol. 42, (9), pp. 3553-3558, (Aug. 1971).

Moll, J. L., Institute of Radio Engineers Wescon Convention Record, Vol. 3, p. 32, (1959).

Monostori, L., "New trends in machine tool monitoring and diagnosis," Robotics and Computer- Integrated Manufacturing, Vol. 4, (3/4), pp. 457-464, (1988).

Montillo, F. and P. Balk, "High temperature annealing of oxidised silicon surfaces," Journal of the Electrochemical Society, Vol. 118, (9), pp. 1463-1468, (1971).

Moore, R. L., "Expert control," in Proceedings of the American Control Conference, pp. 885-887, Boston, (1985).

Morris, J. W., Computational Methods in Elementary Numerical Analysis, Wiley, New York, (1983).

Mucciardi, A. N. and E. E. Gose, "A comparison of seven techniques for choosing subsets of pattern-recognition properties," IEEE Transactions on Computers, Vol. C-20, (9), pp. 1023- 1031, (September 1971).

Muls, P. A., G. J. Declerck, and R. J. van0verstraeten, "Influence of interface charge inhomogeneities on the measurement of surface state densities in Si-Si0 2 interface by means of the MOS a.c. conductance technique," Solid State Electronics, Vol. 20, pp. 911-922, (1977).

Nash, J. C., Compact Numerical Methods For Computers, Adam Hilger, Bristol, (1979).

-303- Naylor, C., Build Your Own Expert System, Sigma Press, Wilmslow, England, (1987). 2nd edition.

Nicollian, E. H. and A. Goetzberger, "Lateral ac current flow model for metal-insulator- semiconductor capacitors," IEEE Transactions on Electron Devices, Vol. ED-12, pp. 108-117, (1965).

Nicollian, E. H. and A. Goetzberger, "MOS conductance technique for measuring surface state parameters," Applied Physics Letters, Vol. 7, (8), pp. 216-219, (15 October 1965). Nicollian, E. H. and A. Goetzberger, "The Si-Si0 2 interface- electrical properties as determined by the metal-insulator-silicon conductance technique," Bell System Technical Journal, Vol. 46, (1), pp. 1055-1133, (1967).

Nicollian, E H., A. Goetzberger, and A. D. Lopez, "Expedient method of obtaining interface state properties from MIS conductance measurements," Solid State Electronics, Vol. 12, pp. 937-944, (1969).

Nicollian, E. H. and C. N. Berglund, "Avalanche injection of electrons into insulating Si0 2 using MOS structures," Journal of Applied Physics, Vol. 41, p. 3052, (June 1970). Nicollian, E. H., "Electrical properties of the Si:Si0 2 interface and its influence on device performance and stability," Journal of Vacuum Science Technology, Vol. 14, (5), pp. 1112- 1121, (Sept/Oct 1977).

Nicollian, E. H. and J. R. Brews, MOS (Metal Oxide Semiconductor) Physics and Technology, John Wiley & Sons, New York, (1982).

Nicollian, E. H., "Adyances in instrumentation for C-V measurements," Solid State Technology, pp. 107-110, (August 1989).

Nihira, H., M. Konaka, H. Iwai, and Y. Nishi, "Anomalous drain current in n-MOSFETs and its suppression by deep ion-implantation," IEDM Technical Digest, p. 487, (1978).

Nii, H. P. and E. A. Feigenbaum, "Rule-based understanding of signals," in Pattern Directed Inference Systems, ed. D.A. Waterman & F. Hayes-Roth, pp. 483-501, Academic Press, New York, (1978).

Nii, H. P. and J. J. Anton, "Signal-to-symbol transformation: HASP/SlAP case study," Al Magazine, pp. 23-35, (Spring 1982).

Nii, H. P., "Blackboard systems: The blackboard model of problem solving and the evolution of blackboard architectures," The A! Magazine, pp. 38-53, (1986). (Summer 1986)

Nilsson, N. J., Learning Machines- foundations of trainable pattern-classifying systems, McGraw- Hill, New York, (1965).

Nilsson, N. J., Principles of Artificial Intelligence, Springer-Verlag, New York, (1982).

Ning, T. H., C. M. Osburn, and H. N. Yu, "Threshold instability in IGFETs due to emission of leakage electrons from silicon substrate into silicon dioxide," Applied Physics Letters, Vol. 29, (3), pp. 198-200, (August 1976). Ning, T. H., "Hot electron emission from silicon into silicon dioxide," Solid State Electronics, Vol. 21, pp. 273-282, (1978). Ogura, S., P. J. Tsang, W. W. Walker, P. L. Critchlow, and J. F. Shepard, "Design and characterisation of the lightly doped drain-source (LDD) insulated gate field-effect transistor," IEEE Transactions on Electron Devices, Vol. ED-27, p. 1359, (1980).

Ohzone, T., H. Shimura, K. Tsuji, and T. Hirao, "Silicon-gate n-well CMOS process by full ion-implantation technology," IEEE Transactions on Electron Devices, Vol. ED-27, pp. 1789- 1795, (1980).

Ong, D. G., Modern MOS Technology, McGraw - Hill, (1984).

Oxborrow, E., Databases & Database Systems: Concepts and Issues, Chartwell Bratt, (1986).

Patrick, E. A. and J. M. Fattu, Artificial intelligence with Statistical Pattern Recognition, Prentice- Hall, New Jersey, (1986).

Pavlidis, T. and S. Horowitz, "Segmentation of plane curves," IEEE Transactions on Computers, Vol. C-23, pp. 860-870, (August 1974). Pavlidis, T., "A Review of Algorithms for Shape Analysis," Computer Graphics and Image Processing, Vol. 7, pp. 243-258, (1978). Pavlidis, T. and F. Au, "A hierarchical syntactic shape analyser," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAM!-!, pp. 2-9, (1979).

Pavlidis, T., "Editorial: papers on shape analysis," IEEE Transactions on Pattern Analysis and Machine intelligence, Vol. PAMI-8, (1), (January 1986).

Pearl, 3., Heuristics, Addison-Wesley, (1983).

Pfann, W. G. and C. G. Garratt, Proceedings of the Institute of Radio Engineers, Vol. 41, p. 2011, (1959).

Pham, D. T. ed., Expert Systems in Engineering, Artificial Intelligence in Industry series, IFS, Bedford, UK, (1988).

Pierret, R. F. and G. W. Neudeck, eds., Modular Series in Solid State Devices (Volume 4), Addison - Wesley, (1983).

Powell, R. J. and G. F. Derbenwick, "Vacuum ultraviolet radiation effects in Si02 ," IEEE Transactions on Nuclear Science, Vol. NS-18, (6), p. 99, (1971). Press, W. H., B. P. Flannery, W. T. Vetterling, and S. A. Teukoisky, Numerical Recipes, Cambridge University Press, Cambridge, (1985).

Rabbani, K. S. and D. R. Lamb, "A quick method for the determination of bulk generation lifetime in semiconductors from pulsed MOS measurements," Solid State Electronics, Vol. 24, pp. 661-664, (1981).

Reddy, D. R., L. D. Erman, R. D. Fennel, and R. B. Neely, "The Hearsay speech understanding system: an example of the recognition process," Proceedings of the 3rd international Joint Conference on Artificial Intelligence, pp. 185-193, Stanford, California, (1973).

Rehrig, D. L. and C. W. Pearce, "Production mercury probe capacitance-voltage testing," Semiconductor International, pp. 151-162, (May 1980).

-30S- Ricco, B., P. Olivo, T. N. Nguyen, T.-S. Kuan, and G. Ferriani, "Oxide thickness determination in thin-insulator MOS structures," IEEE Transactions on Electron Devices, Vol. ED-35, (4), pp. 432-438, (April 1988).

Rochkind, M. J., Advanced Unix Programming, Prentice Hall Software Series, (1985).

Rosenblatt, F., "The Perceptron: a probabalistic model for information storage and organisation in the brain," Psychological Review, Vol. 65, (6), pp. 386-408, (1958).

Rosenfeld, A., "Figure extraction," in Automatic interpretation and Classification of Images, ed. A. Grasselli, pp. 137-154, Academic Press, London, (1969). (Chapter 6)

Rosser, P. J. and P. B. Moynagh, "Comparison of the interfacial stress resistance in rapid thermally processed thin dielectrics.," ESSDERC 87, pp. 833-836, (1987).

Roy, P. K. and A. K. Sinha, "Sythesis of high-quality ultra-thin gate oxides for ULSI applications," AT&T Technical Journal, pp. 155-174, (Nov/Dec 1988).

Rumeihart, D. E. and J. L. McClelland, Parallel Distributed Processing: explorations in the microstructure of cognition, Volume 1: Foundations, MIT, USA, (1986).

Ruska, W. S., Microelectronic Processing - An introduction to the Manufacture of integrated Circuits, McGraw - Hill, (1987).

Sah, C. T., A. B. Tole, and R. F. Pierret, "Error analysis of surface state density determination using the MOS capacitance method," Solid-State Electronics, Vol. 12, pp. 681 & 689 - 709, (1969).

Sah, C. T., R. F. Pierret, and A. B. Tole, "Exact analytical solution of high frequency lossless MOS capacitance-voltage characteristics and validity of charge analysis," Solid-State Electronics, Vol. 12, pp. 681 - 688, (1969). Saminadayar, K., "Evolution of surface-state density of Si/ wet thermal Si0 2 interface during bias- temperature treatment," Solid-State Electronics, Vol. 20, (11), pp. 891 - 897, (November 1977).

Samuel, A., "Some studies in machine learning using the game of checkers," IBM Journal of Research and Development, Vol. 3, (1959).

Sands, D., K. M. Brunson, and C. B. Thomas, "A modified charge-voltage technique for interface-state measurements in metal-oxide-semiconductor capacitors," Semiconductor Science Technology, Vol. 3, pp. 477-482, (1988). Sarrabayrouse, 0., F. Campabadal, and J. L. Prom, "Oxide thickness determination from C-V measurement in an MOS capacitor," lEE Proceedings - G, Vol. 136, (4), pp. 215-216, (August 1989).

Sauers, R., "Controlling expert systems," in Expert System Applications, ed. Bolc & Coombs, Springer-Verlag Symbolic Computation Series, (1988).

Schlegel, E. S., "A bibliography of metal-insulator-semiconductor studies," IEEE Transactions on Electron Devices, Vol. ED-14, (11), pp. 728 - 749, (November 1967). Schroder, D. K. and H. C. Nathanson, "On the separation of bulk and surface components of lifetime using the pulsed MOS capacitor," Solid-State Electronics, Vol. 13, pp. 577 - 582, (1970).

-306- Schroder, D. K. and J. Guldberg, "Interpretation of surface and bulk effects using the pulsed MIS capacitor," Solid-State Electronics, pp. 1285 - 1297, (1971).

Selfridge, 0., "Pandemonium: a paradigm for learning," NPL Symposium on Mechanisation of Thought Processes, HMSO, London, (1959). (reprinted in 'Pattern Recognition' by L. Uhr, Wiley, New York)

Selfridge, 0. G., "Pattern recognition and modern computers," Proceedings Western Joint Computer Conference, pp. 91-93, (March 1955).

Serack, J. A., Edge Effects in Silicon IGFETS, (1988). Ph.D. Thesis, Edinburgh Microfabrication Facility, Dept. Electrical Engineering, University of Edinburgh

Severin, P. J. and G. J. Poodt, "Capacitance-voltage measurements with a mercury-silicon diode," Journal of the Electrochemical Society: Solid-State Science and Technology, Vol. 119, (10), (October 1972).

Shirley, G. C., "A computer-controlled CV characterisation system," Semiconductor international, pp. 87 - 97, (July 1982).

Singer, P. I-i., "Analysing parametric test structures," Semiconductor International, pp. 74 - 79, (June 1987).

Sinha, A. K., "MOS (Si-gate) compatability of RF diode and triode sputtering processes," Journal of the Electrochemical Society, Vol. 123, (1), pp. 65 - 71, (January 1976).

Smith, P., Expert System Development in Prolog and Turbo-Prolog, Sigma Press, Wilmslow (UK), (1988).

Snow, E. H., A. S. Grove, B. E. Deal, and C. T. Sah, "Ion transport phenomena in insulating films," Journal of Applied Physics, Vol. 36, pp. 1664 - 1673, (1965).

Sriram, D. and R. A. Adey, eds., Applications of Artificial Intelligence in Engineering Problems, Volumes 1 & 2, Springer-Verlag, Southampton, U.K., (April 1986). 1st International Conference

Sriram, D. and R. A. Adey, eds., Knowledge Based Expert Systems for Engineering: Classification, Education and Control, Computational Mechanics Publications, (1987). (Papers selected from the 2nd International Conference on Applications of Artificial Intelligence in Engineering Problems, Cambridge, Massachusets)

Stevenson, J.T.M. and A.M. Gundlach, "The application of photolithography to the fabrication of microcircuits," Journal of Physics, Vol. 19, (Part E: Scientific Instrumentation), pp. 654- 667, (1986).

Sze, S. M., Physics of Semiconductor Devices, John Wiley & Sons, New York, (1981). 2nd edition.

Sze, S. M. ed., VLSI Technology, McGraw-Hill, Singapore, (1983).

Sze, S. M., Semiconducting Devices- Physics and Technology, Wiley, New York, (1985).

Tasch, A. F., P. K. Chattergee, H. S. Fu, and T. C. Holloway, "The Hi-C RAM cell concept," IEEE Transactions on Electron Devices, Vol. ED-25, (33), (1978).

Terman, L. M., "An investigation of surface states at a silicon/silicon oxide interface employing metal-oxide-silicon diodes," Solid-State Electronics, Vol. 5, pp. 285 - 299, (1962).

-307- Therrien, C. W., Decision Estimation and Classification: An Introduction to Pattern Recognition and Related Topics, John Wiley & Sons, New York, (1989).

Thornburgh, A. L., "Organising multiple expert systems: a blackboard-based executive application," in Knowledge-Based Expert Systems for Engineering, Classification, Education and Control, ed. D. Sriram & R. A. Adey, pp. 145 - 155, Computational Mechanics Publications, (1987).

Tiwari, P., B. Bhaumik, and J. Vasi, "Rapid measurement of lifetime using a ramped MOS capacitor transient," Solid-State Electronics, Vol. 26, (7), pp. 695 - 698, (1983).

Tong, Qui-Yi and M. K. Andrews, "PMOS design for VLSI CMOS by a C-V technique," 5th Australian & Pacific Region Microelectronics Conference, Adelaide, (13 - 15 May 1986).

Troutman, R. R., "Ion-implanted threshold tailoring for insulated gate field-effect transistors," IEEE Transactions on Electron Devices, Vol. ED-24, (3), pp. 182 - 192, (1977).

Tsividis, Y. P., Operation and Modelling of the MOS Transistor, McGraw-Hill, New York, (1987).

Turing, A., "Computing machinery and intelligence," Mind, Vol. 59, pp. 236 - 250.

Turner, R., Logics for Artificial Intelligence, Ellis Horwood (series in artificial intelligence), (1984).

Ullmann, J. R., Pattern Recognition Techniques, Butterworths, (1973). van Gelder, W. and E. H. Nicollian, "Silicon impurity distribution as revealed by pulsed MOS C- V measurements," Journal of the Electrochemical Society: Solid State Science, Vol. 118, (1), pp. 138 - 141, (January 1971).

Virdi, G. S., S. Singh, B. C. Pathak, and W. S. Khokle, "Adjustment of threshold voltage of MOS devices by ion-implantation," Microelectronics Journal, Vol. 19, (1), pp. 19 - 33. Wallick, P., "Software 'doctor' prescribes remedies," IEEE Spectrum, pp. 43 - 46, (October 1986).

Walls, J. A., A. J. Walton, and J. M. Robertson, "Application of Al Techniques to the Control and Interpretation of C-V Measurements," To be presented at ICMTS 1990, San Diego, CA..

Walls, J. A., Interconnect technology, (May 1986). BSc Honours Dissertation #HD390 (Edinburgh University Electrical Engineering Department)

Walls, J. A., A. J. Walton, J. M. Robertson, and T. M. Crawford, "Interpretation of C-V curves for process fault diagnosis: a machine-learning expert systems approach," in Proceedings JCMTS 1988, pp. 174 - 178, Long Beach, CA., (February 1988). Walls, J. A. and A. J. Walton, "EDinburgh University CApacitor TEst Software (EDUCATES)- a user guide," EMF Internal Report, (July 1989).

Walls, J. A., A. J. Walton, J. M. Robertson, and T. M. Crawford, "Automating and sequencing C-V measurements for process fault diagnosis using a pattern recognition technique," in Proceedings ICMTS 1989, Vol. 2, pp. 193 - 199, Edinburgh U.K., (March 1989).

Waterman, D. A. and F. Hayes-Roth, Pattern-Directed Inference Systems, Academic Press, New York, (1978). (selected papers from the workshop on pattern-directed inference systems, Honolulu, Hawaii, 1977)

Watnabe, S., Methodologies of Pattern Recognition, Academic Press, New York, (1969).

-308- Wendt, H., A. Spitzer, W. Bensch, and K. V. Sichart, "The effect of rapid thermal annealing on the quality thin thermal oxides," ESSDERC 1989, pp. 345-348, Berlin, (1989).

Wiener, N., Extrapolation, Interpolation and Smoothing of Stationary Time Series, Wiley, New York, (1949).

Wilson, C. L., "Correction of differential capacitance profiles for debye-length effects," IEEE Transactions on Electron Devices, Vol. ED-27, (12), pp. 2262 - 2267, (December 1980). Winstanley, G., in Program Design for Knowledge-Based Systems, Sigma Press, Wilmslow (UK), (1987).

Winston, P. H., Artificial Intelligence, Computer Science Series, Addison Wesley, Philippines, (1977).

Wood, D. R. and T. Ambridge, "Electrical assessment of semiconductor material for electronic device applications," British Telecom Technology Journal, Vol. 4, (2), pp. 69 - 84, (April 1986).

Woods, M. H. and R. Williams, "Hole traps in silicon dioxide," Journal of Applied Physics, Vol. 47, (3), pp. 1082 - 1089, (March 1976). Wu, C. P., E. C. Douglas, and C. W. Mueller, "Limitations of the CV technique for ion- implanted profiles," IEEE Transactions on Electron Devices, Vol. ED-22, (6), pp. 319-328, (June 1975).

Yin, G. Z. and D. W. Jillie, "Orthogonal design for process optimisation and its application in plasma etching," Solid-State Technology, pp. 127 - 132, (May 1987).

Yon, E., W. H. Ko, and A. B. Kuper, "Sodium distribution in thermal oxide on silicon by radiochemical and MOS analysis," IEEE Transactions on Electron Devices, Vol. ED-13, (12), pp. 276 - 280, (February 1966).

Zaininger, K. H., "Automatic display of MIS capacitance vs. bias characteristics," RCA Review, pp. 341 - 359, (September 1966).

Zaininger, K. H. and F. P. Heiman, "The C-V technique as an analytical tool," Solid-State Technology, (Part 2), pp. 46 - 55, (June 1970). Zaininger, K. H. and F. P. Heiman, "The C-V technique as an analytical tool," Solid-State Technology, (Part 1), pp. 49 - 56, (May 1970). Zerbst, M., "Relaxationseffekte An Halbleiter-Isolator-Grenzflache," Z. Angew Phys., Vol. 22, (1), pp. 30 - 33, (1966). (in German) Ziegler, K., E. Klausmann, and S. Kar, "Determination of the semiconductor doping profile right up to the surface using the MIS capacitor," Solid State Electronics, Vol. 18, pp. 189 - 193, (1975).

Ziegler, K. and E. Klausman, "Static technique for precise measurement of surface potential & interface state density in MOS structures," Applied Physics Letters, Vol. 26, (7), pp. 400 - 402, (April 1975).

-309-