<<

.

. Information Security

David Weston Birkbeck, University of London Autumn Term 2012 Chapter 1 - Motivation Introduction

Protecting information against malicious or accidental access plays an important role in information-based economies/societies

Just to name a few application areas: Banking: online banking, PIN protocols, digital cash

Economy: mobile phones, DVD players, Pay-per-View TV, computer games

Military: IFF (Identification, friend or foe), secure communication channels, weapon system codes

It’s surprising how much still goes wrong in these areas Typical Cases of Security Lapses

Loss of confidential data: 2007: HMRC loses (unencrypted) disks containing personal details of 25 million people

2008: HSBC loses disks containing details of 180,000 policy holders (fined for a total of £3.2 million)

2007: hard disk containing records of 3 million candidates for driver’s licenses goes missing

not just happening in the UK: Sunrise (Swiss ISP) exposes account names and passwords of users in 2000 Typical Cases of Security Lapses (2)

Credit card fraud is a recurring theme, ranges from spying out PINs at ATMs to

organized stealing and trading of credit card numbers

Recent high profile case: In the U.S. Albert Gonzalez and other hackers infiltrated Heartland and Hannaford (two firms processing payments)

They stole millions of credit card numbers between 2006 and 2008

This has cost Heartland $12.6 million so far Typical Cases of Security Lapses (3)

Hacking into other systems: 2008: in the U.S. 18-year old student hacks into high school computer, changes grades

2005: UCSB (University of California Santa Barbara) student hacks into eGrades system and changes grades

Web site defacement also seems to happen quite regularly (with targets including the U.N., Microsoft, and Google) Typical Cases of Security Lapses (4)

Denial-of-Service attacks: 2009: Twitter is hit by a denial-of-service attack and brought to a standstill

Natural disasters (cause needs not be malicious): Data loss through fire, storm, flooding

2005: Hurricane Katrina takes out two data centers of an aerospace company in the U.S.; unfortunately, they backed each other up Objective

This list could go on and on . . .

However, even this small number of examples already shows the importance of this subject

Objective of this module is to teach basic principles of information security, enabling you to identify and assess security risks

take appropriate countermeasures Chapter 2 - Introduction Basic Definitions

How do we define information security?

Information security is about ensuring the following: Confidentiality

Integrity

Availability Confidentiality

Confidentiality is (the principle of) restricting access to information

Only people who are authorized should be able to access information

Example for loss of confidentiality: losing disks with sensitive data Integrity

Integrity is about preventing improper or unauthorized change of data

Only trustworthy data is of value

Example for loss of integrity: student hacking into university computer and changing grades Availability

Availability is about making sure that information is accessible when needed (by authorized persons)

Usually this implies keeping systems that store the information (and restrict access) operational

Example for loss of availability: system taken out by a disaster Additional Aspects

Sometimes the following aspects are added as separate issues: Authentication: confirming the identity of an entity

Non-repudiation: an entity is not able to refute an earlier action Threats, Attacks, and Vulnerabilities

A threat is a potential danger to an (information) asset, i.e. a violation of security might occur

An attack is an action that actually leads to a violation of security

A vulnerability is a weakness that makes an attack possible Threats, Attacks, and Vulnerabilities (2)

An example: a server storing source code for a new product of a software development company Threat: an unauthorized person (e.g. from a competitor) could access the source code

Attack: someone (unauthorized) actually logs into the system and downloads (parts of) the source code

Vulnerability: an unremoved test account with a default password Security Controls

Security controls are mechanisms to protect information (or a system) against unauthorized access (ensuring confidentiality)

unauthorized modification (ensuring integrity)

destruction/denial-of-service (ensuring availability)

Controls are also called countermeasures or safeguards

General types of controls: Physical

Technical

Administrative Physical Controls

Physical Controls include the following: locks

security guards

badges/swipe cards/scanners

alarms

fire extinguishers

backup power

... Technical Controls

Technical controls include the following: access control software/passwords

antivirus software

backup software/systems

... Administrative Controls

Administrative controls include the following: staff training

clear responsibilities

policies and procedures

contingency plans

... Controls Covered in This Course

In this course we are going to focus on technical controls

administrative controls

Although physical controls should be kept in mind, we don’t have enough time to cover everything

The other two controls are more interesting from the point of view of computer science and management Controls Covered in This Course (2)

Why not just look at the technical issues?

Security is only as strong as the weakest link

Very often, people are the weakest link Overview

In the first part of this module, we’re going to cover the administrative side

Second part looks at the technical side

More fine-grained overview of what’s to come: Administrative Issues

Risk Assessment

Security Policies

Social Engineering

Cryptography

Cryptographic Protocols

Selected Topics Chapter 3 Administrative Issues Organizational Environment

No matter how small the organization, there has to be someone responsible for information security

Usually, this person is called the security officer or SO (or some other acronym)

In small- or medium-sized organizations could be an added responsibility (not a full portfolio)

In large organizations, SO might be supported by a team Role of Security Officer

Defines a corporate security policy

Defines system-specific security policies

Creates project plan for implementing controls

Responsible for user awareness and training programs

Appoints auditors

Optional: obtains formal accreditation Place in Hierarchy of SO

Due to the importance of information security SO should report to the highest level of control (board of directors, CEO)

SO may even be a member of the board

SO acts as an intermediary between management and the user base Place in Hierarchy of SO (2)

Example for a large organization: Management Support

In order for SO to be successful, they need support from high-level management

Convincing (top-level) management can be difficult Ignorance of real nature of risks

Fixated on bottom line (security costs money, no clear benefits)

Fear of having to address unknown risks and take responsibility Convincing People

Introducing security measures will always require investments in time, human resources, and money

Not doing so, however, can have dire consequences: HSBC having to pay a fine of £3.2 million

Heartland losing at least $12.6 million

Health services and software companies in the U.S. have been sued for millions or even billions of dollars

Bloghoster JournalSpace had to close shop after losing all content data

First important step: know your exposure to risk Chapter 4 Risk Assessment What is It About?

Risk assessment helps in understanding: What is at risk? (Identifying assets)

How much is at risk? (Identifying values)

Where does the risk come from? (Identifying threats and vulnerabilities)

How can the risk be reduced? (Identifying countermeasures)

Is it cost effective? (Risk can never be completely eliminated) Traditional Method

Single Loss Expectancy (SLE): measures the expected impact (in monetary terms) of a certain threat occurring: SLE = Asset Value (AV) × Exposure Factor (EF)

Asset Value is the total value of the asset under threat

Exposure Factor measures the proportion of the asset that is lost when threat becomes real Range is between 0.0 and 1.0

1.0 meaning the complete asset is lost Traditional Method (2)

Usually this is looked at on an annualized basis

Annualized Rate of Occurrence (ARO) represents the expected frequency of a threat occurring per year

This gives us the Annualized Loss Expectancy (ALE): ALE = SLE × ARO

The tricky part is figuring out all these values (more on this later) Simple Example

We have an asset worth £150,000, so AV = £150,000

If threat occurs, we lose two thirds of it, so EF = 2/3

We expect this threat to occur once every 20 years (on average), so ARO = 0.05

This gives us: ALE = £150,000 × 2/3 × 0.05 = £5000 Example from Banking

An ALE analysis for a bank’s computer systems might have hundreds of different entries, e.g. Loss Type Asset Value ARO ALE SWIFT fraud £30,000,000 0.005 £150,000 ATM fraud (large) £200,000 0.2 £40,000 ATM fraud (small) £15,000 0.5 £7,500 Teller takes cash £2,160 200 £432,000 Here we assume that EF = 1.0, i.e. all the money is lost

Accurate figures are probably available for common losses, much harder to get these numbers for uncommon, high-risk losses Putting Controls in Place

Introducing controls/countermeasures has an effect on the rate of occurrence and/or the exposure factor

So we have ALEbefore and ALEafter We reduce risk by

ALEbefore − ALEafter

However, controls also have an annual cost: ACcontrol So, the overall benefit is

(ALEbefore − ALEafter) − ACcontrol Risk and Information Assets

What does this all mean for risk assessment in the context of information security?

We have to analyze different aspects: Value of the information/system assets

Possible threats to information/systems

Vulnerabilities of information/systems

Cost of countermeasures How to Measure Risk

Risk is a function of threats, vulnerabilities, and assets: How to Measure Risk (2)

With the help of controls, the risk can be reduced: How to Measure Risk (3)

Example of reducing threat: Do a background check on employees

Example of reducing vulnerability: Keep system updated, i.e. install patches

Example of reducing asset (value): Encrypt data: even if it is accessed, it’s worthless Overall Assessment Process Asset Valuation

Basically two different kinds of assets: Tangible assets (hardware, buildings, etc.)

Intangible assets (software, information)

Valuing tangible assets and part of the intangible assets (operating systems and application software) is not too hard They have a price tag

Cost to replace them is known

What about information assets? Information Asset Valuation

One way to put a value on information assets is to ask knowledgeable staff

Delphi method is used to do this in a systematic way Delphi Method

Experts (i.e. knowledgeable staff) give answers to questionnaires for several rounds

After each round a facilitator summarizes answers (and reasoning) given by experts

This summary is given to the experts (usually done anonymously)

New round is started in which experts may revise their answers Approach by Zareer Pavri

Use a cost-based and/or income-generating approach to arrive at concrete numbers

Cost approach: try to put a fair market value on the information assets

Income approach: try to determine income stream generated by products/services associated with information assets

However, even Zareer Pavri admits that valuation of information assets is sometimes more of an art than a science Example for Gillette

In 1998 Gillette had a total value of invested capital of $58.53 billion This is based on a market price of $49 per share

Working Capital was $2.850 billion

Fixed/other assets were at $5.131 billion

Intangible assets (with a price tag) $5.854 billion

We are missing $44.7 billion

At least part of this has to be the value of the information assets Threat Analysis

Let’s move to the next point: threat analysis

Reminder: a threat is a potential danger to an asset

Generally there are two different types: Accidental

Intentional

Many different sources are possible: Natural: floods, fires, earthquakes, . . .

Human: network-based attacks, malicious software upload, incorrect data entry, . . .

Environmental: power failure, leakage, . . . Threat Analysis (2)

During threat analysis, analyst must decide which threats to consider

This can be a formidable task: There’s always the danger of overlooking something

It’s impossible to protect yourself against every possible threat

Fortunately, many other people have done this before An analyst does not have to start from scratch Threat Analysis (3)

Lots of different sources are available: Government: CESG (Communications-Electronics Security Group), a branch within the GCHQ (Government Communications Headquarters) www.cesg.gov.uk

Industry: vendors of risk assessment tools, we’ll look at one in a moment: CRAMM www.cramm.com

Hire consultants: e.g. CMU’s Computer Emergency Response Team (CERT) www.cert.org

Other organizations: SANS (Sysadmin, Audit, Network, Security) Institute www.sans.org

There’s also a newsgroup called comp.risks Vulnerability Analysis

Next task is about identifying vulnerabilities that can be exploited

Vulnerabilities allow threats to occur (more often) or have a greater impact

Some argue that vulnerability analysis should be done before threat analysis If there is no vulnerability, there is no threat

However, it’s not possible to eliminate each and every vulnerability Vulnerability Analysis (2)

Starting a vulnerability analysis from scratch would also be very tedious

Fortunately, this has also been done before The sources mentioned before also have material on vulnerability analysis

One more source should be mentioned here: National Vulnerability Database (NVD) provided by the National Institute of Standards & Technology (NIST): nvd.nist.gov

At the time of writing this slide, there were 37424 identified vulnerabilities documented in the database NVD Example NVD Example (2) Putting It All Together

To assess risks, threats have to be connected to vulnerabilities, which have to be mapped to assets: Risk Assessment

We’ve covered asset valuation and threat & vulnerability analysis, next step is actual risk assessment

Risk modeling aims at giving well-informed answers to the following questions: What could happen? (threats/vulnerabilities)

How bad would it be? (impact)

How often might it occur? (frequency/probability)

How certain are answers to the question above? (uncertainty) Risk Modeling

We’ve covered the first question (threat and vulnerability analysis)

Second question (impact) was covered to a certain extent (by putting values on assets)

Uncertainty could be modeled by bounded distributions: “I’m 80% certain that it will cost between £170K and £200K to reconstruct the content of this file.”

Where do we get the other numbers from?

This is not a trivial task Risk Modeling (2)

Results of an FBI/CSI (Computer Security Institute) survey in 2002: 80% of investigated companies reported financial losses due to security breaches, only 44% could quantify losses

Only 30% of UK businesses had ever evaluated return-on-investment for information security spending Risk Modeling (3)

People assessing risks have to rely on some form of statistical data This data can come from internal sources (if an organization keeps records)

employees, customers, clients

other organizations such as NIST

ASIS International (an organization for security professionals): www.asisonline.org

government, e.g. law enforcement

tools for risk assessment

On top of that you need statistical methods to evaluate the data Quantitative vs. Qualitative Assessment

Trying to put a number on everything is called quantitative risk assessment

Computing the ALE is a classic form of quantitative risk assessment

Prerequisites of this approach: Reliable data has to be available

Appropriate tools are available

The person doing the assessment knows what they are doing (and is trustworthy)

If this is not the case, quantitative assessment can lead to a false sense of security Especially if the numbers are “massaged” Qualitative Assessment

An alternative to quantitative assessment is qualitative risk assessment

Rather than using concrete numbers, ranking is used, e.g. describing a threat level as high, medium, or low

Asset values may also be described in a similar way, e.g. high, medium, or low importance Qualitative Assessment (2)

Analysis is described with the help of a matrix:

threat level low medium high low medium

asset value high

The darker the field, the higher the risk

Depending on security goals, certain risk levels have to be addressed (by introducing controls) Qualitative Assessment (3)

Qualitative assessment is easier to understand by lay persons

However, it has the following problems: Results are essentially subjective

Results of cost/benefit analysis might be hard to express Methodology

There are pros and cons for both the quantitative and qualitative approaches

Most important issue: A knowledgeable person is doing the assessment

This is done in a systematic way

The concrete methodology becomes important when you are looking for an accreditation

Most relevant in the UK is BS 7799 (the international version of this standard is ISO 27001/27002) Hybrid Approaches

Quantitative and qualitative risk assessment are rarely used in their pure form (in the context of information security)

Increasing the granularity of a qualitative approach (by adding more levels) will push it into the direction of quantitative assessment

Very often a combination of both, a hybrid approach, is used CRAMM

Let’s complete the discussion on risk assessment by looking at a concrete tool: CRAMM

CRAMM stands for CCTA Risk Analysis and Management Method CCTA in turn stands for Central Computing and Telecommunications Agency

CRAMM is ISO 27001 compliant and, e.g., used by NATO, BAE Systems Electronics, and the NHS

If you’re selling to the UK government, chances are high that you have to use this method/tool CRAMM (2)

CRAMM follows the following overall procedure (from top to bottom): Screenshot: Threat/Vulnerability Analysis How Does It Work?

CRAMM uses a risk matrix

However, this matrix is a bit more sophisticated than the one on a previous slide: 7 levels for measuring risk: from 1 (low) to 7 (high)

10 levels for asset values: from 1 (low) to 10 (high)

5 threat levels: very low, low, medium, high, very high

3 vulnerability levels: low, medium, high How Does It Work? (2)

At the core of CRAMM is the following matrix: How Does It Work? (3)

Basically, CRAMM is a qualitative form of assessment

However, the different levels can be roughly mapped to concrete values (making it somewhat hybrid)

Example for mapping levels of risk to ALE: Selecting Countermeasures

Depending on the type of threat

type of asset

desired security level a countermeasure can be selected

CRAMM contains a library of about 3500 generic controls Screenshot: Countermeasures Testing It Out

After countermeasures have been put in place, everything can and should be tested in practice

Organizations often do penetration studies They hire a team of professional security experts who (are authorized to) try to violate the security

The goal is to find gaps in the implemented security Flaw Hypothesis Methodology

A framework for conducting these studies is: 1 Information gathering: testers try to become as familiar with system as possible (in their role as external or internal attackers)

2 Flaw hypothesis: drawing on knowledge from step 1 and known vulnerabilities, testers hypothesize flaws

3 Flaw testing: tester try to exploit possible flaws identified in step 2 If flaw does not exist, go back to step 2

If flaw exists, to to next step

4 Flaw generalization: testers try to find other similar flaws, iterate test again (starting with step 2)

5 Flaw elimination: testers suggest ways of eliminating flaw Example of a Study

The Internal Revenue Service (IRS) in the U.S. did a penetration study in 2001: Testers were able to identify some but not all IRS Internet gateways

Testers then tried to get through the firewalls of the identified gateways (but were not successful)

However, they gained enough information to circumvent firewalls

Testers, posing as Help Desk employees, called 100 IRS employees asking for assistance

71 of these employees agreed to temporarily change their password to help resolve a technical problem

Together with a phone number (also obtained by the testers) this would have enabled them to get access Debate on Studies

There is an ongoing debate on the validity of penetration studies Some vendors use it as an argument to prove the security of their product

Critics claim that these studies have no validity at all (can only show presence of flaws, not the absence)

The answer probably lies in the middle: Penetration studies are not a substitute for thoroughly designing and implementing a security solution

However, they are valuable in a final “testing” stage Conclusion

Risk assessment is an important first step in determining whether to protect an asset and to which level

the threats to protect against and their likelihood of occurring

Environments change, so (parts of) the assessment might have to be redone later

Most important thing is to create an awareness and starting a risk assessment process Chapter 5 Security Policies Introduction

So you’ve succeeded as SO in convincing people that taking care of information security

and assessing the risks is a good idea

What’s next? Implementing everything and putting things in practice Introduction (2)

Implementing information security is not just about scrambling like mad to install the newest patches

get firewalls up and running

install virus scanners

It’s also about putting sensible practices and procedures in place We’re still talking about administrative/management side, we’ll look at technical issues later Security Policies

This is where the concept of a security policy comes into play

It is a (set of) document(s) in which an organization’s philosophy

strategy

practices with regard to confidentiality, integrity, and availability of information and information systems are laid out Security Policies (2)

What can policies do that technology alone can’t? Tie in upper management: technological controls are the responsibility of an IT administrator/manager, policies are those of upper management

If an organization has legal obligations to fulfill, it can provide a paper trail

Define a strategy with regard to security: what to protect and to which degree (rather than how)

Serve as a guide to information security for employees Useless Example (1)

MegaCorp Ltd security policy: 1 This policy is approved by Management.

2 All staff shall obey this security policy.

3 Data shall be available only to those with a “need-to-know”.

4 All breaches of this policy shall be reported at once to Security. Useless Example (2)

17,346,125 pages of bureaucratic and technical details

These two examples probably explain findings from a study undertaken by Cisco in 2008: 1 23% of IT professionals work for an organization not having a security policy

2 47% of employees and 77% of IT professionals think their organization’s security policy could be improved

3 On average 40% of employees and 20% of IT professionals were unaware that their organization had a security policy Defining a Security Policy

First step: find out about your risks (risk assessment)

Policies can be seen as a risk-control mechanism (a procedural one rather than a technical one)

Goal of a control mechanism is to get risk down to an acceptable level

Policy should cover risks identified during risk assessment

Before going into concrete examples, we’ll define some important concepts of security policies Formal Models

An important aspect of security policies are classification and access control models: Distinguishes between various groups of people, systems, and information in terms of security

Determines who is authorized to access which information on which system

Access Control usually comes in two different flavors: Discretionary Access Control (DAC)

Mandatory Access Control (MAC) Discretionary vs. Mandatory

Discretionary Access Control (DAC) Each data object is owned by a user and user can decide freely which other users are allowed to access data object

Many operating systems follow this line

Mandatory Access Control (MAC) Access is controlled by a system-wide policy with no say by the users

Military systems often use this kind of control

We’ll now look at some concrete models Access Control Matrix

One of the earliest models to emerge was the Access Control Matrix

An Access Control Matrix is used to describe the protection state of information or a system

More formally, there are objects o that need to be protected objects can be files, devices, processes

subjects s who want to access the objects subjects can be users, other processes

rights r(s, o) that specify which type of access subject s is permitted on object o access rights can include read, write, execute, own Access Control Matrix (2)

This is organized in a matrix: subjects are in the rows, objects in the columns, and the rights in the cells

o1 o2 o3 ... 1 2 1 2 1 2 s1 r11, r11,... r12, r12,... r13, r13,...... 1 2 1 2 1 2 s2 r21, r21,... r22, r22,... r23, r23,...... 1 2 1 2 1 2 s3 r31, r31,... r32, r32,... r33, r33,...... Example

Here’s a bookkeeping example with different users and objects: Operating Accounting Accounting Audit System Application Data Trail Alice (sysadmin) rwx rwx r r Bob (manager) rx x - - Charlie (auditor) rx r r r Acc. Application rx r rw w (r: read access, w: write access, x: execution permission, -: no rights) Problems with Access Control Matrices

If applied in a straightforward fashion, this matrix doesn’t scale very well Example: bank with 50,000 staff and 300 applications ⇒ 15,000,000 million entries

On top of that, most entries are empty: storing a sparsely populated matrix explicitly wastes a lot of space

Solution: only store entries that actually exist, can be done in two different ways: by row, store privileges with user (capability lists)

by column, store privileges with objects (access control lists) Capability Lists

Storing the example matrix by rows would yield the following capability lists: Alice Op. Sys.: rwx, Acc. App.: rwx, Acc. Data: r, Audit: r Bob Op. Sys.: rx, Acc. App.: x ...... Makes it difficult to check who has privileges for certain files One would have to look at every list Access Control Lists

Storing the example matrix column-wise would result in the following access control lists Op. Sys. Alice: rwx, Bob: rx, Charlie: rx, Acc. App.: rx Acc. App. Alice: rwx, Bob: x, Charlie: r, Acc. App.: r Acc. Data. Alice: r, Charlie: r, Acc. App.: rw ...... Makes it easier to check who can access which file (and, if necessary, revoke the rights)

However, lists may still be too long and it may be too tedious to manage them

One solution is to organize users into groups Access Control Lists (2)

Grouping users is an approach employed by many operating systems

For example, in UNIX/Linux systems you have the following partitioning: self: owner of a file

group: a group of users sharing common access

other: everyone else - rwx|{z} |{z}r-x |{z}r-x alice users accounts self group other alice is the name of the owner users is the name of a group accounts is the name of a file Access Control Lists (3)

However, ownership of a file is still under control of a single user

This user can pass on access permissions as they see fit Assigning users to groups is usually done by a system administrator, though

This might not always be appropriate for (larger) organizations

There is a model called role-based access control, which we will look at after introducing some other models Summary Access Control Matrix

An Access Control Matrix specifies permissions on an abstract level

limits the damage that certain subjects can cause

in the form described above is a snapshot view (dynamic behavior is not described) Bell-La Padula Model

In the Bell-La Padula (BLP) model, every document (or information object) has a security classification The more sensitive the information, the higher the classification level Examples for typical levels are: top secret (ts), secret (s), confidential (c), and unclassified (uc) Every user of the system has a clearance (level) Classification and clearance levels are not decided by the users, some certified entity has to do this To be able to access a document, a user must have at least the level of the document The concept of this model originated in the military Let’s define this more formally Bell-La Padula Model (2)

Assume that you have a data object o

a subject s

Let c(o) be the classification of o and c(s) be the clearance of s

BLP enforces the following two properties: Simple security property: s may have read access to o only if c(o) ≤ c(s)

* (star) property: s, who has read access to o, may have write access to another object p only if c(o) ≤ c(p) Bell-La Padula Model (3)

First rule is quite straightforward: no-one may access information that is classified higher than their clearance

Second rule seems counterintuitive at first glance Intended to prevent a “write-down” effect

“write-down”: more highly classified information is passed to a lower classified object

BLP can be extended by categories Categories

Categories (or compartments) describes types of information

Each data object may be assigned to different categories (multiple categories are possible)

A subject should only have access to information that they need to do their job

In addition to a clearance level, a subject is also assigned a set of categories (which they are allowed to access) Categories (2)

Let’s assume we have the categories NUC, EUR, US

All possible subsets of these categories form a lattice with regard to the subset relation:

A person with access to { NUC, US } may look at data objects categorized as { NUC, US }, { NUC }, { US }, or ∅ (uncategorized), subject to clearance level Summary BLP

In essence BLP enforces confidentiality

Can be explained by its military origin

Does not handle data integrity (or only to a very limited extent)

We’ll have a look at some other models now (which also ensure integrity) Clark-Wilson Integrity Model

In a commercial environment, preventing unauthorized modification of data is at least as important as preventing disclosure

Examples: Preventing fraud (or mistakes) in accounting

Inventory control: Still works if information is disclosed

May break down if too much data is incorrect Clark-Wilson Integrity Model (2)

BLP protects data against unauthorized users

The Clark-Wilson (CW) integrity model also tries to protect data against authorized users

This has started long before computers, two mechanisms are especially important Well-formed transaction: user is constrained in the way they can manipulate data, e.g. record a log of changes

double entry bookkeeping

Separation of duty: executing different subparts of a task by different persons, e.g. authorizing purchase order, recording arrival, recording arrival of invoice, authorizing payment Clark-Wilson Integrity Model (3)

How are these concepts translated for data integrity in information systems?

Identify Constrained Data Items (CDIs) These are data items for which integrity has to be upheld

Not all data items need to be CDIs, other data items are called Unconstrained Data Items (UDIs)

The integrity policy is defined by two classes of procedures: Integrity Verification Procedures (IVPs)

Transformation Procedures (TPs) Integrity Verification Procedures

IVPs check that all CDIs in the system conform to integrity constraints

On successful completion this confirms that at the time of running an IVP, the integrity constraints were satisfied

Example for accounting: Audit functions are typical IVPs

For example, an auditor confirms that the books are balanced and reconciled Transformation Procedures

TPs correspond to the concept of a well-formed transaction

Applying a TP to a CDI in a valid state will result in a CDI that is still in a valid state

Example for accounting: A double entry transaction is a TP IVPs and TPs

CW works in the following way: We start in a valid state (confirmed by IVPs)

We only manipulate CDIs with TPs

We will always be in a valid state

What’s missing? A system can enforce that CDIs are only manipulated by TPs

A system cannot ensure (on its own) that the IVPs and TPs are well-behaved

IVPs and TPs need to be certified by someone Certification

The security officer (or an agent with similar responsibilities) certifies IVPs and TPs with respect to an integrity policy

This is usually not done automatically, but semi-automatically (i.e. some human intervention is necessary)

Accounting example: SO certifies that TPs implement properly segregated double entry bookkeeping CW Model

Assuring the integrity in CW consists of two important parts: Certification (which is done by a human)

Enforcement (which is done by a system)

CW consists of a set of rules referring to certification (C-rules) and enforcement (E-rules) Concrete Rules

Basic framework: C1: (IVP certification) Successful execution of IVPs confirms that all CDIs are in a valid state

C2: (TP certification)

Agent specifies relations of the form (TPi , (CDIi1 , CDIi2 , . . . )) that defines for which CDIs TPi returns valid states E1: (TP enforcement) System maintains a list of relations and only allows manipulation of CDIs via the specified TPs Concrete Rules (2)

Separation of duty: C3: (User certification)

Agent specifies relations of the form (UserID, TPi , (CDIi1 ,

CDIi2 , . . . )) that defines which TPs may be executed on which CDIs on behalf of which user : (User enforcement) System maintains a list of relations and enforces them E3: (Authentication) Each user must be authenticated before executing a TP for that user E4: (Certification and execution) Only an agent permitted to certify entities may change relations, this agent may not have any execution rights with respect to that entity Concrete Rules (3)

Logging/Auditing: C4: (Logging) All TPs must be certified to write to an append-only log, making it possible to reconstruct each executed operation

There’s no special enforcement rule, because the log can be modeled as a CDI

and each TP has to be certified to append to this CDI

We’re almost done. . . Concrete Rules (4)

Relationship between UDIs and CDIs C5: (Transformation) Any TP taking a UDI as an input must be certified to only use valid transformations when taking the UDI input to a CDI

or reject the UDI

This is important as new information is fed into a system via UDIs For example, data typed in by a user may be a UDI Summary CW

CW enforces integrity: Uses two different integrity levels: CDIs and UDIs (CDI being the higher level)

“Upgrades” from UDIs to CDIs via certified methods

Could also enforce confidentiality (not a main concern in the original model): If users are only allowed to access data through Transformation Procedures as well

Although in this case, no data will be changed Role-Based Access Control

Role-based Access Control (RBAC) is a mandatory access control model

Mainly for commercial/civilian applications, so does not have the different levels of secrecy of Bell-La Padula

Can be seen as a more strict version of the Access Control List (ACL) Main difference to ACL: in RBAC a certified entity grants and revokes privileges, this is not done by the user Roles

Similar to ACL RBAC also puts users into different groups, these are called roles

Depending on their current role, users can access (and modify) certain data objects

For example, in a hospital setting you could have the following roles: doctors, nurses, pharmacists, administrators While doctors may modify a diagnosis file, administrators may not do so

Let’s give a more formal description now Formal Definition

In RBAC you have a subject s

role r

transaction t (transactions access/modify data)

object o

mode x (read, write, execute, etc.)

Furthermore, there are AR(s): the active role of s, i.e. the role s is currently using

RA(s): the roles s is authorized to perform

TA(r): the transactions a role is authorized to perform Formal Definition (2)

A subject s may execute a transaction t if exec(s, t) is true

exec(s, t) is true, if and only if s can execute t at the current time This is determined by AR(s) and TA(r)

This is the case if t ∈ TA(AR(s)) Rules in RBAC

RBAC enforces the following rules 1 Role assignment: the assignment of a role is a necessary prerequisite for executing transactions For all s, t : exec(s, t) can only be true if AR(s) ≠ ∅

2 Role authorization: a subject’s active role must be authorized For all s : AR(s) ∈ RA(s)

3 Transaction authorization: s can execute t only if t is authorized for the active role of s For all s, t : exec(s, t) is true only if t ∈ TA(AR(s)) Rules in RBAC (2)

Access modes can be defined for transactions (to allow a finer granularity) This defines what a user (in a certain role) can do with data (read-only, modify, etc.)

access(r, t, o, x) is true, if and only if role r may execute transaction t on object o in mode x

So sometimes a fourth rule is added to RBAC: 4 For all s, t, o : exec(s, t) is true, only if access(AR(s), t, o, x) is true Role Relationships

The relationship between users, roles, transactions, and objects can also be described graphically

Object 1 transaction_a member_of User 1

Role 1

Object 2 transaction_b member_of User 2

A more concrete example in a hospital setting:

Patient file 1 prescribe_drug member_of Dr House

doctor

Patient file 2 diagnose member_of Dr Wilson Summary RBAC

Often MAC is equated with military applications, while DAC is seen as appropriate for commercial/civilian applications

However, DAC might not always provide the appropriate security in a commercial/civilian setting

RBAC fills this gap: it’s a MAC model without the specific military requirements Summary of Formal Models

Access Control Matrix, BLP, CW, and RBAC are just four (though important) models

Other well-known models include Biba (integrity) and Chinese Wall (hybrid)

Due to time constraints they are not discussed here

We will now turn our attention to the management side of security policies Organization-wide Policies

Ideally, a policy for an organization should be light, i.e. not covering hundreds or even thousands of pages

simple and practical

easy to manage and maintain

accessible, i.e. people can find what they’re looking for

Policy should be organized in a hierarchical fashion: Security framework paper

Position papers (on specific issues) Organization-wide Policies (2) Framework Paper

The security framework document should state an organization’s commitment to information security

define a classification system (e.g. based on one of the discussed formal models)

make clear that users and administrators will be held accountable for their behavior

give the SO appropriate authority to enforce security policy

state clearly the responsibilities that individuals have

define how security will be reviewed Position Papers

Position papers address specific aspects of the security policy, e.g. how to configure servers connected to the internet

the process to be followed in the event of a security breach

should be focused and kept short

can have different authors (should be written by someone with expertise in the area) Position Papers (2)

Exact topics depend on the organization, here are some important ones: Physical security

Network security

Access control

Authentication

Encryption

Key management

Compliance

Auditing and Review

Security Awareness

Incident response and contingency plan Content of Position Papers

Scope: should describe which issue, organizational unit, or system is covered

Ownership: name and contact details of the document owner

Validity: policies can/should have limited lifespans (after which they’re reviewed)

Responsibilities: who is responsible for which elements

Compliance: the consequences of non-compliance with the policy

Supporting documentation: references to documents higher and lower in the policy structure

Position statement: statements about the actual issue or system (the hard part)

Review: whether, when, and how issue or system will be reviewed Summary

Policies are an important part of information security, as they also cover the non-technical side

There’s no “one size fits all”, policy must be adapted to the risk profile of an organization There are (customizable) policy solutions out there, so not everything has to be done from scratch

Security policies should be easy to access, use and understand

For a more detailed example, see http://www.securityfocus.com/infocus/1193 Chapter 6 Social Engineering Introduction

As already mentioned, security is not just a technological problem, but also a people and management problem

Social engineers bypass the technological defenses and target people

Social engineering is about manipulating people to gain access to confidential information

perform fraudulent actions (they wouldn’t normally do)

We’ll have a look at some well-known techniques Common Methods

Posing as a fellow employee (easier in a large organization)

Posing as an employee of a vendor, partner company, or law enforcement

Posing as someone in authority

Posing as a new employee requesting help

Using insider lingo and terminology to gain trust Common Methods (2)

Posing as a vendor or systems manufacturer offering a system patch or update

Offering help if a problem occurs, then causing problem, and waiting for victim to ask for help

Sending free software or patch for installation

Sending virus/Trojan Horse as e-mail attachment

Leaving CD/DVD/USB device with malware (malicious software) around the workplace

Offering prizes for registering at a web site Common Methods (3)

Dropping document/file at mail room for intraoffice delivery

Modifying fax machine heading to appear to come from internal location

Asking receptionist to receive, then forward a fax

Pretending to be from remote office and asking for e-mail access locally “Social Engineering Life Cycle”

Research: attackers try to find out as much as possible beforehand from annual reports, brochures, web site, dumpster diving

Developing trust: use of insider information, misrepresenting identity, need for help or authority

Exploiting trust: ask victim for information or other form of help (or manipulate victim into asking for help)

Utilize information: if final goal not reached yet, go back to earlier steps Why Do Social Eng. Attacks Work?

There are certain basic tendencies of human nature that are exploited

Authority: people have a tendency to comply when approached by a person in authority “I’m with the IT department/I’m an executive/I work for an executive/. . . ”

Liking: people seem to comply when a person comes across as likable (similar interests, beliefs, and attitudes) “I have the same hobby/I come from the same town/. . . ” Why Do Social Eng. Attacks Work? (2)

Reciprocation: people feel obliged to help if they’ve received a favor or gift “Let me help you with sorting out your computer problems. . . ”

Consistency: people feel that they have to honor any pledges or commitments made in public “But your company has a reputation for. . . /You have done this for so-and-so/. . . ” Why Do Social Eng. Attacks Work? (3)

Social validation: people have a tendency to comply when doing so appears to match what others are doing “Your colleagues have also answered this survey. . . ”

Scarcity: people seem to behave differently when an asset is in short supply “The first hundred people to register will receive a free. . . ” Warning Signs of an Attack

Unusual request

Refusal to give callback number

Claim of authority

Stresses urgency

Threatens negative consequences in case of non-compliance

Shows discomfort when questioned or challenged

Name dropping

Compliments or flattery

Flirting How to Prevent Attacks

What’s going to stop an attack by a social engineer? A firewall?

Strong authentication devices?

Intrusion detection systems?

Encryption?

The answer is no to all of these How to Prevent Attacks (2)

So what does work?

Have procedures in place for handling suspicious requests: Make them part of the security policy

Let staff know about these

Training of staff plays an important role

Explain why certain procedures are put in place (blind obedience doesn’t work) Training Staff

One of the most important points is to make sure that employees verify the identity of a caller/visitor

This can involve: If caller is known, this can be as simple as recognizing their voice

Ask for a phone number to call back (and check number)

Contact caller’s/visitor’s superior

Ask a trusted employee to vouch for a person

Staff also needs to find out the “need-to-know”: why a certain person wants information and why they are authorized Training Staff (2)

Make sure that staff handling sensitive information are aware of the fact and know potential effects of handing out information

Staff has to be trained to challenge authority when security is at stake Training should include teaching friendly ways of doing this

This has to have support from high-level management, otherwise people will stop doing this

Don’t forget anyone in the training: Security guards may have physical access to sensitive information or systems and be asked to help out Training Staff (3)

Training should not just involve organizing seminars that go on for hours People just switch off after some time

Much better is hands-on training Stage simulated attacks on your organization, e.g. by sending them phishing e-mails (PhishGuru), distributing USB sticks, etc.

Then debrief employees on how to change/improve their behavior (sticks much better due to first-hand experience)

Don’t combine this with punitive actions, this is not about punishing people, but about helping them Summary

There are many (non-technical) ways to gain access to information

Very often these kinds of attacks are the most successful

A very important way of protecting an organization is to train the employees Chapter 7 Introduction

We’ll now turn to technical issues of information security

One of the most important tools available is cryptography

Cryptography is the (art and) science of keeping information secure This is usually done by encoding it

Cryptanalysis is the (art and) science of breaking a code

Cryptology is the branch of math needed for cryptography and Introduction (2)

Cryptography can help in providing: Confidentiality: only authorized persons are allowed to decode a message

Authentication: receiver of a message (e.g. a password) should be able to ascertain its origin

Integrity: receiver of a message should be able to verify that it hasn’t been modified

Non-repudiation: a sender shouldn’t be able to falsely deny that they sent a message

We’re talking about messages here, but the principles can be applied to any information Basic Definitions

The original message is plaintext (sometimes also called cleartext

Disguising the content of a message is called encryption

This results in , the encrypted message

Turning ciphertext back into plaintext is called decryption

Original Plaintext Ciphertext Plaintext Encryption Decryption Basic Definitions (2)

Plaintext is denoted by P or M (for message)

It’s a stream of bits, intended for transmission or storage, e.g. a textfile

a bitmap

digitized audio data

digital video data

Ciphertext is denoted by C and is also binary data Can be the same size as M

Can be larger

Can be smaller (if combining encryption with compression) Basic Definitions (3)

The encryption function E operates on M to produce C: E(M) = C

The decryption function D operates on C to produce M: D(C) = M

As the whole point encrypting and then decrypting is to recover the original message, the following has to be true: D(E(M)) = M Algorithms

A cryptographic algorithm (also called a cipher) is the mathematical function used for encryption and decryption

Usually there are two functions, one for encryption and one for decryption

If security is based on keeping the algorithm secret, then it’s a restricted algorithm

Restricted algorithms are not a good idea: Every time a user leaves a group, the algorithm has to be changed

A group must develop their own algorithm; if they don’t have the expertise, then it will be subpar Algorithms and Keys

Modern algorithms use a , denoted by K

A key is taken from a large range of possible values, the keyspace

used as additional input for the en-/decryption function to do the en-/decrypting

Encryption Decryption Key Key

Original Plaintext Ciphertext Plaintext Encryption Decryption Algorithms and Keys (2)

So, the following holds:

EKE (M) = C

DKD (C) = M

DKD (EKE (M)) = M

The key used for encryption, KE , can be the same as the key used for decryption, KD, or the keys can be calculated from each other If this is the case, we have a symmetric algorithm

Otherwise, it’s an asymmetric algorithm Algorithms and Keys (3)

The security is based in the key, not in the details of the algorithm

This has several advantages: The algorithms can be published and analyzed by experts for possible flaws

Software for the algorithm can be mass-produced

Even if an eavesdropper knows the algorithm, without the key messages cannot be read

An algorithm together with all possible plaintexts, , and keys is called a Symmetric Algorithms

Symmetric algorithms are sometimes also called conventional algorithms Secret-key or single-key algorithm means that encryption and decryption keys are identical

Anyone who has the key can decrypt a message

However, before two persons/systems can communicate, they need to agree on a key Asymmetric Algorithms

Asymmetric algorithms are also often called public-key algorithms The key used for encrypting messages is the public key

The key used for decrypting messages is the private key

The public key can be published, so that anyone can encrypt messages to a person/system

The private key is only known to the receiver

This only works if the private key cannot be calculated from the public key (in any reasonable time) Asymmetric Algorithms (2)

Asymmetric algorithms work similar to padlocks: You distribute open padlocks (for which only you have a key)

Anyone sending you a message puts it into a box and snaps the padlock shut (encryption using public key)

Only you can unlock the padlock using the key (decryption using private key) Encryption Techniques

There are different techniques for actually doing the encryption

We are going to have a look at the most important ones: Substitution ciphers

Transposition ciphers

Stream ciphers

Block ciphers Substitution Ciphers

In a substitution cipher each character in the plaintext is substituted for another character

One of the earliest techniques to be used in the famous Caesar cipher

In the Caesar cipher each character is shifted (by three characters in the original cipher) Monoalphabetic Substitution Ciphers

Caesar cipher is a special case of a simple substitution cipher (or monoalphabetic cipher)

Instead of just shifting characters, the alphabet can be mapped to a (random) permutation: Monoalphabetic Substitution Ciphers (2)

Monoalphabetic ciphers are not very secure and can be easily broken by statistical means: Different characters have typical frequencies in languages

There have been various attempts at making substitutions more secure Homophonic Substitution Ciphers

Homophonic substitution ciphers try to obscure the frequencies by mapping a character to more than one code For example, “A” could correspond to 5, 13, 25, or 56; while for “B” this could be 7, 19, 32, or 42

While this makes analysis a bit harder, it doesn’t hide all statistical properties

With the help of a computer can usually be broken in a few seconds Polygram Substitution Cipher

Instead of encoding single characters, a polygram substitution cipher encrypts groups of letters For example, “ABA” could correspond to “RTQ’, while “ABB” could correspond to “SLL”

The example above uses 3-grams (groups of 3 letters); can be generalized to n-grams

Still not a very secure way of encrypting data: This hides the frequencies of individual letters

However, natural languages also show typical frequencies for n-grams (although the curve is flattened) Polyalphabetic Substitution Cipher

A polyalphabetic substitution cipher uses multiple simple substitution ciphers

The particular one used changes with the position of each character of the plaintext There are multiple one-letter keys

The first key encrypts the first letter of the plaintext, the second key encrypts the second letter of the plaintext, and so on

After all keys are used, you start over with the first key

The number of keys determines the period of the cipher Polyalphabetic Substitution Cipher (2)

An early example is the Vigenère cipher

Adds a key repeatedly into the plaintext using numerical codes: A=0, B=1, . . . , Z=25

This is done modulo 26, i.e. if the result is greater than 26, then we divide the result by 26 and take the remainder C = M + K mod 26

For example, P(15) + U(20) = 35 mod 26 = 9 Polyalphabetic Substitution Cipher (3)

Unfortunately, the Vigenère cipher is also not very secure

Code can be broken by analyzing the period Looking at the example from previous slide, we notice that KIOV is repeated after 9 letters, NU after 6 letters

As 3 is a common divisor of 6 and 9, this is a hint that the period could be 3

Knowing which letters were encoded with the same key allows the application of frequency methods again Rotor Machines

In the 1920s, mechanical encryption devices were developed (to automate the process)

Basically, these machines implemented a complex Vigenère cipher

A rotor is a mechanical wheel wired to perform a general substitution

A rotor machine has a keyboard and a series or rotors, where the the output pins of one rotor are connected to the input of another

For example, in a 4-rotor machine, the 1st rotor might substitute A → F, the 2nd F → Y, the 3rd Y → E, the 4th E → C; so the final output for A is C

After each output, some of the rotors shift Combination of several rotors and shifting of rotors leads to a period of 26n (where n is the number of rotors)

A long period makes it harder to break the code

The best-known rotor machine is the Enigma (pictured on the right)

Rotor Machines (2) Transposition Ciphers

Substitution ciphers on their own are usually not very secure

That is why they are combined with transposition ciphers Transposition ciphers on their own are also not very secure

In a transposition cipher the symbols of the plaintext remain the same, but their order is changed Simple Columnar Transposition Cipher

In a simple columnar transposition cipher the plaintext is written horizontally onto a piece of graph paper of fixed width

while the ciphertext is read off vertically

WEARED ISCOVE REDFLE EATONC E

WIREEESEAACDTROFOEVLNDEEC Double Columnar Transposition

A simple columnar transposition can be broken by trying out different width/column lengths

Putting the ciphertext through a second transposition enhances security In contrast to a simple substitution cipher, where a second application does not increase security Stream Ciphers/Block Ciphers

In symmetric algorithms we also distinguish between stream ciphers and block ciphers

Stream ciphers operate on the plaintext a single bit (or single character) at a time Simple substitution cipher is an example

Block ciphers operate on groups of bits (or groups of characters) An example of an early is Playfair Playfair

Place the alphabet in a 5x5 grid, permuted by the keyword (omitting the letter J; J=I)

Then divide the plaintext into pairs of letters preventing double letters in a pair (by separating them with an ‘x’)

adding a ‘z’ to the last pair (if necessary) Playfair (2)

For example, “Lord Granville’s letter” becomes “lo rd gr an vi lx le sl et te rz”

Then encrypt text pair by pair Replace two letters in the same row or column by succeeding letters, e.g. “am” → “LE”

Otherwise the two letters are in opposite corners of a rectangle, replace them with letters in the other corners, e.g. “lo” → “MT” Modern Algorithms

So far we have covered algorithms that are more of historical interest Good for explaining basic cryptographic concepts with easy-to-understand algorithms

None of these algorithms should be used today (they can all be broken easily)

We will now turn to more modern algorithms (which are still relevant today)

First we introduce some more concepts

In the 1940s Claude Shannon suggested that strong ciphers can be built by combining substitution and transposition

This relies on the principles of confusion and diffusion Confusion adds an unknown key value to confuse an attacker about the true value of a plaintext symbol

Diffusion spreads the plaintext information through the ciphertext SP-Networks

Early (modern) block ciphers used simple networks combining substitution and permutation circuits

These networks are called substitution-permutation networks (or SP networks)

Let’s have a look at a simple SP network operating on 16-bit blocks SP-Networks (2)

A substitution box (S-box) is basically a look-up table containing some permutation0000 → of the1001 numbers 0 to 15: 0001 → 0100 0010 → 1100 0011 → 0000 ...... SP-Networks (3)

The mapping used for the S-boxes needs to be one-to-one (so decryption is possible)

The output of S-boxes is shuffled with a permutation network before being fed into the next layer of S-boxes

Each layer is also called a round

One question remains: how do you add a key? SP-Networks (4)

After each round you add (part of) the key via an XOR (exclusive-or) function Using an SP Network

Why use an SP network? Why not do a direct mapping from 16-bit code to 16-bit code? This would need 220 bits of memory (216 × 16 bits)

Using the 4-bit S-boxes needs 64 bits (24 × 4 bits)

While a 16-bit mapping could be stored in memory, 16-bit block ciphers are far from secure

In order to get a decent level of security, you need ciphers with a block size of at least 64, 128, or even more bits 264 × 64 bits ≈ 100 million Terabytes Using an SP Network (2)

Main reason to use an SP network: saves space

However, in order to make an SP network block cipher secure, the following conditions have to be met: The block size needs to be large enough

We need to have enough rounds

The S-boxes need to be chosen carefully Block Size

The smaller the block size, the more vulnerable a block cipher is to attacks If an attacker can obtain (or guess) some plaintext messages, they can create a dictionary

With a dictionary any message encrypted with a certain key can then be decrypted

The smaller the block size, the easier it is to complete such a dictionary Number of Rounds

Good ciphers exhibit the This means that a slight change in the input (e.g. toggling one bit) has a major impact on the output

Ideally you want to create a lot of diffusion, i.e. spread the information trough the whole ciphertext

Having just one or two rounds is not adequate, a slight change in the input will only have a minor impact on the output

The exact number of rounds that are needed also depends on how quickly an SP network diffuses the data Choice of S-Boxes

S-boxes need to add enough randomness to be effective

Let’s look at an example of an S-box with 3 input and 3 output bits: 000 → 010 100 → 101 001 → 000 101 → 111 010 → 011 110 → 100 011 → 001 111 → 110 The most significant bit passes unchanged through the S-box

This would be a bad choice for an S-box Concrete Algorithms

We’ll now have a look at algorithms based on SP networks

These algorithms are still in use (or have been used until recently) (DES)

Advanced Encryption Standard (AES) Data Encryption Standard

Data Encryption Standard (DES) is a well-known block cipher

The original algorithm encrypts the plaintext in 64-bit blocks using a 64-bit key The key is actually 56 bits, every 8th bit is used for parity

The fundamental building block of DES is a substitution followed by a permutation on a 32-bit block DES uses an SP network on half of the block at a given time

DES has 16 rounds, applying the same technique on a plaintext block 16 times Data Encryption Standard (2)

First step is an initial permutation of the plaintext

Then there are 16 rounds of identical operations: A block is broken up into a left and a right half

In function f , the data is combined with the (current round) key and then run through an SP network

After the 16th round, there is a final permutation Data Encryption Standard (3)

The function f is also called the Feistel function (named after its inventor)

At first glance this might look overly complicated What’s the advantage of all this criss-crossing?

Actually, it’s a very clever scheme allowing to use the same network for encryption and decryption The only difference is that during decryption the round keys are applied in inverse order Data Encryption Standard (4)

DES was a certified standard for more than 20 years

However, there have been a couple of successful attacks on DES Deep Crack ($250,000 custom-built machine) breaks a DES key in 22 hours and 15 minutes (1999)

COPACOBANA ($10,000 custom-built machine) breaks DES in 9 days (2006)

DES has been replaced by the Advanced Encryption Standard (AES) AES uses a more sophisticated substitution/permutation algorithm on a 4x4 matrix of bytes Advanced Encryption Standard

Each input byte is sent through an S-box

Then the rows of the matrix are shifted

In a more complicated process the values in each column of the matrix are mixed

In a final step the current round key is added

This constitutes one round in AES Advanced Encryption Standard (2)

As we have 16 input bytes, AES has a block size of 16 × 8 = 128 bits.

AES can be used with different key sizes: 128, 192, or 256 bits

Depending on the there are 10, 12, or 14 rounds, respectively

AES became a certified standard in 2002 It is the first published algorithm approved by the NSA for top secret documents (using 256-bit keys)

We will move away from SP networks now and look at other types of algorithms The “Perfect” Encryption Algorithm

There is an unbreakable encryption scheme, it’s called a one-time pad

It’s basically a Vigenère cipher with a truly random, non-repeating key Encryption is done by adding plaintext letter and a one-time pad key character modulo 26

Each key letter is used exactly once and for only one message

The sender encrypts the message and then destroys the used part of the key

The receiver (who needs an identical pad) follows the same procedure for decryption One-Time Pads

Here is an example of a one-time pad (using numbers): One-Time Pads (2)

The crucial point is having a truly random sequence of key characters This makes statistical analysis impossible, every potential message has the same probability

Generating truly random numbers is harder than it looks: Most algorithms generate pseudo-random numbers

Better results can be achieved by integrating random events into the algorithm, e.g. mouse movements, disk operations, arrival times of network packets

Another problem is exchanging the key between sender and receiver RSA Public-Key Algorithm

RSA is a very popular and secure encryption algorithm RSA stands for the names of the inventors: Rivest, Shamir, Adleman

Has a public and a private key (asymmetric algorithm)

The public and private key are functions of a pair of large prime numbers Determining the Keys

Select two large prime numbers p and at random

Compute their product n = p × q

Compute Euler totient f = (p − 1)(q − 1)

(Randomly) choose an encryption key e such that e and f are coprime, i.e. they have no factors in common Another way of formulating this: the greatest common divisor (gcd) of e and f is 1

Compute the decryption key d such that e × d = 1 mod f , i.e. e × d divided by f leaves remainder 1

The public key consists of e and n

The private key consists of d and n Determining the Keys (2)

Let’s do an example: Choose p = 47 and q = 71

n = pq = 3337

f = (p − 1)(q − 1) = 3220

A valid value for e is 79 Can be checked by using the Euclidean algorithm

d = 1019, as (79 × 1019) mod 3220 = 1 Can be computed using the extended Euclidean algorithm

Now that we have the keys, what do we do with them? First step: destroy p and q, otherwise your private key is insecure Encrypting/Decrypting a Message

Encrypting a message is done by exponentiating the message M with e (modulo n): C = Me mod n

Decrypting a ciphertext is done by exponentiating it with d (modulo n): M = Cd mod n

Example: 5379 mod 3337 = 65 651019 mod 3337 = 53

Caveat: M and n have to be coprime; if message is too large, split it up into blocks (you’re safe if M < p and M < q) Encrypting/Decrypting a Message (2)

Why does this work? Number theory! Euler showed in 1736 that if M and n are coprime, then Mf = 1 mod n

Furthermore, Mf +s = Ms mod n, i.e. exponent can be taken modulo f

So, encrypting and then decrypting by computing (Me)d mod n yields M1 mod n, which is M Remember that ed = 1 mod f Security

Now we know that RSA will reconstruct the original message again

Why is it secure? Can’t someone figure out d given n and e? This involves integer factoring, i.e. figuring out the factors p and q of n

The prime numbers p and q used in RSA are really big, so that n is sufficiently large Using a size of around 300 bits for n can be factored in a few hours on a PC

A size of 1024 bits for n is probably safe in the near future

The fastest algorithm for integer factoring has a run-time exponential in the size of the input What Now?

O.k., RSA is secure, but we are left with two problems: If integer factoring is so difficult, how do we figure out p and q, which have to be prime?

If the numbers involved are so huge, how do we encrypt and decrypt efficiently? Computing 39579930985039858787943 mod 780323454455344457457 doesn’t look easy Determining Prime Numbers

Factoring numbers is difficult, however, testing a number for primality is easier

There are quick probabilistic tests: Take a whole range of numbers x

Test whether ax−1 = 1 mod x for different values of a

If x is a prime, then this holds for all a

Repeating this for several values of a gives a high probability for having found a prime

There are also deterministic tests, e.g. APR (Adleman, Pomerance, Rumely) These tests are usually very complicated to implement Efficient Exponentiation

This still leaves us with the problem of determining Me mod n and Cd mod n efficiently

As e and d are usually huge numbers, you don’t want to do a brute-force computation: That means multiplying M e times with itself and then taking it modulo n

Don’t try to do this with your calculator

We can use a technique called repeated squaring to cut down on the number of multiplications Repeated Squaring

Assume we want to compute x59

It’s not too hard to compute the following series of numbers: x2, x4, x8, x16, x32,... Multiplying x with itself gives you x 2 multiplied with itself gives you x 4 multiplied with itself gives you x 8, and so on

Incidentally, the exponents for the numbers we can compute quickly are all powers of 2

We can use the binary representation of the exponent and multiply all the numbers that are in the binary representation: 59 = 111011 = 32 + 16 + 8 + 2 + 1 x59 = x32 × x16 × x8 × x2 × x1 Modular Exponentiation

Repeated Squaring reduces the number of multiplications tremendously

However, the numbers can still become very big

Math comes to the rescue again: we can take modulo n after each multiplication

For example, if we want to compute 240 mod 100, then 40 = 32 + 8, so 240 = 232 × 28 ... 28 = 256 = 56 mod 100 216 = 56 × 56 = 3136 = 36 mod 100 232 = 36 × 36 = 1296 = 96 mod 100 240 = 96 × 56 = 5376 = 76 mod 100 Some Additional Stuff on RSA

While the techniques mentioned above make RSA usable in practice, it’s still slower than DES or AES

The prime numbers p and q should not be ‘too close’: 1 p − q should not be less than 2n 4

The private key needs to be ‘large enough’: d should be 1 greater than n 4 /3

The public key can be small, but then extra care has to be taken when encrypting messages: In order to compensate, message has to be made larger by it Summary

This represents only a small glimpse into the world of cryptography

There are countless other out there: , Grand Cru, Khufu, , Twofish, just to name a few

Going deeper into this would involve some very sophisticated math RSA is one of the simpler algorithms in terms of understandability Chapter 8 Introduction to Cryptographic Protocols Introduction

Just having a high-tech cryptosystem in place does not guarantee security

Cryptography has to be used in a certain way to achieve security

That’s where cryptographic protocols come into play

A protocol is a series of steps, involving two or more parties, designed to accomplish a task Protocols

Series of steps: Each step must be well-defined

There must be specified action for every possible situation

Everyone involved must know the steps and follow them

Two or more parties: Parties can be friends and trust each other or adversaries and mistrust each other

Accomplish a task: This can involve sharing (parts of) a secret, confirming an identity, signing a contract, etc. Dramatis Personae

In order to simplify scenarios, over time the following cast has evolved: Alice: 1st participant

Bob: 2nd participant

Carol, Dave: more participants for multi-party protocols

Eve: Eavesdropper

Mallory: malicious active attacker

Trent: trusted arbitrator Arbitrated Protocols

An arbitrator is a disinterested third party trusted to complete the protocol Has no allegiance to any party involved

All people participating trust that he is acting honestly and correctly

Arbitrators can help complete protocols between parties that don’t trust each other

Trent

Alice Bob Arbitrated Protocols (2)

In the real world, lawyers, public notaries, and banks act as arbitrators

For example, Bob can buy a car from Alice using an arbitrated protocol 1 Bob writes a check and gives it to the bank (Trent)

2 Bank puts enough money on hold to cover check and certifies the check

3 Alice gives the title to Bob and Bob gives the certified check to Alice

4 Alice deposits the check

This works, because Alice trusts the bank’s certification Arbitrated Protocols (3)

There are some problems with arbitrated protocols in the virtual world: It’s more difficult for people to trust a faceless entity somewhere in the network

An arbitrator can become a bottleneck, as he has to with every transaction This may lead to even more delay (due to the arbitrator there’s always some delay)

Lots of damage can be caused if arbitrator is subverted

Someone has to pay for running an arbitration service Adjudicated Protocols

Arbitrators have high costs, so arbitrated protocols can be split into two sub-protocols: A non-arbitrated part

An arbitrated part that is executed only if there is a dispute

This special kind of arbitrator is called an adjudicator

Trent

evidence evidence Alice Bob Adjudicated Protocols (2)

In the real world, judges are professional adjudicators

Alice and Bob can enter a contract without a judge; a judge only sees the contract if it is brought before a court 1 Alice and Bob negotiate the terms of the contract

2 Alice signs the contract

3 Bob signs the contract The following is only executed in case of a dispute: 4 Alice and Bob appear before a judge

5 Alice presents her evidence

6 Bob presents his evidence

7 The judge rules on the evidence Adjudicated Protocols (3)

Issues with adjudicated protocols in the virtual world Protocols rely in a first instance on the parties being honest

However, if someone suspects cheating, the protocol provides enough evidence to be able to detect this

In a good adjudicated protocol, this evidence also identifies the cheating party

Instead of preventing cheating, adjudicated protocols detect cheating The (inevitability of) detection acts as a deterrent Self-Enforcing Protocols

A self-enforcing protocol does not need an arbitrator or adjudicator

The protocol itself guarantees fairness

If one party tries to cheat, the other party is able to detect this immediately

The protocol then stops and/or punishes the cheating party

Alice Bob Self-Enforcing Protocols (2)

A well-known self-enforcing protocol is for dividing up things, e.g. a cake 1 Alice divides up the piece of cake

2 Bob chooses which piece to take

Ideally, it would be nice to have self-enforcing protocols for every possible situation

Unfortunately, that’s not possible Basic Cryptographic Protocols

In the following we’re going to have a look at some basic protocols involving cryptography

These cover a range of different applications: Communicating securely

Signing documents

Authenticating users

Exchanging keys

We will also have a look at potential weaknesses of these protocols Symmetric Crypto Communication

An obvious application is secure communication by encrypting messages

Using symmetric cryptography, a protocol could look like this: 1 Alice and Bob agree on a cryptosystem

2 Alice and Bob agree on a key

3 Alice encrypts her plaintext message using the above cryptosystem and key

4 Alice sends ciphertext to Bob

5 Bob decrypts ciphertext using the above cryptosystem and key Symmetric Crypto Communication (2)

What are some of the problems of this protocol? Step 2 has to take place in secret (Step 1 and 4 can take place in public) Assuming that Alice and Bob use a cryptosystem in which all security relies on the key (not the algorithm)

For world-spanning communication, this is problematic Symmetric Crypto Communication (3)

Problems involving compromised secrecy If the key is compromised (stolen, guessed, extorted, bribed, etc.), then Eve can decrypt all messages encrypted with that key

Mallory could intercept messages and send his own, pretending to be Alice or Bob

This protocol assumes that Alice and Bob trust each other: Either one could claim that the key has been compromised and publish the communication anonymously Symmetric Crypto Communication (4)

Yet another problem is key management: If each pair of users in a network have their own key then the total number of keys increases rapidly with the number of users

For example, 10 users require 45 different keys; 100 users require 4950 keys

Sharing keys among users is not a solution: If someone leaves the network or is not trusted anymore, all users sharing keys with that person have to change their keys Asymmetric Crypto Communication

Let’s have a look at how this communication works with a public-key cryptosystem 1 Alice and Bob agree on a public-key cryptosystem

2 Bob sends Alice his public key

3 Alice encrypts her plaintext message using Bob’s public key

4 Alice sends ciphertext to Bob

5 Bob decrypts ciphertext using his private key Asymmetric Crypto Communication (2)

What has changed? The does not have to take place in secret anymore Even if Eve listens in on step 2 and 4, she only has the public key and the ciphertext

This will not help her in recovering the private key or the plaintext

Everyone who wants to communicate with Bob uses Bob’s public key There’s no need to have a separate key for each pair of users Asymmetric Crypto Communication (3)

So, everything is fine now? Unfortunately, no. . . Enter Mallory (who is more powerful than Eve) In step 2, Mallory intercepts Bob’s public key and sends his own to Alice

Alice encrypts her message using Mallory’s public key (thinking it’s Bob’s) and sends it to Bob

On the way back Mallory intercepts Alice’s ciphertext, decrypts it, encrypts it with Bob’s public key, then sends it to Bob

Mallory could even change the message

Alice and Bob think they have a secure connection

This is called a man-in-the-middle attack (we’ll come back to it later) Hybrid Cryptosystems

At first glance it seems a public-key algorithm is always better than a symmetric algorithm

So why not use public-key all the time? Public-key algorithms are much slower than most symmetric algorithms

Public-key algorithms are vulnerable to chosen-plaintext attacks: If there are only a few possible messages, then it’s possible to recover the plaintext (although not the private key)

Encrypt all possible messages with the public key

If one of the generated ciphertexts matches the message ciphertext, then you know what was sent Hybrid Cryptosystems (2)

It’s possible to combine a symmetric algorithm with a public-key algorithm in a protocol 1 Alice and Bob agree on a public-key cryptosystem and a symmetric cryptosystem

2 Bob sends Alice his public key

3 Alice generates a random session key for the symmetric algorithm

4 She encrypts the session key using Bob’s public key

5 Alice sends ciphertext to Bob

6 Bob decrypts ciphertext using his private key to recover the session key

7 Alice and Bob can now continue with the symmetric algorithm Hybrid Cryptosystems (3)

Using public-key cryptography for distributing a key improves key management The key exchange can take place in public, even though a symmetric algorithm is used for the bulk of communication

The session key is only used for a limited time and then destroyed The longer a key sits around, the higher the chances that it is vulnerable to compromise

The public-key cryptosystem is only used very sporadically, generating a very small number of ciphertexts

The less data, the harder it is to break a code One-Way Hash Functions

A one-way hash function is also known under different names: message digest, fingerprint, (cryptographic) checksum, message integrity check, manipulation detection code

A hash function is a function that takes a variable-length input string (called pre-image) and converts it to a fixed-length (generally smaller) output string (called hash value) A simple example is a function that takes the pre-image and returns a byte consisting of the XOR (exclusive-or) of all input bytes

Can be used as a quick check if two pre-images are the same: same pre-images have the same hash value One-Way Hash Functions (2)

A one-way hash functions is a hash function that works in one direction It’s easy to compute a hash value from a pre-image, but it is hard to generate a pre-image that hashes to a particular value

The aforementioned XOR function is not one-way (given a hash-value, it’s easy to figure out a pre-image that will generate it)

A good one-way hash function is also collision-free: it’s hard to generate two pre-images with the same hash value

The hash function is public, the security lies in its one-wayness: it is computationally infeasible to find a pre-image hashing to a particular hash value One-Way Hash Functions (3)

If you want to verify that someone has a particular file (without sending the file itself), ask for its hash value

If they send you the correct hash value, then it is almost certain that they have that file

Can also be used to verify data transmission via an unreliable channel Send message and its hash value, if they don’t match, something has gone wrong

MD5 is a widely used one-way hash function (however, MD5 seems to have a weakness concerning collision-resistance)

SHA-2 is a bet in terms of security Authentication

Another important area of cryptographic protocols is authentication

When Alice logs into a computer (or ATM, or telephone banking system, or any other system) how does the host know who she is?

Traditionally passwords are used for this

Alice enters her password and the host confirms that it is correct

Both Alice and the host know a secret piece of knowledge, which the hosts requests every time Alice logs in Authentication (2)

The host does not actually have to know the password (i.e. store it in cleartext) That’s why sysadmins often cannot tell you your password, they don’t have it (they can only reset it)

It just has to be able to differentiate valid passwords from invalid ones

This can be done using one-way hash functions Authentication Using Hash Values

Instead of storing passwords, a host stores one-way hash values of the passwords 1 Alice sends the host her password

2 The host performs a one-way hash function on the password

3 Host compares the result to the stored value

If someone breaks into the host, they cannot steal the passwords

The one-way hash function cannot be reversed to obtain the passwords Dictionary Attacks

A file of passwords encrypted with a one-way function is still vulnerable

Mallory can take a list of the most popular passwords and a dictionary and run it through the one-way hash function Usually each term is used in different variations (e.g. concatenating numbers, using it backwards)

After stealing the password file, he can then compare the password list with the generated hash values to find matches

This is surprisingly successful, as many passwords are chosen poorly Password Eavesdropping

So far, so good; however, we still have a serious problem: what happens if Eve listens in on Alice’s communication with the host?

Eve can intercept the password anywhere on the network or on the host before it is hashed For example, early electronic locks (e.g. remote control garage door openers, car locks) sent their serial number in the clear

This was very insecure, anyone able to listen in and record the signal was able to open a door Authentication with Public-Keys

Public-key cryptography can help solve this problem

The host keeps a file of every user’s public key, all users keep their private key

This is how the (straightforward) protocol works: 1 The host sends Alice a random string

2 Alice encrypts the string with her private key and sends it back (along with her name)

3 The host looks up Alice’s public key and decrypts the message with it

4 If the decrypted string matches what the host originally sent, then access is granted Authentication with Public-Keys (2)

This protocol is also called challenge-response

The host sends out a challenge (a random number) to which the Alice has to respond in the correct way

The response proves that Alice is in possession of her private key

Important: the private key is never transmitted, only encrypted messages

However, there are still weaknesses: The plaintext and its ciphertext are transmitted

If random numbers issued by the host are not truly random Authentication with Public-Keys (3)

Plaintext/ciphertext: Having access to (a large number of) plaintexts and corresponding ciphertexts makes the job of a cryptanalyst easier

Eve could listen in on Alice’s authentication attempts to obtain information to deduce private key

Random numbers: Mallory (knowing the sequence of future random number) could intercept host’s challenge

Mallory forwards one of the future random numbers, gets it encrypted by Alice; Alice’s login will fail

However, Mallory now has a correct response to a future challenge Improved Protocol

Random number generator has to be implemented very carefully

To avoid the other weakness, use the following protocol: 1 Alice performs a computation on her private key and random number(s) and sends result to host

2 The host sends Alice a different random number

3 Alice performs a computation on her private key, her and the host’s random numbers and sends result to host

4 The host does some computation on the results received by Alice and her public key to verify she knows the private key

5 If the outcome is correct, access is granted Improved Protocol (2)

Step 1 might seem unnecessary, but is required to prevent certain attacks against the protocol

Basically, Alice proves to the host, that she knows her private key without giving away any information on her private key

actually encrypting plaintext with her private key

The weaknesses of the original protocol are also a good reason to not sign just any random document with your private key (it could have been sent by Mallory)

first sign a document, and then encrypt it Chapter 9 Further Cryptographic Protocols Digital Signatures

Cryptography’s use is not limited to securing communication

It’s possible to sign electronic documents using a

Attaching/embedding a (graphical image of a) signature to a document is not helping

It would be too easy to manipulate computer files and cut and paste signatures Symmetric Cryptosystem Signatures

Alice wants to sign a digital message and sent it to Bob

Possible with the help of Trent, who shares a secret key KA with Alice and a secret key KB with Bob

1 Alice encrypts her message to Bob with KA

2 Alice sends the ciphertext to Trent

3 Trent decrypts the message with KA

4 Trent takes the message, adds a confirmation that it is from Alice, encrypts it with KB

5 Trent sends this bundle to Bob

6 Bob decrypts it with KB (to read the message and Trent’s confirmation) Symmetric Cryptosystem Signatures (2)

Proving to a third person, e.g. Carol, that a message is from Alice would involve Trent again: 1 Bob encrypts Alice’s message and Trent’s confirmation with KB

2 Bob sends the ciphertext to Trent

3 Trent decrypts the message/confirmation with KB

4 Trent checks that the message is indeed Alice’s

5 If this is the case, Trent takes the message, adds a confirmation that it is from Alice, encrypts it with KC

6 Trent sends this bundle to Carol

7 Carol decrypts it with KC (to read the message and Trent’s confirmation) Symmetric Cryptosystem Signatures (3)

In theory, this is a secure way of signing documents

If someone tried to “transfer” a signature to a different document, Trent would notice it

However, there are some problems in practice: Trent needs to store all messages that he confirmed (to be able to check validity of a signature)

Every transaction has to go through Trent, he would quickly become a bottleneck

Trent has to be a trusted entity and he has to be infallible If he makes a mistake, this would undermine other people’s trust in him Public-Key Cryptosystem Signatures

There are better ways for digitally signing a document

Many public-key algorithms (including RSA) can be used for this

This is based on the fact that the private key can also be used for encrypting messages

The public key is then used for decrypting it again 1 Alice encrypts the document with her private key

2 Alice sends ciphertext to Bob

3 Bob decrypts the ciphertext using Alice’s public key Public-Key Cryptosystem Signatures (2)

Why does this work? Bob can verify that the signature is authentic (by decrypting message correctly with public key)

The signature cannot be forged, only Alice knows her private key

The signature is not reusable (cannot be extracted from document and attached to another)

The signed document cannot be changed (any change of the ciphertext would lead to gibberish when applying public key)

There is a variant of RSA, called DSA (Digital Signature Algorithm), which can only be used for signatures (not encryption) Public-Key Cryptosystem Signatures (3)

Possible problems with this protocol: Works fine if nothing is gained from having another copy of the document

Things change if it is, e.g. a signed digital check Bob could take the check to a bank, which verifies Alice’s signature and transfers the money

However, Bob could keep a copy of the check and take it to another bank later

Consequently, digital signatures often include timestamps, i.e. current date and time is added to document before signing The bank can then compare the timestamp with a list of stored timestamps of used checks Public-Key Cryptosystem Signatures (4)

Non-repudiation is still a problem: Alice can sign a document and later claim that she did not She can claim that her key was compromised (she lost it, someone stole it) and someone else is pretending to be her

There has been a discussion about tamper-resistant modules Prevents Alice from accessing her own private key and accidentally “losing” it

If you need non-repudiation, then you have to fall back on an arbitrated protocol Arbitrated Non-Repudiation

1 Alice signs her message

2 Alice generates a header (containing some identifying information)

3 She concatenates header + signed message and signs this again, then sends it to Trent

4 Trent verifies Alice’s signature and confirms the header and adds a timestamp to the header + message

5 Trent signs this bundle of information with his own signature and sends it to both, Alice and Bob

6 Bob verifies Trent’s signature, the header, and Alice’s signature

7 Alice verifies Trent’s message and confirms, if it did not originate from her she raises an alarm Digital Signatures and Encryption

Digital signatures can be combined with encryption: 1 Alice signs a message with her private key

2 Alice encrypts the signed message with Bob’s public key

3 Alice sends ciphertext to Bob

4 Bob decrypts ciphertext with his private key

5 Bob verifies with Alice’s public key and recovers message Key Exchange

We still have an open problem: How can we safely transmit a (public) key, so that we can be sure it belongs to whom we think it belongs?

One solution is to have a secure database: Anyone may read the content and access a public key

Only Trent is allowed write-access to the database

To make sure that keys are not substituted during transmission, Trent can sign the keys, users can verify Trent’s signature Users of the database just have to make sure (once) that they have received the correct public key of Trent Key Exchange (2)

In this scenario Trent is called a Key Certification Authority (CA) or Key Distribution Center (KDC)

A large and well-known CA is VeriSign

Usually a CA signs a compound message called a certificate containing information about the holder of a public key

When you apply for a certificate with a CA, they check your identity before issuing the certificate Certificates

A certificate contains the following fields: Issuer: identity of the issuer (the CA)

Subject: identity of the owner of the public key

Public key: the public key of the subject

Dates valid: the time period for which the certificate is valid

Type: there are different classes of certificates Depends on the rigor with which identity check is done

Serial number

Fingerprint: a one-way hash function of previous data (more about these functions on next slide)

Signature: fingerprint signed by issuer Signing Fingerprints

Instead of signing documents directly, often the fingerprint is signed

This is done due to efficiency reasons, signing a fingerprint is quicker

If someone manipulates the original document this can be found out quickly: The result of applying the one-way hash function to the document and the signed fingerprint don’t match! Certificate Chains

If an organization wants its members to have a certificate, it usually doesn’t get them individually

It obtains an organization-wide certificate from a CA and then creates and signs a certificate for each member

So we have a chain of certificates: Member → Organization → CA Checking Certificates

If you want to check a certificate, you would 1 apply the issuer’s public key to the signed fingerprint, decrypting it

2 and compare the decrypted fingerprint with the provided fingerprint

3 optionally create the fingerprint of the certificate using the one-way hash function and compare it, too to verify that certificate hasn’t been tampered with

In case of a chain, you then need to verify the certificate of the issuer

Now we have a problem. . .

Who signs the CA’s certificate? Checking the CA’s Certificate

A CA’s certificate is self-signed, i.e. the issuer and the subject are identical

If Mallory approached you with his certificate signed by VeriSign and he also provided VeriSign’s certificate, would you trust him? Probably not, anyone can create a self-signed certificate

One solution for a CA is to widely publicize the fingerprint of their own certificate It’s not possible to cheat if everyone knows how a CA’s certificate must look like

One more thing: there are Certificate Revocation Lists (CRLs), as certificates can be invalidated Web Browsers and Certificates

Browsers are usually shipped with a number of self-signed certificates already installed

These are certificates that the vendor (has verified) and trusts

So if you trust the vendor, you can trust the installed certificates

Nevertheless, a browser lets you view installed certificates

install certificates yourself

specify whether you trust a certain certificate

Caveat: a browser does not necessarily check whether a certificate has been revoked Security

There’s never 100% security, but in order to subvert a key exchange involving a trusted third party Mallory would have to Replace Alice’s copy of the KDC public key

Corrupt the database: replace his own keys for the valid keys and sign with his private key

Unfortunately, using a KDC (or CA) is an arbitrated protocol

Can we cut out the man-in-the-middle attack with a self-enforcing protocol?

This can be done using an interlock protocol Interlock Protocol

The interlock protocol has a good chance of foiling a man-in-the-middle attack: 1 Alice sends Bob her public key

2 Bob sends Alice his public key

3 Alice encrypts message using Bob’s public key, she sends half of the encrypted message to Bob

4 Bob encrypts message using Alice’s public key, he sends half of the encrypted message to Alice

5 Alice sends the other half to Bob

6 Bob sends the other half to Alice

7 Alice puts the two halves together and decrypts them

8 Bob puts the two halves together and decrypts them Interlock Protocol (2)

The important point is that half of the message is useless without the other half (that it cannot be decrypted) If the encryption is a block algorithm, half of each block (e.g. every second bit) could be sent

Mallory can still substitute his key for Alice’s and Bob’s in steps 1 and 2

However, when he intercepts Alice’s message in step 3, he cannot decrypt it with his own private key and re-encrypt it with Bob’s public key

He must invent a completely new message and send half of it to Bob

Mallory has the same problem in step 4 Interlock Protocol (3)

By the time he intercepts the second halves in steps 5 and 6, it’s too late to change the invented messages

Alice and Bob won’t send the second half of their messages until they have received the first part from the other person

The conversation between Alice and Bob will turn out to be very different

Mallory has a chance of getting away with it, if he is able to mimic Alice and Bob closely enough

However, that is much harder to do than intercepting and reading messages Real-World Protocols

Let us now have a look at some protocols that are used in real applications Smartcard-based banking We are going to have a look at a range of different protocols used for banking

Kerberos This is a trusted third-party authentication protocol designed for TCP/IP networks

TLS (Transport Layer Security), formerly SSL (Secure Sockets Layer) TLS is a protocol for secure communication between a client and a server Smart Cards and Banking

Everyone uses them, but what happens when you insert your credit/debit/cash card into a terminal?

How does the terminal verify that your card is valid?

Newer cards employ the “Chip and PIN” method to do this, what’s inside the chip? Magnetic Strip Cards

Before Chip and PIN, magnetic strip cards and signatures (or PINs at ATMs) were used

Basically, the magnetic strip contains (some of) the information printed on the card in machine-readable form

It is relatively easy to copy the information on the magnetic strip, then all that’s left to do is to forge the signature

find out the PIN

However, not everything was bad for a customer If a signature turned out to be forged later on, the customer was not liable Weaknesses

Some (early) glitches included: Only the account number on the strip was used: Collecting the discarded ATM slip (which in the beginning had the complete account number on it)

and shoulder-surfing the PIN was enough to create a copy of a card

Putting one-way hash code of PIN on card insecurely Make a copy of your own card replacing account number

Take money from that account with your PIN

The relative ease with which card can be copied is still a problem Chip and PIN

Chip and PIN was introduced to make it harder to commit card-based fraud

Actually this is not just one protocol, but a whole suite of protocols

The standard is officially called EMV, after the participating institutions Europay, MasterCard, VISA In the meantime, Japan Credit Bureau (JCB) and American Express have joined as well

The complete standard consists of four books containing hundreds and hundreds of pages http://www.emvco.com/specifications.aspx EMV

EMV can be seen as a “construction kit” for building payment systems Word of warning: choosing building blocks at random is probably going to result in a very insecure system

We are going to look at the three main protocols Static Data Authentication (SDA)

Dynamic Data Authentication (DDA)

Combined Data Authentication (CDA)

Basic premise: a customer card contains a smartcard chip with the capability to verify a PIN

authenticate a transaction SDA

Certification Authority

private key public key dCA eCA Issuer

public key e I signed e I with dCA

private key static app data dI signed with dI

smart terminal card communication between card and terminal Acquirer SDA (2)

Usually the involved parties are the following: Issuer: who gives you the card (e.g. bank)

Certification authority: scheme operator (e.g. VISA)

Acquirer: participant in the scheme (e.g. shops, ATMs)

We’ll have a look at the protocol on the next slide

Basically, terminal verifies that the static application data (put on the card when it was created) has not been manipulated

PIN is correct before going ahead with a transaction SDA Protocol

What follows is a step-by-step description of what happens when a card gets inserted into a terminal

1 Terminal uses CA’s public key eCA (which it has to obtain from CA) to verify that issuer’s public key eI provided by card is valid

2 Terminal uses public key eI to verify that data on card is valid (and has not been modified)

3 Card user is authenticated via PIN: entered PIN is sent (in cleartext) to card, which verifies correctness

Purchase, cash withdrawal, etc. can now go ahead SDA Protocol (2)

4 Terminal sends transaction data (merchantID, amount, serial number, etc.) to card

5 Card encrypts this using a symmetric key shared with the issuer, creating an application cryptogram (or AC)

6 Next step is handled differently, depending on whether 1 terminal is online: terminal sends AC to issuer for verification (checks for sufficient funds, card reported as stolen, etc.)

2 terminal is offline: if amount is below a certain threshold (“floor limit”) 1 transaction is processed (AC is kept as evidence that card was used)

2 otherwise, transaction is aborted Weaknesses of SDA

Backwards compatibility with magnetic strip cards EMV has not been rolled out on a world-wide basis (e.g. United States hasn’t joined)

Even if EMV is used as a standard (such as in the UK), magnetic strip is sometimes used as fallback (e.g. if chip doesn’t work)

This makes it as vulnerable as magnetic strip cards

Eavesdropping on communication between card and terminal Criminals have used sniffing devices put between separate card readers and PIN pads

Tampered terminals have been found Weaknesses of SDA (2)

Picture below shows a terminal with an inserted wiretap Weaknesses of SDA (3)

“Yescards”: cards programmed to accept any PIN Only works in offline mode and relies on the fact that merchant cannot read the application cryptogram

Instead of generating a valid AC, card returns some random data that looks like an AC

However, you need a valid certificate (ei signed with dCA and card data signed with di ) Can be obtained by listening in on the communication of a valid card with a terminal DDA

Compared to SDA there is one additional public/private key pair in DDA This key pair, labeled with SC (for smart card), is used by the card

While the other private keys are only used for signing messages, dSC is actually stored on the card For this to work it has to be stored securely, under no circumstances may it leave the card

In addition to this, the smart card needs to be able to run public key crypto-algorithms This needs more computing power than the symmetric algorithm used in SDA DDA (2)

Certification Authority

private key public key dCA eCA Issuer

public key e I signed e I with dCA

private key static app data dI signed with dI

public key smart e SC signed terminal e card SC with dI communication between card and terminal

private key Acquirer dSC DDA Protocol

Let’s have a look at the protocol 1 Terminal sends challenge to card

2 Card signs this with its private key and sends it back

This assures the terminal that the card is present (somewhere)

3 Terminal verifies issuer’s public key eI on card with eCA

4 Terminal verifies card’s public key eSC with eI

Now terminal can actually verify that its challenge was correctly answered 5 Card user is authenticated via PIN: PIN is not sent in cleartext, but encrypted with eSC

Rest of steps as in SDA before CDA

Top-end protocol in terms of security

Similar to DDA Main difference: the application cryptogram is also signed by card using its private key (in addition to encrypting it)

This makes sure that the card that was authenticated earlier is also used for the transaction Otherwise, if you’re able to switch cards, transaction could be done with different card under DDA Further Weakness

Even CDA is not 100% secure

Vulnerable to a relay attack (tested experimentally by the Computer Lab in Cambridge) Set up bogus terminal in a cafe hooked up via WiFi and a laptop to a bogus card

When someone paid £5 for his cake, card was connected to fake card

With this fake card, somewhere else someone purchased a book for £50 (using the credentials of the valid card) Kerberos

The basic protocol isn’t too hard to understand: 1 A client requests a ticket for a Ticket Granting Service (TGS) from a Kerberos server

2 The ticket is sent to the client (after authentication)

3 To use a particular server, client requests a ticket for a particular server from TGS

4 If user is allowed to access server, TGS issues a ticket

5 Client presents this ticket to the server

Kerberos 1 2 5 3 Client Server TGS 4 Kerberos (2)

We’ll look at some details in just a moment

You might ask yourself what the Ticket Granting Service is for Kerberos could check if a user is allowed to access a service and issue ticket for Server directly

Functionality is split between Kerberos and TGS to share workload: Kerberos does authentication

TGS does access control for services

Kerberos is based on symmetric cryptography: Kerberos shares a different secret key with every entity on the network Steps 1 and 2

Client sends their name to Kerberos (+ TGS server to use)

Kerberos looks up client in database, if found Kerberos generates a session key K1 to be used between client and TGS

Kerberos sends back two messages:

The session key K1 encrypted with the client’s secret key A Ticket-Granting Ticket (TGT) including the client name, client network address, ticket validity period, and K1 This is encrypted using the secret key of TGS

The client can decrypt the first message (and get K1) Step 3

The client sends two messages to TGS The TGT received from Kerberos (+ requested service)

An authenticator including the client name and a timestamp, encrypted using K1

The authenticator serves two purposes:

It shows TGS that the client knows K1 The current timestamp prevents an eavesdropper replaying the TGT and authenticator later

TGS decrypts the TGT ticket and uses the extracted session key K1 to decrypt the authenticator Step 4

TGS can now check validity of client (name, network address) and validity of the timestamp (time period)

It also checks that client is allowed to access requested service

If everything turns out to be correct, TGS generates another session key K2 to be used between client and server

TGS sends two messages to the client:

The session key K2 encrypted with K1 A Client-Server Ticket (CST) including the client name, client network address, ticket validity period, and K2 This is encrypted using the secret key of the server Step 4

Step 4 is very similar to Step 2, TGS is now playing the role that Kerberos played before

After receiving the CST, the client can now retrieve the key used in communication with the server

and is ready to contact the requested service Step 5

The client sends two messages to the service The CST received from TGS (+ requested service)

An authenticator including the client name and a timestamp, encrypted using K2

The server decrypts the CST ticket and uses the extracted session key K2 to decrypt the authenticator After checking that everything is correct, it can provide the service to the client

If client wants to have an authentication from the server:

Server encrypts (timestamp + 1) using K2 and sends it back to client

This shows that server knew its own secret key (used for decrypting CST) Weaknesses

There are a few weaknesses: Single point of failure: While the Kerberos server is down, no-one can access anything

Kerberos server knows all the secret keys, if it is compromised all of these keys are compromised

The clocks on the different machines need to be synchronized (and accurate): If the timestamp provided by the client does not lie in the ticket validity period, access will be denied

If a host can be fooled about the correct time, old authenticators could be replayed TLS

TLS (Transport Layer Security) is a for secure client-server communication

Web browsers contain the client TLS (or SSL) software

If a server has been configured to run TLS, then a client canA secure use it connection by specifying is shownhttps: byin a a URL (instead of http:closed padlock:

Servers can request that clients use TLS (even if client has not used https:, e.g. for logins, payment details) Initiating Communication

TLS can perform different activities: encryption, server authentication, client authentication

When a client requests TLS, client and server go through a handshake protocol to determine which of the above activities will be performed

and which encryption algorithm will be used

A public/private key pair will be needed on the server side to go ahead

A server’s public key can be found in its certificate (also known as a server ID) Encryption Only

If client only wants a secure communication line using encryption, the following protocol is used 1 Client sends request for server ID to server

2 Server sends back server ID

3 Client generates (random) session key and encrypts it using public key found in certificate

4 Client sends session key to server

5 Server decrypts message to obtain session key

Client and server now have a secret session key for encrypted communication Server/Client Authentication

There’s one additional step that the client does (before sending a session key to the server): 3 (b) Client attempts to verify the server ID certificate with installed certificates

If this cannot be done, then the browser will warn its user

In the case of client authentication, client needs to send its certificate and the server will perform an additional step: 4 (b) Client sends its certificate

5 (b) Server attempts to verify the client’s certificate with installed certificates Summary

Again, this is only scratching the surface of cryptographic protocols

There are lots of other of cryptographic protocols for doing a range of things: Threshold schemes for sharing secrets: reconstructing messages with any m out of n pieces

Subliminal channels: covert ways of communicating secretly

Mental Poker: distributing keys anonymously

The important issue is: cryptosystems have to be applied to be in a certain way to guarantee security Chapter 10 Network Attack Motivation

In today’s IT infrastructures, more and more computers and devices are connected to a network

This has major consequences for system security It becomes easier to infiltrate systems (the whole system is as secure as its weakest link)

Once infiltrated, the damage that can be inflicted increases manifold

We will look at some approaches used for attacking and defending networks Intruders

A major threat to security is the intruder. Often referred to as a hacker. They are often categorised into 3 classes Masquerader Unauthorised user who exploits a legitimate user’s account.

Misfeasor A legitimate user who accesses unauthorised data, programs or resources.

A legitmate user who is authorised but misuses their privileges.

Clandestine User A user who seizes supervisory control of the system and uses it to evade auditing and access control or to suppress audit collection. Intrusion Objectives

Goal Gain access to a system

Extend range of privileges in a system

This requires acquiring protected information Typically this means finding out passwords Attacks on Local Networks

For the first scenario, let’s assume that an intruder has managed to gain control of one of your PCs

They can then install packet sniffer software in an attempt to get passwords

masquerade as a machine where a target user is already logged in

intercept filehandles in shared systems, e.g. NFS (Network File System) Attacks Using Internet Protocols

Attacks need not be about gaining access. There are different ways of taking down hosts by sending a large number of malformed packets: SYN flooding, Smurfing

Many of these loopholes have been fixed by now, but there’s still the brute-force distributed denial-of-service (DDoS) attack In this kind of attack, botnets (a large group of subverted machines) flood a site with a huge number of packets Further Attacks

Other forms of attack include DNS Security and Pharming

Spam

Malicious Code (e.g. viruses) Defense Against Network Attacks - Overview

There are several things you can do to make your network safer: Encryption: use protocols that protect specific parts of the network and/or the transmitted data

Management: keeping system up-to-date, configuring it in a way to minimize attack surface

Filtering: use of firewalls to stop bad things from entering your network

Intrusion detection: monitor network for signs of malicious behavior Management

Install newest patches as soon as possible

Make sure that default accounts and passwords are switched off

Shut down unnecessary services on machines

Train staff not to do foolish things Password Management

We have already seen in Chapter 8 a one way hash used to store passwords.

Better than storing passwords in clear text

Can be made more secure by Salting

password is combined with a random value - the salt

Prevents duplicate passwords being visible in password file

Process can be repeated -

PBKDF2 (Password-Based ) Chapter 11 Intrusion Detection Introduction

Monitor

The intrusion detector is classifying the input data into 2 classes intrusion/ not intrusion. Classification

There are 4 possible outcomes for the intrusion detector 1 An intrusion has occurred and it is detected (raises an alarm)

2 An intrusion has occurred but the detector remains silent

3 An intrusion has not occurred but the detector raises an alarm

4 An intrusion has not occurred and the detector does not raise an alarm

Terminology- Positive and Negative classes The positive class is the class that you have the most interest in.

We care most about the intrusions, so the intrusions are the positive class! Classification

There are 4 possible outcomes for the intrusion detector 1 An intrusion has occurred and it is detected (sounds an alarm) True Positive (a good outcome)

2 An intrusion has occurred but the detector remains silent False Negative (a bad outcome)

3 An intrusion has not occurred but the detector raises an alarm False Positive (a bad outcome)

4 An intrusion has not occurred and the detector does not raise an alarm True Negative (a good outcome)

Ideally we would like an intrusion detector that generates only True Positives and True Negatives. The Confusion Matrix

Predicted Class Negative Positive Positive TP FN Negative FP TN Actual Class Example - Credit Card Fraud The Problem of Uneven Class Size

The Base Rate Fallacy Intrusion Detection Chapter 12 Malicious Software and Firewalls Malicious Code

Malicious code, or malware, has been around for some time (started on time-sharing machines in the 1960s) There are several different kinds of malware: Trap Door

A secret entry point into a program Trap door is code that is triggered by from a special sequence of input

being run from a certain user ID

an unlikely sequence of events

Sometimes trap doors are added to facilitate programmers to debug and test.

Become a threat when used to gain unauthorized access

Difficult to implement operating system controls. Security measures focus on the development and update of the code. Logic Bomb

Code embedded in a legitimate program that is triggered when certain conditions are met Triggers include Particular date

Presence of absence of a file

Particular user running a particular program

The code is said to ‘explode’, the damage it causes include: Deleting data or files

Causing machine to halt Trojan Horse

Program that has hidden code which when invoked performs an unwanted of harmful action

A common use of a Trojan Horse is for data destruction To accomplish a function indirectly A user wishing to gain unauthorized access to a file could use a Trojan horse.

The program contains code that when executed, changes file permissions

The program is designed to be a useful utility so users are more likely to run it. Zombie

Program that takes over another networked computer.

Used to launch an attack which is subsequently difficult to trace back to the creator of the zombie.

Denial of Service attack, many computers infected by the zombie are used to overwhelm a target website Virus

A program that inserts itself into one or more files and then performs some (possibly null) action Typical virus goes through 4 phases Dormant Phase - Virus is idle and waiting for an activation event, e.g. a date

Propagation Phase - Virus copies itself into other programs or system areas

Triggering Phase - Virus is activated to perform an action. The trigger can be any number of events, such as a count of the number of times it has replicated

Execution Phase - The action is performed. This can range from harmless such as a message to destruction of data and programs How Viruses Work

A virus typically has two parts: a replication mechanism and a payload

The most common way for a virus to replicate is to append itself to an executable file and patch itself in: The execution path jumps to the virus code and then back to the original program

Program Virus

Viruses used to be restricted to executable files in operating systems, but have spread to other interpreted languages (e.g. Word macros) How Viruses Work (2)

The second component of a virus is the payload

This will usually be activated by a trigger, such as a date, and then do a number of bad things: Make selective (or random) changes to the machine’s protection state

Make changes to user data

Lock the network

Install spyware or adware

Install a rootkit Types of Virus

Parasitic - Most common type. It attaches itself to executable files and when the program is executed, the virus replicates by finding other executables to infect.

Memory Resident - Remains in main memory and infects every program that executes

Boot Sector - Spread whenever a system is booted from a disk containing the virus Types of Virus (2)

Stealth - A type of virus that attempts to hide itself from detection.

The virus may intercept I/O routines, so when these routines are called the virus presents back the original uninfected program details

When a virus appends itslef to a program, the file gets longer. The virus Types of Virus (3)

Polymorphic - Creates copies of itself that are functionally equivalent but are different programs.

This is can be achieved by randomly inserting superfluous instructions and changing order of instructions. Encryption can be used to change the program A random key is generated by a part of the virus called the mutation Engine. This is used to encrypt the remainder of the virus.

The key is stored with the virus and the mutation engine is itself altered.

When the virus replicates, a different key is randomly chosen Malicious Code

Malicious code, or malware, has been around for some time (started on time-sharing machines in the 1960s)

There are several different kinds of malware: Trojan: a program that has a malicious side effect when run by a user For example, a game that tries to capture passwords or other sensitive information

Worm: something that replicates One of the first worms was written at Xerox PARC: a program looking for idle processors assigning them tasks Malicious Code (2)

Different kinds (cont’d): Virus: basically a worm which replicates by attaching itself to files An early example was the IBM BITNET Christma virus

Rootkits: a piece of software that (once installed) seizes control of a system, tries to obscure the fact that it is installed An example is Vanquish, a rootkit that hides files, folders, registry entries and logs passwords. Countermeasures

Two common approaches for antivirus software are scanners and checksummers

Scanners search executables for certain byte patterns known to be a virus A counter-counter is polymorphism, the virus changes form every time it is copied

This can be done by encrypting the virus with a simple cipher and using a different key each time Decryption algorithm and key are also attached Countermeasures (2)

Checksummers keep a list of all executables in the system together with checksums (or fingerprints) of the original versions A counter-counter is a virus watching for operating system calls used by checksummers and then go into hiding

Unfortunately, antivirus software seems to become less effective Programmers of viruses and rootkits become more sophisticated and test against antivirus software Filtering

Most widely sold solution is a firewall

This is a machine standing between the local network and the internet filtering out traffic that might be harmful

Filtering can be done on different levels: Packet Filtering: inspects packet addresses and port numbers

Circuit Gateways: reassembles and examines packets in each TCP session

Application Relays: looks at data on an application level and e.g. strips out executables Filtering (2)

Firewalls mostly try to keep bad thing out, but can also be used to monitor (and restrict) outgoing traffic

You can have more than one firewall in a system, first firewall creates a demilitarized zone (DMZ)

DMZ is then connected to internal networks via further filters: Chapter 13 Usability, Assurance and Evaluation Motivation

Why do people make all these bad security-related decisions?

Usually your average user isn’t that stupid

However, technology doesn’t always help in making the right decision Often it is as easy (or even easier) to make a bad decision Human Behavior

Cognitive psychology deals with how humans think

There are some well-known results: It is easier to memorize things that are repeated frequently

It is easier to store things in context

However, this is not always considered in the right way by developers

We’ll have a look at the three most common mistakes made by humans: lapse of skill, following rules, cognitive errors Practice Makes Perfect (Not)

People become highly skilled by practicing actions

Downside: inattention can lead to a practiced action instead of an intended one

Many users click on ’OK’ in pop-up boxes (as very often that is the only way to continue)

Users are trained in doing the wrong thing Example

Quickly, can you spot the security message? And Now? Following rules

People can make mistakes by following a wrong rule

Can happen when they are distracted by interruptions, (perceptual) confusion, information overload

People then follow the strongest or most general rule they know

Example (Phishing): Looking at a secure connection (https) and the bank’s name, instead of the domain: https://www.citibank.secure.com Cognitive Errors

In plain English: people just don’t understand the problem

When Microsoft introduced an anti-phishing toolbar (see below), this was defeated by a picture-in-picture attack

Users are so accustomed to overlapping windows that many don’t realize what is going on (see next slide) Picture-in-Picture Attack

Fake window inside the real browser window: What To Do?

Trying to teach to all users the technical details in getting things right is a losing battle

In confusing situations when the rational runs out, the emotional side tends to kick in

It is much better to have sensible and secure defaults and easy-to-use interfaces, e.g. ’Our bank will never, ever send you e-mail. Any e-mail seeming to come from us is fraudulent’

Restricting access to public share in Vista to a group of users not needing nine steps and six different user interfaces Authentication

Next we are going to have closer look at why user authentication is still so difficult to do properly

There are still big problems in terms of usability in this area

We’ll have a closer look at passwords. . .

. . . and also look at some biometric alternatives Passwords

Words from a usability researcher: “It’s hard to think of a worse authentication mechanism than passwords.” (Angela Sasse)

The reasons lie chiefly with the way human memory works: People are bad at remembering infrequently-used, frequently-changed, or many similar items

People can’t forget “on demand”

Recall is harder than recognition

Non-meaningful words are harder to remember

There are costs attached to this: BT employs about a hundred staff in its password-reset center Alternatives

Passwords are not the only way of authenticating users

Basically, there are three options: “Something you have”: a physical device, e.g. a key

“Something you know”: present secret knowledge, e.g. a password

“Something you are”: physical appearance, e.g. an iris scan

The second option is usually the most inexpensive to implement

Often the other variants are used in combination with passwords

So passwords are probably going to be around for a while Problems with Passwords

There are three broad concerns with passwords Will a user enter the password correctly with a high enough probability

Will a user remember the password (or will they write it down or select easy-to-guess passwords)?

Will the disclosure of a password (whether accidentally or on purpose) break the system security? Reliable Entry of Passwords

Entry of passwords might seem trivial at first glance

However, if customers experience difficulties in entering codes, a support desk might get swamped with calls

Whether a password is too complex or too long may depend on the application: In a familiar setting, you might be able to get away with some complexity: Entering 20-digit numbers from a print-out into an pre-paid electricity meter, although some customers are illiterate (South Africa)

In very stressful situations, you have to make things simpler: 12 decimal digits used for firing codes of nuclear weapons (U.S.) Complexity of Passwords

Password ≠ password: depending on the application, stronger passwords have to be selected

If you can limit the number of guesses, then it’s o.k. to have a simpler password Example are ATMs that block the account after three wrong guesses

Sometimes you cannot limit the guessing, examples are: Encrypted files, attacker can guess as many times as they want

Possible denial-of-service attack: attackers can lock down a system by flooding it with login attempts Remembering Passwords

Twelve to twenty digits may be fine when copied from a telegram or a meter ticket

If people have to memorize passwords, things look differently: They use easy-to-guess passwords

and/or write them down

The following sums it up quite nicely: “Choose a password you can’t remember and don’t write it down.” Remembering Passwords (2)

Problems with memorizing passwords comes down to four issues: Naive password choice

User ability and training

Design errors

Operational failures Naive Password Choice

In summary, studies looking at people’s choices of passwords are depressing

Some of the biggest glitches commonly are: Using spouses’ names

Single letters

Just hitting return (no password at all)

Systems forcing people to use more complex passwords haven’t worked that well: Dictionary attacks (with some modifications) are still successful

As are lists consisting of the 20 most common female names followed by a single digit Naive Password Choice (2)

Making users change passwords regularly also backfired Users rapidly cycled through exhausted passwords to return to their favorite one

Not allowing password changes until some time had expired is no solution People need an administrator to change compromised passwords User Ability and Training

Some of the problems can be solved by training users Not all of them, though, there’s always a number of people who don’t do what they’re told

You can operate a clean-desk policy (where no passwords are left on unoccupied desks)

Put very important passwords in an envelope in a safe that is guarded User Ability and Training (2)

From an interesting study by Ross Anderson involving three groups: Red group: usual advice (password at least six characters, one non-letter), i.e. naive passwords

Green group: think of a passphrase and select letters (e.g. “It’s 12 noon and I am hungry” becomes I’S12&IAH)

Yellow group: select eight characters (alpha or numeric) randomly from a table, write it down, destroy after memorizing

Result: passphrases offer the best of both worlds They are almost as easy to remember as naive passwords and almost as hard to guess as random passwords Design Errors

Sometimes easy-to-use password schemes are created by people who have no idea how to do this properly

Some examples: Asking for mother’s maiden name (this information is not very secret and cannot be changed)

A mnemonic system for bank PINs 1 2 3 4 5 6 7 8 9 0 b l u e The above example encodes 2256, the rest of the cells are filled up with random letters Design Errors (2)

Problem with the scheme for PINs: A 4 × 10 matrix with random letters only yields about two dozen words

The odds for a card thief have just gone from 1 in 3333 to 1 in 8 Operational Issues

It’s not just the customer end where things go wrong

Sometimes organizations are careless with their master passwords For example, a note stuck on a terminal at an exhibition

Not resetting default accounts and passwords Password Summary

Unfortunately, there’s no silver bullet in sight

You should assume that some accounts will be compromised Work out how to detect this and limit the damage

We’ll now look at biometric alternatives Biometrics

Biometrics is about identifying persons by measuring some aspect of individual anatomy or physiology (such as hand geometry or fingerprints)

some deeply ingrained skill or behavior (such as a handwritten signature)

some combination of the two (such as your voice)

Actually, biometric identification has been around for a while (see “Shibboleth”) Determining the Quality

Like alarm systems, most biometric systems have a trade-off between the rates for false accepts, i.e. a fraudster is not detected

false rejects, i.e. an authorized person is not correctly identified

These rates are sometimes also called fraud and insult rates or just type 1 and type 2 errors

Usually systems can be tuned to favor one over the other

The equal error rate is when the system is tuned so that the probability for a false accept and a false reject are equal Handwritten Signatures

Handwritten signatures have been used in classical China (although seals had more status)

In Europe it was the other way around: seals were used in medieval times, has been replaced by signatures

Handwritten signatures are a rather weak authentication mechanism Works mostly due to the legal system around it

Some numbers on an experiment with professional and untrained examiners: Experts misattributed 6.5% of the documents

Untrained people got it wrong 38.8% of the time Handwritten Signatures (2)

Better results can be achieved with signature tablets

Measures the dynamics as well (velocity of the hand, pen lifted off the paper, etc.)

Still, the equal error rate is at best 1% (several percent for pure optical comparison) Face Recognition

Recognizing people by their facial features is the oldest identification mechanism Goes back to early primate ancestors

This is one area where humans are much better at than machines

The concept of photo IDs relies on this

However, being able to recognize friends does not mean that we’re good at identifying strangers via a photograph Face Recognition (2)

University of Westminster conducted an experiment with a supermarket chain and a bank

44 students were issued four credit cards each with different pictures “good, good one”: genuine and recent

“bad, good one”: genuine, but a bit old

“good, bad one”: picture of a different, similar-looking person

“bad, bad one”: apart from sex and race, completely random Face Recognition (3)

Experiment was done under optimum conditions (after normal business hours, experienced cashiers who were aware of the experiment)

Each student made several trips past the checkout using different cards

Almost none of the cashiers were able to distinguish between the “bad, good” and “good, bad” photographs A few could not even tell the difference between “good, good” and “bad, bad” CCTV

Nonetheless, face recognition is a very hyped technology in today’s world

Trying to pick out faces from an image of a crowd is not trivial in terms of computational complexity

Error rates usually go up to 20% (or even higher)

Most systems are about automated storage of facial images that are then compared by humans Alphonse Bertillon devised a system to identify people by their bodily measurements for the police in Paris (1882)

Eventually fell out of favor, as police switched to fingerprints BertillonageMade a comeback in the form of hand-geometry readers Used for authentication in nuclear premises (1970s)

‘Fast track’ for frequent flyers (INSPASS)

Has a single-attempt equal error rate of about 1% Fingerprints

The idea of using fingerprints to identify people or as an alternative to signatures is quite old Centuries ago has already been employed in China, India, and Japan

Really took off after the introduction of a robust classification scheme (loops, whorls, arches, and tents) Fingerprints (2)

Main use of fingerprints is in identification systems and crime scene forensics

US immigrations uses it in their US-VISIT program, scanning the two index fingers Uses a blacklist of about 6 million “bad guys” and had a false match rate of 0.31% in 2004

A study in 2001 found that common products had a false accept rate of 8% and a false reject rate of 1%

By now, better ones have an equal error rate or slightly below 1% per finger

Then there are obvious problems with amputees, birth defects, and people born without fingerprints Fingerprints (3)

In forensics fingerprints have to match in 16 points to be counted as incontroversial evidence in the UK

Optimistic estimations (by the police) say that this will lead to only one false match in ten billion This is fine as long as fingerprints are only compared to the prints of a few hundred known criminals

Once thousands of prints are compared with an online database of millions, things change Iris Codes

Compared to the previous approaches iris codes haven’t been around for a long time

John Daugman developed signal processing techniques that extract information from an iris image into a 256 byte iris code Iris Codes (2)

So far (as is known), every human iris is measurably unique

There’s only limited genetic influence, i.e. the patterns are different even for identical twins

Under lab conditions the equal error rate has been shown to be better than one in a million (0.0001%), so it’s a strong contender

One potential problem is that it depends on computer processing (no human can do it reliably) Voice Recognition

Voice recognition, also known as speaker recognition, tries to identify a speaker from a short utterance

As with fingerprints, this technology is used for identification and forensics

Currently, there are systems that can achieve an equal error rate of about 1%

However, this will probably become worse in the near future with advances made in the field of voice morphing Potential Problems

Biometrics (like other physical protection mechanisms) are susceptible to environmental conditions such as noise, dirt, vibration, lighting conditions

What do you do with people who are disabled or cannot be identified reliably?

How do you ward off attacks using photographs, contact lenses, moulds of fingerprints?

Error rates have to be tuned carefully (annoying too many people will not help) Summary

There’s still much to be done in terms of making security mechanisms more user-friendly

One possible approach is biometric identification

However, biometrics are not a silver bullet They have to be shown to be much better in attended operation (i.e. process is overseen by people)

Most of the results are achieved by deterring criminals (rather than actually identifying them) Assurance and Evaluation Introduction

Basically, one could define these terms in the following way: Assurance: will the system work?

Evaluation: can you convince other people of this?

This is still quite challenging and there are lots of open questions Assurance

Assurance is a way of justifying that the implemented mechanisms satisfy the requirements set out in a policy.

Policy

Assurance

Mechanism

A working definition of assurance could be: an estimate of the likelihood that a system will fail in a particular way Issues Concerning Assurance

Assurance comes down to the question: Have capable, motivated people tried to beat up on the system enough

Obvious problems are: How do you define ‘enough’? When do you stop searching for more flaws?

How do you define ‘system’? To what extent do you consider the organizational issues and procedures?

Is still more of an art than a science Issues Concerning Assurance (2)

Very often people: protect the wrong thing

protect the right thing in the wrong way

There are also problems with usability A system might be fine when operated by alert, experienced professionals

However, ordinary users might find it too tricky Evaluation

A working definition of evaluation is: the process of assembling evidence that a system meets (or fails to meet) an assurance target

This ranges from convincing your boss that you’ve done your job to

reassuring principals who rely on the system

Tensions can arise when the party who implements the protection and the party who relies on it are different Different parties have different incentives (e.g. managers buying from big name suppliers to minimize likelihood of being fired when things go wrong) Doing the Evaluation

How do people actually evaluate systems in terms of security?

One standard is set down in the Trusted Computer System Evaluation Criteria (TCSEC) This is also called the Orange Book Orange Book

The Orange Book was issued by the Department of Defense (latest version in 1985)

It sets basic requirements for assessing the security of systems

Has been replaced by an international standard, the Common Criteria Common Criteria

The official name is Common Criteria for Information Technology Security Evaluation (or ISO/IEC 15408)

This standard has been adopted internationally (more and more countries signed up for it)

Evaluations at all (but the highest) levels are done by a commercial licensed evaluation facility (CLEF)

Evaluations are (usually) recognized in all participating countries Common Criteria (2)

The Common Criteria is divided up into different parts:

Target of Evaluation (TOE): the system that is under consideration

Protection Profile (PP): a document identifying security requirements for a class of security devices for a particular purpose Examples are profiles for smart cards or firewalls Common Criteria (3)

Security Target (ST): identifies the security properties of a system in regard to one or more PPs How well does the system meet the requirements set in a PP?

Security Functional Requirements (SFRs): specify individual security functions which may (or may not) be provided by a product Meeting STs is measured by looking at SFRs Common Criteria (4)

The evaluation process also tries to consider the development process:

Security Assurance Requirements (SARs): describes the measures taken during development to assure quality E.g. storing code in a revision control system, having a testing environment in place

Evaluation Assurance Level (EAL): a rating describing the depth and rigor of an evaluation There are seven levels ranging from EAL 1 (the most basic) to EAL 7 (the most thorough and expensive) Summary

Although there are some standards in place, there is still a lot to do

There has been some criticism involving the Common Criteria: They are pretty much limited to technical aspects

CLEFs have been involved in too many political games

One approach that seems to work (reasonably well) is hostile reviewing Give reviewers an incentive to break the system, e.g. by paying for every flaw that was found