.
. Information Security
David Weston Birkbeck, University of London Autumn Term 2012 Chapter 1 - Motivation Introduction
Protecting information against malicious or accidental access plays an important role in information-based economies/societies
Just to name a few application areas: Banking: online banking, PIN protocols, digital cash
Economy: mobile phones, DVD players, Pay-per-View TV, computer games
Military: IFF (Identification, friend or foe), secure communication channels, weapon system codes
It’s surprising how much still goes wrong in these areas Typical Cases of Security Lapses
Loss of confidential data: 2007: HMRC loses (unencrypted) disks containing personal details of 25 million people
2008: HSBC loses disks containing details of 180,000 policy holders (fined for a total of £3.2 million)
2007: hard disk containing records of 3 million candidates for driver’s licenses goes missing
not just happening in the UK: Sunrise (Swiss ISP) exposes account names and passwords of users in 2000 Typical Cases of Security Lapses (2)
Credit card fraud is a recurring theme, ranges from spying out PINs at ATMs to
organized stealing and trading of credit card numbers
Recent high profile case: In the U.S. Albert Gonzalez and other hackers infiltrated Heartland and Hannaford (two firms processing payments)
They stole millions of credit card numbers between 2006 and 2008
This has cost Heartland $12.6 million so far Typical Cases of Security Lapses (3)
Hacking into other systems: 2008: in the U.S. 18-year old student hacks into high school computer, changes grades
2005: UCSB (University of California Santa Barbara) student hacks into eGrades system and changes grades
Web site defacement also seems to happen quite regularly (with targets including the U.N., Microsoft, and Google) Typical Cases of Security Lapses (4)
Denial-of-Service attacks: 2009: Twitter is hit by a denial-of-service attack and brought to a standstill
Natural disasters (cause needs not be malicious): Data loss through fire, storm, flooding
2005: Hurricane Katrina takes out two data centers of an aerospace company in the U.S.; unfortunately, they backed each other up Objective
This list could go on and on . . .
However, even this small number of examples already shows the importance of this subject
Objective of this module is to teach basic principles of information security, enabling you to identify and assess security risks
take appropriate countermeasures Chapter 2 - Introduction Basic Definitions
How do we define information security?
Information security is about ensuring the following: Confidentiality
Integrity
Availability Confidentiality
Confidentiality is (the principle of) restricting access to information
Only people who are authorized should be able to access information
Example for loss of confidentiality: losing disks with sensitive data Integrity
Integrity is about preventing improper or unauthorized change of data
Only trustworthy data is of value
Example for loss of integrity: student hacking into university computer and changing grades Availability
Availability is about making sure that information is accessible when needed (by authorized persons)
Usually this implies keeping systems that store the information (and restrict access) operational
Example for loss of availability: system taken out by a disaster Additional Aspects
Sometimes the following aspects are added as separate issues: Authentication: confirming the identity of an entity
Non-repudiation: an entity is not able to refute an earlier action Threats, Attacks, and Vulnerabilities
A threat is a potential danger to an (information) asset, i.e. a violation of security might occur
An attack is an action that actually leads to a violation of security
A vulnerability is a weakness that makes an attack possible Threats, Attacks, and Vulnerabilities (2)
An example: a server storing source code for a new product of a software development company Threat: an unauthorized person (e.g. from a competitor) could access the source code
Attack: someone (unauthorized) actually logs into the system and downloads (parts of) the source code
Vulnerability: an unremoved test account with a default password Security Controls
Security controls are mechanisms to protect information (or a system) against unauthorized access (ensuring confidentiality)
unauthorized modification (ensuring integrity)
destruction/denial-of-service (ensuring availability)
Controls are also called countermeasures or safeguards
General types of controls: Physical
Technical
Administrative Physical Controls
Physical Controls include the following: locks
security guards
badges/swipe cards/scanners
alarms
fire extinguishers
backup power
... Technical Controls
Technical controls include the following: access control software/passwords
antivirus software
backup software/systems
... Administrative Controls
Administrative controls include the following: staff training
clear responsibilities
policies and procedures
contingency plans
... Controls Covered in This Course
In this course we are going to focus on technical controls
administrative controls
Although physical controls should be kept in mind, we don’t have enough time to cover everything
The other two controls are more interesting from the point of view of computer science and management Controls Covered in This Course (2)
Why not just look at the technical issues?
Security is only as strong as the weakest link
Very often, people are the weakest link Overview
In the first part of this module, we’re going to cover the administrative side
Second part looks at the technical side
More fine-grained overview of what’s to come: Administrative Issues
Risk Assessment
Security Policies
Social Engineering
Cryptography
Cryptographic Protocols
Selected Topics Chapter 3 Administrative Issues Organizational Environment
No matter how small the organization, there has to be someone responsible for information security
Usually, this person is called the security officer or SO (or some other acronym)
In small- or medium-sized organizations could be an added responsibility (not a full portfolio)
In large organizations, SO might be supported by a team Role of Security Officer
Defines a corporate security policy
Defines system-specific security policies
Creates project plan for implementing controls
Responsible for user awareness and training programs
Appoints auditors
Optional: obtains formal accreditation Place in Hierarchy of SO
Due to the importance of information security SO should report to the highest level of control (board of directors, CEO)
SO may even be a member of the board
SO acts as an intermediary between management and the user base Place in Hierarchy of SO (2)
Example for a large organization: Management Support
In order for SO to be successful, they need support from high-level management
Convincing (top-level) management can be difficult Ignorance of real nature of risks
Fixated on bottom line (security costs money, no clear benefits)
Fear of having to address unknown risks and take responsibility Convincing People
Introducing security measures will always require investments in time, human resources, and money
Not doing so, however, can have dire consequences: HSBC having to pay a fine of £3.2 million
Heartland losing at least $12.6 million
Health services and software companies in the U.S. have been sued for millions or even billions of dollars
Bloghoster JournalSpace had to close shop after losing all content data
First important step: know your exposure to risk Chapter 4 Risk Assessment What is It About?
Risk assessment helps in understanding: What is at risk? (Identifying assets)
How much is at risk? (Identifying values)
Where does the risk come from? (Identifying threats and vulnerabilities)
How can the risk be reduced? (Identifying countermeasures)
Is it cost effective? (Risk can never be completely eliminated) Traditional Method
Single Loss Expectancy (SLE): measures the expected impact (in monetary terms) of a certain threat occurring: SLE = Asset Value (AV) × Exposure Factor (EF)
Asset Value is the total value of the asset under threat
Exposure Factor measures the proportion of the asset that is lost when threat becomes real Range is between 0.0 and 1.0
1.0 meaning the complete asset is lost Traditional Method (2)
Usually this is looked at on an annualized basis
Annualized Rate of Occurrence (ARO) represents the expected frequency of a threat occurring per year
This gives us the Annualized Loss Expectancy (ALE): ALE = SLE × ARO
The tricky part is figuring out all these values (more on this later) Simple Example
We have an asset worth £150,000, so AV = £150,000
If threat occurs, we lose two thirds of it, so EF = 2/3
We expect this threat to occur once every 20 years (on average), so ARO = 0.05
This gives us: ALE = £150,000 × 2/3 × 0.05 = £5000 Example from Banking
An ALE analysis for a bank’s computer systems might have hundreds of different entries, e.g. Loss Type Asset Value ARO ALE SWIFT fraud £30,000,000 0.005 £150,000 ATM fraud (large) £200,000 0.2 £40,000 ATM fraud (small) £15,000 0.5 £7,500 Teller takes cash £2,160 200 £432,000 Here we assume that EF = 1.0, i.e. all the money is lost
Accurate figures are probably available for common losses, much harder to get these numbers for uncommon, high-risk losses Putting Controls in Place
Introducing controls/countermeasures has an effect on the rate of occurrence and/or the exposure factor
So we have ALEbefore and ALEafter We reduce risk by
ALEbefore − ALEafter
However, controls also have an annual cost: ACcontrol So, the overall benefit is
(ALEbefore − ALEafter) − ACcontrol Risk and Information Assets
What does this all mean for risk assessment in the context of information security?
We have to analyze different aspects: Value of the information/system assets
Possible threats to information/systems
Vulnerabilities of information/systems
Cost of countermeasures How to Measure Risk
Risk is a function of threats, vulnerabilities, and assets: How to Measure Risk (2)
With the help of controls, the risk can be reduced: How to Measure Risk (3)
Example of reducing threat: Do a background check on employees
Example of reducing vulnerability: Keep system updated, i.e. install patches
Example of reducing asset (value): Encrypt data: even if it is accessed, it’s worthless Overall Assessment Process Asset Valuation
Basically two different kinds of assets: Tangible assets (hardware, buildings, etc.)
Intangible assets (software, information)
Valuing tangible assets and part of the intangible assets (operating systems and application software) is not too hard They have a price tag
Cost to replace them is known
What about information assets? Information Asset Valuation
One way to put a value on information assets is to ask knowledgeable staff
Delphi method is used to do this in a systematic way Delphi Method
Experts (i.e. knowledgeable staff) give answers to questionnaires for several rounds
After each round a facilitator summarizes answers (and reasoning) given by experts
This summary is given to the experts (usually done anonymously)
New round is started in which experts may revise their answers Approach by Zareer Pavri
Use a cost-based and/or income-generating approach to arrive at concrete numbers
Cost approach: try to put a fair market value on the information assets
Income approach: try to determine income stream generated by products/services associated with information assets
However, even Zareer Pavri admits that valuation of information assets is sometimes more of an art than a science Example for Gillette
In 1998 Gillette had a total value of invested capital of $58.53 billion This is based on a market price of $49 per share
Working Capital was $2.850 billion
Fixed/other assets were at $5.131 billion
Intangible assets (with a price tag) $5.854 billion
We are missing $44.7 billion
At least part of this has to be the value of the information assets Threat Analysis
Let’s move to the next point: threat analysis
Reminder: a threat is a potential danger to an asset
Generally there are two different types: Accidental
Intentional
Many different sources are possible: Natural: floods, fires, earthquakes, . . .
Human: network-based attacks, malicious software upload, incorrect data entry, . . .
Environmental: power failure, leakage, . . . Threat Analysis (2)
During threat analysis, analyst must decide which threats to consider
This can be a formidable task: There’s always the danger of overlooking something
It’s impossible to protect yourself against every possible threat
Fortunately, many other people have done this before An analyst does not have to start from scratch Threat Analysis (3)
Lots of different sources are available: Government: CESG (Communications-Electronics Security Group), a branch within the GCHQ (Government Communications Headquarters) www.cesg.gov.uk
Industry: vendors of risk assessment tools, we’ll look at one in a moment: CRAMM www.cramm.com
Hire consultants: e.g. CMU’s Computer Emergency Response Team (CERT) www.cert.org
Other organizations: SANS (Sysadmin, Audit, Network, Security) Institute www.sans.org
There’s also a newsgroup called comp.risks Vulnerability Analysis
Next task is about identifying vulnerabilities that can be exploited
Vulnerabilities allow threats to occur (more often) or have a greater impact
Some argue that vulnerability analysis should be done before threat analysis If there is no vulnerability, there is no threat
However, it’s not possible to eliminate each and every vulnerability Vulnerability Analysis (2)
Starting a vulnerability analysis from scratch would also be very tedious
Fortunately, this has also been done before The sources mentioned before also have material on vulnerability analysis
One more source should be mentioned here: National Vulnerability Database (NVD) provided by the National Institute of Standards & Technology (NIST): nvd.nist.gov
At the time of writing this slide, there were 37424 identified vulnerabilities documented in the database NVD Example NVD Example (2) Putting It All Together
To assess risks, threats have to be connected to vulnerabilities, which have to be mapped to assets: Risk Assessment
We’ve covered asset valuation and threat & vulnerability analysis, next step is actual risk assessment
Risk modeling aims at giving well-informed answers to the following questions: What could happen? (threats/vulnerabilities)
How bad would it be? (impact)
How often might it occur? (frequency/probability)
How certain are answers to the question above? (uncertainty) Risk Modeling
We’ve covered the first question (threat and vulnerability analysis)
Second question (impact) was covered to a certain extent (by putting values on assets)
Uncertainty could be modeled by bounded distributions: “I’m 80% certain that it will cost between £170K and £200K to reconstruct the content of this file.”
Where do we get the other numbers from?
This is not a trivial task Risk Modeling (2)
Results of an FBI/CSI (Computer Security Institute) survey in 2002: 80% of investigated companies reported financial losses due to security breaches, only 44% could quantify losses
Only 30% of UK businesses had ever evaluated return-on-investment for information security spending Risk Modeling (3)
People assessing risks have to rely on some form of statistical data This data can come from internal sources (if an organization keeps records)
employees, customers, clients
other organizations such as NIST
ASIS International (an organization for security professionals): www.asisonline.org
government, e.g. law enforcement
tools for risk assessment
On top of that you need statistical methods to evaluate the data Quantitative vs. Qualitative Assessment
Trying to put a number on everything is called quantitative risk assessment
Computing the ALE is a classic form of quantitative risk assessment
Prerequisites of this approach: Reliable data has to be available
Appropriate tools are available
The person doing the assessment knows what they are doing (and is trustworthy)
If this is not the case, quantitative assessment can lead to a false sense of security Especially if the numbers are “massaged” Qualitative Assessment
An alternative to quantitative assessment is qualitative risk assessment
Rather than using concrete numbers, ranking is used, e.g. describing a threat level as high, medium, or low
Asset values may also be described in a similar way, e.g. high, medium, or low importance Qualitative Assessment (2)
Analysis is described with the help of a matrix:
threat level low medium high low medium
asset value high
The darker the field, the higher the risk
Depending on security goals, certain risk levels have to be addressed (by introducing controls) Qualitative Assessment (3)
Qualitative assessment is easier to understand by lay persons
However, it has the following problems: Results are essentially subjective
Results of cost/benefit analysis might be hard to express Methodology
There are pros and cons for both the quantitative and qualitative approaches
Most important issue: A knowledgeable person is doing the assessment
This is done in a systematic way
The concrete methodology becomes important when you are looking for an accreditation
Most relevant in the UK is BS 7799 (the international version of this standard is ISO 27001/27002) Hybrid Approaches
Quantitative and qualitative risk assessment are rarely used in their pure form (in the context of information security)
Increasing the granularity of a qualitative approach (by adding more levels) will push it into the direction of quantitative assessment
Very often a combination of both, a hybrid approach, is used CRAMM
Let’s complete the discussion on risk assessment by looking at a concrete tool: CRAMM
CRAMM stands for CCTA Risk Analysis and Management Method CCTA in turn stands for Central Computing and Telecommunications Agency
CRAMM is ISO 27001 compliant and, e.g., used by NATO, BAE Systems Electronics, and the NHS
If you’re selling to the UK government, chances are high that you have to use this method/tool CRAMM (2)
CRAMM follows the following overall procedure (from top to bottom): Screenshot: Threat/Vulnerability Analysis How Does It Work?
CRAMM uses a risk matrix
However, this matrix is a bit more sophisticated than the one on a previous slide: 7 levels for measuring risk: from 1 (low) to 7 (high)
10 levels for asset values: from 1 (low) to 10 (high)
5 threat levels: very low, low, medium, high, very high
3 vulnerability levels: low, medium, high How Does It Work? (2)
At the core of CRAMM is the following matrix: How Does It Work? (3)
Basically, CRAMM is a qualitative form of assessment
However, the different levels can be roughly mapped to concrete values (making it somewhat hybrid)
Example for mapping levels of risk to ALE: Selecting Countermeasures
Depending on the type of threat
type of asset
desired security level a countermeasure can be selected
CRAMM contains a library of about 3500 generic controls Screenshot: Countermeasures Testing It Out
After countermeasures have been put in place, everything can and should be tested in practice
Organizations often do penetration studies They hire a team of professional security experts who (are authorized to) try to violate the security
The goal is to find gaps in the implemented security Flaw Hypothesis Methodology
A framework for conducting these studies is: 1 Information gathering: testers try to become as familiar with system as possible (in their role as external or internal attackers)
2 Flaw hypothesis: drawing on knowledge from step 1 and known vulnerabilities, testers hypothesize flaws
3 Flaw testing: tester try to exploit possible flaws identified in step 2 If flaw does not exist, go back to step 2
If flaw exists, to to next step
4 Flaw generalization: testers try to find other similar flaws, iterate test again (starting with step 2)
5 Flaw elimination: testers suggest ways of eliminating flaw Example of a Study
The Internal Revenue Service (IRS) in the U.S. did a penetration study in 2001: Testers were able to identify some but not all IRS Internet gateways
Testers then tried to get through the firewalls of the identified gateways (but were not successful)
However, they gained enough information to circumvent firewalls
Testers, posing as Help Desk employees, called 100 IRS employees asking for assistance
71 of these employees agreed to temporarily change their password to help resolve a technical problem
Together with a phone number (also obtained by the testers) this would have enabled them to get access Debate on Studies
There is an ongoing debate on the validity of penetration studies Some vendors use it as an argument to prove the security of their product
Critics claim that these studies have no validity at all (can only show presence of flaws, not the absence)
The answer probably lies in the middle: Penetration studies are not a substitute for thoroughly designing and implementing a security solution
However, they are valuable in a final “testing” stage Conclusion
Risk assessment is an important first step in determining whether to protect an asset and to which level
the threats to protect against and their likelihood of occurring
Environments change, so (parts of) the assessment might have to be redone later
Most important thing is to create an awareness and starting a risk assessment process Chapter 5 Security Policies Introduction
So you’ve succeeded as SO in convincing people that taking care of information security
and assessing the risks is a good idea
What’s next? Implementing everything and putting things in practice Introduction (2)
Implementing information security is not just about scrambling like mad to install the newest patches
get firewalls up and running
install virus scanners
It’s also about putting sensible practices and procedures in place We’re still talking about administrative/management side, we’ll look at technical issues later Security Policies
This is where the concept of a security policy comes into play
It is a (set of) document(s) in which an organization’s philosophy
strategy
practices with regard to confidentiality, integrity, and availability of information and information systems are laid out Security Policies (2)
What can policies do that technology alone can’t? Tie in upper management: technological controls are the responsibility of an IT administrator/manager, policies are those of upper management
If an organization has legal obligations to fulfill, it can provide a paper trail
Define a strategy with regard to security: what to protect and to which degree (rather than how)
Serve as a guide to information security for employees Useless Example (1)
MegaCorp Ltd security policy: 1 This policy is approved by Management.
2 All staff shall obey this security policy.
3 Data shall be available only to those with a “need-to-know”.
4 All breaches of this policy shall be reported at once to Security. Useless Example (2)
17,346,125 pages of bureaucratic and technical details
These two examples probably explain findings from a study undertaken by Cisco in 2008: 1 23% of IT professionals work for an organization not having a security policy
2 47% of employees and 77% of IT professionals think their organization’s security policy could be improved
3 On average 40% of employees and 20% of IT professionals were unaware that their organization had a security policy Defining a Security Policy
First step: find out about your risks (risk assessment)
Policies can be seen as a risk-control mechanism (a procedural one rather than a technical one)
Goal of a control mechanism is to get risk down to an acceptable level
Policy should cover risks identified during risk assessment
Before going into concrete examples, we’ll define some important concepts of security policies Formal Models
An important aspect of security policies are classification and access control models: Distinguishes between various groups of people, systems, and information in terms of security
Determines who is authorized to access which information on which system
Access Control usually comes in two different flavors: Discretionary Access Control (DAC)
Mandatory Access Control (MAC) Discretionary vs. Mandatory
Discretionary Access Control (DAC) Each data object is owned by a user and user can decide freely which other users are allowed to access data object
Many operating systems follow this line
Mandatory Access Control (MAC) Access is controlled by a system-wide policy with no say by the users
Military systems often use this kind of control
We’ll now look at some concrete models Access Control Matrix
One of the earliest models to emerge was the Access Control Matrix
An Access Control Matrix is used to describe the protection state of information or a system
More formally, there are objects o that need to be protected objects can be files, devices, processes
subjects s who want to access the objects subjects can be users, other processes
rights r(s, o) that specify which type of access subject s is permitted on object o access rights can include read, write, execute, own Access Control Matrix (2)
This is organized in a matrix: subjects are in the rows, objects in the columns, and the rights in the cells
o1 o2 o3 ... 1 2 1 2 1 2 s1 r11, r11,... r12, r12,... r13, r13,...... 1 2 1 2 1 2 s2 r21, r21,... r22, r22,... r23, r23,...... 1 2 1 2 1 2 s3 r31, r31,... r32, r32,... r33, r33,...... Example
Here’s a bookkeeping example with different users and objects: Operating Accounting Accounting Audit System Application Data Trail Alice (sysadmin) rwx rwx r r Bob (manager) rx x - - Charlie (auditor) rx r r r Acc. Application rx r rw w (r: read access, w: write access, x: execution permission, -: no rights) Problems with Access Control Matrices
If applied in a straightforward fashion, this matrix doesn’t scale very well Example: bank with 50,000 staff and 300 applications ⇒ 15,000,000 million entries
On top of that, most entries are empty: storing a sparsely populated matrix explicitly wastes a lot of space
Solution: only store entries that actually exist, can be done in two different ways: by row, store privileges with user (capability lists)
by column, store privileges with objects (access control lists) Capability Lists
Storing the example matrix by rows would yield the following capability lists: Alice Op. Sys.: rwx, Acc. App.: rwx, Acc. Data: r, Audit: r Bob Op. Sys.: rx, Acc. App.: x ...... Makes it difficult to check who has privileges for certain files One would have to look at every list Access Control Lists
Storing the example matrix column-wise would result in the following access control lists Op. Sys. Alice: rwx, Bob: rx, Charlie: rx, Acc. App.: rx Acc. App. Alice: rwx, Bob: x, Charlie: r, Acc. App.: r Acc. Data. Alice: r, Charlie: r, Acc. App.: rw ...... Makes it easier to check who can access which file (and, if necessary, revoke the rights)
However, lists may still be too long and it may be too tedious to manage them
One solution is to organize users into groups Access Control Lists (2)
Grouping users is an approach employed by many operating systems
For example, in UNIX/Linux systems you have the following partitioning: self: owner of a file
group: a group of users sharing common access
other: everyone else - rwx|{z} |{z}r-x |{z}r-x alice users accounts self group other alice is the name of the owner users is the name of a group accounts is the name of a file Access Control Lists (3)
However, ownership of a file is still under control of a single user
This user can pass on access permissions as they see fit Assigning users to groups is usually done by a system administrator, though
This might not always be appropriate for (larger) organizations
There is a model called role-based access control, which we will look at after introducing some other models Summary Access Control Matrix
An Access Control Matrix specifies permissions on an abstract level
limits the damage that certain subjects can cause
in the form described above is a snapshot view (dynamic behavior is not described) Bell-La Padula Model
In the Bell-La Padula (BLP) model, every document (or information object) has a security classification The more sensitive the information, the higher the classification level Examples for typical levels are: top secret (ts), secret (s), confidential (c), and unclassified (uc) Every user of the system has a clearance (level) Classification and clearance levels are not decided by the users, some certified entity has to do this To be able to access a document, a user must have at least the level of the document The concept of this model originated in the military Let’s define this more formally Bell-La Padula Model (2)
Assume that you have a data object o
a subject s
Let c(o) be the classification of o and c(s) be the clearance of s
BLP enforces the following two properties: Simple security property: s may have read access to o only if c(o) ≤ c(s)
* (star) property: s, who has read access to o, may have write access to another object p only if c(o) ≤ c(p) Bell-La Padula Model (3)
First rule is quite straightforward: no-one may access information that is classified higher than their clearance
Second rule seems counterintuitive at first glance Intended to prevent a “write-down” effect
“write-down”: more highly classified information is passed to a lower classified object
BLP can be extended by categories Categories
Categories (or compartments) describes types of information
Each data object may be assigned to different categories (multiple categories are possible)
A subject should only have access to information that they need to do their job
In addition to a clearance level, a subject is also assigned a set of categories (which they are allowed to access) Categories (2)
Let’s assume we have the categories NUC, EUR, US
All possible subsets of these categories form a lattice with regard to the subset relation:
A person with access to { NUC, US } may look at data objects categorized as { NUC, US }, { NUC }, { US }, or ∅ (uncategorized), subject to clearance level Summary BLP
In essence BLP enforces confidentiality
Can be explained by its military origin
Does not handle data integrity (or only to a very limited extent)
We’ll have a look at some other models now (which also ensure integrity) Clark-Wilson Integrity Model
In a commercial environment, preventing unauthorized modification of data is at least as important as preventing disclosure
Examples: Preventing fraud (or mistakes) in accounting
Inventory control: Still works if information is disclosed
May break down if too much data is incorrect Clark-Wilson Integrity Model (2)
BLP protects data against unauthorized users
The Clark-Wilson (CW) integrity model also tries to protect data against authorized users
This has started long before computers, two mechanisms are especially important Well-formed transaction: user is constrained in the way they can manipulate data, e.g. record a log of changes
double entry bookkeeping
Separation of duty: executing different subparts of a task by different persons, e.g. authorizing purchase order, recording arrival, recording arrival of invoice, authorizing payment Clark-Wilson Integrity Model (3)
How are these concepts translated for data integrity in information systems?
Identify Constrained Data Items (CDIs) These are data items for which integrity has to be upheld
Not all data items need to be CDIs, other data items are called Unconstrained Data Items (UDIs)
The integrity policy is defined by two classes of procedures: Integrity Verification Procedures (IVPs)
Transformation Procedures (TPs) Integrity Verification Procedures
IVPs check that all CDIs in the system conform to integrity constraints
On successful completion this confirms that at the time of running an IVP, the integrity constraints were satisfied
Example for accounting: Audit functions are typical IVPs
For example, an auditor confirms that the books are balanced and reconciled Transformation Procedures
TPs correspond to the concept of a well-formed transaction
Applying a TP to a CDI in a valid state will result in a CDI that is still in a valid state
Example for accounting: A double entry transaction is a TP IVPs and TPs
CW works in the following way: We start in a valid state (confirmed by IVPs)
We only manipulate CDIs with TPs
We will always be in a valid state
What’s missing? A system can enforce that CDIs are only manipulated by TPs
A system cannot ensure (on its own) that the IVPs and TPs are well-behaved
IVPs and TPs need to be certified by someone Certification
The security officer (or an agent with similar responsibilities) certifies IVPs and TPs with respect to an integrity policy
This is usually not done automatically, but semi-automatically (i.e. some human intervention is necessary)
Accounting example: SO certifies that TPs implement properly segregated double entry bookkeeping CW Model
Assuring the integrity in CW consists of two important parts: Certification (which is done by a human)
Enforcement (which is done by a system)
CW consists of a set of rules referring to certification (C-rules) and enforcement (E-rules) Concrete Rules
Basic framework: C1: (IVP certification) Successful execution of IVPs confirms that all CDIs are in a valid state
C2: (TP certification)
Agent specifies relations of the form (TPi , (CDIi1 , CDIi2 , . . . )) that defines for which CDIs TPi returns valid states E1: (TP enforcement) System maintains a list of relations and only allows manipulation of CDIs via the specified TPs Concrete Rules (2)
Separation of duty: C3: (User certification)
Agent specifies relations of the form (UserID, TPi , (CDIi1 ,
CDIi2 , . . . )) that defines which TPs may be executed on which CDIs on behalf of which user E2: (User enforcement) System maintains a list of relations and enforces them E3: (Authentication) Each user must be authenticated before executing a TP for that user E4: (Certification and execution) Only an agent permitted to certify entities may change relations, this agent may not have any execution rights with respect to that entity Concrete Rules (3)
Logging/Auditing: C4: (Logging) All TPs must be certified to write to an append-only log, making it possible to reconstruct each executed operation
There’s no special enforcement rule, because the log can be modeled as a CDI
and each TP has to be certified to append to this CDI
We’re almost done. . . Concrete Rules (4)
Relationship between UDIs and CDIs C5: (Transformation) Any TP taking a UDI as an input must be certified to only use valid transformations when taking the UDI input to a CDI
or reject the UDI
This is important as new information is fed into a system via UDIs For example, data typed in by a user may be a UDI Summary CW
CW enforces integrity: Uses two different integrity levels: CDIs and UDIs (CDI being the higher level)
“Upgrades” from UDIs to CDIs via certified methods
Could also enforce confidentiality (not a main concern in the original model): If users are only allowed to access data through Transformation Procedures as well
Although in this case, no data will be changed Role-Based Access Control
Role-based Access Control (RBAC) is a mandatory access control model
Mainly for commercial/civilian applications, so does not have the different levels of secrecy of Bell-La Padula
Can be seen as a more strict version of the Access Control List (ACL) Main difference to ACL: in RBAC a certified entity grants and revokes privileges, this is not done by the user Roles
Similar to ACL RBAC also puts users into different groups, these are called roles
Depending on their current role, users can access (and modify) certain data objects
For example, in a hospital setting you could have the following roles: doctors, nurses, pharmacists, administrators While doctors may modify a diagnosis file, administrators may not do so
Let’s give a more formal description now Formal Definition
In RBAC you have a subject s
role r
transaction t (transactions access/modify data)
object o
mode x (read, write, execute, etc.)
Furthermore, there are AR(s): the active role of s, i.e. the role s is currently using
RA(s): the roles s is authorized to perform
TA(r): the transactions a role is authorized to perform Formal Definition (2)
A subject s may execute a transaction t if exec(s, t) is true
exec(s, t) is true, if and only if s can execute t at the current time This is determined by AR(s) and TA(r)
This is the case if t ∈ TA(AR(s)) Rules in RBAC
RBAC enforces the following rules 1 Role assignment: the assignment of a role is a necessary prerequisite for executing transactions For all s, t : exec(s, t) can only be true if AR(s) ≠ ∅
2 Role authorization: a subject’s active role must be authorized For all s : AR(s) ∈ RA(s)
3 Transaction authorization: s can execute t only if t is authorized for the active role of s For all s, t : exec(s, t) is true only if t ∈ TA(AR(s)) Rules in RBAC (2)
Access modes can be defined for transactions (to allow a finer granularity) This defines what a user (in a certain role) can do with data (read-only, modify, etc.)
access(r, t, o, x) is true, if and only if role r may execute transaction t on object o in mode x
So sometimes a fourth rule is added to RBAC: 4 For all s, t, o : exec(s, t) is true, only if access(AR(s), t, o, x) is true Role Relationships
The relationship between users, roles, transactions, and objects can also be described graphically
Object 1 transaction_a member_of User 1
Role 1
Object 2 transaction_b member_of User 2
A more concrete example in a hospital setting:
Patient file 1 prescribe_drug member_of Dr House
doctor
Patient file 2 diagnose member_of Dr Wilson Summary RBAC
Often MAC is equated with military applications, while DAC is seen as appropriate for commercial/civilian applications
However, DAC might not always provide the appropriate security in a commercial/civilian setting
RBAC fills this gap: it’s a MAC model without the specific military requirements Summary of Formal Models
Access Control Matrix, BLP, CW, and RBAC are just four (though important) models
Other well-known models include Biba (integrity) and Chinese Wall (hybrid)
Due to time constraints they are not discussed here
We will now turn our attention to the management side of security policies Organization-wide Policies
Ideally, a policy for an organization should be light, i.e. not covering hundreds or even thousands of pages
simple and practical
easy to manage and maintain
accessible, i.e. people can find what they’re looking for
Policy should be organized in a hierarchical fashion: Security framework paper
Position papers (on specific issues) Organization-wide Policies (2) Framework Paper
The security framework document should state an organization’s commitment to information security
define a classification system (e.g. based on one of the discussed formal models)
make clear that users and administrators will be held accountable for their behavior
give the SO appropriate authority to enforce security policy
state clearly the responsibilities that individuals have
define how security will be reviewed Position Papers
Position papers address specific aspects of the security policy, e.g. how to configure servers connected to the internet
the process to be followed in the event of a security breach
should be focused and kept short
can have different authors (should be written by someone with expertise in the area) Position Papers (2)
Exact topics depend on the organization, here are some important ones: Physical security
Network security
Access control
Authentication
Encryption
Key management
Compliance
Auditing and Review
Security Awareness
Incident response and contingency plan Content of Position Papers
Scope: should describe which issue, organizational unit, or system is covered
Ownership: name and contact details of the document owner
Validity: policies can/should have limited lifespans (after which they’re reviewed)
Responsibilities: who is responsible for which elements
Compliance: the consequences of non-compliance with the policy
Supporting documentation: references to documents higher and lower in the policy structure
Position statement: statements about the actual issue or system (the hard part)
Review: whether, when, and how issue or system will be reviewed Summary
Policies are an important part of information security, as they also cover the non-technical side
There’s no “one size fits all”, policy must be adapted to the risk profile of an organization There are (customizable) policy solutions out there, so not everything has to be done from scratch
Security policies should be easy to access, use and understand
For a more detailed example, see http://www.securityfocus.com/infocus/1193 Chapter 6 Social Engineering Introduction
As already mentioned, security is not just a technological problem, but also a people and management problem
Social engineers bypass the technological defenses and target people
Social engineering is about manipulating people to gain access to confidential information
perform fraudulent actions (they wouldn’t normally do)
We’ll have a look at some well-known techniques Common Methods
Posing as a fellow employee (easier in a large organization)
Posing as an employee of a vendor, partner company, or law enforcement
Posing as someone in authority
Posing as a new employee requesting help
Using insider lingo and terminology to gain trust Common Methods (2)
Posing as a vendor or systems manufacturer offering a system patch or update
Offering help if a problem occurs, then causing problem, and waiting for victim to ask for help
Sending free software or patch for installation
Sending virus/Trojan Horse as e-mail attachment
Leaving CD/DVD/USB device with malware (malicious software) around the workplace
Offering prizes for registering at a web site Common Methods (3)
Dropping document/file at mail room for intraoffice delivery
Modifying fax machine heading to appear to come from internal location
Asking receptionist to receive, then forward a fax
Pretending to be from remote office and asking for e-mail access locally “Social Engineering Life Cycle”
Research: attackers try to find out as much as possible beforehand from annual reports, brochures, web site, dumpster diving
Developing trust: use of insider information, misrepresenting identity, need for help or authority
Exploiting trust: ask victim for information or other form of help (or manipulate victim into asking for help)
Utilize information: if final goal not reached yet, go back to earlier steps Why Do Social Eng. Attacks Work?
There are certain basic tendencies of human nature that are exploited
Authority: people have a tendency to comply when approached by a person in authority “I’m with the IT department/I’m an executive/I work for an executive/. . . ”
Liking: people seem to comply when a person comes across as likable (similar interests, beliefs, and attitudes) “I have the same hobby/I come from the same town/. . . ” Why Do Social Eng. Attacks Work? (2)
Reciprocation: people feel obliged to help if they’ve received a favor or gift “Let me help you with sorting out your computer problems. . . ”
Consistency: people feel that they have to honor any pledges or commitments made in public “But your company has a reputation for. . . /You have done this for so-and-so/. . . ” Why Do Social Eng. Attacks Work? (3)
Social validation: people have a tendency to comply when doing so appears to match what others are doing “Your colleagues have also answered this survey. . . ”
Scarcity: people seem to behave differently when an asset is in short supply “The first hundred people to register will receive a free. . . ” Warning Signs of an Attack
Unusual request
Refusal to give callback number
Claim of authority
Stresses urgency
Threatens negative consequences in case of non-compliance
Shows discomfort when questioned or challenged
Name dropping
Compliments or flattery
Flirting How to Prevent Attacks
What’s going to stop an attack by a social engineer? A firewall?
Strong authentication devices?
Intrusion detection systems?
Encryption?
The answer is no to all of these How to Prevent Attacks (2)
So what does work?
Have procedures in place for handling suspicious requests: Make them part of the security policy
Let staff know about these
Training of staff plays an important role
Explain why certain procedures are put in place (blind obedience doesn’t work) Training Staff
One of the most important points is to make sure that employees verify the identity of a caller/visitor
This can involve: If caller is known, this can be as simple as recognizing their voice
Ask for a phone number to call back (and check number)
Contact caller’s/visitor’s superior
Ask a trusted employee to vouch for a person
Staff also needs to find out the “need-to-know”: why a certain person wants information and why they are authorized Training Staff (2)
Make sure that staff handling sensitive information are aware of the fact and know potential effects of handing out information
Staff has to be trained to challenge authority when security is at stake Training should include teaching friendly ways of doing this
This has to have support from high-level management, otherwise people will stop doing this
Don’t forget anyone in the training: Security guards may have physical access to sensitive information or systems and be asked to help out Training Staff (3)
Training should not just involve organizing seminars that go on for hours People just switch off after some time
Much better is hands-on training Stage simulated attacks on your organization, e.g. by sending them phishing e-mails (PhishGuru), distributing USB sticks, etc.
Then debrief employees on how to change/improve their behavior (sticks much better due to first-hand experience)
Don’t combine this with punitive actions, this is not about punishing people, but about helping them Summary
There are many (non-technical) ways to gain access to information
Very often these kinds of attacks are the most successful
A very important way of protecting an organization is to train the employees Chapter 7 Cryptography Introduction
We’ll now turn to technical issues of information security
One of the most important tools available is cryptography
Cryptography is the (art and) science of keeping information secure This is usually done by encoding it
Cryptanalysis is the (art and) science of breaking a code
Cryptology is the branch of math needed for cryptography and cryptanalysis Introduction (2)
Cryptography can help in providing: Confidentiality: only authorized persons are allowed to decode a message
Authentication: receiver of a message (e.g. a password) should be able to ascertain its origin
Integrity: receiver of a message should be able to verify that it hasn’t been modified
Non-repudiation: a sender shouldn’t be able to falsely deny that they sent a message
We’re talking about messages here, but the principles can be applied to any information Basic Definitions
The original message is plaintext (sometimes also called cleartext
Disguising the content of a message is called encryption
This results in ciphertext, the encrypted message
Turning ciphertext back into plaintext is called decryption
Original Plaintext Ciphertext Plaintext Encryption Decryption Basic Definitions (2)
Plaintext is denoted by P or M (for message)
It’s a stream of bits, intended for transmission or storage, e.g. a textfile
a bitmap
digitized audio data
digital video data
Ciphertext is denoted by C and is also binary data Can be the same size as M
Can be larger
Can be smaller (if combining encryption with compression) Basic Definitions (3)
The encryption function E operates on M to produce C: E(M) = C
The decryption function D operates on C to produce M: D(C) = M
As the whole point encrypting and then decrypting is to recover the original message, the following has to be true: D(E(M)) = M Algorithms
A cryptographic algorithm (also called a cipher) is the mathematical function used for encryption and decryption
Usually there are two functions, one for encryption and one for decryption
If security is based on keeping the algorithm secret, then it’s a restricted algorithm
Restricted algorithms are not a good idea: Every time a user leaves a group, the algorithm has to be changed
A group must develop their own algorithm; if they don’t have the expertise, then it will be subpar Algorithms and Keys
Modern algorithms use a key, denoted by K
A key is taken from a large range of possible values, the keyspace
used as additional input for the en-/decryption function to do the en-/decrypting
Encryption Decryption Key Key
Original Plaintext Ciphertext Plaintext Encryption Decryption Algorithms and Keys (2)
So, the following holds:
EKE (M) = C
DKD (C) = M
DKD (EKE (M)) = M
The key used for encryption, KE , can be the same as the key used for decryption, KD, or the keys can be calculated from each other If this is the case, we have a symmetric algorithm
Otherwise, it’s an asymmetric algorithm Algorithms and Keys (3)
The security is based in the key, not in the details of the algorithm
This has several advantages: The algorithms can be published and analyzed by experts for possible flaws
Software for the algorithm can be mass-produced
Even if an eavesdropper knows the algorithm, without the key messages cannot be read
An algorithm together with all possible plaintexts, ciphertexts, and keys is called a cryptosystem Symmetric Algorithms
Symmetric algorithms are sometimes also called conventional algorithms Secret-key or single-key algorithm means that encryption and decryption keys are identical
Anyone who has the key can decrypt a message
However, before two persons/systems can communicate, they need to agree on a key Asymmetric Algorithms
Asymmetric algorithms are also often called public-key algorithms The key used for encrypting messages is the public key
The key used for decrypting messages is the private key
The public key can be published, so that anyone can encrypt messages to a person/system
The private key is only known to the receiver
This only works if the private key cannot be calculated from the public key (in any reasonable time) Asymmetric Algorithms (2)
Asymmetric algorithms work similar to padlocks: You distribute open padlocks (for which only you have a key)
Anyone sending you a message puts it into a box and snaps the padlock shut (encryption using public key)
Only you can unlock the padlock using the key (decryption using private key) Encryption Techniques
There are different techniques for actually doing the encryption
We are going to have a look at the most important ones: Substitution ciphers
Transposition ciphers
Stream ciphers
Block ciphers Substitution Ciphers
In a substitution cipher each character in the plaintext is substituted for another character
One of the earliest techniques to be used in the famous Caesar cipher
In the Caesar cipher each character is shifted (by three characters in the original cipher) Monoalphabetic Substitution Ciphers
Caesar cipher is a special case of a simple substitution cipher (or monoalphabetic cipher)
Instead of just shifting characters, the alphabet can be mapped to a (random) permutation: Monoalphabetic Substitution Ciphers (2)
Monoalphabetic ciphers are not very secure and can be easily broken by statistical means: Different characters have typical frequencies in languages
There have been various attempts at making substitutions more secure Homophonic Substitution Ciphers
Homophonic substitution ciphers try to obscure the frequencies by mapping a character to more than one code For example, “A” could correspond to 5, 13, 25, or 56; while for “B” this could be 7, 19, 32, or 42
While this makes analysis a bit harder, it doesn’t hide all statistical properties
With the help of a computer can usually be broken in a few seconds Polygram Substitution Cipher
Instead of encoding single characters, a polygram substitution cipher encrypts groups of letters For example, “ABA” could correspond to “RTQ’, while “ABB” could correspond to “SLL”
The example above uses 3-grams (groups of 3 letters); can be generalized to n-grams
Still not a very secure way of encrypting data: This hides the frequencies of individual letters
However, natural languages also show typical frequencies for n-grams (although the curve is flattened) Polyalphabetic Substitution Cipher
A polyalphabetic substitution cipher uses multiple simple substitution ciphers
The particular one used changes with the position of each character of the plaintext There are multiple one-letter keys
The first key encrypts the first letter of the plaintext, the second key encrypts the second letter of the plaintext, and so on
After all keys are used, you start over with the first key
The number of keys determines the period of the cipher Polyalphabetic Substitution Cipher (2)
An early example is the Vigenère cipher
Adds a key repeatedly into the plaintext using numerical codes: A=0, B=1, . . . , Z=25
This is done modulo 26, i.e. if the result is greater than 26, then we divide the result by 26 and take the remainder C = M + K mod 26
For example, P(15) + U(20) = 35 mod 26 = 9 Polyalphabetic Substitution Cipher (3)
Unfortunately, the Vigenère cipher is also not very secure
Code can be broken by analyzing the period Looking at the example from previous slide, we notice that KIOV is repeated after 9 letters, NU after 6 letters
As 3 is a common divisor of 6 and 9, this is a hint that the period could be 3
Knowing which letters were encoded with the same key allows the application of frequency methods again Rotor Machines
In the 1920s, mechanical encryption devices were developed (to automate the process)
Basically, these machines implemented a complex Vigenère cipher
A rotor is a mechanical wheel wired to perform a general substitution
A rotor machine has a keyboard and a series or rotors, where the the output pins of one rotor are connected to the input of another
For example, in a 4-rotor machine, the 1st rotor might substitute A → F, the 2nd F → Y, the 3rd Y → E, the 4th E → C; so the final output for A is C
After each output, some of the rotors shift Combination of several rotors and shifting of rotors leads to a period of 26n (where n is the number of rotors)
A long period makes it harder to break the code
The best-known rotor machine is the Enigma (pictured on the right)
Rotor Machines (2) Transposition Ciphers
Substitution ciphers on their own are usually not very secure
That is why they are combined with transposition ciphers Transposition ciphers on their own are also not very secure
In a transposition cipher the symbols of the plaintext remain the same, but their order is changed Simple Columnar Transposition Cipher
In a simple columnar transposition cipher the plaintext is written horizontally onto a piece of graph paper of fixed width
while the ciphertext is read off vertically
WEARED ISCOVE REDFLE EATONC E
WIREEESEAACDTROFOEVLNDEEC Double Columnar Transposition
A simple columnar transposition can be broken by trying out different width/column lengths
Putting the ciphertext through a second transposition enhances security In contrast to a simple substitution cipher, where a second application does not increase security Stream Ciphers/Block Ciphers
In symmetric algorithms we also distinguish between stream ciphers and block ciphers
Stream ciphers operate on the plaintext a single bit (or single character) at a time Simple substitution cipher is an example
Block ciphers operate on groups of bits (or groups of characters) An example of an early block cipher is Playfair Playfair
Place the alphabet in a 5x5 grid, permuted by the keyword (omitting the letter J; J=I)
Then divide the plaintext into pairs of letters preventing double letters in a pair (by separating them with an ‘x’)
adding a ‘z’ to the last pair (if necessary) Playfair (2)
For example, “Lord Granville’s letter” becomes “lo rd gr an vi lx le sl et te rz”
Then encrypt text pair by pair Replace two letters in the same row or column by succeeding letters, e.g. “am” → “LE”
Otherwise the two letters are in opposite corners of a rectangle, replace them with letters in the other corners, e.g. “lo” → “MT” Modern Algorithms
So far we have covered algorithms that are more of historical interest Good for explaining basic cryptographic concepts with easy-to-understand algorithms
None of these algorithms should be used today (they can all be broken easily)
We will now turn to more modern algorithms (which are still relevant today)
First we introduce some more concepts Confusion and Diffusion
In the 1940s Claude Shannon suggested that strong ciphers can be built by combining substitution and transposition
This relies on the principles of confusion and diffusion Confusion adds an unknown key value to confuse an attacker about the true value of a plaintext symbol
Diffusion spreads the plaintext information through the ciphertext SP-Networks
Early (modern) block ciphers used simple networks combining substitution and permutation circuits
These networks are called substitution-permutation networks (or SP networks)
Let’s have a look at a simple SP network operating on 16-bit blocks SP-Networks (2)
A substitution box (S-box) is basically a look-up table containing some permutation0000 → of the1001 numbers 0 to 15: 0001 → 0100 0010 → 1100 0011 → 0000 ...... SP-Networks (3)
The mapping used for the S-boxes needs to be one-to-one (so decryption is possible)
The output of S-boxes is shuffled with a permutation network before being fed into the next layer of S-boxes
Each layer is also called a round
One question remains: how do you add a key? SP-Networks (4)
After each round you add (part of) the key via an XOR (exclusive-or) function Using an SP Network
Why use an SP network? Why not do a direct mapping from 16-bit code to 16-bit code? This would need 220 bits of memory (216 × 16 bits)
Using the 4-bit S-boxes needs 64 bits (24 × 4 bits)
While a 16-bit mapping could be stored in memory, 16-bit block ciphers are far from secure
In order to get a decent level of security, you need ciphers with a block size of at least 64, 128, or even more bits 264 × 64 bits ≈ 100 million Terabytes Using an SP Network (2)
Main reason to use an SP network: saves space
However, in order to make an SP network block cipher secure, the following conditions have to be met: The block size needs to be large enough
We need to have enough rounds
The S-boxes need to be chosen carefully Block Size
The smaller the block size, the more vulnerable a block cipher is to attacks If an attacker can obtain (or guess) some plaintext messages, they can create a dictionary
With a dictionary any message encrypted with a certain key can then be decrypted
The smaller the block size, the easier it is to complete such a dictionary Number of Rounds
Good ciphers exhibit the avalanche effect This means that a slight change in the input (e.g. toggling one bit) has a major impact on the output
Ideally you want to create a lot of diffusion, i.e. spread the information trough the whole ciphertext
Having just one or two rounds is not adequate, a slight change in the input will only have a minor impact on the output
The exact number of rounds that are needed also depends on how quickly an SP network diffuses the data Choice of S-Boxes
S-boxes need to add enough randomness to be effective
Let’s look at an example of an S-box with 3 input and 3 output bits: 000 → 010 100 → 101 001 → 000 101 → 111 010 → 011 110 → 100 011 → 001 111 → 110 The most significant bit passes unchanged through the S-box
This would be a bad choice for an S-box Concrete Algorithms
We’ll now have a look at algorithms based on SP networks
These algorithms are still in use (or have been used until recently) Data Encryption Standard (DES)
Advanced Encryption Standard (AES) Data Encryption Standard
Data Encryption Standard (DES) is a well-known block cipher
The original algorithm encrypts the plaintext in 64-bit blocks using a 64-bit key The key is actually 56 bits, every 8th bit is used for parity
The fundamental building block of DES is a substitution followed by a permutation on a 32-bit block DES uses an SP network on half of the block at a given time
DES has 16 rounds, applying the same technique on a plaintext block 16 times Data Encryption Standard (2)
First step is an initial permutation of the plaintext
Then there are 16 rounds of identical operations: A block is broken up into a left and a right half
In function f , the data is combined with the (current round) key and then run through an SP network
After the 16th round, there is a final permutation Data Encryption Standard (3)
The function f is also called the Feistel function (named after its inventor)
At first glance this might look overly complicated What’s the advantage of all this criss-crossing?
Actually, it’s a very clever scheme allowing to use the same network for encryption and decryption The only difference is that during decryption the round keys are applied in inverse order Data Encryption Standard (4)
DES was a certified standard for more than 20 years
However, there have been a couple of successful attacks on DES Deep Crack ($250,000 custom-built machine) breaks a DES key in 22 hours and 15 minutes (1999)
COPACOBANA ($10,000 custom-built machine) breaks DES in 9 days (2006)
DES has been replaced by the Advanced Encryption Standard (AES) AES uses a more sophisticated substitution/permutation algorithm on a 4x4 matrix of bytes Advanced Encryption Standard
Each input byte is sent through an S-box
Then the rows of the matrix are shifted
In a more complicated process the values in each column of the matrix are mixed
In a final step the current round key is added
This constitutes one round in AES Advanced Encryption Standard (2)
As we have 16 input bytes, AES has a block size of 16 × 8 = 128 bits.
AES can be used with different key sizes: 128, 192, or 256 bits
Depending on the key size there are 10, 12, or 14 rounds, respectively
AES became a certified standard in 2002 It is the first published algorithm approved by the NSA for top secret documents (using 256-bit keys)
We will move away from SP networks now and look at other types of algorithms The “Perfect” Encryption Algorithm
There is an unbreakable encryption scheme, it’s called a one-time pad
It’s basically a Vigenère cipher with a truly random, non-repeating key Encryption is done by adding plaintext letter and a one-time pad key character modulo 26
Each key letter is used exactly once and for only one message
The sender encrypts the message and then destroys the used part of the key
The receiver (who needs an identical pad) follows the same procedure for decryption One-Time Pads
Here is an example of a one-time pad (using numbers): One-Time Pads (2)
The crucial point is having a truly random sequence of key characters This makes statistical analysis impossible, every potential message has the same probability
Generating truly random numbers is harder than it looks: Most algorithms generate pseudo-random numbers
Better results can be achieved by integrating random events into the algorithm, e.g. mouse movements, disk operations, arrival times of network packets
Another problem is exchanging the key between sender and receiver RSA Public-Key Algorithm
RSA is a very popular and secure encryption algorithm RSA stands for the names of the inventors: Rivest, Shamir, Adleman
Has a public and a private key (asymmetric algorithm)
The public and private key are functions of a pair of large prime numbers Determining the Keys
Select two large prime numbers p and q at random
Compute their product n = p × q
Compute Euler totient f = (p − 1)(q − 1)
(Randomly) choose an encryption key e such that e and f are coprime, i.e. they have no factors in common Another way of formulating this: the greatest common divisor (gcd) of e and f is 1
Compute the decryption key d such that e × d = 1 mod f , i.e. e × d divided by f leaves remainder 1
The public key consists of e and n
The private key consists of d and n Determining the Keys (2)
Let’s do an example: Choose p = 47 and q = 71
n = pq = 3337
f = (p − 1)(q − 1) = 3220
A valid value for e is 79 Can be checked by using the Euclidean algorithm
d = 1019, as (79 × 1019) mod 3220 = 1 Can be computed using the extended Euclidean algorithm
Now that we have the keys, what do we do with them? First step: destroy p and q, otherwise your private key is insecure Encrypting/Decrypting a Message
Encrypting a message is done by exponentiating the message M with e (modulo n): C = Me mod n
Decrypting a ciphertext is done by exponentiating it with d (modulo n): M = Cd mod n
Example: 5379 mod 3337 = 65 651019 mod 3337 = 53
Caveat: M and n have to be coprime; if message is too large, split it up into blocks (you’re safe if M < p and M < q) Encrypting/Decrypting a Message (2)
Why does this work? Number theory! Euler showed in 1736 that if M and n are coprime, then Mf = 1 mod n
Furthermore, Mf +s = Ms mod n, i.e. exponent can be taken modulo f
So, encrypting and then decrypting by computing (Me)d mod n yields M1 mod n, which is M Remember that ed = 1 mod f Security
Now we know that RSA will reconstruct the original message again
Why is it secure? Can’t someone figure out d given n and e? This involves integer factoring, i.e. figuring out the factors p and q of n
The prime numbers p and q used in RSA are really big, so that n is sufficiently large Using a size of around 300 bits for n can be factored in a few hours on a PC
A size of 1024 bits for n is probably safe in the near future
The fastest algorithm for integer factoring has a run-time exponential in the size of the input What Now?
O.k., RSA is secure, but we are left with two problems: If integer factoring is so difficult, how do we figure out p and q, which have to be prime?
If the numbers involved are so huge, how do we encrypt and decrypt efficiently? Computing 39579930985039858787943 mod 780323454455344457457 doesn’t look easy Determining Prime Numbers
Factoring numbers is difficult, however, testing a number for primality is easier
There are quick probabilistic tests: Take a whole range of numbers x
Test whether ax−1 = 1 mod x for different values of a
If x is a prime, then this holds for all a
Repeating this for several values of a gives a high probability for having found a prime
There are also deterministic tests, e.g. APR (Adleman, Pomerance, Rumely) These tests are usually very complicated to implement Efficient Exponentiation
This still leaves us with the problem of determining Me mod n and Cd mod n efficiently
As e and d are usually huge numbers, you don’t want to do a brute-force computation: That means multiplying M e times with itself and then taking it modulo n
Don’t try to do this with your calculator
We can use a technique called repeated squaring to cut down on the number of multiplications Repeated Squaring
Assume we want to compute x59
It’s not too hard to compute the following series of numbers: x2, x4, x8, x16, x32,... Multiplying x with itself gives you x 2 multiplied with itself gives you x 4 multiplied with itself gives you x 8, and so on
Incidentally, the exponents for the numbers we can compute quickly are all powers of 2
We can use the binary representation of the exponent and multiply all the numbers that are present in the binary representation: 59 = 111011 = 32 + 16 + 8 + 2 + 1 x59 = x32 × x16 × x8 × x2 × x1 Modular Exponentiation
Repeated Squaring reduces the number of multiplications tremendously
However, the numbers can still become very big
Math comes to the rescue again: we can take modulo n after each multiplication
For example, if we want to compute 240 mod 100, then 40 = 32 + 8, so 240 = 232 × 28 ... 28 = 256 = 56 mod 100 216 = 56 × 56 = 3136 = 36 mod 100 232 = 36 × 36 = 1296 = 96 mod 100 240 = 96 × 56 = 5376 = 76 mod 100 Some Additional Stuff on RSA
While the techniques mentioned above make RSA usable in practice, it’s still slower than DES or AES
The prime numbers p and q should not be ‘too close’: 1 p − q should not be less than 2n 4
The private key needs to be ‘large enough’: d should be 1 greater than n 4 /3
The public key can be small, but then extra care has to be taken when encrypting messages: In order to compensate, message has to be made larger by padding it Summary
This represents only a small glimpse into the world of cryptography
There are countless other cryptosystems out there: Anubis, Grand Cru, Khufu, Serpent, Twofish, just to name a few
Going deeper into this would involve some very sophisticated math RSA is one of the simpler algorithms in terms of understandability Chapter 8 Introduction to Cryptographic Protocols Introduction
Just having a high-tech cryptosystem in place does not guarantee security
Cryptography has to be used in a certain way to achieve security
That’s where cryptographic protocols come into play
A protocol is a series of steps, involving two or more parties, designed to accomplish a task Protocols
Series of steps: Each step must be well-defined
There must be specified action for every possible situation
Everyone involved must know the steps and follow them
Two or more parties: Parties can be friends and trust each other or adversaries and mistrust each other
Accomplish a task: This can involve sharing (parts of) a secret, confirming an identity, signing a contract, etc. Dramatis Personae
In order to simplify scenarios, over time the following cast has evolved: Alice: 1st participant
Bob: 2nd participant
Carol, Dave: more participants for multi-party protocols
Eve: Eavesdropper
Mallory: malicious active attacker
Trent: trusted arbitrator Arbitrated Protocols
An arbitrator is a disinterested third party trusted to complete the protocol Has no allegiance to any party involved
All people participating trust that he is acting honestly and correctly
Arbitrators can help complete protocols between parties that don’t trust each other
Trent
Alice Bob Arbitrated Protocols (2)
In the real world, lawyers, public notaries, and banks act as arbitrators
For example, Bob can buy a car from Alice using an arbitrated protocol 1 Bob writes a check and gives it to the bank (Trent)
2 Bank puts enough money on hold to cover check and certifies the check
3 Alice gives the title to Bob and Bob gives the certified check to Alice
4 Alice deposits the check
This works, because Alice trusts the bank’s certification Arbitrated Protocols (3)
There are some problems with arbitrated protocols in the virtual world: It’s more difficult for people to trust a faceless entity somewhere in the network
An arbitrator can become a bottleneck, as he has to deal with every transaction This may lead to even more delay (due to the arbitrator there’s always some delay)
Lots of damage can be caused if arbitrator is subverted
Someone has to pay for running an arbitration service Adjudicated Protocols
Arbitrators have high costs, so arbitrated protocols can be split into two sub-protocols: A non-arbitrated part
An arbitrated part that is executed only if there is a dispute
This special kind of arbitrator is called an adjudicator
Trent
evidence evidence Alice Bob Adjudicated Protocols (2)
In the real world, judges are professional adjudicators
Alice and Bob can enter a contract without a judge; a judge only sees the contract if it is brought before a court 1 Alice and Bob negotiate the terms of the contract
2 Alice signs the contract
3 Bob signs the contract The following is only executed in case of a dispute: 4 Alice and Bob appear before a judge
5 Alice presents her evidence
6 Bob presents his evidence
7 The judge rules on the evidence Adjudicated Protocols (3)
Issues with adjudicated protocols in the virtual world Protocols rely in a first instance on the parties being honest
However, if someone suspects cheating, the protocol provides enough evidence to be able to detect this
In a good adjudicated protocol, this evidence also identifies the cheating party
Instead of preventing cheating, adjudicated protocols detect cheating The (inevitability of) detection acts as a deterrent Self-Enforcing Protocols
A self-enforcing protocol does not need an arbitrator or adjudicator
The protocol itself guarantees fairness
If one party tries to cheat, the other party is able to detect this immediately
The protocol then stops and/or punishes the cheating party
Alice Bob Self-Enforcing Protocols (2)
A well-known self-enforcing protocol is for dividing up things, e.g. a cake 1 Alice divides up the piece of cake
2 Bob chooses which piece to take
Ideally, it would be nice to have self-enforcing protocols for every possible situation
Unfortunately, that’s not possible Basic Cryptographic Protocols
In the following we’re going to have a look at some basic protocols involving cryptography
These cover a range of different applications: Communicating securely
Signing documents
Authenticating users
Exchanging keys
We will also have a look at potential weaknesses of these protocols Symmetric Crypto Communication
An obvious application is secure communication by encrypting messages
Using symmetric cryptography, a protocol could look like this: 1 Alice and Bob agree on a cryptosystem
2 Alice and Bob agree on a key
3 Alice encrypts her plaintext message using the above cryptosystem and key
4 Alice sends ciphertext to Bob
5 Bob decrypts ciphertext using the above cryptosystem and key Symmetric Crypto Communication (2)
What are some of the problems of this protocol? Step 2 has to take place in secret (Step 1 and 4 can take place in public) Assuming that Alice and Bob use a cryptosystem in which all security relies on the key (not the algorithm)
For world-spanning communication, this is problematic Symmetric Crypto Communication (3)
Problems involving compromised secrecy If the key is compromised (stolen, guessed, extorted, bribed, etc.), then Eve can decrypt all messages encrypted with that key
Mallory could intercept messages and send his own, pretending to be Alice or Bob
This protocol assumes that Alice and Bob trust each other: Either one could claim that the key has been compromised and publish the communication anonymously Symmetric Crypto Communication (4)
Yet another problem is key management: If each pair of users in a network have their own key then the total number of keys increases rapidly with the number of users
For example, 10 users require 45 different keys; 100 users require 4950 keys
Sharing keys among users is not a solution: If someone leaves the network or is not trusted anymore, all users sharing keys with that person have to change their keys Asymmetric Crypto Communication
Let’s have a look at how this communication works with a public-key cryptosystem 1 Alice and Bob agree on a public-key cryptosystem
2 Bob sends Alice his public key
3 Alice encrypts her plaintext message using Bob’s public key
4 Alice sends ciphertext to Bob
5 Bob decrypts ciphertext using his private key Asymmetric Crypto Communication (2)
What has changed? The key exchange does not have to take place in secret anymore Even if Eve listens in on step 2 and 4, she only has the public key and the ciphertext
This will not help her in recovering the private key or the plaintext
Everyone who wants to communicate with Bob uses Bob’s public key There’s no need to have a separate key for each pair of users Asymmetric Crypto Communication (3)
So, everything is fine now? Unfortunately, no. . . Enter Mallory (who is more powerful than Eve) In step 2, Mallory intercepts Bob’s public key and sends his own to Alice
Alice encrypts her message using Mallory’s public key (thinking it’s Bob’s) and sends it to Bob
On the way back Mallory intercepts Alice’s ciphertext, decrypts it, encrypts it with Bob’s public key, then sends it to Bob
Mallory could even change the message
Alice and Bob think they have a secure connection
This is called a man-in-the-middle attack (we’ll come back to it later) Hybrid Cryptosystems
At first glance it seems a public-key algorithm is always better than a symmetric algorithm
So why not use public-key all the time? Public-key algorithms are much slower than most symmetric algorithms
Public-key algorithms are vulnerable to chosen-plaintext attacks: If there are only a few possible messages, then it’s possible to recover the plaintext (although not the private key)
Encrypt all possible messages with the public key
If one of the generated ciphertexts matches the message ciphertext, then you know what was sent Hybrid Cryptosystems (2)
It’s possible to combine a symmetric algorithm with a public-key algorithm in a protocol 1 Alice and Bob agree on a public-key cryptosystem and a symmetric cryptosystem
2 Bob sends Alice his public key
3 Alice generates a random session key for the symmetric algorithm
4 She encrypts the session key using Bob’s public key
5 Alice sends ciphertext to Bob
6 Bob decrypts ciphertext using his private key to recover the session key
7 Alice and Bob can now continue with the symmetric algorithm Hybrid Cryptosystems (3)
Using public-key cryptography for distributing a key improves key management The key exchange can take place in public, even though a symmetric algorithm is used for the bulk of communication
The session key is only used for a limited time and then destroyed The longer a key sits around, the higher the chances that it is vulnerable to compromise
The public-key cryptosystem is only used very sporadically, generating a very small number of ciphertexts
The less data, the harder it is to break a code One-Way Hash Functions
A one-way hash function is also known under different names: message digest, fingerprint, (cryptographic) checksum, message integrity check, manipulation detection code
A hash function is a function that takes a variable-length input string (called pre-image) and converts it to a fixed-length (generally smaller) output string (called hash value) A simple example is a function that takes the pre-image and returns a byte consisting of the XOR (exclusive-or) of all input bytes
Can be used as a quick check if two pre-images are the same: same pre-images have the same hash value One-Way Hash Functions (2)
A one-way hash functions is a hash function that works in one direction It’s easy to compute a hash value from a pre-image, but it is hard to generate a pre-image that hashes to a particular value
The aforementioned XOR function is not one-way (given a hash-value, it’s easy to figure out a pre-image that will generate it)
A good one-way hash function is also collision-free: it’s hard to generate two pre-images with the same hash value
The hash function is public, the security lies in its one-wayness: it is computationally infeasible to find a pre-image hashing to a particular hash value One-Way Hash Functions (3)
If you want to verify that someone has a particular file (without sending the file itself), ask for its hash value
If they send you the correct hash value, then it is almost certain that they have that file
Can also be used to verify data transmission via an unreliable channel Send message and its hash value, if they don’t match, something has gone wrong
MD5 is a widely used one-way hash function (however, MD5 seems to have a weakness concerning collision-resistance)
SHA-2 is a safer bet in terms of security Authentication
Another important area of cryptographic protocols is authentication
When Alice logs into a computer (or ATM, or telephone banking system, or any other system) how does the host know who she is?
Traditionally passwords are used for this
Alice enters her password and the host confirms that it is correct
Both Alice and the host know a secret piece of knowledge, which the hosts requests every time Alice logs in Authentication (2)
The host does not actually have to know the password (i.e. store it in cleartext) That’s why sysadmins often cannot tell you your password, they don’t have it (they can only reset it)
It just has to be able to differentiate valid passwords from invalid ones
This can be done using one-way hash functions Authentication Using Hash Values
Instead of storing passwords, a host stores one-way hash values of the passwords 1 Alice sends the host her password
2 The host performs a one-way hash function on the password
3 Host compares the result to the stored value
If someone breaks into the host, they cannot steal the passwords
The one-way hash function cannot be reversed to obtain the passwords Dictionary Attacks
A file of passwords encrypted with a one-way function is still vulnerable
Mallory can take a list of the most popular passwords and a dictionary and run it through the one-way hash function Usually each term is used in different variations (e.g. concatenating numbers, using it backwards)
After stealing the password file, he can then compare the password list with the generated hash values to find matches
This is surprisingly successful, as many passwords are chosen poorly Password Eavesdropping
So far, so good; however, we still have a serious problem: what happens if Eve listens in on Alice’s communication with the host?
Eve can intercept the password anywhere on the network or on the host before it is hashed For example, early electronic locks (e.g. remote control garage door openers, car locks) sent their serial number in the clear
This was very insecure, anyone able to listen in and record the signal was able to open a door Authentication with Public-Keys
Public-key cryptography can help solve this problem
The host keeps a file of every user’s public key, all users keep their private key
This is how the (straightforward) protocol works: 1 The host sends Alice a random string
2 Alice encrypts the string with her private key and sends it back (along with her name)
3 The host looks up Alice’s public key and decrypts the message with it
4 If the decrypted string matches what the host originally sent, then access is granted Authentication with Public-Keys (2)
This protocol is also called challenge-response
The host sends out a challenge (a random number) to which the Alice has to respond in the correct way
The response proves that Alice is in possession of her private key
Important: the private key is never transmitted, only encrypted messages
However, there are still weaknesses: The plaintext and its ciphertext are transmitted
If random numbers issued by the host are not truly random Authentication with Public-Keys (3)
Plaintext/ciphertext: Having access to (a large number of) plaintexts and corresponding ciphertexts makes the job of a cryptanalyst easier
Eve could listen in on Alice’s authentication attempts to obtain information to deduce private key
Random numbers: Mallory (knowing the sequence of future random number) could intercept host’s challenge
Mallory forwards one of the future random numbers, gets it encrypted by Alice; Alice’s login will fail
However, Mallory now has a correct response to a future challenge Improved Protocol
Random number generator has to be implemented very carefully
To avoid the other weakness, use the following protocol: 1 Alice performs a computation on her private key and random number(s) and sends result to host
2 The host sends Alice a different random number
3 Alice performs a computation on her private key, her and the host’s random numbers and sends result to host
4 The host does some computation on the results received by Alice and her public key to verify she knows the private key
5 If the outcome is correct, access is granted Improved Protocol (2)
Step 1 might seem unnecessary, but is required to prevent certain attacks against the protocol
Basically, Alice proves to the host, that she knows her private key without giving away any information on her private key
actually encrypting plaintext with her private key
The weaknesses of the original protocol are also a good reason to not sign just any random document with your private key (it could have been sent by Mallory)
first sign a document, and then encrypt it Chapter 9 Further Cryptographic Protocols Digital Signatures
Cryptography’s use is not limited to securing communication
It’s possible to sign electronic documents using a digital signature
Attaching/embedding a (graphical image of a) signature to a document is not helping
It would be too easy to manipulate computer files and cut and paste signatures Symmetric Cryptosystem Signatures
Alice wants to sign a digital message and sent it to Bob
Possible with the help of Trent, who shares a secret key KA with Alice and a secret key KB with Bob
1 Alice encrypts her message to Bob with KA
2 Alice sends the ciphertext to Trent
3 Trent decrypts the message with KA
4 Trent takes the message, adds a confirmation that it is from Alice, encrypts it with KB
5 Trent sends this bundle to Bob
6 Bob decrypts it with KB (to read the message and Trent’s confirmation) Symmetric Cryptosystem Signatures (2)
Proving to a third person, e.g. Carol, that a message is from Alice would involve Trent again: 1 Bob encrypts Alice’s message and Trent’s confirmation with KB
2 Bob sends the ciphertext to Trent
3 Trent decrypts the message/confirmation with KB
4 Trent checks that the message is indeed Alice’s
5 If this is the case, Trent takes the message, adds a confirmation that it is from Alice, encrypts it with KC
6 Trent sends this bundle to Carol
7 Carol decrypts it with KC (to read the message and Trent’s confirmation) Symmetric Cryptosystem Signatures (3)
In theory, this is a secure way of signing documents
If someone tried to “transfer” a signature to a different document, Trent would notice it
However, there are some problems in practice: Trent needs to store all messages that he confirmed (to be able to check validity of a signature)
Every transaction has to go through Trent, he would quickly become a bottleneck
Trent has to be a trusted entity and he has to be infallible If he makes a mistake, this would undermine other people’s trust in him Public-Key Cryptosystem Signatures
There are better ways for digitally signing a document
Many public-key algorithms (including RSA) can be used for this
This is based on the fact that the private key can also be used for encrypting messages
The public key is then used for decrypting it again 1 Alice encrypts the document with her private key
2 Alice sends ciphertext to Bob
3 Bob decrypts the ciphertext using Alice’s public key Public-Key Cryptosystem Signatures (2)
Why does this work? Bob can verify that the signature is authentic (by decrypting message correctly with public key)
The signature cannot be forged, only Alice knows her private key
The signature is not reusable (cannot be extracted from document and attached to another)
The signed document cannot be changed (any change of the ciphertext would lead to gibberish when applying public key)
There is a variant of RSA, called DSA (Digital Signature Algorithm), which can only be used for signatures (not encryption) Public-Key Cryptosystem Signatures (3)
Possible problems with this protocol: Works fine if nothing is gained from having another copy of the document
Things change if it is, e.g. a signed digital check Bob could take the check to a bank, which verifies Alice’s signature and transfers the money
However, Bob could keep a copy of the check and take it to another bank later
Consequently, digital signatures often include timestamps, i.e. current date and time is added to document before signing The bank can then compare the timestamp with a list of stored timestamps of used checks Public-Key Cryptosystem Signatures (4)
Non-repudiation is still a problem: Alice can sign a document and later claim that she did not She can claim that her key was compromised (she lost it, someone stole it) and someone else is pretending to be her
There has been a discussion about tamper-resistant modules Prevents Alice from accessing her own private key and accidentally “losing” it
If you need non-repudiation, then you have to fall back on an arbitrated protocol Arbitrated Non-Repudiation
1 Alice signs her message
2 Alice generates a header (containing some identifying information)
3 She concatenates header + signed message and signs this again, then sends it to Trent
4 Trent verifies Alice’s signature and confirms the header and adds a timestamp to the header + message
5 Trent signs this bundle of information with his own signature and sends it to both, Alice and Bob
6 Bob verifies Trent’s signature, the header, and Alice’s signature
7 Alice verifies Trent’s message and confirms, if it did not originate from her she raises an alarm Digital Signatures and Encryption
Digital signatures can be combined with encryption: 1 Alice signs a message with her private key
2 Alice encrypts the signed message with Bob’s public key
3 Alice sends ciphertext to Bob
4 Bob decrypts ciphertext with his private key
5 Bob verifies with Alice’s public key and recovers message Key Exchange
We still have an open problem: How can we safely transmit a (public) key, so that we can be sure it belongs to whom we think it belongs?
One solution is to have a secure database: Anyone may read the content and access a public key
Only Trent is allowed write-access to the database
To make sure that keys are not substituted during transmission, Trent can sign the keys, users can verify Trent’s signature Users of the database just have to make sure (once) that they have received the correct public key of Trent Key Exchange (2)
In this scenario Trent is called a Key Certification Authority (CA) or Key Distribution Center (KDC)
A large and well-known CA is VeriSign
Usually a CA signs a compound message called a certificate containing information about the holder of a public key
When you apply for a certificate with a CA, they check your identity before issuing the certificate Certificates
A certificate contains the following fields: Issuer: identity of the issuer (the CA)
Subject: identity of the owner of the public key
Public key: the public key of the subject
Dates valid: the time period for which the certificate is valid
Type: there are different classes of certificates Depends on the rigor with which identity check is done
Serial number
Fingerprint: a one-way hash function of previous data (more about these functions on next slide)
Signature: fingerprint signed by issuer Signing Fingerprints
Instead of signing documents directly, often the fingerprint is signed
This is done due to efficiency reasons, signing a fingerprint is quicker
If someone manipulates the original document this can be found out quickly: The result of applying the one-way hash function to the document and the signed fingerprint don’t match! Certificate Chains
If an organization wants its members to have a certificate, it usually doesn’t get them individually
It obtains an organization-wide certificate from a CA and then creates and signs a certificate for each member
So we have a chain of certificates: Member → Organization → CA Checking Certificates
If you want to check a certificate, you would 1 apply the issuer’s public key to the signed fingerprint, decrypting it
2 and compare the decrypted fingerprint with the provided fingerprint
3 optionally create the fingerprint of the certificate using the one-way hash function and compare it, too to verify that certificate hasn’t been tampered with
In case of a chain, you then need to verify the certificate of the issuer
Now we have a problem. . .
Who signs the CA’s certificate? Checking the CA’s Certificate
A CA’s certificate is self-signed, i.e. the issuer and the subject are identical
If Mallory approached you with his certificate signed by VeriSign and he also provided VeriSign’s certificate, would you trust him? Probably not, anyone can create a self-signed certificate
One solution for a CA is to widely publicize the fingerprint of their own certificate It’s not possible to cheat if everyone knows how a CA’s certificate must look like
One more thing: there are Certificate Revocation Lists (CRLs), as certificates can be invalidated Web Browsers and Certificates
Browsers are usually shipped with a number of self-signed certificates already installed
These are certificates that the vendor (has verified) and trusts
So if you trust the vendor, you can trust the installed certificates
Nevertheless, a browser lets you view installed certificates
install certificates yourself
specify whether you trust a certain certificate
Caveat: a browser does not necessarily check whether a certificate has been revoked Security
There’s never 100% security, but in order to subvert a key exchange involving a trusted third party Mallory would have to Replace Alice’s copy of the KDC public key
Corrupt the database: replace his own keys for the valid keys and sign with his private key
Unfortunately, using a KDC (or CA) is an arbitrated protocol
Can we cut out the man-in-the-middle attack with a self-enforcing protocol?
This can be done using an interlock protocol Interlock Protocol
The interlock protocol has a good chance of foiling a man-in-the-middle attack: 1 Alice sends Bob her public key
2 Bob sends Alice his public key
3 Alice encrypts message using Bob’s public key, she sends half of the encrypted message to Bob
4 Bob encrypts message using Alice’s public key, he sends half of the encrypted message to Alice
5 Alice sends the other half to Bob
6 Bob sends the other half to Alice
7 Alice puts the two halves together and decrypts them
8 Bob puts the two halves together and decrypts them Interlock Protocol (2)
The important point is that half of the message is useless without the other half (that it cannot be decrypted) If the encryption is a block algorithm, half of each block (e.g. every second bit) could be sent
Mallory can still substitute his key for Alice’s and Bob’s in steps 1 and 2
However, when he intercepts Alice’s message in step 3, he cannot decrypt it with his own private key and re-encrypt it with Bob’s public key
He must invent a completely new message and send half of it to Bob
Mallory has the same problem in step 4 Interlock Protocol (3)
By the time he intercepts the second halves in steps 5 and 6, it’s too late to change the invented messages
Alice and Bob won’t send the second half of their messages until they have received the first part from the other person
The conversation between Alice and Bob will turn out to be very different
Mallory has a chance of getting away with it, if he is able to mimic Alice and Bob closely enough
However, that is much harder to do than intercepting and reading messages Real-World Protocols
Let us now have a look at some protocols that are used in real applications Smartcard-based banking We are going to have a look at a range of different protocols used for banking
Kerberos This is a trusted third-party authentication protocol designed for TCP/IP networks
TLS (Transport Layer Security), formerly SSL (Secure Sockets Layer) TLS is a protocol for secure communication between a client and a server Smart Cards and Banking
Everyone uses them, but what happens when you insert your credit/debit/cash card into a terminal?
How does the terminal verify that your card is valid?
Newer cards employ the “Chip and PIN” method to do this, what’s inside the chip? Magnetic Strip Cards
Before Chip and PIN, magnetic strip cards and signatures (or PINs at ATMs) were used
Basically, the magnetic strip contains (some of) the information printed on the card in machine-readable form
It is relatively easy to copy the information on the magnetic strip, then all that’s left to do is to forge the signature
find out the PIN
However, not everything was bad for a customer If a signature turned out to be forged later on, the customer was not liable Weaknesses
Some (early) glitches included: Only the account number on the strip was used: Collecting the discarded ATM slip (which in the beginning had the complete account number on it)
and shoulder-surfing the PIN was enough to create a copy of a card
Putting one-way hash code of PIN on card insecurely Make a copy of your own card replacing account number
Take money from that account with your PIN
The relative ease with which card can be copied is still a problem Chip and PIN
Chip and PIN was introduced to make it harder to commit card-based fraud
Actually this is not just one protocol, but a whole suite of protocols
The standard is officially called EMV, after the participating institutions Europay, MasterCard, VISA In the meantime, Japan Credit Bureau (JCB) and American Express have joined as well
The complete standard consists of four books containing hundreds and hundreds of pages http://www.emvco.com/specifications.aspx EMV
EMV can be seen as a “construction kit” for building payment systems Word of warning: choosing building blocks at random is probably going to result in a very insecure system
We are going to look at the three main protocols Static Data Authentication (SDA)
Dynamic Data Authentication (DDA)
Combined Data Authentication (CDA)
Basic premise: a customer card contains a smartcard chip with the capability to verify a PIN
authenticate a transaction SDA
Certification Authority
private key public key dCA eCA Issuer
public key e I signed e I with dCA
private key static app data dI signed with dI
smart terminal card communication between card and terminal Acquirer SDA (2)
Usually the involved parties are the following: Issuer: who gives you the card (e.g. bank)
Certification authority: scheme operator (e.g. VISA)
Acquirer: participant in the scheme (e.g. shops, ATMs)
We’ll have a look at the protocol on the next slide
Basically, terminal verifies that the static application data (put on the card when it was created) has not been manipulated
PIN is correct before going ahead with a transaction SDA Protocol
What follows is a step-by-step description of what happens when a card gets inserted into a terminal
1 Terminal uses CA’s public key eCA (which it has to obtain from CA) to verify that issuer’s public key eI provided by card is valid
2 Terminal uses public key eI to verify that data on card is valid (and has not been modified)
3 Card user is authenticated via PIN: entered PIN is sent (in cleartext) to card, which verifies correctness
Purchase, cash withdrawal, etc. can now go ahead SDA Protocol (2)
4 Terminal sends transaction data (merchantID, amount, serial number, etc.) to card
5 Card encrypts this using a symmetric key shared with the issuer, creating an application cryptogram (or AC)
6 Next step is handled differently, depending on whether 1 terminal is online: terminal sends AC to issuer for verification (checks for sufficient funds, card reported as stolen, etc.)
2 terminal is offline: if amount is below a certain threshold (“floor limit”) 1 transaction is processed (AC is kept as evidence that card was used)
2 otherwise, transaction is aborted Weaknesses of SDA
Backwards compatibility with magnetic strip cards EMV has not been rolled out on a world-wide basis (e.g. United States hasn’t joined)
Even if EMV is used as a standard (such as in the UK), magnetic strip is sometimes used as fallback (e.g. if chip doesn’t work)
This makes it as vulnerable as magnetic strip cards
Eavesdropping on communication between card and terminal Criminals have used sniffing devices put between separate card readers and PIN pads
Tampered terminals have been found Weaknesses of SDA (2)
Picture below shows a terminal with an inserted wiretap Weaknesses of SDA (3)
“Yescards”: cards programmed to accept any PIN Only works in offline mode and relies on the fact that merchant cannot read the application cryptogram
Instead of generating a valid AC, card returns some random data that looks like an AC
However, you need a valid certificate (ei signed with dCA and card data signed with di ) Can be obtained by listening in on the communication of a valid card with a terminal DDA
Compared to SDA there is one additional public/private key pair in DDA This key pair, labeled with SC (for smart card), is used by the card
While the other private keys are only used for signing messages, dSC is actually stored on the card For this to work it has to be stored securely, under no circumstances may it leave the card
In addition to this, the smart card needs to be able to run public key crypto-algorithms This needs more computing power than the symmetric algorithm used in SDA DDA (2)
Certification Authority
private key public key dCA eCA Issuer
public key e I signed e I with dCA
private key static app data dI signed with dI
public key smart e SC signed terminal e card SC with dI communication between card and terminal
private key Acquirer dSC DDA Protocol
Let’s have a look at the protocol 1 Terminal sends challenge to card
2 Card signs this with its private key and sends it back
This assures the terminal that the card is present (somewhere)
3 Terminal verifies issuer’s public key eI on card with eCA
4 Terminal verifies card’s public key eSC with eI
Now terminal can actually verify that its challenge was correctly answered 5 Card user is authenticated via PIN: PIN is not sent in cleartext, but encrypted with eSC
Rest of steps as in SDA before CDA
Top-end protocol in terms of security
Similar to DDA Main difference: the application cryptogram is also signed by card using its private key (in addition to encrypting it)
This makes sure that the card that was authenticated earlier is also used for the transaction Otherwise, if you’re able to switch cards, transaction could be done with different card under DDA Further Weakness
Even CDA is not 100% secure
Vulnerable to a relay attack (tested experimentally by the Computer Lab in Cambridge) Set up bogus terminal in a cafe hooked up via WiFi and a laptop to a bogus card
When someone paid £5 for his cake, card was connected to fake card
With this fake card, somewhere else someone purchased a book for £50 (using the credentials of the valid card) Kerberos
The basic protocol isn’t too hard to understand: 1 A client requests a ticket for a Ticket Granting Service (TGS) from a Kerberos server
2 The ticket is sent to the client (after authentication)
3 To use a particular server, client requests a ticket for a particular server from TGS
4 If user is allowed to access server, TGS issues a ticket
5 Client presents this ticket to the server
Kerberos 1 2 5 3 Client Server TGS 4 Kerberos (2)
We’ll look at some details in just a moment
You might ask yourself what the Ticket Granting Service is for Kerberos could check if a user is allowed to access a service and issue ticket for Server directly
Functionality is split between Kerberos and TGS to share workload: Kerberos does authentication
TGS does access control for services
Kerberos is based on symmetric cryptography: Kerberos shares a different secret key with every entity on the network Steps 1 and 2
Client sends their name to Kerberos (+ TGS server to use)
Kerberos looks up client in database, if found Kerberos generates a session key K1 to be used between client and TGS
Kerberos sends back two messages:
The session key K1 encrypted with the client’s secret key A Ticket-Granting Ticket (TGT) including the client name, client network address, ticket validity period, and K1 This is encrypted using the secret key of TGS
The client can decrypt the first message (and get K1) Step 3
The client sends two messages to TGS The TGT received from Kerberos (+ requested service)
An authenticator including the client name and a timestamp, encrypted using K1
The authenticator serves two purposes:
It shows TGS that the client knows K1 The current timestamp prevents an eavesdropper replaying the TGT and authenticator later
TGS decrypts the TGT ticket and uses the extracted session key K1 to decrypt the authenticator Step 4
TGS can now check validity of client (name, network address) and validity of the timestamp (time period)
It also checks that client is allowed to access requested service
If everything turns out to be correct, TGS generates another session key K2 to be used between client and server
TGS sends two messages to the client:
The session key K2 encrypted with K1 A Client-Server Ticket (CST) including the client name, client network address, ticket validity period, and K2 This is encrypted using the secret key of the server Step 4
Step 4 is very similar to Step 2, TGS is now playing the role that Kerberos played before
After receiving the CST, the client can now retrieve the key used in communication with the server
and is ready to contact the requested service Step 5
The client sends two messages to the service The CST received from TGS (+ requested service)
An authenticator including the client name and a timestamp, encrypted using K2
The server decrypts the CST ticket and uses the extracted session key K2 to decrypt the authenticator After checking that everything is correct, it can provide the service to the client
If client wants to have an authentication from the server:
Server encrypts (timestamp + 1) using K2 and sends it back to client
This shows that server knew its own secret key (used for decrypting CST) Weaknesses
There are a few weaknesses: Single point of failure: While the Kerberos server is down, no-one can access anything
Kerberos server knows all the secret keys, if it is compromised all of these keys are compromised
The clocks on the different machines need to be synchronized (and accurate): If the timestamp provided by the client does not lie in the ticket validity period, access will be denied
If a host can be fooled about the correct time, old authenticators could be replayed TLS
TLS (Transport Layer Security) is a cryptographic protocol for secure client-server communication
Web browsers contain the client TLS (or SSL) software
If a server has been configured to run TLS, then a client canA secure use it connection by specifying is shownhttps: byin a a URL (instead of http:closed padlock:
Servers can request that clients use TLS (even if client has not used https:, e.g. for logins, payment details) Initiating Communication
TLS can perform different activities: encryption, server authentication, client authentication
When a client requests TLS, client and server go through a handshake protocol to determine which of the above activities will be performed
and which encryption algorithm will be used
A public/private key pair will be needed on the server side to go ahead
A server’s public key can be found in its certificate (also known as a server ID) Encryption Only
If client only wants a secure communication line using encryption, the following protocol is used 1 Client sends request for server ID to server
2 Server sends back server ID
3 Client generates (random) session key and encrypts it using public key found in certificate
4 Client sends session key to server
5 Server decrypts message to obtain session key
Client and server now have a secret session key for encrypted communication Server/Client Authentication
There’s one additional step that the client does (before sending a session key to the server): 3 (b) Client attempts to verify the server ID certificate with installed certificates
If this cannot be done, then the browser will warn its user
In the case of client authentication, client needs to send its certificate and the server will perform an additional step: 4 (b) Client sends its certificate
5 (b) Server attempts to verify the client’s certificate with installed certificates Summary
Again, this is only scratching the surface of cryptographic protocols
There are lots of other of cryptographic protocols for doing a range of things: Threshold schemes for sharing secrets: reconstructing messages with any m out of n pieces
Subliminal channels: covert ways of communicating secretly
Mental Poker: distributing keys anonymously
The important issue is: cryptosystems have to be applied to be in a certain way to guarantee security Chapter 10 Network Attack Motivation
In today’s IT infrastructures, more and more computers and devices are connected to a network
This has major consequences for system security It becomes easier to infiltrate systems (the whole system is as secure as its weakest link)
Once infiltrated, the damage that can be inflicted increases manifold
We will look at some approaches used for attacking and defending networks Intruders
A major threat to security is the intruder. Often referred to as a hacker. They are often categorised into 3 classes Masquerader Unauthorised user who exploits a legitimate user’s account.
Misfeasor A legitimate user who accesses unauthorised data, programs or resources.
A legitmate user who is authorised but misuses their privileges.
Clandestine User A user who seizes supervisory control of the system and uses it to evade auditing and access control or to suppress audit collection. Intrusion Objectives
Goal Gain access to a system
Extend range of privileges in a system
This requires acquiring protected information Typically this means finding out passwords Attacks on Local Networks
For the first scenario, let’s assume that an intruder has managed to gain control of one of your PCs
They can then install packet sniffer software in an attempt to get passwords
masquerade as a machine where a target user is already logged in
intercept filehandles in shared systems, e.g. NFS (Network File System) Attacks Using Internet Protocols
Attacks need not be about gaining access. There are different ways of taking down hosts by sending a large number of malformed packets: SYN flooding, Smurfing
Many of these loopholes have been fixed by now, but there’s still the brute-force distributed denial-of-service (DDoS) attack In this kind of attack, botnets (a large group of subverted machines) flood a site with a huge number of packets Further Attacks
Other forms of attack include DNS Security and Pharming
Spam
Malicious Code (e.g. viruses) Defense Against Network Attacks - Overview
There are several things you can do to make your network safer: Encryption: use protocols that protect specific parts of the network and/or the transmitted data
Management: keeping system up-to-date, configuring it in a way to minimize attack surface
Filtering: use of firewalls to stop bad things from entering your network
Intrusion detection: monitor network for signs of malicious behavior Management
Install newest patches as soon as possible
Make sure that default accounts and passwords are switched off
Shut down unnecessary services on machines
Train staff not to do foolish things Password Management
We have already seen in Chapter 8 a one way hash used to store passwords.
Better than storing passwords in clear text
Can be made more secure by Salting
password is combined with a random value - the salt
Prevents duplicate passwords being visible in password file
Process can be repeated - Key stretching
PBKDF2 (Password-Based Key Derivation Function) Chapter 11 Intrusion Detection Introduction
Monitor
The intrusion detector is classifying the input data into 2 classes intrusion/ not intrusion. Classification
There are 4 possible outcomes for the intrusion detector 1 An intrusion has occurred and it is detected (raises an alarm)
2 An intrusion has occurred but the detector remains silent
3 An intrusion has not occurred but the detector raises an alarm
4 An intrusion has not occurred and the detector does not raise an alarm
Terminology- Positive and Negative classes The positive class is the class that you have the most interest in.
We care most about the intrusions, so the intrusions are the positive class! Classification
There are 4 possible outcomes for the intrusion detector 1 An intrusion has occurred and it is detected (sounds an alarm) True Positive (a good outcome)
2 An intrusion has occurred but the detector remains silent False Negative (a bad outcome)
3 An intrusion has not occurred but the detector raises an alarm False Positive (a bad outcome)
4 An intrusion has not occurred and the detector does not raise an alarm True Negative (a good outcome)
Ideally we would like an intrusion detector that generates only True Positives and True Negatives. The Confusion Matrix
Predicted Class Negative Positive Positive TP FN Negative FP TN Actual Class Example - Credit Card Fraud The Problem of Uneven Class Size
The Base Rate Fallacy Intrusion Detection Chapter 12 Malicious Software and Firewalls Malicious Code
Malicious code, or malware, has been around for some time (started on time-sharing machines in the 1960s) There are several different kinds of malware: Trap Door
A secret entry point into a program Trap door is code that is triggered by from a special sequence of input
being run from a certain user ID
an unlikely sequence of events
Sometimes trap doors are added to facilitate programmers to debug and test.
Become a threat when used to gain unauthorized access
Difficult to implement operating system controls. Security measures focus on the development and update of the code. Logic Bomb
Code embedded in a legitimate program that is triggered when certain conditions are met Triggers include Particular date
Presence of absence of a file
Particular user running a particular program
The code is said to ‘explode’, the damage it causes include: Deleting data or files
Causing machine to halt Trojan Horse
Program that has hidden code which when invoked performs an unwanted of harmful action
A common use of a Trojan Horse is for data destruction To accomplish a function indirectly A user wishing to gain unauthorized access to a file could use a Trojan horse.
The program contains code that when executed, changes file permissions
The program is designed to be a useful utility so users are more likely to run it. Zombie
Program that takes over another networked computer.
Used to launch an attack which is subsequently difficult to trace back to the creator of the zombie.
Denial of Service attack, many computers infected by the zombie are used to overwhelm a target website Virus
A program that inserts itself into one or more files and then performs some (possibly null) action Typical virus goes through 4 phases Dormant Phase - Virus is idle and waiting for an activation event, e.g. a date
Propagation Phase - Virus copies itself into other programs or system areas
Triggering Phase - Virus is activated to perform an action. The trigger can be any number of events, such as a count of the number of times it has replicated
Execution Phase - The action is performed. This can range from harmless such as a message to destruction of data and programs How Viruses Work
A virus typically has two parts: a replication mechanism and a payload
The most common way for a virus to replicate is to append itself to an executable file and patch itself in: The execution path jumps to the virus code and then back to the original program
Program Virus
Viruses used to be restricted to executable files in operating systems, but have spread to other interpreted languages (e.g. Word macros) How Viruses Work (2)
The second component of a virus is the payload
This will usually be activated by a trigger, such as a date, and then do a number of bad things: Make selective (or random) changes to the machine’s protection state
Make changes to user data
Lock the network
Install spyware or adware
Install a rootkit Types of Virus
Parasitic - Most common type. It attaches itself to executable files and when the program is executed, the virus replicates by finding other executables to infect.
Memory Resident - Remains in main memory and infects every program that executes
Boot Sector - Spread whenever a system is booted from a disk containing the virus Types of Virus (2)
Stealth - A type of virus that attempts to hide itself from detection.
The virus may intercept I/O routines, so when these routines are called the virus presents back the original uninfected program details
When a virus appends itslef to a program, the file gets longer. The virus Types of Virus (3)
Polymorphic - Creates copies of itself that are functionally equivalent but are different programs.
This is can be achieved by randomly inserting superfluous instructions and changing order of instructions. Encryption can be used to change the program A random key is generated by a part of the virus called the mutation Engine. This is used to encrypt the remainder of the virus.
The key is stored with the virus and the mutation engine is itself altered.
When the virus replicates, a different key is randomly chosen Malicious Code
Malicious code, or malware, has been around for some time (started on time-sharing machines in the 1960s)
There are several different kinds of malware: Trojan: a program that has a malicious side effect when run by a user For example, a game that tries to capture passwords or other sensitive information
Worm: something that replicates One of the first worms was written at Xerox PARC: a program looking for idle processors assigning them tasks Malicious Code (2)
Different kinds (cont’d): Virus: basically a worm which replicates by attaching itself to files An early example was the IBM BITNET Christma virus
Rootkits: a piece of software that (once installed) seizes control of a system, tries to obscure the fact that it is installed An example is Vanquish, a rootkit that hides files, folders, registry entries and logs passwords. Countermeasures
Two common approaches for antivirus software are scanners and checksummers
Scanners search executables for certain byte patterns known to be a virus A counter-counter is polymorphism, the virus changes form every time it is copied
This can be done by encrypting the virus with a simple cipher and using a different key each time Decryption algorithm and key are also attached Countermeasures (2)
Checksummers keep a list of all executables in the system together with checksums (or fingerprints) of the original versions A counter-counter is a virus watching for operating system calls used by checksummers and then go into hiding
Unfortunately, antivirus software seems to become less effective Programmers of viruses and rootkits become more sophisticated and test against antivirus software Filtering
Most widely sold solution is a firewall
This is a machine standing between the local network and the internet filtering out traffic that might be harmful
Filtering can be done on different levels: Packet Filtering: inspects packet addresses and port numbers
Circuit Gateways: reassembles and examines packets in each TCP session
Application Relays: looks at data on an application level and e.g. strips out executables Filtering (2)
Firewalls mostly try to keep bad thing out, but can also be used to monitor (and restrict) outgoing traffic
You can have more than one firewall in a system, first firewall creates a demilitarized zone (DMZ)
DMZ is then connected to internal networks via further filters: Chapter 13 Usability, Assurance and Evaluation Motivation
Why do people make all these bad security-related decisions?
Usually your average user isn’t that stupid
However, technology doesn’t always help in making the right decision Often it is as easy (or even easier) to make a bad decision Human Behavior
Cognitive psychology deals with how humans think
There are some well-known results: It is easier to memorize things that are repeated frequently
It is easier to store things in context
However, this is not always considered in the right way by developers
We’ll have a look at the three most common mistakes made by humans: lapse of skill, following rules, cognitive errors Practice Makes Perfect (Not)
People become highly skilled by practicing actions
Downside: inattention can lead to a practiced action instead of an intended one
Many users click on ’OK’ in pop-up boxes (as very often that is the only way to continue)
Users are trained in doing the wrong thing Example
Quickly, can you spot the security message? And Now? Following rules
People can make mistakes by following a wrong rule
Can happen when they are distracted by interruptions, (perceptual) confusion, information overload
People then follow the strongest or most general rule they know
Example (Phishing): Looking at a secure connection (https) and the bank’s name, instead of the domain: https://www.citibank.secure.com Cognitive Errors
In plain English: people just don’t understand the problem
When Microsoft introduced an anti-phishing toolbar (see below), this was defeated by a picture-in-picture attack
Users are so accustomed to overlapping windows that many don’t realize what is going on (see next slide) Picture-in-Picture Attack
Fake window inside the real browser window: What To Do?
Trying to teach to all users the technical details in getting things right is a losing battle
In confusing situations when the rational runs out, the emotional side tends to kick in
It is much better to have sensible and secure defaults and easy-to-use interfaces, e.g. ’Our bank will never, ever send you e-mail. Any e-mail seeming to come from us is fraudulent’
Restricting access to public share in Vista to a group of users not needing nine steps and six different user interfaces Authentication
Next we are going to have closer look at why user authentication is still so difficult to do properly
There are still big problems in terms of usability in this area
We’ll have a closer look at passwords. . .
. . . and also look at some biometric alternatives Passwords
Words from a usability researcher: “It’s hard to think of a worse authentication mechanism than passwords.” (Angela Sasse)
The reasons lie chiefly with the way human memory works: People are bad at remembering infrequently-used, frequently-changed, or many similar items
People can’t forget “on demand”
Recall is harder than recognition
Non-meaningful words are harder to remember
There are costs attached to this: BT employs about a hundred staff in its password-reset center Alternatives
Passwords are not the only way of authenticating users
Basically, there are three options: “Something you have”: a physical device, e.g. a key
“Something you know”: present secret knowledge, e.g. a password
“Something you are”: physical appearance, e.g. an iris scan
The second option is usually the most inexpensive to implement
Often the other variants are used in combination with passwords
So passwords are probably going to be around for a while Problems with Passwords
There are three broad concerns with passwords Will a user enter the password correctly with a high enough probability
Will a user remember the password (or will they write it down or select easy-to-guess passwords)?
Will the disclosure of a password (whether accidentally or on purpose) break the system security? Reliable Entry of Passwords
Entry of passwords might seem trivial at first glance
However, if customers experience difficulties in entering codes, a support desk might get swamped with calls
Whether a password is too complex or too long may depend on the application: In a familiar setting, you might be able to get away with some complexity: Entering 20-digit numbers from a print-out into an pre-paid electricity meter, although some customers are illiterate (South Africa)
In very stressful situations, you have to make things simpler: 12 decimal digits used for firing codes of nuclear weapons (U.S.) Complexity of Passwords
Password ≠ password: depending on the application, stronger passwords have to be selected
If you can limit the number of guesses, then it’s o.k. to have a simpler password Example are ATMs that block the account after three wrong guesses
Sometimes you cannot limit the guessing, examples are: Encrypted files, attacker can guess as many times as they want
Possible denial-of-service attack: attackers can lock down a system by flooding it with login attempts Remembering Passwords
Twelve to twenty digits may be fine when copied from a telegram or a meter ticket
If people have to memorize passwords, things look differently: They use easy-to-guess passwords
and/or write them down
The following sums it up quite nicely: “Choose a password you can’t remember and don’t write it down.” Remembering Passwords (2)
Problems with memorizing passwords comes down to four issues: Naive password choice
User ability and training
Design errors
Operational failures Naive Password Choice
In summary, studies looking at people’s choices of passwords are depressing
Some of the biggest glitches commonly are: Using spouses’ names
Single letters
Just hitting return (no password at all)
Systems forcing people to use more complex passwords haven’t worked that well: Dictionary attacks (with some modifications) are still successful
As are lists consisting of the 20 most common female names followed by a single digit Naive Password Choice (2)
Making users change passwords regularly also backfired Users rapidly cycled through exhausted passwords to return to their favorite one
Not allowing password changes until some time had expired is no solution People need an administrator to change compromised passwords User Ability and Training
Some of the problems can be solved by training users Not all of them, though, there’s always a number of people who don’t do what they’re told
You can operate a clean-desk policy (where no passwords are left on unoccupied desks)
Put very important passwords in an envelope in a safe that is guarded User Ability and Training (2)
From an interesting study by Ross Anderson involving three groups: Red group: usual advice (password at least six characters, one non-letter), i.e. naive passwords
Green group: think of a passphrase and select letters (e.g. “It’s 12 noon and I am hungry” becomes I’S12&IAH)
Yellow group: select eight characters (alpha or numeric) randomly from a table, write it down, destroy after memorizing
Result: passphrases offer the best of both worlds They are almost as easy to remember as naive passwords and almost as hard to guess as random passwords Design Errors
Sometimes easy-to-use password schemes are created by people who have no idea how to do this properly
Some examples: Asking for mother’s maiden name (this information is not very secret and cannot be changed)
A mnemonic system for bank PINs 1 2 3 4 5 6 7 8 9 0 b l u e The above example encodes 2256, the rest of the cells are filled up with random letters Design Errors (2)
Problem with the scheme for PINs: A 4 × 10 matrix with random letters only yields about two dozen words
The odds for a card thief have just gone from 1 in 3333 to 1 in 8 Operational Issues
It’s not just the customer end where things go wrong
Sometimes organizations are careless with their master passwords For example, a note stuck on a terminal at an exhibition
Not resetting default accounts and passwords Password Summary
Unfortunately, there’s no silver bullet in sight
You should assume that some accounts will be compromised Work out how to detect this and limit the damage
We’ll now look at biometric alternatives Biometrics
Biometrics is about identifying persons by measuring some aspect of individual anatomy or physiology (such as hand geometry or fingerprints)
some deeply ingrained skill or behavior (such as a handwritten signature)
some combination of the two (such as your voice)
Actually, biometric identification has been around for a while (see “Shibboleth”) Determining the Quality
Like alarm systems, most biometric systems have a trade-off between the rates for false accepts, i.e. a fraudster is not detected
false rejects, i.e. an authorized person is not correctly identified
These rates are sometimes also called fraud and insult rates or just type 1 and type 2 errors
Usually systems can be tuned to favor one over the other
The equal error rate is when the system is tuned so that the probability for a false accept and a false reject are equal Handwritten Signatures
Handwritten signatures have been used in classical China (although seals had more status)
In Europe it was the other way around: seals were used in medieval times, has been replaced by signatures
Handwritten signatures are a rather weak authentication mechanism Works mostly due to the legal system around it
Some numbers on an experiment with professional and untrained examiners: Experts misattributed 6.5% of the documents
Untrained people got it wrong 38.8% of the time Handwritten Signatures (2)
Better results can be achieved with signature tablets
Measures the dynamics as well (velocity of the hand, pen lifted off the paper, etc.)
Still, the equal error rate is at best 1% (several percent for pure optical comparison) Face Recognition
Recognizing people by their facial features is the oldest identification mechanism Goes back to early primate ancestors
This is one area where humans are much better at than machines
The concept of photo IDs relies on this
However, being able to recognize friends does not mean that we’re good at identifying strangers via a photograph Face Recognition (2)
University of Westminster conducted an experiment with a supermarket chain and a bank
44 students were issued four credit cards each with different pictures “good, good one”: genuine and recent
“bad, good one”: genuine, but a bit old
“good, bad one”: picture of a different, similar-looking person
“bad, bad one”: apart from sex and race, completely random Face Recognition (3)
Experiment was done under optimum conditions (after normal business hours, experienced cashiers who were aware of the experiment)
Each student made several trips past the checkout using different cards
Almost none of the cashiers were able to distinguish between the “bad, good” and “good, bad” photographs A few could not even tell the difference between “good, good” and “bad, bad” CCTV
Nonetheless, face recognition is a very hyped technology in today’s world
Trying to pick out faces from an image of a crowd is not trivial in terms of computational complexity
Error rates usually go up to 20% (or even higher)
Most systems are about automated storage of facial images that are then compared by humans Alphonse Bertillon devised a system to identify people by their bodily measurements for the police in Paris (1882)
Eventually fell out of favor, as police switched to fingerprints BertillonageMade a comeback in the form of hand-geometry readers Used for authentication in nuclear premises (1970s)
‘Fast track’ for frequent flyers (INSPASS)
Has a single-attempt equal error rate of about 1% Fingerprints
The idea of using fingerprints to identify people or as an alternative to signatures is quite old Centuries ago has already been employed in China, India, and Japan
Really took off after the introduction of a robust classification scheme (loops, whorls, arches, and tents) Fingerprints (2)
Main use of fingerprints is in identification systems and crime scene forensics
US immigrations uses it in their US-VISIT program, scanning the two index fingers Uses a blacklist of about 6 million “bad guys” and had a false match rate of 0.31% in 2004
A study in 2001 found that common products had a false accept rate of 8% and a false reject rate of 1%
By now, better ones have an equal error rate or slightly below 1% per finger
Then there are obvious problems with amputees, birth defects, and people born without fingerprints Fingerprints (3)
In forensics fingerprints have to match in 16 points to be counted as incontroversial evidence in the UK
Optimistic estimations (by the police) say that this will lead to only one false match in ten billion This is fine as long as fingerprints are only compared to the prints of a few hundred known criminals
Once thousands of prints are compared with an online database of millions, things change Iris Codes
Compared to the previous approaches iris codes haven’t been around for a long time
John Daugman developed signal processing techniques that extract information from an iris image into a 256 byte iris code Iris Codes (2)
So far (as is known), every human iris is measurably unique
There’s only limited genetic influence, i.e. the patterns are different even for identical twins
Under lab conditions the equal error rate has been shown to be better than one in a million (0.0001%), so it’s a strong contender
One potential problem is that it depends on computer processing (no human can do it reliably) Voice Recognition
Voice recognition, also known as speaker recognition, tries to identify a speaker from a short utterance
As with fingerprints, this technology is used for identification and forensics
Currently, there are systems that can achieve an equal error rate of about 1%
However, this will probably become worse in the near future with advances made in the field of voice morphing Potential Problems
Biometrics (like other physical protection mechanisms) are susceptible to environmental conditions such as noise, dirt, vibration, lighting conditions
What do you do with people who are disabled or cannot be identified reliably?
How do you ward off attacks using photographs, contact lenses, moulds of fingerprints?
Error rates have to be tuned carefully (annoying too many people will not help) Summary
There’s still much to be done in terms of making security mechanisms more user-friendly
One possible approach is biometric identification
However, biometrics are not a silver bullet They have to be shown to be much better in attended operation (i.e. process is overseen by people)
Most of the results are achieved by deterring criminals (rather than actually identifying them) Assurance and Evaluation Introduction
Basically, one could define these terms in the following way: Assurance: will the system work?
Evaluation: can you convince other people of this?
This is still quite challenging and there are lots of open questions Assurance
Assurance is a way of justifying that the implemented mechanisms satisfy the requirements set out in a policy.
Policy
Assurance
Mechanism
A working definition of assurance could be: an estimate of the likelihood that a system will fail in a particular way Issues Concerning Assurance
Assurance comes down to the question: Have capable, motivated people tried to beat up on the system enough
Obvious problems are: How do you define ‘enough’? When do you stop searching for more flaws?
How do you define ‘system’? To what extent do you consider the organizational issues and procedures?
Is still more of an art than a science Issues Concerning Assurance (2)
Very often people: protect the wrong thing
protect the right thing in the wrong way
There are also problems with usability A system might be fine when operated by alert, experienced professionals
However, ordinary users might find it too tricky Evaluation
A working definition of evaluation is: the process of assembling evidence that a system meets (or fails to meet) an assurance target
This ranges from convincing your boss that you’ve done your job to
reassuring principals who rely on the system
Tensions can arise when the party who implements the protection and the party who relies on it are different Different parties have different incentives (e.g. managers buying from big name suppliers to minimize likelihood of being fired when things go wrong) Doing the Evaluation
How do people actually evaluate systems in terms of security?
One standard is set down in the Trusted Computer System Evaluation Criteria (TCSEC) This is also called the Orange Book Orange Book
The Orange Book was issued by the Department of Defense (latest version in 1985)
It sets basic requirements for assessing the security of systems
Has been replaced by an international standard, the Common Criteria Common Criteria
The official name is Common Criteria for Information Technology Security Evaluation (or ISO/IEC 15408)
This standard has been adopted internationally (more and more countries signed up for it)
Evaluations at all (but the highest) levels are done by a commercial licensed evaluation facility (CLEF)
Evaluations are (usually) recognized in all participating countries Common Criteria (2)
The Common Criteria is divided up into different parts:
Target of Evaluation (TOE): the system that is under consideration
Protection Profile (PP): a document identifying security requirements for a class of security devices for a particular purpose Examples are profiles for smart cards or firewalls Common Criteria (3)
Security Target (ST): identifies the security properties of a system in regard to one or more PPs How well does the system meet the requirements set in a PP?
Security Functional Requirements (SFRs): specify individual security functions which may (or may not) be provided by a product Meeting STs is measured by looking at SFRs Common Criteria (4)
The evaluation process also tries to consider the development process:
Security Assurance Requirements (SARs): describes the measures taken during development to assure quality E.g. storing code in a revision control system, having a testing environment in place
Evaluation Assurance Level (EAL): a rating describing the depth and rigor of an evaluation There are seven levels ranging from EAL 1 (the most basic) to EAL 7 (the most thorough and expensive) Summary
Although there are some standards in place, there is still a lot to do
There has been some criticism involving the Common Criteria: They are pretty much limited to technical aspects
CLEFs have been involved in too many political games
One approach that seems to work (reasonably well) is hostile reviewing Give reviewers an incentive to break the system, e.g. by paying for every flaw that was found