Network Working Group W. Kumari Internet-Draft Google Intended status: Informational I. Gashinsky Expires: July 11, 2012 Yahoo! J. Jaeggli Zynga January 8, 2012

Neighbor Discovery Enhancements for DOS mititgation draft-gashinsky-6man-v6nd-enhance-00

Abstract

In IPv4, subnets are generally small, made just large enough to cover the actual number of machines on the subnet. In contrast, the default IPv6 subnet size is a /64, a number so large it covers trillions of addresses, the overwhelming number of which will be unassigned. Consequently, simplistic implementations of Neighbor Discovery can be vulnerable to denial of service attacks whereby they attempt to perform address resolution for large numbers of unassigned addresses. Such denial of attacks can be launched intentionally (by an attacker), or result from legitimate operational tools that scan networks for inventory and other purposes. As a result of these vulnerabilities, new devices may not be able to "join" a network, it may be impossible to establish new IPv6 flows, and existing transported flows may be interrupted.

This document describes possible modifications to the traditional [RFC4861] neighbor discovery protocol for improving the resilience of the neighbor discovery process as well as an alternative method for maintaining ND caches.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

Kumari, et al. Expires July 11, 2012 [Page 1] Internet-Draft ND Enhance January 2012

This Internet-Draft will expire on July 11, 2012.

Copyright Notice

Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Kumari, et al. Expires July 11, 2012 [Page 2] Internet-Draft ND Enhance January 2012

Table of Contents

1. Introduction ...... 4 1.1. Applicability ...... 4 2. The Problem ...... 4 3. Terminology ...... 5 4. Background ...... 6 5. Neighbor Discovery Overview ...... 7 6. Recommendations for Implementors...... 7 6.1. Priortize NDP Activities ...... 8 6.2. Queue Tuning...... 9 6.3. ND cache priming and refresh ...... 9 6.4. NDP Protocol Gratuitous NA ...... 11 7. IANA Considerations ...... 11 8. Security Considerations ...... 11 9. Acknowledgements ...... 12 10. References ...... 12 10.1. Normative References ...... 12 10.2. Informative References ...... 12 Appendix A. Text goes here...... 12 Authors’ Addresses ...... 13

Kumari, et al. Expires July 11, 2012 [Page 3] Internet-Draft ND Enhance January 2012

1. Introduction

This document describes modifications to the IPv6 Neighbor Discovery protocol [RFC4861] in order to reduce exposure to vulnerabilities when a network is scanned, either by an intruder, as part of a deliberate dos attempt, or through the use of scanning tools that peform network inventory, security audits, etc. (e.g., "nmap").

1.1. Applicability

This document is primarily intended for implementors of [RFC4861].

2. The Problem

In IPv4, subnets are generally small, made just large enough to cover the actual number of machines on the subnet. For example, an IPv4 /20 contains only 4096 address. In contrast, the default IPv6 subnet size is a /64, a number so large it covers literally billions of billions of addresses, the overwhelming number of which will be unassigned. Consequently, simplistic implementations of Neighbor Discovery can be vulnerable to denial of service attacks whereby they perform address resolution for large numbers of unassigned addresses. Such denial of attacks can be launched intentionally (by an attacker), or result from legitimate operational tools that scan networks for inventory and other purposes. As a result of these vulnerabilities, new devices may not be able to "join" a network, it may be impossible to establish new IPv6 flows, and existing ipv6 transport flows may be interrupted.

Network scans attempt to find and probe devices on a network. Typically, scans are performed on a range of target addresses, or all the addresses on a particular subnet. When such probes are directed via a router, and the target addresses are on a directly attached network, the router will to attempt to perform address resolution on a large number of destinations (i.e., some fraction of the 2^64 addresses on the subnet). The process of testing for the (non)existance of neighbors can induce a denial of service condition, where the number of Neighbor Discovery requests overwhelms the implementation’s capacity to process them, exhausts available memory, replaces existing in-use mappings with incomplete entries that will never be completed, etc. The result can be network disruption, where existing traffic may be impacted, and devices that join the net find that address resolutions fails.

In order to alleviate risk associated with this DOS threat, some router implementations have taken steps to rate-limit the processing rate of Neighbor Solicitations (NS). While these mitigations do

Kumari, et al. Expires July 11, 2012 [Page 4] Internet-Draft ND Enhance January 2012

help, they do not fully address the issue and may introduce their own set of potential liabilities to the neighbor discovery process.

3. Terminology

Address Resolution Address resolution is the process through which a node determines the link-layer address of a neighbor given only its IP address. In IPv6, address resolution is performed as part of Neighbor Discovery [RFC4861], p60

Forwarding Plane That part of a router responsible for forwarding packets. In higher-end routers, the forwarding plane is typically implemented in specialized hardware optimized for performance. Forwarding steps include determining the correct outgoing interface for a packet, decrementing its Time To Live (TTL), verifying and updating the checksum, placing the correct link- layer header on the packet, and forwarding it.

Control Plane That part of the router implementation that maintains the data structures that determine where packets should be forwarded. The control plane is typically implemented as a "slower" software process running on a general purpose processor and is responsible for such functions as the routing protocols, performing management and resolving the correct link-layer address for adjacent neighbors. The control plane "controls" the forwarding plane by programming it with the information needed for packet forwarding.

Neighbor Cache As described in [RFC4861], the data structure that holds the cache of (amongst other things) IP address to link-layer address mappings for connected nodes. The forwarding plane accesses the Neighbor Cache on every forwarded packet. Thus it is usually implemented in an ASIC .

Neighbor Discovery Process The Neighbor Discovery Process (NDP) is that part of the control plane that implements the Neighbor Discovery protocol. NDP is responsible for performing address resolution and maintaining the Neighbor Cache. When forwarding packets, the forwarding plane accesses entries within the Neighbor Cache. Whenever the forwarding plane processes a packet for which the corresponding Neighbor Cache Entry is missing or incomplete, it notifies NDP to take appropriate action (typically via a shared queue). NDP picks up requests from the shared queue and performs any necessary actions. In many implementations it is also responsible for responding to router solicitation messages, Neighbor Unreachability Detection (NUD), etc.

Kumari, et al. Expires July 11, 2012 [Page 5] Internet-Draft ND Enhance January 2012

4. Background

Modern router architectures separate the forwarding of packets (forwarding plane) from the decisions needed to decide where the packets should go (control plane). In order to deal with the high number of packets per second the forwarding plane is generally implemented in hardware and is highly optimized for the task of forwarding packets. In contrast, the NDP control plane is mostly implemented in software processes running on a general purpose processor.

When a router needs to forward an IP packet, the forwarding plane logic performs the longest match lookup to determine where to send the packet and what outgoing interface to use. To deliver the packet to an adjacent node, It encapsulates the packet in a link-layer frame (which contains a header with the link-layer destination address). The forwarding plane logic checks the Neighbor Cache to see if it already has a suitable link-layer destination, and if not, places the request for the required information into a queue, and signals the control plane (i.e., NDP) that it needs the link-layer address resolved.

In order to protect NDP specifically and the control plane generally from being overwhelmed with these requests, appropriate steps must be taken. For example, the size and rate of the queue might be limited. NDP running in the control plane of the router dequeues requests and performs the address resolution function (by performing a neighbor solicitation and listening for a neighbor advertisement). This process is usually also responsible for other activities needed to maintain link-layer information, such as Neighbor Unreachability Detection (NUD).

An attacker sending the appropriate packets to addresses on a given subnet can cause the router to queue attempts to resolve so many addresses that it crowds out attempts to resolve "legitimate" addresses (and in many cases becomes unable to perform maintenance of existing entries in the neighbor cache, and unable to answer Neighbor Solicitiation). This condition can result the inability to resolve new neighbors and loss of reachability to neighbors with existing ND- Cache entries. During testing it was concluded that 4 simultaneous nmap sessions from a low-end computer was sufficient to make a router’s neighbor discovery process unhappy and therefore forwarding unusable.

This behavior has been observed across multiple platforms and implementations.

Kumari, et al. Expires July 11, 2012 [Page 6] Internet-Draft ND Enhance January 2012

5. Neighbor Discovery Overview

When a packet arrives at (or is generated by) a router for a destination on an attached link, the router needs to determine the correct link-layer address to send the packet to. The router checks the Neighbor Cache for an existing Neighbor Cache Entry for the neighbor, and if none exists, invokes the address resolution portions of the IPv6 Neighbor Discovery [RFC4861] protocol to determine the link-layer address.

RFC4861 Section 5.2 (Conceptual Sending Algorithm) outlines how this process works. A very high level summary is that the device creates a new Neighbor Cache Entry for the neighbor, sets the state to INCOMPLETE, queues the packet and initiates the actual address resolution process. The device then sends out one or more Neighbor Solicitiations, and when it receives a correpsonding Neighbor Advertisement, completes the Neighbor Cache Entry and sends the queued packet.

6. Recommendations for Implementors.

The section provides some recommendations to implementors of IPv4 Neighbor Discovery.

At a high-level, implementors should program defensively. That is, they should assume that intruders will attempt to exploit implementation weaknesses, and should ensure that implementations are robust to various attacks. In the case of Neighbor Discovery, the following general considerations apply:

Manage Resources Explicitely - Resources such as processor cycles, memory, etc. are never infinite, yet with IPv6’s large subnets it is easy to cause NDP to generate large numbers of address resolution requests for non-existant destinations. Implementations need to limit resources devoted to processing Neighbor Discovery requests in a thoughtful manner.

Prioritize - Some NDP requests are more important than others. For example, when resources are limited, responding to Neighbor Solicitations for one’s own address is more important than initiating address resolution requests that create new entries. Likewise, performing Neighbor Unreachability Detection, which by definition is only invoked on destinations that are actively being used, is more important than creating new entries for possibly non-existant neighbors.

Kumari, et al. Expires July 11, 2012 [Page 7] Internet-Draft ND Enhance January 2012

6.1. Priortize NDP Activities

Not all Neighbor Discovery activies are equally important. Specifically, requests to perform large numbers of address resolutions on non-existant Neighbor Cache Entries should not come at the expense of servicing requests related to keeping existing, in-use entries properly up-to-date. Thus, implementations should divide work activities into categories having different priorities. The following gives examples of different activities and their importance in rough priority order.

1. It is critical to respond to Neighbor Solicitations for one’s own address, especially when a router. Whether for address resolution or Neighbor Unreachability Detection, failure to respond to Neighbor Solicitations results in immediate problems. Failure to respond to NS requests that are part of NUD can cause neighbors to delete the NCE for that address, and will result in followup NS messages using multicast. Once an entry has been flushed, existing traffic for destinations using that entry can no longer be forwarded until address resolution completes succesfully. In other words, not responding to NS messages further increases the NDP load, and causes on-going communication to fail.

2. It is critical to revalidate one’s own existing NCEs in need of refresh. As part of NUD, ND is required to frequently revalidate existing, in-use entries. Failure to do so can result in the entry being discarded. For in-use entries, discarding the entry will almost certainly result in a subsquent request to perform address resolution on the entry, but this time using multicast. As above, once the entry has been flushed, existing traffic for destinations using that entry can no longer be forwarded until address resolution completes succesfully.

3. To maintin the stability of the control plane, Neighbor Discovery activity related to traffic sourced by the router (as opposed to traffic being forwarded by the router) should be given high priority. Whenever network problems occur, debugging and making other operational changes requires being able to query and access the router. In addition, routing protocols may begin to react (negatively) to perceived connectivity problems, causing addition undesirable ripple effects.

4. Activities related to the sending and recieving of Router Advertisements also impact address resolutions. [XXX say more?]

5. Traffic to unknown addresses should be given lowest priority. Indeed, it may be useful to distinguish between "never seen" addresses and those that have been seen before, but that do not have

Kumari, et al. Expires July 11, 2012 [Page 8] Internet-Draft ND Enhance January 2012

a corresponding NCE. Specifically, the conceptual processing algorithm in IPv6 Neighbor Discovery [RFC4861] calls for deleting NCEs under certain conditions. Rather than delete them completely, however, it might be useful to at least keep track of the fact that an entry at one time existed, in order to prioritize address resolution requests for such neighbors compared with neighbors that have never been seen before.

6.2. Queue Tuning.

On implementations in which requests to NDP are submitted via a single queue, router vendors SHOULD provide operators with means to control both the rate of link-layer address resolution requests placed into the queue and the size of the queue. This will allow operators to tune Neighbour Discovery for their specific environment. The ability to set or have per interface or subnet queue limits at a rate below that of the global queue limit might limit the damage to the neighbor discovery process to the taret network.

Setting those values must be a very careful balancing act - the lower the rate of entry into the queue, the less load there will be on the ND process, however, it also means that it will take the router longer to learn legitimate destinations. In a datacenter with 6,000 hosts attached to a single router, setting that value to be under 1000 would mean that resolving all of the addresses from an initial state (or something that invalidates the address cache, such as a STP TCN) may take over 6 seconds. Similarly, the lower the size of the queue, the higher the likelihood of an attack being able to knock out legitimate traffic (but less memory utilization on the router).

6.3. ND cache priming and refresh

With all of the above recommendations implemented, it should be possible to survive a "scan attack" with very little impact to the network, however, adding new hosts to the network (and the sending of traffic to them) may still be negatively impacted. Traffic to those new hosts would have to go through the unknown Neighbor Resolution queue, which is where the attack traffic would end up as well. A solution to this would be that any new host that joins the network would "announce" itself, and be added to the cache, therefore not requiring packets destined to it to go through the unknown NDP queue. This could be done by sending a ping packet to the all-routers multicast address, which would then trigger the router’s own neighbor resolution process, which should be in a different queue then other packets.

All attempts should be made to keep these addresses in cache, since any eviction of legitimate hosts from the cache could potentially

Kumari, et al. Expires July 11, 2012 [Page 9] Internet-Draft ND Enhance January 2012

place resolutions for them into the same queue as the attack traffic. At present, [RFC4861] states that there should be MAX_UNICAST_SOLICIT (3) attempts, RETRANS_TIMER 1 second apart, so if there is an interruption in the network or control plane processing for longer then 3 seconds during the refresh, the entry would be evicted from the ND Cache. Any network event which takes longer then 3 seconds to converge (UDLD, STP, etc may take 30+ seconds) while under an attack, would result in ND cache eviction. If an entry is evicted during a scan, connectivity could be lost for an extended period of time.

NDP refresh timers could be revised as suggested in draft-nordmark-6man-impatient-nud-00 [1] and SHOULD have a configurable value for MAX_UNICAST_SOLICIT and RETRANS_TIMER, and include capabilities for binary/exponential backoff.

A suggested algorithm, which retains backward compatiblity with [RFC4861] is: operator configurable values for MAX_UNICAST_SOLICIT, RETRANS_TIMER, and a way to set adaptive back-of multiple, simmilar to -- call it BACKOFF_MULTIPLE), so that we could implement:

next_retrans = ($BACKOFF_MULTIPLE^$solicit_attempt_num)*$RETRANS_TIMER + jittered value.

The recommended behavior is to have 5 attempts, with timing spacing of 0 (initial request), 1 second later, 3 seconds later, then 9, and then 27, which represents:

MAX_UNICAST_SOLICIT=5

RETRANS_TIMER=1 (default)

BACKOFF_MULTIPLE=3

If BACKOFF_MULTIPLE=1 (which should be the default value), and MAX_UNICAST_SOLICIT=3, you would get the backwards-compatible RFC behavior, but operators should be able to adjust the values as necessary to insure that they are sufficiently aggressive about retaining ND entries in cache.

An Implementation following this algorithm would if the request was not answered at first due for example to a transitory condition, retry immediately, and then back off for progressively longer periods. This would allow for a reasonably fast resolution time when the transitory condition clears.

Kumari, et al. Expires July 11, 2012 [Page 10] Internet-Draft ND Enhance January 2012

6.4. NDP Protocol Gratuitous NA

Per RFC 4861, section 7.2.5 and 7.2.6 [RFC4861] requires that unsolicited neighbor advertisements result in the receiver setting it’s neighbor cache entry to STALE, kicking off the resolution of the neighbor using neighbor solicitation. If the link layer address in an unsolicited neighbor advertisement matches that of the existing ND cache entry, routers SHOULD retain the existing entry updating it’s status with regards to LRU retention policy.

Hosts MAY be configured to send unsolicited Neighbor advertisement at a rate set at the discretion of the operators. The rate SHOULD be appropriate to the sizing of ND cache parameters and the host count on the subnet. An unsolicited NA rate parameter MUST NOT be enabled by default. The unsolicted rate interval as interpreted by hosts must jitter the value for the interval between transmissions. Hosts receiving a neighbor solicitation requests from a router following each of three subsequent gratuitous NA intervals MUST revert to RFC 4861 behavior.

Implementation of new behavior for unsolicited neighbor advertisement would make it possible under appropriate circumstances to greatly reduce the dependence on the neighbor solicitation process for retaining existing ND cache entries.

This may impact the detection of one-way reachability.

It is understood that this section may need to be moved into a separate document -- it is (currently) provided here for discussion purposes.

7. IANA Considerations

No IANA resources or consideration are requested in this draft.

8. Security Considerations

This document outlines mitigation options that operators can use to protect themselves from Denial of Service attacks. Implementation advice to router vendors aimed at ameliorating known problems carries the risk of previously unforeseen consequences. It is not believed that these techniques create additional security or DOS exposure

Kumari, et al. Expires July 11, 2012 [Page 11] Internet-Draft ND Enhance January 2012

9. Acknowledgements

The authors would like to thank Ron Bonica, Troy Bonin, John Jason Brzozowski, Randy Bush, Vint Cerf, Jason Fesler Erik Kline, Jared Mauch, Chris Morrow and Suran De Silva. Special thanks to Thomas Narten for detailed review and (even more so) for providing text!

Apologies for anyone we may have missed; it was not intentional.

10. References

10.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC4398] Josefsson, S., "Storing Certificates in the Domain Name System (DNS)", RFC 4398, March 2006.

[RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, September 2007.

[RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless Address Autoconfiguration", RFC 4862, September 2007.

[RFC6164] Kohno, M., Nitzan, B., Bush, R., Matsuzaki, Y., Colitti, L., and T. Narten, "Using 127-Bit IPv6 Prefixes on Inter- Router Links", RFC 6164, April 2011.

10.2. Informative References

[RFC4255] Schlyter, J. and W. Griffin, "Using DNS to Securely Publish Secure Shell (SSH) Key Fingerprints", RFC 4255, January 2006.

URIs

[1]

Appendix A. Text goes here.

TBD

Kumari, et al. Expires July 11, 2012 [Page 12] Internet-Draft ND Enhance January 2012

Authors’ Addresses

Warren Kumari Google

Email: [email protected]

Igor Yahoo! 45 W 18th St New York, NY USA

Email: [email protected]

Joel Zynga 111 Evelyn Sunnyvale, CA USA

Email: [email protected]

Kumari, et al. Expires July 11, 2012 [Page 13]

Network Working Group W. Kumari Internet-Draft Google Intended status: Informational I. Gashinsky Expires: April 25, 2013 Yahoo! J. Jaeggli Zynga K. Chittimaneni Google October 22, 2012

Neighbor Discovery Enhancement for DOS mititgation draft-gashinsky-6man-v6nd-enhance-02

Abstract

In IPv4, subnets are generally small, made just large enough to cover the actual number of machines on the subnet. In contrast, the default IPv6 subnet size is a /64, a number so large it covers trillions of addresses, the overwhelming number of which will be unassigned. Consequently, simplistic implementations of Neighbor Discovery can be vulnerable to denial of service attacks whereby they attempt to perform address resolution for large numbers of unassigned addresses. Such denial of attacks can be launched intentionally (by an attacker), or result from legitimate operational tools that scan networks for inventory and other purposes. As a result of these vulnerabilities, new devices may not be able to "join" a network, it may be impossible to establish new IPv6 flows, and existing IPv6 transported flows may be interrupted.

This document describes a modification to the [RFC4861] neighbor discovery protocol aimed at improving the resilience of the neighbor discovery process. We call this process Gratuitous neighbor discovery and it derives inspiration in part from analogous IPv4 gratuitous ARP implementation.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any

Kumari, et al. Expires April 25, 2013 [Page 1] Internet-Draft ND Enhance October 2012

time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on April 25, 2013.

Copyright Notice

Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Kumari, et al. Expires April 25, 2013 [Page 2] Internet-Draft ND Enhance October 2012

Table of Contents

1. Introduction ...... 4 1.1. Applicability ...... 4 2. The Problem ...... 4 2.1. Scenario 1 - DoS condition induced by default router failure ...... 5 3. Terminology ...... 6 4. Background ...... 7 5. Neighbor Discovery Overview ...... 8 6. Proposed Solutions ...... 8 6.1. NDP Protocol Gratuitous NA ...... 9 6.2. User Configurable DELAY_FIRST_PROBE_TIME ...... 9 7. IANA Considerations ...... 9 8. Security Considerations ...... 10 9. Acknowledgements ...... 10 10. References ...... 10 10.1. Normative References ...... 10 10.2. Informative References ...... 10 Authors’ Addresses ...... 11

Kumari, et al. Expires April 25, 2013 [Page 3] Internet-Draft ND Enhance October 2012

1. Introduction

This document describes modifications to the IPv6 Neighbor Discovery protocol [RFC4861] in order to reduce exposure to vulnerabilities when a network is scanned, either by an intruder, as part of a deliberate DOS attempt, or through the use of scanning tools that perform network inventory, security audits, etc. (e.g., "nmap"). In some cases, DOS-like conditions can also be induced by legitimate traffic in heavy traffic networks such as campuses or datacenters.

1.1. Applicability

This document is primarily intended for implementors of [RFC4861].

This document is a companion to two additional documents. The first document was [RFC6583] Operational Neighbor Discovery Problems which addressed the problem in detail and described operational and implementation mitigation within the framework of the Existing protocol. The second related document [I-D.ietf-6man-impatient-nud] Neighbor Unreachability Detection is too impatient proposes to alter the Neighbor unreachability Detection by relaxing rules in an attempt to keep devices in the cache.

In this document we propose alterations that allow the update or installation of neighbor entries without the instigation of a full [RFC4861] neighbor solicitation.

2. The Problem

In IPv4, subnets are generally small, made just large enough to cover the actual number of machines on the subnet. For example, an IPv4 /20 contains only 4096 address. In contrast, the default IPv6 subnet size is a /64, a number so large it covers literally billions of billions of addresses, the overwhelming number of which will be unassigned. Consequently, simplistic implementations of Neighbor Discovery can be vulnerable to denial of service attacks whereby they perform address resolution for large numbers of unassigned addresses. Such denial of attacks can be launched intentionally (by an attacker), or result from legitimate operational tools that scan networks for inventory and other purposes. As a result of these vulnerabilities, new devices may not be able to "join" a network, it may be impossible to establish new IPv6 flows, and existing IPv6 transport flows may be interrupted.

Network scans attempt to find and probe devices on a network. Typically, scans are performed on a range of target addresses, or all the addresses on a particular subnet. When such probes are directed

Kumari, et al. Expires April 25, 2013 [Page 4] Internet-Draft ND Enhance October 2012

via a router, and the target addresses are on a directly attached network, the router will to attempt to perform address resolution on a large number of destinations (i.e., some fraction of the 2^64 addresses on the subnet). The process of testing for the (non)existence of neighbors can induce a denial of service condition, where the number of Neighbor Discovery requests overwhelms the implementation’s capacity to process them, exhausts available memory, replaces existing in-use mappings with incomplete entries that will never be completed, etc. The result can be network disruption, where existing traffic may be impacted, and devices that join the net find that address resolutions fails.

In order to alleviate risk associated with this DOS threat, some router implementations have taken steps to rate-limit the processing rate of Neighbor Solicitations (NS). While these mitigations do help, they do not fully address the issue and may introduce their own set of potential liabilities to the neighbor discovery process.

In some network environments, legitimate Neighbor Discovery traffic from a large number of connected hosts could induce a DoS condition even without the use of any scanning tools.

2.1. Scenario 1 - DoS condition induced by default router failure

Consider the following scenario - You have a pair of routers, R1 and R2, acting as default routers for a campus wifi network that serves thousands of clients. These clients range from traditional laptops with common OSes such as Windows, MAC OS X, etc., to smart phones and tablets running a slew of mobile OSes. R1, R2 and all clients are configured with default ND parameters.

Under normal operating conditions, R1 acts as a default gateway for all client traffic and R2 is mostly acting as a standby. R1 and R2 routinely send out Router Advertisements and all nodes perform Neighbor Discovery as per the default timers configured. Clients that are actively transmitting and receiving data will likely have a Neighbor Cache entry for R1 as REACHABLE and R2 as STALE.

Now imagine that for some reason (power outage, hardware failure, etc.) R1 goes down. When this happens, R2 begins various housekeeping tasks such as reconverging its routing protocols (OSPF, BGP, etc.), recalculating layer 2 topologies such as in STP and so on. Typically, such reconvergence incidents are quite CPU intensive depending on the size of the topology and are generally aggravated in dual stack environments. Once clients determine that R1 is no longer reachable, they would start using R2 as their default router.

At this point, the Neighbor Cache Entry for R2 is still marked as

Kumari, et al. Expires April 25, 2013 [Page 5] Internet-Draft ND Enhance October 2012

STALE. As per RFC4861, a node will start sending packets to R2, mark the neighbor cache entry for R2 as DELAY and set a timer to expire in DELAY_FIRST_PROBE_TIME seconds. DELAY_FIRST_PROBE_TIME is a fixed node constant with a value of 5 seconds. If the entry is still in the DELAY state when the timer expires, the entry’s state changes to PROBE. Upon entering the PROBE state, a node sends a unicast Neighbor Solicitation message to R2 using the cached link-layer address.

Ordinarily, it is highly likely that the client will receive reachability confirmation within the 5 seconds of DELAY_FIRST_PROBE_TIME by virtue of hints from upper layer protocols. However, in this scenario, given that R2 is busy doing other things, it is possible that it will take a longer time for the client to receive said reachability confirmation, forcing it to enter the PROBE state and send out a unicast NS message.

With thousands of clients now sending out unicast NS messages to R2 in a short period of time, while it is busy dealing with other reconvergence related calculations, you effectively end up in a DoS situation entirely with legitimate traffic.

3. Terminology

Address Resolution Address resolution is the process through which a node determines the link-layer address of a neighbor given only its IP address. In IPv6, address resolution is performed as part of Neighbor Discovery [RFC4861], p60

Forwarding Plane That part of a router responsible for forwarding packets. In higher-end routers, the forwarding plane is typically implemented in specialized hardware optimized for performance. Forwarding steps include determining the correct outgoing interface for a packet, decrementing its Time To Live (TTL), verifying and updating the checksum, placing the correct link- layer header on the packet, and forwarding it.

Control Plane That part of the router implementation that maintains the data structures that determine where packets should be forwarded. The control plane is typically implemented as a "slower" software process running on a general purpose processor and is responsible for such functions as the routing protocols, performing management and resolving the correct link-layer address for adjacent neighbors. The control plane "controls" the forwarding plane by programming it with the information needed for packet forwarding.

Kumari, et al. Expires April 25, 2013 [Page 6] Internet-Draft ND Enhance October 2012

Neighbor Cache As described in [RFC4861], the data structure that holds the cache of (amongst other things) IP address to link-layer address mappings for connected nodes. The forwarding plane accesses the Neighbor Cache on every forwarded packet. Thus it is usually implemented in an ASIC .

Neighbor Discovery Process The Neighbor Discovery Process (NDP) is that part of the control plane that implements the Neighbor Discovery protocol. NDP is responsible for performing address resolution and maintaining the Neighbor Cache. When forwarding packets, the forwarding plane accesses entries within the Neighbor Cache. Whenever the forwarding plane processes a packet for which the corresponding Neighbor Cache Entry is missing or incomplete, it notifies NDP to take appropriate action (typically via a shared queue). NDP picks up requests from the shared queue and performs any necessary actions. In many implementations it is also responsible for responding to router solicitation messages, Neighbor Unreachability Detection (NUD), etc.

4. Background

Modern router architectures separate the forwarding of packets (forwarding plane) from the decisions needed to decide where the packets should go (control plane). In order to deal with the high number of packets per second the forwarding plane is generally implemented in hardware and is highly optimized for the task of forwarding packets. In contrast, the NDP control plane is mostly implemented in software processes running on a general purpose processor.

When a router needs to forward an IP packet, the forwarding plane logic performs the longest match lookup to determine where to send the packet and what outgoing interface to use. To deliver the packet to an adjacent node, It encapsulates the packet in a link-layer frame (which contains a header with the link-layer destination address). The forwarding plane logic checks the Neighbor Cache to see if it already has a suitable link-layer destination, and if not, places the request for the required information into a queue, and signals the control plane (i.e., NDP) that it needs the link-layer address resolved.

In order to protect NDP specifically and the control plane generally from being overwhelmed with these requests, appropriate steps must be taken. For example, the size and rate of the queue might be limited. NDP running in the control plane of the router dequeues requests and performs the address resolution function (by performing a neighbor solicitation and listening for a neighbor advertisement). This

Kumari, et al. Expires April 25, 2013 [Page 7] Internet-Draft ND Enhance October 2012

process is usually also responsible for other activities needed to maintain link-layer information, such as Neighbor Unreachability Detection (NUD).

An attacker sending the appropriate packets to addresses on a given subnet can cause the router to queue attempts to resolve so many addresses that it crowds out attempts to resolve "legitimate" addresses (and in many cases becomes unable to perform maintenance of existing entries in the neighbor cache, and unable to answer Neighbor Solicitiation). This condition can result the inability to resolve new neighbors and loss of reachability to neighbors with existing ND- Cache entries. During testing it was concluded that 4 simultaneous nmap sessions from a low-end computer was sufficient to make a router’s neighbor discovery process unhappy and therefore forwarding unusable.

This behavior has been observed across multiple platforms and implementations.

5. Neighbor Discovery Overview

When a packet arrives at (or is generated by) a router for a destination on an attached link, the router needs to determine the correct link-layer address to send the packet to. The router checks the Neighbor Cache for an existing Neighbor Cache Entry for the neighbor, and if none exists, invokes the address resolution portions of the IPv6 Neighbor Discovery [RFC4861] protocol to determine the link-layer address.

RFC4861 Section 5.2 (Conceptual Sending Algorithm) outlines how this process works. A very high level summary is that the device creates a new Neighbor Cache Entry for the neighbor, sets the state to INCOMPLETE, queues the packet and initiates the actual address resolution process. The device then sends out one or more Neighbor Solicitations, and when it receives a corresponding Neighbor Advertisement, completes the Neighbor Cache Entry and sends the queued packet.

6. Proposed Solutions

Let us examine a few possible solutions that could alleviate the issues discussed in ’The Problem’ section

Kumari, et al. Expires April 25, 2013 [Page 8] Internet-Draft ND Enhance October 2012

6.1. NDP Protocol Gratuitous NA

RFC 4861, section 7.2.5 and 7.2.6 [RFC4861] requires that unsolicited neighbor advertisements result in the receiver setting it’s neighbor cache entry to STALE, kicking off the resolution of the neighbor using neighbor solicitation. If the link layer address in an unsolicited neighbor advertisement matches that of the existing ND cache entry, routers SHOULD retain the existing entry updating it’s status with regards to LRU retention policy.

Hosts MAY be configured to send unsolicited Neighbor advertisement at a rate set at the discretion of the operators. The rate SHOULD be appropriate to the sizing of ND cache parameters and the host count on the subnet. An unsolicited NA rate parameter MUST NOT be enabled by default. The unsolicited rate interval as interpreted by hosts must jitter the value for the interval between transmissions. Hosts receiving a neighbor solicitation requests from a router following each of three subsequent gratuitous NA intervals MUST revert to RFC 4861 behavior.

Implementation of new behavior for unsolicited neighbor advertisement would make it possible under appropriate circumstances to greatly reduce the dependence on the neighbor solicitation process for retaining existing ND cache entries.

This may impact the detection of one-way reachability.

6.2. User Configurable DELAY_FIRST_PROBE_TIME

A very simple solution for Scenario 1 could be to have a user configurable DELAY_FIRST_PROBE_TIME that could be set to a higher value than the current constant of 5 seconds. This would allow clients to keep sending traffic in the DELAY state, while giving more time for R2 to stabilize before it has to process the barrage of ND messages. It will be up to Network administrators to determine what this value should be based upon unique characteristics of their setup. Having a longer DELAY_FIRST_PROBE_TIME does run the risk of clients sending traffic without ever knowing that they have forward reachability. However, in most cases, the router’s forwarding plane remains unaffected during high CPU events and therefore the likelihood of the traffic making it to the destination is high.

7. IANA Considerations

No IANA resources or consideration are requested in this draft.

Kumari, et al. Expires April 25, 2013 [Page 9] Internet-Draft ND Enhance October 2012

8. Security Considerations

This technique has potential impact on neighbor detection and in particular the discovery of unidirectional forwarding problems.

9. Acknowledgements

The authors would like to thank Ron Bonica, Troy Bonin, John Jason Brzozowski, Randy Bush, Vint Cerf, Jason Fesler Erik Kline, Jared Mauch, Chris Morrow and Suran De Silva. Special thanks to Thomas Narten for detailed review and (even more so) for providing text!

Apologies for anyone we may have missed; it was not intentional.

10. References

10.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC4398] Josefsson, S., "Storing Certificates in the Domain Name System (DNS)", RFC 4398, March 2006.

[RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, September 2007.

[RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless Address Autoconfiguration", RFC 4862, September 2007.

[RFC6164] Kohno, M., Nitzan, B., Bush, R., Matsuzaki, Y., Colitti, L., and T. Narten, "Using 127-Bit IPv6 Prefixes on Inter- Router Links", RFC 6164, April 2011.

10.2. Informative References

[I-D.ietf-6man-impatient-nud] Nordmark, E. and I. Gashinsky, "Neighbor Unreachability Detection is too impatient", draft-ietf-6man-impatient-nud-02 (work in progress), July 2012.

[RFC4255] Schlyter, J. and W. Griffin, "Using DNS to Securely Publish Secure Shell (SSH) Key Fingerprints", RFC 4255, January 2006.

Kumari, et al. Expires April 25, 2013 [Page 10] Internet-Draft ND Enhance October 2012

[RFC6583] Gashinsky, I., Jaeggli, J., and W. Kumari, "Operational Neighbor Discovery Problems", RFC 6583, March 2012.

Authors’ Addresses

Warren Kumari Google

Email: [email protected]

Igor Yahoo! 45 W 18th St New York, NY USA

Email: [email protected]

Joel Zynga 111 Evelyn Sunnyvale, CA USA

Email: [email protected]

Kiran Google 1600 Amphitheater Pkwy Mountain View, CA USA

Email: [email protected]

Kumari, et al. Expires April 25, 2013 [Page 11]

IPv6 maintenance Working Group (6man) F. Gont Internet-Draft UK CPNI Updates: 4861 (if approved) December 15, 2011 Intended status: Standards Track Expires: June 17, 2012

Managing the Address Generation Policy for Stateless Address Autoconfiguration in IPv6 draft-gont-6man-managing-slaac-policy-00

Abstract

This document describes an operational problem that arises due to the impossibility of managing the address generation policy employed by hosts participating in IPv6 Stateless Address Autoconfiguration (SLAAC). Additionally, it specifies a new field in the Prefix Information option of Router Advertisement messages, such that routers can advertise, for each network prefix included in a Router Advertisement message, the desired address generation policy to be used for SLAAC.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This document may not be modified, and derivative works of it may not be created, and it may not be published except as an Internet-Draft.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on June 17, 2012.

Copyright Notice

Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents

Gont Expires June 17, 2012 [Page 1] Internet-Draft Managing the Address Generation Policy December 2011

(http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction ...... 3 2. Updating the Prefix Information option ...... 5 3. Router specification ...... 7 3.1. Router configuration ...... 7 3.2. Router operation ...... 7 4. Host specification ...... 8 4.1. Host configuration ...... 8 4.2. Host Operation ...... 8 5. IANA Considerations ...... 10 6. Privacy Considerations ...... 11 7. Security Considerations ...... 12 8. Acknowledgements ...... 13 9. References ...... 14 9.1. Normative References ...... 14 9.2. Informative References ...... 14 Appendix A. Changes from previous versions of the document (to be removed by the RFC Editor before publication of this document as a RFC ...... 16 A.1. Changes from draft-gont-6man-managing-privacy-extensions-01 ...... 16 Author’s Address ...... 17

Gont Expires June 17, 2012 [Page 2] Internet-Draft Managing the Address Generation Policy December 2011

1. Introduction

[RFC4862] specifies the Stateless Address Autoconfiguration (SLAAC) for IPv6, which typically results in hosts configuring one or more addresses composed of a network prefix advertised by a local router, and an Interface Identifier (IID) that typically embeds a hardware address (using the Modified EUI-64 Format [RFC4291].

Since the identifiers (e.g. Ethernet MAC addresses) typically used for those addresses are usually globally unique, the IPv6 addresses generated as specified in [RFC4291] can be leveraged to track and correlate the activity of a node, thus negatively affecting the privacy of users.

The "Privacy Extensions for Stateless Address Autoconfiguration in IPv6" [RFC4941] were introduced to difficult the task of eavesdroppers and other information collectors to correlate the activities of a node, and basically result in random Interface Identifiers that are typically more difficult to leverage than their Modified EUI-64 Format counterpart. Some flavor of these "Privacy Extensions" have been implemented in a variety of systems, some of which (notably Microsoft Windows Vista and Microsoft Windows 7) enable them by default.

The impossibility of managing the address generation policy employed for SLAAC poses a problem when a site desires or requires a specific policy for the generation of IPv6 addresses. For example, some operating systems (notably FreeBSD) implement "Privacy Extensions", but do not enable them by default. And since there is currently no mechanism in IPv6 to convey the desired address-generation policy, administrators have no option other than manual configuration to enable such extensions. On the other hand, some implementations (notably Windows Vista and Windows 7) that enable "Privacy Extensions" by default might need to be deployed on sites that require the use of stable addresses (e.g. those resulting from Modified EUI-64 Format Identifiers [RFC4291]), for the ease of correlating network activities or enforcing simple access controls. However, since there is currently no mechanism to convey the desired address-generation policy for SLAAC, an administrator would need to manually-configure each of the attached nodes such that they employ the desired address generation policy.

Depending on manual configuration for enabling a specific and homogeneous address-generation policy may result in a lot of work on the side of the administrator, but may also be difficult to implement, particularly when considering mobile nodes such as laptops and mobile phones [Broersma]. Additionally, the lack of a mechanism for conveying address-generation policy information might preclude

Gont Expires June 17, 2012 [Page 3] Internet-Draft Managing the Address Generation Policy December 2011

the use of some technologies, such as "Privacy Extensions" [RFC4941], which are desirable in most general environments (e.g., a typical home network, or an Internet cafe), but are currently only enabled as a result of manual configuration.

This document specifies a new field in the Prefix Information option of Router Advertisement messages, such that routers can advertise, for each network prefix to be used for SLAAC, the desired policy for the generation of IPv6 addresses. The policy information is simply "advisory" information, in the sense that hosts still have the final word on which address generation policy they use.

The aforementioned policy information basically indicates whether "stable" or "temporary" addresses are desired. We note that while the only address generation policies that have so far been standardized by the IETF are those based on e.g. IEEE identifiers and "Privacy Extensions for SLAAC", other address generation policies are possible. For example, [STABLE-PRIV] describes an address generation policy which results in interface identifiers that are stable for each prefix used for SLAAC, but that change from one autoconfiguration prefix to another.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

Gont Expires June 17, 2012 [Page 4] Internet-Draft Managing the Address Generation Policy December 2011

2. Updating the Prefix Information option

The syntax of the Prefix Information option is updated as follows:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | Prefix Length |L|A|R|AGP|Rsvd1| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Valid Lifetime | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Preferred Lifetime | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved2 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + + | | + Prefix + | | + + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

An additional field, the two-bit "AGP" (Address Generation Policy) field, is specified for the Prefix Information option. The semantics of each of the possible values are:

00: No specific advice is provided for the generation of addresses for this prefix.

01: When generating addresses for this prefix, the resulting addresses SHOULD be stable (i.e., not temporary). The resulting stable addresses may be based on Modified EUI-64 Format Identifiers [RFC4291], the stable private identifiers proposed in [STABLE-PRIV], or any other address generation policy specified in the future which results in IPv6 addresses that are stable/ constant for that autoconfiguration prefix/subnet.

10: When generating addresses for this prefix, temporary addresses SHOULD be employed. The resulting addresses may be based on "Privacy Extensions for SLAAC" [RFC4941], or on any other policy which results in temporary Interface Identifiers. Address generation policies that result in stable addresses (such as those

Gont Expires June 17, 2012 [Page 5] Internet-Draft Managing the Address Generation Policy December 2011

specified in [RFC4941] and [STABLE-PRIV]) SHOULD NOT be used for this prefix.

11: Unused (reserved for future use). The special value "11" is reserved for future extensions, and MUST NOT be set by routers implementing this specification. Hosts that implement this specification MUST interpret the special value "11" in the same way as "00" (i.e., no specific advice is provided for address generation).

Note: The "R" bit was specified by [RFC3775]. The Rsvd1 field corresponds to the remaining reserved bits, and thus MUST be set to zero by the sender of this option, and ignored by the receiver.

Since the "AGP" bits correspond to a previously "reserved" field, implementations that predate this specification should be setting the AGP field to "00" when sending the option, and ignoring the AGP bits upon receipt.

Gont Expires June 17, 2012 [Page 6] Internet-Draft Managing the Address Generation Policy December 2011

3. Router specification

3.1. Router configuration

This section specifies a variable that routers implementing this specification MUST support:

DesiredAddressPolicy: This variable specifies the desired address generation policy for IPv6 addresses resulting from SLAAC. As of this specification, possible values are: "Default", "TemporaryAddresses", and "StableAddresses". This variable SHOULD default to "Default".

3.2. Router operation

A router sending a Prefix Information option MUST set the AGP bits according to the value of the variable DesiredAddressPolicy. The following table specifies which values must be used for the AGP field depending on the value of the DesiredAddressPolicy variable.

+------+------+ | DesiredAddressPolicy | AGP field | +------+------+ | Default | 00 | +------+------+ | StableAddresses | 01 | +------+------+ | TemporaryAddresses | 10 | +------+------+

Table 1: Correspondence between DesiredAddressPolicy and AGP bits

Gont Expires June 17, 2012 [Page 7] Internet-Draft Managing the Address Generation Policy December 2011

4. Host specification

4.1. Host configuration

This section specifies two new variables that hosts implementing this specification MUST support:

AddressPolicyConfiguration: This variable specifies whether the host should honor the advice conveyed in the AGP field of the received Prefix Information options. There are two possible values for this variable: "Enabled" and "Disabled". This variable SHOULD default to "Enabled".

DefaultAddressPolicy: This variable specifies the default IPv6 address generation policy that will be employed if AddressPolicyConfiguration is set to "Disabled", or if "AddressPolicyConfiguration" is set to "Enabled" but the AGP field of the received Prefix Information option is set to "00" (i.e., no specific advice is provided for the generation of addresses for this prefix). As of this specification, possible values are "TemporaryAddresses" (e.g. for [RFC4941]) and "StableAddresses" (for [RFC4291] or [STABLE-PRIV]). This variable SHOULD default to "TemporaryAddresses".

A host willing to ignore the advice of the router regarding which policy to use to generate IPv6 addresses MAY do so by setting AddressPolicyConfiguration to "Disabled".

It should be noted that the aforementioned variables might have different granularities. For example, a host could specify a set of EnableAddressPolicy and DefaultAddressPolicy variables on a global basis, on a "per network interface type" basis, on a "per wireless network" basis, etc.

4.2. Host Operation

When generating addresses for the prefix contained in a "Prefix Information Option", hosts implementing this specification MUST proceed as follows:

o If AddressPolicyConfiguration is set to "Enabled", the host SHOULD employ only the policy specified by the AGP field when generating addresses for this prefix. If no specific advice is provided (i.e., the AGP field is set to "00"), the host SHOULD employ only the policy specified by the DefaultAddressPolicy variable when generating addresses for this prefix.

Gont Expires June 17, 2012 [Page 8] Internet-Draft Managing the Address Generation Policy December 2011

o If AddressPolicyConfiguration is Disabled, the host SHOULD employ only the policy specified by the DefaultAddressPolicy variable when generating addresses for this prefix.

Gont Expires June 17, 2012 [Page 9] Internet-Draft Managing the Address Generation Policy December 2011

5. IANA Considerations

There are no IANA registries within this document. The RFC-Editor can remove this section before publication of this document as an RFC.

Gont Expires June 17, 2012 [Page 10] Internet-Draft Managing the Address Generation Policy December 2011

6. Privacy Considerations

As discussed in [RFC4941], IPv6 addresses generated using the Modified EUI-64 Format Identifiers [RFC4291] allow tracking of nodes across networks, since the resulting Interface-ID is a globally- unique value that will remain constant across all networks that the node may connect to.

As specified in Section 3 and Section 4 of this document, the default value for the AGP bits of a Prefix Information option is "01" (no specific advice is provided for the generation of addresses for this prefix), and the default address generation policy (DefaultAddressPolicy) for hosts is set to "TemporaryAddresses". This means that, unless the router or host default settings are overridden, the default settings resulting from this specification will enable the use of "temporary addresses" (such as those specified in [RFC4941]).

Nevertheless, it should be noted that the mechanism specified in this document simply provides the means for a router to convey *advisory* information regarding the desired policy for generating IPv6 addresses when SLAAC is employed: this specification allows hosts to ignore the aforementioned advice when deemed appropriate (by setting AddressPolicyConfiguration to "Disabled"). For example, hosts very concerned with the privacy implications of using interface identifiers that remain constant across networks may set AddressPolicyConfiguration to "Disabled" and DefaultAddressPolicy to "TemporaryAddresses" when connecting to untrusted networks, such that Temporary Addresses (such as those specified in [RFC4941]) are always employed (despite the advice provided by the local router).

Finally, we note that the value and effectiveness of some variants of Temporary Addresses (such as that specified in [RFC4941]) have been questioned in a number of studies [I-D.dupont-ipv6-rfc3041harmful] [Escudero] [CPNI-IPv6]. However, this document does not take a stance about their value and effectiveness.

Gont Expires June 17, 2012 [Page 11] Internet-Draft Managing the Address Generation Policy December 2011

7. Security Considerations

An attacker could exploit the mechanism specified in this document to cause hosts in a given subnet to disable "Temporary Addresses", thus usually leading to the generation of Interface Identifiers that embed the underlying hardware address (e.g. using Modified EUI-64 Format Identifiers), instead. Thus, in such cases, the privacy of the victim hosts that would have enabled the Privacy Extensions could possibly be reduced.

However, some considerations should be made about such possible attack. Firstly, such an attack would require from an attacker the same effort as any other Neighbor Discovery attack based on crafted Router Advertisement messages [RFC3756] [CPNI-IPv6], most of which would be far more interesting for an attacker than this possible attack vector. For example, a (possibly malicious) router could still cause a host to use Modified EUI-64 Format Identifiers if DHCPv6 [RFC3315] is required for address configuration, and the DHCPv6 server selects the IPv6 addresses to be leased to hosts based on e.g. the source link-layer address of the DHCP requests. Secondly, while the the only policy for generating stable IPv6 addresses that has so far been standardized by the IETF is that based on e.g. IEEE identifiers, there are other possible policies, such as that proposed in [STABLE-PRIV], that lead to stable addresses. That is, use of "stable" identifiers does not necessarily imply that such identifiers remain stable/constant across networks.

In those cases in which the network itself is trusted, but users connected to the same network are not, the possible options for mitigating this and other attack vectors based on crafted Router Advertisement messages include the deployment of the so-called "Router Advertisement guard" mechanism [RFC6104], with the implementation guidelines described in [I-D.gont-v6ops-ra-guard-evasion]. Additionally, SEND (SEcure Neighbor Discovery) [RFC3971] could be potentially deployed to mitigate these and other Neighbor Discovery attacks. However, a number of issues (such as the requirement for Public Key Infrastructure) difficult the deployment of SEND in most general network scenarios [CPNI-IPv6].

Gont Expires June 17, 2012 [Page 12] Internet-Draft Managing the Address Generation Policy December 2011

8. Acknowledgements

The author would like to thank (in alphabetical order) Mikael Abrahamsson, Ran Atkinson, Brian Carpenter, Christian Huitema, Mark Smith, Mark Townsley, and James Woodyatt, for providing valuable comments on [I-D.gont-6man-managing-privacy-extensions], on which this document is based.

Fernando Gont would like to thank CPNI (http://www.cpni.gov.uk) for their continued support.

Gont Expires June 17, 2012 [Page 13] Internet-Draft Managing the Address Generation Policy December 2011

9. References

9.1. Normative References

[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998.

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC3775] Johnson, D., Perkins, C., and J. Arkko, "Mobility Support in IPv6", RFC 3775, June 2004.

[RFC3971] Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure Neighbor Discovery (SEND)", RFC 3971, March 2005.

[RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing Architecture", RFC 4291, February 2006.

[RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, September 2007.

[RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless Address Autoconfiguration", RFC 4862, September 2007.

[RFC4941] Narten, T., Draves, R., and S. Krishnan, "Privacy Extensions for Stateless Address Autoconfiguration in IPv6", RFC 4941, September 2007.

[RFC6104] Chown, T. and S. Venaas, "Rogue IPv6 Router Advertisement Problem Statement", RFC 6104, February 2011.

9.2. Informative References

[RFC3315] Droms, R., Bound, J., Volz, B., Lemon, T., Perkins, C., and M. Carney, "Dynamic Host Configuration Protocol for IPv6 (DHCPv6)", RFC 3315, July 2003.

[RFC3756] Nikander, P., Kempf, J., and E. Nordmark, "IPv6 Neighbor Discovery (ND) Trust Models and Threats", RFC 3756, May 2004.

[I-D.gont-v6ops-ra-guard-evasion] Gont, F. and U. CPNI, "IPv6 Router Advertisement Guard (RA-Guard) Evasion", draft-gont-v6ops-ra-guard-evasion-01 (work in progress), June 2011.

Gont Expires June 17, 2012 [Page 14] Internet-Draft Managing the Address Generation Policy December 2011

[I-D.gont-6man-managing-privacy-extensions] Gont, F. and R. Broersma, "Managing the Use of Privacy Extensions for Stateless Address Autoconfiguration in IPv6", draft-gont-6man-managing-privacy-extensions-01 (work in progress), March 2011.

[I-D.dupont-ipv6-rfc3041harmful] Dupont, F. and P. Savola, "RFC 3041 Considered Harmful", draft-dupont-ipv6-rfc3041harmful-05 (work in progress), June 2004.

[STABLE-PRIV] Gont, F., "A method for Generating Stable Privacy-Enhanced Addresses with IPv6 Stateless Address Autoconfiguration (SLAAC)", Work in Progress, December 2011, .

[Broersma] Broersma, R., "IPv6 Everywhere: Living with a Fully IPv6- enabled environment", Australian IPv6 Summit 2010, Melbourne, VIC Australia, October 2010, .

[Escudero] Escudero, A., "PRIVACY EXTENSIONS FOR STATELESS ADDRESS AUTOCONFIGURATION IN IPV6 - "REQUIREMENTS FOR UNOBSERVABILITY", RVK02, Stockholm, 2002, .

[CPNI-IPv6] Gont, F., "Security Assessment of the Internet Protocol version 6 (IPv6)", UK Centre for the Protection of National Infrastructure, (available on request).

Gont Expires June 17, 2012 [Page 15] Internet-Draft Managing the Address Generation Policy December 2011

Appendix A. Changes from previous versions of the document (to be removed by the RFC Editor before publication of this document as a RFC

A.1. Changes from draft-gont-6man-managing-privacy-extensions-01

o The address-generation policy information has been changed from "’Modified EUI-64’ vs. ’Privacy Addresses’" to "’stable addresses’ vs. ’temporary addresses’", thus noting that more than one possible policy exists for each category.

o The appendix on possible alternative specifications for the AGP bits has been removed.

o The document now focuses on a mechanism that would enable increased use of "temporary addresses" (such as "Privacy Extensions").

Gont Expires June 17, 2012 [Page 16] Internet-Draft Managing the Address Generation Policy December 2011

Author’s Address

Fernando Gont UK CPNI

Email: [email protected] URI: http://www.cpni.gov.uk

Gont Expires June 17, 2012 [Page 17]

IPv6 maintenance Working Group (6man) F. Gont Internet-Draft UK CPNI Updates: 3971, 4861 (if approved) June 13, 2012 Intended status: Standards Track Expires: December 15, 2012

Security Implications of the Use of IPv6 Extension Headers with IPv6 Neighbor Discovery draft-gont-6man-nd-extension-headers-03

Abstract

This document analyzes the security implications of using IPv6 Extension Headers with Neighbor Discovery (ND) messages. It updates RFC 4861 such that use of the IPv6 Fragmentation Header is forbidden in all Neighbor Discovery messages, thus allowing for simple and effective counter-measures for Neighbor Discovery attacks. Finally, it discusses the security implications of using IPv6 fragmentation with SEcure Neighbor Discovery (SEND), and provides advice such that the aforementioned security implications are mitigated.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This document may not be modified, and derivative works of it may not be created, and it may not be published except as an Internet-Draft.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on December 15, 2012.

Copyright Notice

Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents

Gont Expires December 15, 2012 [Page 1] Internet-Draft ND and IPv6 Extension Headers June 2012

(http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction ...... 3 2. Traditional Neighbor Discovery and IPv6 Fragmentation . . . . 5 3. SEcure Neighbor Discovery (SEND) and IPv6 Fragmentation . . . 6 4. Specification ...... 7 5. Security Considerations ...... 8 6. Acknowledgements ...... 9 7. References ...... 10 7.1. Normative References ...... 10 7.2. Informative References ...... 10 Appendix A. Changes from previous versions of the draft (to be removed by the RFC Editor before publication of this document as a RFC ...... 12 A.1. Changes from draft-gont-6man-nd-extension-headers-02 . . . 12 A.2. Changes from draft-gont-6man-nd-extension-headers-01 . . . 12 A.3. Changes from draft-gont-6man-nd-extension-headers-00 . . . 12 Author’s Address ...... 13

Gont Expires December 15, 2012 [Page 2] Internet-Draft ND and IPv6 Extension Headers June 2012

1. Introduction

The Neighbor Discovery Protocol (NDP) is specified in RFC 4861 [RFC4861] and RFC 4862 [RFC4862]. It is used by both hosts and routers. Its functions include Neighbor Discovery (ND), Router Discovery (RD), Address Autoconfiguration, Address Resolution, Neighbor Unreachability Detection (NUD), Duplicate Address Detection (DAD), and Redirection.

Many of the possible attacks against the Neighbor Discovery Protocol are discussed in detail in [RFC3756]. In order to mitigate the aforementioned possible attacks, the SEcure Neighbor Discovery (SEND) was standardized. SEND employs a number of mechanisms to certify the origin of Neighbor Discovery packets and the authority of routers, and to protect Neighbor Discovery packets from being the subject of modification or replay attacks.

However, a number of factors, such as the use of trust anchors and the unavailability of SEND implementations for many widely-deployed operating systems, make SEND hard to deploy [Gont-DEEPSEC2011]. Thus, in many general scenarios it may be necessary and/or convenient to use other mitigation techniques for NDP-based attacks. The following "lightweight" mitigations are currently available for NDP attacks:

o Layer-2 filtering of Neighbor Discovery packets (such as RA-Guard [RFC6105])

o Neighbor Discovery monitoring tools (e.g., such as NDPMon [NDPMon])

IPv6 Router Advertisement Guard (RA-Guard) is a mitigation technique for attack vectors based on ICMPv6 Router Advertisement messages. It is meant to block attack packets at a layer-2 device before the attack packets actually reach the target nodes. [RFC6104] describes the problem statement of "Rogue IPv6 Router Advertisements", and [RFC6105] specifies the "IPv6 Router Advertisement Guard" functionality.

Tools such as NDPMon [NDPMon] and ramond [ramond] aim at monitoring Neighbor Discovery traffic in the hopes of detecting possible attacks when there are discrepancies between the information advertised in Neighbor Discovery packets and the information stored on a local database.

A key challenge that these mitigation or monitoring techniques face is that introduced by IPv6 fragmentation, since it is trivial for an attacker to conceal his attack by fragmenting his packets into

Gont Expires December 15, 2012 [Page 3] Internet-Draft ND and IPv6 Extension Headers June 2012

multiple fragments. This may limit or even eliminate the effectiveness of the aforementioned mitigation or monitoring techniques. Recent work [CPNI-IPv6] indicates that current implementations of the aforementioned "lightweight" mitigations for NDP attacks can be trivially evaded. For example, as noted in [I-D.ietf-v6ops-ra-guard-implementation], current RA-Guard implementations can be trivially evaded by fragmenting the attack packets into multiple fragments, such that the layer-2 device cannot find all the necessary information to perform packet filtering in the same packet. While Neighbor Discovery monitoring tools could (in theory implement IPv6 fragment reassembly, this is usually an arms- race with the attacker (an attacker generate lots of forged fragments to "confuse" the monitoring tools), and therefore the aforementioned tools are unreliable for the detection of such attacks.

Section 2 analyzes the use of IPv6 fragmentation in traditional Neighbor discovery. Section 3 analyzes the use of IPv6 fragmentation in SEcure Neighbor Discovery (SEND). Section 4 formally updates RFC 4861 such that use of the IPv6 Fragment Header with traditional Neighbor Discovery is forbidden, and provides advice on the use of IPv6 fragmentation with SEND.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

Gont Expires December 15, 2012 [Page 4] Internet-Draft ND and IPv6 Extension Headers June 2012

2. Traditional Neighbor Discovery and IPv6 Fragmentation

The only potential use case for IPv6 fragmentation with traditional (i.e., non-SEND) IPv6 Neighbor Discovery would be that in which a Router Advertisement must include a large number of options (Prefix Information Options, Route Information Options, etc.). However, this could still be achieved without employing fragmentation, by splitting the aforementioned information into multiple Router Advertisement messages.

Some Neighbor Discovery implementations are known to silently ignore Router Advertisement messages that employ fragmentation. Therefore, splitting the necessary information into multiple RA messages (rather than sending a large RA message that is fragmented into multiple IPv6 fragments) is probably desirable even from an interoperability point of view.

As a result of the aforementioned considerations, and since avoiding the use of IPv6 fragmentation in traditional Neighbor Discovery would greatly simplify and improve the effectiveness of monitoring and filtering ND, Section 4 specifies that hosts silently ignore traditional Neighbor Discovery messages (i.e., those specified in [RFC4861]) that employ IPv6 fragmentation.

Gont Expires December 15, 2012 [Page 5] Internet-Draft ND and IPv6 Extension Headers June 2012

3. SEcure Neighbor Discovery (SEND) and IPv6 Fragmentation

SEND packets typically carry more information than traditional Neighbor Discovery packets: for example, they include additional options such as the CGA option and the RSA signature option.

In the case of Neighbor Discovery messages specified in [RFC4861], the situation is roughly the same: if more information than would fit in a non-fragmented Neighbor Discovery packet needs to be sent, it should be split into multiple Neighbor Discovery messages (such that IPv6 fragmentation is avoided).

However, Certification Path Advertisement messages (specified in [RFC3971]) pose a different situation, since the Certificate Option they include contain much more information than any other Neighbor Discovery option. For example, Appendix C of [RFC3971] reports Certification Path Advertisement messages from 1050 to 1066 bytes on an Ethernet link layer. Since the size of CPA messages could potentially exceed the MTU of the local link, we recommend that fragmented CPA messages be normally processed (although *sending* fragmented CPA messages is discouraged).

It should be noted that relying on fragmentation opens the door to a variety of IPv6 fragmentation-based attacks. In particular, if an attacker is located on the same broadcast domain as the victim host, and Certification Path Advertisement messages employ IPv6 fragmentation, it would be trivial for the attacker to forge IPv6 fragment such that they result in "Fragment ID collisions", causing both the attack fragments and the legitimate fragments to be discarded by the victim node. This would eventually cause the Authorization Delegation Discovery to fail, thus leading the host to fall back (depending to local configuration) either to unsecured mode, or to reject the corresponding Router Advertisement messages (possibly resulting in a Denial of Service).

Gont Expires December 15, 2012 [Page 6] Internet-Draft ND and IPv6 Extension Headers June 2012

4. Specification

Nodes MUST NOT employ IPv6 fragmentation for sending any of the following Neighbor Discovery and SEcure Neighbor Discovery messages: Neighbor Solicitation, Neighbor Advertisement, Router Solicitation, Router Advertisement, Redirect, and Certification Path Solicitation. SEND nodes SHOULD NOT employ IPv6 fragmentation for sending Certification Path Advertisement messages.

Nodes MUST silently ignore the following Neighbor Discovery and SEcure Neighbor Discovery messages if the packets carrying them include an IPv6 Fragmentation Header: Neighbor Solicitation, Neighbor Advertisement, Router Solicitation, Router Advertisement, Redirect, and Certification Path Solicitation.

Nodes SHOULD normally process Certification Path Advertisement messages that employ IPv6 fragmentation.

Gont Expires December 15, 2012 [Page 7] Internet-Draft ND and IPv6 Extension Headers June 2012

5. Security Considerations

The IPv6 Fragmentation Header can be leveraged to circumvent network monitoring tools and current implementations of mechanisms such as RA-Guard [I-D.ietf-v6ops-ra-guard-implementation]. By updating the relevant specifications such that the IPv6 Fragment Header is not allowed in any Neighbor Discovery messages except "Certification Path Advertisement", protection of local nodes against Neighbor Discovery attacks, and monitoring of Neighbor Discovery traffic is greatly simplified.

[I-D.ietf-v6ops-ra-guard-implementation] discusses an improvement to the RA-Guard mechanism that can mitigate Neighbor Discovery attacks that employ IPv6 Fragmentation. However, it should be noted that unless [RFC4861] is updated (as proposed in this document), Neighbor Discovery monitoring tools (such as NDPMon [NDPMon]) would remain unreliable and trivial to circumvent by a skilled attacker.

As noted in Section 3, use of SEND could potentially result in fragmented "Certification Path Advertisement" messages, thus allowing an attacker to employ IPv6 fragmentation-based attacks against such messages. Therefore, to the extent that is possible, such use of fragmentation should be avoided.

Gont Expires December 15, 2012 [Page 8] Internet-Draft ND and IPv6 Extension Headers June 2012

6. Acknowledgements

The author would like to thank (in alphabetical order) Mikael Abrahamsson, Ran Atkinson, Ron Bonica, Jean-Michel Combes, David Farmer, Roque Gagliano, Bob Hinden, Philip Homburg, Ray Hunter, Arturo Servin, and Mark Smith, for providing valuable comments on earlier versions of this document.

This document resulted from the project "Security Assessment of the Internet Protocol version 6 (IPv6)" [CPNI-IPv6], carried out by Fernando Gont on behalf of the UK Centre for the Protection of National Infrastructure (CPNI). The author would like to thank the UK CPNI, for their continued support.

Gont Expires December 15, 2012 [Page 9] Internet-Draft ND and IPv6 Extension Headers June 2012

7. References

7.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC3971] Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure Neighbor Discovery (SEND)", RFC 3971, March 2005.

[RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, September 2007.

[RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless Address Autoconfiguration", RFC 4862, September 2007.

7.2. Informative References

[RFC3756] Nikander, P., Kempf, J., and E. Nordmark, "IPv6 Neighbor Discovery (ND) Trust Models and Threats", RFC 3756, May 2004.

[RFC6104] Chown, T. and S. Venaas, "Rogue IPv6 Router Advertisement Problem Statement", RFC 6104, February 2011.

[RFC6105] Levy-Abegnoli, E., Van de Velde, G., Popoviciu, C., and J. Mohacsi, "IPv6 Router Advertisement Guard", RFC 6105, February 2011.

[NDPMon] "NDPMon - IPv6 Neighbor Discovery Protocol Monitor", .

[ramond] "ramond", .

[I-D.ietf-v6ops-ra-guard-implementation] Gont, F., "Implementation Advice for IPv6 Router Advertisement Guard (RA-Guard)", draft-ietf-v6ops-ra-guard-implementation-04 (work in progress), May 2012.

[CPNI-IPv6] Gont, F., "Security Assessment of the Internet Protocol version 6 (IPv6)", UK Centre for the Protection of National Infrastructure, (available on request).

[Gont-DEEPSEC2011] Gont, "Results of a Security Assessment of the Internet

Gont Expires December 15, 2012 [Page 10] Internet-Draft ND and IPv6 Extension Headers June 2012

Protocol version 6 (IPv6)", DEEPSEC 2011 Conference, Vienna, Austria, November 2011, .

Gont Expires December 15, 2012 [Page 11] Internet-Draft ND and IPv6 Extension Headers June 2012

Appendix A. Changes from previous versions of the draft (to be removed by the RFC Editor before publication of this document as a RFC

A.1. Changes from draft-gont-6man-nd-extension-headers-02

o Makes the requirement a "MUST" (rather than a "SHOULD").

o Clarifies the implications for SEND.

o This rev addresses feedback received from Ran Atkinson, Ron Bonica, and Roque Gagliano.

A.2. Changes from draft-gont-6man-nd-extension-headers-01

o The I-D now forbids only the Fragment Header (rather than all Extension Headers) with most ND packets.

o A discussion of the use of IPv6 fragmentation with ND and SEND was added.

o Text was added noting that if SEND traffic is fragmented, this would open the door to fragmentation-based attacks, which would lead to trivial DoS attacks.

o Minor editorial changes

A.3. Changes from draft-gont-6man-nd-extension-headers-00

o The Security Considerations section now notes that unless IPv6 extension headers are not allowed with Neighbor Discovery messages, monitoring ND traffic and/or mitigating ND vulnerabilities might result in increased complexity and/or reduced performance.

o Minor editorial changes

Gont Expires December 15, 2012 [Page 12] Internet-Draft ND and IPv6 Extension Headers June 2012

Author’s Address

Fernando Gont Centre for the Protection of National Infrastructure

Email: [email protected] URI: http://www.cpni.gov.uk

Gont Expires December 15, 2012 [Page 13]

IPv6 maintenance Working Group (6man) F. Gont Internet-Draft SI6 Networks / UTN-FRH Updates: 2460 (if approved) V. Manral Intended status: Standards Track Hewlett-Packard Corp. Expires: December 15, 2012 June 13, 2012

Security and Interoperability Implications of Oversized IPv6 Header Chains draft-gont-6man-oversized-header-chain-02

Abstract

The IPv6 specification allows IPv6 header chains of an arbitrary size. The specification also allows options which can in turn extend each of the headers. In those scenarios in which the IPv6 header chain or options are unusually long and packets are fragmented, or scenarios in which the fragment size is very small, the first fragment of a packet may fail to include the entire IPv6 header chain. This document discusses the interoperability and security problems of such traffic, and updates RFC 2460 such that the first fragment of a packet is required to contain the entire IPv6 header chain.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This document may not be modified, and derivative works of it may not be created, and it may not be published except as an Internet-Draft.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on December 15, 2012.

Copyright Notice

Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved.

Gont & Manral Expires December 15, 2012 [Page 1] Internet-Draft Implications of Oversized Header Chains June 2012

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction ...... 3 2. Interoperability Implications of Oversized IPv6 Header Chains ...... 4 3. Forwarding Implications of Oversized IPv6 Header Chains . . . 5 4. Security Implications of Oversized IPv6 Header Chains . . . . 6 5. Updating RFC 2460 ...... 7 6. IANA Considerations ...... 8 7. Security Considerations ...... 9 8. Acknowledgements ...... 10 9. References ...... 11 9.1. Normative References ...... 11 9.2. Informative References ...... 11 Authors’ Addresses ...... 12

Gont & Manral Expires December 15, 2012 [Page 2] Internet-Draft Implications of Oversized Header Chains June 2012

1. Introduction

[RFC2460] allows for an IPv6 header chain of an arbitrary size. It also allows the headers themselves to have options, which can change the size of the headers. In those scenarios in which the IPv6 header chain is unusually long and packets are fragmented, or scenarios in which the fragment size is very small, the first fragment of a packet may fail to include the entire IPv6 header chain. This document discusses the interoperability and security problems of such traffic, and updates RFC 2460 such that the first fragment of a fragmented datagram is required to contain the entire IPv6 header chain.

It should be noted that this requirement does not preclude the use of e.g. IPv6 jumbo payloads but instead merely requires that all *headers*, starting from IPv6 base header and continuing up to the upper layer header (e.g. TCP or the like) be present in the first fragment.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

Gont & Manral Expires December 15, 2012 [Page 3] Internet-Draft Implications of Oversized Header Chains June 2012

2. Interoperability Implications of Oversized IPv6 Header Chains

Some transition technologies, such as NAT64 [RFC6146], may need to have access to the entire IPv6 header chain in order to associate an incoming IPv6 packet with an ongoing "session".

For instance, Section 3.4 of [RFC6146] states that "The NAT64 MAY require that the UDP, TCP, or ICMP header be completely contained within the fragment that contains fragment offset equal to zero".

Failure to include the entire IPv6 header chain in the first-fragment may cause the translation function to fail, with the corresponding packets being dropped.

Gont & Manral Expires December 15, 2012 [Page 4] Internet-Draft Implications of Oversized Header Chains June 2012

3. Forwarding Implications of Oversized IPv6 Header Chains

A lot of the switches and Routers in the internet do hardware based forwarding. To be able to achieve a level of throughput, there is a fixed maximum number of clock cycles dedicated to each packet. However with the use of unlimited options and header interleaving, larger packets with a lot of interleaving have to be forwarded to the software. It is for this reason that the maximum size of valid packets with interleaved headers needs to be limited.

Gont & Manral Expires December 15, 2012 [Page 5] Internet-Draft Implications of Oversized Header Chains June 2012

4. Security Implications of Oversized IPv6 Header Chains

Most firewalls enforce they filtering policy based on the following parameters:

o Source IP address

o Destination IP address

o Protocol type

o Source port number

o Destination port number

Some firewalls reassemble fragmented packets before applying a filtering policy, and thus always have the aforementioned information available when deciding whether to allow or block a packet. However, other stateless firewalls (generally prevalent on small/ home office equipment) do not reassemble fragmented traffic, and hence have to enforce their filtering policy based on the information contained in the received fragment, as opposed to the information contained in the reassembled datagram.

When presented with fragmented traffic, many of such firewalls typically enforce their policy only on the first fragment of a packet, based on the assumption that if the first fragment is dropped, reassembly of the corresponding datagram will fail, and thus such datagram will be effectively blocked. However, if the first fragment fails to include the entire IPv6 header chain, they may have no option other than "blindly" allowing or blocking the corresponding fragment. If they blindly allow the packet, then the firewall can be easily circumvented by intentionally sending fragmented packets that fail to include the entire IPv6 header chain in the first fragment. On the other hand, first-fragments that fail to include the entire IPv6 header chain have never been formally deprecated and thus, in theory, might be legitimately generated.

Gont & Manral Expires December 15, 2012 [Page 6] Internet-Draft Implications of Oversized Header Chains June 2012

5. Updating RFC 2460

If a packet is fragmented, the first fragment of the packet (i.e., that with a Fragment Offset of 0) MUST contain the entire IPv6 header chain.

Gont & Manral Expires December 15, 2012 [Page 7] Internet-Draft Implications of Oversized Header Chains June 2012

6. IANA Considerations

There are no IANA registries within this document. The RFC-Editor can remove this section before publication of this document as an RFC.

Gont & Manral Expires December 15, 2012 [Page 8] Internet-Draft Implications of Oversized Header Chains June 2012

7. Security Considerations

This document describes the interoperability and security implications of IPv6 packets or first-fragments that fail to include the entire IPv6 header chain. The security implications include the possibility of an attacker evading network security controls such as firewalls and Network Intrusion Detection Systems (NIDS) [CPNI-IPv6].

This document updates RFC 2460 such that those packets are forbidden, thus preventing the aforementioned issues.

Gont & Manral Expires December 15, 2012 [Page 9] Internet-Draft Implications of Oversized Header Chains June 2012

8. Acknowledgements

The authors would like to thank (in alphabetical order) Ran Atkinson and Dave Thaler for providing valuable comments on earlier versions of this document.

Gont & Manral Expires December 15, 2012 [Page 10] Internet-Draft Implications of Oversized Header Chains June 2012

9. References

9.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998.

9.2. Informative References

[RFC6146] Bagnulo, M., Matthews, P., and I. van Beijnum, "Stateful NAT64: Network Address and Protocol Translation from IPv6 Clients to IPv4 Servers", RFC 6146, April 2011.

[CPNI-IPv6] Gont, F., "Security Assessment of the Internet Protocol version 6 (IPv6)", UK Centre for the Protection of National Infrastructure, (available on request).

Gont & Manral Expires December 15, 2012 [Page 11] Internet-Draft Implications of Oversized Header Chains June 2012

Authors’ Addresses

Fernando Gont SI6 Networks / UTN-FRH Evaristo Carriego 2644 Haedo, Provincia de Buenos Aires 1706 Argentina

Phone: +54 11 4650 8472 Email: [email protected] URI: http://www.si6networks.com

Vishwas Manral Hewlett-Packard Corp. 191111 Pruneridge Ave. Cupertino, CA 95014 US

Phone: 408-447-1497 Email: [email protected] URI:

Gont & Manral Expires December 15, 2012 [Page 12]

IPv6 maintenance Working Group (6man) F. Gont Internet-Draft SI6 Networks / UTN-FRH Updates: 2460 (if approved) January 9, 2013 Intended status: Standards Track Expires: July 13, 2013

Security Implications of Predictable Fragment Identification Values draft-gont-6man-predictable-fragment-id-03

Abstract

IPv6 specifies the Fragment Header, which is employed for the fragmentation and reassembly mechanisms. The Fragment Header contains an "Identification" field which, together with the IPv6 Source Address and the IPv6 Destination Address of the packet, identifies fragments that correspond to the same original datagram, such that they can be reassembled together at the receiving host. The only requirement for setting the "Identification" value is that it must be different than that of any other fragmented packet sent recently with the same Source Address and Destination Address. Some implementations simply use a global counter for setting the Fragment Identification field, thus leading to predictable values. This document analyzes the security implications of predictable Identification values, and updates RFC 2460 specifying additional requirements for setting the Fragment Identification, such that the aforementioned security implications are mitigated.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This document may not be modified, and derivative works of it may not be created, and it may not be published except as an Internet-Draft.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on July 13, 2013.

Copyright Notice

Gont Expires July 13, 2013 [Page 1] Internet-Draft Implications of Predictable Fragment IDs January 2013

Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction ...... 3 2. Security Implications of Predictable Fragment Identification values ...... 4 3. Updating RFC 2460 ...... 8 4. Constraints for the selection of Fragment Identification Values ...... 9 5. Algorithms for Selecting Fragment Identification Values . . . 10 5.1. Per-destination counter (initialized to a random value) . 10 5.2. Randomized Identification values ...... 11 5.3. Hash-based Fragment Identification selection algorithm . . 11 6. IANA Considerations ...... 14 7. Security Considerations ...... 15 8. Acknowledgements ...... 16 9. References ...... 17 9.1. Normative References ...... 17 9.2. Informative References ...... 17 Appendix A. Information leakage produced by vulnerable implementations ...... 19 Appendix B. Survey of Fragment Identification selection algorithms employed by popular IPv6 implementations ...... 21 Author’s Address ...... 22

Gont Expires July 13, 2013 [Page 2] Internet-Draft Implications of Predictable Fragment IDs January 2013

1. Introduction

IPv6 specifies the Fragment Header, which is employed for the fragmentation and reassembly mechanisms. The Fragment Header contains an "Identification" field which, together with the IPv6 Source Address and the IPv6 Destination Address of the packet, identifies fragments that correspond to the same original datagram, such that they can be reassembled together at the receiving host. The only requirement for setting the "Identification" value is that it must be different than that of any other fragmented packet sent recently with the same Source Address and Destination Address.

The most trivial algorithm to avoid reusing Fragment Identification values too quickly is to maintain a global counter that is incremented for each fragmented packet that is sent. However, this trivial algorithm leads to predictable Identification values, which can be leveraged for performing a variety of attacks.

Section 2 of this document analyzes the security implications of predictable Identification values. Section 3 updates RFC 2460 by adding the requirement that Identification values not be predictable by an off-path attacker. Section 4 discusses constraints in the possible algorithms for selecting Fragment Identification values. Section 5 specifies a number of algorithms that could be used for generating Identification values. Finally, Appendix B contains a survey of the Fragment Identification algorithms employed by popular IPv6 implementations.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

Gont Expires July 13, 2013 [Page 3] Internet-Draft Implications of Predictable Fragment IDs January 2013

2. Security Implications of Predictable Fragment Identification values

Predictable Identification values result in an information leakage that can be exploited in a number of ways. Among others, they may potentially be exploited to:

o determine the packet rate at which a given system is transmitting information,

o perform stealth port scans to a third-party,

o uncover the rules of a number of firewalls,

o count the number of systems behind a middle-box, or,

o perform a Denial of Service (DoS) attack

[CPNI-IPv6] contains a detailed analysis of possible vulnerabilities introduced by predictable Fragment Identification values. In summary, their security implications are very similar to those of predictable Identification values in IPv4.

[Sanfilippo1998a] originally pointed out how the IPv4 Identification field could be examined to determine the packet rate at which a given system is transmitting information. Later, [Sanfilippo1998b] described how a system with such an implementation could be used to perform a stealth port scan to a third (victim) host. [Sanfilippo1999] explained how to exploit this implementation strategy to uncover the rules of a number of firewalls. [Bellovin2002] explains how the IPv4 Identification field can be exploited to count the number of systems behind a NAT. [Fyodor2004] is an entire paper on most (if not all) the ways to exploit the information provided by the Identification field of the IPv4 header (and these results apply in a similar way to IPv6). [RFC6274] covers the security implications of IPv4 in detail.

One key difference between the IPv4 case and the IPv6 case is that in IPv4 the Identification field is part of the fixed IPv4 header (and thus usually set for all packets), while in IPv6 the Identification field is set only in those packets that employ a Fragment Header. As a result, successful exploitation of the Identification field against communication instances with arbitrary destinations depends on two different factors:

o IPv6 implementations using predictable Identification values, and,

Gont Expires July 13, 2013 [Page 4] Internet-Draft Implications of Predictable Fragment IDs January 2013

o the ability of the attacker to cause the victim host to fragment packets destined to other nodes

As noted in the previous section, some implementations are known to use predictable identification values.

For example, 2.6.38-8 sets the Identification field according to a global counter that is incremented by one for each datagram that is sent with a fragment header (either a single fragment or as multiple fragments).

Finally, we note that an attacker could cause a victim host to fragment its outgoing packets by sending it a forged ICMPv6 ’Packet Too Big’ error message advertising a Next-Hop MTU smaller than 1280 bytes.

RFC 1981 [RFC1981] states that when an ICMPv6 Packet Too Big error message with an MTU smaller than 1280 bytes is received, the receiving host is not required to reduce the Path-MTU for the corresponding destination address, but must simply include a Fragment Header in all subsequent packets sent to that destination. In order to make sure that the forged ICMPv6 Packet Too Big error message triggers fragmentation at the victim host, the attacker could set the MTU field of the error message to a value smaller than 1280 bytes. Since the minimum IPv6 MTU is 1280 bytes, such value would always be smaller than the Path-MTU in use for that destination.

There are a few issues that should be considered, though:

o In all the implementations the author is aware of, an attacker can only cause the victim to enable fragmentation on a per-destination basis. That is, the victim will use fragmentation only for those packets sent to the Source Address of IPv6 packet embedded in the payload of the ICMPv6 Packet Too Big error message.

Section 5.2 of [RFC1981] notes that an implementation could maintain a single system-wide PMTU value to be used for all packets originating from that nodes. Clearly, such an implementations would exacerbate the problem of any attacks based on PMTUD [RFC5927] or IPv6 fragmentation.

o If the victim node implements some of the counter-measures for ICMP attacks described in RFC 5927 [RFC5927], it might be difficult for an attacker to cause the victim node to use fragmentation for its outgoing packets.

Gont Expires July 13, 2013 [Page 5] Internet-Draft Implications of Predictable Fragment IDs January 2013

Some implementations do not incorporate countermeasures for attacks based on ICMPv6 error messages. For example, Linux 2.6.38-8 does not even require received ICMPv6 error messages to correspond to ongoing communication instances.

Implementations that employ predictable Identification values and also fail to include countermeasures against attacks based on ICMPv6 error messages will be vulnerable to attacks similar to those based on the IPv4 Identification field for IPv4 networks, such as the stealth port-scanning technique described in [Sanfilippo1998b].

One possible way in which predictable Identification values could be leveraged for performing a Denial of Service (DoS) attack is as follows: once the Identification value currently in use at the victim host has been learned, the attacker would send a forged ICMPv6 Packet Too Big error message to the victim host, with the IPv6 Destination Address of the embedded IPv6 packet set to the IPv6 address of a third-party host with which the victim is communicating. This ICMPv6 Packet Too Big error message would cause any packets sent from the victim to the third-party host to include a Fragment Header. The attacker would then send forged IPv6 fragments to the third-party host, with their IPv6 Source Address set to that of the victim host, and with the Identification field of the forged fragments set to values that would result in collisions at the third-party host. If the third-party host discards fragments that result in collisions of Identification values, the attacker could simply trash the Identification space by sending multiple forged fragments with different Identification values, such that any subsequent packets from the victim host are discarded at the third-party host as a result of the malicious fragments sent by the attacker.

For example, Linux 2.6.38-10 is vulnerable to the aforementioned issue.

[I-D.ietf-6man-ipv6-atomic-fragments] describes an improved processing of these packets that would eliminate this specific attack vector, at least in the case of TCP connections that employ the Path-MTU Discovery mechanism.

The aforementioned attack scenario is simply included to illustrate the problem of employing predictable fragment Identification values, rather than to indicate a specific attack vector that needs to be mitigated.

We note that regardless of the attacker’s ability to cause a victim host to employ fragmentation when communicating with third-parties, use of predictable Identification values makes communication flows that employ fragmentation vulnerable to any fragmentation-based

Gont Expires July 13, 2013 [Page 6] Internet-Draft Implications of Predictable Fragment IDs January 2013

attacks.

Gont Expires July 13, 2013 [Page 7] Internet-Draft Implications of Predictable Fragment IDs January 2013

3. Updating RFC 2460

Hereby we update RFC 2460 [RFC2460] as follows:

The Identification value of the Fragment Header MUST NOT be predictable by an off-path attacker.

Gont Expires July 13, 2013 [Page 8] Internet-Draft Implications of Predictable Fragment IDs January 2013

4. Constraints for the selection of Fragment Identification Values

the "Identification" field of the Fragmentation Header is 32-bits long. However, when translators [RFC6145] are employed, the "effective" length of the IPv6 Fragment Identification field is 16 bits.

[RFC6145] notes that, when translating in the IPv6-to-IPv4 direction, "if there is a Fragment Header in the IPv6 packet, the last 16 bits of its value MUST be used for the IPv4 identification value". This means that the high-order 16 bits are effectively ignored.

As a result, at least during the IPv6/IPv4 transition/co-existence phase, it is probably safer to assume that only the last 16 bits of the IPv6 Fragment Identification may be used in some cases.

Regarding the selection of Fragment Identification values, the only requirement specified in [RFC2460] is that the Fragment Identification must be different than that of any other fragmented packet sent recently with the same Source Address and Destination Address.

Failure to comply with that requirement might lead to the interoperability problems discussed in [RFC4963].

From a security standpoint, unpredictable Identification values are desirable. However, this is somewhat at odds with the "re-use" requirements specified in [RFC2460].

Finally, since Fragment Identification values need to be selected for each outgoing datagram that requires fragmentation, the performance aspect should be considered when choosing an algorithm for the selection of Fragment Identification values.

Gont Expires July 13, 2013 [Page 9] Internet-Draft Implications of Predictable Fragment IDs January 2013

5. Algorithms for Selecting Fragment Identification Values

This section specifies a number of algorithms that MAY be used for selecting Fragment Identification values.

5.1. Per-destination counter (initialized to a random value)

1. Whenever a packet must be sent with a Fragment Header, the sending host should perform a look-up in the Destinations Cache an entry corresponding to the intended Destination Address.

2. If such an entry exists, it contains the last Fragment Identification value used for that Destination. Therefore, such value should be incremented by 1, and used for setting the Fragment Identification value of the outgoing packet. Additionally, the updated value should be recorded in the corresponding entry of the Destination Cache.

3. If such an entry does not exist, it should be created, and the "Identification" value for that destination should be initialized with a random value (e.g., with a pseudorandom number generator), and used for setting the Identification field of the Fragment Header of the outgoing packet.

The advantages of this algorithm are:

o It is simple to implement, with the only complexity residing in the Pseudo-Random Number Generator (PRNG) used to initialize the "Identification" value contained in each entry of the Destinations Cache.

o The "Identification" re-use frequency will typically be lower than that achieved by a global counter (when sending traffic to multiple destinations), since this algorithm uses per-destination counters (rather than a single system-wide counter).

o It has good performance properties (once the corresponding entry in the Destinations Cache has been created, each subsequent "Identification" value simply involves the increment of a counter).

The possible drawbacks of this algorithm are:

o If as a result of resource management an entry of the Destinations Cache must be removed, the last Fragment Identification value used for that Destination is obviously lost. Thus, if subsequent traffic to that destination causes the aforementioned entry to be re-created, the Fragment Identification value will be randomized,

Gont Expires July 13, 2013 [Page 10] Internet-Draft Implications of Predictable Fragment IDs January 2013

thus possibly leading to Fragment Identification "collisions".

o Since the Fragment Identification values are predictable by the destination host, a vulnerable host might possible leak to third- parties the Fragment Identification values used by other hosts to send traffic to it (i.e., Host B could leak to Host C the Fragment Identification values that Host A is using to send packets to Host B).

Appendix A describes a scenario in which that information leakage could take place.

5.2. Randomized Identification values

Clearly, use of a Pseudo-Random Number Generator for selecting the Fragment Identification could be desirable from a security standpoint. With such a scheme, the Fragment Identification of each fragmented datagram would be selected as:

Identification = random()

where "random()" is the PRNG.

The specific properties of such scheme would clearly depend on the specific PRNG algorithm used. For example, some PRNGs may result in higher Fragment Identification reuse frequencies than others, in the same way as some PRNGs may be more expensive (in terms of processing requirements and/or implementation complexity) than others.

Discussion of the properties of possible PRNGs is considered out of the scope of this document. However, we do note that some PRNGs employed in the past by some implementations have been found to be predictable [Klein2007]. Please see [RFC4086] for randomness requirements for security.

5.3. Hash-based Fragment Identification selection algorithm

Another alternative is to implement a hash-based algorithm similar to that specified in for the selection of transport port numbers. With such a scheme, the Fragment Identification value of each fragment datagram would be selected with the expression:

Identification = F(Src IP, Dst IP, secret1) + counter[G(src IP, Dst Pref, secret2)]

where:

Gont Expires July 13, 2013 [Page 11] Internet-Draft Implications of Predictable Fragment IDs January 2013

Identification: Identification value to be used for the fragmented datagram

F(): Hash function

Src IP: IPv6 Source Address of the datagram to be fragmented

Dst IP: IPv6 Destination Address of the datagram to be fragmented

secret1: Secret data unknown to the attacker

counter[]: System-wide array of 32-bit counters (e.g. with 8K elements or more)

G(): Hash function. May or may not be the same hash function as that used for F()

Dst Pref: IPv6 "Destination Prefix" of datagram to be fragmented (can be assumed to be the first eight bytes of the Destination Address of such packet). Note: the "Destination Prefix" (rather than Destination Address) is used, such that the ability of an attacker of searching the "increments" space by using multiple addresses of the same subnet is reduced.

secret1: Secret data unknown to the attacker

Note: counter[G(src IP, Dst Pref, secret2)] should be incremented by one each time an Identification value is selected.

The advantages of this algorithm are:

o The "Identification" re-use frequency will typically be lower than that achieved by a global counter (when sending traffic to multiple destinations), since this algorithm uses multiple system- wide counters (rather than a single system-wide counter). The extent to which the re-use frequency will be lower will depend on the number of elements in counter[], and the number of other active flows that result in the same value of G() (and hence cause the same counter to be incremented for each fragmented datagram that is sent).

Gont Expires July 13, 2013 [Page 12] Internet-Draft Implications of Predictable Fragment IDs January 2013

o It is possible to implement the algorithm such that good performance is achieved. For example, the result of F() could be stored in the Destinations Cache (such that it need not be recomputed for each packet that must be sent) along with computed *index* for counter[].

It should be noted that if this implementation approach is followed, and an entry of the Destinations Cache must be removed as a result of resource management, the last Fragment Identification value used for that Destination will *not* lost. This is an improvement over the algorithm specified in Section 5.1.

The possible drawbacks of this algorithm are:

o Since the Fragment Identification values are predictable by the destination host, a vulnerable host could possibly leak to third- parties the Fragment Identification values used by other hosts to send traffic to it (i.e., Host B could leak to Host C the Fragment Identification values that Host A is using to send packets to Host B).

Appendix A describes a scenario in which that information leakage could take place. We note, however, that this algorithm makes the aforementioned attack less reliable for the attacker, since each counter could be possibly shared by multiple traffic flows (i.e., packets destined to other destinations might cause the counter to be incremented).

This algorithm might be preferable (over the one specified in Section 5.1) in those scenarios in which a node is expected to communicate with a large number of destinations, and thus it is desirable to limit the amount of information to be maintained in memory.

In such scenarios, if the algorithm specified in Section 5.1 were implemented, entries from the Destinations Cache might need to be pruned frequently, thus increasing the risk of fragment Identification collisions.

Gont Expires July 13, 2013 [Page 13] Internet-Draft Implications of Predictable Fragment IDs January 2013

6. IANA Considerations

There are no IANA registries within this document. The RFC-Editor can remove this section before publication of this document as an RFC.

Gont Expires July 13, 2013 [Page 14] Internet-Draft Implications of Predictable Fragment IDs January 2013

7. Security Considerations

This document discusses the security implications of predictable Fragment Identification values, and updates RFC 2460 such that Fragment Identification values are required to be unpredictable by off-path attackers, hence mitigating the aforementioned security implications.

A number of possible algorithms are specified, to provide some implementation alternatives to implementers. However, the selection of an specific algorithm that complies with Section 3 is left to implementers. We note that the selection of such an algorithm usually implies a number of trade-offs (security, performance, implementation complexity, interoperability properties, etc.).

Gont Expires July 13, 2013 [Page 15] Internet-Draft Implications of Predictable Fragment IDs January 2013

8. Acknowledgements

The author would like to thank Ivan Arce for proposing the attack scenario described in Appendix A, and for providing valuable comments on earlier versions of this document.

The author would like to thank Dave Thaler for providing valuable comments on earlier versions of this document.

This document is based on the technical report "Security Assessment of the Internet Protocol version 6 (IPv6)" [CPNI-IPv6] authored by Fernando Gont on behalf of the UK Centre for the Protection of National Infrastructure (CPNI).

Fernando Gont would like to thank the UK CPNI (http://www.cpni.gov.uk) for their continued support.

Gont Expires July 13, 2013 [Page 16] Internet-Draft Implications of Predictable Fragment IDs January 2013

9. References

9.1. Normative References

[RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery for IP version 6", RFC 1981, August 1996.

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998.

[RFC4086] Eastlake, D., Schiller, J., and S. Crocker, "Randomness Requirements for Security", BCP 106, RFC 4086, June 2005.

[RFC5722] Krishnan, S., "Handling of Overlapping IPv6 Fragments", RFC 5722, December 2009.

[RFC6145] Li, X., Bao, C., and F. Baker, "IP/ICMP Translation Algorithm", RFC 6145, April 2011.

9.2. Informative References

[RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly Errors at High Data Rates", RFC 4963, July 2007.

[RFC5927] Gont, F., "ICMP Attacks against TCP", RFC 5927, July 2010.

[RFC6274] Gont, F., "Security Assessment of the Internet Protocol Version 4", RFC 6274, July 2011.

[I-D.ietf-6man-ipv6-atomic-fragments] Gont, F., "Processing of IPv6 "atomic" fragments", draft-ietf-6man-ipv6-atomic-fragments-03 (work in progress), December 2012.

[Bellovin2002] Bellovin, S., "A Technique for Counting NATted Hosts", IMW’02 Nov. 6-8, 2002, Marseille, France, 2002.

[CPNI-IPv6] Gont, F., "Security Assessment of the Internet Protocol version 6 (IPv6)", UK Centre for the Protection of National Infrastructure, (available on request).

[Fyodor2004] Fyodor, "Idle scanning and related IP ID games", 2004,

Gont Expires July 13, 2013 [Page 17] Internet-Draft Implications of Predictable Fragment IDs January 2013

.

[Klein2007] Klein, A., "OpenBSD DNS Cache Poisoning and Multiple O/S Predictable IP ID Vulnerability", 2007, .

[Sanfilippo1998a] Sanfilippo, S., "about the ip header id", Post to Bugtraq mailing-list, Mon Dec 14 1998, .

[Sanfilippo1998b] Sanfilippo, S., "Idle scan", Post to Bugtraq mailing-list, 1998, .

[Sanfilippo1999] Sanfilippo, S., "more ip id", Post to Bugtraq mailing- list, 1999, .

[SI6-IPv6] "SI6 Networks’ IPv6 toolkit", .

Gont Expires July 13, 2013 [Page 18] Internet-Draft Implications of Predictable Fragment IDs January 2013

Appendix A. Information leakage produced by vulnerable implementations

Section 2 provides a number of references describing a number of ways in which the information leakage produced by a vulnerable implementation could be leveraged by an attacker. This section describes a specific network scenario in which a vulnerable implementation could possibly leak the current Fragment Identification value in use by a third-party host to send fragmented datagrams to the vulnerable implementation.

For the most part, this section is included to illustrate how a vulnerable implementation might be leveraged to leak-out the Fragment Identification value of an otherwise secure implementation. This section might be removed in future revisions of this document.

The following scenarios assume:

A: Is an IPv6 host that implements the recommended Fragment Identification algorithm (Section 5.1), implements [RFC5722], but does not implement [I-D.ietf-6man-ipv6-atomic-fragments].

B: Victim node. Selected the Fragment Identification values from a global counter.

C: Attacker. Can forge the IPv6 Source Address of his packets at will.

If the attacker sends forged SYN packets to a closed TCP port, and then fails when trying to produce a collision of Fragment Identifications (see line #4), the following packet exchange might take place:

A B C

#1 <------Echo Req #1 ------#2 --- Echo Resp #1, FID=5000 ---> #3 <------SYN #1, src= B ------#4 <---- SYN/ACK, FID=42 src = A--- #5 ---- SYN/ACK, FID=9000 ---> #6 <---- RST, FID= 5001 ----- #7 <---- RST, FID= 5002 ----- #8 <------Echo Req #2 ------#9 ---- Echo Resp #2, FID= 5003 -->

Gont Expires July 13, 2013 [Page 19] Internet-Draft Implications of Predictable Fragment IDs January 2013

On the other hand, if the attacker succeeds to produce a collision of Fragment Identification values, the following packet exchange could take place:

A B C

#1 <------Echo Req #1 ------#2 ---- Echo Resp #1, FID=5000 ---> #3 <------SYN #1, src= B ------#4 <-- SYN/ACK, FID=9000 src=A --- #5 ---- SYN/ACK, FID=9000 ---> ... (RFC5722) ... #6 <------Echo Req #2 ------#7 ---- Echo Resp #2, FID= 5001 -->

Clearly, the Fragment Identification value sampled by from the second ICMPv6 Echo Response packet ("Echo Resp #2") implicitly indicates whether the Fragment Identification in the forged SYN/ACK (see line #4 in both figures) was the current Fragment Identification in use by Host A.

As a result, the attacker could employ this technique to learn the current Fragment Identification value used by host A to send packets to host B.

Gont Expires July 13, 2013 [Page 20] Internet-Draft Implications of Predictable Fragment IDs January 2013

Appendix B. Survey of Fragment Identification selection algorithms employed by popular IPv6 implementations

This section includes a survey of the Fragment Identification selection algorithms employed in some popular operating systems.

The survey was produced with the SI6 Networks IPv6 toolkit [SI6-IPv6].

+------+------+ | Operating System | Algorithm | +------+------+ | FreeBSD 9.0 | Unpredictable (Random) | +------+------+ | Linux 3.0.0-15 | Predictable (Global Counter, Init=0, | | | Incr=1) | +------+------+ | Linux-current | Unpredictable (Per-dest Counter, | | | Init=random, Incr=1) | +------+------+ | NetBSD 5.1 | Unpredictable (Random) | +------+------+ | OpenBSD-current | Random (SKIP32) | +------+------+ | Solaris 10 | Predictable (Per-dst Counter, Init=0, | | | Incr=1) | +------+------+ | Windows XP SP2 | Predictable (Global Counter, Init=0, | | | Incr=2) | +------+------+ | Windows Vista (Build | Predictable (Global Counter, Init=0, | | 6000) | Incr=2) | +------+------+ | Windows 7 Home | Predictable (Global Counter, Init=0, | | Premium | Incr=2) | +------+------+

Table 1: Fragment Identification algorithms employed by different OSes

In the text above, "predictable" should be taken as "easily guessable by an off-path attacker, by sending a few probe packets".

Gont Expires July 13, 2013 [Page 21] Internet-Draft Implications of Predictable Fragment IDs January 2013

Author’s Address

Fernando Gont SI6 Networks / UTN-FRH Evaristo Carriego 2644 Haedo, Provincia de Buenos Aires 1706 Argentina

Phone: +54 11 4650 8472 Email: [email protected] URI: http://www.si6networks.com

Gont Expires July 13, 2013 [Page 22]

IPv6 maintenance Working Group (6man) F. Gont Internet-Draft UK CPNI Updates: 4291, 4862 (if approved) March 31, 2012 Intended status: Standards Track Expires: October 2, 2012

A method for Generating Stable Privacy-Enhanced Addresses with IPv6 Stateless Address Autoconfiguration (SLAAC) draft-gont-6man-stable-privacy-addresses-01

Abstract

This document specifies a method for generating IPv6 Interface Identifiers to be used with IPv6 Stateless Address Autoconfiguration (SLAAC), such that addresses configured using this method are stable within each subnet, but the Interface Identifier changes when hosts move from one network to another. The aforementioned method is meant to be an alternative to generating Interface Identifiers based on IEEE identifiers, such that the benefits of stable addresses can be achieved without sacrificing the privacy of users.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This document may not be modified, and derivative works of it may not be created, and it may not be published except as an Internet-Draft.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on October 2, 2012.

Copyright Notice

Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents

Gont Expires October 2, 2012 [Page 1] Internet-Draft Stable Privacy Addresses March 2012

(http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction ...... 3 2. Design goals ...... 5 3. Algorithm specification ...... 6 4. IANA Considerations ...... 8 5. Security Considerations ...... 9 6. Acknowledgements ...... 10 7. Privacy issues still present with RFC 4941 ...... 11 7.1. Host tracking ...... 11 7.1.1. Tracking hosts across networks #1 ...... 11 7.1.2. Tracking hosts across networks #2 ...... 12 7.1.3. Revealing the identity of a devices performing server-like functions ...... 12 7.2. Host scanning-attacks ...... 12 8. References ...... 14 8.1. Normative References ...... 14 8.2. Informative References ...... 14 Author’s Address ...... 15

Gont Expires October 2, 2012 [Page 2] Internet-Draft Stable Privacy Addresses March 2012

1. Introduction

[RFC4862] specifies the Stateless Address Autoconfiguration (SLAAC) for IPv6, which typically results in hosts configuring one or more "stable" addresses composed of a network prefix advertised by a local router, and an Interface Identifier (IID) that typically embeds a hardware address (e.g., using IEEE identifiers) [RFC4291].

Generally, static addresses are thought to simplify network management, since they simplify ACLs and logging. However, since IEEE identifiers are typically globally unique, the resulting IPv6 addresses can be leveraged to track and correlate the activity of a node over time and across multiple subnets and networks, thus negatively affecting the privacy of users.

The "Privacy Extensions for Stateless Address Autoconfiguration in IPv6" [RFC4941] were introduced to complicate the task of eavesdroppers and other information collectors to correlate the activities of a node, and basically result in temporary (and random) Interface Identifiers that are typically more difficult to leverage than those based on IEEE identifiers. When privacy extensions are enabled, "privacy addresses" are employed for "outgoing communications", while the traditional IPv6 addresses based on IEEE identifiers are still used for "server" functions (i.e., receiving incoming connections).

As noted in [RFC4941], "anytime a fixed identifier is used in multiple contexts, it becomes possible to correlate seemingly unrelated activity using this identifier". Therefore, since "privacy addresses" [RFC4941] do not eliminate the use of fixed identifiers for server-like functions, they only *partially* mitigate the correlation of host activities (see Section 7 for some example attacks that are still possible with privacy addresses). Therefore, it is vital that the privacy characteristics of "stable" addresses are improved such that the ability of an attacker correlating host activities across networks is reduced.

Another important aspect not mitigated by "Privacy Addresses" [RFC4941] is that of host scanning. Since IPv6 addresses that embed IEEE identifiers have specific patterns, an attacker could leverage such patterns to greatly reduce the search space for "live" hosts. Since "privacy addresses" do not eliminate the use of IPv6 addresses that embed IEEE identifiers, host scanning attacks are still feasible even if "privacy extensions" are employed [Gont-DEEPSEC2011] [CPNI-IPv6]. This is yet another motivation to improve the privacy characteristics of "stable" addresses (currently embedding IEEE identifiers).

Gont Expires October 2, 2012 [Page 3] Internet-Draft Stable Privacy Addresses March 2012

Privacy/temporary addresses can be challenging in a number of areas. For example, from a network-management point of view, they tend to increase the complexity of event logging, trouble-shooting, and enforcing access controls and quality of service, etc. As a result, some organizations disable the use of privacy addresses even at the expense of reduced privacy [Broersma]. Also, they result in increased complexity, which might not be possible or desirable in some implementations (e.g., some embedded devices).

In scenarios in which "Privacy Extensions" are deliberately not used (possibly for any of the aforementioned reasons), all a host is left with is the addresses that have been generated using e.g. IEEE identifiers, and this is yet another case in which it is also vital that the privacy characteristics of these stable addresses are improved.

We note that in most (if not all) of those scenarios in which "Privacy Extensions" are disabled, there is usually no actual desire to negatively affect user privacy, but rather a desire to simplify operation of the network (simplify the use of ACLs, logging, etc.).

This document specifies a method to generate interface identifiers that are stable/constant within each subnet, but that change as hosts move from one network to another, thus keeping the "stability" properties of the interface identifiers specified in [RFC4291], while still mitigating host-scanning attacks and preventing correlation of the activities of a node as it moves from one network to another.

For nodes that currently disable "Privacy extensions" [RFC4941] for some of the reasons stated above, this mechanism provides stable privacy-enhanced addresses which may already address most of the privacy concerns related to addresses that embed IEEE identifiers [RFC4291]. On the other hand, in scenarios in which "Privacy Extensions" are employed, implementation of the mechanism described in this document would mitigate host-scanning attacks and also mitigate correlation of host activities.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

Gont Expires October 2, 2012 [Page 4] Internet-Draft Stable Privacy Addresses March 2012

2. Design goals

This document specifies a method for selecting interface identifiers to be used with IPv6 SLAAC, with the following goals:

o The resulting interface identifier remains constant/stable for each prefix used with SLAAC within each subnet. That is, the algorithm generates the same interface identifier when configuring an address belonging to the same prefix within the same subnet.

o The resulting interface identifier does change when addresses are configured for different prefixes. That is, if different autoconfiguration prefixes are used to configure addresses for the same network interface card, the resulting interface identifiers must be (statistically) different.

o It must be difficult for an outsider to predict the interface identifiers that will be generated by the algorithm, even with knowledge of the interface identifiers generated for configuring other addresses.

o The aforementioned interface identifiers are meant to be an alternative to those based on IEEE identifiers, as specified in [RFC4291].

We note that of use of the algorithm specified in this document is (to a large extent) orthogonal to the use of "Privacy Extensions" [RFC4941]. Hosts that do not implement/use "Privacy Extensions" would have the benefit that they would not be subject to the host- tracking and host scanning issues discussed in the previous section. On the other hand, in the case of hosts employing "Privacy Extensions", the method specified in this document would prevent host scanning attacks and correlation of node activities across networks (see Section 7).

Gont Expires October 2, 2012 [Page 5] Internet-Draft Stable Privacy Addresses March 2012

3. Algorithm specification

IPv6 implementations conforming to this specification MUST generate interface identifiers using the algorithm specified in this section. The aforementioned algorithm MUST be employed for generating the interface identifiers for all the IPv6 addresses configured with SLAAC for a given interface, including IPv6 link-local addresses. Unless otherwise noted, all of the parameters included in the expression below MUST be included when generating an Interface ID.

1. Compute a random (but stable) identifier with the expression:

RID = F(Prefix, Modified_EUI64, Network_ID, secret_key)

Where:

RID: Random (but stable) identifier

F(): A pseudorandom function (PRF) that is not computable from the outside (without knowledge of the secret key). The PRF could be implemented as a cryptographic hash of the concatenation of each of the function parameters .

Prefix: The prefix to be used for SLAAC, as learned from an ICMPv6 Router Advertisement message.

Modified_EUI64: The Modified EUI-64 format identifier corresponding to this network interface.

Network_ID: Some network specific data that identifies the subnet to which this interface is attached. For example the IEEE 802.11 SSID corresponding to the network to which this interface is associated. This parameter is OPTIONAL.

secret_key: A secret key that is not known by the attacker. The secret key MUST be initialized at system installation time to the concatenation of a pseudo-random number (see [RFC4086] for randomness requirements for security) and the machine’s serial number. An implementation MAY provide the means for the user to change the secret key.

Gont Expires October 2, 2012 [Page 6] Internet-Draft Stable Privacy Addresses March 2012

2. The Interface Identifier is finally obtained by taking the leftmost 64 bits of the RID value computed in the previous step, and and setting bit 6 (the leftmost bit is numbered 0) to zero. This creates an interface identifier with the universal/local bit indicating local significance only.

Note that the result of F() in the algorithm above is no more secure than the secret key. If an attacker is aware of the PRF that is being used by the victim (which we should expect), and the attacker can obtain enough material (i.e. addresses configured by the victim), the attacker may simply search the entire secret-key space to find matches. To protect against this, the secret key should be of a reasonable length. Key lengths of at least 128 bits should be adequate. The secret key is initialized at installation time to the concatenation of a pseudo-random number and the machine’s serial number. This allows this mechanism to be enabled/used automatically, without user intervention.

The machine’s serial number is concatenated to the pseudo-random number, such that the entropy of the key is increased (since at installation time the entropy of the Pseudo-Random Number Generator might be reduced).

Including the SLAAC prefix in the PRF computation causes the Interface ID to vary across networks that employ different prefixes, thus mitigating host-tracking attacks and any other attacks that benefit from predictable Interface IDs (such as host scanning).

Including the optional Network_ID parameter when computing the RID value above would cause the algorithm to produce a different Interface Identifier when connecting to different networks, even when configuring addresses belonging to the same prefix. This means that a host would employ a different Interface ID as it moves from one network to another even for IPv6 link-local addresses or Unique Local Addresses (ULAs).

Note that there are a number of ways in which these addresses might leak out. For example, an attacker could use ICMPv6 Node Information queries [RFC4620] to obtain such addresses.

Gont Expires October 2, 2012 [Page 7] Internet-Draft Stable Privacy Addresses March 2012

4. IANA Considerations

There are no IANA registries within this document. The RFC-Editor can remove this section before publication of this document as an RFC.

Gont Expires October 2, 2012 [Page 8] Internet-Draft Stable Privacy Addresses March 2012

5. Security Considerations

This document specifies an algorithm for generating interface identifiers to be used with IPv6 Stateless Address Autoconfiguration (SLAAC), replacing e.g. the Modified EUI-64 format identifiers. When compared to modified EUI-64 format identifiers, the identifiers specified in this document have a number of advantages:

o They prevent trivial host-tracking, since when a host moves from one network to another the network prefix used for autoconfiguration and/or the Network ID (e.g., IEEE 802.11 SSID) will typically change, and hence the resulting interface identifier will also change (see Section 7.

o They mitigate host-scanning techniques which leverage predictable interface identifiers (e.g., known Organizational Unique Identifiers).

We note that this algorithm is meant to replace Modified EUI-64 format identifiers, but not the temporary-addresses such as those specified in [RFC4941]. Clearly, temporary addresses may help to mitigate the correlation of activities of a node within the same network, and may also reduce the attack exposure window (since the lifetime of privacy/temporary IPv6 address is reduced when compared to that of addresses generated with the method specified in this document). We note that implementation of this algorithm would still benefit those hosts employing "Privacy Addresses", since it would mitigate host-tracking vectors still present when privacy addresses are used (Section 7, and would also mitigate host-scanning techniques that leverage patterns in IPv6 addresses that embed IEEE identifiers.

Finally, we note that the method described in this document may mitigate most of the privacy concerns arising from the use of IPv6 addresses that embed IEEE identifiers, without the use of temporary addresses, thus possibly offering an interesting trade-off for those scenarios in which the use of temporary addresses is not feasible.

Gont Expires October 2, 2012 [Page 9] Internet-Draft Stable Privacy Addresses March 2012

6. Acknowledgements

The author would like to thank (in alphabetical order) Steven Bellovin, Dominik Elsbroek, Ray Hunter, Jong-Hyouk Lee, and Michael Richardson, for providing valuable comments on earlier versions of this document.

This document is based on the technical report "Security Assessment of the Internet Protocol version 6 (IPv6)" [CPNI-IPv6] authored by Fernando Gont on behalf of the UK Centre for the Protection of National Infrastructure (CPNI).

Fernando Gont would like to thank CPNI (http://www.cpni.gov.uk) for their continued support.

Gont Expires October 2, 2012 [Page 10] Internet-Draft Stable Privacy Addresses March 2012

7. Privacy issues still present with RFC 4941

This aims to clarify the motivation of using the method specified in this document even when privacy/temporary addresses are employed. It has been incorporated in the document to clarify a number of questions that arose during the presentation of this document at IETF 83 (Paris). This entire section might be removed prior to publication of this document as an RFC.

7.1. Host tracking

Some 6man participants questioned the inclusion of the SLAAC prefix in PRF function, and noted that the ID of "stable" addresses need not change across networks, since privacy/temporary addresses already mitigate host tracking. This section describes one possible attack scenario that illustrates that host-tracking may still be possible when privacy/temporary addresses are employed.

7.1.1. Tracking hosts across networks #1

A host configures the stable addresses without including the Prefix in the F() (the PRF). The aforementioned host now runs any application that needs to perform a server-like function (e.g. a peer-to-peer application). As a result of that, an attacker/user participating in the same application (e.g., P2P) would learn the Interface-ID used for the stable address.

Some time later, the same host moves to a completely different network, and uses the same P2P application, probably even with a different user. The attacker now interacts with the same host again, and hence can learn the "new" stable address. Since the interface ID is the same as the one used before, the attacker can infer that it is communicating with the same device as before.

Had the host included the Prefix when computing the Interface-ID (with the hash function F()) as RECOMMENDED in this document, the Interface-ID would have been different, and this privacy attack would not have been possible.

This is just *one* possible attack scenario, which should remind us that one should not disclose more than it is really needed for achieving a specific goal (and an Interface-ID that is constant across different networks does exactly that: it discloses more information than is needed for providing a stable address).

Gont Expires October 2, 2012 [Page 11] Internet-Draft Stable Privacy Addresses March 2012

7.1.2. Tracking hosts across networks #2

Once an attacker learns the fixed Interface-ID employed by the victim host for its stable address, the attacker is able to "probe" a network for the presence of such host at any given network.

See Section 7.1.1 for just one example of how an attacker could learn such prefix. Other examples include being able to share the same network segment at some point in time (think about sharing a conference network with 1000+ peers), etc.

For example, if an attacker learns that in one network the victim used the prefix 1111:2222:3333:4444 for its stable addresses, then we could subsequently probe for the presence of such device in the network 2011:db8::/64 by sending a probe packet (ICMPv6 Echo Request, or your favourite probe packet) to the address 2001:db8::1111:2222: 3333:4444.

7.1.3. Revealing the identity of a devices performing server-like functions

Some devices may typically perform server-like functions and may be usually moved from one network to another (e.g., from storage devices to printers). Such devices are likely to simply disable (or not even implement) privacy/temporary addresses [RFC4941]. If the aforementioned devices employ Interface-IDs that are constant across networks, it would be trivial for an attacker to tell whether the same device is being used across networks by simply looking at the Interface ID. Clearly, performing server-like should not imply that a device discloses its identity (i.e., that the attacker can tell whether it is the same device providing some function in two different networks, at two different points in time.

The scheme proposed in this document prevents such information leakage by causing nodes to generate different Interface-IDs when moving to one network to another, thus mitigating this kind of privacy attack.

7.2. Host scanning-attacks

While it is usually assumed that host-scanning attacks are unfeasible, an attack can leverage patterns in IPv6 address generation to greatly reduce the search space.

As noted earlier in this document, privacy/temporary addresses do not eliminate the use of IPv6 addresses that embed IEEE identifiers, and hence such hosts would still be vulnerable to host-scanning attacks unless they eliminate the patterns introduced by embedding IEEE

Gont Expires October 2, 2012 [Page 12] Internet-Draft Stable Privacy Addresses March 2012

identifiers in the Interface-ID. The method specified in this document would mitigate the aforementioned host-scanning attacks.

Gont Expires October 2, 2012 [Page 13] Internet-Draft Stable Privacy Addresses March 2012

8. References

8.1. Normative References

[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998.

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC4086] Eastlake, D., Schiller, J., and S. Crocker, "Randomness Requirements for Security", BCP 106, RFC 4086, June 2005.

[RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing Architecture", RFC 4291, February 2006.

[RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless Address Autoconfiguration", RFC 4862, September 2007.

[RFC4941] Narten, T., Draves, R., and S. Krishnan, "Privacy Extensions for Stateless Address Autoconfiguration in IPv6", RFC 4941, September 2007.

8.2. Informative References

[RFC4620] Crawford, M. and B. Haberman, "IPv6 Node Information Queries", RFC 4620, August 2006.

[Gont-DEEPSEC2011] Gont, "Results of a Security Assessment of the Internet Protocol version 6 (IPv6)", DEEPSEC 2011 Conference, Vienna, Austria, November 2011, .

[Broersma] Broersma, R., "IPv6 Everywhere: Living with a Fully IPv6- enabled environment", Australian IPv6 Summit 2010, Melbourne, VIC Australia, October 2010, .

[CPNI-IPv6] Gont, F., "Security Assessment of the Internet Protocol version 6 (IPv6)", UK Centre for the Protection of National Infrastructure, (available on request).

Gont Expires October 2, 2012 [Page 14] Internet-Draft Stable Privacy Addresses March 2012

Author’s Address

Fernando Gont UK CPNI

Email: [email protected] URI: http://www.cpni.gov.uk

Gont Expires October 2, 2012 [Page 15]

Network Working Group R. Asati Internet-Draft H. Singh Updates: 4862 (if approved) W. Beebee Intended status: Standards Track Cisco Systems, Inc. Expires: July 8, 2012 E. Dart Lawrence Berkeley National Laboratory W. George Time Warner Cable C. Pignataro Cisco Systems, Inc. January 5, 2012

Enhanced Duplicate Address Detection draft-hsingh-6man-enhanced-dad-04.txt

Abstract

Appendix A of the IPv6 Duplicate Address Detection (DAD) document in RFC 4862 discusses Loopback Suppression and DAD. However, RFC 4862 does not settle on one specific automated means to detect loopback of Neighbor Discovery (ND of RFC 4861) messages used by DAD. Several service provider communities have expressed a need for automated detection of looped backed ND messages used by DAD. This document includes mitigation techniques and then outlines the Enhanced DAD algorithm to automate detection of looped back IPv6 ND messages used by DAD. For network loopback tests, the Enhanced DAD algorithm allows IPv6 to self-heal after a loopback is placed and removed. Further, for certain access networks the document automates resolving a specific duplicate address conflict.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on July 8, 2012.

Asati, et al. Expires July 8, 2012 [Page 1] Internet-Draft Enhanced DAD January 2012

Copyright Notice

Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Terminology ...... 3 2. Introduction ...... 3 3. Operational Mitigation Options ...... 4 3.1. Disable DAD on Interface ...... 4 3.2. Dynamic Disable/Enable of DAD Using Layer 2 Protocol . . . 5 3.3. Operational Considerations ...... 5 4. The Enhanced DAD Algorithm ...... 6 4.1. General Rules ...... 7 4.2. Processing Rules for Senders ...... 7 4.3. Processing Rules for Receivers ...... 7 4.4. Impact on SEND ...... 7 4.5. Changes to RFC 4862 ...... 8 5. Actions to Perform on Detecting a Genuine Duplicate . . . . . 8 6. Security Considerations ...... 9 7. IANA Considerations ...... 9 8. Acknowledgements ...... 9 9. Normative References ...... 9 Appendix A. Changes between the -03 and the -04 versions . . . . 10 Appendix B. Changes between the -02 and the -03 versions . . . . 10 Authors’ Addresses ...... 10

Asati, et al. Expires July 8, 2012 [Page 2] Internet-Draft Enhanced DAD January 2012

1. Terminology

o DAD-failed state - Duplication Address Detection failure as specified in [RFC4862]. Failure also includes if the Target Address is optimistic. Optimistic DAD is specified in [RFC4429].

o Looped back message - also referred to as a reflected message. The message sent by the sender is received by the sender due to the network or a Upper Layer Protocol on the sender looping the message back.

o Loopback - A function in which the router’s interface (or the circuit to which the router’s interface is connected) is looped back or connected to itself. Loopback causes packets sent by the interface to be received by the interface, and results in interface unavailability for regular data traffic forwarding. See more details in section 9.1 of [RFC1247]. The Loopback function is commonly used in an interface context to gain information on the quality of the interface, by employing mechanisms such as ICMPv6 pings, bit-error tests, etc. In a circuit context, it is used in wide area environments including optical dense wave division multiplexing (DWDM) and SONET/SDH for fault isolation (e.g. by placing a loopback at different geographic locations along the path of a wide area circuit to help locate a circuit fault). The Loopback function may be employed locally or remotely.

o NS(DAD) - shorthand notation to denote an NS with unspecified IPv6 source-address issued during DAD.

2. Introduction

Appendix A of [RFC4862] discusses Loopback Suppression and Duplicate Address Detection (DAD). However, [RFC4862] does not settle on one specific automated means to detect loopback of ND messages used by DAD. One specific DAD message is a Neighbor Solicitation (NS), specified in [RFC4861]. The NS is issued by the network interface of an IPv6 node for DAD. Another message involved in DAD is a Neighbor Advertisement (NA). The Enhanced DAD algorithm proposed in this document focuses on detecting an NS looped back to the transmitting interface during the DAD operation. Detecting a looped back NA is of no use because no problems with DAD will occur if a node receives a looped back NA. Detecting of any other looped back ND messages outside of the DAD operation is not critical and thus this document does not cover such detection. The document also includes a Mitigation section that discusses means already available to mitigate the loopback problem.

Asati, et al. Expires July 8, 2012 [Page 3] Internet-Draft Enhanced DAD January 2012

Recently service providers have reported a DAD loopback problem. Loopback testing is underway on a circuit connected to an interface on a router. The interface on the router is enabled for IPv6. The interface issues a NS for the IPv6 link-local address DAD. The NS is reflected back to the router interface due to the loopback condition of the circuit, and the router interface enters a DAD-failed state. In contrast to IPv4, IPv6 will not return to operation on the interface when the loopback condition is cleared without manual intervention.

There are other conditions which will also trigger similar problems with DAD Loopback. While the following example is not a common configuration, it has occurred in a large service provider network. It is necessary to address it in the proposed solution because the trigger scenario has the potential to cause significant IPv6 service outages when it does occur. Two broadband modems in the same location are served by the same service provider and both modems are served by one access concentrator and one layer 3 interface on the access concentrator. The two modems have the Ethernet ports of each modem connected to a network hub. The access concentrator serving the modems is the first-hop IPv6 router for the modems. The access concentrator also supports proxying of DAD messages. Each modem is enabled for at least data services. The network interface of the access concentrator serving the two broadband modems is enabled for IPv6 and the interface issues a NS(DAD) message for the IPv6 link- local address. The NS message reaches one modem first and this modem sends the message to the network hub which sends the message to the second modem which forwards the message back to the access concentrator. The looped back NS message causes the network interface on the access concentrator to be in a DAD-failed state. Such a network interface typically serves up to 100 thousand broadband modems causing all the modems (and hosts behind the modems) to fail to get IPv6 online on the access network. Additionally, it may be tedious for the access concentrator to find out which of the six thousand or more homes looped back the DAD message. Clearly there is a need for automated detection of looped back NS messages during DAD operations by a node.

3. Operational Mitigation Options

Two mitigation options are described below. The mechanisms do not require any change to existing implementations.

3.1. Disable DAD on Interface

One can disable DAD on an interface and then there is no NS(DAD) issued to be looped back. DAD is disabled by setting the interface’s

Asati, et al. Expires July 8, 2012 [Page 4] Internet-Draft Enhanced DAD January 2012

DupAddrDetectTransmits variable to zero. While this mitigation may be the simplest the mitigation has three drawbacks.

It would likely require careful analysis of configuration on such point-to-point interfaces, a one-time manual configuration on each of such interfaces, and more importantly, genuine duplicates in the link will not be detected.

A network operator MAY use this mitigation.

3.2. Dynamic Disable/Enable of DAD Using Layer 2 Protocol

It is possible that one or more layer 2 protocols include provisions to detect the existence of a loopback on an interface circuit, usually by comparing protocol data sent and received. For example, PPP uses magic number (section 6.4 of [RFC1661]) to detect a loopback on an interface.

When a layer 2 protocol detects that a loopback is present on an interface circuit, the device MUST temporarily disable DAD on the interface, and when the protocol detects that a loopback is no longer present (or the interface state has changed), the device MUST (re-)enable DAD on that interface.

This solution requires no protocol changes. This solution SHOULD be enabled by default, and MUST be a configurable option.

This mitigation has several benefits. They are

1. It leverages layer 2 protocol’s built-in loopback detection capability, if available.

2. It scales better (since it relies on an event-driven), requires no additional state, timer etc. This may be a significant scaling consideration on devices with hundreds or thousands of interfaces that may be in loopback for long periods of time (such as while awaiting turn-up or during long-duration intrusive bit error rate tests).

3.3. Operational Considerations

The mitigation options discussed in the document do not require the devices on both ends of the circuit to support the mitigation functionality simultaneously, and do not propose any capability negotiation. Suffice to say that the mitigation options are well effective for the unidirectional loopback.

The mitigation options may not be effective for the bidirectional

Asati, et al. Expires July 8, 2012 [Page 5] Internet-Draft Enhanced DAD January 2012

loopback (i.e. the loopback is placed in both directions of the circuit interface, so as to identify the faulty segment) if only one device followed a mitigation option specified in this document, since the other device would follow current behavior and disable IPv6 on that interface due to DAD until manual intervention restores it.

This is nothing different from what happens today (without the solutions proposed by this document) in case of unidirectional loopback. Hence, it is expected that an operator would resort to manual intervention for the devices not compliant with this document, as usual.

4. The Enhanced DAD Algorithm

The Enhanced DAD algorithm covers detection of a looped back NS(DAD) message. The document proposes use of the Nonce Option specified in the SEND document of [RFC3971]. The nonce is a random number as specified in [RFC3971]. If SEND is enabled on the router and the router also supports the Enhanced DAD algorithm (specified in this document), there is integration with the Enhanced DAD algorithm and SEND. See more details in the Impact on SEND section.

When the IPv6 network interface issues a NS(DAD) message, the interface includes the Nonce Option in the NS(DAD) message and saves the nonce in local store. Subsequently if the interface receives an identical NS(DAD) message, the interface logs a system management message, updates any statistics counter, and drops the looped back NS(DAD). If the DupAddrDetectTransmits variable for the interface is greater than one, subsequent NS(DAD) messages for the same Target Address should be suppressed. If the interface receives a NS(DAD) message with a different nonce but TargetAddress matches a tentative or optimistic address on the interface, the interface logs a DAD- failed system management message, updates any statistics, and behaves identical to the behavior specified in [RFC4862] for DAD failure.

Six bytes of random nonce is sufficiently large for nonce collisions. However if there is a collision because two nodes generated the same random nonce (that are using the same Target address in their NS(DAD)), then the algorithm will incorrectly detect a looped back NS(DAD) when the NS(DAD) was issued to signal a genuine duplicate. Since each looped back NS(DAD) event is logged to system management, the administrator of the network will have to intervene manually.

The algorithm is capable of detecting any ND solicitation (NS and Router Solicitation) or advertisement (NA and Router Advertisement) that is looped back. However, saving a nonce and nonce related data for all ND messages has impact on memory of the node and also adds

Asati, et al. Expires July 8, 2012 [Page 6] Internet-Draft Enhanced DAD January 2012

the algorithm state to a substantially larger number of ND messages. Therefore this document does not recommend using the algorithm outside of the DAD operation by an interface on a node.

4.1. General Rules

A node MUST implement detection of looped back NS(DAD) messages during DAD for an interface address.

4.2. Processing Rules for Senders

If a node has been configured to use the Enhanced DAD algorithm, when sending a NS(DAD) for a tentative or optimistic interface address the sender MUST generate a random nonce associated with the interface address, MUST save the nonce, and MUST include the nonce in the Nonce Option included in the NS(DAD). If a looped back NS(DAD) is detected by the interface, and if the DupAddrDetectTransmits variable for the interface is greater than one, subsequent NS(DAD) messages for the same Target Address SHOULD be suppressed.

4.3. Processing Rules for Receivers

If the node has been configured to use the Enhanced DAD algorithm and an interface on the node receives any NS(DAD) message that matches the interface address (in tentative or optimistic state), the receiver compares the nonce in the message with the saved nonce. If a match is found, the node SHOULD log a system management message, SHOULD update any statistics counter, and MUST drop the received message. If the received NS(DAD) message includes a nonce and no match is found with the saved nonce, the node SHOULD log a system management message for DAD-failed and SHOULD update any statistics counter.

4.4. Impact on SEND

The SEND document uses the Nonce Option in the context of matching an NA with an NS. However, no text in SEND has an explicit mention of detecting looped back ND messages. If this document updates [RFC4862], SEND should be updated to integrate with the Enhanced DAD algorithm. A minor update to SEND would be to explicitly mention that the nonce in SEND is also used by SEND to detect looped back NS messages during DAD operations by the node. In a mixed SEND environment with SEND and unsecured nodes, the lengths of the nonce used by SEND and unsecured nodes MUST be identical.

Asati, et al. Expires July 8, 2012 [Page 7] Internet-Draft Enhanced DAD January 2012

4.5. Changes to RFC 4862

The following text is added to [RFC4862].

A network interface of an IPv6 node SHOULD implement the Enhanced DAD algorithm. For example, if the interface on an IPv6 node is connected to a circuit that supports loopback testing, then the node should implement the Enhanced DAD algorithm that allows the IPv6 interface to self-heal after loopback testing is ended on the circuit. Another example is when the IPv6 interface resides on an access concentrator running DAD Proxy. The interface supports up to 100 thousand IPv6 clients (broadband modems) connected to the interface. If the interface performs DAD for its IPv6 link-local address and if the DAD probe is reflected back to the interface, the interface is stuck in DAD failed state and IPv6 services to the 100 thousand clients is denied. Disabling DAD for such an IPv6 interface on an access concentrator is not an option because the network also needs to detect genuine duplicates in the interface downstream network. The Enhanced DAD algorithm also facilitates detecting a genuine duplicate for the interface on the access concentrator. See the Actions to Perform on Detecting a Genuine Duplicate section of the Enhanced DAD document.

5. Actions to Perform on Detecting a Genuine Duplicate

As described in paragraphs above the nonce can also serve to detect genuine duplicates even when the network has potential for looping back ND messages. When a genuine duplicate is detected, the node follows the manual intervention specified in section 5.4.5 of [RFC4862]. However, in certain networks such as an access network if the genuine duplicate matches the tentative or optimistic IPv6 address of a network interface of the access concentrator, automated actions are proposed.

One access network is a cable broadband deployment where the access concentrator is the first-hop IPv6 router to several thousand broadband modems. The router also supports proxying of DAD messages. The network interface on the access concentrator initiates DAD for an IPv6 address and detects a genuine duplicate due to receiving an NS(DAD) or an NA message. On detecting such a duplicate the access concentrator logs a system management message, drops the received ND message, and blocks the modem on whose layer 2 service identifier the NS(DAD) or NA message was received on.

The network described above follows a trust model where a trusted router serves un-trusted IPv6 host nodes. Operators of such networks have a desire to take automated action if a network interface of the

Asati, et al. Expires July 8, 2012 [Page 8] Internet-Draft Enhanced DAD January 2012

trusted router has a tentative or optimistic address duplicate with a host served by trusted router interface. Any other network that follows the same trust model MAY use the automated actions proposed in this section.

6. Security Considerations

The nonce can be exploited by a rogue deliberately changing the nonce to fail the looped back detection specified by the Enhanced DAD algorithm. SEND is recommended for this exploit. For any mitigation suggested in the document such as disabling DAD has an obvious security issue before a remote node on the link can issue reflected NS(DAD) messages. Again, SEND is recommended for this exploit.

7. IANA Considerations

None.

8. Acknowledgements

Thanks (in alphabetical order by first name) to Dmitry Anipko, Eric Levy-Abegnoli, Erik Nordmark, Fred Templin, Suresh Krishnan, and Tassos Chatzithomaoglou for their guidance and review of the document. Thanks to Thomas Narten for encouraging this work. Thanks to Steinar Haug and Scott Beuker for describing the use cases.

9. Normative References

[RFC1247] Moy, J., "OSPF Version 2", RFC 1247, July 1991.

[RFC1661] Simpson, W., "The Point-to-Point Protocol (PPP)", STD 51, RFC 1661, July 1994.

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC3971] Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure Neighbor Discovery (SEND)", RFC 3971, March 2005.

[RFC4429] Moore, N., "Optimistic Duplicate Address Detection (DAD) for IPv6", RFC 4429, April 2006.

[RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861,

Asati, et al. Expires July 8, 2012 [Page 9] Internet-Draft Enhanced DAD January 2012

September 2007.

[RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless Address Autoconfiguration", RFC 4862, September 2007.

Appendix A. Changes between the -03 and the -04 versions

1. Modified text in the Introduction section for the second DAD Loopback cable broadband problem with some editorial changes.

Appendix B. Changes between the -02 and the -03 versions

1. Added more precise definition of the Loopback function in a circuit network to the Terminology section.

2. Added more clarification to the cable broadband example in the Introduction section.

3. Changed the Changes to RFC 4862 section for the requirement to support the Enhanced DAD algorithm from MUST to SHOULD. Also changed the requirement to apply to a network interface for IPv6 rather than a router or host. Since the requirement changed from a MUST to a SHOULD, a guidance paragraph was added to explain the SHOULD.

4. Fixed some typos.

Authors’ Addresses

Rajiv Asati Cisco Systems, Inc. 7025 Kit Creek road Research Triangle Park, NC 27709-4987 USA

Email: [email protected] URI: http://www.cisco.com/

Asati, et al. Expires July 8, 2012 [Page 10] Internet-Draft Enhanced DAD January 2012

Hemant Singh Cisco Systems, Inc. 1414 Massachusetts Ave. Boxborough, MA 01719 USA

Phone: +1 978 936 1622 Email: [email protected] URI: http://www.cisco.com/

Wes Beebee Cisco Systems, Inc. 1414 Massachusetts Ave. Boxborough, MA 01719 USA

Phone: +1 978 936 2030 Email: [email protected] URI: http://www.cisco.com/

Eli Dart Lawrence Berkeley National Laboratory 1 Cyclotron Road, Berkeley, CA 94720 USA

Email: [email protected] URI: http://www.es.net/

Wes George Time Warner Cable 13820 Sunrise Valley Drive Herndon, VA 20171 USA

Email: [email protected]

Asati, et al. Expires July 8, 2012 [Page 11] Internet-Draft Enhanced DAD January 2012

Carlos Pignataro Cisco Systems, Inc. 7200-12 Kit Creek Road Research Triangle Park, NC 27709 USA

Email: [email protected] URI: http://www.cisco.com/

Asati, et al. Expires July 8, 2012 [Page 12]

IPv6 Maintenance Working Group K. Lynn, Ed. Internet-Draft Consultant Intended status: Standards Track J. Martocci Expires: September 13, 2012 Johnson Controls C. Neilson Delta Controls S. Donaldson Honeywell March 12, 2012

Transmission of IPv6 over MS/TP Networks draft-ietf-6man-6lobac-01

Abstract

MS/TP (Master-Slave/Token-Passing) is a contention-free access method for the RS-485 physical layer that is used extensively in building automation networks. This document describes the frame format for transmission of IPv6 packets and the method of forming link-local and statelessly autoconfigured IPv6 addresses on MS/TP networks.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on September 13, 2012.

Copyright Notice

Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents

Lynn, et al. Expires September 13, 2012 [Page 1] Internet-Draft IPv6 over MS/TP March 2012

carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction ...... 3 2. MS/TP Mode for IPv6 ...... 6 3. Addressing Modes ...... 6 4. Maximum Transmission Unit (MTU) ...... 7 5. LoBAC Adaptation Layer ...... 7 6. Stateless Address Autoconfiguration ...... 9 7. IPv6 Link Local Address ...... 10 8. Unicast Address Mapping ...... 10 9. Multicast Address Mapping ...... 11 10. Header Compression ...... 11 11. IANA Considerations ...... 11 12. Security Considerations ...... 12 13. Acknowledgments ...... 12 14. References ...... 12 14.1. Normative References ...... 12 14.2. Informative References ...... 13 Appendix A. Extended Data CRC [CRC32K] ...... 14 Appendix B. Consistent Overhead Byte Stuffing [COBS] ...... 16 Authors’ Addresses ...... 20

Lynn, et al. Expires September 13, 2012 [Page 2] Internet-Draft IPv6 over MS/TP March 2012

1. Introduction

MS/TP (Master-Slave/Token-Passing) is a contention-free access method for the RS-485 [TIA-485-A] physical layer that is used extensively in building automation networks. This document describes the frame format for transmission of IPv6 [RFC2460] packets and the method of forming link-local and statelessly autoconfigured IPv6 addresses on MS/TP networks. The general approach is to adapt elements of the 6LoWPAN [RFC4944] specification to constrained wired networks.

An MS/TP device is typically based on a low-cost microcontroller with limited processing power and memory. Together with low data rates and a small address space, these constraints are similar to those faced in 6LoWPAN networks and suggest some elements of that solution might be applied. MS/TP differs significantly from 6LoWPAN in at least three respects: a) MS/TP devices typically have a continuous source of power, b) all MS/TP devices on a segment can communicate directly so there are no hidden node or mesh routing issues, and c) proposed changes to MS/TP will support payloads of up to 1501 octets, eliminating the need for link-layer fragmentation and reassembly.

The following sections provide a brief overview of MS/TP, then describe how to form IPv6 addresses and encapsulate IPv6 packets in MS/TP frames. This document also specifies a header compression mechanism, based on [RFC6282], that is recommended in order to make IPv6 practical on low speed MS/TP networks.

1.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

1.2. Abbreviations Used

ASHRAE: American Society of Heating, Refrigerating, and Air- Conditioning Engineers (http://www.ashrae.org)

BACnet: An ISO/ANSI/ASHRAE Standard Data Communication Protocol for Building Automation and Control Networks

CRC: Cyclic Redundancy Check

MAC: Medium Access Control

MSDU: MAC Service Data Unit (MAC client data)

UART: Universal Asynchronous Transmitter/Receiver

Lynn, et al. Expires September 13, 2012 [Page 3] Internet-Draft IPv6 over MS/TP March 2012

1.3. MS/TP Overview

This section provides a brief overview of MS/TP, which is specified in Clause 9 of ANSI/ASHRAE 135-2010 [BACnet] and included herein by reference. [BACnet] also covers physical layer deployment options.

MS/TP is designed to enable multidrop networks over shielded twisted pair wiring. It can support segments up to 1200 meters in length or data rates up to 115,200 baud (at this highest data rate the segment length is limited to 1000 meters). An MS/TP link requires only a UART, a 5ms resolution timer, and a [TIA-485-A] transceiver with a driver that can be disabled. These features combine to make MS/TP a cost-effective field bus for the most numerous and least expensive devices in a building automation network.

The differential signaling used by [TIA-485-A] requires a contention- free MAC. MS/TP uses a token to control access to a multidrop bus. A master node may initiate the transmission of a data frame when it holds the token. After sending at most a configured maximum number of data frames, a master node passes the token to the next master node (as determined by node address). Slave nodes transmit only when polled and are not considered part of this specification.

MS/TP frames have the following format*:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 0x55 | 0xFF | Frame Type* | DA | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SA | Length (MS octet first) | Header CRC | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / / / Data and Extended Data CRC fields* / / encoded using Consistent Overhead Byte Stuffing [COBS] / / / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | optional 0xFF | +-+-+-+-+-+-+-+-+

Figure 1: MS/TP Extended Frame Format

*Note: A BACnet proposal [Addendum_an], now in public review, assigns a new Frame Type for IPv6 Encapsulation, extends the maximum length of the Data field to 1501 octets, and specifies a 32-bit Extended Data CRC [CRC32K] for these frames. The Data and Extended Data CRC fields are [COBS] encoded and present only if Length is non-zero.

Lynn, et al. Expires September 13, 2012 [Page 4] Internet-Draft IPv6 over MS/TP March 2012

The MS/TP frame fields have the following descriptions**:

Preamble two octet preamble: 0x55, 0xFF Frame Type one octet Destination Address one octet address Source Address one octet address Length two octets, most significant octet first Header CRC one octet Data 0 - 1501 octets** (present only if Length is non-zero) Extended Data CRC four octets**, least significant octet first (present only if Length is non-zero) (pad) (optional) at most one octet of trailer: 0xFF

The Frame Type is used to distinguish between different types of MAC frames. Currently defined types (in decimal) are:

00 Token 01 Poll For Master 02 Reply To Poll For Master ... 10 IPv6 over MS/TP Encapsulation**

Frame Types 11 through 127 are reserved for assignment by ASHRAE. All master nodes MUST understand Token, Poll For Master, and Reply to Poll For Master frames. See Section 2 for additional details.

The Destination and Source Addresses are each one octet in length. See Section 3 for additional details.

A non-zero Length field specifies the length of the [COBS] encoded Data and Extended Data CRC fields in octets, minus two. (Note: This trick is required for co-existence with legacy MS/TP devices.) See Section 4 and Appendices for additional details.

The Header CRC field covers the Frame Type, Destination Address, Source Address, and Length fields. The Header CRC generation and check procedures are specified in [BACnet].

**The Data and Extended Data CRC fields are conditional on the Frame Type and the Length and will always be present in frames specified by this document. These fields are concatenated and then encoded before transmission using Consistent Overhead Byte Stuffing [COBS] to remove preamble sequences from the fields. The Extended Data CRC and COBS encoding procedures are specified in the BACnet [Addendum_an] change proposal and briefly summarized in Appendices A and B below.

Lynn, et al. Expires September 13, 2012 [Page 5] Internet-Draft IPv6 over MS/TP March 2012

1.4. Goals and Non-goals

The primary goal of this specification is to enable IPv6 directly to wired end devices in building automation and control networks, while leveraging existing standards to the greatest extent possible. A secondary goal is to co-exist with legacy MS/TP implementations. Only the minimum changes necessary to support IPv6 over MS/TP are proposed in BACnet [Addendum_an] (see note in Section 1.3).

Non-goals include making changes to the MS/TP frame header format, control frames, Master Node state machine, or addressing modes. Also, while the techniques described here may be applicable to other data links, no attempt is made to define a general design pattern.

2. MS/TP Mode for IPv6

The BACnet [Addendum_an] change proposal allocates a new MS/TP Frame Type from the ASHRAE reserved range to indicate IPv6 Encapsulation. The new Frame Type for IPv6 Encapsulation is 10 (0x0A).

All MS/TP master nodes (including those that support IPv6) must understand Token, Poll For Master, and Reply to Poll For Master control frames and support the Master Node state machine as specified in [BACnet]. MS/TP master nodes that support IPv6 must also support the Receive Frame state machine as specified in [BACnet] as extended by [Addendum_an].

3. Addressing Modes

MS/TP node (link-layer) addresses are one octet in length. The method of assigning node addresses is outside the scope of this document. However, each MS/TP node on the link MUST have a unique address or a misconfiguration condition exists.

[BACnet] specifies that addresses 0 through 127 are valid for master nodes. The method specified in Section 6 for creating the Interface Identifier (IID) ensures that an IID of all zeros can never result.

A Destination Address of 255 (0xFF) denotes a link-level broadcast (all nodes). A Source Address of 255 MUST NOT be used. MS/TP does not support multicast, therefore all IPv6 multicast packets MUST be sent as link-level broadcasts and filtered at the IPv6 layer.

This document assumes that each MS/TP link maps onto a unique IPv6 subnet prefix. Hosts learn IPv6 prefixes via router advertisements according to [RFC4861].

Lynn, et al. Expires September 13, 2012 [Page 6] Internet-Draft IPv6 over MS/TP March 2012

4. Maximum Transmission Unit (MTU)

The BACnet [Addendum_an] change proposal specifies that the MSDU be increased to 1501 octets and covered by a 32-bit CRC. This is sufficient to convey an MTU of at least 1280 octets as required by IPv6 without the need for link-layer fragmentation and reassembly.

However, the relatively low data rates of MS/TP still make a compelling case for header compression. An adaptation layer to indicate compressed or uncompressed IPv6 headers is specified below in Section 5 and the compression scheme is specified in Section 10.

5. LoBAC Adaptation Layer

The encapsulation formats defined in this section (subsequently referred to as the "LoBAC" encapsulation) comprise the payload (MSDU) of an MS/TP frame. The LoBAC payload (i.e., an IPv6 packet) follows an encapsulation header stack. LoBAC is a subset of the LoWPAN encapsulation defined in [RFC4944], therefore the use of "LOWPAN" in literals below is intentional. The primary differences between LoBAC and LoWPAN are: a) exclusion of the Fragmentation, Mesh, and Broadcast headers, and b) use of LOWPAN_IPHC [RFC6282] in place of LOWPAN_HC1 header compression (which is deprecated by [RFC6282]).

All LoBAC encapsulated datagrams transmitted over MS/TP are prefixed by an encapsulation header stack. Each header in the stack consists of a header type followed by zero or more header fields. Whereas in an IPv6 header the stack would contain, in the following order, addressing, hop-by-hop options, routing, fragmentation, destination options, and finally payload [RFC2460]; in a LoBAC encapsulation the analogous sequence is (optional) header compression and payload. The header stacks that are valid in a LoBAC network are shown below.

A LoBAC encapsulated IPv6 datagram:

+------+------+------+ | IPv6 Dispatch | IPv6 Header | Payload | +------+------+------+

A LoBAC encapsulated LOWPAN_IPHC compressed IPv6 datagram:

+------+------+------+ | IPHC Dispatch | IPHC Header | Payload | +------+------+------+

All protocol datagrams (i.e., IPv6 or compressed IPv6 headers) SHALL be preceded by one of the valid LoBAC encapsulation headers. This

Lynn, et al. Expires September 13, 2012 [Page 7] Internet-Draft IPv6 over MS/TP March 2012

permits uniform software treatment of datagrams without regard to their mode of transmission.

The definition of LoBAC headers consists of the dispatch value, the definition of the header fields that follow, and their ordering constraints relative to all other headers. Although the header stack structure provides a mechanism to address future demands on the LoBAC (LoWPAN) adaptation layer, it is not intended to provided general purpose extensibility. This format document specifies a small set of header types using the header stack for clarity, compactness, and orthogonality.

5.1. Dispatch Value and Header

The LoBAC Dispatch value begins with a "0" bit followed by a "1" bit. The Dispatch value and header are shown here:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0|1| Dispatch | Type-specific header +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Dispatch 6-bit selector. Identifies the type of header immediately following the Dispatch value.

Type-specific header A header determined by the Dispatch value.

Figure 2: Dispatch Value and Header

The Dispatch value may be treated as an unstructured namespace. Only a few symbols are required to represent current LoBAC functionality. Although some additional savings could be achieved by encoding additional functionality into the dispatch value, these measures would tend to constrain the ability to address future alternatives.

Pattern Header Type +------+------+ | 00 xxxxxx | NALP - Not a LoWPAN (LoBAC) frame | | 01 000000 | ESC - Additional Dispatch octet follows | | 01 000001 | IPv6 - Uncompressed IPv6 Addresses | | ... | reserved - Defined or reserved by [RFC4944] | | 01 1xxxxx | LOWPAN_IPHC - LOWPAN_IPHC compressed IPv6 [RFC6282] | | 1x xxxxxx | reserved - Defined or reserved by [RFC4944] | +------+------+

Figure 3: Dispatch Value Bit Patterns

Lynn, et al. Expires September 13, 2012 [Page 8] Internet-Draft IPv6 over MS/TP March 2012

NALP: Specifies that the following bits are not a part of the LoBAC encapsulation, and any LoBAC node that encounters a Dispatch value of 00xxxxxx shall discard the packet. Non-LoBAC protocols that wish to coexist with LoBAC nodes should include an octet matching this pattern immediately following the MS/TP header.

ESC: Specifies that the following header is a single 8-bit field for the Dispatch value. It allows support for Dispatch values larger than 127 (see [RFC6282] section 5).

IPv6: Specifies that the following header is an uncompressed IPv6 header [RFC2460].

LOWPAN_IPHC: A value of 011xxxxx specifies a LOWPAN_IPHC compression header (see Section 10.)

Reserved: A LoBAC node that encounters a Dispatch value in the range 01000010 through 01011111 or 1xxxxxxx SHALL discard the packet.

6. Stateless Address Autoconfiguration

This section defines how to obtain an IPv6 Interface Identifier. The general procedure is described in Appendix A of [RFC4291], "Creating Modified EUI-64 Format Interface Identifiers".

The Interface Identifier may be based on an [EUI-64] identifier assigned to the device (but this is not typical for MS/TP). In this case, the Interface Identifier is formed from the EUI-64 by inverting the "u" (universal/local) bit according to [RFC4291]. This will result in a globally unique Interface Identifier.

If the device does not have an EUI-64, then the Interface Identifier MUST be formed by concatenating its 8-bit MS/TP node address to the seven octets 0x00, 0x00, 0x00, 0xFF, 0xFE, 0x00, 0x00. For example, an MS/TP node address of hexadecimal value 0x4F results in the following Interface Identifier:

|0 1|1 3|3 4|4 6| |0 5|6 1|2 7|8 3| +------+------+------+------+ |0000000000000000|0000000011111111|1111111000000000|0000000001001111| +------+------+------+------+

Note that this results in the universal/local bit set to "0" to indicate local scope.

An IPv6 address prefix used for stateless autoconfiguration [RFC4862]

Lynn, et al. Expires September 13, 2012 [Page 9] Internet-Draft IPv6 over MS/TP March 2012

of an MS/TP interface MUST have a length of 64 bits.

7. IPv6 Link Local Address

The IPv6 link-local address [RFC4291] for an MS/TP interface is formed by appending the Interface Identifier, as defined above, to the prefix FE80::/64.

10 bits 54 bits 64 bits +------+------+------+ |1111111010| (zeros) | Interface Identifier | +------+------+------+

8. Unicast Address Mapping

The address resolution procedure for mapping IPv6 non-multicast addresses into MS/TP link-layer addresses follows the general description in Section 7.2 of [RFC4861], unless otherwise specified.

The Source/Target Link-layer Address option has the following form when the addresses are 8-bit MS/TP node (link-layer) addresses.

0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length=1 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 0x00 | MS/TP Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +- Padding -+ | (all zeros) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Option fields:

Type:

1: for Source Link-layer address.

2: for Target Link-layer address.

Length: This is the length of this option (including the type and length fields) in units of 8 octets. The value of this field is 1 for 8-bit MS/TP node addresses.

Lynn, et al. Expires September 13, 2012 [Page 10] Internet-Draft IPv6 over MS/TP March 2012

MS/TP Address: The 8-bit address in canonical bit order [RFC2469]. This is the unicast address the interface currently responds to.

9. Multicast Address Mapping

All IPv6 multicast packets MUST be sent to MS/TP Destination Address 255 (broadcast) and filtered at the IPv6 layer. When represented as a 16-bit address in a compressed header (see Section 10), it MUST be formed by padding on the left with a zero:

0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 0x00 | 0xFF | +-+-+-+-+-+-+-+-+------+

10. Header Compression

LoBAC uses LOWPAN_IPHC IPv6 compression, which is specified in [RFC6282] and included herein by reference. This section will simply identify substitutions that should be made when interpreting the text of [RFC6282].

In general the following substitutions should be made:

* Replace "6LoWPAN" with "MS/TP network"

* Replace "IEEE 802.15.4 address" with "MS/TP address"

When a 16-bit address is called for (i.e., an IEEE 802.15.4 "short address") it MUST be formed by padding the MS/TP address to the left with a zero:

0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 0x00 | MS/TP address | +-+-+-+-+-+-+-+-+------+

11. IANA Considerations

This document uses values previously reserved by [RFC4944] and [RFC6282] and makes no further requests of IANA.

Note to RFC Editor: this section may be removed upon publication.

Lynn, et al. Expires September 13, 2012 [Page 11] Internet-Draft IPv6 over MS/TP March 2012

12. Security Considerations

The method of deriving Interface Identifiers from MAC addresses is intended to preserve global uniqueness when possible. However, there is no protection from duplication through accident or forgery.

13. Acknowledgments

We are grateful to the authors of [RFC4944] and members of the IETF 6LoWPAN working group; this document borrows extensively from their work.

14. References

14.1. Normative References

[BACnet] American Society of Heating, Refrigerating, and Air- Conditioning Engineers, "BACnet, A Data Communication Protocol for Building Automation and Control Networks (ANSI Approved)", ANSI/ASHRAE 135-2010, April 2011.

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998.

[RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing Architecture", RFC 4291, February 2006.

[RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, September 2007.

[RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless Address Autoconfiguration", RFC 4862, September 2007.

[RFC4944] Montenegro, G., Kushalnagar, N., Hui, J., and D. Culler, "Transmission of IPv6 Packets over IEEE 802.15.4 Networks", RFC 4944, September 2007.

[RFC6282] Hui, J. and P. Thubert, "Compression Format for IPv6 Datagrams over IEEE 802.15.4-Based Networks", RFC 6282, September 2011.

14.2. Informative References

Lynn, et al. Expires September 13, 2012 [Page 12] Internet-Draft IPv6 over MS/TP March 2012

[Addendum_an] ASHRAE, "BSR/ASHRAE Addendum an to ANSI/ASHRAE Standard 135-2010, BACnet - A Data Communication Protocol for Building Automation and Control Networks (Advisory Public Review Draft)", September 2011, .

[COBS] Cheshire, S. and M. Baker, "Consistent Overhead Byte Stuffing", IEEE/ACM TRANSACTIONS ON NETWORKING, VOL.7, NO.2 , April 1999, .

[CRC32K] Koopman, P., "32-Bit Cyclic Redundancy Codes for Internet Applications", IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2002) , June 2002, .

[EUI-64] IEEE, "Guidelines for 64-bit Global Identifier (EUI-64) Registration Authority", March 1997, .

[RFC2469] Narten, T. and C. Burton, "A Caution On The Canonical Ordering Of Link-Layer Addresses", RFC 2469, December 1998.

[TIA-485-A] Telecommunications Industry Association, "TIA-485-A, Electrical Characteristics of Generators and Receivers for Use in Balanced Digital Multipoint Systems (ANSI/TIA/ EIA-485-A-98) (R2003)", March 2003.

Lynn, et al. Expires September 13, 2012 [Page 13] Internet-Draft IPv6 over MS/TP March 2012

Appendix A. Extended Data CRC [CRC32K]

This Appendix is informative and not part of the standard.

Extending the payload of MS/TP to 1501 octets requires upgrading the Data CRC from 16 bits to 32 bits. Koopman has published several papers on choosing the right CRC polynomial for the application. In [CRC32K], he surveyed the entire 32-bit polynomial space and identified some that exceed the 802.3 polynomial in performance.

The BACnet MS/TP change proposal [Addendum_an] specifies the CRC32K (Koopman) polynomial. An example C implementation is shown below. The specified use of the function is that ’crcValue’ is initialized to all ones before the function is first called and, upon running the function over all octets in the payload, the ones complement of ’crcValue’ be sent in LSB-first order. Upon reception, the data field and modified ’crcValue’ are passed again through the function. If these fields were properly received, the result of the function will be ’CRC32K_RESIDUE’.

Lynn, et al. Expires September 13, 2012 [Page 14] Internet-Draft IPv6 over MS/TP March 2012

#include

/* See ASHRAE 135-2010 Addendum an, section G.3.2 */ #define CRC32K_INITIAL_VALUE (0xFFFFFFFF) #define CRC32K_RESIDUE (0x0843323B) /* CRC-32K polynomial, 1 + x**1 + ... + x**30 (+ x**32) */ #define CRC32K_POLY (0xEB31D82E)

/* * Accumulate ’dataValue’ into the CRC in ’crcValue’. * Return updated CRC. * * Note: crcValue must be set to CRC32K_INITIAL_VALUE * before initial call. */ uint32_t CalcExtendedDataCRC(uint8_t dataValue, uint32_t crcValue) { uint8_t data, b; uint32_t crc;

data = dataValue; crc = crcValue;

for (b = 0; b < 8; b++) { if ((data & 1) ^ (crc & 1)) { crc >>= 1; crc ^= CRC32K_POLY; } else { crc >>= 1; } data >>= 1; } return crc; }

Lynn, et al. Expires September 13, 2012 [Page 15] Internet-Draft IPv6 over MS/TP March 2012

Appendix B. Consistent Overhead Byte Stuffing [COBS]

This Appendix is informative and not part of the standard.

The BACnet change proposal [Addendum_an] corrects a long-standing issue with the MS/TP specification; namely that preamble sequences were not escaped whenever they appeared in the Data or Data CRC fields. In some cases, this could result in dropped frames due to mis-alignment. The solution is encode the Data and Extended Data CRC fields before transmission using Consistent Overhead Byte Stuffing [COBS] and decode these fields upon reception.

COBS is a run-length encoding method that effectively removes ’0x00’ octets from its input. The worst-case overhead is bounded at approx. one octet in 254, or less than 0.5%, as described in [COBS]. An aribtrary octet value may be removed by XOR’ing the COBS output with the specified value. In the case of MS/TP, the ’0x55’ preamble octet is specified for removal.

Encoding proceeds logically in three passes. First, the Extended Data CRC is calculated over the data, modified for transmission as described in Appendix A, and appended to the data. The combined fields are then passed through the COBS encoder. The resulting output is then XOR’d with the MS/TP preamble octet ’0x55’. The length of the encoded fields, minus two octets for compatibility with existing MS/TP devices, is placed in the MS/TP header Length field before transmission.

An example C implementation that combines these passes is shown below. The decode() function is the inverse of the encode() function.

Lynn, et al. Expires September 13, 2012 [Page 16] Internet-Draft IPv6 over MS/TP March 2012

#include

#define MSTP_PREAMBLE_55 (0x55)

/* * Encodes ’length’ octets of data at the location pointed to by ’input’ * and writes the output to the location pointed to by ’output’. * Returns the number of octets written to ’output’. */ size_t frame_encode (uint8_t *output, const uint8_t *input, size_t length) { size_t code_index = 0; size_t read_index = 0; size_t write_index = 1; uint8_t code = 1; uint8_t data; int i;

uint32_t crc32K = CRC32K_INITIAL_VALUE;

while (read_index < length) { data = input[read_index++]; crc32K = CalcExtendedDataCRC(data, crc32K); /* * In the common case, simply copy input to output and * increment the number of octets copied. */ if (data != 0) { output[write_index++] = (data ^ MSTP_PREAMBLE_55); code++; if (code != 0xFF) continue; } /* * In the special case of encountering a zero in the input or * having copied the maximum number (254) of non-zero octets, * store the count and re-initialize encoder variables. */ output[code_index] = (code ^ MSTP_PREAMBLE_55); code_index = write_index++; code = 1; } /* * Run the one’s complement of the CRC value through the encoder, * LSB first. */ crc32K = ˜crc32K;

Lynn, et al. Expires September 13, 2012 [Page 17] Internet-Draft IPv6 over MS/TP March 2012

for (i = 0; i < 4; i++) { data = ((uint8_t *) &crc32K)[i];

if (data != 0) { output[write_index++] = (data ^ MSTP_PREAMBLE_55); code++; if (code != 0xFF) continue; } /* * In the special case of encountering a zero in the input or * having copied the maximum number (254) of non-zero octets, * store the count and re-initialize encoder variables. */ output[code_index] = (code ^ MSTP_PREAMBLE_55); code_index = write_index++; last_code = code; code = 1; } /* Append a "phantom zero" to the output stream. */ output[code_index] = (code ^ MSTP_PREAMBLE_55); /* * Return the combined value of the Data and Extended CRC fields. * Subtract two before use as the MS/TP frame Length field. */ return write_index; }

Lynn, et al. Expires September 13, 2012 [Page 18] Internet-Draft IPv6 over MS/TP March 2012

/* * Takes COBS encoded Data and Extended Data CRC fields as ’input’. * The ’length’ contains the actual length of these fields in * octets (that is, the header Length field plus two). * Decodes the Data and Extended Data CRC fields into ’output’. * Returns length of decoded Data in octets. * Note: Safe to call with ’output’ <= ’input’. */ size_t frame_decode (uint8_t *output, const uint8_t *input, size_t length) { uint16_t read_index = 0; uint16_t write_index = 0; uint8_t code, data; int i;

crc32 = CRC32K_INITIAL_VALUE; while (read_index < length) { code = (input[read_index] ^ MSTP_PREAMBLE_55); /* * Sanity check the encoding to prevent the for() loop below * from overrunning the output buffer. */ if ((read_index + code) > length) return 0;

read_index++;

for (i = 1; i < code; i++) { data = (input[read_index++] ^ MSTP_PREAMBLE_55); crc32 = CalcExtendedDataCRC(data, crc32); output[write_index++] = data; } if ((code < 0xFF) && (read_index < length)) { data = ’\0’; crc32 = CalcExtendedDataCRC(data, crc32); output[write_index++] = data; } } if (crc32K == CRC32K_RESIDUE) return write_index - sizeof(uint32_t); else return 0; }

Lynn, et al. Expires September 13, 2012 [Page 19] Internet-Draft IPv6 over MS/TP March 2012

Authors’ Addresses

Kerry Lynn (editor) Consultant

Phone: +1 978 460 4253 Email: [email protected]

Jerry Martocci Johnson Controls, Inc. 507 E. Michigan St Milwaukee, WI 53202 USA

Phone: +1 414 524 4010 Email: [email protected]

Carl Neilson Delta Controls, Inc. 17850 56th Ave Surrey, BC V3S 1C7 Canada

Phone: +1 604 575 5913 Email: [email protected]

Stuart Donaldson Honeywell Automation & Control Solutions 6670 185th Ave NE Redmond, WA 98052 USA

Email: [email protected]

Lynn, et al. Expires September 13, 2012 [Page 20]

IPv6 Maintenance T.J. Chown, Ed. Internet-Draft University of Southampton Intended status: Informational A.M. Matsumoto, Ed. Expires: October 03, 2013 NTT April 01, 2013

Considerations for IPv6 Address Selection Policy Changes draft-ietf-6man-addr-select-considerations-05

Abstract

This ducument is intended to capture the address selection design team’s considerations about the address selection issues mainly raised in [RFC5220]. This considerations led to the revision of RFC 3484 [RFC6724], and Address Selection DHCP option. Although it does not perfectly match the current state, this document captures the past discussion and considerations for the historical record.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on October 03, 2013.

Copyright Notice

Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of

Chown & Matsumoto Expires October 03, 2013 [Page 1] Internet-Draft Address Selection Considerations April 2013

the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English.

Table of Contents

1. Introduction ...... 2 2. Issues to Consider ...... 3 3. Other Related Work ...... 4 4. Drivers for Policy Changes ...... 4 4.1. Internal vs External Triggers ...... 6 4.2. Administratively Triggered Changes ...... 6 4.3. Start-up vs Running Changes ...... 7 4.4. Nomadic Nodes ...... 7 4.5. Multiple Interface Nodes ...... 8 5. How Dynamic? ...... 9 6. Considerations when Obtaining Policy ...... 10 6.1. Changes in Available Address(es) ...... 10 6.2. Timeliness ...... 10 7. Solution Space ...... 10 7.1. Is default policy used? ...... 11 7.2. Pull model ...... 11 7.3. Push model ...... 11 7.4. Routing Hints ...... 12 7.5. Policy Conflicts ...... 12 7.6. Policy Merging ...... 13 8. On RFC3484 Default Policies ...... 14 9. Conclusions ...... 14 10. Security Considerations ...... 16 11. IANA Considerations ...... 16 12. Acknowledgements ...... 16 13. Informative References ...... 16 Authors’ Addresses ...... 17

1. Introduction

Chown & Matsumoto Expires October 03, 2013 [Page 2] Internet-Draft Address Selection Considerations April 2013

This ducument is intended to capture the past discussions and considerations about the address selection issues mainly raised in [RFC5220]. This considerations led to the revision of RFC 3484 [RFC6724], and Address Selection DHCP option [I-D.ietf-6man-addr-select-opt]. Although it does not necessarily match the current state, this document captures the past discussion and considerations for the historical record.

Where the source and/or destination node of an IPv6 communication is multi-addressed, a mechanism is required for the initiating node to select the most appropriate address pair for the communication. RFC 3484 (IPv6 Default Address Selection) [RFC3484] defines such a mechanism for nodes to perform source and destination address selection. While RFC 3484 recognised the need for implementations to be able to change the policy table, it did not define how this could be achieved. Requirements have now emerged for administrators to be able to configure and potentially dynamically change RFC 3484 policy from a central control point, and for (nomadic) hosts to be able to obtain the policy for the network that they are currently attached to without manual user intervention. This text discusses considerations for such policy changes, including examples of cases where a change of policy is required, and the likely frequency of such policy changes. This text also includes some discussion on the need to also update RFC 3484, where default policies are currently defined.

There have been various operational issues observed with Default Address Selection for IPv6 (RFC 3484) [RFC3484], as described in RFC 5220 [RFC5220]. As as a result, there has been some demand for hosts to be able to have their policy tables, and potentially the rules described in RFC 3484, modified dynamically. Such changes may apply to ’static’ hosts in a network where policies or topologies change, or different default policy to that described in RFC 3484 is required, or for nomadic hosts within a network for which policies may vary depending on their location within the network.

2. Issues to Consider

There are a number of aspects to consider in the context of such address selection policy updates.

First is the frequency for which such updates are likely to be required; this can be determined largely from identifying the scenarios in which policy changes will be required. This may include overriding default operating system policies on startup, as well as changes while a system is running. We discuss this topic in Section 4.

Chown & Matsumoto Expires October 03, 2013 [Page 3] Internet-Draft Address Selection Considerations April 2013

Second, by understanding how dynamic the policy update mechanism needs to be we should be better placed to determine what types of update approaches best meet those needs. There may be other considerations of course, e.g. whether the systems are in managed or unmanaged environments, and whether the solution should be proactive or automated. Section 5 covers these issues.

Third, if we assume some policy update mechanism is defined we should consider how hosts and systems may become aware that a policy change has happened, and how policy can be disseminated in a timely fashion. Thus we need to understand what kind of triggers can be identified that can be used for invoking the policy table update mechanism, e.g. address re-obtainment, address lifetime expiration, or perhaps policy lifetime expiration. We also need to consider what other factors may come into play, e.g. potential policy conflicts. This is discussed in Section 6.

After analysing these issues, we can make some initial comments regarding the potential solution spaces, and what models may be well suited, e.g. push vs pull models, and what other methods might assist us, e.g. hints from local routing tables. This is covered in Section 7.

Finally, we should assess whether these update solutions require or need RFC 3484 to be updated. In some instances, we might envision solutions that simply use RFC 3484 as guidelines and provide sufficient controls to address the current limitations in the RFC. However, as noted in RFC 5220 [RFC5220], not all the operational issues observed to date can be remedied by updating RFC 3484 alone.

3. Other Related Work

We note that there is some existing work in defining Requirements for Address Selection Mechanisms [RFC5221], and some initial work has been done in the solution space (for a DHCP-based method) [I-D.ietf-6man-addr-select-opt], but these are not discussed here. While RFC 5221 assumes that a dynamic policy update mechanism of some form is available, this draft is primarily aimed at understanding the scenarios and triggers for policy changes, to better inform future detailed solution discussions.

A draft discussing methods for multihoming without IPv6 NAT [I-D.ietf-v6ops-multihoming-without-nat66] has been published recently. This draft includes a requirement for a method to distribute address selection policy to support IPv6 multihoming.

4. Drivers for Policy Changes

Chown & Matsumoto Expires October 03, 2013 [Page 4] Internet-Draft Address Selection Considerations April 2013

If we wish to determine how frequent address selection policy changes are likely to be, we need to understand why such policies might need to be changed, for particular sites or networks.

One reference text for potential drivers for policy change is RFC 5220, in which operational issues with the existing policies described in RFC 3484 are listed. Each subsection of this document gives a reason why the existing rules or policy tables in RFC 3484 may not be sufficient in certain cases. There have been some significant changes to IPv6 since RFC 3484 was drafted which have impacted the RFC, e.g. the introduction of Unique Local Addresses (ULAs), and concerns about the impact of using longest prefix matching on (DNS) round-robin load balancing.

In summary, the issues raised in RFC 5220 were:

o Multiple Routers on a Single Interface

o Ingress Filtering

o Half-Closed Network Problem (*)

o Combined Use of Global and ULA addresses (*)

o Site Renumbering (*)

o Multicast Source Address Selection (*)

o Temporary Address Selection

o IPv4 or IPv6 Prioritization (*)

o ULA and IPv4 Dual-Stack Environment (*)

o ULA or Global Prioritization (*)

The authors of RFC 5220 noted which of these issues can be solved just by changes to the RFC 3484 policy table, marked (*) above, and which cannot. It is interesting to note that issues largely related to internal networking and (administrative) policy decisions can be handled this way. However some issues need changes beyond just policy table updates.

Chown & Matsumoto Expires October 03, 2013 [Page 5] Internet-Draft Address Selection Considerations April 2013

4.1. Internal vs External Triggers

When considering drivers or triggers that may lead to a requirement for the policy to change, we can divide the problem space into those drivers that are external to a site or network and those internal to it. In the case of the first two examples above, a dynamic policy table update may be required by externally driven routing changes, assuming the site uses a dynamic routing protocol intra-site and the routing protocol is configured to reflect changes of extra-site routing topology.

If a site is multihomed using BGP and advertising a single prefix upstream, then no policy table manipulation is required for global address preferences. However where a site is multihomed by receiving a prefix from each upstream provider, each host will have multiple addresses and many need policy table manipulation. In such a case, the policy table of hosts may need to be updated according to the routing policy.

It should be noted that we have other mechanisms for dynamic routing topology change, for example deprecating one of the advertised prefixes, e.g. when one of the upstream links has a problem. But such mechanisms may only help in some cases, and do not remove the need for agility in the RFC 3484 policy.

Other examples of external factors include a new transition mechanism being defined (e.g. as with the emergence of Teredo using 2001::/32 as assigned by IANA) and its inclusion being required in the policy table (at the time of writing Teredo is not included in RFC 3484, though some operating systems have added it), a new address block being defined, or a site renumbering event that could be triggered by an upstream provider’s actions.

4.2. Administratively Triggered Changes

The other examples above are, in the general case, where the site administrator chooses to change a local policy and in doing so triggers the need for policy table updates. Some of these changes one might assume to be set once, and to change rarely, for example:

o Setting priority use of IPv6 over IPv4 (or vice versa).

o Setting priority use of ULAs over globals (or vice versa).

o Setting priority of Teredo over native IPv4 (or vice versa).

o Setting priority use of privacy addresses over DNS-published globals (or vice versa).

Chown & Matsumoto Expires October 03, 2013 [Page 6] Internet-Draft Address Selection Considerations April 2013

o An internal network renumbering occurs, perhaps due to a site expanding.

o The nature of the external connectivity through multiple ISPs requires specific additional information (policy) to be delivered to certain hosts (as discussed in 2.1.3 in RFC 5220).

o Disabling longest-prefix match functions to facilitate round-robin load balancing.

However it may be the case that different parts of a site have different policies, or policies are changed in a rolling fashion across a site over time as IPv6 and/or ULAs are introduced (for example). This may happen where the administrator prefers a gradual introduction of new policy in a phased operation across a site, rather than changing policy across the whole site in one operation.

Other administrative changes may occur more frequently, e.g.:

o Routing tables and forwarding tables change dynamically.

o A different provider (link) is preferred for a given destination.

It’s possible that provider links may vary on a daily basis, or by time of day. The frequency of such policy changes will depend on the frequency that the administrator wishes to change the implied traffic engineering policies.

4.3. Start-up vs Running Changes

When a host starts up it may be configured with the default RFC 3484 policies. At this stage a number of addresses may be configured on a number of interfaces on the host. At this time it may be desirable for the host to be able to receive the site-specific policy updates as a start-up override from the RFC 3484 defaults.

Other policy changes may later be required while the host is running. Ideally the same protocol should be used for the start-up and running state update mechanism.

4.4. Nomadic Nodes

A host may be nomadic within a site and as a result it may see the preferred policy change depending on the host’s topological location within that site. Such a host should be capable of receiving policy updates in a timely fashion as it migrates within the network.

Chown & Matsumoto Expires October 03, 2013 [Page 7] Internet-Draft Address Selection Considerations April 2013

While this may be one case of ’running changes’ described above, the policy changes are required due to the host’s new point of attachment, not changes of policy to the current point of attachment. The frequency of updates are thus depend ant on the frequency of host mobility to parts of the network that have differing policies.

It is worth noting that the point at which a nomadic host configures its network settings would be an appropriate time for it to also receive any specific address selection policy for its point of attachement.

4.5. Multiple Interface Nodes

In considering scenarios where hosts may be multi-addressed and require policy to assist in address selection, the issue of hosts with multiple interfaces arises.

A host may have a variety of reasons to have multiple interfaces. It may for example have WiFi and 3G interfaces, and be capable of sending or receiving data over either interface. In some cases these interfaces may fall within the same administrative domain (ISP) and in some cases they may not. Another example would be the case of a host with a VPN connection established, where address selection may be affected by the choice of whether the VPN connection is used or not. In this case it is interesting to note the choice to use the VPN tunnel for all, or just VPN home site traffic, is often left as a choice for the user via a tickbox selection. In addition, initiating the VPN typically changes several related settings, which is reasonable behaviour given the user chose to initiate the VPN connection.

Handling multiple interface nodes, and the possibility of conflicting policy being retrieved via each, is clearly an important problem today, but we note that RFC 3484 is currently defined as a per-node, not per-interface, mechanism (at least in the context of destination address selection). However, for RFC 3484, and its potential update mechanisms, to be applicable to typical ’real world’ usage patterns, we should consider the multiple interface scenarios.

In the case where a host has multiple interfaces there are two likely scenarios:

o Wired and wireless interfaces - in this case the operating system just needs to pick one interface and use it.

o Normal and VPN interfaces - here the default should be the normal interface; the VPN interface should only be used for destinations associated with the VPN.

Chown & Matsumoto Expires October 03, 2013 [Page 8] Internet-Draft Address Selection Considerations April 2013

It has been suggested that an RFC 3484 policy table is required on a per-interface basis, though the choice of interface may itself be determined by the (destination) address selection process. As stated above, RFC 3484’s policy table is currently defined to be node-wide. The node-wide problem is destination address selection when the source address is implied from a selected interface.

We note that there are some new, initial drafts published recently on the multiple interface problem [RFC6418], and on a number of possible DHCPv6 extensions, e.g. to inform hosts about routing information to assist the selection process., to inform hosts about DNS server selection policy, [RFC6731]. These drafts fall within the remit of the new IETF mif WG. We note that the mif WG may produce relevant work with respect to the analysis of RFC 3484 policy changes, but at this stage no such output exists for inclusion.

5. How Dynamic?

The discussion above suggests that many of the potential triggers for policy table changes are ’one-off’ in nature, i.e. a site makes a one-time policy change. It is thus unlikely that such administrative changes will be frequent.

There are some cases where updates may be required to be more frequent. In the example of a site which is implementing the gradual introduction of new policy across its network, while the frequency of changes may be relatively high, there is still probably only one or a small number of changes per host.

There may be a higher rate of policy changes within a site if there are nomadic hosts within the site, and these are roaming frequently to parts of the network where differing policies are in effect. In such cases it may be useful for a host to know whether or not the default RFC 3484 (or soon to be 3484bis) policies are in effect or not, and for there to be a ’cheap’ way for the host to discover this.

Perhaps the biggest cause of policy change lies where the preferred links or paths for certain destinations change frequently over time as (typically) traffic engineering requirements change. In some networks this may be a daily change, or change between states at different times of day. It is not clear how common these cases are, and thus further input is welcomed here. Our belief is that cases where dynamic changes are used heavily are rare.

So, unless a site or network has rapidly changing traffic engineering requirements, or includes a high number of mobile nodes where the nodes are roaming to areas of the network with differing address selection related policies, the frequency of updates is likely to be

Chown & Matsumoto Expires October 03, 2013 [Page 9] Internet-Draft Address Selection Considerations April 2013

relatively low. Most update requests will simply occur when a host starts up, and such requests for policy will be little different in frequency to other configuration requests. Other types of network change that may require a host to change its RFC 3484 policy behaviour are probably also likely to have associated changes with other host configuration data.

6. Considerations when Obtaining Policy

When a policy change is made, or a host migrates to a part of the network with different policies, that change of policy needs to be conveyed to the host. It needs to be made available and applied without restarting every affected host.

6.1. Changes in Available Address(es)

One might assume at first that when a host observes a change in its addresses, it should re-obtain the selection policy, but this may not always be the case. Not all policy changes are tied to a host changing one or more addresses, though it may be acceptable to query regardless for new policy (if a pull model is used) when address information changes.

As described above, it may be sufficient for a host to know when a policy is changed, or that perhaps the default policy is - or is not - in effect in its current locale.

6.2. Timeliness

In many, but not all, cases a policy change will need to be synchronised across a network. Thus there is a general issue of timely and synchronised dissemination of new policy. If the policy is distributed via the same mechanism that informs a host of a change of address(es), the application of the policy should be synchronised sufficiently with the address change. However, not all hosts may receive the update information at the same time, e.g. where new address assignments may be dependent on DHCP lease timers.

Where hosts use DHCPv6 for address information, in the absence of some form of Reconfigure message, a host may see a delay in policy changes being notified. One possible tool to help here is the DHCPv6 Lifetime Option (RFC4242) [RFC4242], which was originally introduced to assist with network renumbering events.

7. Solution Space

In this section we make some initial observations on the possible solution space.

Chown & Matsumoto Expires October 03, 2013 [Page 10] Internet-Draft Address Selection Considerations April 2013

7.1. Is default policy used?

There could be some mechanism to indicate to a host that the local network has a modified RFC 3484 policy in use, and thus that a revised policy table is available (and should be used). Alternatively a host could simply always attempt to obtain local RFC 3484 policy on startup. Regardless, it should also be possible for a host to detect that policy has changed (whether ’around’ the host, or due to the host being nomadic). The method to convey this chnage to a host would depend on whether a push or pull configuration method is used.

It is assumed by ’default’ policy here we refer to the revised/ updated RFC3484 specification, when that is produced.

7.2. Pull model

One potential solution is that a host uses a similar mechanism for RFC 3484 policy updates as is used for obtaining other configuration data, for example DHCPv6 [RFC3315]. For hosts using stateless autoconfiguration, policy could be made available via stateless DHCPv6 [RFC3736].

There are also already some initial proposals from the IETF mif WG on using DHCPv6 to deliver (mainly routing oriented) information to hosts, e.g. DHCPv6 route option and [RFC6731]. These methods assume entities that have timely knowledge of routing information can provide equally timely hints to hosts on address selection, via DHCPv6. At this stage we believe that distributing RFC 3484 policy, as configured by an administrator, is a more practical use of DHCPv6.

The DHCP model allows individual nodes to potentially have differing policy, even when on the same subnet.

7.3. Push model

For hosts only using stateless autoconfiguration, in environments without stateless DHCPv6, it may be argued that since the network is not managed, there is not likely to be any managed policy to push to the hosts. In such environments hosts may perhaps more usefully use techniques such as router hints to make informed selections, as discussed later in this text.

It may of course be possible to piggy back policy information to a host in a Router Advertisement message, though initial consensus seems to be that this is a less attractive approach.

Chown & Matsumoto Expires October 03, 2013 [Page 11] Internet-Draft Address Selection Considerations April 2013

7.4. Routing Hints

As mentioned above, if a host has routing hints available, it may be able to make more informed selections. For example, a protocol could be specified for a node to query an on-link or remote (e.g. edge) router for ’hints’. For example, a new ICMPv6 message could be defined that queried a site edge router or route server for address pairs to use for a given destination address.

However, having hosts themselves participate in routing is generally not desirable. At this stage we can simply note that address selection might be simplified when some hint based on routing state is provided to the end system, but such mechanisms are out of scope for this text.

It is noted in [RFC5887] that:

"In an environment where a site has more than one upstream link to the outside world, the site might have more than one valid routing prefix. In such cases, typically all valid routing prefixes within a site will have the same prefix length. Also in such cases, it might be desirable for hosts that obtain their addresses using DHCPv6 to learn about the availability of upstream links dynamically, by deducing from periodic IPv6 RA messages which routing prefixes are currently valid. This application seems possible within the IPv6 Neighbour Discovery architecture, but does not appear to be clearly specified anywhere."

The same thought seems relevant to address selection. There’s no point selecting a source address whose prefix is not being advertised in RAs.

While routing and prefix hints may help a host make selection decisions, we should consider to what extent we wish to ’burden’ a host with holding such information. If a host is to determine and cache routing hints, this may require an update of RFC 3484 policy table syntax to support preference for address pairs.

7.5. Policy Conflicts

In the case of a host operating in a single administrative domain, consistent policy should be available from whichever policy distribution mechanism provides the information. In such cases the network should not distribute policy sets from multiple entities (or by multiple mechanisms). However, in scenarios where a host is multi-addressed from multiple providers (e.g. a SOHO network with differing DSL and cable providers, or a user in a coffee shop initiating a VPN connection to their home network), multiple RFC 3484

Chown & Matsumoto Expires October 03, 2013 [Page 12] Internet-Draft Address Selection Considerations April 2013

policies may be received and there is likely to be some conflicts in the received policy information.

There are scenarios where a host may wish to ignore a conveyed policy. For example, the manager of a mobile node may not want to have its preferences changed by a visited network. In such a case one might argue that the mobile node should use MIPv6 with whatever its home network policies are.

The question then is whether the policy update mechanism itself needs to handle such potential conflicts, choosing one or ther other or merging by some set of heuristics, or whether the policy update mechansism should be viewed independently of the conflict handling. The view of the design team was that distributing policy is a network problem, while handling conflicts is a host problem.

7.6. Policy Merging

For whatever mechanism is used to distribute RFC 3484 policy, it is not yet clear whether entire policy tables will be made available, or simply differences to the ’default’, and thus whether policies may need to be merged, or overridden. Some policy conflicts will be unresolvable, e.g. one prefers IPv4 over IPv6, the other vice-versa. It may be simpler, though less efficient, for whole policy tables to be distributed, to avoid the merger problem.

One option may be to split the policy table into destination address selection and source address selection tables, with the policy distribution only updating the source address selection. Whether this might make merging policies simpler or in fact more complex would require further study.

It may also be possible to indicate some priority value for a policy, e.g. the priority of the interface it is received on, or perhaps to convey a unique identifier for the policy provider. Alternatibely, if there are multiple policies in conflict, a host could simply choose to fall back to use the default RFC 3484 policy.

A host also needs to know how to decide when to accept a policy. We could simplify the discussion by assuming a host is located in and only nomadic within a single site with one administrative controlling entity.

Chown & Matsumoto Expires October 03, 2013 [Page 13] Internet-Draft Address Selection Considerations April 2013

8. On RFC3484 Default Policies

RFC 3484 includes text about mechanisms for changing policy, having ’policy hooks’ and having a configurable policy table. The implication is that defaults can be changed, and the text gives examples of this in Section 10. However, issues with RFC 3484 are broader that just policy table updates - it remains the case that some operational issues with RFC 3484 are not just related to the table, but on rules themselves, e.g. longest prefix match (affecting DNS round robin as described in [RFC5220]).

While discussing default policy, we noted that the word ’default’ has to be carefully defined, and also what the scope of this ’default’ is. The default policy should be whatever RFC 3484, or its -bis version, states. At present some operating systems have already modified their default, based on operational feedback (e.g. on ULAs, on Teredo prefixes, or on the DNS round-robin problem). Currently we assume RFC3484 and changes to it will remain node-specific.

It certainly seems the case that the issues raised in RFC 5220, and problems about RFC 3484 revision mean that an update of RFC 3484 is required, if only because some of the issues (as highlighted earlier) cannot be addressed by updating the policy table alone. An update would also give us some hope that all operating systems might have a common ’default’.

We do not note any specific comments here on how RFC 3484 should be updated. Other drafts have made suggestions. There are some discussions on ideas however, e.g. on the semantics of labels, and in adding ULAs explicitly to the default policy table.

There have also been new issues identified, e.g. on how one differentiates between IPv4+NAT access or IPv6 transitional access (e.g. via Teredo) to a dual-stack destination (the IPv4 private address inside the NAT is implicitly global, although its explicit scope is local) [I-D.denis-v6ops-nat-addrsel]. This illustrates that new issues may continue to be identified through growing IPv6 operational experience.

It is hard to predict exactly what features people will want to add to address selection algorithms in the future. Ideally we should not preclude future flexibility. It seems clear that any RFC 3484 update has two aspects: one that uses the existing policy table capability, and one that might change associated algorithms.

9. Conclusions

Chown & Matsumoto Expires October 03, 2013 [Page 14] Internet-Draft Address Selection Considerations April 2013

We believe a key outcome of this text should be progression of a solution to allow an enterprise network manager to configure their hosts with address selection policies that may differ from the RFC 3484 default, across all or part of their network, and possibly changing polciy with time. The general scope of this text applies to site and enterprise networks, where an administrator may need to change policies over time. It also includes nomadic nodes within the site, which may migrate to different parts of the site where different policies are required.

It is clear there may be environments which might introduce conflicting policies from different administrative domains, e.g. a SOHO network with two ISP links, or an enterprise node running a VPN to a remote network. We conclude that the policy distribution mechanism is a network task, while policy conflict handling is a host task. Within this text, we do not present a solution for policy conflict handling, because at this time there is no perfect or practical solution. We thus recommend that we should progress the policy distribution solution while analysing conflict handling (which is not unique to this domain) in a separate text.

The scope of this text includes issues affecting the design of a protocol to allow a host’s RFC 3484 policy table to be updated. From discussion of update triggers/scenarios, we believe rapid updates are unlikely to be required unless a node is in a network which has (very) dynamic external traffic engineering, or many nodes are mobile between parts of the network with differing policy. It’s thus generally appropriate to use a similar method to obtain RFC 3484 policy as to obtain other configuration data.

In terms of obtaining policy, a pull-based solution, such as DHCPv6, may be more appropriate in managed environments (where managed non- default policies are most likely to be in effect), which would assure that hosts only gain policy information from a single entity (the DHCPv6 service). Use of DHCPv6 is also preferable if individual hosts on a subnet require different policies. In unmanaged networks, without stateless DHCPv6, use of routing hints may be an approach worth exploring.

Finally, there is a clear need to revise RFC 3484, to create a new default policy table for address selection, and to improve non policy table algorithms. This should be expedited.

Chown & Matsumoto Expires October 03, 2013 [Page 15] Internet-Draft Address Selection Considerations April 2013

10. Security Considerations

There are no extra Security consideration for this document.

11. IANA Considerations

There are no extra IANA consideration for this document.

12. Acknowledgements

The design team working on this draft is: Marcelo Bagnulo Braun, Marc Blanchet, Tim Chown, Francis Dupont, Tim Enos, TJ Evans, Brian Haberman, Tony Hain, Ruri Hiromi, Suresh Krishnan, Arifumi Matsumoto, Janos Mohacsi, Sebastien Roy, Teemu Savolainen, Fujisaki Tomohiro, and John Zhao.

We also acknowledge comments received from IETF WG mail lists, including those by Brian Carpenter and Dave Thaler.

13. Informative References

[RFC3484] Draves, R., "Default Address Selection for Internet Protocol version 6 (IPv6)", RFC 3484, February 2003.

[RFC6724] Thaler, D., Draves, R., Matsumoto, A., and T. Chown, "Default Address Selection for Internet Protocol Version 6 (IPv6)", RFC 6724, September 2012.

[RFC3315] Droms, R., Bound, J., Volz, B., Lemon, T., Perkins, C., and M. Carney, "Dynamic Host Configuration Protocol for IPv6 (DHCPv6)", RFC 3315, July 2003.

[RFC3736] Droms, R., "Stateless Dynamic Host Configuration Protocol (DHCP) Service for IPv6", RFC 3736, April 2004.

[RFC4242] Venaas, S., Chown, T., and B. Volz, "Information Refresh Time Option for Dynamic Host Configuration Protocol for IPv6 (DHCPv6)", RFC 4242, November 2005.

[RFC5220] Matsumoto, A., Fujisaki, T., Hiromi, R., and K. Kanayama, "Problem Statement for Default Address Selection in Multi- Prefix Environments: Operational Issues of RFC 3484 Default Rules", RFC 5220, July 2008.

[RFC5221] Matsumoto, A., Fujisaki, T., Hiromi, R., and K. Kanayama, "Requirements for Address Selection Mechanisms", RFC 5221, July 2008.

Chown & Matsumoto Expires October 03, 2013 [Page 16] Internet-Draft Address Selection Considerations April 2013

[RFC5887] Carpenter, B., Atkinson, R., and H. Flinck, "Renumbering Still Needs Work", RFC 5887, May 2010.

[RFC6418] Blanchet, M. and P. Seite, "Multiple Interfaces and Provisioning Domains Problem Statement", RFC 6418, November 2011.

[RFC6731] Savolainen, T., Kato, J., and T. Lemon, "Improved Recursive DNS Server Selection for Multi-Interfaced Nodes", RFC 6731, December 2012.

[I-D.ietf-6man-addr-select-opt] Matsumoto, A., Fujisaki, T., and T. Chown, "Distributing Address Selection Policy using DHCPv6", draft-ietf-6man- addr-select-opt-08 (work in progress), January 2013.

[I-D.ietf-mif-dhcpv6-route-option] Dec, W., Mrugalski, T., Sun, T., Sarikaya, B., and A. Matsumoto, "DHCPv6 Route Options", draft-ietf-mif-dhcpv6 -route-option-05 (work in progress), August 2012.

[I-D.denis-v6ops-nat-addrsel] Denis-Courmont, R., "Problems with IPv6 source address selection and IPv4 NATs", draft-denis-v6ops-nat-addrsel-00 (work in progress), February 2009.

[I-D.ietf-v6ops-multihoming-without-nat66] Troan, O., Miles, D., Matsushima, S., Okimoto, T., and D. Wing, "IPv6 Multihoming without Network Address Translation", draft-ietf-v6ops-multihoming-without- nat66-00 (work in progress), December 2010.

Authors’ Addresses

Tim Chown (editor) University of Southampton Southampton , Hampshire SO17 1BJ United Kingdom

Email: [email protected]

Chown & Matsumoto Expires October 03, 2013 [Page 17] Internet-Draft Address Selection Considerations April 2013

Arifumi Matsumoto (editor) NTT NT Lab Midori-Cho 3-9-11 Musashino-shi, Tokyo 180-8585 Japan

Phone: +81 422 59 3334 Email: [email protected]

Chown & Matsumoto Expires October 03, 2013 [Page 18] 6man Working Group A. Matsumoto Internet-Draft T. Fujisaki Intended status: Standards Track NTT Expires: April 12, 2014 T. Chown University of Southampton October 09, 2013

Distributing Address Selection Policy using DHCPv6 draft-ietf-6man-addr-select-opt-13.txt

Abstract

RFC 6724 defines default address selection mechanisms for IPv6 that allow nodes to select an appropriate address when faced with multiple source and/or destination addresses to choose between. RFC 6724 allows for the future definition of methods to administratively configure the address selection policy information. This document defines a new DHCPv6 option for such configuration, allowing a site administrator to distribute address selection policy overriding the default address selection parameters and policy table, and thus to control the address selection behavior of nodes in their site.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on April 12, 2014.

Copyright Notice

Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of

Matsumoto, et al. Expires April 12, 2014 [Page 1] Internet-Draft DHCPv6 Address Selection Policy Opt October 2013

publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English.

1. Introduction

[RFC6724] describes default algorithms for selecting an address when a node has multiple destination and/or source addresses to choose from by using an address selection policy. This specification defines a new DHCPv6 option for configuring the default policy table.

Some problems were identified with the default address selection policy as originally defined in [RFC3484]. As a result, RFC 3484 was updated and obsoleted by [RFC6724]. While this update corrected a number of issues identifed from operational experience, it is unlikely that any default policy will suit all scenarios, and thus mechanisms to control the source address selection policy will be necessary. Requirements for those mechanisms are described in [RFC5221], while solutions are discussed in [I-D.ietf-6man-addr-select-considerations]. Those documents have helped shape the improvements in the default address selection algorithm in [RFC6724] as well as the requirements for the DHCPv6 option defined in this specification.

This option’s concept is to serve as a hint for a node about how to behave in the network. Ultimately, while the node’s administrator can control how to deal with the received policy information, the implementation SHOULD follow the method described below uniformly, to ease troubleshooting and to reduce operational costs.

1.1. Conventions Used in This Document

Matsumoto, et al. Expires April 12, 2014 [Page 2] Internet-Draft DHCPv6 Address Selection Policy Opt October 2013

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

1.2. Terminology

This document uses the terminology defined in [RFC2460] and the DHCPv6 specification defined in [RFC3315]

2. Address Selection options

The Address Selection option provides the address selection policy table, and some other configuration parameters.

An Address Selection option contains zero or more policy table options. Multiple policy table options in an Address Selection option constitute a single policy table. When an Address Selection option does not contain a policy table option, it may be used to just convey the A and P flags.

The format of the Address Selection option is given below.

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | OPTION_ADDRSEL | option-len | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved |A|P| | +-+-+-+-+-+-+-+-+ POLICY TABLE OPTIONS | | (variable length) | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Figure 1: Address Selection option format

option-code: OPTION_ADDRSEL (TBD).

option-len: The total length of the Reserved field, A, P flags, and POLICY TABLE OPTIONS in octets.

Reserved: Reserved field. The server MUST set this value to zero and the client MUST ignore its content.

Matsumoto, et al. Expires April 12, 2014 [Page 3] Internet-Draft DHCPv6 Address Selection Policy Opt October 2013

A: Automatic Row Addition flag. This flag toggles the Automatic Row Addition flag at client hosts, which is described in section 2.1 of [RFC6724]. If this flag is set to 1, it does not change client host behavior, that is, a client MAY automatically add additional site-specific rows to the policy table. If set to 0, the Automatic Row Addition flag is disabled, and a client SHOULD NOT automatically add rows to the policy table. If the option contains a POLICY TABLE option, this flag is meaningless, and automatic row addition SHOULD NOT be performed against the distributed policy table. This flag SHOULD be set to 0 only when the Automatic Row Addition at client hosts is harmful for site-specific reasons.

P: Privacy Preference flag. This flag toggles the Privacy Preference flag on client hosts, which is described in section 5 of [RFC6724]. If this flag is set to 1, it does not change client host behavior, that is, a client will prefer temporary addresses [RFC4941]. If set to 0, the Privacy Preference flag is disabled, and a client will prefer public addresses. This flag SHOULD be set to 0 only when the temporary addresses should not be preferred for site-specific reasons.

POLICY TABLE OPTIONS: Zero or more Address Selection Policy Table options, as described below. This option corresponds to a row in the policy table defined in section 2.1 of [RFC6724].

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | OPTION_ADDRSEL_TABLE | option-len | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | label | precedence | prefix-len | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | | prefix (variable length) | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Figure 2: Address Selection Policy Table option format

option-code: OPTION_ADDRSEL_TABLE (TBD).

Matsumoto, et al. Expires April 12, 2014 [Page 4] Internet-Draft DHCPv6 Address Selection Policy Opt October 2013

option-len: The total length of the label field, precedence field, prefix-len field, and prefix field.

label: An 8-bit unsigned integer; this value is for correlation of source address prefixes and destination address prefixes. This field is used to deliver a label value in the [RFC6724] policy table.

precedence: An 8-bit unsigned integer; this value is used for sorting destination addresses. This field is used to to deliver a precedence value in [RFC6724] policy table.

prefix-len: An 8-bit unsigned integer; the number of leading bits in the prefix that are valid. The value ranges from 0 to 128. If an option with a prefix length greater than 128 is included, the whole Address Selection option MUST be ignored.

prefix: A variable-length field containing an IP address or the prefix of an IP address. An IPv4-mapped address [RFC4291] must be used to represent an IPv4 address as a prefix value. This field is padded with zeros up to the nearest octet boundary when prefix-len is not divisible by 8. This can be expressed using the following equation: (prefix-len + 7)/8 So the length of this field should be between 0 and 16 bytes. For example, the prefix 2001:db8::/60 would be encoded with an prefix-len of 60, the prefix would be 8 octets and would contains octets 20 01 0d b8 00 00 00 00.

3. Processing the Address Selection option

This section describes how to process a received Address Selection option at the DHCPv6 client.

This option’s concept is to serve as a hint for a node about how to behave in the network. Ultimately, while the node’s administrator can control how to deal with the received policy information, the implementation SHOULD follow the method described below uniformly, to ease troubleshooting and to reduce operational costs.

3.1. Handling local configurations

Matsumoto, et al. Expires April 12, 2014 [Page 5] Internet-Draft DHCPv6 Address Selection Policy Opt October 2013

[RFC6724] defines two flags (A, P) and the default policy table. Also, users are usually able to configure the flags and the policy table to satisfy their own requirements.

The client implementation SHOULD provide the following choices to the user.

(a) replace the existing flags and active policy table with the DHCPv6 distributed flags and policy table.

(b) preserve the existing flags and active policy table, whether this be the default policy table, or user configured policy.

Choice (a) SHOULD be the default, i.e. that the policy table is not explictly configured by the user.

3.2. Handling stale distributed flags and policy table

When the information from the DHCP server goes stale, the flags and the policy table received from the DHCP server SHOULD be deprecated. The local configuration SHOULD be restored when the DHCP-supplied configuration has been deprecated. In order to implement this, a host can retain the local configuration even after the flags and the policy table is updated by the distributed flags and policy table.

The received information can be considered stale in several cases, e.g., when the interface goes down, the DHCP server does not respond for a certain amount of time, or the Information Refresh Time has expired.

3.3. Handling multiple interfaces

The policy table, and other parameters specified in this document, are node-global information by their nature. One reason being that the outbound interface is usually chosen after destination address selection. So a host cannot make use of multiple address selection policies even if they are stored per interface.

The policy table is defined as a whole, so the slightest addition/ deletion from the policy table brings a change in the semantics of the policy.

Matsumoto, et al. Expires April 12, 2014 [Page 6] Internet-Draft DHCPv6 Address Selection Policy Opt October 2013

It also should be noted that the absence of a DHCP-distributed policy from a certain network interface should not infer that the network administrator does not care about address selection policy at all, because it may mean there is a preference to use the default address selection policy. So, it should be safe to assume that the default address selection policy should be used where no overriding policy is provided.

Under the above assumptions, we can specify how to handle received policy as follows.

In the absence of distributed policy for a certain network interface, the default address selection policy SHOULD be used. A node should use Address Selection options by default in any of the following two cases:

1: A single-homed host SHOULD use default address selection options, where the host belongs exclusively to one administrative network domain, usually through one active network interface.

2: Hosts that use advanced heuristics to deal with multiple received policies that are defined outside the scope of this document SHOULD use Address Selection options.

Implementations MAY provide configuration options to enable this protocol on a per interface basis.

Implementations MAY store distributed address selection policies per interface. They can be used effectively on implementations that adopt per-application interface selection.

4. Implementation Considerations

o The value ’label’ is passed as an unsigned integer, but there is no special meaning for the value, that is whether it is a large or small number. It is used to select a preferred source address prefix corresponding to a destination address prefix by matching the same label value within the DHCP message. DHCPv6 clients SHOULD convert this label to a representation appropriate for the local implementation (e.g., string).

o The maximum number of address selection rules that may be conveyed in one DHCPv6 message depends on the prefix length of each rule and the maximum DHCPv6 message size defined in [RFC3315]. It is possible to carry over 3,000 rules in one DHCPv6 message (maximum UDP message size). However, it should not be expected that DHCP clients, servers and relay agents can handle UDP fragmentation.

Matsumoto, et al. Expires April 12, 2014 [Page 7] Internet-Draft DHCPv6 Address Selection Policy Opt October 2013

Network adiministrators SHOULD consider local limitations to the maximum DHCPv6 message size that can be reliably transported via their specific local infrastructure to end nodes; and therefore they SHOULD consider the number of options, the total size of the options, and the resulting DHCPv6 message size, when defining their policy table.

5. Security Considerations

A rogue DHCPv6 server could issue bogus address selection policies to a client. This might lead to incorrect address selection by the client, and the affected packets might be blocked at an outgoing ISP because of ingress filtering, incur additional network charges, or be misdirected to an attacker’s machine. Alternatively, an IPv6 transition mechanism might be preferred over native IPv6, even if it is available. To guard against such attacks, a legitimate DHCPv6 server should communicate through a secure, trusted channel, such as a channel protected by IPsec, SEND and DHCP authentication, as described in section 21 of [RFC3315]. A commonly used alternative mitigation is to employ DHCP snooping at Layer 2.

Another threat surrounds the potential privacy concern as described in the security considerations section of [RFC6724], whereby an attacker can send packets with different source addresses to a destination to solicit different source addresses in the responses from that destination. This issue will not be modified by the introduction of this option, regardless of whether the host is multihomed or not.

6. IANA Considerations

IANA is requested to assign option codes to OPTION_ADDRSEL and OPTION_ADDRSEL_TABLE from the "DHCP Option Codes" registry (http:// www.iana.org/assignments/dhcpv6-parameters/dhcpv6-parameters.xml).

7. References

7.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC3315] Droms, R., Bound, J., Volz, B., Lemon, T., Perkins, C., and M. Carney, "Dynamic Host Configuration Protocol for IPv6 (DHCPv6)", RFC 3315, July 2003.

Matsumoto, et al. Expires April 12, 2014 [Page 8] Internet-Draft DHCPv6 Address Selection Policy Opt October 2013

[RFC6724] Thaler, D., Draves, R., Matsumoto, A., and T. Chown, "Default Address Selection for Internet Protocol Version 6 (IPv6)", RFC 6724, September 2012.

7.2. Informative References

[I-D.ietf-6man-addr-select-considerations] Chown, T. and A. Matsumoto, "Considerations for IPv6 Address Selection Policy Changes", draft-ietf-6man-addr- select-considerations-05 (work in progress), April 2013.

[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998.

[RFC3484] Draves, R., "Default Address Selection for Internet Protocol version 6 (IPv6)", RFC 3484, February 2003.

[RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing Architecture", RFC 4291, February 2006.

[RFC4941] Narten, T., Draves, R., and S. Krishnan, "Privacy Extensions for Stateless Address Autoconfiguration in IPv6", RFC 4941, September 2007.

[RFC5220] Matsumoto, A., Fujisaki, T., Hiromi, R., and K. Kanayama, "Problem Statement for Default Address Selection in Multi- Prefix Environments: Operational Issues of RFC 3484 Default Rules", RFC 5220, July 2008.

[RFC5221] Matsumoto, A., Fujisaki, T., Hiromi, R., and K. Kanayama, "Requirements for Address Selection Mechanisms", RFC 5221, July 2008.

Appendix A. Acknowledgements

Authors would like to thank to Dave Thaler, Pekka Savola, Remi Denis- Courmont, Francois-Xavier Le Bail, Ole Troan, Bob Hinden, Dmitry Anipko, Ray Hunter, Rui Paulo, Brian E Carpenter, Tom Petch, and the members of 6man’s address selection design team for their invaluable contributions to this document.

Appendix B. Examples

[RFC5220] gives several cases where address selection problems happen. This section contains some examples for solving those cases by using the DHCP option defined in this text to update the hosts’ policy table in a network accordingly. There is also some discussion of example policy tables in sections 10.3 to 10.7 of RFC 6724.

Matsumoto, et al. Expires April 12, 2014 [Page 9] Internet-Draft DHCPv6 Address Selection Policy Opt October 2013

B.1. Ingress Filtering Problem

In the case described in section 2.1.2 of [RFC5220], the following policy table should be distributed, when Router performs static routing and directs the default route to ISP1 as per Figure 2. By putting the same label value to all IPv6 addresses (::/0) and the local subnet (2001:db8:1000:1::/64), a host picks a source address in this subnet to send a packet via the default route.

Prefix Precedence Label ::1/128 50 0 ::/0 40 1 2001:db8:1000:1::/64 45 1 2001:db8:8000:1::/64 45 14 ::ffff:0:0/96 35 4 2002::/16 30 2 2001::/32 5 5 fc00::/7 3 13 ::/96 1 3 fec0::/10 1 11 3ffe::/16 1 12

B.2. Half-Closed Network Problem

In the case described in section 2.1.3 of [RFC5220], the following policy table should be distributed. By splitting the closed network prefix (2001:db8:8000::/36) from all IPv6 addresses (::/0) and giving different labels, the closed network prefix will only be used when packets are destined for the closed network.

Prefix Precedence Label ::1/128 50 0 ::/0 40 1 2001:db8:8000::/36 45 14 ::ffff:0:0/96 35 4 2002::/16 30 2 2001::/32 5 5 fc00::/7 3 13 ::/96 1 3 fec0::/10 1 11 3ffe::/16 1 12

B.3. IPv4 or IPv6 Prioritization

Matsumoto, et al. Expires April 12, 2014 [Page 10] Internet-Draft DHCPv6 Address Selection Policy Opt October 2013

In the case described in section 2.2.1 of [RFC5220], the following policy table should be distributed to prioritize IPv6. This case is also described in [RFC6724]

Prefix Precedence Label ::1/128 50 0 ::/0 40 1 ::ffff:0:0/96 100 4 2002::/16 30 2 2001::/32 5 5 fc00::/7 3 13 ::/96 1 3 fec0::/10 1 11 3ffe::/16 1 12

B.4. ULA or Global Prioritization

In the case described in section 2.2.3 of [RFC5220], the following policy table should be distributed, or Automatic Row Addition flag should be set to 1. By splitting the ULA in this site (fc12:3456:789a::/48) from all IPv6 addresses (::/0) and giving it higher precendence, the ULA will be used to connect to servers in the same site.

Prefix Precedence Label ::1/128 50 0 fc12:3456:789a::/48 45 14 ::/0 40 1 ::ffff:0:0/96 35 4 2002::/16 30 2 2001::/32 5 5 fc00::/7 3 13 ::/96 1 3 fec0::/10 1 11 3ffe::/16 1 12

Authors’ Addresses

Arifumi Matsumoto NTT NT Lab 3-9-11 Midori-Cho Musashino-shi, Tokyo 180-8585 Japan

Phone: +81 422 59 3334 Email: [email protected]

Matsumoto, et al. Expires April 12, 2014 [Page 11] Internet-Draft DHCPv6 Address Selection Policy Opt October 2013

Tomohiro Fujisaki NTT NT Lab 3-9-11 Midori-Cho Musashino-shi, Tokyo 180-8585 Japan

Phone: +81 422 59 7351 Email: [email protected]

Tim Chown University of Southampton Southampton, Hampshire SO17 1BJ United Kingdom

Email: [email protected]

Matsumoto, et al. Expires April 12, 2014 [Page 12] 6man Working Group F. Costa Internet−Draft J−M. Combes, Ed. Intended status: Standards Track X. Pougnard Expires: October 11, 2013 France Telecom Orange H. Li Huawei Technologies April 9, 2013

Duplicate Address Detection Proxy draft−ietf−6man−dad−proxy−07

Abstract

The document describes a proxy based mechanism allowing the use of Duplicate Address Detection (DAD) by IPv6 nodes in a point−to− multipoint architecture with "split−horizon" forwarding scheme, primarily deployed for Digital Subscriber Line (DSL) and Fiber access architectures. Based on the DAD signalling, the first hop router stores in a Binding Table all known IPv6 addresses used on a point− to−multipoint domain (e.g. VLAN). When a node performs DAD for an address already used by another node, the first hop router replies instead of this last one.

Status of this Memo

This Internet−Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet−Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet−Drafts. The list of current Internet− Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet−Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet−Drafts as reference material or to cite them other than as "work in progress."

This Internet−Draft will expire on October 11, 2013.

Copyright Notice

Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents

Costa, et al. Expires October 11, 2013 [Page 1] Internet−Draft DAD−Proxy April 2013

(http://trustee.ietf.org/license−info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction ...... 3 1.1. Requirements Language ...... 3 2. Background ...... 3 3. Why existing IETF solutions are not sufficient? ...... 4 3.1. Duplicate Address Detection ...... 5 3.2. Neighbor Discovery Proxy ...... 5 3.3. 6LoWPAN Neighbor Discovery ...... 5 3.4. IPv6 Mobility Manager ...... 6 4. Duplicate Address Detection Proxy (DAD−Proxy) specifications ...... 6 4.1. DAD−Proxy Data structure ...... 6 4.2. DAD−Proxy mechanism ...... 7 4.2.1. No entry exists for the tentative address ...... 7 4.2.2. An entry already exists for the tentative address . . 7 4.2.3. Confirmation of reachability to check the validity of the conflict ...... 9 5. Manageability Considerations ...... 11 6. IANA Considerations ...... 11 7. Security Considerations ...... 11 7.1. Interoperability with SEND ...... 11 7.2. IP source address spoofing protection ...... 11 8. Acknowledgments ...... 12 9. References ...... 12 9.1. Normative References ...... 12 9.2. Informative References ...... 12 Appendix A. DAD Proxy state machine ...... 13 Authors’ Addresses ...... 15

Costa, et al. Expires October 11, 2013 [Page 2] Internet−Draft DAD−Proxy April 2013

1. Introduction

This document specifies a function called Duplicate Address Detection (DAD) proxy allowing the use of DAD by the nodes on the same point− to−multipoint domain with "split−horizon" forwarding scheme, primarily deployed for Digital Subscriber Line (DSL) and Fiber access architectures. It only impacts the first hop router and it doesn’t need modifications on the other IPv6 nodes. This mechanism is fully effective if all the nodes of a point−to−multipoint domain (except the DAD proxy itself) perform DAD.

This document explains also why DAD mechanism [RFC4862] without a proxy cannot be used in a point−to−multipoint architecture with "split−horizon" forwarding scheme (IPv6 over PPP [RFC5072] is not affected). One of the main reasons is that, because of this forwarding scheme, IPv6 nodes on the same point−to−multipoint domain cannot have direct communication: any communication between them must go through the first hop router of the same domain.

It is assumed in this document that Link−layer addresses on a point− to−multipoint domain are unique from the first hop router’s point of view (e.g. in an untrusted Ethernet architecture this assumption can be guaranteed thanks to mechanisms such as "MAC Address Translation" performed by an aggregation device between IPv6 nodes and the first hop router).

1.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

2. Background

Terminology in this document follows that in Neighbor Discovery for IP version 6 (IPv6) [RFC4861] and IPv6 Stateless Address Autoconfiguration [RFC4862]. In addition, this section defines additional terms related to Digital Subscriber Line (DSL) and Fiber access architectures, which are an important case where the solution described in this document can be used:

Customer Premises Equipment (CPE) The first IPv6 node in a customer’s network.

Costa, et al. Expires October 11, 2013 [Page 3] Internet−Draft DAD−Proxy April 2013

Access Node (AN) The first aggregation point in the public access network. It is considered as a L2 bridge in this document.

Broadband Network Gateway (BNG) The first hop router from the CPE’s point of view.

VLAN N:1 architecture A point−to−multipoint architecture where many CPEs are connected to the same VLAN. The CPEs may be connected on the same or different Access Nodes.

split−horizon model A forwarding scheme where CPEs cannot have direct layer 2 communications between them (i.e. IP flows must be forwarded through the BNG via routing).

The following figure shows where are the different entities defined above.

+−−−−−−+ +−−−−+ | CPE3 |−−−−−−−−−| AN | +−−−−−−+ +−−−−+ | | +−−−−−−+ +−−−−+ | CPE2 |−−−−−−−−−| AN |−−−+ +−−−−−−+ +−−−−+ | +−−−−−−+ | | | CPE1 |−−−−−−−−−−−−+ | +−−−−−−+ +−−−−−+ | BNG |−−− Internet +−−−−−+

Figure 1: DSL and Fiber access Architecture

3. Why existing IETF solutions are not sufficient?

In a DSL or Fiber access architecture depicted in Figure 1, CPE1,2,3 and the BNG are IPv6 nodes, while AN is a L2 bridge providing connectivity between the BNG and each CPE. The AN enforces a split− horizon model so that CPEs can only send and receive frames (e.g. Ethernet frames) to and from the BNG but not to each other. That said, the BNG is on a same link with all CPE, but one CPE is not on a same link with any other CPE.

Costa, et al. Expires October 11, 2013 [Page 4] Internet−Draft DAD−Proxy April 2013

3.1. Duplicate Address Detection

Duplicate Address Detection (DAD) [RFC4862] is performed when an IPv6 node verifies the uniqueness of a tentative IPv6 address. This node sends a Neighbor Solicitation (NS) message with the IP destination set to the solicited−node multicast address of the tentative address. This NS message is multicasted to other nodes on the same link. When the tentative address is already used on the link by another node, this last one replies with a Neighbor Advertisement (NA) message to inform the first node. So when performing DAD, a node expects the NS messages to be received by any node currently using the tentative address.

However, in a point−to−multipoint network with split−horizon forwarding scheme implemented in the AN, the CPEs are prevented from talking to each other directly. All packets sent out from a CPE are forwarded by the AN only to the BNG but not to any other CPE. NS messages sent by a certain CPE will be received only by the BNG and will not reach other CPEs. So, other CPEs have no idea that a certain IPv6 address is used by another CPE. That means, in a network with split−horizon, DAD, as defined in [RFC4862], can’t work properly without an additional help.

3.2. Neighbor Discovery Proxy

Neighbor Discovery (ND) Proxy [RFC4389] is designed for forwarding ND messages between different IP links where the subnet prefix is the same. A ND Proxy function on a bridge ensures that packets between nodes on different segments can be received by this function and have the correct link−layer address type on each segment. When the ND proxy receives a multicast ND message, it forwards it to all other interfaces on a same link.

In DSL or Fiber networks, when the AN, acting as a ND Proxy, receives a ND message from a CPE, it will forward it to the BNG but none of other CPEs, as only the BNG is on the same link with the CPE. Hence, implementing ND Proxy on the AN would not help a CPE acknowledge link−local addresses used by other CPEs.

As the BNG must not forward link−local scoped messages sent from a CPE to other CPEs, ND Proxy cannot be implemented in the BNG.

3.3. 6LoWPAN Neighbor Discovery

[RFC6775] defines an optional modification of DAD for a 6LowPAN. When a 6LoWPAN node wants to configure an IPv6 address, it registers that address with one or more of its default routers using the Address Registration option (ARO). If this address is already owned

Costa, et al. Expires October 11, 2013 [Page 5] Internet−Draft DAD−Proxy April 2013

by another node, the router informs the 6LoWPAN node this address cannot be configured.

This mechanism requires modifications in all hosts in order to support the Address Registration option.

3.4. IPv6 Mobility Manager

According to [RFC6275], a home agent acts as a proxy for mobile nodes when they are away from the home network: the home agent defends an mobile node’s home address by replying to NS messages with NA messages.

There is a problem for this mechanism if it is applied in a DSL or Fiber public access network. Operators of such networks require a NA message is only received by the sender of the corresponding NS message, for security and scalability reasons. However, the home agent per [RFC6275] multicasts NA messages on the home link and all nodes on this link will receive these NA messages. This shortcoming prevents this mechanism being deployed in DSL or Fiber access networks directly.

4. Duplicate Address Detection Proxy (DAD−Proxy) specifications

At first, it is important to note that as this mechanism is strongly based on DAD [RFC4862], it is not completely reliable and the goal of this document is not to fix DAD.

4.1. DAD−Proxy Data structure

A BNG needs to store in a Binding Table information related to the IPv6 addresses generated by any CPE. This Binding Table can be distinct from the Neighbor Cache. This must be done per point to multipoint domain (e.g. per Ethernet VLAN). Each entry in this Binding Table MUST contain the following fields:

o IPv6 Address

o Link−layer Address

For security or performances reasons, it must be possible to limit the number of IPv6 Addresses per Link−layer Address (possibly, but not necessarily, to 1).

On the reception of an unsolicited NA (e.g., when a CPE wishes to inform its neighbors of a new link−layer address) for an IPv6 address already recorded in the Binding Table, each entry associated to this

Costa, et al. Expires October 11, 2013 [Page 6] Internet−Draft DAD−Proxy April 2013

IPv6 address MUST be updated consequently: the current Link−layer Address is replaced by the one included in the unsolicited NA message.

For security or performances reasons, the Binding Table MUST be large enough for the deployment in which it is used: if the Binding Table is distinct from the Neighbor Cache, it MUST have at least the same size as this last one. Implementations MUST either state the fixed size of Binding Table that they support or make the size configurable. In the latter case, implementations MUST state the largest Binding Table size that they support. Additionally, implementations SHOULD allow an operator to enquire the current occupancy level of the Binding Table to determine if it is about to become full. Implementations encountering a full Binding Table will likely handle it in a way similar to NS message loss.

It is recommended to apply technical solutions to minimize the risk that the Binding Table becomes full. These solutions are out of the scope of this document.

4.2. DAD−Proxy mechanism

When a CPE performs DAD, as specified in [RFC4862], it sends a Neighbor Solicitation (NS) message, with the unspecified address as the source address, in order to check if a tentative address is already in use on the link. The BNG receives this message and MUST perform actions specified in the following sections based on the information in the Binding Table.

4.2.1. No entry exists for the tentative address

When there is no entry for the tentative address, the BNG MUST create one with the following information:

o IPv6 Address Field set to the tentative address in the NS message.

o Link−layer Address Field set to the Link−layer source address in the Link−layer Header of the NS message.

The BNG MUST NOT reply to the CPE or forward the NS message.

4.2.2. An entry already exists for the tentative address

When there is an entry for the tentative address, the BNG MUST check the following conditions:

o The address in the Target Address Field in the NS message is equal to the address in the IPv6 Address Field in the entry.

Costa, et al. Expires October 11, 2013 [Page 7] Internet−Draft DAD−Proxy April 2013

o The source address of the IPv6 Header in the NS message is equal to the unspecified address.

When these conditions are met and the source address of the Link− Layer Header in the NS message is equal to the address in the Link− Layer Address Field in the entry, that means the CPE is still performing DAD for this address. The BNG MUST NOT reply to the CPE or forward the NS message.

When these conditions are met and the source address of the Link− Layer Header in the NS message is not equal to the address in the Link−Layer Address Field in the entry, that means possibly another CPE performs DAD for an already owned address. The BNG then has to verify whether there is a real conflict by checking if the CPE whose IPv6 address is in the entry is still connected. In the following, we will call IPv6−CPE1 the IPv6 address of the existing entry in the Binding Table, Link−layer−CPE1 the Link−layer address of that entry and Link−layer−CPE2 the Link−layer address of the CPE which is performing DAD, which is different from Link−layer−CPE1.

The BNG MUST check if the potential address conflict is real. In particular:

o If IPv6−CPE1 is in the Neighbor Cache and it is associated with Link−layer−CPE1, the reachability of IPv6−CPE1 MUST be confirmed as explained in Section 4.2.3.

o If IPv6−CPE1 is in the Neighbor Cache, but in this cache it is associated with another Link−layer address than Link−layer−CPE1, that means that there is possibly a conflict with another CPE, but that CPE did not perform DAD. This situation is out of the scope of this document, since one assumption made above is that all the nodes of a point−to−multipoint domain (except the DAD proxy itself) perform DAD.

o If IPv6−CPE1 is not in the Neighbor Cache, then the BNG MUST create a new entry based on the information of the entry in the Binding Table. This step is necessary in order to trigger the reachibility check as explained in Section 4.2.3. The entry in the Neighbor Cache MUST be created based on the algorithm defined in section 7.3.3 of [RFC4861], in particular by considering the case as if a packet other than a solicited Neighbor Advertisement was received from IPv6−CPE1. That means that the new entry of the Neighbor Cache MUST contain the following information:

* IPv6 address: IPv6−CPE1

Costa, et al. Expires October 11, 2013 [Page 8] Internet−Draft DAD−Proxy April 2013

* Link−layer address: Link−layer−CPE1

* State: STALE

Then the reachability of IPv6−CPE1 MUST be confirmed as soon as possible following the procedure explained in section 4.2.3.

4.2.3. Confirmation of reachability to check the validity of the conflict

Given that the IPv6−CPE1 is in an entry of the Neighbor Cache, the reachability of IPv6−CPE1 is checked by using the NUD (Neighbor Unreachibility Detection) mechanism described in section 7.3.1 of [RFC4861]. This mechanism MUST be triggered as if a packet has to be sent to IPv6−CPE1. Note that in some cases this mechanism does not do anything, for instance if the state of the entry is REACHABLE and a positive confirmation was received recently that the forward path to the IPv6−CPE1 was functioning properly (see RFC 4861 for more details).

Next, the behavior of the BNG depends on the result of the NUD process, as explained in the following sections.

4.2.3.1. The result of the NUD process is negative

If the result of the NUD process is negative (i.e. if this process removes IPv6−CPE1 from the Neighbor Cache), that means that the potential conflict is not real.

The conflicting entry in the Binding Table (Link−layer−CPE1) is deleted and it is replaced by a new entry with the same IPv6 address, but the Link−layer address of the CPE which is performing DAD (Link− layer−CPE2), as explained in Section 4.2.1.

4.2.3.2. The result of the NUD process is positive

If the result of the NUD process is positive (i.e. if after this process the state of IPv6−CPE1 is REACHABLE), that means that the potential conflict is real.

As shown in Figure 2, the BNG MUST reply to CPE that is performing DAD (CPE2 in Figure 1) with a NA message which has the following format:

Layer 2 Header Fields:

Costa, et al. Expires October 11, 2013 [Page 9] Internet−Draft DAD−Proxy April 2013

Source Address The Link−layer address of the interface on which the BNG received the NS message.

Destination Address The source address in the Layer 2 Header of the NS message received by the BNG (i.e. Link−layer−CPE2)

IPv6 Header Fields:

Source Address An address assigned to the interface from which the advertisement is sent.

Destination Address The all−nodes multicast address.

ICMPv6 Fields:

Target Address The tentative address already used (i.e. IPv6−CPE1).

Target Link−layer address The Link−layer address of the interface on which the BNG received the NS message.

CPE1 CPE2 BNG | | | (a)| | | | | | (b)|======>| | | |(c) | | | | (d)| | | | | | (e)|======>| | | | | |<======|(f) | | |

(a) CPE1 generated a tentative address (b) CPE1 performs DAD for this one (c) BNG updates its Binding Table (d) CPE2 generates a same tentative address (e) CPE2 performs DAD for this one (f) BNG informs CPE2 that DAD fails

Costa, et al. Expires October 11, 2013 [Page 10] Internet−Draft DAD−Proxy April 2013

Figure 2

The BNG and the CPE MUST support the Unicast Transmission on Link− layer of IPv6 Multicast Messages [RFC6085], to be able, respectively, to generate and to process such a packet format.

5. Manageability Considerations

The BNG SHOULD support a mechanism to log and emit alarms whenever a duplication of IPv6 addresses is detected by the DAD−Proxy function. Moreover, the BNG SHOULD implement a function to allow an operator to access logs and to see the current entries in the Binding Table. The management of access rights to get this information is out of the scope of this document.

6. IANA Considerations

No new options or messages are defined in this document.

7. Security Considerations

7.1. Interoperability with SEND

The mechanism described in this document will not interoperate with SEcure Neighbor Discovery (SEND) [RFC3971]. This is due to the BNG not owning the private key associated with the Cryptographically Generated Address (CGA) [RFC3972] needed to correctly sign the proxied ND messages [RFC5909].

Secure Proxy ND Support for SEND [RFC6496] has been specified to address this limitation, and SHOULD be implemented and used on the BNG and the CPEs.

7.2. IP source address spoofing protection

To ensure protection against IP source address spoofing in data packets, this proposal can be used in combinaison with Source Address Validation Improvement (SAVI) mechanisms [RFC6620] [I−D.ietf−savi−send] [I−D.ietf−savi−mix].

If so, the SAVI device is the BNG and the Binding Anchor for a CPE is its MAC address, which is assumed to be unique in this document (cf. Section 1).

Costa, et al. Expires October 11, 2013 [Page 11] Internet−Draft DAD−Proxy April 2013

8. Acknowledgments

The authors would like to thank Alan Kavanagh, Wojciech Dec, Suresh Krishnan and Tassos Chatzithomaoglou for their comments. The authors would like also to thank the IETF 6man WG members and the BBF community for their support.

9. References

9.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, September 2007.

[RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless Address Autoconfiguration", RFC 4862, September 2007.

[RFC6085] Gundavelli, S., Townsley, M., Troan, O., and W. Dec, "Address Mapping of IPv6 Multicast Packets on Ethernet", RFC 6085, January 2011.

9.2. Informative References

[I−D.ietf−savi−mix] Bi, J., Yao, G., Halpern, J., and E. Levy−Abegnoli, "SAVI for Mixed Address Assignment Methods Scenario", draft−ietf−savi−mix−03 (work in progress), November 2012.

[I−D.ietf−savi−send] Bagnulo, M. and A. Garcia−Martinez, "SEND−based Source− Address Validation Implementation", draft−ietf−savi−send−09 (work in progress), April 2013.

[RFC3971] Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure Neighbor Discovery (SEND)", RFC 3971, March 2005.

[RFC3972] Aura, T., "Cryptographically Generated Addresses (CGA)", RFC 3972, March 2005.

[RFC4389] Thaler, D., Talwar, M., and C. Patel, "Neighbor Discovery Proxies (ND Proxy)", RFC 4389, April 2006.

[RFC5072] S.Varada, Haskins, D., and E. Allen, "IP Version 6 over

Costa, et al. Expires October 11, 2013 [Page 12] Internet−Draft DAD−Proxy April 2013

PPP", RFC 5072, September 2007.

[RFC5909] Combes, J−M., Krishnan, S., and G. Daley, "Securing Neighbor Discovery Proxy: Problem Statement", RFC 5909, July 2010.

[RFC6275] Perkins, C., Johnson, D., and J. Arkko, "Mobility Support in IPv6", RFC 6275, July 2011.

[RFC6496] Krishnan, S., Laganier, J., Bonola, M., and A. Garcia− Martinez, "Secure Proxy ND Support for SEcure Neighbor Discovery (SEND)", RFC 6496, February 2012.

[RFC6620] Nordmark, E., Bagnulo, M., and E. Levy−Abegnoli, "FCFS SAVI: First−Come, First−Served Source Address Validation Improvement for Locally Assigned IPv6 Addresses", RFC 6620, May 2012.

[RFC6775] Shelby, Z., Chakrabarti, S., Nordmark, E., and C. Bormann, "Neighbor Discovery Optimization for IPv6 over Low−Power Wireless Personal Area Networks (6LoWPANs)", RFC 6775, November 2012.

Appendix A. DAD Proxy state machine

This appendix, informative, contains a summary (cf. Table 1) of the actions done by the BNG when it receives a DAD based NS (DAD−NS) message. The tentative address in this message is IPv6−CPE1 and the associated Link−layer address is Link−layer−CPE2. The actions are precisely specified in Section 4.2.

Costa, et al. Expires October 11, 2013 [Page 13] Internet−Draft DAD−Proxy April 2013

+−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−+−−−−−−−−−−−−+ | Event | Check | Action | New event | +−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−+−−−−−−−−−−−−+ | DAD−NS | o No entry for | Create an entry | − | | message | IPv6−CPE1 in the | for IPv6−CPE1 | | | reception. | Binding Table. | bound to | | | | | Link−layer−CPE2 | | | | | in the Binding | | | | | Table. | | | | o Entry for | − | Existing | | | IPv6−CPE1 in the | | entry | | | Binding Table. | | | | Existing | o Link−layer−CPE2 | − | − | | entry | bound to IPv6−CPE1 | | | | | in the Binding | | | | | Table. | | | | | o Another | − | Conflict? | | | Link−layer address, | | | | | Link−layer−CPE1, | | | | | bound to IPv6−CPE1 | | | | | in the Binding | | | | | Table. | | | | Conflict? | o IPv6−CPE1 | − | Reachable? | | | associated to | | | | | Link−layer−CPE1 in | | | | | the Neighbor Cache. | | | | | o IPv6−CPE1 | Out of scope. | − | | | associated to | | | | | another Link−layer | | | | | address than | | | | | Link−layer−CPE1 in | | | | | the Neighbor Cache. | | | | | o IPv6−CPE1 is not | Create an entry | Reachable? | | | in the Neighbor | for IPv6−CPE1 | | | | Cache. | associated to | | | | | Link−layer−CPE1 | | | | | in the Neighbor | | | | | Cache. | | | Reachable? | o NUD process is | IPv6−CPE2 is | − | | | negative. | bound to | | | | | Link−layer−CPE2, | | | | | instead to | | | | | Link−layer−CPE1, | | | | | in the Binding | | | | | Table. | | | | o NUD process is | A NA message is | − | | | positive. | sent. | | +−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−+−−−−−−−−−−−−+

Costa, et al. Expires October 11, 2013 [Page 14] Internet−Draft DAD−Proxy April 2013

Table 1

Authors’ Addresses

Fabio Costa France Telecom Orange 61 rue des Archives 75141 Paris Cedex 03 France

Email: [email protected]

Jean−Michel Combes (editor) France Telecom Orange 38 rue du General Leclerc 92794 Issy−les−Moulineaux Cedex 9 France

Email: [email protected]

Xavier Pougnard France Telecom Orange 2 avenue Pierre Marzin 22300 Lannion France

Email: [email protected]

Hongyu Li Huawei Technologies Huawei Industrial Base Shenzhen China

Email: [email protected]

Costa, et al. Expires October 11, 2013 [Page 15]

6man Working Group S. Krishnan Internet-Draft Ericsson Updates: 2460 (if approved) j h. woodyatt Intended status: Standards Track Apple Expires: July 15, 2012 E. Kline Google J. Hoagland Symantec M. Bhatia Alcatel-Lucent January 12, 2012

An uniform format for IPv6 extension headers draft-ietf-6man-exthdr-06

Abstract

In IPv6, optional internet-layer information is encoded in separate headers that may be placed between the IPv6 header and the transport layer header. There are a small number of such extension headers currently defined. This document describes the issues that can arise when defining new extension headers and discusses the alternate extension mechanisms in IPv6. It also provides a common format for defining any new IPv6 extension headers if they are needed.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on July 15, 2012.

Copyright Notice

Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved.

Krishnan, et al. Expires July 15, 2012 [Page 1] Internet-Draft Format for IPv6 extension headers January 2012

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction ...... 3 2. Conventions used in this document ...... 3 3. Applicability ...... 3 4. Proposed IPv6 Extension Header format ...... 5 5. Backward Compatibility ...... 5 6. Future work ...... 5 7. IANA Considerations ...... 6 8. Security Considerations ...... 6 9. Acknowledgements ...... 6 10. Normative References ...... 6 Authors’ Addresses ...... 6

Krishnan, et al. Expires July 15, 2012 [Page 2] Internet-Draft Format for IPv6 extension headers January 2012

1. Introduction

The base IPv6 standard [RFC2460] defines extension headers as an expansion mechanism to carry optional internet layer information. Extension headers, with the exception of the hop-by-hop options header, are not usually processed on intermediate nodes. However, Several existing deployed IPv6 routers and several existing deployed IPv6 firewalls, in contradiction to [RFC2460], are capable of parsing past or ignoring all currently defined IPv6 Extension Headers (e.g. to examine transport-layer header fields) at wire-speed (e.g. by using custom ASICs for packet processing). Hence, one must also consider that any new IPv6 Extension Header will break IPv6 deployments that use these existing capabilities.

Any IPv6 header or option that has hop-by-hop behaviour, and is intended for general use in the public IPv6 Internet, could be subverted to create an attack on IPv6 routers processing packets containing such a header or option. Reports from the field indicate that some IP routers deployed within the global Internet are configured either to ignore the presence of headers with hop-by-hop behaviour or to drop packets containing headers with hop-by-hop behaviour.

2. Conventions used in this document

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

3. Applicability

The base IPv6 standard [RFC2460] allows the use of both extension headers and destination options in order to encode optional destination information in an IPv6 packet. The use of destination options to encode this information, provides more flexible handling characteristics and better backward compatibility than using extension headers. Because of this, implementations SHOULD use destination options as the preferred mechanism for encoding optional destination information, and use a new extension header only if destination options do not satisfy their needs. The request for creation of a new IPv6 extension header MUST be accompanied by an specific explanation of why destination options could not be used to convey this information.

The base IPv6 standard [RFC2460] defines 3 extension headers (i.e. Routing Header, Destination Options Header, Hop-by-Hop Options

Krishnan, et al. Expires July 15, 2012 [Page 3] Internet-Draft Format for IPv6 extension headers January 2012

Header) to be used for any new IPv6 options. The same standard only allows the creation of new Extension Headers in limited circumstances [RFC2460] Section 4.6.

As noted above, the use of any option with Hop-by-Hop behaviour can be problematic in the global public Internet. New IPv6 Extension Header(s) having hop-by-hop behaviour MUST NOT be created or specified. New options for the existing Hop-by-Hop Header SHOULD NOT be created or specified unless no alternative solution is feasible. Any proposal to create a new option for the existing Hop-by-Hop Header MUST include a detailed explanation of why the hop-by-hop behaviour is absolutely essential in the Internet-Draft proposing the new option with hop-by-hop behaviour.

The use of IPv6 Destination Options to encode information provides more flexible handling characteristics and better backward compatibility than using a new Extension Header. Because of this, new optional information to be sent SHOULD be encoded in a new option for the existing IPv6 Destination Options Header.

Mindful of the need for compatibility with existing IPv6 deployments, new IPv6 extension headers MUST NOT be created or specified, unless no existing IPv6 Extension Header can be used by specifying a new option for that existing IPv6 Extension Header. Any proposal to create or specify a new IPv6 Extension Header MUST include a detailed technical explanation of why no existing IPv6 Extension Header can be used in the Internet-Draft proposing the new IPv6 Extension Header.

Krishnan, et al. Expires July 15, 2012 [Page 4] Internet-Draft Format for IPv6 extension headers January 2012

4. Proposed IPv6 Extension Header format

Any IPv6 Extension Headers are defined in future, keeping in mind the restrictions specified in Section 3 and also the restrictions specified in [RFC2460], MUST use the consistent format defined in Figure 1. This minimizes breakage in intermediate nodes that examine these extension headers.

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Next Header | Hdr Ext Len | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | . . . Header Specific Data . . . | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Next Header 8-bit selector. Identifies the type of header immediately following the Extension header. Uses the same values as the IPv4 Protocol field [IANA_IP_PARAM]. Hdr Ext Len 8-bit unsigned integer. Length of the Extension header in 8-octet units, not including the first 8 octets. Header Specific Variable length. Fields specific to the Data extension header

Figure 1: Extension header layout

5. Backward Compatibility

The scheme proposed in this document is not intended to be backward compatible with all the currently defined IPv6 extension headers. It applies only to newly defined extension headers. Specifically, the fragment header predates this document and does not follow the format proposed in this document.

6. Future work

This document proposes one step in easing the inspection of extension headers by middleboxes. There is further work required in this area. Some issues that are left unresolved beyond this document include

Krishnan, et al. Expires July 15, 2012 [Page 5] Internet-Draft Format for IPv6 extension headers January 2012

o There can be an arbitrary number of extension headers. o Extension headers must be processed in the order they appear. o Extension headers may alter the processing of the payload itself, and hence the packet may not be processed properly without knowledge of said header.

7. IANA Considerations

This document does not require any IANA actions.

8. Security Considerations

This document proposes a standard format for the IPv6 extension headers that minimizes breakage at intermediate nodes that inspect but do not understand the contents of these headers. Intermediate nodes, such as firewalls, that skip over unknown headers might end up allowing the setup of a covert channel from the outside of the firewall to the inside using the data field(s) of the unknown extension headers.

9. Acknowledgements

The authors would like to thank Albert Manfredi, Bob Hinden, Brian Carpenter, Erik Nordmark, Hemant Singh, Lars Westberg, Markku Savela, Tatuya Jinmei, Thomas Narten, Vishwas Manral, Alfred Hoenes, Joel Halpern, Ran Atkinson, Steven Blake, Jari Arkko, Kathleen Moriarty, Stephen Farrell, Ralph Droms, Sean Turner and Adrian Farrell for their reviews and suggestions that made this document better.

10. Normative References

[IANA_IP_PARAM] IANA, "IP Parameters", .

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998.

Krishnan, et al. Expires July 15, 2012 [Page 6] Internet-Draft Format for IPv6 extension headers January 2012

Authors’ Addresses

Suresh Krishnan Ericsson 8400 Decarie Blvd. Town of Mount Royal, QC Canada

Phone: +1 514 345 7900 x42871 Email: [email protected]

james woodyatt Apple Inc. 1 Infinite Loop Cupertino, CA 95014 US

Email: [email protected]

Erik Kline Google Mori Tower 26F Roppongi 6-10-1 Minato ku Tokyo 106-6126 Japan

Phone: +81 3-6384-9635 Email: [email protected]

James Hoagland Symantec Corporation 350 Ellis St. Mountain View, CA 94043 US

Email: [email protected] URI: http://symantec.com/

Krishnan, et al. Expires July 15, 2012 [Page 7] Internet-Draft Format for IPv6 extension headers January 2012

Manav Bhatia Alcatel-Lucent Bangalore India

Email: [email protected]

Krishnan, et al. Expires July 15, 2012 [Page 8]

6MAN WG E. Nordmark Internet-Draft Arista Networks Updates: 4861 (if approved) I. Gashinsky Intended status: Standards Track Yahoo! Expires: April 24, 2014 October 21, 2013

Neighbor Unreachability Detection is too impatient draft-ietf-6man-impatient-nud-07.txt

Abstract

IPv6 Neighbor Discovery includes Neighbor Unreachability Detection. That function is very useful when a host has an alternative neighbor, for instance when there are multiple default routers, since it allows the host to switch to the alternative neighbor in short time. This time is 3 seconds after the node starts probing by default. However, if there are no alternative neighbors, this is far too impatient. This document specifies relaxed rules for Neighbor Discovery retransmissions that allow an implementation to choose different timeout behavior based on whether or not there are alternative neighbors. This document updates RFC 4861.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on April 24, 2014.

Copyright Notice

Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of

Nordmark & Gashinsky Expires April 24, 2014 [Page 1] Internet-Draft NUD is too impatient October 2013

publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction ...... 3 2. Definition Of Terms ...... 4 3. Protocol Updates ...... 4 4. Example Algorithm ...... 6 5. Acknowledgements ...... 8 6. Security Considerations ...... 8 7. IANA Considerations ...... 8 8. References ...... 8 8.1. Normative References ...... 8 8.2. Informative References ...... 8 Authors’ Addresses ...... 9

Nordmark & Gashinsky Expires April 24, 2014 [Page 2] Internet-Draft NUD is too impatient October 2013

1. Introduction

IPv6 Neighbor Discovery [RFC4861] includes Neighbor Unreachability Detection (NUD), which detects when a neighbor is no longer reachable. The timeouts specified for NUD are very short (by default three transmissions spaced one second apart). These short timeouts can be appropriate when there are alternative neighbors to which the packets can be sent. For example, if a host has multiple default routers in its Default Router List or if the host has a Neighbor Cache Entry (NCE) created by a Redirect message. In those cases, when NUD fails, the host will try the alternative neighbor by redoing next-hop selection. That implies picking the next router in the Default Router List or discarding the redirect, respectively.

The timeouts specified in [RFC4861] were chosen to be short in order to optimize for the scenarios where alternative neighbors are available.

However, when there is no alternative neighbor there are several benefits in making NUD try probing for a longer time. One of those benefits is to make NUD more robust against transient failures, such as spanning tree reconvergence and other layer 2 issues that can take many seconds to resolve. Marking the NCE as unreachable in that case causes additional multicast on the network. Assuming there are IP packets to send, the lack of an NCE will result in multicast Neighbor Solicitations being sent (to the solicited-node multicast address) every second instead of the unicast Neighbor Solicitations that NUD sends.

As a result IPv6 Neighbor Discovery is operationally more brittle than IPv4 ARP. For IPv4 there is no mandatory time limit on the retransmission behavior for ARP [RFC0826] which allows implementors to pick more robust schemes.

The following constant values in [RFC4861] seem to have been made part of IPv6 conformance testing: MAX_MULTICAST_SOLICIT, MAX_UNICAST_SOLICIT, and RETRANS_TIMER. While such strict conformance testing seems consistent with [RFC4861], it means that the standard needs to be updated to allow IPv6 Neighbor Discovery to be as robust as ARP.

This document updates RFC 4861 to relax the retransmission rules.

Additional motivations for making IPv6 Neighbor Discovery more robust in the face of degenerate conditions are covered in [RFC6583].

Nordmark & Gashinsky Expires April 24, 2014 [Page 3] Internet-Draft NUD is too impatient October 2013

2. Definition Of Terms

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

3. Protocol Updates

Discarding the NCE after after three packets spaced one second apart is only needed when an alternative neighbor is available, such as an additional default router or a redirect.

If an implementation transmits more than MAX_UNICAST_SOLICIT/ MAX_MULTICAST_SOLICIT packets then it SHOULD use exponential backoff of the retransmit timer. This is to avoid any significant load due to a steady background level of retransmissions from implementations that retransmit a large number of NSes before discarding the NCE.

Even if there is no alternative neighbor, the protocol needs to be able to handle the case when the link-layer address of the neighbor/ target has changed by switching to multicast Neighbor Solicitations at some point in time.

In order to capture all the cases above this document introduces a new UNREACHABLE state in the conceptual model described in [RFC4861]. A NCE in the UNREACHABLE state retains the link-layer address, and IPv6 packets continue to be sent to that link-layer address. But in the UNREACHABLE state the NUD Neighbor Solicitations are multicast (to the solicited-node multicast address), using a timeout that follows an exponential backoff.

In the places where RFC4861 says to to discard/delete the NCE after N probes (Section 7.3, 7.3.3 and Appendix C) this document instead specifies a transition to the UNREACHABLE state.

If the Neighbor Cache Entry was created by a redirect, a node MAY delete the NCE instead of changing its state to UNREACHABLE. In any case, the node SHOULD NOT use an NCE created by a Redirect to send packets if that NCE is in UNREACHABLE state. Packets should be sent following the next-hop selection algorithm in [RFC4861], Section 5.2, which disregards NCEs that are not reachable.

The default router selection in [RFC4861], Section 6.3.6 says to prefer default routers that are "known to be reachable". For the purposes of that section, if the NCE for the router is in UNREACHABLE state, it is not known to be reachable. Thus the particular text in section 6.3.6 which says "in any state other than INCOMPLETE" needs

Nordmark & Gashinsky Expires April 24, 2014 [Page 4] Internet-Draft NUD is too impatient October 2013

to be extended to say "in any state other than INCOMPLETE or UNREACHABLE".

Apart from the use of multicast NS instead of unicast NS, and the exponential backoff of the timer, the UNREACHABLE state works the same as the current PROBE state.

A node MAY garbage collect a Neighbor Cache Entry at any time as specified in RFC 4861. This freedom to garbage collect does not change with the introduction of the UNREACHABLE state in the conceptual model. An implementation MAY prefer garbage collecting UNREACHABLE NCEs over other NCEs.

There is a non-obvious extension to the state machine description in Appendix C in RFC 4861 in the case for "NA, Solicited=1, Override=0. Different link-layer address than cached". There we need to add "UNREACHABLE" to the current list of "STALE, PROBE, Or DELAY". That is, the NCE would be unchanged. Note that there is no corresponding change necessary to the text in [RFC4861], Section 7.2.5, since it is phrased using "Otherwise" instead of explicitly listing the three states.

The other state transitions described in Appendix C handle the introduction of the UNREACHABLE state without any change, since they are described using "not INCOMPLETE".

There is also the more obvious change already described above. RFC 4861 has this:

State Event Action New state

PROBE Retransmit timeout, Discard entry - N or more retransmissions.

That needs to be replaced by:

State Event Action New state

PROBE Retransmit timeout, Increase timeout UNREACHABLE N retransmissions. Send multicast NS

UNREACHABLE Retransmit timeout Increase timeout UNREACHABLE Send multicast NS

The exponential backoff SHOULD be clamped at some reasonable maximum retransmit timeout, such as 60 seconds (see MAX_RETRANS_TIMER below). If there is no IPv6 packet sent using the UNREACHABLE NCE, then it is

Nordmark & Gashinsky Expires April 24, 2014 [Page 5] Internet-Draft NUD is too impatient October 2013

RECOMMENDED to stop the retransmits of the multicast NS until either the NCE is garbage collected or there are IPv6 packets sent using the NCE. The multicast NS and associated exponential backoff can be applied on the condition of the continued use of the NCE to send IPv6 packets to the recorded link-layer address.

A node can unicast the first few Neighbor Solicitation messages even while in UNREACHABLE state, but it MUST switch to multicast Neighbor Solicitations within 60 seconds of the initial retransmission to be able to handle a link-layer address change for the target. The example below shows such behavior.

4. Example Algorithm

This section is NOT normative, but specifies a simple implementation which conforms with this document. The implementation is described using operator configurable values that allows it to be configured in a way to be compatible with the retransmission behavior in [RFC4861]. The operator can configure the values for MAX_UNICAST_SOLICIT, MAX_MULTICAST_SOLICIT, RETRANS_TIMER, and the new BACKOFF_MULTIPLE, MAX_RETRANS_TIMER and MARK_UNREACHABLE. This allows the implementation to be as simple as:

next_retrans = ($BACKOFF_MULTIPLE ^ $solicit_retrans_num) * $RetransTimer * $JitterFactor where solicit_retrans_num is zero for the first transmission, and JitterFactor is a random value between MIN_RANDOM_FACTOR and MAX_RANDOM_FACTOR [RFC4861] to avoid any synchronization of transmissions from different hosts.

After MARK_UNREACHABLE transmissions the implementation would mark the NCE UNREACHABLE and as result explore alternate next hops. After MAX_UNICAST_SOLICIT the implementation would switch to multicast NUD probes.

The behavior of this example algorithm is to have 5 attempts, with timing spacing of 0 (initial request), 1 second later, 3 seconds after the first retransmission, then 9, then 27, and switch to UNREACHABLE after the first three transmissions. Thus relative to the time of the first transmissions the retransmissions would occur at 1 second, 4 seconds, 13 seconds, and finally 40 seconds. At 4 seconds from the first transmission the NCE would be marked UNREACHABLE. That behavior corresponds to:

MAX_UNICAST_SOLICIT=5

RETRANS_TIMER=1 (default)

Nordmark & Gashinsky Expires April 24, 2014 [Page 6] Internet-Draft NUD is too impatient October 2013

MAX_RETRANS_TIMER=60

BACKOFF_MULTIPLE=3

MARK_UNREACHABLE=3

After 3 retransmissions the implementation would mark the NCE UNREACHABLE. That results in trying an alternative neighbor, such as another default router or ignoring a redirect as specified in [RFC4861]. With the above values that would occur after 4 seconds after the first transmission compared to the 2 seconds using the fixed scheme in [RFC4861]. That additional delay is small compared to the default 30,000 milliseconds ReachableTime.

After 5 transmissions, i.e., 40 seconds after the initial transmission, the example behavior is to switch to multicast NUD probes. In the language of the state machine in [RFC4861] that corresponds to the action "Discard entry". Thus any attempts to send future packets would result in sending multicast NS packets. An implementation MAY retain the backoff value as it switches to multicast NUD probes. The potential downside of deferring switching to multicast is that it would take longer for NUD to handle a change in a link-layer address i.e., the case when a host or a router changes their link-layer address while keeping the same IPv6 address. However, [RFC4861] says that a node MAY send unsolicited NS to handle that case, which is rather infrequent in operational networks. In any case, the implementation needs to follow the "SHOULD" in section Section 3 to switch to multicast solutions within 60 seconds after the initial transmission.

If BACKOFF_MULTIPLE=1, MARK_UNREACHABLE=3 and MAX_UNICAST_SOLICIT=3, you would get the same behavior as in [RFC4861].

An implementation following this algorithm would, if the request was not answered at first due for example to a transitory condition, retry immediately, and then back off for progressively longer periods. This would allow for a reasonably fast resolution time when the transitory condition clears.

Note that RetransTimer and ReachableTime are by default set from the protocol constants RETRANS_TIMER and REACHABLE_TIME, but are overridden by values advertised in Router Advertisements as specified in [RFC4861]. That remains the case even with the protocol updates specified in this document. The key values that the operator would configure are BACKOFF_MULTIPLE, MAX_RETRANS_TIMER, MAX_UNICAST_SOLICIT and MAX_MULTICAST_SOLICIT.

It is be useful to have a maximum value for

Nordmark & Gashinsky Expires April 24, 2014 [Page 7] Internet-Draft NUD is too impatient October 2013

($BACKOFF_MULTIPLE^$solicit_attempt_num)*$RetransTimer so that the retransmissions are not too far apart. The above value of 60 seconds for this MAX_RETRANS_TIMER is consistent with DHCPv6.

5. Acknowledgements

The comments from Thomas Narten, Philip Homburg, Joel Jaeggli, Hemant Singh, Tina Tsou, Suresh Krishnan, and Murray Kucherawy have helped improve this draft.

6. Security Considerations

Relaxing the retransmission behavior for NUD is believed to have no impact on security. In particular, it doesn’t impact the application Secure Neighbor Discovery [RFC3971].

7. IANA Considerations

This are no IANA considerations for this document.

8. References

8.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC3971] Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure Neighbor Discovery (SEND)", RFC 3971, March 2005.

[RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, September 2007.

8.2. Informative References

[RFC0826] Plummer, D., "Ethernet Address Resolution Protocol: Or converting network protocol addresses to 48.bit Ethernet address for transmission on Ethernet hardware", STD 37, RFC 826, November 1982.

[RFC6583] Gashinsky, I., Jaeggli, J., and W. Kumari, "Operational Neighbor Discovery Problems", RFC 6583, March 2012.

Nordmark & Gashinsky Expires April 24, 2014 [Page 8] Internet-Draft NUD is too impatient October 2013

Authors’ Addresses

Erik Nordmark Arista Networks Santa Clara, CA USA

Email: [email protected]

Igor Gashinsky Yahoo! 45 W 18th St New York, NY USA

Email: [email protected]

Nordmark & Gashinsky Expires April 24, 2014 [Page 9]

IPv6 maintenance Working Group (6man) F. Gont Internet-Draft Huawei Technologies Updates: 2460, 5722 (if approved) March 21, 2013 Intended status: Standards Track Expires: September 22, 2013

Processing of IPv6 "atomic" fragments draft-ietf-6man-ipv6-atomic-fragments-04

Abstract

The IPv6 specification allows packets to contain a Fragment Header without the packet being actually fragmented into multiple pieces (we refer to these packets as "atomic fragments"). Such packets are typically sent by hosts that have received an ICMPv6 "Packet Too Big" error message that advertises a "Next-Hop MTU" smaller than 1280 bytes, and are currently processed by some implementations as normal "fragmented traffic" (i.e., they are "reassembled" with any other queued fragments that supposedly correspond to the same original packet). Thus, an attacker can cause hosts to employ "atomic fragments" by forging ICMPv6 "Packet Too Big" error messages, and then launch any fragmentation-based attacks against such traffic. This document discusses the generation of the aforementioned "atomic fragments" and the corresponding security implications. Additionally, this document formally updates RFC 2460 and RFC 5722 such that IPv6 atomic fragments are processed independently of any other fragments, thus completely eliminating the aforementioned attack vector.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on September 22, 2013.

Copyright Notice

Gont Expires September 22, 2013 [Page 1] Internet-Draft IPv6 atomic fragments March 2013

Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction ...... 3 2. Terminology ...... 5 3. Generation of IPv6 ’atomic fragments’ ...... 6 4. Updating RFC 2460 and RFC 5722 ...... 8 5. IANA Considerations ...... 9 6. Security Considerations ...... 10 7. Acknowledgements ...... 11 8. References ...... 12 8.1. Normative References ...... 12 8.2. Informative References ...... 12 Appendix A. Survey of processing of IPv6 atomic fragments by different operating systems ...... 13 Author’s Address ...... 14

Gont Expires September 22, 2013 [Page 2] Internet-Draft IPv6 atomic fragments March 2013

1. Introduction

[RFC2460] specifies the IPv6 fragmentation mechanism, which allows IPv6 packets to be fragmented into smaller pieces such that they fit in the Path-MTU to the intended destination(s). [RFC2460] allowed fragments to overlap, thus leading to ambiguity in the result of the reassembly process, which could be leveraged by attackers to bypass firewall rules and/or evade Network Intrusion Detection Systems (NIDS) [RFC5722].

[RFC5722] forbid overlapping fragments, specifying that when overlapping fragments are detected, all the fragments corresponding to that packet must be silently discarded.

As specified in Section 5 of [RFC2460], when a host receives an ICMPv6 "Packet Too Big" message advertising a "Next-Hop MTU" smaller than 1280 (the minimum IPv6 MTU), it is not required to reduce the assumed Path-MTU, but must simply include a Fragment Header in all subsequent packets sent to that destination. The resulting packets will thus *not* be actually fragmented into several pieces, but just include a Fragment Header with both the "Fragment Offset" and the "M" flag set to 0 (we refer to these packets as "atomic fragments"). IPv6/IPv4 translators employ the Fragment Identification information found in the Fragment Header to select an appropriate Fragment Identification value for the resulting IPv4 fragments.

While these packets are really "atomic fragments" (they can be processed by the IPv6 module and handed to the upper-layer protocol without waiting for any other fragments), many IPv6 implementations process them as regular fragments. Namely, they try to perform IPv6 fragment reassembly with the "atomic fragment" and any other fragments already queued with the same set {IPv6 Source Address, IPv6 Destination Address, Fragment Identification}. For example, in the case of IPv6 implementations that have been updated to support [RFC5722], if a fragment with the same {IPv6 Source Address, IPv6 Destination Address, Fragment Identification} is already queued for reassembly at a host when an "atomic fragment" is received with the same set {IPv6 Source Address, IPv6 Destination Address, Fragment Identification}, and both fragments overlap, all the fragments will be silently discarded.

Processing of IPv6 "atomic fragments" as regular fragmented packets clearly provides an unnecessary vector to perform fragmentation-based attacks against non-fragmented traffic (i.e., IPv6 datagrams that are not really split into multiple pieces, but that just include a Fragment Header).

IPv6 fragmentation attacks have been discussed in great detail in

Gont Expires September 22, 2013 [Page 3] Internet-Draft IPv6 atomic fragments March 2013

[I-D.gont-6man-predictable-fragment-id] and [CPNI-IPv6], and [RFC5722] describes a specific firewall-circumvention attack that could be performed by leveraging overlapping fragments. The possible IPv6 fragmentation-based attacks are, in most cases, "ports" of the IPv4 fragmentation attacks discussed in [RFC6274].

Section 3 describes the generation of IPv6 "atomic fragments", and how they can be remotely "triggered" by a remote attacker. Section 4 formally updates [RFC2460] and [RFC5722] such that the aforementioned attack vector is eliminated. Appendix A contains a survey of the generation and processing of IPv6 atomic fragments in different versions of a number of popular IPv6 implementations.

Gont Expires September 22, 2013 [Page 4] Internet-Draft IPv6 atomic fragments March 2013

2. Terminology

IPv6 atomic fragments IPv6 packets that contain a Fragment Header with the Fragment Offset set to 0 and the M flag set to 0.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

Gont Expires September 22, 2013 [Page 5] Internet-Draft IPv6 atomic fragments March 2013

3. Generation of IPv6 ’atomic fragments’

Section 5 of [RFC2460] states:

In response to an IPv6 packet that is sent to an IPv4 destination (i.e., a packet that undergoes translation from IPv6 to IPv4), the originating IPv6 node may receive an ICMP Packet Too Big message reporting a Next-Hop MTU less than 1280. In that case, the IPv6 node is not required to reduce the size of subsequent packets to less than 1280, but must include a Fragment header in those packets so that the IPv6-to-IPv4 translating router can obtain a suitable Identification value to use in resulting IPv4 fragments. Note that this means the payload may have to be reduced to 1232 octets (1280 minus 40 for the IPv6 header and 8 for the Fragment header), and smaller still if additional extension headers are used.

This means that any ICMPv6 "Packet Too Big" message advertising a "Next-Hop MTU" smaller than 1280 could trigger the generation of the so-called "atomic fragments" (i.e., IPv6 datagrams that include a Fragment Header, but that are composed of a single fragment, with both the "Fragment Offset" and the "M" fields of the Fragment Header set to 0). This can be leveraged to perform a variety of fragmentation-based attacks [I-D.gont-6man-predictable-fragment-id] [CPNI-IPv6].

For example, an attacker could forge IPv6 fragments with an appropriate {IPv6 Source Address, IPv6 Destination Address, Fragment Identification} tuple, such that these malicious fragments are incorrectly "reassembled" by the destination host together with some of the legitimate fragments of the original packet, thus leading to packet drops (and a potential Denial of Service).

From a security standpoint, this situation is exacerbated by the following factors:

o Many implementations fail to perform validation checks on the received ICMPv6 error messages, as recommended in Section 5.2 of [RFC4443] and documented in [RFC5927]. It should be noted that in some cases, such as when an ICMPv6 error message has (supposedly) been elicited by a connection-less transport protocol (or some other connection-less protocol being encapsulated in IPv6), it may be virtually impossible to perform validation checks on the received ICMPv6 error messages.

o Upon receipt of one of the aforementioned ICMPv6 "Packet Too Big" error messages, the Destination Cache [RFC4861] is usually updated

Gont Expires September 22, 2013 [Page 6] Internet-Draft IPv6 atomic fragments March 2013

to reflect that any subsequent packets to such destination should include a Fragment Header. This means that a single ICMPv6 "Packet Too Big" error message might affect multiple communication instances (e.g., TCP connections) with such destination.

o Some implementations employ predictable Fragment Identification values, thus greatly improving the chances of an attacker of successfully performing fragmentation-based attacks [I-D.gont-6man-predictable-fragment-id].

Gont Expires September 22, 2013 [Page 7] Internet-Draft IPv6 atomic fragments March 2013

4. Updating RFC 2460 and RFC 5722

Section 4.5 of [RFC2460] and Section 4 of [RFC5722] are updated as follows:

A host that receives an IPv6 packet which includes a Fragment Header with the "Fragment Offset" equal to 0 and the "M" flag equal to 0 MUST process such packet in isolation from any other packets/fragments, even if such packets/fragments contain the same set {IPv6 Source Address, IPv6 Destination Address, Fragment Identification}. A received "atomic fragments" should be "reassembled" from the contents of that sole fragment.

The Unfragmentable Part of the reassembled packet consists of all headers up to, but not including, the Fragment header of the received atomic fragment.

The Next Header field of the last header of the Unfragmentable Part of the reassembled packet is obtained from the Next Header field of the Fragment header of the received atomic fragment.

The Payload Length of the reassembled packet is obtained by subtracting the length of the Fragment Header (that is, 8) from the Payload Length of the received atomic fragment.

Additionally, if any fragments with the same set {IPv6 Source Address, IPv6 Destination Address, Fragment Identification} are present in the fragment reassembly queue when the atomic fragment is received, such fragments MUST NOT be discarded upon receipt of the "colliding" IPv6 atomic fragment, since IPv6 atomic fragments MUST NOT interfere with "normal" fragmented traffic.

Gont Expires September 22, 2013 [Page 8] Internet-Draft IPv6 atomic fragments March 2013

5. IANA Considerations

There are no IANA registries within this document. The RFC-Editor can remove this section before publication of this document as an RFC.

Gont Expires September 22, 2013 [Page 9] Internet-Draft IPv6 atomic fragments March 2013

6. Security Considerations

This document describes how the traditional processing of IPv6 atomic fragments enables the exploitation of fragmentation-based attacks (such as those described in [I-D.gont-6man-predictable-fragment-id] and [CPNI-IPv6]). This document formally updates [RFC2460] and [RFC5722], such that IPv6 atomic fragments are processed independently of any other fragments, thus completely eliminating the aforementioned attack vector.

Gont Expires September 22, 2013 [Page 10] Internet-Draft IPv6 atomic fragments March 2013

7. Acknowledgements

The author would like to thank (in alphabetical order) Tore Anderson, Ran Atkinson, Remi Despres, Stephen Farrell, Brian Haberman, Timothy Hartrick, Steinar Haug, Philip Homburg, Simon Josefsson, Simon Perreault, Sean Turner, Florian Weimer, and Bjoern A. Zeeb, for providing valuable comments on earlier versions of this document. Additionally, the author would like to thank Alexander Bluhm, who implemented this specification for OpenBSD.

This document is based on the technical report "Security Assessment of the Internet Protocol version 6 (IPv6)" [CPNI-IPv6] authored by Fernando Gont on behalf of the UK Centre for the Protection of National Infrastructure (CPNI).

Fernando Gont would like to thank CPNI (http://www.cpni.gov.uk) for their continued support.

Gont Expires September 22, 2013 [Page 11] Internet-Draft IPv6 atomic fragments March 2013

8. References

8.1. Normative References

[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998.

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification", RFC 4443, March 2006.

[RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, September 2007.

[RFC5722] Krishnan, S., "Handling of Overlapping IPv6 Fragments", RFC 5722, December 2009.

8.2. Informative References

[RFC5927] Gont, F., "ICMP Attacks against TCP", RFC 5927, July 2010.

[RFC6274] Gont, F., "Security Assessment of the Internet Protocol Version 4", RFC 6274, July 2011.

[CPNI-IPv6] Gont, F., "Security Assessment of the Internet Protocol version 6 (IPv6)", UK Centre for the Protection of National Infrastructure, (available on request).

[I-D.gont-6man-predictable-fragment-id] Gont, F., "Security Implications of Predictable Fragment Identification Values", draft-gont-6man-predictable-fragment-id-03 (work in progress), January 2013.

Gont Expires September 22, 2013 [Page 12] Internet-Draft IPv6 atomic fragments March 2013

Appendix A. Survey of processing of IPv6 atomic fragments by different operating systems

This section includes a survey of the support of IPv6 atomic fragments in popular operating systems, as tested in October 30, 2012.

+------+------+------+ | Operating System | Generates atomic | Implements this | | | fragments | specification | +------+------+------+ | FreeBSD 8.0 | No | No | +------+------+------+ | FreeBSD 8.2 | Yes | No | +------+------+------+ | FreeBSD 9.0 | Yes | No | +------+------+------+ | Linux 3.0.0-15 | Yes | Yes | +------+------+------+ | NetBSD 5.1 | No | No | +------+------+------+ | NetBSD-current | No | Yes | +------+------+------+ | OpenBSD-current | Yes | Yes | +------+------+------+ | Solaris 11 | Yes | Yes | +------+------+------+ | Windows XP SP2 | Yes | No | +------+------+------+ | Windows Vista | Yes | No | | (Build 6000) | | | +------+------+------+ | Windows 7 Home | Yes | No | | Premium | | | +------+------+------+

Table 1: Processing of IPv6 atomic fragments by different OSes

In the table above, "generates atomic fragments" notes whether an implementation generates atomic fragments in response to received ICMPv6 Packet Too Big error messages that advertise a MTU smaller than 1280 bytes.

Gont Expires September 22, 2013 [Page 13] Internet-Draft IPv6 atomic fragments March 2013

Author’s Address

Fernando Gont Huawei Technologies Evaristo Carriego 2644 Haedo, Provincia de Buenos Aires 1706 Argentina

Phone: +54 11 4650 8472 Email: [email protected]

Gont Expires September 22, 2013 [Page 14]

6man Working Group S. Krishnan Internet-Draft A. Kavanagh Intended status: Standards Track B. Varga Expires: February 18, 2013 Ericsson S. Ooghe Alcatel-Lucent E. Nordmark Cisco August 17, 2012

The Line Identification Destination Option draft-ietf-6man-lineid-08

Abstract

In Ethernet based aggregation networks, several subscriber premises may be logically connected to the same interface of an edge router. This document proposes a method for the edge router to identify the subscriber premises using the contents of the received Router Solicitation messages. The applicability is limited to broadband network deployment scenarios where multiple user ports are mapped to the same virtual interface on the edge router.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on February 18, 2013.

Copyright Notice

Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents

Krishnan, et al. Expires February 18, 2013 [Page 1] Internet-Draft Line ID Destination Option August 2012

(http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction ...... 3 1.1. Terminology ...... 4 1.2. Conventions used in this document ...... 5 2. Applicability Statement ...... 6 3. Issues with identifying the subscriber premises in an N:1 VLAN model ...... 7 4. Basic operation ...... 7 5. AN Behavior ...... 8 5.1. On initialization ...... 8 5.2. On receiving a Router Solicitation from the end-device . . 8 5.3. On receiving a Router Advertisement from the Edge Router ...... 9 5.3.1. Identifying tunneled Router Advertisements ...... 9 5.4. On detecting a subscriber circuit coming up ...... 9 5.5. On detecting Edge Router failure ...... 9 5.6. RS Retransmission algorithm ...... 10 6. Edge Router Behavior ...... 10 6.1. On receiving a Tunneled Router Solicitation from the AN . 10 6.2. On sending a Router Advertisement towards the end-device ...... 10 6.3. Sending periodic unsolicited Router Advertisements towards the end-device ...... 11 7. Line Identification Destination Option (LIO) ...... 11 7.1. Encoding of Line ID ...... 12 8. Garbage collection of unused prefixes ...... 13 9. Interactions with Secure Neighbor Discovery ...... 14 10. Acknowledgements ...... 14 11. Security Considerations ...... 14 12. IANA Considerations ...... 14 13. References ...... 15 13.1. Normative References ...... 15 13.2. Informative References ...... 16 Authors’ Addresses ...... 16

Krishnan, et al. Expires February 18, 2013 [Page 2] Internet-Draft Line ID Destination Option August 2012

1. Introduction

Digital Subscriber Line (DSL) is a widely deployed access technology for Broadband Access for Next Generation Networks. While traditional DSL access networks were Point-to-Point Protocol (PPP) [RFC1661] based, some networks are migrating from the traditional PPP access model into a pure IP-based Ethernet aggregated access environment. Architectural and topological models of an Ethernet aggregation network in the context of DSL aggregation are described in [TR101].

+----+ +----+ +------+ |Host|---| RG |----| | +----+ +----+ | | | AN |\ +----+ +----+ | | \ |Host|---| RG |----| | \ +----+ +----+ +------+ \ +------+ \ | | +------+ | | | Aggregation | | Edge | | Network |------| Router | +------+ | | / | | +------+ / +------+ | | / +----+ +----+ | | / |Host|---| RG |----| AN |/ +----+ +----+ | | | | +------+

Figure 1: Broadband Forum Network Architecture

One of the Ethernet and Gigabit Passive Optical Network (GPON) aggregation models specified in this document bridges sessions from multiple user ports behind a DSL Access Node (AN), also referred to as a Digital subscriber line access multiplexer (DSLAM), into a single VLAN in the aggregation network. This is called the N:1 VLAN allocation model.

Krishnan, et al. Expires February 18, 2013 [Page 3] Internet-Draft Line ID Destination Option August 2012

+------+ | | | | | AN |\ | | \ | | \ VLANx +------+ \ +------+ \ | | +------+ | | | Aggregation | VLANx | Edge | | Network |------| Router | +------+ | | / | | +------+ / +------+ | | / VLANx | | / | AN |/ | | | | +------+

Figure 2: n:1 VLAN model

1.1. Terminology

1:1 VLAN It is a broadband network deployment scenario where each user port is mapped to a different VLAN on the Edge Router. The uniqueness of the mapping is maintained in the Access Node and across the Aggregation Network. N:1 VLAN It is a broadband network deployment scenario where multiple user ports are mapped to the same VLAN on the Edge Router. The user ports may be located in the same or different Access Nodes. GPON Gigabit-capable Passive Optical Network is an optical access network that has been introduced into the Broadband Forum architecture in [TR156] AN A DSL or a GPON Access Node. The Access Node terminates the physical layer (e.g. DSL termination function or GPON termination function), may physically aggregate other nodes implementing such functionality, or may perform both functions at the same time. This node contains at least one standard Ethernet

Krishnan, et al. Expires February 18, 2013 [Page 4] Internet-Draft Line ID Destination Option August 2012

interface that serves as its "northbound" interface into which it aggregates traffic from several user ports or Ethernet-based "southbound" interfaces. It does not implement an IPv6 stack but performs some limited inspection/modification of IPv6 packets. The IPv6 functions required on the Access Node are described in Section 5 of [TR177]. Aggregation Network The part of the network stretching from the Access Nodes to the Edge Router. In the context of this document the aggregation network is considered to be Ethernet based, providing standard Ethernet interfaces at the edges, for connecting the Access Nodes and Broadband Network. It is comprised of ethernet switches that provide very limited IP functionality (e.g. IGMP snooping, MLD snooping etc.). RG A residential gateway device. It can be a Layer 3 (routed) device or a Layer 2 (bridged) device. The residential gateway for Broadband Forum networks is defined in [TR124] Edge Router The Edge Router, also known as the Broadband Network Gateway (BNG) is the first IPv6 hop for the user. In the cases where the Residential Gateway (RG) is bridged, the BNG acts as the default router for the hosts behind the RG. In cases where the RG is routed, the BNG acts as the default router for the RG itself. This node implements IPv6 router functionality. Host A node that implements IPv6 host functionality. End-device A node that sends Router Solicitations and processes received Router Advertisements. When a Layer 3 RG is used it is considered an end-device in the context of this document. When a Layer 2 RG is used, the host behind the RG is considered to be an end-device in the context of this document.

1.2. Conventions used in this document

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL","SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

Krishnan, et al. Expires February 18, 2013 [Page 5] Internet-Draft Line ID Destination Option August 2012

2. Applicability Statement

The line identification destination option is intended to be used only for the N:1 VLAN deployment model. For the other VLAN deployment models, line identification can be achieved differently. The mechanism described in the document is useful for allowing the connection of hosts that only support IPv6 stateless address auto- configuration to attach to networks that use the N:1 VLAN deployment model.

When the Dynamic Host Configuration Protocol (DHCP) [RFC3315] is used for IPv6 address assignment it has the side-effect of including reliability initiated by the end-device (the end-device retransmits DHCP messages until it receives a response), as well as a way to detect when the end-device is not active for an extended period of time (the end-device would not renew its DHCP lease). The IPv6 Stateless address autoconfiguration protocol [RFC4862] was not designed to satisfy such requirements [RSDA]. While this option improves the reliability of operation in deployments that use Router Solicitations rather than DHCP, there are some limitations as specified below.

The mechanism described in this document deals with the loss of subscriber-originated Router Solicitations (RSes) by initiating RSes at the AN, which improves the robustness over solely relying on the end-device’s few initial retransmissions of RSes.

But the AN retransmissions imply that some information (e.g. the subscriber’s MAC address) that was obtained by the edge router from subscriber-originated RSes may no longer be available. e.g. Since there is no L2 frame received from the subscriber in case of an RS sent by an AN, the L2 address information of the end-device cannot be determined. One piece of L2 address information currently used in some Broadband networks is the MAC address. For this reason, the solution described in this document is NOT RECOMMENDED for networks that require the MAC address of the endpoint for identification.

There is no indication when a subscriber is no longer active. Thus this protocol can not be used to automatically reclaim resources, such as prefixes, that are associated with an active subscriber. See Section 8. Thus this protocol is NOT RECOMMENDED for networks that require automatic notification when a subscriber is no longer active.

This mechanism by itself provides no protection against the loss of RS induced state in access routers that would lead to loss of IPv6 connectivity for end-devices. Given that regular IPv6 hosts do not have RS retransmission behavior that would allow automatic recovery from such a failure, this mechanism SHOULD only be used in

Krishnan, et al. Expires February 18, 2013 [Page 6] Internet-Draft Line ID Destination Option August 2012

deployments employing N:1 VLANs.

3. Issues with identifying the subscriber premises in an N:1 VLAN model

In a DSL or GPON based fixed Broadband Network, IPv6 end-devices are connected to an AN. These end-devices today will typically send a Router Solicitation Message to the Edge Router, to which the Edge Router responds with a Router Advertisement message. The Router Advertisement typically contains a prefix that the end-devices will use to automatically configure an IPv6 Address. Upon sending the Router Solicitation message the node connecting the end-device on the access circuit, typically an AN, would forward the RS to the Edge Router upstream over a switched network. However, in such Ethernet- based aggregation networks, several subscriber premises may be connected to the same interface of an edge router (e.g. on the same VLAN). However, the edge router requires some information to identify the end-device on the circuit. To accomplish this, the AN needs to add line identification information to the Router Solicitation message and forward this to the Edge Router. This is analogous to the case where DHCP is being used, and the line identification information is inserted by a DHCP relay agent [RFC3315]. This document proposes a method for the edge router to identify the subscriber premises using the contents of the received Router Solicitation messages. Note that there might be several end- devices that are located on the same subscriber premises.

4. Basic operation

This document uses a mechanism that tunnels Neighbor discovery packets inside another IPv6 packet that uses a destination option (Line ID option) to convey line identification information as depicted in Figure 3. The use of the Line ID option in any other IPv6 datagrams, including untunneled RS and RA messages, is not defined by this document. The Neighbor discovery packets are left unmodified inside the encapsulating IPv6 packet. In particular, the Hop Limit field of the Neighbor Discovery (ND) message is not decremented when the packet is being tunneled. This is because ND messages whose Hop Limit is not 255 will be discarded by the receiver of such messages, as described in Sections 6.1.1 and 6.1.2 of [RFC4861].

Krishnan, et al. Expires February 18, 2013 [Page 7] Internet-Draft Line ID Destination Option August 2012

+----+ +----+ +------+ |Host| | AN | |Edge Router| +----+ +----+ +------+ | RS | | |------>| | | | | | |Tunneled RS(LIO)| | |------>| | | | | |Tunneled RA(LIO)| | |<------| | RA | | |<------| | | | |

Figure 3: Basic message flow

5. AN Behavior

5.1. On initialization

On initialization, the AN MUST join the All-BBF-Access-Nodes multicast group on all its upstream interfaces towards the Edge Router.

5.2. On receiving a Router Solicitation from the end-device

When an end-device sends out a Router Solicitation, it is received by the AN. The AN identifies these messages by looking for ICMPv6 messages (IPv6 Next Header value of 58) with ICMPv6 type 133. The AN intercepts and then tunnels the received Router Solicitation in a newly created IPv6 datagram with the Line Identification Option (LIO). The AN forms a new IPv6 datagram whose payload is the received Router Solicitation message as described in [RFC2473] except that the Hop Limit field of the Router Solicitation message MUST NOT be decremented. If the AN has an IPv6 address, it MUST use this address in the Source Address field of the outer IPv6 datagram. Otherwise, the AN MUST copy the source address from the received Router Solicitation into the Source Address field of the outer IPv6 datagram. The destination address of the outer IPv6 datagram MUST be copied from the destination address of the tunneled RS. The AN MUST include a destination options header between the outer IPv6 header and the payload. It MUST insert a LIO destination option and set the line identification field of the option to contain the circuit identifier corresponding to the logical access loop port of the AN from which the RS was initiated.

Krishnan, et al. Expires February 18, 2013 [Page 8] Internet-Draft Line ID Destination Option August 2012

5.3. On receiving a Router Advertisement from the Edge Router

When the edge router sends out a tunneled Router Advertisement in response to the RS, it is received by the AN. If there is an LIO option present, the AN MUST use the line identification data of the LIO option to identify the subscriber agent circuit of the AN on which the RA should be sent. The AN MUST then remove the outer IPv6 header of this tunneled RA and multicast the inner packet (the original RA) on this specific subscriber circuit.

5.3.1. Identifying tunneled Router Advertisements

The AN can identify tunneled RAs by installing filters based on the destination address (All-BBF-Access-Nodes which is reserved link- local scoped multicast address) of the outer packets, and the presence of a destination option header with an LIO destination option.

5.4. On detecting a subscriber circuit coming up

RSes initiated by end-devices as described in Section 5.2 may be lost due to lack of connectivity between the AN and the end-device. To ensure that the end-device will receive an RA, the AN needs to trigger the sending of periodic RAs on the edge router. For this purpose, the AN needs to inform the edge router that a subscriber circuit has come up. Each time the AN detects that a subscriber circuit has come up, it MUST create a Router Solicitation message as described in Section 6.3.7 of [RFC4861]. It MUST use the unspecified address as the source address of this RS. It MUST then tunnel this RS towards the edge router as described in Section 5.2.

In case there are connectivity issues between the AN and the edge router, the RSes initiated by the AN can be lost. The AN SHOULD continue retransmitting the Router Solicitations following the algorithm described in Section 5.6 for a given LIO until it receives an RA for that specific LIO.

5.5. On detecting Edge Router failure

When the edge router reboots and loses state or is replaced by a new edge router, the AN will detect it using connectivity check mechanisms that are already in place in Broadband networks (e.g. BFD). When such edge router failure is detected, the AN needs to start transmitting RSes for each of its subscriber circuits that are up as described in Section 5.4.

Krishnan, et al. Expires February 18, 2013 [Page 9] Internet-Draft Line ID Destination Option August 2012

5.6. RS Retransmission algorithm

The AN SHOULD use the exponential backoff algorithm for retransmits that is described in Section 14 of [RFC3315] in order to continuously retransmit the Router Solicitations for a given LIO until a response is received for that specific LIO. The AN SHOULD use the following variables as input to the retransmission algorithm:

IRT 1 Second MRT 30 Seconds MRC 0 MRD 0

6. Edge Router Behavior

6.1. On receiving a Tunneled Router Solicitation from the AN

When the edge router receives a tunneled Router Solicitation forwarded by the AN, it needs to check if there is an LIO destination option present in the outer datagram. The edge router can use the contents of the line identification field to lookup the addressing information and policy that need to be applied to the line from which the Router Solicitation was received. The edge router MUST then process the inner RS message as specified in [RFC4861].

6.2. On sending a Router Advertisement towards the end-device

When the edge router sends out a Router Advertisement in response to a tunneled RS that included an LIO option, it MUST tunnel the Router Advertisement in a newly created IPv6 datagram with the Line Identification Option (LIO) as described below. First, The edge router creates the Router Advertisement message as described in Section 6.2.3 of [RFC4861]. The edge router MUST include a Prefix Information Option in this RA that contains the prefix that corresponds to the received LIO (The LIO from the received tunneled RS is usually passed on from the edge router to some form of provisioning system that returns the prefix to be included in the RA. It could e,g, be based on RADIUS.). Then, the Edge Router forms the new IPv6 datagram, whose payload is the Router Advertisement message, as described in [RFC2473] except that the Hop Limit field of the Router Advertisement message MUST NOT be decremented. The Edge router MUST use a link-local IPv6 address on the outgoing interface in the Source Address field of the outer IPv6 datagram. The edge router MUST include a destination options header between the outer IPv6 header and the payload. It MUST insert a LIO destination option and set the line identification field of the option to contain the same value as that of the Line ID option in the received RS. The

Krishnan, et al. Expires February 18, 2013 [Page 10] Internet-Draft Line ID Destination Option August 2012

IPv6 destination address of the inner RA MUST be set to the all-nodes multicast address.

If the Source Address field of the received IPv6 datagram was not the unspecified address, the edge router MUST copy this address into the Destination Address field of the outer IPv6 datagram sent back towards the AN. The link-layer destination address of the outer IPv6 datagram containing the outer IPv6 datagram MUST be resolved using regular Neighbour Discovery procedures.

If the Source Address field of the received IPv6 datagram was the unspecified address, the destination address of the outer IPv6 datagram MUST be set to the well-known link-local scope All-BBF- Access-Nodes multicast address [to be allocated]. The link-layer destination address of the tunneled RA MUST be set to the unicast link-layer address of the AN that sent the tunneled Router Solicitation which is being responded to.

The edge router MUST ensure that it does not transmit tunneled RAs whose size is larger than the MTU of the link between the edge router and the AN, which would require that the outer IPv6 datagram undergo fragmentation. This limitation is imposed because the AN may not be capable of handling the reassembly of such fragmented datagrams.

6.3. Sending periodic unsolicited Router Advertisements towards the end-device

After sending a tunneled Router Advertisement as specified in Section 6.2 in response to a received RS, the edge router MUST store the mapping between the LIO and the prefixes contained in the Router Advertisement. It should then initiate periodic sending of unsolicited Router Advertisements as described in Section 6.2.3. of [RFC4861] . The Router Advertisements MUST be created and tunneled as described in Section 6.2. The edge router MAY stop sending Router Advertisements if it receives a notification from the AN that the subscriber circuit has gone down. This notification can be received out-of-band using a mechanism such as ANCP. Please consult Section 8 for more details.

7. Line Identification Destination Option (LIO)

The Line Identification Destination Option (LIO) is a destination option that can be included in IPv6 datagrams that tunnel Router Solicitation and Router Advertisement messages. The use of the Line ID option in any other IPv6 datagrams is not defined by this document. Multiple Line Identification destination options MUST NOT be present in the same IPv6 datagram. The LIO has no alignment

Krishnan, et al. Expires February 18, 2013 [Page 11] Internet-Draft Line ID Destination Option August 2012

requirement.

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Option Type | Option Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | LineIDLen | Line Identification... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Figure 4: Line Identification Destination Option Layout

Option Type

8-bit identifier of the type of option. The option identifier for the line identification option will be allocated by the IANA.

Option Length

8-bit unsigned integer. The length of the option (excluding the Option Type and Option Length fields). The value MUST be greater than 0.

LineIDLen

8-bit unsigned integer. The length of the Line Identification field in number of octets.

Line Identification

Variable length data inserted by the AN describing the subscriber agent circuit identifier corresponding to the logical access loop port of the AN from which the RS was initiated. The line identification MUST be unique across all the ANs that share a link to the edge router. e.g. One such line identification scheme is described in Section 3.9 of [TR101]. The line idenfication should be encoded as specified in Section 7.1.

7.1. Encoding of Line ID

This IPv6 Destination Option is derived from an existing widely deployed DHCPv6 Option [RFC4649], which is in turn derived from a widely deployed DHCPv4 Option [RFC3046]. Both of those derive from and cite the basic DHCP options specification [RFC2132]. Those widely deployed DHCP options use the NVT character set

Krishnan, et al. Expires February 18, 2013 [Page 12] Internet-Draft Line ID Destination Option August 2012

[RFC2132][RFC0020]. Since the data carried in the Line ID option is used in the same manner by the provisioning systems as the DHCP options, it is beneficial for it to maintain the same encoding as the DHCP options.

The IPv6 Line ID option contains a description which identifies the line, using only character positions (decimal 32 to decimal 126, inclusive) of the US-ASCII character set [X3.4], [RFC0020]. Consistent with [RFC2132], [RFC3046] and [RFC4649], the Line ID field SHOULD NOT contain the US-ASCII NUL character (decimal 0). However, implementations receiving this option MUST NOT fail merely because an ASCII NUL character is (erroneously) present in the Line ID option’s data field.

Some existing widely deployed implementations of edge routers and ANs that support the previously mentioned DHCP option only support US- ASCII, and strip the high-order bit from any 8-bit characters entered by the device operator. The previously mentioned DHCP options do not support 8-bit character sets either. Therefore, for compatibility with the installed base and to maximise interoperability, the high- order bit of each octet in this field MUST be set to zero by any device inserting this option in an IPv6 packet.

Consistent with [RFC3046] and [RFC4649], this option always uses binary comparison. Therefore, two Line IDs MUST be equal when they match when compared byte-by-byte. Line-ID A and Line-ID B match byte-by-byte when (1) A and B have the same number of bytes and (2) for all byte indexes P in A: the value of A at index P has the same binary value as the value of B at index P.

Two Line IDs MUST NOT be equal if they do not match byte-by-byte. For example, an IPv6 Line ID option containing "f123" is not equal to a Line ID option "F123".

Intermediate systems MUST NOT alter the contents of the Line ID.

8. Garbage collection of unused prefixes

Following the mechanism described in this document, the Broadband network associates a prefix to a subscriber line based on the LIO. Even when the subscriber line goes down temporarily, this prefix stays allocated to that specific subscriber line. i.e. The prefix is not returned to the unused pool. When a subscriber line no longer needs a prefix, the prefix can be reclaimed by manual action dissociating the prefix from the LIO in the backend systems.

Krishnan, et al. Expires February 18, 2013 [Page 13] Internet-Draft Line ID Destination Option August 2012

9. Interactions with Secure Neighbor Discovery

Since the SEND [RFC3971] protected RS/RA packets are not modified in anyway by the mechanism described in this document, there are no issues with SEND verification.

10. Acknowledgements

The authors would like to thank Margaret Wasserman, Mark Townsley, David Miles, John Kaippallimalil, Eric Levy-Abegnoli, Thomas Narten, Olaf Bonness, Thomas Haag, Wojciech Dec, Brian Haberman, Ole Troan, Hemant Singh, Jari Arkko, Joel Halpern, Bob Hinden, Ran Atkinson, Glen Turner, Kathleen Moriarty, David Sinicrope, Dan Harkins, Stephen Farrell, Barry Leiba, Sean Turner, Ralph Droms, and Mohammed Boucadair for reviewing this document and suggesting changes.

11. Security Considerations

The line identification information inserted by the AN or the edge router is not protected. This means that this option may be modified, inserted, or deleted without being detected. In order to ensure validity of the contents of the line identification field, the network between the AN and the edge router needs to be trusted.

12. IANA Considerations

This document defines a new IPv6 destination option for carrying line identification. IANA is requested to assign a new destination option type in the Destination Options registry maintained at

http://www.iana.org/assignments/ipv6-parameters

Line Identification Option [RFCXXXX]

The act bits for this option need to be 10 and the chg bit needs to be 0.

This document also requires the allocation of a well-known link-local scope multicast address from the IPv6 Multicast Address Space Registry located at

http://www.iana.org/assignments/ipv6-multicast-addresses/ ipv6-multicast-addresses.xml

All-BBF-Access-Nodes [RFCXXXX]

Krishnan, et al. Expires February 18, 2013 [Page 14] Internet-Draft Line ID Destination Option August 2012

13. References

13.1. Normative References

[RFC1661] Simpson, W., "The Point-to-Point Protocol (PPP)", STD 51, RFC 1661, July 1994.

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC2473] Conta, A. and S. Deering, "Generic Packet Tunneling in IPv6 Specification", RFC 2473, December 1998.

[RFC3315] Droms, R., Bound, J., Volz, B., Lemon, T., Perkins, C., and M. Carney, "Dynamic Host Configuration Protocol for IPv6 (DHCPv6)", RFC 3315, July 2003.

[RFC3971] Arkko, J., Kempf, J., Zill, B., and P. Nikander, "SEcure Neighbor Discovery (SEND)", RFC 3971, March 2005.

[RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, September 2007.

[RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless Address Autoconfiguration", RFC 4862, September 2007.

[TR101] Broadband Forum, "Migration to Ethernet-based DSL aggregation", .

[TR124] Broadband Forum, "Functional Requirements for Broadband Residential Gateway Devices", .

[TR156] Broadband Forum, "Using GPON Access in the context of TR- 101", .

[TR177] Broadband Forum, "IPv6 in the context of TR-101", .

[X3.4] American National Standards Institute, "American Standard Code for Information Interchange (ASCII)", Standard X3.4 , 1968.

Krishnan, et al. Expires February 18, 2013 [Page 15] Internet-Draft Line ID Destination Option August 2012

13.2. Informative References

[RFC0020] Cerf, V., "ASCII format for network interchange", RFC 20, October 1969.

[RFC2132] Alexander, S. and R. Droms, "DHCP Options and BOOTP Vendor Extensions", RFC 2132, March 1997.

[RFC3046] Patrick, M., "DHCP Relay Agent Information Option", RFC 3046, January 2001.

[RFC4649] Volz, B., "Dynamic Host Configuration Protocol for IPv6 (DHCPv6) Relay Agent Remote-ID Option", RFC 4649, August 2006.

[RSDA] Dec, W., "IPv6 Router Solicitation Driven Access Considered Harmful", draft-dec-6man-rs-access-harmful-00 (work in progress), June 2011.

Authors’ Addresses

Suresh Krishnan Ericsson 8400 Blvd Decarie Town of Mount Royal, Quebec Canada

Email: [email protected]

Alan Kavanagh Ericsson 8400 Blvd Decarie Town of Mount Royal, Quebec Canada

Email: [email protected]

Balazs Varga Ericsson Konyves Kalman krt. 11. B. 1097 Budapest Hungary

Email: [email protected]

Krishnan, et al. Expires February 18, 2013 [Page 16] Internet-Draft Line ID Destination Option August 2012

Sven Ooghe Alcatel-Lucent Copernicuslaan 50 2018 Antwerp, Belgium

Phone: Email: [email protected]

Erik Nordmark Cisco 510 McCarthy Blvd. Milpitas, CA, 95035 USA

Phone: +1 408 527 6625 Email: [email protected]

Krishnan, et al. Expires February 18, 2013 [Page 17]

Network Working Group A. Matsumoto Internet-Draft J. Kato Intended status: Standards Track T. Fujisaki Expires: April 3, 2012 NTT T. Chown University of Southampton October 2011

Update to RFC 3484 Default Address Selection for IPv6 draft-ietf-6man-rfc3484-revise-05.txt

Abstract

RFC 3484 describes algorithms for source address selection and for destination address selection. The algorithms specify default behavior for all Internet Protocol version 6 (IPv6) implementations. This document specifies a set of updates that modify the algorithms and fix the known defects.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on April 3, 2012.

Copyright Notice

Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must

Matsumoto, et al. Expires April 3, 2012 [Page 1] Internet-Draft RFC3484 Bis October 2011

include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English.

Table of Contents

1. Introduction ...... 3 2. Specification ...... 3 2.1. Changes related to the default policy table ...... 3 2.1.1. ULA in the policy table ...... 4 2.1.2. Teredo in the policy table ...... 4 2.1.3. 6to4, Teredo, and IPv4 prioritization ...... 4 2.1.4. Deprecated addresses in the policy table ...... 5 2.1.5. Renewed default policy table ...... 5 2.2. The longest matching rule ...... 5 2.3. Utilize next-hop for source address selection ...... 6 2.4. Private IPv4 address scope ...... 6 2.5. Deprecation of site-local unicast address ...... 7 2.6. Anycast addresses for candidate source addresses . . . . . 7 3. Security Considerations ...... 7 4. IANA Considerations ...... 7 5. References ...... 8 5.1. Normative References ...... 8 5.2. Informative References ...... 8 Appendix A. Acknowledgements ...... 9 Appendix B. Past Discussion ...... 9 B.1. The longest match rule ...... 9 B.2. NAT64 prefix issue ...... 10 B.3. ISATAP issue ...... 10 Appendix C. Revision History ...... 11 Authors’ Addresses ...... 11

Matsumoto, et al. Expires April 3, 2012 [Page 2] Internet-Draft RFC3484 Bis October 2011

1. Introduction

The IPv6 addressing architecture [RFC4291] allows multiple unicast addresses to be assigned to interfaces. Because of this IPv6 implementations need to handle multiple possible source and destination addresses when initiating communication. RFC 3484 [RFC3484] specifies the default algorithms, common across all implementations, for selecting source and destination addresses so that it is easier to predict the address selection behavior.

Since RFC 3484 was specified, some issues have been identified with the algorithms specified there. The issues include the longest match algorithm used in Rule 9 of destination address selection breaking DNS round-robin techniques, and prioritization of poor IPv6 connectivity using transition mechanisms over native IPv4 connectivity.

There have also been some significant changes to the IPv6 addressing architecture that require changes in the RFC 3484 policy table. Such changes include the deprecation of site-local unicast addresses [RFC3879] and of IPv4-compatible IPv6 addresses, and the introduction of Unique Local Addresses [RFC4193].

This document specifies a set of updates that modify the algorithms and fix the known defects.

2. Specification

2.1. Changes related to the default policy table

The default policy table is defined in RFC 3484 Section 2.1 as follows:

Prefix Precedence Label ::1/128 50 0 ::/0 40 1 2002::/16 30 2 ::/96 20 3 ::ffff:0:0/96 10 4

The changes that should be included into the default policy table are those rules that are universally useful and do no harm in every reasonable network environment. The changes we should consider for the default policy table are listed in this sub-section.

The policy table is defined to be configurable. The changes that are useful locally but not universally can be put into the policy table

Matsumoto, et al. Expires April 3, 2012 [Page 3] Internet-Draft RFC3484 Bis October 2011

manually or by using the policy distribution mechanism proposed as a DHCP option [I-D.ietf-6man-addr-select-opt].

2.1.1. ULA in the policy table

RFC 5220 [RFC5220] sections 2.1.4, 2.2.2, and 2.2.3 describe address selection problems related to ULAs [RFC4193]. These problems can be solved by either changing the scope of ULAs to site-local, or by adding an entry for the default policy table that has its own label for ULAs.

Centrally assigned ULAs [I-D.ietf-ipv6-ula-central] have been proposed, and are assigned fc00::/8. Using the different labels for fc00::/8 and fd00::/8 makes sense if we assume the same kind of address block is assigned in the same or adjacent network. However, we cannot expect that the type of ULA address block and network adjacency commonly have any relationships.

Regarding the scope of ULAs, ULAs have been specified with a global scope because the reachability of ULAs was intended to be restricted by the routing system. Since the ULAs will not be exposed outside of their reachability domain, if a ULA is available as a candidate destination address, it can be expected to be reachable.

If we change the scope of ULAs to be smaller than global, we can prioritize ULA to ULA communication over GUA to GUA communication. At the same time, however, finer-grained configuration of ULA address selection will be impossible. For example, even if you want to prioritize communication related to the only /48 ULA prefix used in your site, and do not want to prioritize communication to any other ULA prefix, such a policy cannot be implemented in the policy table. So, this draft proposes the use of the policy table to differentiate ULAs from GUAs.

2.1.2. Teredo in the policy table

Teredo [RFC4380] is defined and has been assigned 2001::/32. This address block should be assigned its own label in the policy table. Teredo’s priority should be less than or equal to 6to4, considering its characteristic of being a transitional tunnel mechanism.

2.1.3. 6to4, Teredo, and IPv4 prioritization

Regarding the prioritization between IPv4 and these transitional mechanisms, their connectivity is known to usually be worse than IPv4. These mechanisms are said to be the last resort access method to IPv6 resources. 6to4 should have higher precedence than Teredo, given that 6to4 host to 6to4 host communication can be over IPv4

Matsumoto, et al. Expires April 3, 2012 [Page 4] Internet-Draft RFC3484 Bis October 2011

(which can result in a more optimal path) and that 6to4 should not used behind a NAT device.

2.1.4. Deprecated addresses in the policy table

IPv4-compatible IPv6 addresses (::/96) are deprecated [RFC4291]. IPv6 site-local unicast addresses (fec0::/10) are deprecated [RFC3879]. 6bone testing addresses [RFC3701] has also been phased out.

These addresses were removed from the current specification. Considering the inappropriate use of these address blocks, especially in outdated implementations and bad effects brought by them, they should be labeled differently from the legitimate address blocks as long as the address block is reserved by IANA.

2.1.5. Renewed default policy table

After applying these updates, the default policy table will be:

Prefix Precedence Label ::1/128 60 0 fc00::/7 50 1 ::/0 40 2 ::ffff:0:0/96 30 3 2002::/16 20 4 2001::/32 10 5 ::/96 1 10 fec0::/10 1 11 3ffe::/16 1 12

2.2. The longest matching rule

This issue is related to the longest matching rule, which was found by Dave Thaler. It causes a malfunction of the DNS round robin technique, as described below. It is common for both IPv4 and IPv6.

When a destination address DA, DB, and the source address of DA Source(DA) are on the same subnet and Source(DA) == Source(DB), DNS round robin load-balancing cannot function. By considering prefix lengths that are longer than the subnet prefix, this rule establishes preference between addresses that have no substantive differences between them. The rule functions as an arbitrary tie-breaker between the hosts in a round robin, causing a given host to always prefer a given member of the round robin.

By limiting the calculation of common prefixes to a maximum length equal to the length of the subnet prefix of the source address, rule

Matsumoto, et al. Expires April 3, 2012 [Page 5] Internet-Draft RFC3484 Bis October 2011

9 can continue to favor hosts that are nearby in the network hierarchy without arbitrarily sorting addresses within a given network. This modification could be written as follows:

Rule 9: Use longest matching prefix.

When DA and DB belong to the same address family (both are IPv6 or both are IPv4): If CommonPrefixLen(DA & Netmask(Source(DA)), Source(DA)) > CommonPrefixLen(DB & Netmask(Source(DB)), Source(DB)), then prefer DA. Similarly, if CommonPrefixLen(DA & Netmask(Source(DA)), Source(DA)) < CommonPrefixLen(DB & Netmask(Source(DB)), Source(DB)), then prefer DB.

2.3. Utilize next-hop for source address selection

RFC 3484 source address selection rule 5 says that the address that is attached to the outgoing interface should be preferred as the source address. This rule is reasonable considering the prevalence of ingress filtering described in BCP 38 [RFC2827]. This is because an upstream network provider usually assumes it receives packets from their customer that only have the delegated addresses as the source addresses.

This rule, however, is not effective in an environment such as that described in RFC 5220 Section 2.1.1, where a host has multiple upstream routers on the same link and has addresses delegated from each upstream router on a single interface.

Also, DHCPv6 assigned addresses are not associated like SLAAC assigned addresses to a next-hop gateway, so implementations usually can’t apply this heuristic in a DHCPv6 network.

So, a new rule 5.1 that utilizes next-hop information for source address selection is inserted just after rule 5.

Rule 5.1: Use an address assigned by the selected next-hop.

If SA is assigned by the selected next-hop that will be used to send to D and SB is assigned by a different next-hop, then prefer SA. Similarly, if SB is assigned by the next-hop that will be used to send to D and SA is assigned by a different next-hop, then prefer SB.

2.4. Private IPv4 address scope

When a packet goes through a NAT, its source or destination address can get replaced with another address with a different scope. It follows that the result of the source address selection algorithm may be different when the original address is replaced with the NATed

Matsumoto, et al. Expires April 3, 2012 [Page 6] Internet-Draft RFC3484 Bis October 2011

address.

The algorithm currently specified in RFC 3484 is based on the assumption that a source address with a small scope cannot reach a destination address with a larger scope. This assumption does not hold if private IPv4 addresses and a NAT are used to reach public IPv4 addresses.

Due to this assumption, in the presence of both a NATed private IPv4 address and a transitional address (like 6to4 or Teredo), the host will choose the transitional IPv6 address to access dual-stack peers [I-D.denis-v6ops-nat-addrsel]. Choosing transitional IPv6 connectivity over native IPv4 connectivity, particularly where the transitional connectivity is unmanaged, is not considered to be generally desirable.

This issue can be fixed by changing the address scope of private IPv4 addresses to global.

2.5. Deprecation of site-local unicast address

RFC 3484 contains a few "site-local unicast" and "fec0::" descriptions. It’s better to remove examples related to site-local unicast addresses, or change the examples to use ULAs. Possible points to be re-written are listed below.

- 2nd paragraph in RFC 3484 Section 3.1 describes the scope comparison mechanism. - RFC 3484 Section 10 contains examples for site-local addresses.

2.6. Anycast addresses for candidate source addresses

RFC 3484 Section 4 states that anycast addresses, as well as multicast addresses and the unspecified address, MUST NOT be included in a candidate set of source address. Now that RFC 4291 Section 2.6 [RFC4291] removed the restrictions on using IPv6 anycast addresses as the source address of an IPv6 packet, this restriction of RFC 3484 should also be removed.

3. Security Considerations

No security risk is found that degrades RFC 3484.

4. IANA Considerations

Address type number for the policy table may have to be assigned by

Matsumoto, et al. Expires April 3, 2012 [Page 7] Internet-Draft RFC3484 Bis October 2011

IANA.

5. References

5.1. Normative References

[RFC1794] Brisco, T., "DNS Support for Load Balancing", RFC 1794, April 1995.

[RFC1918] Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and E. Lear, "Address Allocation for Private Internets", BCP 5, RFC 1918, February 1996.

[RFC3484] Draves, R., "Default Address Selection for Internet Protocol version 6 (IPv6)", RFC 3484, February 2003.

[RFC3701] Fink, R. and R. Hinden, "6bone (IPv6 Testing Address Allocation) Phaseout", RFC 3701, March 2004.

[RFC3879] Huitema, C. and B. Carpenter, "Deprecating Site Local Addresses", RFC 3879, September 2004.

[RFC4193] Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast Addresses", RFC 4193, October 2005.

[RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing Architecture", RFC 4291, February 2006.

[RFC4380] Huitema, C., "Teredo: Tunneling IPv6 over UDP through Network Address Translations (NATs)", RFC 4380, February 2006.

[RFC5214] Templin, F., Gleeson, T., and D. Thaler, "Intra-Site Automatic Tunnel Addressing Protocol (ISATAP)", RFC 5214, March 2008.

[RFC5220] Matsumoto, A., Fujisaki, T., Hiromi, R., and K. Kanayama, "Problem Statement for Default Address Selection in Multi- Prefix Environments: Operational Issues of RFC 3484 Default Rules", RFC 5220, July 2008.

5.2. Informative References

[I-D.chown-addr-select-considerations] Chown, T., "Considerations for IPv6 Address Selection Policy Changes", draft-chown-addr-select-considerations-03 (work in progress), July 2009.

Matsumoto, et al. Expires April 3, 2012 [Page 8] Internet-Draft RFC3484 Bis October 2011

[I-D.denis-v6ops-nat-addrsel] Denis-Courmont, R., "Problems with IPv6 source address selection and IPv4 NATs", draft-denis-v6ops-nat-addrsel-00 (work in progress), February 2009.

[I-D.ietf-6man-addr-select-opt] Matsumoto, A., Fujisaki, T., Kato, J., and T. Chown, "Distributing Address Selection Policy using DHCPv6", draft-ietf-6man-addr-select-opt-01 (work in progress), June 2011.

[I-D.ietf-ipv6-ula-central] Hinden, R., "Centrally Assigned Unique Local IPv6 Unicast Addresses", draft-ietf-ipv6-ula-central-02 (work in progress), June 2007.

[RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: Defeating Denial of Service Attacks which employ IP Source Address Spoofing", BCP 38, RFC 2827, May 2000.

[RFC6052] Bao, C., Huitema, C., Bagnulo, M., Boucadair, M., and X. Li, "IPv6 Addressing of IPv4/IPv6 Translators", RFC 6052, October 2010.

Appendix A. Acknowledgements

Authors would like to thank to Dave Thaler, Pekka Savola, Remi Denis- Courmont, Francois-Xavier Le Bail, and the members of 6man’s address selection design team for their invaluable contributions to this document.

Appendix B. Past Discussion

This section summarizes discussions we had before related to address selection mechanisms.

B.1. The longest match rule

RFC 3484 defines that destination address selection rule 9 should be applied to both IPv4 and IPv6, which spoils the DNS-based load balancing technique that is widely used in the IPv4 Internet today.

When two or more destination addresses are acquired from one FQDN, rule 9 states that the longest matching destination and source address pair should be chosen. As in RFC 1794, the DNS-based load balancing technique is achieved by not re-ordering the destination

Matsumoto, et al. Expires April 3, 2012 [Page 9] Internet-Draft RFC3484 Bis October 2011

addresses returned from the DNS server. Rule 9 defines a deterministic rule for re-ordering hosts, hence the technique described in RFC 1794 is not available anymore.

Regarding this problem, there was discussion in IETF and other places like below.

Discussion: The possible changes to RFC 3484 are as follows:

1. To delete Rule 9 completely. 2. To apply Rule 9 only for IPv6 and not for IPv4. In IPv6, hierarchical address assignment generally used at present, hence the longest matching rule is beneficial in many cases. In IPv4, as stated above, the DNS based load balancing technique is widely used. 3. To apply Rule 9 for IPv6 conditionally and not for IPv4. When the length of matching bits of the destination address and the source address is longer than N, rule 9 is applied. Otherwise, the order of the destination addresses do not change. The value of N should be configurable and it should be 32 by default. This is simply because the two sites whose matching bit length is longer than 32 are probably adjacent.

Now that IPv6 PI addresses are being introduced by RIRs, hierarchical address assignment is not always maintained anymore. It seems that the longest matching algorithm may not worth the adverse effect of disabling the DNS-based load balance technique.

B.2. NAT64 prefix issue

The NAT64 WKP has recently been defined[RFC6052]. It depends site by site whether NAT64 should be preferred over IPv4, in other words NAT44, or NAT44 over NAT64. So, the issue of local site policy should be solved by manual policy table changes locally, or by use of the proposed DHCP-based policy distribution mechanism.

B.3. ISATAP issue

Where a site is using ISATAP [RFC5214], there is generally no way to differentiate an ISATAP address from a native address without interface information. However, a site will assign a prefix for its ISATAP overlay, and can choose to add an entry for that prefix to the policy table if it wishes to change the default preference for that prefix.

Matsumoto, et al. Expires April 3, 2012 [Page 10] Internet-Draft RFC3484 Bis October 2011

Appendix C. Revision History

05: 6bone testing addresses were back in the default policy table. Section 2.6 for allowing anycast source address were added.

04: Added comment about ISATAP.

03: ULA address selection issue was expanded. 6to4, Teredo and IPv4 prioritization issue was elaborated. Deprecated address blocks in policy table section was elaborated. In appendix, NAT64 prefix issue was added.

02: Suresh Krishnan’s suggestions for better english sentences were incorporated. A new source address selection rule that utilizes the next-hop information is included in Section 2.3. Site local address prefix was corrected.

01: Re-structured to contain only the actual changes to RFC 3484.

00: Published as a 6man working group item.

03: Added acknowledgements. Added longest matching algorithm malfunction regarding local DNS round robin. The proposed changes section was re-structured. The issue of 6to4/Teredo and IPv4 prioritization was included. The issue of deprecated addresses was added. The renewed default policy table was changed accordingly.

02: Added the reference to address selection design team’s proposal.

01: The issue of private IPv4 address scope was added. The issue of ULA address scope was added. Discussion of longest matching rule was expanded.

Matsumoto, et al. Expires April 3, 2012 [Page 11] Internet-Draft RFC3484 Bis October 2011

Authors’ Addresses

Arifumi Matsumoto NTT SI Lab Midori-Cho 3-9-11 Musashino-shi, Tokyo 180-8585 Japan

Phone: +81 422 59 3334 Email: [email protected]

Jun-ya Kato NTT SI Lab Midori-Cho 3-9-11 Musashino-shi, Tokyo 180-8585 Japan

Phone: +81 422 59 2939 Email: [email protected]

Tomohiro Fujisaki NTT PF Lab Midori-Cho 3-9-11 Musashino-shi, Tokyo 180-8585 Japan

Phone: +81 422 59 7351 Email: [email protected]

Tim Chown University of Southampt on Southampton, Hampshire SO17 1BJ United Kingdom

Email: [email protected]

Matsumoto, et al. Expires April 3, 2012 [Page 12]

Network Working Group D. Thaler, Ed. Internet-Draft Microsoft Obsoletes: 3484 (if approved) R. Draves Intended status: Standards Track Microsoft Research Expires: December 29, 2012 A. Matsumoto NTT T. Chown University of Southampton June 27, 2012

Default Address Selection for Internet Protocol version 6 (IPv6) draft-ietf-6man-rfc3484bis-06.txt

Abstract

This document describes two algorithms, one for source address selection and one for destination address selection. The algorithms specify default behavior for all Internet Protocol version 6 (IPv6) implementations. They do not override choices made by applications or upper-layer protocols, nor do they preclude the development of more advanced mechanisms for address selection. The two algorithms share a common context, including an optional mechanism for allowing administrators to provide policy that can override the default behavior. In dual stack implementations, the destination address selection algorithm can consider both IPv4 and IPv6 addresses - depending on the available source addresses, the algorithm might prefer IPv6 addresses over IPv4 addresses, or vice-versa.

Default address selection as defined in this specification applies to all IPv6 nodes, including both hosts and routers. This document obsoletes RFC 3484.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

Thaler, et al. Expires December 29, 2012 [Page 1] Internet-Draft Default Address Selection for IPv6 June 2012

This Internet-Draft will expire on December 29, 2012.

Copyright Notice

Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Thaler, et al. Expires December 29, 2012 [Page 2] Internet-Draft Default Address Selection for IPv6 June 2012

Table of Contents

1. Introduction ...... 4 1.1. Conventions Used in This Document ...... 5 2. Context in Which the Algorithms Operate ...... 5 2.1. Policy Table ...... 7 2.2. Common Prefix Length ...... 8 3. Address Properties ...... 8 3.1. Scope Comparisons ...... 9 3.2. IPv4 Addresses and IPv4-Mapped Addresses ...... 9 3.3. Other IPv6 Addresses with Embedded IPv4 Addresses . . . . 10 3.4. IPv6 Loopback Address and Other Format Prefixes . . . . . 10 3.5. Mobility Addresses ...... 10 4. Candidate Source Addresses ...... 11 5. Source Address Selection ...... 12 6. Destination Address Selection ...... 14 7. Interactions with Routing ...... 17 8. Implementation Considerations ...... 17 9. Security Considerations ...... 18 10. Examples ...... 19 10.1. Default Source Address Selection ...... 19 10.2. Default Destination Address Selection ...... 20 10.3. Configuring Preference for IPv6 or IPv4 ...... 21 10.3.1. Handling Broken IPv6 ...... 22 10.4. Configuring Preference for Link-Local Addresses . . . . . 22 10.5. Configuring a Multi-Homed Site ...... 23 10.6. Configuring ULA Preference ...... 24 10.7. Configuring 6to4 Preference ...... 25 11. IANA Considerations ...... 26 12. References ...... 26 12.1. Normative References ...... 26 12.2. Informative References ...... 27 Appendix A. Acknowledgements ...... 29 Appendix B. Changes Since RFC 3484 ...... 29 Authors’ Addresses ...... 31

Thaler, et al. Expires December 29, 2012 [Page 3] Internet-Draft Default Address Selection for IPv6 June 2012

1. Introduction

The IPv6 addressing architecture [RFC4291] allows multiple unicast addresses to be assigned to interfaces. These addresses might have different reachability scopes (link-local, site-local, or global). These addresses might also be "preferred" or "deprecated" [RFC4862]. Privacy considerations have introduced the concepts of "public addresses" and "temporary addresses" [RFC4941]. The mobility architecture introduces "home addresses" and "care-of addresses" [RFC6275]. In addition, multi-homing situations will result in more addresses per node. For example, a node might have multiple interfaces, some of them tunnels or virtual interfaces, or a site might have multiple ISP attachments with a global prefix per ISP.

The end result is that IPv6 implementations will very often be faced with multiple possible source and destination addresses when initiating communication. It is desirable to have default algorithms, common across all implementations, for selecting source and destination addresses so that developers and administrators can reason about and predict the behavior of their systems.

Furthermore, dual or hybrid stack implementations, which support both IPv6 and IPv4, will very often need to choose between IPv6 and IPv4 when initiating communication. For example, when DNS name resolution yields both IPv6 and IPv4 addresses and the network protocol stack has available both IPv6 and IPv4 source addresses. In such cases, a simple policy to always prefer IPv6 or always prefer IPv4 can produce poor behavior. As one example, suppose a DNS name resolves to a global IPv6 address and a global IPv4 address. If the node has assigned a global IPv6 address and a 169.254/16 auto-configured IPv4 address [RFC3927], then IPv6 is the best choice for communication. But if the node has assigned only a link-local IPv6 address and a global IPv4 address, then IPv4 is the best choice for communication. The destination address selection algorithm solves this with a unified procedure for choosing among both IPv6 and IPv4 addresses.

The algorithms in this document are specified as a set of rules that define a partial ordering on the set of addresses that are available for use. In the case of source address selection, a node typically has multiple addresses assigned to its interfaces, and the source address ordering rules in section 5 define which address is the "best" one to use. In the case of destination address selection, the DNS might return a set of addresses for a given name, and an application needs to decide which one to use first, and in what order to try others if the first one is not reachable. The destination address ordering rules in section 6, when applied to the set of addresses returned by the DNS, provide such a recommended ordering.

Thaler, et al. Expires December 29, 2012 [Page 4] Internet-Draft Default Address Selection for IPv6 June 2012

This document specifies source address selection and destination address selection separately, but using a common context so that together the two algorithms yield useful results. The algorithms attempt to choose source and destination addresses of appropriate scope and configuration status (preferred or deprecated in the RFC 4862 sense). Furthermore, this document suggests a preferred method, longest matching prefix, for choosing among otherwise equivalent addresses in the absence of better information.

This document also specifies policy hooks to allow administrative override of the default behavior. For example, using these hooks an administrator can specify a preferred source prefix for use with a destination prefix, or prefer destination addresses with one prefix over addresses with another prefix. These hooks give an administrator flexibility in dealing with some multi-homing and transition scenarios, but they are certainly not a panacea.

The selection rules specified in this document MUST NOT be construed to override an application or upper-layer’s explicit choice of a legal destination or source address.

1.1. Conventions Used in This Document

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14, RFC 2119 [RFC2119].

2. Context in Which the Algorithms Operate

Our context for address selection derives from the most common implementation architecture, which separates the choice of destination address from the choice of source address. Consequently, we have two separate algorithms for these tasks. The algorithms are designed to work well together and they share a mechanism for administrative policy override.

In this implementation architecture, applications use APIs such as getaddrinfo() [RFC3493] that return a list of addresses to the application. This list might contain both IPv6 and IPv4 addresses (sometimes represented as IPv4-mapped addresses). The application then passes a destination address to the network stack with connect() or sendto(). The application would then typically try the first address in the list, looping over the list of addresses until it finds a working address. In any case, the network layer is never in a situation where it needs to choose a destination address from several alternatives. The application might also specify a source

Thaler, et al. Expires December 29, 2012 [Page 5] Internet-Draft Default Address Selection for IPv6 June 2012

address with bind(), but often the source address is left unspecified. Therefore the network layer does often choose a source address from several alternatives.

As a consequence, we intend that implementations of APIs such as getaddrinfo() will use the destination address selection algorithm specified here to sort the list of IPv6 and IPv4 addresses that they return. Separately, the IPv6 network layer will use the source address selection algorithm when an application or upper-layer has not specified a source address. Application of this specification to source address selection in an IPv4 network layer might be possible but this is not explored further here.

Well-behaved applications SHOULD NOT simply use the first address returned from an API such as getaddrinfo() and then give up if it fails. For many applications, it is appropriate to iterate through the list of addresses returned from getaddrinfo() until a working address is found. For other applications, it might be appropriate to try multiple in parallel (e.g., with some small delay in between) and use the first one to succeed.

Although source and destination address selection is most typically done when initiating communication, a responder also must deal with address selection. In many cases this is trivially dealt with by an application using the source address of a received packet as the response destination, and the destination address of the received packet as the response source. Other cases, however, are handled like an initiator, such as when the request was multicast and hence source address selection must still occur when generating a response, or when the request includes a list of the initiator’s addresses from which to choose a destination. Finally, a third application scenario is that of a listening application choosing on what local addresses to listen. This third scenario is out of scope for this document.

The algorithms use several criteria in making their decisions. The combined effect is to prefer destination/source address pairs for which the two addresses are of equal scope or type, prefer smaller scopes over larger scopes for the destination address, prefer non- deprecated source addresses, avoid the use of transitional addresses when native addresses are available, and all else being equal prefer address pairs having the longest possible common prefix. For source address selection, temporary addresses [RFC4941] are preferred over public addresses. In mobile situations [RFC6275], home addresses are preferred over care-of addresses. If an address is simultaneously a home address and a care-of address (indicating the mobile node is "at home" for that address), then the home/care-of address is preferred over addresses that are solely a home address or solely a care-of address.

Thaler, et al. Expires December 29, 2012 [Page 6] Internet-Draft Default Address Selection for IPv6 June 2012

This specification optionally allows for the possibility of administrative configuration of policy (e.g., via manual configuration or a DHCP option such as that proposed in [I-D.ietf-6man-addr-select-opt]) that can override the default behavior of the algorithms. The policy override consists of the following set of state, which SHOULD be configurable:

o Policy Table (Section 2.1): a table that specifies precedence values and preferred source prefixes for destination prefixes. o Automatic Row Additions flag (Section 2.1): a flag that specifies whether the implementation is permitted to automatically add site- specific rows for certain types of addresses. o Privacy Preference flag (Section 5): a flag that specifies whether temporary source addresses or stable source addresses are preferred by default, when both types exist.

2.1. Policy Table

The policy table is a longest-matching-prefix lookup table, much like a routing table. Given an address A, a lookup in the policy table produces two values: a precedence value Precedence(A) and a classification or label Label(A).

The precedence value Precedence(A) is used for sorting destination addresses. If Precedence(A) > Precedence(B), we say that address A has higher precedence than address B, meaning that our algorithm will prefer to sort destination address A before destination address B.

The label value Label(A) allows for policies that prefer a particular source address prefix for use with a destination address prefix. The algorithms prefer to use a source address S with a destination address D if Label(S) = Label(D).

IPv6 implementations SHOULD support configurable address selection via a mechanism at least as powerful as the policy tables defined here. It is important that implementations provide a way to change the default policies as more experience is gained. Sections 10.3 through 10.7 provide examples of the kind of changes that might be needed.

If an implementation is not configurable or has not been configured, then it SHOULD operate according to the algorithms specified here in conjunction with the following default policy table:

Thaler, et al. Expires December 29, 2012 [Page 7] Internet-Draft Default Address Selection for IPv6 June 2012

Prefix Precedence Label ::1/128 50 0 ::/0 40 1 ::ffff:0:0/96 35 4 2002::/16 30 2 2001::/32 5 5 fc00::/7 3 13 ::/96 1 3 fec0::/10 1 11 3ffe::/16 1 12

An implementation MAY automatically add additional site-specific rows to the default table based on its configured addresses, such as for Unique Local Addresses (ULAs) [RFC4193] and 6to4 [RFC3056] addresses for instance (see Section 10.6 and Section 10.7 for examples). Any such rows automatically added by the implementation as a result of address acquisition MUST NOT override a row for the same prefix configured via other means. That is, rows can be added but never updated automatically. An implementation SHOULD provide a means (the Automatic Row Additions flag) for an administrator to disable automatic row additions.

As will become apparent later, one effect of the default policy table is to prefer using native source addresses with native destination addresses, 6to4 source addresses with 6to4 destination addresses, etc. Another effect of the default policy table is to prefer communication using IPv6 addresses to communication using IPv4 addresses, if matching source addresses are available.

Policy table entries for address prefixes that are not of global scope MAY be qualified with an optional zone index. If so, a prefix table entry only matches against an address during a lookup if the zone index also matches the address’s zone index.

2.2. Common Prefix Length

We define the common prefix length CommonPrefixLen(S, D) of a source address S and a destination address D as the length of the longest prefix (looking at the most significant, or leftmost, bits) that the two addresses have in common, up to the length of S’s prefix (i.e., the portion of the address not including the interface ID). For example, CommonPrefixLen(fe80::1, fe80::2) is 64.

3. Address Properties

In the rules given in later sections, addresses of different types (e.g., IPv4, IPv6, multicast and unicast) are compared against each

Thaler, et al. Expires December 29, 2012 [Page 8] Internet-Draft Default Address Selection for IPv6 June 2012

other. Some of these address types have properties that aren’t directly comparable to each other. For example, IPv6 unicast addresses can be "preferred" or "deprecated" [RFC4862], while IPv4 addresses have no such notion. To compare such addresses using the ordering rules (e.g., to use "preferred" addresses in preference to "deprecated" addresses), the following mappings are defined.

3.1. Scope Comparisons

Multicast destination addresses have a 4-bit scope field that controls the propagation of the multicast packet. The IPv6 addressing architecture defines scope field values for interface- local (0x1), link-local (0x2), admin-local (0x4), site-local (0x5), organization-local (0x8), and global (0xE) scopes ([RFC4291] Section 2.7).

Use of the source address selection algorithm in the presence of multicast destination addresses requires the comparison of a unicast address scope with a multicast address scope. We map unicast link- local to multicast link-local, unicast site-local to multicast site- local, and unicast global scope to multicast global scope. For example, unicast site-local is equal to multicast site-local, which is smaller than multicast organization-local, which is smaller than unicast global, which is equal to multicast global. (Note that IPv6 site-local unicast addresses are deprecated [RFC4291]. However, some existing implementations and deployments may still use these addresses and, therefore, they are included in the procedures in this specification. Also note that ULAs are considered as global, not site-local, scope but are handled via the prefix policy table as discussed in Section 10.6.)

We write Scope(A) to mean the scope of address A. For example, if A is a link-local unicast address and B is a site-local multicast address, then Scope(A) < Scope(B).

This mapping implicitly conflates unicast site boundaries and multicast site boundaries [RFC4007].

3.2. IPv4 Addresses and IPv4-Mapped Addresses

The destination address selection algorithm operates on both IPv6 and IPv4 addresses. For this purpose, IPv4 addresses MUST be represented as IPv4-mapped addresses [RFC4291]. For example, to lookup the precedence or other attributes of an IPv4 address in the policy table, lookup the corresponding IPv4-mapped IPv6 address.

IPv4 addresses are assigned scopes as follows. IPv4 auto- configuration addresses [RFC3927], which have the prefix 169.254/16,

Thaler, et al. Expires December 29, 2012 [Page 9] Internet-Draft Default Address Selection for IPv6 June 2012

are assigned link-local scope. IPv4 loopback addresses ([RFC1918], section 4.2.2.11), which have the prefix 127/8, are assigned link- local scope (analogously to the treatment of the IPv6 loopback address ([RFC4007], section 4)). Other IPv4 addresses (including IPv4 private addresses [RFC1918] and Shared Address Space addresses [RFC6598]) are assigned global scope.

IPv4 addresses MUST be treated as having "preferred" (in the RFC 4862 sense) configuration status.

3.3. Other IPv6 Addresses with Embedded IPv4 Addresses

IPv4-compatible addresses [RFC4291], IPv4-mapped [RFC4291], IPv4- converted [RFC6145], IPv4-translatable [RFC6145], and 6to4 addresses [RFC3056] contain an embedded IPv4 address. For the purposes of this document, these addresses MUST be treated as having global scope.

IPv4-compatible, IPv4-mapped, and IPv4-converted addresses MUST be treated as having "preferred" (in the RFC 4862 sense) configuration status.

3.4. IPv6 Loopback Address and Other Format Prefixes

The loopback address MUST be treated as having link-local scope ([RFC4007], section 4) and "preferred" (in the RFC 4862 sense) configuration status.

NSAP addresses and other addresses with as-yet-undefined format prefixes MUST be treated as having global scope and "preferred" (in the RFC 4862) configuration status. Later standards might supersede this treatment.

3.5. Mobility Addresses

Some nodes might support mobility using the concepts of home address and care-of address (for example see [RFC6275]). Conceptually, a home address is an IP address assigned to a mobile node and used as the permanent address of the mobile node. A care-of address is an IP address associated with a mobile node while visiting a foreign link. When a mobile node is on its home link, it might have an address that is simultaneously a home address and a care-of address.

For the purposes of this document, it is sufficient to know whether one’s own addresses are designated as home addresses or care-of addresses. Whether an address ought to be designated a home address or care-of address is outside the scope of this document.

Thaler, et al. Expires December 29, 2012 [Page 10] Internet-Draft Default Address Selection for IPv6 June 2012

4. Candidate Source Addresses

The source address selection algorithm uses the concept of a "candidate set" of potential source addresses for a given destination address. The candidate set is the set of all addresses that could be used as a source address; the source address selection algorithm will pick an address out of that set. We write CandidateSource(A) to denote the candidate set for the address A.

It is RECOMMENDED that the candidate source addresses be the set of unicast addresses assigned to the interface that will be used to send to the destination (the "outgoing" interface). On routers, the candidate set MAY include unicast addresses assigned to any interface that forwards packets, subject to the restrictions described below. Implementations that wish to support the use of global source addresses assigned to a loopback interface MUST behave as if the loopback interface originates and forwards the packet.

Discussion: The Neighbor Discovery Redirect mechanism [RFC4861] requires that routers verify that the source address of a packet identifies a neighbor before generating a Redirect, so it is advantageous for hosts to choose source addresses assigned to the outgoing interface.

In some cases the destination address might be qualified with a zone index or other information that will constrain the candidate set.

For all multicast and link-local destination addresses, the set of candidate source addresses MUST only include addresses assigned to interfaces belonging to the same link as the outgoing interface.

Discussion: The restriction for multicast destination addresses is necessary because currently-deployed multicast forwarding algorithms use Reverse Path Forwarding (RPF) checks.

For site-local unicast destination addresses, the set of candidate source addresses MUST only include addresses assigned to interfaces belonging to the same site as the outgoing interface.

In any case, multicast addresses, and the unspecified address MUST NOT be included in a candidate set.

On IPv6-only nodes that support Stateless IP/ICMP Translation (SIIT) [RFC6145], if the destination address is an IPv4-converted address then the candidate set MUST contain only IPv4-translatable addresses.

If an application or upper layer specifies a source address, it may affect the choice of outgoing interface. Regardless, if the

Thaler, et al. Expires December 29, 2012 [Page 11] Internet-Draft Default Address Selection for IPv6 June 2012

application or upper layer specifies a source address that is not in the candidate set for the destination, then the network layer MUST treat this as an error. If the application or upper layer specifies a source address that is in the candidate set for the destination, then the network layer MUST respect that choice. If the application or upper layer does not specify a source address, then the network layer uses the source address selection algorithm specified in the next section.

5. Source Address Selection

The source address selection algorithm produces as output a single source address for use with a given destination address. This algorithm only applies to IPv6 destination addresses, not IPv4 addresses.

The algorithm is specified here in terms of a list of pair-wise comparison rules that (for a given destination address D) imposes a "greater than" ordering on the addresses in the candidate set CandidateSource(D). The address at the front of the list after the algorithm completes is the one the algorithm selects.

Note that conceptually, a sort of the candidate set is being performed, where a set of rules define the ordering among addresses. But because the output of the algorithm is a single source address, an implementation need not actually sort the set; it need only identify the "maximum" value that ends up at the front of the sorted list.

The ordering of the addresses in the candidate set is defined by a list of eight pair-wise comparison rules, with each rule placing a "greater than," "less than" or "equal to" ordering on two source addresses with respect to each other (and that rule). In the case that a given rule produces a tie, i.e., provides an "equal to" result for the two addresses, the remaining rules MUST be applied (in order) to just those addresses that are tied to break the tie. Note that if a rule produces a single clear "winner" (or set of "winners" in the case of ties), those addresses not in the winning set can be discarded from further consideration, with subsequent rules applied only to the remaining addresses. If the eight rules fail to choose a single address, the tie-breaker is implementation-specific.

When comparing two addresses SA and SB from the candidate set, we say "prefer SA" to mean that SA is "greater than" SB, and similarly we say "prefer SB" to mean that SA is "less than" SB. If neither is stated to be preferred, this means that SA is "equal to" SB and the remaining rules apply as noted above.

Thaler, et al. Expires December 29, 2012 [Page 12] Internet-Draft Default Address Selection for IPv6 June 2012

Rule 1: Prefer same address. If SA = D, then prefer SA. Similarly, if SB = D, then prefer SB.

Rule 2: Prefer appropriate scope. If Scope(SA) < Scope(SB): If Scope(SA) < Scope(D), then prefer SB and otherwise prefer SA. Similarly, if Scope(SB) < Scope(SA): If Scope(SB) < Scope(D), then prefer SA and otherwise prefer SB.

Discussion: This rule must be given high priority because it can affect interoperability.

Rule 3: Avoid deprecated addresses. If one of the two source addresses is "preferred" and one of them is "deprecated" (in the RFC 4862 sense), then prefer the one that is "preferred."

Rule 4: Prefer home addresses. If SA is simultaneously a home address and care-of address and SB is not, then prefer SA. Similarly, if SB is simultaneously a home address and care-of address and SA is not, then prefer SB. If SA is just a home address and SB is just a care-of address, then prefer SA. Similarly, if SB is just a home address and SA is just a care-of address, then prefer SB.

Implementations supporting home addresses MUST provide a mechanism allowing an application to reverse the sense of this preference and prefer care-of addresses over home addresses (e.g., via appropriate API extensions such as [RFC5014]). Use of the mechanism MUST only affect the selection rules for the invoking application.

Rule 5: Prefer outgoing interface. If SA is assigned to the interface that will be used to send to D and SB is assigned to a different interface, then prefer SA. Similarly, if SB is assigned to the interface that will be used to send to D and SA is assigned to a different interface, then prefer SB.

Rule 5.5: Prefer addresses in a prefix advertised by the next-hop If SA or SA’s prefix is assigned by the selected next-hop that will be used to send to D and SB or SB’s prefix is assigned by a different next-hop, then prefer SA. Similarly, if SB or SB’s prefix is assigned by the next-hop that will be used to send to D and SA or SA’s prefix is assigned by a different next-hop, then prefer SB.

Discussion: An IPv6 implementation is not required to remember which next-hops advertised which prefixes. The conceptual models of IPv6 hosts in Section 5 of [RFC4861] and Section 3 of [RFC4191] have no such requirement. Hence rule 5.5 is only applicable to implementations that track this information.

Thaler, et al. Expires December 29, 2012 [Page 13] Internet-Draft Default Address Selection for IPv6 June 2012

Rule 6: Prefer matching label. If Label(SA) = Label(D) and Label(SB) <> Label(D), then prefer SA. Similarly, if Label(SB) = Label(D) and Label(SA) <> Label(D), then prefer SB.

Rule 7: Prefer temporary addresses. If SA is a temporary address and SB is a public address, then prefer SA. Similarly, if SB is a temporary address and SA is a public address, then prefer SB.

Implementations MUST provide a mechanism allowing an application to reverse the sense of this preference and prefer public addresses over temporary addresses (e.g., via appropriate API extensions such as [RFC5014]). Use of the mechanism MUST only affect the selection rules for the invoking application. This default is intended to address privacy concerns as discussed in [RFC4941], but introduces a risk of applications potentially failing due to the relatively short lifetime of temporary addresses or due to the possibility of the reverse lookup of a temporary address either failing or returning a randomized name. Implementations for which application compatibility considerations outweigh these privacy concerns MAY reverse the sense of this rule and by default prefer public addresses over temporary addresses. There SHOULD be an administrative option (the Privacy Preference flag) to change this preference, if the implementation supports temporary addresses. If there is no such option, there MUST be an administrative option to disable temporary addresses.

Rule 8: Use longest matching prefix. If CommonPrefixLen(SA, D) > CommonPrefixLen(SB, D), then prefer SA. Similarly, if CommonPrefixLen(SB, D) > CommonPrefixLen(SA, D), then prefer SB.

Rule 8 MAY be superseded if the implementation has other means of choosing among source addresses. For example, if the implementation somehow knows which source address will result in the "best" communications performance.

6. Destination Address Selection

The destination address selection algorithm takes a list of destination addresses and sorts the addresses to produce a new list. It is specified here in terms of the pair-wise comparison of addresses DA and DB, where DA appears before DB in the original list.

The algorithm sorts together both IPv6 and IPv4 addresses. To find the attributes of an IPv4 address in the policy table, the IPv4 address MUST be represented as an IPv4-mapped address.

Thaler, et al. Expires December 29, 2012 [Page 14] Internet-Draft Default Address Selection for IPv6 June 2012

We write Source(D) to indicate the selected source address for a destination D. For IPv6 addresses, the previous section specifies the source address selection algorithm. Source address selection for IPv4 addresses is not specified in this document.

We say that Source(D) is undefined if there is no source address available for destination D. For IPv6 addresses, this is only the case if CandidateSource(D) is the empty set.

The pair-wise comparison of destination addresses consists of ten rules, which MUST be applied in order. If a rule determines a result, then the remaining rules are not relevant and MUST be ignored. Subsequent rules act as tie-breakers for earlier rules. See the previous section for a lengthier description of how pair-wise comparison tie-breaker rules can be used to sort a list.

Rule 1: Avoid unusable destinations. If DB is known to be unreachable or if Source(DB) is undefined, then prefer DA. Similarly, if DA is known to be unreachable or if Source(DA) is undefined, then prefer DB.

Discussion: An implementation might know that a particular destination is unreachable in several ways. For example, the destination might be reached through a network interface that is currently unplugged. For example, the implementation might retain for some period of time information from Neighbor Unreachability Detection [RFC4861]. In any case, the determination of unreachability for the purposes of this rule is implementation- dependent.

Rule 2: Prefer matching scope. If Scope(DA) = Scope(Source(DA)) and Scope(DB) <> Scope(Source(DB)), then prefer DA. Similarly, if Scope(DA) <> Scope(Source(DA)) and Scope(DB) = Scope(Source(DB)), then prefer DB.

Rule 3: Avoid deprecated addresses. If Source(DA) is deprecated and Source(DB) is not, then prefer DB. Similarly, if Source(DA) is not deprecated and Source(DB) is deprecated, then prefer DA.

Rule 4: Prefer home addresses. If Source(DA) is simultaneously a home address and care-of address and Source(DB) is not, then prefer DA. Similarly, if Source(DB) is simultaneously a home address and care-of address and Source(DA) is not, then prefer DB.

If Source(DA) is just a home address and Source(DB) is just a care-of address, then prefer DA. Similarly, if Source(DA) is just a care-of

Thaler, et al. Expires December 29, 2012 [Page 15] Internet-Draft Default Address Selection for IPv6 June 2012

address and Source(DB) is just a home address, then prefer DB.

Rule 5: Prefer matching label. If Label(Source(DA)) = Label(DA) and Label(Source(DB)) <> Label(DB), then prefer DA. Similarly, if Label(Source(DA)) <> Label(DA) and Label(Source(DB)) = Label(DB), then prefer DB.

Rule 6: Prefer higher precedence. If Precedence(DA) > Precedence(DB), then prefer DA. Similarly, if Precedence(DA) < Precedence(DB), then prefer DB.

Rule 7: Prefer native transport. If DA is reached via an encapsulating transition mechanism (e.g., IPv6 in IPv4) and DB is not, then prefer DB. Similarly, if DB is reached via encapsulation and DA is not, then prefer DA.

Discussion: "IPv6 Rapid Deployment on IPv4 Infrastructures" (6rd) [RFC5969], the Intra-Site Automatic Tunnel Addressing Protocol (ISATAP) [RFC5214], and configured tunnels [RFC4213] are examples of encapsulating transition mechanisms for which the destination address does not have a specific prefix and hence can not be assigned a lower precedence in the policy table. An implementation MAY generalize this rule by using a concept of interface preference, and giving virtual interfaces (like the IPv6-in-IPv4 encapsulating interfaces) a lower preference than native interfaces (like ethernet interfaces).

Rule 8: Prefer smaller scope. If Scope(DA) < Scope(DB), then prefer DA. Similarly, if Scope(DA) > Scope(DB), then prefer DB.

Rule 9: Use longest matching prefix. When DA and DB belong to the same address family (both are IPv6 or both are IPv4): If CommonPrefixLen(Source(DA), DA) > CommonPrefixLen(Source(DB), DB), then prefer DA. Similarly, if CommonPrefixLen(Source(DA), DA) < CommonPrefixLen(Source(DB), DB), then prefer DB.

Rule 10: Otherwise, leave the order unchanged. If DA preceded DB in the original list, prefer DA. Otherwise prefer DB.

Rules 9 and 10 MAY be superseded if the implementation has other means of sorting destination addresses. For example, if the implementation somehow knows which destination addresses will result in the "best" communications performance.

Thaler, et al. Expires December 29, 2012 [Page 16] Internet-Draft Default Address Selection for IPv6 June 2012

7. Interactions with Routing

This specification of source address selection assumes that routing (more precisely, selecting an outgoing interface on a node with multiple interfaces) is done before source address selection. However, implementations MAY use source address considerations as a tiebreaker when choosing among otherwise equivalent routes.

For example, suppose a node has interfaces on two different links, with both links having a working default router. Both of the interfaces have preferred (in the RFC 4862 sense) global addresses. When sending to a global destination address, if there’s no routing reason to prefer one interface over the other, then an implementation MAY preferentially choose the outgoing interface that will allow it to use the source address that shares a longer common prefix with the destination.

Implementations that support source address selection (Section 5) Rule 5.5 also use the choice of router to influence the choice of source address. For example, suppose a host is on a link with two routers. One router is advertising a global prefix A and the other router is advertising global prefix B. Then when sending via the first router, the host might prefer source addresses with prefix A and when sending via the second router, prefer source addresses with prefix B.

8. Implementation Considerations

The destination address selection algorithm needs information about potential source addresses. One possible implementation strategy is for getaddrinfo() to call down to the network layer with a list of destination addresses, sort the list in the network layer with full current knowledge of available source addresses, and return the sorted list to getaddrinfo(). This is simple and gives the best results but it introduces the overhead of another system call. One way to reduce this overhead is to cache the sorted address list in the resolver, so that subsequent calls for the same name do not need to resort the list.

Another implementation strategy is to call down to the network layer to retrieve source address information and then sort the list of addresses directly in the context of getaddrinfo(). To reduce overhead in this approach, the source address information can be cached, amortizing the overhead of retrieving it across multiple calls to getaddrinfo(). In this approach, the implementation might not have knowledge of the outgoing interface for each destination, so it MAY use a looser definition of the candidate set during

Thaler, et al. Expires December 29, 2012 [Page 17] Internet-Draft Default Address Selection for IPv6 June 2012

destination address ordering.

In any case, if the implementation uses cached and possibly stale information in its implementation of destination address selection, or if the ordering of a cached list of destination addresses is possibly stale, then it MUST ensure that the destination address ordering returned to the application is no more than one second out of date. For example, an implementation might make a system call to check if any routing table entries or source address assignments or prefix policy table entries that might affect these algorithms have changed. Another strategy is to use an invalidation counter that is incremented whenever any underlying state is changed. By caching the current invalidation counter value with derived state and then later comparing against the current value, the implementation could detect if the derived state is potentially stale.

9. Security Considerations

This document has no direct impact on Internet infrastructure security.

Note that most source address selection algorithms, including the one specified in this document, expose a potential privacy concern. An unfriendly node can infer correlations among a target node’s addresses by probing the target node with request packets that force the target host to choose its source address for the reply packets. (Perhaps because the request packets are sent to an anycast or multicast address, or perhaps the upper-layer protocol chosen for the attack does not specify a particular source address for its reply packets.) By using different addresses for itself, the unfriendly node can cause the target node to expose the target’s own addresses. The source address selection default preference for temporary addresses helps mitigate this concern.

Similarly, most source and destination address selection algorithms, including the one specified in this document, influence the choice of network path taken (as do routing algorithms that are orthogonal to, but used together with such algorithms) and hence whether data might be sent over a path or network that might be more or less trusted than other paths or networks. Administrators should consider the security impact of the rows they configure in the prefix policy table, just as they should consider the security impact of the interface metrics used in the routing algorithms.

In addition, some address selection rules might be administratively configurable. Care must be taken to make sure that all administrative options are secured against illicit modification, or

Thaler, et al. Expires December 29, 2012 [Page 18] Internet-Draft Default Address Selection for IPv6 June 2012

else an attacker could redirect and/or block traffic.

10. Examples

This section contains a number of examples, first of default behavior and then demonstrating the utility of policy table configuration. These examples are provided for illustrative purposes; they are not to be construed as normative.

10.1. Default Source Address Selection

The source address selection rules, in conjunction with the default policy table, produce the following behavior:

Destination: 2001:db8:1::1 Candidate Source Addresses: 2001:db8:3::1 or fe80::1 Result: 2001:db8::1 (prefer appropriate scope)

Destination: ff05::1 Candidate Source Addresses: 2001:db8:3::1 or fe80::1 Result: 2001:db8:3::1 (prefer appropriate scope)

Destination: 2001:db8:1::1 Candidate Source Addresses: 2001:db8:1::1 (deprecated) or 2001:db8:2::1 Result: 2001:db8:1::1 (prefer same address)

Destination: fe80::1 Candidate Source Addresses: fe80::2 (deprecated) or 2001:db8:1::1 Result: fe80::2 (prefer appropriate scope)

Destination: 2001:db8:1::1 Candidate Source Addresses: 2001:db8:1::2 or 2001:db8:3::2 Result: 2001:db8:1:::2 (longest-matching-prefix)

Destination: 2001:db8:1::1 Candidate Source Addresses: 2001:db8:1::2 (care-of address) or 2001: db8:3::2 (home address) Result: 2001:db8:3::2 (prefer home address)

Destination: 2002:c633:6401::1 Candidate Source Addresses: 2002:c633:6401::d5e3:7953:13eb:22e8 (temporary) or 2001:db8:1::2 Result: 2002:c633:6401::d5e3:7953:13eb:22e8 (prefer matching label)

Destination: 2001:db8:1::d5e3:0:0:1 Candidate Source Addresses: 2001:db8:1::2 or 2001:db8:1::d5e3:7953:

Thaler, et al. Expires December 29, 2012 [Page 19] Internet-Draft Default Address Selection for IPv6 June 2012

13eb:22e8 (temporary) Result: 2001:db8:1::2 (prefer public address)

10.2. Default Destination Address Selection

The destination address selection rules, in conjunction with the default policy table and the source address selection rules, produce the following behavior:

Candidate Source Addresses: 2001:db8:1::2 or fe80::1 or 169.254.13.78 Destination Address List: 2001:db8:1::1 or 198.51.100.121 Result: 2001:db8:1::1 (src 2001:db8:1::2) then 198.51.100.121 (src 169.254.13.78) (prefer matching scope)

Candidate Source Addresses: fe80::1 or 198.51.100.117 Destination Address List: 2001:db8:1::1 or 198.51.100.121 Result: 198.51.100.121 (src 198.51.100.117) then 2001:db8:1::1 (src fe80::1) (prefer matching scope)

Candidate Source Addresses: 2001:db8:1::2 or fe80::1 or 10.1.2.4 Destination Address List: 2001:db8:1::1 or 10.1.2.3 Result: 2001:db8:1::1 (src 2001:db8:1::2) then 10.1.2.3 (src 10.1.2.4) (prefer higher precedence)

Candidate Source Addresses: 2001:db8:1::2 or fe80::2 Destination Address List: 2001:db8:1::1 or fe80::1 Result: fe80::1 (src fe80::2) then 2001:db8:1::1 (src 2001:db8:1::2) (prefer smaller scope)

Candidate Source Addresses: 2001:db8:1::2 (care-of address) or 2001: db8:3::1 (home address) or fe80::2 (care-of address) Destination Address List: 2001:db8:1::1 or fe80::1 Result: 2001:db8:1::1 (src 2001:db8:3::1) then fe80::1 (src fe80::2) (prefer home address)

Candidate Source Addresses: 2001:db8:1::2 or fe80::2 (deprecated) Destination Address List: 2001:db8:1::1 or fe80::1 Result: 2001:db8:1::1 (src 2001:db8:1::2) then fe80::1 (src fe80::2) (avoid deprecated addresses)

Candidate Source Addresses: 2001:db8:1::2 or 2001:db8:3f44::2 or fe80::2 Destination Address List: 2001:db8:1::1 or 2001:db8:3ffe::1 Result: 2001:db8:1::1 (src 2001:db8:1::2) then 2001:db8:3ffe::1 (src 2001:db8:3f44::2) (longest matching prefix)

Candidate Source Addresses: 2002:c633:6401::2 or fe80::2 Destination Address List: 2002:c633:6401::1 or 2001:db8:1::1

Thaler, et al. Expires December 29, 2012 [Page 20] Internet-Draft Default Address Selection for IPv6 June 2012

Result: 2002:c633:6401::1 (src 2002:c633:6401::2) then 2001:db8:1::1 (src 2002:c633:6401::2) (prefer matching label)

Candidate Source Addresses: 2002:c633:6401::2 or 2001:db8:1::2 or fe80::2 Destination Address List: 2002:c633:6401::1 or 2001:db8:1::1 Result: 2001:db8:1::1 (src 2001:db8:1::2) then 2002:c633:6401::1 (src 2002:c633:6401::2) (prefer higher precedence)

10.3. Configuring Preference for IPv6 or IPv4

The default policy table gives IPv6 addresses higher precedence than IPv4 addresses. This means that applications will use IPv6 in preference to IPv4 when the two are equally suitable. An administrator can change the policy table to prefer IPv4 addresses by giving the ::ffff:0.0.0.0/96 prefix a higher precedence:

Prefix Precedence Label ::1/128 50 0 ::/0 40 1 ::ffff:0:0/96 100 4 2002::/16 30 2 2001::/32 5 5 fc00::/7 3 13 ::/96 1 3 fec0::/10 1 11 3ffe::/16 1 12

This change to the default policy table produces the following behavior:

Candidate Source Addresses: 2001:db8::2 or fe80::1 or 169.254.13.78 Destination Address List: 2001:db8::1 or 198.51.100.121 Unchanged Result: 2001:db8::1 (src 2001:db8::2) then 198.51.100.121 (src 169.254.13.78) (prefer matching scope)

Candidate Source Addresses: fe80::1 or 198.51.100.117 Destination Address List: 2001:db8::1 or 198.51.100.121 Unchanged Result: 198.51.100.121 (src 198.51.100.117) then 2001:db8::1 (src fe80::1) (prefer matching scope)

Candidate Source Addresses: 2001:db8::2 or fe80::1 or 10.1.2.4 Destination Address List: 2001:db8::1 or 10.1.2.3 New Result: 10.1.2.3 (src 10.1.2.4) then 2001:db8::1 (src 2001:db8::2) (prefer higher precedence)

Thaler, et al. Expires December 29, 2012 [Page 21] Internet-Draft Default Address Selection for IPv6 June 2012

10.3.1. Handling Broken IPv6

One problem in practice that has been recently observed occurs when a host has IPv4 connectivity to the Internet, but has "broken" IPv6 connectivity to the Internet in that it has a global IPv6 address, but is discconnected from the IPv6 Internet. Since the default policy table prefers IPv6, this can result in unwanted timeouts.

This can be solved by configuring the table to prefer IPv4 as shown above. An implementation that has some means to detect that it is not connected to the IPv6 Internet MAY do this automatically. An implementation could instead treat it as part of its implementation of Rule 1 (avoid unusable destinations).

10.4. Configuring Preference for Link-Local Addresses

The destination address selection rules give preference to destinations of smaller scope. For example, a link-local destination will be sorted before a global scope destination when the two are otherwise equally suitable. An administrator can change the policy table to reverse this preference and sort global destinations before link-local destinations:

Prefix Precedence Label ::1/128 50 0 ::/0 40 1 ::ffff:0:0/96 35 4 fe80::/10 33 1 2002::/16 30 2 2001::/32 5 5 fc00::/7 3 13 ::/96 1 3 fec0::/10 1 11 3ffe::/16 1 12

This change to the default policy table produces the following behavior:

Candidate Source Addresses: 2001:db8::2 or fe80::2 Destination Address List: 2001:db8::1 or fe80::1 New Result: 2001:db8::1 (src 2001:db8::2) then fe80::1 (src fe80::2) (prefer higher precedence)

Candidate Source Addresses: 2001:db8::2 (deprecated) or fe80::2 Destination Address List: 2001:db8::1 or fe80::1 Unchanged Result: fe80::1 (src fe80::2) then 2001:db8::1 (src 2001: db8::2) (avoid deprecated addresses)

Thaler, et al. Expires December 29, 2012 [Page 22] Internet-Draft Default Address Selection for IPv6 June 2012

10.5. Configuring a Multi-Homed Site

Consider a site A that has a business-critical relationship with another site B. To support their business needs, the two sites have contracted for service with a special high-performance ISP. This is in addition to the normal Internet connection that both sites have with different ISPs. The high-performance ISP is expensive and the two sites wish to use it only for their business-critical traffic with each other.

Each site has two global prefixes, one from the high-performance ISP and one from their normal ISP. Site A has prefix 2001:db8:1aaa::/48 from the high-performance ISP and prefix 2001:db8:70aa::/48 from its normal ISP. Site B has prefix 2001:db8:1bbb::/48 from the high- performance ISP and prefix 2001:db8:70bb::/48 from its normal ISP. All hosts in both sites register two addresses in the DNS.

The routing within both sites directs most traffic to the egress to the normal ISP, but the routing directs traffic sent to the other site’s 2001 prefix to the egress to the high-performance ISP. To prevent unintended use of their high-performance ISP connection, the two sites implement ingress filtering to discard traffic entering from the high-performance ISP that is not from the other site.

The default policy table and address selection rules produce the following behavior:

Candidate Source Addresses: 2001:db8:1aaa::a or 2001:db8:70aa::a or fe80::a Destination Address List: 2001:db8:1bbb::b or 2001:db8:70bb::b Result: 2001:db8:70bb::b (src 2001:db8:70aa::a) then 2001:db8:1bbb::b (src 2001:db8:1aaa::a) (longest matching prefix)

In other words, when a host in site A initiates a connection to a host in site B, the traffic does not take advantage of their connections to the high-performance ISP. This is not their desired behavior.

Candidate Source Addresses: 2001:db8:1aaa::a or 2001:db8:70aa::a or fe80::a Destination Address List: 2001:db8:1ccc::c or 2001:db8:6ccc::c Result: 2001:db8:1ccc::c (src 2001:db8:1aaa::a) then 2001:db8:6ccc::c (src 2001:db8:70aa::a) (longest matching prefix)

In other words, when a host in site A initiates a connection to a host in some other site C, the reverse traffic might come back through the high-performance ISP. Again, this is not their desired behavior.

Thaler, et al. Expires December 29, 2012 [Page 23] Internet-Draft Default Address Selection for IPv6 June 2012

This predicament demonstrates the limitations of the longest- matching-prefix heuristic in multi-homed situations.

However, the administrators of sites A and B can achieve their desired behavior via policy table configuration. For example, they can use the following policy table:

Prefix Precedence Label ::1/128 50 0 2001:db8:1aaa::/48 43 6 2001:db8:1bbb::/48 43 6 ::/0 40 1 ::ffff:0:0/96 35 4 2002::/16 30 2 2001::/32 5 5 fc00::/7 3 13 ::/96 1 3 fec0::/10 1 11 3ffe::/16 1 12

This policy table produces the following behavior:

Candidate Source Addresses: 2001:db8:1aaa::a or 2001:db8:70aa::a or fe80::a Destination Address List: 2001:db8:1bbb::b or 2001:db8:70bb::b New Result: 2001:db8:1bbb::b (src 2001:db8:1aaa::a) then 2001:db8: 70bb::b (src 2001:db8:70aa::a) (prefer higher precedence)

In other words, when a host in site A initiates a connection to a host in site B, the traffic uses the high-performance ISP as desired.

Candidate Source Addresses: 2001:db8:1aaa::a or 2001:db8:70aa::a or fe80::a Destination Address List: 2001:db8:1ccc::c or 2001:db8:6ccc::c New Result: 2001:db8:6ccc::c (src 2001:db8:70aa::a) then 2001:db8: 1ccc::c (src 2001:db8:70aa::a) (longest matching prefix)

In other words, when a host in site A initiates a connection to a host in some other site C, the traffic uses the normal ISP as desired.

10.6. Configuring ULA Preference

RFC 5220 [RFC5220] sections 2.1.4, 2.2.2, and 2.2.3 describe address selection problems related to Unique Local Addresses (ULAs) [RFC4193]. By default, global IPv6 destinations are preferred over ULA destinations, since an arbitrary ULA is not necessarily reachable:

Thaler, et al. Expires December 29, 2012 [Page 24] Internet-Draft Default Address Selection for IPv6 June 2012

Candidate Source Addresses: 2001:db8:1::1 or fd11:1111:1111:1::1 Destination Address List: 2001:db8:2::2 or fd22:2222:2222:2::2 Result: 2001:db8:2::2 (src 2001:db8:1::1) then fd22:2222:2222:2::2 (src fd11:1111:1111:1::1) (prefer higher precedence)

However, a site-specific policy entry can be used to cause ULAs within a site to be preferred over global addresses as follows.

Prefix Precedence Label ::1/128 50 0 fd11:1111:1111::/48 45 14 ::/0 40 1 ::ffff:0:0/96 35 4 2002::/16 30 2 2001::/32 5 5 fc00::/7 3 13 ::/96 1 3 fec0::/10 1 11 3ffe::/16 1 12

Such a configuration would have the following effect:

Candidate Source Addresses: 2001:db8:1::1 or fd11:1111:1111:1::1 Destination Address List: 2001:db8:2::2 or fd22:2222:2222:2::2 Unchanged Result: 2001:db8:2::2 (src 2001:db8:1::1) then fd22:2222: 2222:2::2 (src fd11:1111:1111:1::1) (prefer higher precedence)

Candidate Source Addresses: 2001:db8:1::1 or fd11:1111:1111:1::1 Destination Address List: 2001:db8:2::2 or fd11:1111:1111:2::2 New Result: fd11:1111:1111:2::2 (src fd11:1111:1111:1::1) then 2001: db8:2::2 (src 2001:db8:1::1) (prefer higher precedence)

Since ULAs are defined to have a /48 site prefix, an implementation might choose to add such a row automatically on a machine with a ULA.

It is also worth noting that ULAs are assigned global scope. As such, the existence of one or more rows in the prefix policy table is important so that source address selection does not choose a ULA purely based on longest match:

Candidate Source Addresses: 2001:db8:1::1 or fd11:1111:1111:1::1 Destination Address List: ff00:1 Result: 2001:db8:1::1 (prefer matching label)

10.7. Configuring 6to4 Preference

By default, NAT’ed IPv4 is preferred over 6to4-relayed connectivity:

Thaler, et al. Expires December 29, 2012 [Page 25] Internet-Draft Default Address Selection for IPv6 June 2012

Candidate Source Addresses: 2002:c633:6401::2 or 10.1.2.3 Destination Address List: 2001:db8:1::1 or 203.0.113.1 Result: 203.0.113.1 (src 10.1.2.3) then 2001:db8:1::1 (src 2002:c633: 6401::2) (prefer matching label)

However, NAT’ed IPv4 is now also preferred over 6to4-to-6to4 connectivity by default. Since a 6to4 prefix might be used natively within an organization, a site-specific policy entry can be used to cause native IPv6 communication (using a 6to4 prefix) to be preferred over NAT’ed IPv4 as follows.

Prefix Precedence Label ::1/128 50 0 2002:c633:6401::/48 45 14 ::/0 40 1 ::ffff:0:0/96 35 4 2002::/16 30 2 2001::/32 5 5 fc00::/7 3 13 ::/96 1 3 fec0::/10 1 11 3ffe::/16 1 12

Such a configuration would have the following effect:

Candidate Source Addresses: 2002:c633:6401:1::1 or 10.1.2.3 Destination Address List: 2002:c633:6401:2::2 or 203.0.113.1 New Result: 2002:c633:6401:2::2 (src 2002:c633:6401:1::1) then 203.0.113.1 (sec 10.1.2.3) (prefer higher precedence)

Since 6to4 addresses are defined to have a /48 site prefix, an implementation might choose to add such a row automatically on a machine with a native IPv6 address with a 6to4 prefix.

11. IANA Considerations

This document has no IANA actions.

12. References

12.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC3056] Carpenter, B. and K. Moore, "Connection of IPv6 Domains

Thaler, et al. Expires December 29, 2012 [Page 26] Internet-Draft Default Address Selection for IPv6 June 2012

via IPv4 Clouds", RFC 3056, February 2001.

[RFC3879] Huitema, C. and B. Carpenter, "Deprecating Site Local Addresses", RFC 3879, September 2004.

[RFC4193] Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast Addresses", RFC 4193, October 2005.

[RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing Architecture", RFC 4291, February 2006.

[RFC4380] Huitema, C., "Teredo: Tunneling IPv6 over UDP through Network Address Translations (NATs)", RFC 4380, February 2006.

[RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless Address Autoconfiguration", RFC 4862, September 2007.

[RFC4941] Narten, T., Draves, R., and S. Krishnan, "Privacy Extensions for Stateless Address Autoconfiguration in IPv6", RFC 4941, September 2007.

[RFC6145] Li, X., Bao, C., and F. Baker, "IP/ICMP Translation Algorithm", RFC 6145, April 2011.

12.2. Informative References

[I-D.ietf-6man-addr-select-opt] Matsumoto, A., Fujisaki, T., Kato, J., and T. Chown, "Distributing Address Selection Policy using DHCPv6", draft-ietf-6man-addr-select-opt-03 (work in progress), February 2012.

[RFC1794] Brisco, T., "DNS Support for Load Balancing", RFC 1794, April 1995.

[RFC1918] Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and E. Lear, "Address Allocation for Private Internets", BCP 5, RFC 1918, February 1996.

[RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: Defeating Denial of Service Attacks which employ IP Source Address Spoofing", BCP 38, RFC 2827, May 2000.

[RFC3493] Gilligan, R., Thomson, S., Bound, J., McCann, J., and W. Stevens, "Basic Socket Interface Extensions for IPv6", RFC 3493, February 2003.

Thaler, et al. Expires December 29, 2012 [Page 27] Internet-Draft Default Address Selection for IPv6 June 2012

[RFC3701] Fink, R. and R. Hinden, "6bone (IPv6 Testing Address Allocation) Phaseout", RFC 3701, March 2004.

[RFC3927] Cheshire, S., Aboba, B., and E. Guttman, "Dynamic Configuration of IPv4 Link-Local Addresses", RFC 3927, May 2005.

[RFC4007] Deering, S., Haberman, B., Jinmei, T., Nordmark, E., and B. Zill, "IPv6 Scoped Address Architecture", RFC 4007, March 2005.

[RFC4191] Draves, R. and D. Thaler, "Default Router Preferences and More-Specific Routes", RFC 4191, November 2005.

[RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms for IPv6 Hosts and Routers", RFC 4213, October 2005.

[RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, September 2007.

[RFC5014] Nordmark, E., Chakrabarti, S., and J. Laganier, "IPv6 Socket API for Source Address Selection", RFC 5014, September 2007.

[RFC5214] Templin, F., Gleeson, T., and D. Thaler, "Intra-Site Automatic Tunnel Addressing Protocol (ISATAP)", RFC 5214, March 2008.

[RFC5220] Matsumoto, A., Fujisaki, T., Hiromi, R., and K. Kanayama, "Problem Statement for Default Address Selection in Multi- Prefix Environments: Operational Issues of RFC 3484 Default Rules", RFC 5220, July 2008.

[RFC5969] Townsley, W. and O. Troan, "IPv6 Rapid Deployment on IPv4 Infrastructures (6rd) -- Protocol Specification", RFC 5969, August 2010.

[RFC6275] Perkins, C., Johnson, D., and J. Arkko, "Mobility Support in IPv6", RFC 6275, July 2011.

[RFC6598] Weil, J., Kuarsingh, V., Donley, C., Liljenstolpe, C., and M. Azinger, "IANA-Reserved IPv4 Prefix for Shared Address Space", BCP 153, RFC 6598, April 2012.

Thaler, et al. Expires December 29, 2012 [Page 28] Internet-Draft Default Address Selection for IPv6 June 2012

Appendix A. Acknowledgements

RFC 3484 acknowledged the contributions of the IPng Working Group, particularly Marc Blanchet, Brian Carpenter, Matt Crawford, Alain Durand, Steve Deering, Robert Elz, Jun-ichiro itojun Hagino, Tony Hain, M.T. Hollinger, JINMEI Tatuya, Thomas Narten, Erik Nordmark, Ken Powell, Markku Savela, Pekka Savola, Hesham Soliman, Dave Thaler, Mauro Tortonesi, Ole Troan, and Stig Venaas. In addition, the anonymous IESG reviewers had many great comments and suggestions for clarification.

This revision was heavily influenced by the work by Arifumi Matsumoto, Jun-ya Kato, and Tomohiro Fujisaki in a working draft that made proposals for this revision to adopt, with input from Pekka Savola, Remi Denis-Courmont, Francois-Xavier Le Bail, and the 6man Working Group. Dmitry Anipko, Mark Andrews, Ray Hunter, and Wes George also provided valuable feedback on this revision.

Appendix B. Changes Since RFC 3484

Some changes were made to the default policy table that were deemed to be universally useful and cause no harm in every reasonable network environment. In doing so, care was taken to use the same preference and label values as in RFC 3484 whenever possible, and for new rows to use label values less likely to collide with values that might already be in use in additional rows on some hosts. These changes are:

1. Added the Teredo [RFC4380] prefix (2001::/32), with the preference and label values already widely used in popular implementations. 2. Added a row for ULAs (fc00::/7) below native IPv6 since they are not globally reachable, as discussed in Section 10.6. 3. Added a row for site-local addresses (fec0::/10) in order to depreference them, for consistency with the example in Section 10.3, since they are deprecated [RFC3879]. 4. Depreferenced 6to4 (2002::/32) below native IPv4 since 6to4 connectivity is less reliable today (and is expected to be phased out over time, rather than becoming more reliable). It remains above Teredo since 6to4 is more efficient in terms of connection establishment time, bandwidth, and server load. 5. Depreferenced IPv4-Compatible addresses (::/96) since they are now deprecated [RFC4291] and not in common use. 6. Added a row for 6bone testing addresses (3ffe::/16) in order to depreference them as they have also been phased out [RFC3701].

Thaler, et al. Expires December 29, 2012 [Page 29] Internet-Draft Default Address Selection for IPv6 June 2012

7. Added optional ability for an implementation to add automatic rows to the table for site-specific ULA prefixes and site- specific native 6to4 prefixes.

Similarly, some changes were made to the rules, as follows:

1. Changed the definition of CommonPrefixLen() to only compare bits up to the source address’s prefix length. The previous definition used the entire source address, rather than only its prefix. As a result, when a source and destination addresses had the same prefix, common bits in the interface ID would previously result in overriding DNS load balancing [RFC1794] by forcing the destination address with the most bits in common to be always chosen. The updated definition allows DNS load balancing to continue to be used as a tie breaker. 2. Added Rule 5.5 to allow choosing a source address from a prefix advertised by the chosen next-hop for a given destination. This allows better connectivity in the presence of BCP 38 [RFC2827] ingress filtering and egress filtering. Previously, RFC 3484 had issues with multiple egress networks reached via the same interface, as discussed in [RFC5220]. 3. Removed restriction against anycast addresses in the candidate set of source addresses, since the restriction against using IPv6 anycast addresses as source addresses was removed in Section 2.6 of RFC 4291 [RFC4291]. 4. Changed mapping of RFC 1918 [RFC1918] addresses to global scope in Section 3.2. Previously they were mapped to site-local scope. However, experience has resulted in current implementations already using global scope instead. When they were mapped to site-local, Destination Address Selection Rule 2 (Prefer matching scope) would cause IPv6 to be preferred in scenarios such as that described in Section 10.7. The change to global scope allows configurability via the prefix policy table. 5. Changed the default recommendation for Source Address Selection Rule 7 to prefer temporary addresses rather than public addresses, while providing an administrative override (in addition to the application-specific override that was already specified). This change was made because of the increasing importance of privacy considerations, as well as the fact that widely deployed implementations have preferred temporary addresses for many years without major application issues.

Finally, some editorial changes were made, including:

1. Changed global IP addresses in examples to use ranges reserved for documentation.

Thaler, et al. Expires December 29, 2012 [Page 30] Internet-Draft Default Address Selection for IPv6 June 2012

2. Added additional examples in Section 10.6 and Section 10.7. 3. Added Section 10.3.1 on "broken" IPv6. 4. Updated references.

Authors’ Addresses

Dave Thaler (editor) Microsoft One Microsoft Way Redmond, WA 98052

Phone: +1 425 703 8835 Email: [email protected]

Richard Draves Microsoft Research One Microsoft Way Redmond, WA 98052

Phone: +1 425 706 2268 Email: [email protected]

Arifumi Matsumoto NTT SI Lab Midori-Cho 3-9-11 Musashino-shi, Tokyo 180-8585 Japan

Phone: +81 422 59 3334 Email: [email protected]

Tim Chown University of Southampt on Southampton, Hampshire SO17 1BJ United Kingdom

Email: [email protected]

Thaler, et al. Expires December 29, 2012 [Page 31] 6MAN J. Hui Internet-Draft JP. Vasseur Intended status: Standards Track Cisco Systems, Inc Expires: June 16, 2012 December 14, 2011

RPL Option for Carrying RPL Information in Data-Plane Datagrams draft-ietf-6man-rpl-option-06

Abstract

The RPL protocol includes routing information in data-plane datagrams to quickly identify inconsistencies in the routing topology. This document describes the RPL Option for use among RPL routers to include such routing information.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on June 16, 2012.

Copyright Notice

Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Hui & Vasseur Expires June 16, 2012 [Page 1] Internet-Draft RPL Option December 2011

Table of Contents

1. Introduction ...... 3 1.1. Requirements Language ...... 3 2. Overview ...... 4 3. Format of the RPL Option ...... 5 4. RPL Router Behavior ...... 7 5. Security Considerations ...... 9 5.1. DAG Inconsistency Attacks ...... 9 5.2. DAO Inconsistency Attacks ...... 9 6. IANA Considerations ...... 10 7. Acknowledgements ...... 11 8. Changes ...... 12 9. References ...... 13 9.1. Normative References ...... 13 9.2. Informative References ...... 13 Authors’ Addresses ...... 14

Hui & Vasseur Expires June 16, 2012 [Page 2] Internet-Draft RPL Option December 2011

1. Introduction

RPL is a distance vector IPv6 routing protocol designed for Low power and Lossy Networks (LLNs) [I-D.ietf-roll-rpl]. Such networks are typically constrained in energy and/or channel capacity. To conserve precious resources, a routing protocol must generate control traffic sparingly. However, this is at odds with the need to quickly propagate any new routing information to resolve routing inconsistencies quickly.

To help minimize resource consumption, RPL uses a slow proactive process to construct and maintain a routing topology but a reactive and dynamic process to resolving routing inconsistencies. In the steady state, RPL maintains the routing topology using a low-rate beaconing process. However, when RPL detects inconsistencies that may prevent proper datagram delivery, RPL temporarily increases the beacon rate to quickly resolve those inconsistencies. This dynamic rate control operation is governed by the use of dynamic timers also referred to as "Trickle" timers and defined in [RFC6206]. In contrast to other routing protocols (e.g OSPF [RFC2328]), RPL detects routing inconsistencies using data-path verification, by including routing information within the datagram itself. In doing so, repair mechanisms operate only as needed, allowing the control and data planes to operate on similar time scales. The main motivation for data path verification in LLNs is that control plane traffic should be carefully bounded with respect to the data traffic. Intuitively, there is no need to solve routing issues (which may be temporary) in the absence of data traffic.

The RPL protocol constructs a Directed Acyclic Graph (DAG) that attempts to minimize path costs to the DAG root according to a set of metric and objective functions. There are circumstances where loops may occur, and RPL is designed to use a data-path loop detection method. This is one of the known requirements of RPL and other data- path usage might be defined in the future.

To that end, this document defines a new IPv6 option, called the RPL Option, to be carried within the IPv6 Hop-by-Hop header. The RPL Option is only for use between RPL routers participating in the same RPL Instance.

1.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

Hui & Vasseur Expires June 16, 2012 [Page 3] Internet-Draft RPL Option December 2011

2. Overview

The RPL Option provides a mechanism to include routing information with each datagram that a router forwards. When receiving datagrams that include routing information, RPL routers process the routing information to help maintain the routing topology.

Every RPL router along a packet’s delivery path processes and updates the RPL Option. If the received packet does not already contain a RPL Option, the RPL router must insert a RPL Option before forwarding it to another RPL router. This draft also specifies the use of IPv6- in-IPv6 tunneling [RFC2473] when attaching a RPL option to a packet. Use of tunneling ensures that the original packet remains unmodified and that ICMP errors return to the RPL Option source rather than the source of the original packet.

Hui & Vasseur Expires June 16, 2012 [Page 4] Internet-Draft RPL Option December 2011

3. Format of the RPL Option

The RPL Option is carried in an IPv6 Hop-by-Hop Options header, immediately following the IPv6 header. This option has an alignment requirement of 2n. The option has the following format:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Option Type | Opt Data Len | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |O|R|F|0|0|0|0|0| RPLInstanceID | SenderRank | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | (sub-TLVs) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Figure 1: RPL Option

Option Type: TBD by IANA.

Opt Data Len: 8-bit field indicating the length of the option, in octets, excluding the Option Type and Opt Data Len fields.

Down ’O’: 1-bit flag as defined in Section 11.2 of [I-D.ietf-roll-rpl]. The processing SHALL follow the rules described in Section 11.2 of [I-D.ietf-roll-rpl].

Rank-Error ’R’: 1-bit flag as defined in Section 11.2 of [I-D.ietf-roll-rpl]. The processing SHALL follow the rules described in Section 11.2 of [I-D.ietf-roll-rpl].

Forwarding-Error ’F’: 1-bit flag as defined in Section 11.2 of [I-D.ietf-roll-rpl]. The processing SHALL follow the rules described in Section 11.2 of [I-D.ietf-roll-rpl].

RPLInstanceID: 8-bit field as defined in Section 11.2 of [I-D.ietf-roll-rpl]. The processing SHALL follow the rules described in Section 11.2 of [I-D.ietf-roll-rpl].

SenderRank: 16-bit field as defined in Section 11.2 of [I-D.ietf-roll-rpl]. The processing SHALL follow the rules described in Section 11.2 of [I-D.ietf-roll-rpl].

The two high order bits of the Option Type MUST be set to ’01’ and the third bit is equal to ’1’. With these bits, according to [RFC2460], nodes that do not understand this option on a received packet MUST discard the packet. Also, according to [RFC2460], the values within the RPL Option are expected to change en-route. The

Hui & Vasseur Expires June 16, 2012 [Page 5] Internet-Draft RPL Option December 2011

RPL Option Data Length is variable.

The action taken by using the RPL Option and the potential set of sub-TLVs carried within the RPL Option MUST be specified by the RFC of the protocol that use that option. No sub-TLVs are defined in this document. A RPL device MUST skip over any unrecognized sub-TLVs and attempt to process any additional sub-TLVs that may appear after.

Hui & Vasseur Expires June 16, 2012 [Page 6] Internet-Draft RPL Option December 2011

4. RPL Router Behavior

Datagrams sent between RPL routers MUST include a RPL Option or RPL Source Route Header ([I-D.ietf-6man-rpl-routing-header]) and MAY include both. A datagram including a SRH does not need to include a RPL Option since both the source and intermediate routers ensure that the SRH does not contain loops.

When the router is the source of the original packet and the destination is known to be within the same RPL Instance, the router SHOULD include the RPL Option directly within the original packet. Otherwise, routers MUST use IPv6-in-IPv6 tunneling [RFC2473] and place the RPL Option in the tunnel header. Using IPv6-in-IPv6 tunneling ensures that the delivered datagram remains unmodified and that ICMPv6 errors generated by a RPL Option are sent back to the router that generated the RPL Option.

A RPL router chooses the next RPL router that should process the original packet as the tunnel exit-point. In some cases, the tunnel exit-point will be the final RPL router along a path towards the original packet’s destination and the original packet will only traverse a single tunnel. One example is when the final destination or the destination’s attachment router is known to be within the same RPL Instance.

In other cases, the tunnel exit-point will not be the final RPL router along a path and the original packet may traverse multiple tunnels to reach the destination. One example is when a RPL router is simply forwarding a packet to one of its DODAG Parents. In this case, the RPL router sets the tunnel exit-point to a DODAG Parent. When forwarding the original packet hop-by-hop, the RPL router only makes a determination on the next hop towards the destination.

A RPL router receiving an IPv6-in-IPv6 packet destined to it processes the tunnel packet as described in Section 3 of [RFC2473]. Before IPv6 decapsulation, the RPL router MUST process the RPL Option if one exists. After IPv6 decapsulation, if the router determines that it should forward the original packet to another RPL router it MUST encapsulate the packet again using IPv6-in-IPv6 tunneling to include the RPL Option. Fields within the RPL Option that do not change hop-by-hop MUST remain the same as those received from the prior tunnel.

RPL routers are responsible for ensuring that a RPL Option is only used between RPL routers:

1. For datagrams destined to a RPL router, the router processes the packet in the usual way. For instance, if the RPL Option was

Hui & Vasseur Expires June 16, 2012 [Page 7] Internet-Draft RPL Option December 2011

included using tunneled mode and the RPL router serves as the tunnel endpoint, the router removes the outer IPv6 header, at the same time removing the RPL Option as well.

2. Datagrams destined elsewhere within the same RPL Instance are forwarded to the correct interface.

3. Datagrams destined to nodes outside the RPL Instance are dropped if the outer-most IPv6 header contains a RPL Option not generated by the RPL router forwarding the datagram.

To avoid fragmentation, it is desirable to employ MTU sizes that allow for the header expansion (i.e. at least 1280 + 40 (outer IP header) + RPL_OPTION_MAX_SIZE), where RPL_OPTION_MAX_SIZE is the maximum RPL Option header size for a given RPL network. To take advantage of this, however, the communicating endpoints need to be aware of the MTU along the path (i.e. through Path MTU Discovery). Unfortunately, the larger MTU size may not be available on all links (e.g. 1280 octets on 6LoWPAN links). However, it is expected that much of the traffic on these types of networks consists of much smaller messages than the MTU, so performance degradation through fragmentation would be limited.

Hui & Vasseur Expires June 16, 2012 [Page 8] Internet-Draft RPL Option December 2011

5. Security Considerations

The RPL Option assists RPL routers in detecting routing inconsistencies. The RPL message security mechanisms defined in [I-D.ietf-roll-rpl] do not apply to the RPL Option.

5.1. DAG Inconsistency Attacks

Using the Down ’O’ flag and SenderRank field, an attacker can cause RPL routers to believe that a DAG inconsistency exists within the RPL instance identified by the RPLInstanceID field. This attack would cause a RPL router to reset its DIO Trickle timer and begin transmitting DIO messages more frequently.

In order to avoid any unacceptable impact on network operations, an implementation MAY limit the number of triggers caused by receiving a RPL Option to no greater than MAX_RPL_OPTION_RANK_ERRORS per hour. A RECOMMENDED value for MAX_RPL_OPTION_RANK_ERRORS is 20.

5.2. DAO Inconsistency Attacks

In storing mode, RPL routers maintain downward routing state. Under normal operation, the RPL Option assists RPL routers in cleaning up stale downward routing state by using the Forwarding-Error ’F’ flag to indicate that a datagram could not be delivered by a child and is being sent back to try a different child. Using this flag, an attacker can cause a RPL router to discard downward routing state.

In order to avoid any unacceptable impact on network operations, an implementation MAY limit the number of triggers caused by receiving a RPL Option to no greater than MAX_RPL_OPTION_FORWARD_ERRORS per hour. A RECOMMENDED value for MAX_RPL_OPTION_FORWARD_ERRORS is 20.

In non-storing mode, only the LBR maintains downward routing state. Because RPL routers do not maintain downward routing state, the RPL Option cannot be used to mount such attacks.

Hui & Vasseur Expires June 16, 2012 [Page 9] Internet-Draft RPL Option December 2011

6. IANA Considerations

IANA is requested to reserve a new value in the Destination Options and Hop-by-Hop Options registry. The proposed value to be confirmed by IANA is:

Hex Value Binary Value act chg rest Description Reference ------TBD 01 1 TBD RPL Option [RFCthis]

As specified in [RFC2460], the first two bits indicate that the IPv6 node MUST discard the packet if it doesn’t recognize the option type, and the third bit indicates that the Option Data may change en-route. The remaining bits serve as the option type and are TBD by IANA.

IANA is requested to create a registry called RPL-option-TLV, for the sub-TLVs carried in the RPL Option header. New codes may be allocated only by IETF Review [RFC5226]. The type field is an 8-bit field whose value be between 0 and 255, inclusive.

Hui & Vasseur Expires June 16, 2012 [Page 10] Internet-Draft RPL Option December 2011

7. Acknowledgements

The authors thank Jari Arkko, Ralph Droms, Adrian Farrel, Stephen Farrell, Richard Kelsey, Suresh Krishnan, Vishwas Manral, Erik Nordmark, Pascal Thubert, Sean Turner, and Tim Winter, for their comments and suggestions that helped shape this document.

Hui & Vasseur Expires June 16, 2012 [Page 11] Internet-Draft RPL Option December 2011

8. Changes

(This section to be removed by the RFC editor.)

Draft 06:

- Address IESG comments.

Draft 05:

- Address LC comments.

Draft 04:

- Clarify when the RPL Option is used.

- Updated text on recommendations for avoiding fragmentation.

- Specify skip-over-and-continue policy for unrecognized sub-TLVs.

- Change use of IPv6-in-IPv6 tunneling from SHOULD to MUST.

- Specify RPL Border Router operations in terms of forwarding decision outcomes.

- Expand security section.

Draft 03:

- Removed any presumed values that are TBD by IANA.

Hui & Vasseur Expires June 16, 2012 [Page 12] Internet-Draft RPL Option December 2011

9. References

9.1. Normative References

[I-D.ietf-roll-rpl] Winter, T., Thubert, P., Brandt, A., Clausen, T., Hui, J., Kelsey, R., Levis, P., Pister, K., Struik, R., and J. Vasseur, "RPL: IPv6 Routing Protocol for Low power and Lossy Networks", draft-ietf-roll-rpl-19 (work in progress), March 2011.

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, April 1998.

[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998.

[RFC2473] Conta, A. and S. Deering, "Generic Packet Tunneling in IPv6 Specification", RFC 2473, December 1998.

[RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 5226, May 2008.

[RFC6206] Levis, P., Clausen, T., Hui, J., Gnawali, O., and J. Ko, "The Trickle Algorithm", RFC 6206, March 2011.

9.2. Informative References

[I-D.ietf-6man-rpl-routing-header] Hui, J., Vasseur, J., Culler, D., and V. Manral, "An IPv6 Routing Header for Source Routes with RPL", draft-ietf-6man-rpl-routing-header-05 (work in progress), November 2011.

Hui & Vasseur Expires June 16, 2012 [Page 13] Internet-Draft RPL Option December 2011

Authors’ Addresses

Jonathan W. Hui Cisco Systems, Inc 170 West Tasman Drive San Jose, California 95134 USA

Phone: +408 424 1547 Email: [email protected]

JP Vasseur Cisco Systems, Inc 11, Rue Camille Desmoulins Issy Les Moulineaux, 92782 France

Email: [email protected]

Hui & Vasseur Expires June 16, 2012 [Page 14]

6MAN J. Hui Internet-Draft JP. Vasseur Intended status: Standards Track Cisco Systems, Inc Expires: June 18, 2012 D. Culler UC Berkeley V. Manral Hewlett Packard Co. December 16, 2011

An IPv6 Routing Header for Source Routes with RPL draft-ietf-6man-rpl-routing-header-07

Abstract

In Low power and Lossy Networks (LLNs), memory constraints on routers may limit them to maintaining at most a few routes. In some configurations, it is necessary to use these memory constrained routers to deliver datagrams to nodes within the LLN. The Routing for Low Power and Lossy Networks (RPL) protocol can be used in some deployments to store most, if not all, routes on one (e.g. the Directed Acyclic Graph (DAG) root) or few routers and forward the IPv6 datagram using a source routing technique to avoid large routing tables on memory constrained routers. This document specifies a new IPv6 Routing header type for delivering datagrams within a RPL Instance.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on June 18, 2012.

Copyright Notice

Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.

Hui, et al. Expires June 18, 2012 [Page 1] Internet-Draft RPL Source Route Header December 2011

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction ...... 3 1.1. Requirements Language ...... 3 2. Overview ...... 4 3. Format of the RPL Routing Header ...... 7 4. RPL Router Behavior ...... 10 4.1. Generating Source Routing Headers ...... 10 4.2. Processing Source Routing Headers ...... 10 5. Security Considerations ...... 14 5.1. Source Routing Attacks ...... 14 5.2. ICMPv6 Attacks ...... 14 6. IANA Considerations ...... 15 7. Acknowledgements ...... 16 8. Changes ...... 17 9. References ...... 19 9.1. Normative References ...... 19 9.2. Informative References ...... 19 Authors’ Addresses ...... 20

Hui, et al. Expires June 18, 2012 [Page 2] Internet-Draft RPL Source Route Header December 2011

1. Introduction

Routing for Low Power and Lossy Networks (RPL) is a distance vector IPv6 routing protocol designed for Low Power and Lossy networks (LLN) [I-D.ietf-roll-rpl]. Such networks are typically constrained in resources (limited communication data rate, processing power, energy capacity, memory). In particular, some LLN configurations may utilize LLN routers where memory constraints limit nodes to maintaining only a small number of default routes and no other destinations. However, it may be necessary to utilize such memory- constrained routers to forward datagrams and maintain reachability to destinations within the LLN.

To utilize paths that include memory-constrained routers, RPL relies on source routing. In one deployment model of RPL, more capable routers collect routing information and form paths to arbitrary destinations within a RPL Instance. However, a source routing mechanism supported by IPv6 is needed to deliver datagrams.

This document specifies the Source Routing Header (SRH) for use strictly between RPL routers in the same RPL Instance.

1.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

Hui, et al. Expires June 18, 2012 [Page 3] Internet-Draft RPL Source Route Header December 2011

2. Overview

The format of SRH draws from that of the Type 0 Routing header (RH0) [RFC2460]. However, SRH introduces mechanisms to compact the source route entries when all entries share the same prefix with the IPv6 Destination Address of a packet carrying a SRH, a typical scenario in LLNs using source routing. The compaction mechanism reduces consumption of scarce resources such as channel capacity.

SRH also differs from RH0 in the processing rules to alleviate security concerns that led to the deprecation of RH0 [RFC5095]. First, RPL routers implement a strict source route policy where each and every IPv6 hop between the source and destination of the source route is specified within the SRH. Note that the source route may be a subset of the path between the actual source and destination and is discussed further below. Second, a SRH is only used between RPL routers within a RPL Instance. RPL Border Routers, responsible for connecting other RPL Instances and IP domains that use other routing protocols, do not allow datagrams already carrying a SRH header to enter or exit a RPL Instance. Third, a RPL router drops datagrams that includes multiple addresses assigned to any interfaces on that router to avoid forwarding loops.

There are two cases that determine how to include a SRH when a RPL router requires the use of a SRH to deliver a datagram to its destination.

1. If the SRH specifies the complete path from source to destination, the router places the SRH directly in the datagram itself.

2. If the SRH only specifies a subset of the path from source to destination, the router uses IPv6-in-IPv6 tunneling [RFC2473] and places the SRH in the outer IPv6 header. Use of tunneling ensures that the datagram is delivered unmodified and that ICMP errors return to the source of the SRH rather than the source of the original datagram.

In a RPL network, Case 1 occurs when both source and destinations are within a RPL Instance and a single SRH is used to specify the entire path from source to destination, as shown in the following figure:

+------+ | | | (S) ------> (D) | | | +------+

Hui, et al. Expires June 18, 2012 [Page 4] Internet-Draft RPL Source Route Header December 2011

RPL Instance

In the above scenario, datagrams traveling from source, S, to destination, D, have the following packet structure:

+------+------+------//-+ | IPv6 | Source | IPv6 | | Header | Routing | Payload | | | Header | | +------+------+------//-+

S’s address is carried in the IPv6 Header’s Source Address field. D’s address is carried in the last entry of SRH for all but the last hop, when D’s address is carried in the IPv6 Header’s Destination Address field of the packet carrying the SRH.

In a RPL network, Case 2 occurs for all datagrams that have source and/or destination outside the RPL Instance, as shown in the following diagram:

+------+ | | | (S) ------> (R) ------> (D) | | +------+ RPL Instance

+------+ | | (S) ------> (R) ------> (D) | | | +------+ RPL Instance

+------+ | | (S) ------> (R) ------> (R) ------> (D) | | +------+ RPL Instance

In the scenarios above, R may indicate a RPL Border Router (when connecting to other routing domains) or a RPL Router (when connecting

Hui, et al. Expires June 18, 2012 [Page 5] Internet-Draft RPL Source Route Header December 2011

to hosts). The datagrams have the following structure when traveling within the RPL Instance:

+------+------+------+------//-+ | Outer | Source | Inner | IPv6 | | IPv6 | Routing | IPv6 | Payload | | Header | Header | Header | | +------+------+------+------//-+ <--- Original Packet ---> <--- Tunneled Packet --->

Note that the outer header (including the SRH) is added and removed by the RPL router.

Case 2 also occurs whenever a RPL router needs to insert a source route when forwarding datagram. One such use case with RPL is to have all RPL traffic flow through a Border Router and have the Border Router use source routes to deliver datagrams to their final destination. When including the SRH using tunneled mode, the Border Router would encapsulate the received datagram unmodified using IPv6- in-IPv6 and include a SRH in the outer IPv6 header.

+------+ | | | (S) ------\ | | \ | | (LBR) | / | | (D) <------/ | | | +------+ RPL Instance

In the above scenario, datagrams travel from S to D through LBR. Between S and LBR, the datagrams are routed using the DAG built by RPL and do not contain a SRH. LBR encapsulates received datagrams unmodified using IPv6-in-IPv6 and the SRH is included in the outer IPv6 header.

Hui, et al. Expires June 18, 2012 [Page 6] Internet-Draft RPL Source Route Header December 2011

3. Format of the RPL Routing Header

The Source Routing Header has the following format:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Next Header | Hdr Ext Len | Routing Type | Segments Left | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | CmprI | CmprE | Pad | Reserved | RPLInstanceID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | . . . Addresses[1..n] . . . | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Next Header 8-bit selector. Identifies the type of header immediately following the Routing header. Uses the same values as the IPv6 Next Header field [RFC2460].

Hdr Ext Len 8-bit unsigned integer. Length of the Routing header in 8-octet units, not including the first 8 octets. Note that when Addresses[1..n] are compressed (i.e. value of CmprI or CmprE is not 0), Hdr Ext Len does not equal twice the number of Addresses.

Routing Type 8-bit selector. Identifies the particular Routing header variant. A SRH should set the Routing Type to TBD by IANA.

Segments Left 8-bit unsigned integer. Number of route segments remaining, i.e., number of explicitly listed intermediate nodes still to be visited before reaching the final destination. The originator of a SRH sets this field to n, the number of addresses contained in Addresses[1..n].

CmprI 4-bit unsigned integer. Number of prefix octets from each segment, except than the last segment, (i.e. segments 1 through n-1) that are elided. For example, a SRH carrying full IPv6 addresses in Addresses[1..n-1] sets CmprI to 0.

Hui, et al. Expires June 18, 2012 [Page 7] Internet-Draft RPL Source Route Header December 2011

CmprE 4-bit unsigned integer. Number of prefix octets from the last segment (i.e. segment n) that are elided. For example, a SRH carrying a full IPv6 address in Addresses[n] sets CmprE to 0.

Pad 4-bit unsigned integer. Number of octets that are used for padding after Address[n] at the end of the SRH.

Reserved This field is unused. It MUST be initialized to zero by the sender and MUST be ignored by the receiver.

RPLInstanceID 8-bit unsigned integer. Indicates the RPL Instance along which the packet is sent.

Address[1..n] Vector of addresses, numbered 1 to n. Each vector element in [1..n-1] has size (16 - CmprI) and element [n] has size (16-CmprE). The originator of a SRH places the next-hop’s IPv6 address as the first address in Address[1..n] (i.e. Address[1]).

The SRH shares the same basic format as the Type 0 Routing header [RFC2460]. When carrying full IPv6 addresses, the CmprI, CmprE, and Pad fields are set to 0 and the only difference between the SRH and Type 0 encodings is the value of the Routing Type field.

A common network configuration for a RPL Instance is that all routers within a RPL Instance share a common prefix. The SRH introduces the CmprI, CmprE, and Pad fields to allow compaction of the Address[1..n] vector when all entries share the same prefix as the IPv6 Destination Address field of the packet carrying the SRH. The CmprI and CmprE field indicates the number of prefix octets that are shared with the IPv6 Destination Address of the packet carrying the SRH. The shared prefix octets are not carried within the Routing header and each entry in Address[1..n-1] has size (16 - CmprI) octets and Address[n] has size (16 - CmprE) octets. When CmprI or CmprE is non-zero, there may exist unused octets between the last entry, Address[n], and the end of the Routing header. The Pad field indicates the number of unused octets that are used for padding. Note that when CmprI and CmprE are both 0, Pad MUST carry a value of 0.

The SRH MUST NOT specify a path that visits a node more than once. When generating a SRH, the source may not know the mapping between IPv6 addresses and nodes. Minimally, the source MUST ensure that IPv6 Addresses do not appear more than once and the IPv6 Source and Destination addresses of the encapsulating datagram do not appear in

Hui, et al. Expires June 18, 2012 [Page 8] Internet-Draft RPL Source Route Header December 2011

the SRH.

Multicast addresses MUST NOT appear in a SRH, or in the IPv6 Destination Address field of a datagram carrying a SRH.

Hui, et al. Expires June 18, 2012 [Page 9] Internet-Draft RPL Source Route Header December 2011

4. RPL Router Behavior

4.1. Generating Source Routing Headers

To deliver an IPv6 datagram to its destination, a router may need to generate a new SRH and specify a strict source route. When the router is the source of the original packet and the destination is known to be within the same RPL Instance, the router SHOULD include the SRH directly within the original packet. Otherwise, the router MUST use IPv6-in-IPv6 tunneling [RFC2473] and place the SRH in the tunnel header. Using IPv6-in-IPv6 tunneling ensures that the delivered datagram remains unmodified and that ICMPv6 errors generated by a SRH are sent back to the router that generated the SRH.

In order to respect the IPv6 Hop Limit value of the original datagram, a RPL router generating an SRH MUST set the Segments Left to no greater than the original datagram’s IPv6 Hop Limit value upon forwarding. In the case that the source route is longer than the original datagram’s IPv6 Hop Limit, only the initial hops (determined by the original datagram’s IPv6 Hop Limit) should be included in the SRH. If the RPL router is not the source of the original datagram, the original datagram’s IPv6 Hop Limit field is decremented before generating the SRH. After generating the SRH, the RPL router decrements the original datagram’s IPv6 Hop Limit value by the SRH Segments Left value. Processing the SRH Segments Left and original datagram’s IPv6 Hop Limit fields in this way ensures that ICMPv6 Time Exceeded errors occur as would be expected on more traditional IPv6 networks that forward datagrams without tunneling.

To avoid fragmentation, it is desirable to employ MTU sizes that allow for the header expansion (i.e. at least 1280 + 40 (outer IP header) + SRH_MAX_SIZE), where SRH_MAX_SIZE is the maximum path length for a given RPL network. To take advantage of this, however, the communicating endpoints need to be aware of the MTU along the path (i.e. through Path MTU Discovery). Unfortunately, the larger MTU size may not be available on all links (e.g. 1280 octets on 6LoWPAN links). However, it is expected that much of the traffic on these types of networks consists of much smaller messages than the MTU, so performance degradation through fragmentation would be limited.

4.2. Processing Source Routing Headers

As specified in [RFC2460], a routing header is not examined or processed until it reaches the node identified in the Destination Address field of the IPv6 header. In that node, dispatching on the Next Header field of the immediately preceding header causes the

Hui, et al. Expires June 18, 2012 [Page 10] Internet-Draft RPL Source Route Header December 2011

Routing header module to be invoked.

The function of SRH is intended to be very similar to the Type 0 Routing Header defined in [RFC2460]. After the routing header has been processed and the IPv6 datagram resubmitted to the IPv6 module for processing, the IPv6 Destination Address contains the next hop’s address. When forwarding an IPv6 datagram that contains a SRH with a non-zero Segments Left value, if the IPv6 Destination Address is not on-link, a router MUST drop the datagram and SHOULD send an ICMP Destination Unreachable (ICMPv6 Type 1) message with ICMPv6 Code set to (TBD by IANA) to the packet’s Source Address. This ICMPv6 Code indicates that the IPv6 Destination Address is not on-link and the router cannot satisfy the strict source route requirement. When generating ICMPv6 error messages, the rules in Section 2.4 of [RFC4443] must be observed.

To detect loops in the SRH, a router MUST determine if the SRH includes multiple addresses assigned to any interface on that router. If such addresses appear more than once and are separated by at least one address not assigned to that router, the router MUST drop the packet and SHOULD send an ICMP Parameter Problem, Code 0, to the Source Address. While this loop check does add significant per- packet processing overhead, it is required to mitigate bandwidth exhaustion attacks that led to the deprecation of RH0 [RFC5095].

The following describes the algorithm performed when processing a SRH:

Hui, et al. Expires June 18, 2012 [Page 11] Internet-Draft RPL Source Route Header December 2011

if Segments Left = 0 { proceed to process the next header in the packet, whose type is identified by the Next Header field in the Routing header } else { compute n, the number of addresses in the Routing header, by n = (((Hdr Ext Len * 8) - Pad - (16 - CmprE)) / (16 - CmprI)) + 1

if Segments Left is greater than n { send an ICMP Parameter Problem, Code 0, message to the Source Address, pointing to the Segments Left field, and discard the packet } else { decrement Segments Left by 1

compute i, the index of the next address to be visited in the address vector, by subtracting Segments Left from n

if Address[i] or the IPv6 Destination Address is multicast { discard the packet } else if 2 or more entries in Address[1..n] are assigned to local interface and are separated by at least one address not assigned to local interface { send an ICMP Parameter Problem (Code 0) and discard the packet } else { swap the IPv6 Destination Address and Address[i]

if the IPv6 Hop Limit is less than or equal to 1 { send an ICMP Time Exceeded -- Hop Limit Exceeded in Transit message to the Source Address and discard the packet } else { decrement the Hop Limit by 1

resubmit the packet to the IPv6 module for transmission to the new destination } } } }

RPL routers are responsible for ensuring that a SRH is only used

Hui, et al. Expires June 18, 2012 [Page 12] Internet-Draft RPL Source Route Header December 2011

between RPL routers:

1. For datagrams destined to a RPL router, the router processes the packet in the usual way. For instance, if the SRH was included using tunneled mode and the RPL router serves as the tunnel endpoint, the router removes the outer IPv6 header, at the same time removing the SRH as well.

2. Datagrams destined elsewhere within the same RPL Instance are forwarded to the correct interface.

3. Datagrams destined to nodes outside the RPL Instance are dropped if the outer-most IPv6 header contains a SRH not generated by the RPL router forwarding the datagram.

Hui, et al. Expires June 18, 2012 [Page 13] Internet-Draft RPL Source Route Header December 2011

5. Security Considerations

5.1. Source Routing Attacks

The RPL message security mechanisms defined in [I-D.ietf-roll-rpl] do not apply to the RPL Source Route Header. This specification does not provide any confidentiality, integrity, or authenticity mechanisms to protect the SRH.

[RFC5095] deprecates the Type 0 Routing header due to a number of significant attacks that are referenced in that document. Such attacks include bypassing filtering devices, reaching otherwise unreachable Internet systems, network topology discovery, bandwidth exhaustion, and defeating anycast.

Because this document specifies that SRH is only for use within a RPL Instance, such attacks cannot be mounted from outside a RPL Instance. As specified in this document, RPL routers MUST drop datagrams entering or exiting a RPL Instance that contain a SRH in the IPv6 Extension headers.

Such attacks, however, can be mounted from within a RPL Instance. To mitigate bandwidth exhaustion attacks, this specification requires RPL routers to check for loops in the SRH and drop datagrams that contain such loops. Attacks that include bypassing filtering devices and reaching otherwise unreachable Internet systems are not as relevant in mesh networks since the topologies are, by their very nature, highly dynamic. The RPL routing protocol is designed to provide reachability to all devices within a RPL Instance and may utilize routes that traverse any number of devices in any order. Even so, these attacks and others (e.g. defeating anycast and routing topology discovery) can occur within a RPL Instance when using this specification.

5.2. ICMPv6 Attacks

The generation of ICMPv6 error messages may be used to attempt denial-of-service attacks by sending error-causing SRH in back-to- back datagrams. An implementation that correctly follows Section 2.4 of [RFC4443] would be protected by the ICMPv6 rate limiting mechanism.

Hui, et al. Expires June 18, 2012 [Page 14] Internet-Draft RPL Source Route Header December 2011

6. IANA Considerations

This document defines a new IPv6 Routing Type, the "RPL Source Route Header", and has been assigned assigned number TBD by IANA.

This document defines a new ICMPv6 Destination Unreachable Code, the "strict source route failed" error, and has been assigned number TBD by IANA.

Hui, et al. Expires June 18, 2012 [Page 15] Internet-Draft RPL Source Route Header December 2011

7. Acknowledgements

The authors thank Jari Arkko, Ralph Droms, Adrian Farrel, Stephen Farrell, Richard Kelsey, Suresh Krishnan, Erik Nordmark, Pascal Thubert, Sean Turner, and Tim Winter for their comments and suggestions that helped shape this document.

Hui, et al. Expires June 18, 2012 [Page 16] Internet-Draft RPL Source Route Header December 2011

8. Changes

(This section to be removed by the RFC editor.)

Draft 06:

- Address IESG comments.

Draft 05:

- Address LC comments.

Draft 04:

- Updated text on recommendations for avoiding fragmentation.

- Clarify definition of CmprE where it is first mentioned.

- Change use of IPv6-in-IPv6 tunneling from SHOULD to MUST.

- Update packet processing pseudocode to match the text on sending back a parameter problem error.

- Recommend that non-RPL devices drop packets with SRH by default.

- Clarify packet structure figures.

- State that checking for cycles represents significant per-packet processing.

Draft 03:

- Removed any presumed values that are TBD by IANA.

Draft 02:

- Updated to send ICMP Destination Unreachable error only after the SRH has been processed.

- Updated pseudocode to reflect encoding changes in draft-01.

- Allow multiple addresses assigned to same node as long as they are not separated by other addresses.

Draft 01:

Hui, et al. Expires June 18, 2012 [Page 17] Internet-Draft RPL Source Route Header December 2011

- Allow Addresses[1..n-1] and Addresses[n] to have a different number of bytes elided.

Hui, et al. Expires June 18, 2012 [Page 18] Internet-Draft RPL Source Route Header December 2011

9. References

9.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998.

[RFC2473] Conta, A. and S. Deering, "Generic Packet Tunneling in IPv6 Specification", RFC 2473, December 1998.

[RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification", RFC 4443, March 2006.

[RFC5095] Abley, J., Savola, P., and G. Neville-Neil, "Deprecation of Type 0 Routing Headers in IPv6", RFC 5095, December 2007.

9.2. Informative References

[I-D.ietf-roll-rpl] Winter, T., Thubert, P., Brandt, A., Clausen, T., Hui, J., Kelsey, R., Levis, P., Pister, K., Struik, R., and J. Vasseur, "RPL: IPv6 Routing Protocol for Low power and Lossy Networks", draft-ietf-roll-rpl-19 (work in progress), March 2011.

Hui, et al. Expires June 18, 2012 [Page 19] Internet-Draft RPL Source Route Header December 2011

Authors’ Addresses

Jonathan W. Hui Cisco Systems, Inc 170 West Tasman Drive San Jose, California 95134 USA

Phone: +408 424 1547 Email: [email protected]

JP Vasseur Cisco Systems, Inc 11, Rue Camille Desmoulins Issy Les Moulineaux, 92782 France

Email: [email protected]

David E. Culler UC Berkeley 465 Soda Hall Berkeley, California 94720 USA

Phone: +510 643 7572 Email: [email protected]

Vishwas Manral Hewlett Packard Co. 19111 Pruneridge Ave. Cupertino, California 95014 USA

Email: [email protected]

Hui, et al. Expires June 18, 2012 [Page 20]

Network Working Group M. Eubanks Internet-Draft AmericaFree.TV LLC Updates: 2460 (if approved) P. Chimento Intended status: Standards Track Johns Hopkins University Applied Expires: August 25, 2013 Physics Laboratory M. Westerlund Ericsson February 21, 2013

IPv6 and UDP Checksums for Tunneled Packets draft-ietf-6man-udpchecksums-08

Abstract

This document provides an update of the Internet Protocol version 6 (IPv6) specification (RFC2460) to improve the performance in the use case where a tunnel protocol uses UDP with IPv6 to tunnel packets. The performance improvement is obtained by relaxing the IPv6 UDP checksum requirement for any suitable tunnel protocol where header information is protected on the "inner" packet being carried. This relaxation removes the overhead associated with the computation of UDP checksums on IPv6 packets used to carry tunnel protocols. The specification describes how the IPv6 UDP checksum requirement can be relaxed for the situation where the encapsulated packet itself contains a checksum. The limitations and risks of this approach are described, and restrictions specified on the use of the method.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on August 25, 2013.

Copyright Notice

Copyright (c) 2013 IETF Trust and the persons identified as the

Eubanks, et al. Expires August 25, 2013 [Page 1] Internet-Draft udp-checksum February 2013

document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction ...... 3 2. Some Terminology ...... 4 2.1. Requirements Language ...... 4 3. Problem Statement ...... 4 4. Discussion ...... 4 4.1. Analysis of Corruption in Tunnel Context ...... 5 4.2. Limitation to Tunnel Protocols ...... 7 4.3. Middleboxes ...... 8 5. The Zero-Checksum Update ...... 8 6. Additional Observations ...... 10 7. IANA Considerations ...... 10 8. Security Considerations ...... 10 9. Acknowledgements ...... 11 10. References ...... 11 10.1. Normative References ...... 11 10.2. Informative References ...... 12 Authors’ Addresses ...... 12

Eubanks, et al. Expires August 25, 2013 [Page 2] Internet-Draft udp-checksum February 2013

1. Introduction

This work constitutes an update of the Internet Protocol Version 6 (IPv6) Specification [RFC2460], in the use case where a tunnel protocol uses UDP with IPv6 to tunnel packets. With the rapid growth of the Internet, tunnel protocols have become increasingly important to enable the deployment of new protocols. Tunnel protocols can be deployed rapidly, while the time to upgrade and deploy a critical mass of routers, middleboxes and hosts on the global Internet for a new protocol is now measured in decades. At the same time, the increasing use of firewalls and other security-related middleboxes means that truly new tunnel protocols, with new protocol numbers, are also unlikely to be deployable in a reasonable time frame, which has resulted in an increasing interest in and use of UDP-based tunnel protocols. In such protocols, there is an encapsulated "inner" packet, and the "outer" packet carrying the tunneled inner packet is a UDP packet, which can pass through firewalls and other middleboxes that perform filtering that is a fact of life on the current Internet.

Tunnel endpoints may be routers or middleboxes aggregating traffic from a number of tunnel users, therefore the computation of an additional checksum on the outer UDP packet may be seen as an unwarranted burden on nodes that implement a tunnel protocol, especially if the inner packet(s) are already protected by a checksum. In IPv4, there is a checksum over the IP packet header, and the checksum on the outer UDP packet may be set to zero. However in IPv6 there is no checksum in the IP header and RFC 2460 [RFC2460] explicitly states that IPv6 receivers MUST discard UDP packets with a zero checksum. So, while sending a UDP datagram with a zero checksum is permitted in IPv4 packets, it is explicitly forbidden in IPv6 packets. To improve support for IPv6 UDP tunnels, this document updates RFC 2460 to allow endpoints to use a zero UDP checksum under constrained situations (primarily IPv6 tunnel transports that carry checksum-protected packets), following the applicability statements and constraints in [I-D.ietf-6man-udpzero].

"Unicast UDP Usage Guidelines for Application Designers" [RFC5405] should be consulted when reading this specification. It discusses both UDP tunnels (Section 3.1.3) and the usage of checksums (Section 3.4).

While the origin of this specification is the problem raised by the draft titled "Automatic Multicast Tunnels", also known as "AMT" [I-D.ietf-mboned-auto-multicast] we expect it to have wide applicability. Since the first version of this document, the need for an efficient UDP tunneling mechanism has increased. Other IETF Working Groups, notably LISP [RFC6830] and Softwires [RFC5619] have

Eubanks, et al. Expires August 25, 2013 [Page 3] Internet-Draft udp-checksum February 2013

expressed a need to update the UDP checksum processing in RFC 2460. We therefore expect this update to be applicable in the future to other tunnel protocols specified by these and other IETF Working Groups.

2. Some Terminology

This document discusses only IPv6, since this problem does not exist for IPv4. Therefore all reference to ’IP’ should be understood as a reference to IPv6.

The document uses the terms "tunneling" and "tunneled" as adjectives when describing packets. When we refer to ’tunneling packets’ we refer to the outer packet header that provides the tunneling function. When we refer to ’tunneled packets’ we refer to the inner packet, i.e., the packet being carried in the tunnel.

2.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

3. Problem Statement

When using tunnel protocols based on UDP, there can be both a benefit and a cost to computing and checking the UDP checksum of the outer (encapsulating) UDP transport header. In certain cases, reducing the forwarding cost is important, e.g., for nodes that perform the checksum in software the cost may outweigh the benefit. This document provides an update for usage of the UDP checksum with IPv6. The update is specified for use by a tunnel protocol that transports packets that are themselves protected by a checksum.

4. Discussion

"Applicability Statement for the use of IPv6 UDP Datagrams with Zero Checksums" [I-D.ietf-6man-udpzero] describes issues related to allowing UDP over IPv6 to have a valid zero UDP checksum and is the starting point for this discussion. Sections 4 and 5 of [I-D.ietf-6man-udpzero], respectively identify node implementation and usage requirements for datagrams sent and received with a zero UDP checksum. These introduce constraints on the usage of a zero checksum for UDP over IPv6. The remainder of this section analyses the use of general tunnels and motivates why tunnel protocols are

Eubanks, et al. Expires August 25, 2013 [Page 4] Internet-Draft udp-checksum February 2013

being permitted to use the method described in this update. Issues with middleboxes are also discussed.

4.1. Analysis of Corruption in Tunnel Context

This section analyzes the impact of the different corruption modes in the context of a tunnel protocol. It indicates what needs to be considered by the designer and user of a tunnel protocol to be robust. It also summarizes why use of a zero UDP checksum is thought to be safe for deployment.

1. Context (i.e., tunneling state) should be established by exchanging application Protocol Data Units (PDUs) carried in checksummed UDP datagrams or by other protocols with integrity protection against corruption. These control packets should also carry any negotiation required to enable the tunnel endpoint to accept UDP datagrams with a zero checksum and identify the set of ports that are used. It is important that the control traffic is robust against corruption because undetected errors can lead to long-lived and significant failures that may affect much more than the single packet that was corrupted.

2. Keep-alive datagrams with a zero UDP checksum should be sent to validate the network path, because the path between tunnel endpoints can change and therefore the set of middleboxes along the path may change during the life of an association. Paths with middleboxes that drop datagrams with a zero UDP checksum will drop these keep-alives. To enable the tunnel endpoints to discover and react to this behavior in a timely way, the keep- alive traffic should include datagrams with a non-zero checksum and datagrams with a zero checksum.

3. Receivers should attempt to detect corruption of the address information in an encapsulating packet. A robust tunnel protocol should track tunnel context based on the 5-tuple (tunneled protocol number, IPv6 source address, IPv6 destination address, UDP source port, UDP destination port). A corrupted datagram that arrives at a destination may be filtered based on this check.

* If the datagram header matches the 5-tuple and the node has the zero checksum enabled for this port, the payload is matched to the wrong context. The tunneled packet will then be decapsulated and forwarded by the tunnel egress.

* If a corrupted datagram matches a different 5-tuple and the zero checksum was enabled for the port, the datagram payload is matched to the wrong context, and may be processed by the

Eubanks, et al. Expires August 25, 2013 [Page 5] Internet-Draft udp-checksum February 2013

wrong tunnel protocol, if it also passes the verification of that protocol.

* If a corrupted datagram matches a 5-tuple and the zero checksum has not been enabled for this port, the datagram will be discarded.

When only the source information is corrupted, the datagram could arrive at the intended applications/protocol, which will process the datagram and try to match it against an existing tunnel context. The likelihood that a corrupted packet enters a valid context is reduced when the protocol restricts processing to only the source addresses with established contexts. When both source and destination fields are corrupted, this increases the likelihood of failing to match a context, with the exception of errors replacing one packet header with another one. In this case, it is possible that both packets are tunnelled and therefore the corrupted packet could match a previously defined context.

4. Receivers should attempt to detect corruption of source- fragmented encapsulating packets. A tunnel protocol may reassemble fragments associated with the wrong context at the right tunnel endpoint, or it may reassemble fragments associated with a context at the wrong tunnel endpoint, or corrupted fragments may be reassembled at the right context at the right tunnel endpoint. In each of these cases, the IPv6 length of the encapsulating header may be checked (though [I-D.ietf-6man-udpzero] points out the weakness in this check). In addition, if the encapsulated packet is protected by a transport (or other) checksum, these errors can be detected (with some probability).

5. Tunnel protocols using UDP have some advantages that reduce the risk for a corrupted tunnel packet reaching a destination that will receive it, compared to other applications. This results from processing by the network of the inner (tunneled) packet after being forwarded from the tunnel egress using a wrong context:

* A tunneled packet may be forwarded to the wrong address domain, for example, a private address domain where the inner packet’s address is not routable, or may fail a source address check, such as Unicast Reverse Path Forwarding [RFC2827], resulting in the packet being dropped.

* The destination address of a tunneled packet may not at all be reachable from the delivered domain. For example, an Ethernet

Eubanks, et al. Expires August 25, 2013 [Page 6] Internet-Draft udp-checksum February 2013

frame where the destination MAC address is not present on the LAN segment that was reached.

* The type of the tunneled packet may prevent delivery. For example, an attempt to interpret an IP packet payload as an Ethernet frame, would likely to result in the packet being dropped as invalid.

* The tunneled packet checksum or integrity mechanism may detect corruption of the inner packet caused at the same time as corruption to the outer packet header. The resulting packet would likely be dropped as invalid.

These checks each significantly reduce the likelihood that a corrupted inner tunneled packet is finally delivered to a protocol listener that can be affected by the packet. While the methods do not guarantee correctness, they can reduce the risk of relaxing the UDP checksum requirement for a tunnel application using IPv6.

4.2. Limitation to Tunnel Protocols

This document describes the applicability of using a zero UDP checksum to support tunnel protocols. There are good motivations behind this and the arguments are provided here.

o Tunnels carry inner packets that have their own semantics, which may make any corruption less likely to reach the indicated destination and be accepted as a valid packet. This is true for IP packets with the addition of verification that can be made by the tunnel protocol, the network processing of the inner packet headers as discussed above, and verification of the inner packet checksums. Non-IP inner packets are likely to be subject to similar effects that may reduce the likelihood of a misdelivered packet being delivered to a protocol listener that can be affected by the packet.

o Protocols that directly consume the payload must have sufficient robustness against misdelivered packets from any context, including the ones that are corrupted in tunnels and any other usage of the zero checksum. This will require an integrity mechanism. Using a standard UDP checksum reduces the computational load in the receiver to verify this mechanism.

o The design for stateful protocols or protocols where corruption causes cascade effects requires extra care. In tunnel usage, each encapsulating packet provides only a transport mechanism from tunnel ingress to tunnel egress. A corruption will commonly only affect the single tunneled packet, not the established protocol

Eubanks, et al. Expires August 25, 2013 [Page 7] Internet-Draft udp-checksum February 2013

state. One common effect is that the inner packet flow will only see a corruption and misdelivery of the outer packet as a lost packet.

o Some non-tunnel protocols operate with general servers that do not know the source from which they will receive a packet. In such applications, a zero UDP checksum is unsuitable because there is a need to provide the first level of verification that the packet was intended for the receiving server. A verification prevents the server from processing the datagram payload and without this it may spend significant resources processing the packet, including sending replies or error messages.

Tunnel protocols that encapsulate IP will generally be safe for deployment, since all IPv4 and IPv6 packets include at least one checksum at either the network or transport layer. The network delivery of the inner packet will then further reduce the effects of corruption. Tunnel protocols carrying non-IP packets may offer equivalent protection when the non-IP networks reduce the risk of misdelivery to applications. However, there is a need for further analysis to understand the implications of misdelievery of corrupted packets for that each non-IP protocol. The analysis above suggests that non-tunnel protocols can be expected to have significantly more cases where a zero checksum would result in misdelivery or negative side-effects.

One unfortunate side-effect of increased use of a zero-checksum is that it also increases the likelihood of acceptance when a datagram with a zero UDP checksum is misdelivered. This requires all tunnel protocols using this method to be designed to be robust to misdelivery.

4.3. Middleboxes

"Applicability Statement for the use of IPv6 UDP Datagrams with Zero Checksums" [I-D.ietf-6man-udpzero] notes that middleboxes that conform to RFC 2460 will discard datagrams with a zero UDP checksum and should log this as an error. Tunnel protocols intending to use a zero UDP checksum need to ensure that they have defined a method for handling cases when a middlebox prevents the path between the tunnel ingress and egress from supporting transmission of datagrams with a zero UDP checksum.

5. The Zero-Checksum Update

This specification updates IPv6 to allow a zero UDP checksum in the outer encapsulating datagram of a tunnel protocol. UDP endpoints

Eubanks, et al. Expires August 25, 2013 [Page 8] Internet-Draft udp-checksum February 2013

that implement this update MUST follow the node requirements in "Applicability Statement for the use of IPv6 UDP Datagrams with Zero Checksums" [I-D.ietf-6man-udpzero].

The following text in [RFC2460] Section 8.1, 4th bullet should be deleted:

"Unlike IPv4, when UDP packets are originated by an IPv6 node, the UDP checksum is not optional. That is, whenever originating a UDP packet, an IPv6 node must compute a UDP checksum over the packet and the pseudo-header, and, if that computation yields a result of zero, it must be changed to hex FFFF for placement in the UDP header. IPv6 receivers must discard UDP packets containing a zero checksum, and should log the error."

This text should be replaced by:

An IPv6 node associates a mode with each used UDP port (for sending and/or receiving packets).

Whenever originating a UDP packet for a port in the default mode, an IPv6 node MUST compute a UDP checksum over the packet and the pseudo-header, and, if that computation yields a result of zero, it MUST be changed to hex FFFF for placement in the UDP header as specified in [RFC2460]. IPv6 receivers MUST by default discard UDP packets containing a zero checksum, and SHOULD log the error.

As an alternative, certain protocols that use UDP as a tunnel encapsulation, MAY enable the zero-checksum mode for a specific port (or set of ports) for sending and/or receiving. Any node implementing the zero-checksum mode MUST follow the node requirements specified in Section 4 of "Applicability Statement for the use of IPv6 UDP Datagrams with Zero Checksums" [I-D.ietf-6man-udpzero].

Any protocol that enables the zero-checksum mode for a specific port or ports MUST follow the usage requirements specified in Section 5 of "Applicability Statement for the use of IPv6 UDP Datagrams with Zero Checksums" [I-D.ietf-6man-udpzero].

Middleboxes supporting IPv6 MUST follow requirements 9, 10 and 11 of the usage requirements specified in Section 5 of "Applicability Statement for the use of IPv6 UDP Datagrams with Zero Checksums" [I-D.ietf-6man-udpzero].

Eubanks, et al. Expires August 25, 2013 [Page 9] Internet-Draft udp-checksum February 2013

6. Additional Observations

This update was motivated by the existence of a number of protocols being developed in the IETF that are expected to benefit from the change. The following observations are made:

o An empirically-based analysis of the probabilities of packet corruption (with or without checksums) has not (to our knowledge) been conducted since about 2000. At the time of publication, it is now 2012. We strongly suggest a new empirical study, along with an extensive analysis of the corruption probabilities of the IPv6 header. This can potentially allow revising the recommendations in this document.

o A key motivation for the increase in use of UDP in tunneling is a lack of protocol support in middleboxes. Specifically, new protocols, such as LISP [RFC6830], may prefer to use UDP tunnels to traverse an end-to-end path successfully and avoid having their packets dropped by middleboxes. If middleboxes were updated to support UDP-Lite [RFC3828], UDP-Lite would provide better protection than offered by this update. This may be suited to a variety of applications and would be expected to be preferred over this method for many tunnel protocols.

o Another issue is that the UDP checksum is overloaded with the task of protecting the IPv6 header for UDP flows (as is the TCP checksum for TCP flows). Protocols that do not use a pseudo- header approach to computing a checksum or CRC have essentially no protection from misdelivered packets.

7. IANA Considerations

This document makes no request of IANA.

Note to RFC Editor: this section may be removed on publication as an RFC.

8. Security Considerations

Less work is required to generate an attack using a zero UDP checksum than one using a standard full UDP checksum. However, this does not lead to significant new vulnerabilities because checksums are not a security measure and can be easily generated by any attacker.

In general any user of zero UDP checksums should apply the checks and context verification that are possible to minimize the risk of

Eubanks, et al. Expires August 25, 2013 [Page 10] Internet-Draft udp-checksum February 2013

unintended traffic to reach a particular context. This will however not protect against an intended attack that create packet with the correct information. Source address validation can help prevent injection of traffic into contexts by an attacker.

Depending on the hardware design, the processing requirements may differ for tunnels that have a zero UDP checksum and those that calculate a checksum. This processing overhead may need to be considered when deciding whether to enable a tunnel and to determine an acceptable rate for transmission. This can become a security risk for designs that can handle a significantly larger number of packets with zero UDP checksums compared to datagrams with a non-zero checksum, such as tunnel egress. An attacker could attempt to inject non-zero checksummed UDP packets into a tunnel forwarding zero checksum UDP packets and cause overload in the processing of the non- zero checksums, e.g. if this happens in a routers slow path. Protection mechanisms should therefore be employed when this threat exists. Protection may include source address filtering to prevent an attacker injecting traffic, as well as throttling the amount of non-zero checksum traffic. The latter may impact the function of the tunnel protocol.

9. Acknowledgements

We would like to thank Brian Haberman, Dan Wing, Joel Halpern, David Waltermire, J.W. Atwood, Peter Yee, Joe Touch and the IESG of 2012 for discussions and reviews. Gorry Fairhurst has been very diligent in reviewing and help ensuring alignment between this document and [I-D.ietf-6man-udpzero].

10. References

10.1. Normative References

[I-D.ietf-6man-udpzero] Fairhurst, G. and M. Westerlund, "Applicability Statement for the use of IPv6 UDP Datagrams with Zero Checksums", draft-ietf-6man-udpzero-10 (work in progress), January 2013.

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998.

Eubanks, et al. Expires August 25, 2013 [Page 11] Internet-Draft udp-checksum February 2013

10.2. Informative References

[I-D.ietf-mboned-auto-multicast] Bumgardner, G., "Automatic Multicast Tunneling", draft-ietf-mboned-auto-multicast-14 (work in progress), June 2012.

[RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: Defeating Denial of Service Attacks which employ IP Source Address Spoofing", BCP 38, RFC 2827, May 2000.

[RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., and G. Fairhurst, "The Lightweight User Datagram Protocol (UDP-Lite)", RFC 3828, July 2004.

[RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines for Application Designers", BCP 145, RFC 5405, November 2008.

[RFC5619] Yamamoto, S., Williams, C., Yokota, H., and F. Parent, "Softwire Security Analysis and Requirements", RFC 5619, August 2009.

[RFC6830] Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, "The Locator/ID Separation Protocol (LISP)", RFC 6830, January 2013.

Authors’ Addresses

Marshall Eubanks AmericaFree.TV LLC P.O. Box 141 Clifton, Virginia 20124 USA

Phone: +1-703-501-4376 Fax: Email: [email protected]

Eubanks, et al. Expires August 25, 2013 [Page 12] Internet-Draft udp-checksum February 2013

P.F. Chimento Johns Hopkins University Applied Physics Laboratory 11100 Johns Hopkins Road Laurel, MD 20723 USA

Phone: +1-443-778-1743 Email: [email protected]

Magnus Westerlund Ericsson Farogatan 6 SE-164 80 Kista Sweden

Phone: +46 10 714 82 87 Email: [email protected]

Eubanks, et al. Expires August 25, 2013 [Page 13]

Internet Engineering Task Force G. Fairhurst Internet-Draft University of Aberdeen Intended status: Standards Track M. Westerlund Expires: August 29, 2013 Ericsson February 25, 2013

Applicability Statement for the use of IPv6 UDP Datagrams with Zero Checksums draft-ietf-6man-udpzero-12

Abstract

This document provides an applicability statement for the use of UDP transport checksums with IPv6. It defines recommendations and requirements for the use of IPv6 UDP datagrams with a zero UDP checksum. It describes the issues and design principles that need to be considered when UDP is used with IPv6 to support tunnel encapsulations and examines the role of the IPv6 UDP transport checksum. The document also identifies issues and constraints for deployment on network paths that include middleboxes. An appendix presents a summary of the trade-offs that were considered in evaluating the safety of the update to RFC 2460 that updates use of the UDP checksum with IPv6.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on August 29, 2013.

Copyright Notice

Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal

Fairhurst & Westerlund Expires August 29, 2013 [Page 1] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction ...... 4 1.1. Document Structure ...... 5 1.2. Terminology ...... 5 1.3. Use of UDP Tunnels ...... 5 1.3.1. Motivation for new approaches ...... 6 1.3.2. Reducing forwarding cost ...... 6 1.3.3. Need to inspect the entire packet ...... 7 1.3.4. Interactions with middleboxes ...... 7 1.3.5. Support for load balancing ...... 8 2. Standards-Track Transports ...... 9 2.1. UDP with Standard Checksum ...... 9 2.2. UDP-Lite ...... 9 2.2.1. Using UDP-Lite as a Tunnel Encapsulation ...... 10 2.3. General Tunnel Encapsulations ...... 10 2.4. Relation to UDP-Lite and UDP with checksum ...... 10 3. Issues Requiring Consideration ...... 12 3.1. Effect of packet modification in the network ...... 13 3.1.1. Corruption of the destination IP address ...... 14 3.1.2. Corruption of the source IP address ...... 15 3.1.3. Corruption of Port Information ...... 16 3.1.4. Delivery to an unexpected port ...... 16 3.1.5. Corruption of Fragmentation Information ...... 17 3.2. Where Packet Corruption Occurs ...... 19 3.3. Validating the network path ...... 20 3.4. Applicability of method ...... 21 3.5. Impact on non-supporting devices or applications . . . . . 21 4. Constraints on implementation of IPv6 nodes supporting zero checksum ...... 22 5. Requirements on usage of the zero UDP checksum ...... 24 6. Summary ...... 26 7. Acknowledgements ...... 28 8. IANA Considerations ...... 28 9. Security Considerations ...... 28 10. References ...... 29 10.1. Normative References ...... 29 10.2. Informative References ...... 29

Fairhurst & Westerlund Expires August 29, 2013 [Page 2] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

Appendix A. Evaluation of proposal to update RFC 2460 to support zero checksum ...... 31 A.1. Alternatives to the Standard Checksum ...... 31 A.2. Comparison ...... 33 A.2.1. Middlebox Traversal ...... 33 A.2.2. Load Balancing ...... 34 A.2.3. Ingress and Egress Performance Implications . . . . . 34 A.2.4. Deployability ...... 34 A.2.5. Corruption Detection Strength ...... 35 A.2.6. Comparison Summary ...... 35 Appendix B. Document Change History ...... 38 Authors’ Addresses ...... 41

Fairhurst & Westerlund Expires August 29, 2013 [Page 3] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

1. Introduction

The User Datagram Protocol (UDP) [RFC0768] transport is defined for the Internet Protocol (IPv4) [RFC0791] and is defined in "Internet Protocol, Version 6 (IPv6) [RFC2460] for IPv6 hosts and routers. The UDP transport protocol has a minimal set of features. This limited set has enabled a wide range of applications to use UDP, but these application do need to provide many important transport functions on top of UDP. The UDP Usage Guidelines [RFC5405] provides overall guidance for application designers, including the use of UDP to support tunneling. The key difference between UDP usage with IPv4 and IPv6 is that RFC 2460 mandates use of a calculated UDP checksum, i.e. a non-zero value, due to the lack of an IPv6 header checksum. The inclusion of the pseudo header in the checksum computation provides a statistical check that datagrams have been delivered to the intended IPv6 destination node. Algorithms for checksum computation are described in [RFC1071].

The lack of a possibility to use an IPv6 datagram with a zero UDP checksum has been observed as a real problem for certain classes of application, primarily tunnel applications. This class of application has been deployed with a zero UDP checksum using IPv4. The design of IPv6 raises different issues when considering the safety of using a UDP checksum with IPv6. These issues can significantly affect applications, both when an endpoint is the intended user and when an innocent bystander (when a packet is received by a different endpoint to that intended).

This document examines the issues and an appendix compares the strengths and weaknesses of a number of proposed solutions. This identifies a set of issues that must be considered and mitigated to be able to safely deploy IPv6 applications that use a zero UDP checksum. The provided comparison of methods is expected to also be useful when considering applications that have different goals from the ones that initiated the writing of this document, especially the use of already standardized methods. The analysis concludes that using a zero UDP checksum is the best method of the proposed alternatives to meet the goals for certain tunnel applications.

This document defines recommendations and requirements for use of IPv6 datagrams with a zero UDP checksum. This usage is expected to have initial deployment issues related to middleboxes, limiting the usability more than desired in the currently deployed Internet. However, this limitation will be largest initially and will reduce as updates are provided in middleboxes that support the zero UDP checksum for IPv6. The document therefore derives a set of constraints required to ensure safe deployment of a zero UDP checksum.

Fairhurst & Westerlund Expires August 29, 2013 [Page 4] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

Finally, the document also identifies some issues that require future consideration and possibly additional research.

1.1. Document Structure

Section 1 provides a background to key issues, and introduces the use of UDP as a tunnel transport protocol.

Section 2 describes a set of standards-track datagram transport protocols that may be used to support tunnels.

Section 3 discusses issues with a zero UDP checksum for IPv6. It considers the impact of corruption, the need for validation of the path and when it is suitable to use a zero UDP checksum.

Section 4 is an applicability statement that defines requirements and recommendations on the implementation of IPv6 nodes that support the use of a zero UDP checksum.

Section 5 provides an applicability statement that defines requirements and recommendations for protocols and tunnel encapsulations that are transported over an IPv6 transport that does not perform a UDP checksum calculation to verify the integrity at the transport endpoints.

Section 6 provides the recommendations for standardization of zero UDP checksum with a summary of the findings and notes remaining issues needing future work.

Appendix A evaluates the set of proposals to update the UDP transport behaviour and other alternatives intended to improve support for tunnel protocols. It concludes by assessing the trade-offs of the various methods, identifying advantages and disadvantages for each method.

1.2. Terminology

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

1.3. Use of UDP Tunnels

One increasingly popular use of UDP is as a tunneling protocol, where a tunnel endpoint encapsulates the packets of another protocol inside UDP datagrams and transmits them to another tunnel endpoint. Using UDP as a tunneling protocol is attractive when the payload protocol is not supported by the middleboxes that may exist along the path,

Fairhurst & Westerlund Expires August 29, 2013 [Page 5] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

because many middleboxes support transmission using UDP. In this use, the receiving endpoint decapsulates the UDP datagrams and forwards the original packets contained in the payload [RFC5405]. Tunnels establish virtual links that appear to directly connect locations that are distant in the physical Internet topology and can be used to create virtual (private) networks.

1.3.1. Motivation for new approaches

A number of tunnel encapsulations deployed over IPv4 have used the UDP transport with a zero checksum. Users of these protocols expect a similar solution for IPv6.

A number of tunnel protocols are also currently being defined (e.g. Automated Multicast Tunnels, AMT [I-D.ietf-mboned-auto-multicast], and the Locator/Identifier Separation Protocol, LISP [LISP]). These protocols motivated an update to IPv6 UDP checksum processing to benefit from simpler checksum processing for various reasons:

o Reducing forwarding costs, motivated by redundancy present in the encapsulated packet header, since in tunnel encapsulations, payload integrity and length verification may be provided by higher layer encapsulations (often using the IPv4, UDP, UDP-Lite, or TCP checksums).

o Eliminating a need to access the entire packet when forwarding the packet by a tunnel endpoint.

o Enhancing ability to traverse and function with middleboxes.

o A desire to use the port number space to enable load-sharing.

1.3.2. Reducing forwarding cost

It is a common requirement to terminate a large number of tunnels on a single router/host. The processing cost per tunnel includes both state (memory requirements) and per-packet processing at the tunnel ingress and egress.

Automatic IP Multicast Tunneling, known as AMT [I-D.ietf-mboned-auto-multicast] currently specifies UDP as the transport protocol for packets carrying tunneled IP multicast packets. The current specification for AMT states that the UDP checksum in the outer packet header should be zero (see Section 6.6 of [I-D.ietf-mboned-auto-multicast]). This argues that the computation of an additional checksum is an unwarranted burden on nodes implementing lightweight tunneling protocols when an inner packet is already adequately protected, . The AMT protocol needs to

Fairhurst & Westerlund Expires August 29, 2013 [Page 6] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

replicate a multicast packet to each gateway tunnel. In this case, the outer IP addresses are different for each tunnel and therefore require a different pseudo header to be built for each UDP replicated encapsulation.

The argument concerning redundant processing costs is valid regarding the integrity of a tunneled packet. In some architectures (e.g. PC- based routers), other mechanisms may also significantly reduce checksum processing costs: There are implementations that have optimised checksum processing algorithms, including the use of checksum-offloading. This processing is readily available for IPv4 packets at high line rates. Such processing may be anticipated for IPv6 endpoints, allowing receivers to reject corrupted packets without further processing. However, there are certain classes of tunnel end-points where this off-loading is not available and unlikely to become available in the near future.

1.3.3. Need to inspect the entire packet

The currently-deployed hardware in many routers uses a fast-path processing that only provides the first n bytes of a packet to the forwarding engine, where typically n <= 128.

When this design is used to support a tunnel ingress and egress, it prevents fast processing of a transport checksum over an entire (large) packet. Hence the currently defined IPv6 UDP checksum is poorly suited to use within a router that is unable to access the entire packet and does not provide checksum-offloading. Thus enabling checksum calculation over the complete packet can impact router design, performance improvement, energy consumption and/or cost.

1.3.4. Interactions with middleboxes

Many paths in the Internet include one or more middleboxes of various types. There exist large classes of middleboxes that will handle zero UDP checksum packets, which would not support UDP-Lite or the other investigated proposals. These middleboxes includes load balancers (see Section 1.3.5) including Equal Cost Multipath Routing, traffic classifiers and other functions that reads some fields in the UDP headers but does not validate the UDP checksum.

There are also middleboxes that either validates or modify the UDP checksum. The two most common classes are Firewalls and NATs. In IPv4, UDP-encapsulation may be desirable for NAT traversal, since UDP support is commonly provided. It is also necessary due to the almost ubiquitous deployment of IPv4 NATs. There has also been discussion of NAT for IPv6, although not for the same reason as in IPv4. If

Fairhurst & Westerlund Expires August 29, 2013 [Page 7] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

IPv6 NAT becomes a reality they hopefully do not present the same protocol issues as for IPv4. If NAT is defined for IPv6, it should take into consideration the use of a zero UDP checksum.

The requirements for IPv6 firewall traversal are likely be to be similar to those for IPv4. In addition, it can be reasonably expected that a firewall conforming to RFC 2460 will not regard datagrams with a zero UDP checksum as valid. Use of a zero UDP checksum with IPv6 requires firewalls to be updated before the full utility of the change is available.

It can be expected that datagrams with zero UDP checksum will initially not have the same middlebox traversal characteristics as regular UDP (RFC 2460). However when implementations follow the requirements specified in this document, we expect the traversal capabilities to improve over time. We also note that deployment of IPv6-capable middleboxes is still in its initial phases. Thus, it might be that the number of non-updated boxes quickly become a very small percentage of the deployed middleboxes.

1.3.5. Support for load balancing

The UDP port number fields have been used as a basis to design load- balancing solutions for IPv4. This approach has also been leveraged for IPv6. An alternate method would be to utilise the IPv6 Flow Label [RFC6437] as a basis for entropy for load balancing. This would have the desirable effect of releasing IPv6 load-balancing devices from the need to assume semantics for the use of the transport port field and also works for all type of transport protocols.

This use of the flow-label for load balancing is consistent with the intended use, although further clarity was needed to ensure the field can be consistently used for this purpose, therefore an updated IPv6 Flow Label [RFC6437] and Equal-Cost Multi-Path routing usage, (ECMP) [RFC6438] was produced. Router vendors could be encouraged to start using the IPv6 Flow Label as a part of the flow hash, providing support for ECMP without requiring use of UDP.

However, the method for populating the outer IPv6 header with a value for the flow label is not trivial: If the inner packet uses IPv6, then the flow label value could be copied to the outer packet header. However, many current end-points set the flow label to a zero value (thus no entropy). The ingress of a tunnel seeking to provide good entropy in the flow label field would therefore need to create a random flow label value and keep corresponding state, so that all packets that were associated with a flow would be consistently given the same flow label. Although possible, this complexity may not be

Fairhurst & Westerlund Expires August 29, 2013 [Page 8] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

desirable in a tunnel ingress.

The end-to-end use of flow labels for load balancing is a long-term solution. Even if the usage of the flow label is clarified, there would be a transition time before a significant proportion of end- points start to assign a good quality flow label to the flows that they originate, with continued use of load balancing using the transport header fields until any widespread deployment is finally achieved.

2. Standards-Track Transports

The IETF has defined a set of transport protocols that may be applicable for tunnels with IPv6. There are also a set of network layer encapsulation tunnels such as IP-in-IP and GRE. These already standardized solutions are discussed here prior to the issues, as background for the issue description and some comparison of where the issue may already occur.

2.1. UDP with Standard Checksum

UDP [RFC0768] with standard checksum behaviour, as defined in RFC 2460, has already been discussed. UDP usage guidelines are provided in [RFC5405].

2.2. UDP-Lite

UDP-Lite [RFC3828] offers an alternate transport to UDP, specified as a proposed standard, RFC 3828. A MIB is defined in [RFC5097] and unicast usage guidelines in [RFC5405]. There is at least one open source implementation as a part of the Linux kernel since version 2.6.20.

UDP-Lite provides a checksum with optional partial coverage. When using this option, a datagram is divided into a sensitive part (covered by the checksum) and an insensitive part (not covered by the checksum). When the checksum covers the entire packet, UDP-Lite is fully equivalent with UDP, with the exception that it uses a different value in the Next Header field in the IPv6 header. Errors/ corruption in the insensitive part will not cause the datagram to be discarded by the transport layer at the receiving endpoint. A minor side-effect of using UDP-Lite is that this was specified for damage- tolerant payloads and some link-layers may employ different link encapsulations when forwarding UDP-Lite segments (e.g. radio access bearers). Most link-layers will cover the insensitive part with the same strong layer 2 frame CRC that covers the sensitive part.

Fairhurst & Westerlund Expires August 29, 2013 [Page 9] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

2.2.1. Using UDP-Lite as a Tunnel Encapsulation

Tunnel encapsulations can use UDP-Lite (e.g. Control And Provisioning of Wireless Access Points, CAPWAP [RFC5415]), since UDP- Lite provides a transport-layer checksum, including an IP pseudo header checksum, in IPv6, without the need for a router/middlebox to traverse the entire packet payload. This provides most of the verification required for delivery and still keeps a low complexity for the checksumming operation. UDP-Lite may set the length of checksum coverage on a per packet basis. This feature could be used if a tunnel protocol is designed to only verify delivery of the tunneled payload and uses a calculated checksum for control information.

There is currently poor support for middlebox traversal using UDP- Lite, because UDP-Lite uses a different IPv6 network-layer Next Header value to that of UDP, and few middleboxes are able to interpret UDP-Lite and take appropriate actions when forwarding the packet. This makes UDP-Lite less suited to protocols needing general Internet support, until such time that UDP-Lite has achieved better support in middleboxes and end-points.

2.3. General Tunnel Encapsulations

The IETF has defined a set of tunneling protocols or network layer encapsulations, e.g., IP-in-IP and GRE. These either do not include a checksum or use a checksum that is optional, since tunnel encapsulations are typically layered directly over the Internet layer (identified by the upper layer type in the IPv6 Next Header field) and are also not used as endpoint transport protocols. There is little chance of confusing a tunnel-encapsulated packet with other application data that could result in corruption of application state or data.

From the end-to-end perspective, the principal difference is that the network-layer Next Header field identifies a separate transport, which reduces the probability that corruption could result in the packet being delivered to the wrong endpoint or application. Specifically, packets are only delivered to protocol modules that process a specific Next Header value. The Next Header field therefore provides a first-level check of correct demultiplexing. In contrast, the UDP port space is shared by many diverse applications and therefore UDP demultiplexing relies solely on the port numbers.

2.4. Relation to UDP-Lite and UDP with checksum

The operation of IPv6 with UDP with a zero-checksum is not the same as IPv4 with UDP with a zero-checksum. Protocol designers should not

Fairhurst & Westerlund Expires August 29, 2013 [Page 10] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

be fooled into thinking the two are the same. The requirements below list a set of additional considerations.

Where possible, existing general tunnel encapsulations, such as GRE, IP-in-IP, should be used. This section assumes that such existing tunnel encapsulations do not offer the functionally required to satisfy the protocol designer’s goals. The section considers the standardized alternative solutions, rather than the full set of ideas evaluated in Appendix A. The alternatives to UDP with a zero checksum are UDP with a (calculated) checksum, and UDP-Lite.

UDP with a checksum has the advantage of close to universal support in both endpoints and middleboxes. It also provides statistical verification of delivery to the intended destination (address and port). However, some classes of device have limited support for calculation of a checksum that covers a full datagram. For these devices, this can incur significant processing cost (e.g. requiring processing in the router slow-path) and can hence reduce capacity or fail to function.

UDP-Lite has the advantage of using a checksum that is calculated only over the pseudo header and the UDP header. This provides a statistical verification of delivery to the intended destination (address and port). The checksum can be calculated without access to the datagram payload, only requiring access to the part to be protected. A drawback is that UDP-Lite has currently limited support in both end-points (i.e. is not supported on all operating system platforms) and middleboxes (that require support for the UDP-Lite header type). A path verification method is therefore recommended.

IPv6 and UDP with a zero-checksum can also be used by nodes that do not permit calculation of a payload checksum. Many existing classes of middleboxes do not verify or change the transport checksum. For these middleboxes, IPv6 with a zero UDP checksum is expected to function where UDP-Lite would not. However, support for the zero UDP checksum in middleboxes that do change or verify the checksum is currently limited, and this may result in datagrams with a zero UDP checksum being discarded, therefore a path verification method is recommended.

There are sets of constrains for which no solution exist: A protocol designer that needs to originate or receive datagrams on a device that can not efficiently calculate a checksum over a full datagram and also needs these packets to pass through a middlebox that verifies or changes a UDP checksum, but does not support a zero UDP checksum, can not use the zero UDP checksum method. Similarly, one that originates datagrams on a device with UDP-Lite support, but needs the packets to pass through a middlebox that does not support

Fairhurst & Westerlund Expires August 29, 2013 [Page 11] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

UDP-Lite, can not use UDP-Lite. For such cases, there is no optimal solution and the current recommendation is to use or fall-back to using UDP with full checksum coverage.

3. Issues Requiring Consideration

This informative section evaluates issues around the proposal to update IPv6 [RFC2460], to enable the UDP transport checksum to be set to zero. Some of the identified issues are shared with other protocols already in use. The section also provides background to the requirements and recommendations that follow.

The decision in RFC 2460 to omit an integrity check at the network level meant that the IPv6 transport checksum was overloaded with many functions, including validating:

o the endpoint address was not corrupted within a router, i.e., a packet was intended to be received by this destination and validate that the packet does not consist of a wrong header spliced to a different payload;

o that extension header processing is correctly delimited - i.e., the start of data has not been corrupted. In this case, reception of a valid Next Header value provides some protection;

o reassembly processing, when used;

o the length of the payload;

o the port values - i.e., the correct application receives the payload (applications should also check the expected use of source ports/addresses);

o the payload integrity.

In IPv4, the first four checks are performed using the IPv4 header checksum.

In IPv6, these checks occur within the endpoint stack using the UDP checksum information. An IPv6 node also relies on the header information to determine whether to send an ICMPv6 error message [RFC4443] and to determine the node to which this is sent. Corrupted information may lead to misdelivery to an unintended application socket on an unexpected host.

Fairhurst & Westerlund Expires August 29, 2013 [Page 12] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

3.1. Effect of packet modification in the network

IP packets may be corrupted as they traverse an Internet path. Older evidence in "When the CRC and TCP Checksum Disagree" [Sigcomm2000] show that this was once an issue in year 2000 with IPv4 routers, and occasional corruption could result from bad internal router processing in routers or hosts. These errors are not detected by the strong frame checksums employed at the link-layer [RFC3819]. During the development of this document in 2009, individuals provided reports of observed rates for received UDP datagrams using IPv4 where the UDP checksum had been detected as corrupt. These rates where as high as 1.39E-4 for some paths, but also close to zero for some other paths.

There is extensive experience of deployment using tunnel protocols in well-managed networks (e.g. corporate networks or service provider core networks). This has shown the robustness of methods such as PWE and MPLS that do not employ a transport protocol checksum and have not specified mechanisms to protect from corruption of the unprotected headers (such as the VPN Identifier in MPLS). Reasons for the robustness may include:

o A reduced probability of corruption on paths through well-managed networks.

o IP form the majority of the inner traffic carried by these tunnel. Hence from a transport perspective, endpoint verification is already being performed when processing a received IPv4 packet or by the transport pseudo-header for an IPv6 packet. This update to UDP does not change this behaviour.

o In certain cases, a combination of additional filtering (e.g. filter of a MAC destination address in a L2 tunnel) significantly reduces the probability of final mis-delivery to the IP stack.

o The tunnel protocols did not use a UDP transport header, any corruption is therefore unlikely to result in misdelivery to another UDP-based application. This concern is specific to the use of UDP with IPv6.

While this experience can guide the present recommendations, any update to UDP must preserve operation in the general Internet. This is heterogeneous and can include links and systems of very varying characteristics. Transport protocols used by hosts need to be designed with this in mind, especially when there is need to traverse edge networks, where middlebox deployments are common.

For the general Internet, there is no current evidence that

Fairhurst & Westerlund Expires August 29, 2013 [Page 13] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

corruption is rare, nor that this may not be applicable to IPv6. It therefore seems prudent not to relax checks on misdelivery . The emergence of low-end IPv6 routers and the proposed use of NAT with IPv6 further motivate the need to protect from misdelivery.

Corruption in the network may result in:

o A datagram being misdelivered to the wrong host/router or the wrong transport entity within an endpoint. Such a datagram needs to be discarded;

o A datagram payload being corrupted, but still delivered to the intended host/router transport entity. Such a datagram needs to be either discarded or correctly processed by an application that provides its own integrity checks;

o A datagram payload being truncated by corruption of the length field. Such a datagram needs to be discarded.

When a checksum is used, this significantly reduces the impact of errors, reducing the probability of undetected corruption of state (and data) on both the host stack and the applications using the transport service.

The following sections examine the impact of modifying each of these header fields.

3.1.1. Corruption of the destination IP address

An IPv6 endpoint destination address could be modified in the network (e.g. corrupted by an error). This is not a concern for IPv4, because the IP header checksum will result in this packet being discarded by the receiving IP stack. Such modification in the network can not be detected at the network layer when using IPv6. Detection of this corruption by a UDP receiver relies on the IPv6 pseudo header incorporated in the transport checksum.

There are two possible outcomes:

o Delivery to a destination address that is not in use (the packet will not be delivered, but could result in an error report);

o Delivery to a different destination address. This modification will normally be detected by the transport checksum, resulting in silent discard. Without a computed checksum, the packet would be passed to the endpoint port demultiplexing function. If an application is bound to the associated ports, the packet payload will be passed to the application (see the subsequent section on

Fairhurst & Westerlund Expires August 29, 2013 [Page 14] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

port processing).

3.1.2. Corruption of the source IP address

This section examines what happens when the source address is corrupted in transit. This is not a concern in IPv4, because the IP header checksum will normally result in this packet being discarded by the receiving IP stack. Detection of this corruption by a UDP receiver relies on the IPv6 pseudo header incorporated in the transport checksum.

Corruption of an IPv6 source address does not result in the IP packet being delivered to a different endpoint protocol or destination address. If only the source address is corrupted, the datagram will likely be processed in the intended context, although with erroneous origin information. When using Unicast Reverse Path Forwarding [RFC2827], a change in address may result in the router discarding the packet when the route to the modified source address is different to that of the source address of the original packet.

The result will depend on the application or protocol that processes the packet. Some examples are:

o An application that requires a per-established context may disregard the datagram as invalid, or could map this to another context (if a context for the modified source address was already activated).

o A stateless application will process the datagram outside of any context, a simple example is the ECHO server, which will respond with a datagram directed to the modified source address. This would create unwanted additional processing load, and generate traffic to the modified endpoint address.

o Some datagram applications build state using the information from packet headers. A previously unused source address would result in receiver processing and the creation of unnecessary transport- layer state at the receiver. For example, Real Time Protocol (RTP) [RFC3550] sessions commonly employ a source independent receiver port. State is created for each received flow. Reception of a datagram with a corrupted source address will therefore result in accumulation of unnecessary state in the RTP state machine, including collision detection and response (since the same synchronization source, SSRC, value will appear to arrive from multiple source IP addresses).

o ICMP messages relating to a corrupted packet can be misdirected to the wrong source node.

Fairhurst & Westerlund Expires August 29, 2013 [Page 15] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

In general, the effect of corrupting the source address will depend upon the protocol that processes the packet and its robustness to this error. For the case where the packet is received by a tunnel endpoint, the tunnel application is expected to correctly handle a corrupted source address.

The impact of source address modification is more difficult to quantify when the receiving application is not that originally intended and several fields have been modified in transit.

3.1.3. Corruption of Port Information

This section describes what happens if one or both of the UDP port values are corrupted in transit. This can also happen with IPv4 is used with a zero UDP checksum, but not when UDP checksums are calculated or when UDP-Lite is used. If the ports carried in the transport header of an IPv6 packet were corrupted in transit, packets may be delivered to the wrong application process (on the intended machine) and/or responses or errors sent to the wrong application process (on the intended machine).

3.1.4. Delivery to an unexpected port

If one combines the corruption effects, such as destination address and ports, there is a number of potential outcomes when traffic arrives at an unexpected port. This section discusses these possibilities and their outcomes for a packet that does not use the UDP checksum validation:

o Delivery to a port that is not in use. The packet is discarded, but could generate an ICMPv6 message (e.g. port unreachable).

o It could be delivered to a different node that implements the same application, where the packet may be accepted, generating side- effects or accumulated state.

o It could be delivered to an application that does not implement the tunnel protocol, where the packet may be incorrectly parsed, and may be misinterpreted, generating side-effects or accumulated state.

The probability of each outcome depends on the statistical probability that the address or the port information for the source or destination becomes corrupt in the datagram such that they match those of an existing flow or server port. Unfortunately, such a match may be more likely for UDP than for connection-oriented transports, because:

Fairhurst & Westerlund Expires August 29, 2013 [Page 16] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

1. There is no handshake prior to communication and no sequence numbers (as in TCP, DCCP, or SCTP). Together, this makes it hard to verify that an application process is given only the application data associated with a specific transport session.

2. Applications writers often bind to wild-card values in endpoint identifiers and do not always validate correctness of datagrams they receive (guidance on this topic is provided in [RFC5405]).

While these rules could, in principle, be revised to declare naive applications as "Historic". This remedy is not realistic: the transport owes it to the stack to do its best to reject bogus datagrams.

If checksum coverage is suppressed, the application therefore needs to provide a method to detect and discard the unwanted data. A tunnel protocol would need to perform its own integrity checks on any control information if transported in datagrams with a zero UDP checksum. If the tunnel payload is another IP packet, the packets requiring checksums can be assumed to have their own checksums provided that the rate of corrupted packets is not significantly larger due to the tunnel encapsulation. If a tunnel transports other inner payloads that do not use IP, the assumptions of corruption detection for that particular protocol must be fulfilled, this may require an additional checksum/CRC and/or integrity protection of the payload and tunnel headers.

A protocol that uses a zero UDP checksum can not assume that it is the only protocol using a zero UDP checksum. Therefore, it needs to gracefully handle misdelivery. It must be robust to reception of malformed packets received on a listening port and expect that these packets may contain corrupted data or data associated with a completely different protocol.

3.1.5. Corruption of Fragmentation Information

The fragmentation information in IPv6 employs a 32-bit identity field, compared to only a 16-bit field in IPv4, a 13-bit fragment offset and a 1-bit flag, indicating if there are more fragments. Corruption of any of these field may result in one of two outcomes:

Reassembly failure: An error in the "More Fragments" field for the last fragment will for example result in the packet never being considered complete and will eventually be timed out and discarded. A corruption in the ID field will result in the fragment not being delivered to the intended context thus leaving the rest incomplete, unless that packet has been duplicated prior to corruption. The incomplete packet will eventually be timed out

Fairhurst & Westerlund Expires August 29, 2013 [Page 17] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

and discarded.

Erroneous reassembly: The re-assembled packet did not match the original packet. This can occur when the ID field of a fragment is corrupted, resulting in a fragment becoming associated with another packet and taking the place of another fragment. Corruption in the offset information can cause the fragment to be misaligned in the reassembly buffer, resulting in incorrect reassembly. Corruption can cause the packet to become shorter or longer, however completion of reassembly is much less probable, since this would require consistent corruption of the IPv6 headers payload length field and the offset field. The possibility of mis-assembly requires the reassembling stack to provide strong checks that detect overlap or missing data, note however that this is not guaranteed and has been clarified in "Handling of Overlapping IPv6 Fragments" [RFC5722].

The erroneous reassembly of packets is a general concern and such packets should be discarded instead of being passed to higher layer processes. The primary detector of packet length changes is the IP payload length field, with a secondary check by the transport checksum. The Upper-Layer Packet length field included in the pseudo header assists in verifying correct reassembly, since the Internet checksum has a low probability of detecting insertion of data or overlap errors (due to misplacement of data). The checksum is also incapable of detecting insertion or removal of all zero-data that occurs in a multiple of a 16-bit chunk.

The most significant risk of corruption results following mis- association of a fragment with a different packet. This risk can be significant, since the size of fragments is often the same (e.g. fragments resulting when the path MTU results in fragmentation of a larger packet, common when addition of a tunnel encapsulation header expands the size of a packet). Detection of this type of error requires a checksum or other integrity check of the headers and the payload. Such protection is anyway desirable for tunnel encapsulations using IPv4, since the small fragmentation ID can easily result in wrap-around [RFC4963], this is especially the case for tunnels that perform flow aggregation [I-D.ietf-intarea-tunnels].

Tunnel fragmentation behavior matters. There can be outer or inner fragmentation "Tunnels in the Internet Architecture" [I-D.ietf-intarea-tunnels]. If there is inner fragmentation by the tunnel, the outer headers will never be fragmented and thus a zero UDP checksum in the outer header will not affect the reassembly process. When a tunnel performs outer header fragmentation, the tunnel egress needs to perform reassembly of the outer fragments into an inner packet. The inner packet is either a complete packet or a

Fairhurst & Westerlund Expires August 29, 2013 [Page 18] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

fragment. If it is a fragment, the destination endpoint of the fragment will perform reassembly of the received fragments. The complete packet or the reassembled fragments will then be processed according to the packet Next Header field. The receiver may only detect reassembly anomalies when it uses a protocol with a checksum. The larger the number of reassembly processes to which a packet has been subjected, the greater the probability of an error.

o An IP-in-IP tunnel that performs inner fragmentation has similar properties to a UDP tunnel with a zero UDP checksum that also performs inner fragmentation.

o An IP-in-IP tunnel that performs outer fragmentation has similar properties to a UDP tunnel with a zero UDP checksum that performs outer fragmentation.

o A tunnel that performs outer fragmentation can result in a higher level of corruption due to both inner and outer fragmentation, enabling more chances for reassembly errors to occur.

o Recursive tunneling can result in fragmentation at more than one header level, even for inner fragmentation unless it goes to the inner-most IP header.

o Unless there is verification at each reassembly, the probability for undetected error will increase with the number of times fragmentation is recursively applied, making IP-in-IP and UDP with zero UDP checksum both vulnerable to undetected errors.

In conclusion, fragmentation of datagrams with a zero UDP checksum does not worsen the performance compared to some other commonly used tunnel encapsulations. However, caution is needed for recursive tunneling without any additional verification at the different tunnel layers.

3.2. Where Packet Corruption Occurs

Corruption of IP packets can occur at any point along a network path, during packet generation, during transmission over the link, in the process of routing and switching, etc. Some transmission steps include a checksum or Cyclic Redundancy Check (CRC) that reduces the probability for corrupted packets being forwarded, but there still exists a probability that errors may propagate undetected.

Unfortunately the community lacks reliable information to identify the most common functions or equipment that result in packet corruption. However, there are indications that the place where corruption occurs can vary significantly from one path to another.

Fairhurst & Westerlund Expires August 29, 2013 [Page 19] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

There is therefore a risk in applying evidence from one domain of usage to infer characteristics for another. Methods intended for general Internet usage must therefore assume that corruption can occur and deploy mechanisms to mitigate the effect of corruption and/or resulting misdelivery.

3.3. Validating the network path

IP transports designed for use in the general Internet should not assume specific path characteristics. Network protocols may reroute packets that change the set of routers and middleboxes along a path. Therefore transports such as TCP, SCTP and DCCP have been designed to negotiate protocol parameters, adapt to different network path characteristics, and receive feedback to verify that the current path is suited to the intended application. Applications using UDP and UDP-Lite need to provide their own mechanisms to confirm the validity of the current network path.

A zero value in the UDP checksum field is explicitly disallowed in RFC2460. Thus it may be expected that any device on the path that has a reason to look beyond the IP header, for example to validate the UDP checksum, will consider such a packet as erroneous or illegal and may discard it, unless the device is updated to support the new behavior. Any middlebox that modifies the UDP checksum, for example a NAT that changes the values of the IP and UDP header in such a way that the checksum over the pseudo header changes value, will need to be updated to support this behavior. Until then, a zero UDP checksum packet is likely to be discarded either directly in the middlebox or at the destination, when a zero UDP checksum has been modified to a non-zero by an incremental update.

A pair of end-points intending to use a new behavior will therefore not only need to ensure support at each end-point, but also that the path between them will deliver packets with the new behavior. This may require using negotiation or an explicit mandate to use the new behavior by all nodes that support the new protocol.

Enabling the use of a zero checksum places new requirements on equipment deployed within the network, such as middleboxes. A middlebox (e.g. Firewalls, Network Address Translators) may enable zero checksum usage for a particular range of ports. Note that checksum off-loading and operating system design may result in all IPv6 UDP traffic being sent with a calculated checksum. This requires middleboxes that are configured to enable a zero UDP checksum to continue to work with bidirectional UDP flows that use a zero UDP checksum in only one direction, and therefore they must not maintain separate state for a UDP flow based on its checksum usage.

Fairhurst & Westerlund Expires August 29, 2013 [Page 20] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

Support along the path between end points can be guaranteed in limited deployments by appropriate configuration. In general, it can be expected to take time for deployment of any updated behaviour to become ubiquitous.

A sender will need to probe the path to verify the expected behavior. Path characteristics may change, and usage therefore should be robust and able to detect a failure of the path under normal usage and re- negotiate. Note that a bidirectional path does not necessarily support the same checksum usage in both the forward and return directions: Receipt of a datagram with a zero UDP checksum, does not imply that the remote endpoint can also receive a datagram with a zero UDP checksum. This will require periodic validation of the path, adding complexity to any solution using the new behavior.

3.4. Applicability of method

The update to the IPv6 specification defined in [I-D.ietf-6man-udpchecksums] only modifies IPv6 nodes that implement specific protocols designed to permit omission of a UDP checksum. This document therefore provides an applicability statement for the updated method indicating when the mechanism can (and can not) be used. Enabling this, and ensuring correct interactions with the stack, implies much more than simply disabling the checksum algorithm for specific packets at the transport interface.

When the method is widely available, it may be expected to be used by applications that are perceived to gain benefit. Any solution that uses an end-to-end transport protocol, rather than an IP-in-IP encapsulation, needs to minimise the possibility that application processes could confuse a corrupted or wrongly delivered UDP datagram with that of data addressed to the application running on their endpoint.

The protocol or application that uses the zero checksum method must ensure that the lack of checksum does not affect the protocol operation. This includes being robust to receiving a unintended packet from another protocol or context following corruption of a destination or source address and/or port value. It also includes considering the need for additional implicit protection mechanisms required when using the payload of a UDP packet received with a zero checksum.

3.5. Impact on non-supporting devices or applications

It is important to consider the potential impact of using a zero UDP checksum on end-point devices or applications that are not modified to support the new behavior or by default or preference, use the

Fairhurst & Westerlund Expires August 29, 2013 [Page 21] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

regular behavior. These applications must not be significantly impacted by the update.

To illustrate why this necessary, consider the implications of a node that enables use of a zero UDP checksum at the interface level: This would result in all applications that listen to a UDP socket receiving datagrams where the checksum was not verified. This could have a significant impact on an application that was not designed with the additional robustness needed to handle received packets with corruption, creating state or destroying existing state in the application.

A zero UDP checksum therefore needs to be enabled only for individual ports using an explicit request by the application. In this case, applications using other ports would maintain the current IPv6 behavior, discarding incoming datagrams with a zero UDP checksum. These other applications would not be affected by this changed behavior. An application that allows the changed behavior should be aware of the risk of corruption and the increased level of misdirected traffic, and can be designed robustly to handle this risk.

4. Constraints on implementation of IPv6 nodes supporting zero checksum

This section is an applicability statement that defines requirements and recommendations on the implementation of IPv6 nodes that support use of a zero value in the checksum field of a UDP datagram.

All implementations that support this zero UDP checksum method MUST conform to the requirements defined below.

1. An IPv6 sending node MAY use a calculated RFC 2460 checksum for all datagrams that it sends. This explicitly permits an interface that supports checksum offloading to insert an updated UDP checksum value in all UDP datagrams that it forwards, however note that sending a calculated checksum requires the receiver to also perform the checksum calculation. Checksum offloading can normally be switched off for a particular interface to ensure that datagrams are sent with a zero UDP checksum.

2. IPv6 nodes SHOULD by default NOT allow the zero UDP checksum method for transmission.

3. IPv6 nodes MUST provide a way for the application/protocol to indicate the set of ports that will be enabled to send datagrams with a zero UDP checksum. This may be implemented by enabling a

Fairhurst & Westerlund Expires August 29, 2013 [Page 22] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

transport mode using a socket API call when the socket is established, or a similar mechanism. It may also be implemented by enabling the method for a pre-assigned static port used by a specific tunnel protocol.

4. IPv6 nodes MUST provide a method to allow an application/ protocol to indicate that a particular UDP datagram is required to be sent with a UDP checksum. This needs to be allowed by the operating system at any time (e.g. to send keep-alive datagrams), not just when a socket is established in the zero checksum mode.

5. The default IPv6 node receiver behaviour MUST discard all IPv6 packets carrying datagrams with a zero UDP checksum.

6. IPv6 nodes MUST provide a way for the application/protocol to indicate the set of ports that will be enabled to receive datagrams with a zero UDP checksum. This may be implemented via a socket API call, or similar mechanism. It may also be implemented by enabling the method for a pre-assigned static port used by a specific tunnel protocol.

7. IPv6 nodes supporting usage of zero UDP checksums MUST also allow reception using a calculated UDP checksum on all ports configured to allow zero UDP checksum usage. (The sending endpoint, e.g. encapsulating ingress, may choose to compute the UDP checksum, or may calculate this by default.) The receiving endpoint MUST use the reception method specified in RFC2460 when the checksum field is not zero.

8. RFC 2460 specifies that IPv6 nodes SHOULD log received datagrams with a zero UDP checksum. This remains the case for any datagram received on a port that does not explicitly enable processing of a zero UDP checksum. A port for which the zero UDP checksum has been enabled MUST NOT log the datagram solely because the checksum value is zero.

9. IPv6 nodes MAY separately identify received UDP datagrams that are discarded with a zero UDP checksum. It SHOULD NOT add these to the standard log, since the endpoint has not been verified. This may be used to support other functions (such as a security policy).

10. IPv6 nodes that receive ICMPv6 messages that refer to packets with a zero UDP checksum MUST provide appropriate checks concerning the consistency of the reported packet to verify that the reported packet actually originated from the node, before acting upon the information (e.g. validating the address and

Fairhurst & Westerlund Expires August 29, 2013 [Page 23] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

port numbers in the ICMPv6 message body).

5. Requirements on usage of the zero UDP checksum

This section is an applicability statement that identifies requirements and recommendations for protocols and tunnel encapsulations that are transported over an IPv6 transport flow (e.g. tunnel) that does not perform a UDP checksum calculation to verify the integrity at the transport endpoints. Before deciding to use the zero UDP checksum and loose the integrity verification provided, a protocol developer should seriously consider if they can use checksummed UDP packets or UDP-Lite [RFC3828], because IPv6 with a zero UDP checksum is not equivalent in behavior to IPv4 with zero UDP checksum.

The requirements and recommendations for protocols and tunnel encapsulations using an IPv6 transport flow that does not perform a UDP checksum calculation to verify the integrity at the transport endpoints are:

1. Transported protocols that enable the use of zero UDP checksum MUST only enable this for a specific port or port-range. This needs to be enabled at the sending and receiving endpoints for a UDP flow.

2. An integrity mechanism is always RECOMMENDED at the transported protocol layer to ensure that corruption rates of the delivered payload is not increased (e.g. the inner-most packet of a UDP tunnel). A mechanism that isolates the causes of corruption (e.g. identifying misdelivery, IPv6 header corruption, tunnel header corruption) is expected to also provide additional information about the status of the tunnel (e.g. to suggest a security attack).

3. A transported protocol that encapsulates Internet Protocol (IPv4 or IPv6) packets MAY rely on the inner packet integrity checks, provided that the tunnel protocol will not significantly increase the rate of corruption of the inner IP packet. If a significantly increased corruption rate can occur, then the tunnel protocol MUST provide an additional integrity verification mechanism. Early detection is desirable to avoid wasting unnecessary computation, transmission capacity or storage for packets that will subsequently be discarded.

4. A transported protocol that supports use of a zero UDP checksum, MUST be designed so that corruption of this information does not result in accumulated state for the protocol.

Fairhurst & Westerlund Expires August 29, 2013 [Page 24] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

5. A transported protocol with a non-tunnel payload or one that encapsulates non-IP packets MUST have a CRC or other mechanism for checking packet integrity, unless the non-IP packet is specifically designed for transmission over a lower layer that does not provide a packet integrity guarantee.

6. A transported protocol with control feedback SHOULD be robust to changes in the network path, since the set of middleboxes on a path may vary during the life of an association. The UDP endpoints need to discover paths with middleboxes that drop packets with a zero UDP checksum. Therefore, transported protocols SHOULD send keep-alive messages with a zero UDP checksum. An endpoint that discovers an appreciable loss rate for keep-alive packets MAY terminate the UDP flow (e.g. tunnel). Section 3.1.3 of RFC 5405 describes requirements for congestion control when using a UDP-based transport.

7. A protocol with control feedback that can fall-back to using UDP with a calculated RFC 2460 checksum is expected to be more robust to changes in the network path. Therefore, keep-alive messages SHOULD include both UDP datagrams with a checksum and datagrams with a zero UDP checksum. This will enable the remote endpoint to distinguish between a path failure and dropping of datagrams with a zero UDP checksum.

8. A middlebox implementation MUST allow forwarding of an IPv6 UDP datagram with both a zero and standard UDP checksum using the same UDP port.

9. A middlebox MAY configure a restricted set of specific port ranges that forward UDP datagrams with a zero UDP checksum. The middlebox MAY drop IPv6 datagrams with a zero UDP checksum that are outside a configured range.

10. When a middlebox forwards an IPv6 UDP flow containing datagrams with both a zero and standard UDP checksum, the middlebox MUST NOT maintain separate state for flows depending on the value of their UDP checksum field. (This requirement is necessary to enable a sender that always calculates a checksum to communicate via a middlebox with a remote endpoint that uses a zero UDP checksum.)

Special considerations are required when designing a UDP tunnel protocol, where the tunnel ingress or egress may be a router that may not have access to the packet payload. When the node is acting as a host (i.e., sending or receiving a packet addressed to itself), the checksum processing is similar to other hosts. However, when the node (e.g. a router) is acting as a tunnel ingress or egress that

Fairhurst & Westerlund Expires August 29, 2013 [Page 25] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

forwards a packet to or from a UDP tunnel, there may be restricted access to the packet payload. This prevents calculating (or verifying) a UDP checksum. In this case, the tunnel protocol may use a zero UDP checksum and must:

o Ensure that tunnel ingress and tunnel egress router are both configured to use a zero UDP checksum. For example, this may include ensuring that hardware checksum offloading is disabled.

o The tunnel operator must ensure that middleboxes on the network path are updated to support use of a zero UDP checksum.

o A tunnel egress should implement appropriate security techniques to protect from overload, including source address filtering to prevent traffic injection by an attacker, and rate-limiting of any packets that incur additional processing, such as UDP datagrams used for control functions that require verification of a calculated checksum to verify the network path. Usage of common control traffic for multiple tunnels between a pair of nodes can assist in reducing the number of packets to be processed.

6. Summary

This document provides an applicability statement for the use of UDP transport checksums with IPv6.

It examines the role of the UDP transport checksum when used with IPv6 and presents a summary of the trade-offs in evaluating the safety of updating RFC 2460 to permit an IPv6 endpoint to use a zero UDP checksum field to indicate that no checksum is present.

Application designers should first examine whether their transport goals may be met using standard UDP (with a calculated checksum) or by using UDP-Lite. The use of UDP with a zero UDP checksum has merits for some applications, such as tunnel encapsulation, and is widely used in IPv4. However, there are different dangers for IPv6: There is an increased risk of corruption and misdelivery when using zero UDP checksum in IPv6 compared to using IPv4 due to the lack of an IPv6 header checksum. Thus, applications need to evaluate the risks of enabling use of a zero UDP checksum and consider a solution that at least provides the same delivery protection as for IPv4, for example by utilizing UDP-Lite, or by enabling the UDP checksum. The use of checksum off-loading may help alleviate the cost of checksum processing and permit use of a checksum using method defined in RFC 2460.

Tunnel applications using UDP for encapsulation can in many cases use

Fairhurst & Westerlund Expires August 29, 2013 [Page 26] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

a zero UDP checksum without significant impact on the corruption rate. A well-designed tunnel application should include consistency checks to validate the header information encapsulated with a received packet. In most cases, tunnels encapsulating IP packets can rely on the integrity protection provided by the transported protocol (or tunneled inner packet). When correctly implemented, such an endpoint will not be negatively impacted by omission of the transport-layer checksum. Recursive tunneling and fragmentation is a potential issue that can raise corruption rates significantly, and requires careful consideration.

Other UDP applications at the intended destination node or another node can be impacted if they are allowed to receive datagrams that have a zero UDP checksum. It is important that already deployed applications are not impacted by a change at the transport layer. If these applications execute on nodes that implement RFC 2460, they will discard (and log) all datagrams with a zero UDP checksum. This is not an issue.

In general, UDP-based applications need to employ a mechanism that allows a large percentage of the corrupted packets to be removed before they reach an application, both to protect the data stream of the application and the control plane of higher layer protocols. These checks are currently performed by the UDP checksum for IPv6, or the reduced checksum for UDP-Lite when used with IPv6.

The transport of recursive tunneling and the use of fragmentation pose difficult issues that need to be considered in the design of tunnel protocols. There is an increased risk of an error in the inner-most packet when fragmentation when several layers of tunneling and several different reassembly processes are run without verification of correctness. This requires extra thought and careful consideration in the design of transported tunnels.

Any use of the updated method must consider the implications on firewalls, NATs and other middleboxes. It is not expected that IPv6 NATs handle IPv6 UDP datagrams in the same way that they handle IPv4 UDP datagrams. In many deployed cases this will require an update to support an IPv6 zero UDP checksum. Firewalls are intended to be configured, and therefore may need to be explicitly updated to allow new services or protocols. IPv6 middlebox deployment is not yet as prolific as it is in IPv4, and therefore new devices are expected to follow the methods specified in this document.

Each application should consider the implications of choosing an IPv6 transport that uses a zero UDP checksum, and consider whether other standard methods may be more appropriate, and may simplify application design.

Fairhurst & Westerlund Expires August 29, 2013 [Page 27] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

7. Acknowledgements

Brian Haberman, Brian Carpenter, Margaret Wasserman, Lars Eggert, others in the TSV directorate. Barry Leiba, Ronald Bonica, Pete Resnick, and Stewart Bryant are thanked for resulting in a document with much greater applicability. Thanks to P.F. Chimento for careful review and editorial corrections.

Thanks also to: Remi Denis-Courmont, Pekka Savola, Glen Turner, and many others who contributed comments and ideas via the 6man, behave, lisp and mboned lists.

8. IANA Considerations

This document does not require any actions by IANA.

9. Security Considerations

Transport checksums provide the first stage of protection for the stack, although they can not be considered authentication mechanisms. These checks are also desirable to ensure packet counters correctly log actual activity, and can be used to detect unusual behaviours.

Depending on the hardware design, the processing requirements may differ for tunnels that have a zero UDP checksum and those that calculate a checksum. This processing overhead may need to be considered when deciding whether to enable a tunnel and to determine an acceptable rate for transmission. This can become a security risk for designs that can handle a significantly larger number of packets with zero UDP checksums compared to datagrams with a non-zero checksum, such as tunnel egress. An attacker could attempt to inject non-zero checksummed UDP packets into a tunnel forwarding zero checksum UDP packets and cause overload in the processing of the non- zero checksums, e.g. if this happens in a routers slow path. Protection mechanisms should therefore be employed when this threat exists. Protection may include source address filtering to prevent an attacker injecting traffic, as well as throttling the amount of non-zero checksum traffic. The latter may impact the function of the tunnel protocol.

Transmission of IPv6 packets with a zero UDP checksum could reveal additional information to an on-path attacker to identify the operating system or configuration of a sending node. There is a need to probe the network path to determine whether the current path supports using IPv6 packets with a zero UDP checksum. The details of the probing mechanism may differ for different tunnel encapsulations

Fairhurst & Westerlund Expires August 29, 2013 [Page 28] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

and if visible in the network (e.g. if not using IPsec in encryption mode) could reveal additional information to an on-path attacker to identify the type of tunnel being used.

IP-in-IP or GRE tunnels offer good traversal of middleboxes that have not been designed for security, e.g. firewalls. However, firewalls may be expected to be configured to block general tunnels as they present a large attack surface. This applicability statement therefore permits this method to be enabled only for specific ranges of ports.

When the zero UDP checksum mode is enabled for a range of ports, nodes and middleboxes must forward received UDP datagrams that have either a calculated checksum or a zero checksum.

10. References

10.1. Normative References

[I-D.ietf-6man-udpchecksums] Eubanks, M., Chimento, P., and M. Westerlund, "IPv6 and UDP Checksums for Tunneled Packets", draft-ietf-6man-udpchecksums-08 (work in progress), February 2013.

[RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, August 1980.

[RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, September 1981.

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998.

10.2. Informative References

[I-D.ietf-intarea-tunnels] Touch, J. and M. Townsley, "Tunnels in the Internet Architecture", draft-ietf-intarea-tunnels-00 (work in progress), March 2010.

[I-D.ietf-mboned-auto-multicast] Bumgardner, G., "Automatic Multicast Tunneling", draft-ietf-mboned-auto-multicast-14 (work in progress),

Fairhurst & Westerlund Expires August 29, 2013 [Page 29] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

June 2012.

[LISP] D. Farinacci et al, "Locator/ID Separation Protocol (LISP)", November 2012.

[RFC1071] Braden, R., Borman, D., Partridge, C., and W. Plummer, "Computing the Internet checksum", RFC 1071, September 1988.

[RFC1141] Mallory, T. and A. Kullberg, "Incremental updating of the Internet checksum", RFC 1141, January 1990.

[RFC1624] Rijsinghani, A., "Computation of the Internet Checksum via Incremental Update", RFC 1624, May 1994.

[RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: Defeating Denial of Service Attacks which employ IP Source Address Spoofing", BCP 38, RFC 2827, May 2000.

[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003.

[RFC3819] Karn, P., Bormann, C., Fairhurst, G., Grossman, D., Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L. Wood, "Advice for Internet Subnetwork Designers", BCP 89, RFC 3819, July 2004.

[RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., and G. Fairhurst, "The Lightweight User Datagram Protocol (UDP-Lite)", RFC 3828, July 2004.

[RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification", RFC 4443, March 2006.

[RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly Errors at High Data Rates", RFC 4963, July 2007.

[RFC5097] Renker, G. and G. Fairhurst, "MIB for the UDP-Lite protocol", RFC 5097, January 2008.

[RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines for Application Designers", BCP 145, RFC 5405, November 2008.

[RFC5415] Calhoun, P., Montemurro, M., and D. Stanley, "Control And Provisioning of Wireless Access Points (CAPWAP) Protocol

Fairhurst & Westerlund Expires August 29, 2013 [Page 30] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

Specification", RFC 5415, March 2009.

[RFC5722] Krishnan, S., "Handling of Overlapping IPv6 Fragments", RFC 5722, December 2009.

[RFC6437] Amante, S., Carpenter, B., Jiang, S., and J. Rajahalme, "IPv6 Flow Label Specification", RFC 6437, November 2011.

[RFC6438] Carpenter, B. and S. Amante, "Using the IPv6 Flow Label for Equal Cost Multipath Routing and Link Aggregation in Tunnels", RFC 6438, November 2011.

[Sigcomm2000] Jonathan Stone and Craig Partridge , "When the CRC and TCP Checksum Disagree", 2000.

[UDPTT] G Fairhurst, "The UDP Tunnel Transport mode", Feb 2010.

Appendix A. Evaluation of proposal to update RFC 2460 to support zero checksum

This informative appendix documents the evaluation of the proposal to update IPv6 [RFC2460], to provide the option that some nodes may suppress generation and checking of the UDP transport checksum. It also compares the proposal with other alternatives, and notes that for a particular application some standard methods may be more appropriate than using IPv6 with a zero UDP checksum.

A.1. Alternatives to the Standard Checksum

There are several alternatives to the normal method for calculating the UDP Checksum [RFC1071] that do not require a tunnel endpoint to inspect the entire packet when computing a checksum. These include (in decreasing order of complexity):

o Delta computation of the checksum from an encapsulated checksum field. Since the checksum is a cumulative sum [RFC1624], an encapsulating header checksum can be derived from the new pseudo header, the inner checksum and the sum of the other network-layer fields not included in the pseudo header of the encapsulated packet, in a manner resembling incremental checksum update [RFC1141]. This would not require access to the whole packet, but does require fields to be collected across the header, and arithmetic operations on each packet. The method would only work for packets that contain a 2’s complement transport checksum (i.e., it would not be appropriate for SCTP or when IP fragmentation is used).

Fairhurst & Westerlund Expires August 29, 2013 [Page 31] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

o UDP-Lite with the checksum coverage set to only the header portion of a packet. This requires a pseudo header checksum calculation only on the encapsulating packet header. The computed checksum value may be cached (before adding the Length field) for each flow/destination and subsequently combined with the Length of each packet to minimise per-packet processing. This value is combined with the UDP payload length for the pseudo header, however this length is expected to be known when performing packet forwarding.

o The proposed UDP Tunnel Transport [UDPTT] suggested a method where UDP would be modified to derive the checksum only from the encapsulating packet protocol header. This value does not change between packets in a single flow. The value may be cached per flow/destination to minimise per-packet processing.

o There has been a proposal to simply ignore the UDP checksum value on reception at the tunnel egress, allowing a tunnel ingress to insert any value correct or false. For tunnel usage, a non standard checksum value may be used, forcing an RFC 2460 receiver to drop the packet. The main downside is that it would be impossible to identify a UDP datagram (in the network or an endpoint) that is treated in this way compared to a packet that has actually been corrupted.

o A method has been proposed that uses a new (to be defined) IPv6 Destination Options Header to provide an end-to-end validation check at the network layer. This would allow an endpoint to verify delivery to an appropriate end point, but would also require IPv6 nodes to correctly handle the additional header, and would require changes to middlebox behavior (e.g. when used with a NAT that always adjusts the checksum value).

o UDP modified to disable checksum processing [I-D.ietf-6man-udpchecksums]. This eliminates the need for a checksum calculation, but would require constraints on appropriate usage and updates to end-points and middleboxes.

o IP-in-IP tunneling. As this method completely dispenses with a transport protocol in the outer-layer it has reduced overhead and complexity, but also reduced functionality. There is no outer checksum over the packet and also no ports to perform demultiplexing between different tunnel types. This reduces the information available upon which a load balancer may act.

These options are compared and discussed further in the following sections.

Fairhurst & Westerlund Expires August 29, 2013 [Page 32] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

A.2. Comparison

This section compares the above listed methods to support datagram tunneling. It includes proposals for updating the behaviour of UDP.

While this comparison focuses on applications that are expected to execute on routers, the distinction between a router and a host is not always clear, especially at the transport level. Systems (such as unix-based operating systems) routinely provide both functions. There is no way to identify the role of the receiving node from a received packet.

A.2.1. Middlebox Traversal

Regular UDP with a standard checksum or the delta encoded optimization for creating correct checksums have the best possibilities for successful traversal of a middlebox. No new support is required.

A method that ignores the UDP checksum on reception is expected to have a good probability of traversal, because most middleboxes perform an incremental checksum update. UDPTT would also have been able to traverse a middlebox with this behaviour. However, a middlebox on the path that attempts to verify a standard checksum will not forward packets using either of these methods, preventing traversal. A method that ignores the checksum has an additional downside in that it prevents improvement of middlebox traversal, because there is no way to identify UDP datagrams that use the modified checksum behaviour.

IP-in-IP or GRE tunnels offer good traversal of middleboxes that have not been designed for security, e.g. firewalls. However, firewalls may be expected to be configured to block general tunnels as they present a large attack surface.

A new IPv6 Destination Options header will suffer traversal issues with middleboxes, especially Firewalls and NATs, and will likely require them to be updated before the extension header is passed.

Datagrams with a zero UDP checksum will not be passed by any middlebox that validates the checksum using RFC 2460 or updates the checksum field, such as NAT or firewalls. This would require an update to correctly handle a datagram with a zero UDP checksum.

UDP-Lite will require an update of almost all type of middleboxes, because it requires support for a separate network-layer protocol number. Once enabled, the method to support incremental checksum update would be identical to that for UDP, but different for checksum

Fairhurst & Westerlund Expires August 29, 2013 [Page 33] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

validation.

A.2.2. Load Balancing

The usefulness of solutions for load balancers depends on the difference in entropy in the headers for different flows that can be included in a hash function. All the proposals that use the UDP protocol number have equal behavior. UDP-Lite has the potential for equally good behavior as for UDP. However, UDP-Lite is currently unlikely to be supported by deployed hashing mechanisms, which could cause a load balancer to not use the transport header in the computed hash. A load balancer that only uses the IP header will have low entropy, but could be improved by including the IPv6 the flow label, providing that the tunnel ingress ensures that different flow labels are assigned to different flows. However, a transition to the common use of good quality flow labels is likely to take time to deploy.

A.2.3. Ingress and Egress Performance Implications

IP-in-IP tunnels are often considered efficient, because they introduce very little processing and low data overhead. The other proposals introduce a UDP-like header incurring associated data overhead. Processing is minimised for the method that uses a zero UDP checksum, ignoring the UDP checksum on reception, and only slightly higher for UDPTT, the extension header and UDP-Lite. The delta-calculation scheme operates on a few more fields, but also introduces serious failure modes that can result in a need to calculate a checksum over the complete datagram. Regular UDP is clearly the most costly to process, always requiring checksum calculation over the entire datagram.

It is important to note that the zero UDP checksum method, ignoring checksum on reception, the Option Header, UDPTT and UDP-Lite will likely incur additional complexities in the application to incorporate a negotiation and validation mechanism.

A.2.4. Deployability

The major factors influencing deployability of these solutions are a need to update both end-points, a need for negotiation and the need to update middleboxes. These are summarised below:

o The solution with the best deployability is regular UDP. This requires no changes and has good middlebox traversal characteristics.

o The next easiest to deploy is the delta checksum solution. This does not modify the protocol on the wire and only needs changes in

Fairhurst & Westerlund Expires August 29, 2013 [Page 34] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

tunnel ingress.

o IP-in-IP tunnels should not require changes to the end-points, but raise issues when traversing firewalls and other security devices, which are expected to require updates.

o Ignoring the checksum on reception will require changes at both end-points. The never ceasing risk of path failure requires additional checks to ensure this solution is robust and will require changes or additions to the tunnel control protocol to negotiate support and validate the path.

o The remaining solutions (including the zero checksum method) offer similar deployability. UDP-Lite requires support at both end- points and in middleboxes. UDPTT and the zero UDP checksum method with or without an extension header require support at both end- points and in middleboxes. UDP-Lite, UDPTT, and the zero UDP checksum method and use of extension headers may additionally require changes or additions to the tunnel control protocol to negotiate support and path validation.

A.2.5. Corruption Detection Strength

The standard UDP checksum and the delta checksum can both provide some verification at the tunnel egress. This can significantly reduce the probability that a corrupted inner packet is forwarded. UDP-Lite, UDPTT and the extension header all provide some verification against corruption, but do not verify the inner packet. They only provide a strong indication that the delivered packet was intended for the tunnel egress and was correctly delimited.

The methods using a zero UDP checksum, ignoring the UDP checksum on reception and IP-and-IP encapsulation all provide no verification that a received datagram was intended to be processed by a specific tunnel egress or that the inner encapsulated packet was correct. Section 3.1 discusses experience using specific protocols in well- managed networks.

A.2.6. Comparison Summary

The comparisons above may be summarised as "there is no silver bullet that will slay all the issues". One has to select which down side(s) can best be lived with. Focusing on the existing solutions, this can be summarized as:

Fairhurst & Westerlund Expires August 29, 2013 [Page 35] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

Regular UDP: The method defined in RFC 2460 has good middlebox traversal and load balancing and multiplexing, requiring a checksum in the outer headers covering the whole packet.

IP in IP: A low complexity encapsulation, with limited middlebox traversal, no multiplexing support, and currently poor load balancing support that could improve over time.

UDP-Lite: A medium complexity encapsulation, with good multiplexing support, limited middlebox traversal, but possible to improve over time, currently poor load balancing support that could improve over time, in most cases requiring application level negotiation to select the protocol and validation to confirm the path forwards UDP-Lite.

The delta-checksum is an optimization in the processing of UDP, as such it exhibits some of the drawbacks of using regular UDP.

The remaining proposals may be described in similar terms:

Zero-Checksum: A low complexity encapsulation, with good multiplexing support, limited middlebox traversal that could improve over time, good load balancing support, in most cases requiring application level negotiation and validation to confirm the path forwards a zero UDP checksum.

UDPTT: A medium complexity encapsulation, with good multiplexing support, limited middlebox traversal, but possible to improve over time, good load balancing support, in most cases requiring application level negotiation to select the transport and validation to confirm the path forwards UDPTT datagrams.

IPv6 Destination Option IP in IP tunneling: A medium complexity, with no multiplexing support, limited middlebox traversal, currently poor load balancing support that could improve over time, in most cases requiring negotiation to confirm the option is supported and validation to confirm the path forwards the option.

IPv6 Destination Option combined with UDP Zero-checksuming: A medium complexity encapsulation, with good multiplexing support, limited load balancing support that could improve over time, in most cases requiring negotiation to confirm the option is supported and validation to confirm the path forwards the option.

Ignore the checksum on reception: A low complexity encapsulation, with good multiplexing support, medium middlebox traversal that never can improve, good load balancing support, in most cases requiring negotiation to confirm the option is supported by the

Fairhurst & Westerlund Expires August 29, 2013 [Page 36] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

remote endpoint and validation to confirm the path forwards a zero UDP checksum.

There is no clear single optimum solution. If the most important need is to traverse middleboxes, then the best choice is to stay with regular UDP and consider the optimizations that may be required to perform the checksumming. If one can live with limited middlebox traversal, low complexity is necessary and one does not require load balancing, then IP-in-IP tunneling is the simplest. If one wants strengthened error detection, but with currently limited middlebox traversal and load-balancing. UDP-Lite is appropriate. Zero UDP checksum addresses another set of constraints, low complexity and a need for load balancing from the current Internet, providing it can live with currently limited middlebox traversal.

Techniques for load balancing and middlebox traversal do continue to evolve. Over a long time, developments in load balancing have good potential to improve. This time horizon is long since it requires both load balancer and end-point updates to get full benefit. The challenges of middlebox traversal are also expected to change with time, as device capabilities evolve. Middleboxes are very prolific with a larger proportion of end-user ownership, and therefore may be expected to take long time cycles to evolve.

One potential advantage is that the deployment of IPv6-capable middleboxes are still in its initial phase and the quicker a new method becomes standardized, the fewer boxes will be non-compliant.

Thus, the question of whether to permit use of datagrams with a zero UDP checksum for IPv6 under reasonable constraints, is therefore best viewed as a trade-off between a number of more subjective questions:

o Is there sufficient interest in using a zero UDP checksum with the given constraints (summarised below)?

o Are there other avenues of change that will resolve the issue in a better way and sufficiently quickly ?

o Do we accept the complexity cost of having one more solution in the future?

The analysis concludes that the IETF should carefully consider constraints on sanctioning the use of any new transport mode. The 6man working group of the IETF has determined that the answer to the above questions are sufficient to update IPv6 to standardise use of a zero UDP checksum for use by tunnel encapsulations for specific applications.

Fairhurst & Westerlund Expires August 29, 2013 [Page 37] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

Each application should consider the implications of choosing an IPv6 transport that uses a zero UDP checksum. In many cases, standard methods may be more appropriate, and may simplify application design. The use of checksum off-loading may help alleviate the checksum processing cost and permit use of a checksum using method defined in RFC 2460.

Appendix B. Document Change History

{RFC EDITOR NOTE: This section must be deleted prior to publication}

Individual Draft 00 This is the first DRAFT of this document - It contains a compilation of various discussions and contributions from a variety of IETF WGs, including: mboned, tsv, 6man, lisp, and behave. This includes contributions from Magnus with text on RTP, and various updates.

Individual Draft 01

* This version corrects some typos and editorial NiTs and adds discussion of the need to negotiate and verify operation of a new mechanism (3.3.4).

Individual Draft 02

* Version -02 corrects some typos and editorial NiTs.

* Added reference to ECMP for tunnels.

* Clarifies the recommendations at the end of the document.

Working Group Draft 00

* Working Group Version -00 corrects some typos and removes much of rationale for UDPTT. It also adds some discussion of IPv6 extension header.

Working Group Draft 01

* Working Group Version -01 updates the rules and incorporates off-list feedback. This version is intended for wider review within the 6man working group.

Fairhurst & Westerlund Expires August 29, 2013 [Page 38] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

Working Group Draft 02

* This version is the result of a major rewrite and re-ordering of the document.

* A new section comparing the results have been added.

* The constraints list has been significantly altered by removing some and rewording other constraints.

* This contains other significant language updates to clarify the intent of this draft.

Working Group Draft 03

* Editorial updates

Working Group Draft 04

* Resubmission only updating the AMT and RFC2765 references.

Working Group Draft 05

* Resubmission to correct editorial NiTs - thanks to Bill Atwood for noting these.Group Draft 05.

Working Group Draft 06

* Resubmission to keep draft alive (spelling updated from 05).

Working Group Draft 07

* Interim Version

* Submission after IESG Feedback Added

* Updates to enable the document to become a PS Applicability Statement

Working Group Draft 08

* First Version written as a PS Applicability Statement

* Changes to reflect decision to update RFC 2460, rather than recommend decision

* Updates to requirements for middleboxes

Fairhurst & Westerlund Expires August 29, 2013 [Page 39] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

* Inclusion of requirements for security, API, and tunnel

* Move of the rationale for the update to an Annex (former section 4)

Working Group Draft 09

* Submission after second WGLC (note mistake corrected in -09).

* Clarified role of API for supporting full checksum.

* Clarified that full checksum is required in security considerations, and therefore noting that full checksum should not be treated as an attack - consistent with remainder of document.

* Added mention that API can set a mode in transport stack - to link to similar statement in RFC 2460 update.

* Fixed typos.

Working Group Draft 10

* Submission to correct unwanted removal of text from section 5 bullets 5-7 by GF.

* Replaced section 5 text with the text from 08, and reapplied the editorial correction.

* Note to reviewers: Please compare this revision with -08 used in the IETF LC).

Working Group Draft 11

* Added REF for 5097 (Noted by S.Turner)

* Added text in response to P. Resnick on place where checksum is calculated.

* Added text to note experience with MPLS/PWE; Appendix updated to refer to this (S. Bryant)

* Added text in response to P.Resnick’s 2nd comments.

* Request to make UDP-Lite more clearly recommended (J Touch, P.Resnick)

Fairhurst & Westerlund Expires August 29, 2013 [Page 40] Internet-Draft Applicability of IPv6 UDP Zero Checksum February 2013

* Added considerations around usage of zero checksum in routers.

* Added text in response to Stewart Bryant’s comments on router requirements.

Authors’ Addresses

Godred Fairhurst University of Aberdeen School of Engineering Aberdeen, AB24 3UE Scotland, UK

Email: [email protected] URI: http://www.erg.abdn.ac.uk/users/gorry

Magnus Westerlund Ericsson Farogatan 6 Stockholm, SE-164 80 Sweden

Phone: +46 8 719 0000 Email: [email protected]

Fairhurst & Westerlund Expires August 29, 2013 [Page 41]

6MAN B. Carpenter Internet-Draft Univ. of Auckland Updates: 3986 (if approved) S. Cheshire Intended status: Standards Track Apple Inc. Expires: June 10, 2013 R. Hinden Check Point December 7, 2012

Representing IPv6 Zone Identifiers in Address Literals and Uniform Resource Identifiers draft-ietf-6man-uri-zoneid-06

Abstract

This document describes how the Zone Identifier of an IPv6 scoped address, as defined in RFC 4007, can be represented in a literal IPv6 address and in a Uniform Resource Identifier that includes such a literal address. It updates RFC 3986 accordingly.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on June 10, 2013.

Copyright Notice

Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must

Carpenter, et al. Expires June 10, 2013 [Page 1] Internet-Draft IPv6 Zone ID in URI December 2012

include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction ...... 3 2. Specification ...... 4 3. Web Browsers ...... 5 4. Security Considerations ...... 6 5. IANA Considerations ...... 6 6. Acknowledgements ...... 6 7. Change log [RFC Editor: Please remove] ...... 7 8. References ...... 7 8.1. Normative References ...... 7 8.2. Informative References ...... 8 Appendix A. Options Considered ...... 8 Authors’ Addresses ...... 10

Carpenter, et al. Expires June 10, 2013 [Page 2] Internet-Draft IPv6 Zone ID in URI December 2012

1. Introduction

The Uniform Resource Identifier (URI) syntax [RFC3986] defined how a literal IPv6 address can be represented in the "host" part of a URI. A subsequent specification [RFC4007] extended the text representation of limited-scope IPv6 addresses such that a zone identifier may be concatenated to a literal address, for purposes described in that RFC. Zone identifiers are especially useful in contexts where literal addresses are typically used, for example during fault diagnosis, when it may be essential to specify which interface is used for sending to a link local address. It should be noted that zone identifiers have purely local meaning within the node where they are defined, often being the same as IPv6 interface names. They are completely meaningless for any other node. Today, they are only meaningful when attached to addresses with less than global scope, but it is possible that other uses might be defined in the future.

RFC 4007 does not specify how zone identifiers are to be represented in URIs. Practical experience has shown that this feature is useful, in particular when using a web browser for debugging with link local addresses, but as it is undefined, it is not implemented consistently in URI parsers or in browsers.

Some versions of some browsers accept the RFC 4007 syntax for scoped IPv6 addresses embedded in URIs, i.e., they have been coded to interpret the "%" sign according to RFC 4007 instead of RFC 3986. Clearly this approach is very convenient for users, although it formally breaches the syntax rules of RFC 3986. The present document defines an alternative approach that respects and extends the rules of URI syntax, and IPv6 literals in general, to be consistent.

Thus, this document updates [RFC3986] by adding syntax to allow a zone identifier to be included in a literal IPv6 address within a URI.

It should be noted that in other contexts than a user interface, a zone identifier is mapped into a numeric zone index or interface number. The MIB textual convention InetZoneIndex [RFC4001] and the socket interface [RFC3493] define this as a 32 bit unsigned integer. The mapping between the human-readable zone identifier string and the numeric value is a host-specific function that varies between operating systems. The present document is concerned only with the human-readable string.

Several alternative solutions were considered while this document was developed. The Appendix briefly describes the various options and their advantages and disadvantages.

Carpenter, et al. Expires June 10, 2013 [Page 3] Internet-Draft IPv6 Zone ID in URI December 2012

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

2. Specification

According to RFC 4007, a zone identifier is attached to the textual representation of an IPv6 address by concatenating "%" followed by , where is a string identifying the zone of the address. However, RFC 4007 gives no precise definition of the character set allowed in . There are no rules or de facto standards for this. For example, the first Ethernet interface in a host might be called %0, %1, %en1, %eth0, or whatever the implementer happened to choose.

In a URI, a literal IPv6 address is always embedded between "[" and "]". This document specifies how a can be appended to the address. Unfortunately "%" is always treated as an escape character in a URI, and according to RFC 3986 it MUST therefore itself be percent-encoded in a URI, in the form "%25". Thus, the scoped address fe80::a%en1 would appear in a URI as http://[fe80::a%25en1].

A SHOULD contain only ASCII characters classified in RFC 3986 as "unreserved". This excludes characters such as "]" or even "%" which would complicate parsing. However, the syntax below does allow such characters to be percent-encoded, for compatibility with existing devices that use them.

If an operating system uses any other characters in zone or interface identifiers that are not in the "unreserved" character set, they MUST be escaped with a "%" sign according to RFC 3986.

We now present the necessary formal syntax.

In RFC 3986, the IPv6 literal format is formally defined in ABNF [RFC5234] by the following rule:

IP-literal = "[" ( IPv6address / IPvFuture ) "]"

To provide support for a zone identifier, the existing syntax of IPv6address is retained, and a zone identifier may be added optionally to any literal address. This allows flexibility for unknown future uses. The rule quoted above from RFC 3986 is replaced by three rules:

Carpenter, et al. Expires June 10, 2013 [Page 4] Internet-Draft IPv6 Zone ID in URI December 2012

IP-literal = "[" ( IPv6address / IPv6addrz / IPvFuture ) "]"

ZoneID = 1*( unreserved / pct-encoded )

IPv6addrz = IPv6address "%25" ZoneID

This syntax fills the gap that is described at the end of Section 11.7 of RFC 4007.

The rules in [RFC5952] SHOULD be applied in producing URIs.

RFC 3986 states that URIs have a global scope, but that in some cases their interpretation depends on the end-user’s context. URIs including a ZoneID are to be interpreted only in the context of the host where they originate, since the ZoneID is of local significance only.

RFC 4007 offers guidance on how the ZoneID affects interface/address selection inside the IPv6 stack. Note that the behaviour of an IPv6 stack if passed a non-null zone index for an address other than link- local is undefined.

3. Web Browsers

This section discusses how web browsers might handle this syntax extension. Unfortunately there is no formal distinction between the syntax allowed in a browser’s input dialogue box and the syntax allowed in URIs. For this reason, no normative statements are made in this section.

Due to the lack of defined syntax, web browsers have been inconsistent in providing for ZoneIDs. Many have no support, but there are examples of ad hoc support. For example, some versions of Firefox allowed the use of a ZoneID preceded by an unescaped "%" character, but this was removed for consistency with RFC 3986. As another example, some versions of Internet Explorer allow use of a ZoneID preceded by a "%" character escaped as "%25", still beyond the syntax allowed by RFC 3986. This syntax extension is in fact used internally in the Windows operating system and some of its APIs.

It is desirable for all browsers to recognise a ZoneID preceded by an escaped "%". In the spirit of "be liberal with what you accept", we also suggest that URI parsers accept bare "%" signs when possible (i.e., a "%" not followed by two valid and meaningful hexadecimal characters). This would make it possible for a user to copy and paste a string such as "fe80::a%en1" from the output of a "ping" command and have it work. On the other hand, "%ee1" would need to be

Carpenter, et al. Expires June 10, 2013 [Page 5] Internet-Draft IPv6 Zone ID in URI December 2012

manually escaped as "fe80::a%25ee1" to avoid any risk of misinterpretation.

Such bare "%" signs are for user interface convenience, and need to be turned into properly escaped characters (where "%25" encodes "%") before the URI is used in any protocol or HTML document. However, URIs including a ZoneID have no meaning outside the originating node. It would therefore be highly desirable for a browser to remove the ZoneID from a URI before including that URI in an HTTP request.

The normal diagnostic usage for the ZoneID syntax will cause it to be entered in the browser’s input dialogue box. Thus, URIs including a ZoneID are unlikely to be encountered in HTML documents. However, if they do (for example, in a diagnostic script coded in HTML) it would be appropriate to treat them exactly as above.

4. Security Considerations

The security considerations of [RFC3986] and [RFC4007] apply. In particular, this URI format creates a specific pathway by which a deceitful zone index might be communicated, as mentioned in the final security consideration of RFC 4007. It is emphasised that the format is intended only for debugging purposes, but of course this intention does not prevent misuse.

To limit this risk, implementations MUST NOT allow use of this format except for well-defined usages such as sending to link local addresses under prefix fe80::/10. At the time of writing, this is the only well-defined usage known.

An HTTP client, proxy or other intermediary MUST remove any ZoneID attached to an outgoing URI, as it only has local significance at the sending host.

5. IANA Considerations

This document requests no action by IANA.

6. Acknowledgements

The lack of this format was first pointed out by Margaret Wasserman some years ago, and more recently by Kerry Lynn. A previous draft document by Martin Duerst and Bill Fenner [I-D.fenner-literal-zone] discussed this topic but was not finalised.

Carpenter, et al. Expires June 10, 2013 [Page 6] Internet-Draft IPv6 Zone ID in URI December 2012

Valuable comments and contributions were made by Karl Auer, Carsten Bormann, Benoit Claise, Stephen Farrell, Brian Haberman, Ted Hardie, Tatuya Jinmei, Yves Lafon, Barry Leiba, Radia Perlman, Tom Petch, Tomoyuki Sahara, Juergen Schoenwaelder, Dave Thaler, Martin Thomson, and Ole Troan.

Brian Carpenter was a visitor at the Computer Laboratory, Cambridge University during part of this work.

This document was produced using the xml2rfc tool [RFC2629].

7. Change log [RFC Editor: Please remove]

draft-ietf-6man-uri-zoneid-06: responding to IETF Last Call and IESG comments, 2012-12-07.

draft-ietf-6man-uri-zoneid-05: tuned ABNF, clarified RFC 4007 text, 2012-11-06.

draft-ietf-6man-uri-zoneid-04: additional author, 2012-09-21.

draft-ietf-6man-uri-zoneid-03: reverted to percent-encoded model following WGLC, 2012-09-10.

draft-ietf-6man-uri-zoneid-02: additional WG comments, 2012-07-11.

draft-ietf-6man-uri-zoneid-01: use "-" instead of %25, listed alternatives in Appendix, according to WG debate, added suggestion for browser developers, 2012-05-29.

draft-ietf-6man-uri-zoneid-00: adopted by WG, fixed syntax to allow for % encoded characters, 2012-02-17.

draft-carpenter-6man-uri-zoneid-01: chose Option 2, removed 15 character limit, added explanation of ID/number mapping and other clarifications, 2012-02-08.

draft-carpenter-6man-uri-zoneid-00: original version, 2011-12-07.

8. References

8.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

Carpenter, et al. Expires June 10, 2013 [Page 7] Internet-Draft IPv6 Zone ID in URI December 2012

[RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, January 2005.

[RFC4007] Deering, S., Haberman, B., Jinmei, T., Nordmark, E., and B. Zill, "IPv6 Scoped Address Architecture", RFC 4007, March 2005.

[RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, January 2008.

[RFC5952] Kawamura, S. and M. Kawashima, "A Recommendation for IPv6 Address Text Representation", RFC 5952, August 2010.

8.2. Informative References

[I-D.fenner-literal-zone] Fenner, B. and M. Duerst, "Formats for IPv6 Scope Zone Identifiers in Literal Address Formats", draft-fenner-literal-zone-02 (work in progress), October 2005.

[RFC2629] Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629, June 1999.

[RFC3493] Gilligan, R., Thomson, S., Bound, J., McCann, J., and W. Stevens, "Basic Socket Interface Extensions for IPv6", RFC 3493, February 2003.

[RFC4001] Daniele, M., Haberman, B., Routhier, S., and J. Schoenwaelder, "Textual Conventions for Internet Network Addresses", RFC 4001, February 2005.

[chrome] Google, "Use the address bar (omnibox)", 2012, .

Appendix A. Options Considered

The syntax defined above allows a ZoneID to be added to any IPv6 address. The 6man WG discussed and rejected an alternative in which the existing syntax of IPv6address would be extended by an option to add the ZoneID only for the case of link-local addresses. It was felt that the present solution offers more flexibility for future uses and is more straightforward to implement.

The various syntax options considered are now briefly described.

Carpenter, et al. Expires June 10, 2013 [Page 8] Internet-Draft IPv6 Zone ID in URI December 2012

1. Leave the problem unsolved.

This would mean that per-interface diagnostics would still have to be performed using ping or ping6:

ping fe80::a%en1

Advantage: works today.

Disadvantage: less convenient than using a browser.

2. Simply using the percent character.

http://[fe80::a%en1]

Advantage: allows use of browser, allows cut and paste.

Disadvantage: invalid syntax under RFC 3986; not acceptable to URI community.

3. Escaping the escape character as allowed by RFC 3986:

http://[fe80::a%25en1]

Advantage: allows use of browser, consistent with general URI syntax.

Disadvantage: somewhat ugly and confusing, doesn’t allow simple cut and paste.

This is the option chosen for standardization.

4. Alternative separator

http://[fe80::a-en1]

Advantage: allows use of browser, simple syntax

Disadvantage: Requires all IPv6 address literal parsers and generators to be updated in order to allow simple cut and paste; inconsistent with existing tools and practice.

Note: the initial proposal for this choice was to use an underscore as the separator, but it was noted that this becomes effectively invisible when a user interface automatically underlines URLs.

Carpenter, et al. Expires June 10, 2013 [Page 9] Internet-Draft IPv6 Zone ID in URI December 2012

5. With the "IPvFuture" syntax left open in RFC 3986:

http://[v6.fe80::a_en1]

Advantage: allows use of browser.

Disadvantage: ugly and redundant, doesn’t allow simple cut and paste.

Authors’ Addresses

Brian Carpenter Department of Computer Science University of Auckland PB 92019 Auckland, 1142 New Zealand

Email: [email protected]

Stuart Cheshire Apple Inc. 1 Infinite Loop Cupertino, CA 95014 US

Email: [email protected]

Robert M. Hinden Check Point Software Technologies, Inc. 800 Bridge Parkway Redwood City, CA 94065 US

Email: [email protected]

Carpenter, et al. Expires June 10, 2013 [Page 10]

6man Working Group T. Macaulay Internet-Draft McAfee Inc. Intended status: Standards Track Aug 2012 Expires: February 2, 2013

IPv6 packet staining draft-macaulay-6man-packet-stain-01

Abstract

This document specifies the application of security staining on an IPv6 datagrams and the minimum requirements for IPv6 nodes staining flows, IPv6 nodes forwarding stained packets within a given domain of control, and nodes interpreting stains on flows.

The usage of the packet staining destination option enables proactive delivery of security intelligence to IPv6 nodes such as firewalls and intrusion prevention systems, and end-points such servers, workstations, mobile and smart devices and an infinite array of as- yet-to-be-invented sensors and controllers.

The usage of packet staining is not intended for use across the open internet, where fragmentation issues associated with increased header size may induce service degradation; packet staining is intended as a security adjunct within a given doamin of control such as an carrier or enterprise network.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on February 2, 2013.

Copyright Notice

Copyright (c) 2012 IETF Trust and the persons identified as the

Macaulay Expires February 2, 2013 [Page 1] Internet-Draft IPv6 Packet Staining Aug 2012

document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction ...... 3 2. Conventions used in this document ...... 3 3. Background ...... 4 3.1. Packet Staining Benefits ...... 4 3.2. Implementation and support models ...... 5 3.3. Use cases ...... 5 4. Requirements for staining IPv6 packets ...... 7 5. Packet Stain Destination Option (PSDO) ...... 7 6. Acknowledgements ...... 8 7. Security Considerations ...... 8 8. IANA Considerations ...... 9 9. Normative References ...... 10 Author’s Address ...... 10

Macaulay Expires February 2, 2013 [Page 2] Internet-Draft IPv6 Packet Staining Aug 2012

1. Introduction

From the viewpoint of the network layer, a flow is a sequence of packets sent from a particular source to a particular unicast, anycast, or multicast destination. From an upper layer viewpoint, a flow could consist of all packets in one direction of a specific transport connection or media stream. However, a flow is not necessarily 1:1 mapped to a transport connection.

Traditionally, flow classifiers have been based on the 5-tuple of the source and destination addresses, ports, and the transport protocol type. However, as the growth of internetworked devices continues under IPv6, security issues associated with the reputation of the source of flows are becoming a critical criterion associated with the trust of the data payloads and the security of the destination end- points and the networks on which they reside.

The usage of security reputational intelligence associated with the source address field and possibly the port and protocol [REF1] enables packet-by-packet IPv6 security classification, where the IPv6 header extensions in the form of Destination Options may be used to stain each packet with security reputation information such that the network routing is unaffected, but intermediate security nodes and endpoint devices can apply policy decisions about incoming information flows without the requirement to assemble and treat payloads at higher levels of the stack.

IPv6 packet staining support consists of labeling datagrams with security reputation information through the addition of an IPv6 destination option in the packet header by packet manipulation devices (PMDs) in the carrier or enterprise network. This destination option may be read by in-line security nodes upstream from the packet destination, as well as by the destination nodes themselves.

The usage of packet staining is not intended for use across the open internet, where fragmentation issues associated with increased header size may induce service degradation; packet staining is intended as a security adjunct within a given doamin of control such as an carrier or enterprise network.

2. Conventions used in this document

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

Macaulay Expires February 2, 2013 [Page 3] Internet-Draft IPv6 Packet Staining Aug 2012

3. Background

Internet based threats in the form of both malicious software and the agents that control this software (organized crime, spys, hackitivits) have surpassed the abilities of signature-based security systems; whether they be on the enterprise perimeter, within the corporate network, on the endpoint point or in-the-cloud (internet- based service). Additionally, the sensitivity of IP network continues to grow as new generation of smart devices is appearing on the networks in the form of broadband mobile devices, legacy industrial control devices, and very low-power sensors. This diverse collections of IP-based assets is coming to be known as the Internet of Things (IOT).

In response to the accelerating threats, the security vendor community have integrated their products with proprietary forms of security reputation intelligence. This intelligence is about IP addresses and domains which have been observed engaged in attack- behaviours such as inappropriate messaging and traffic volumes, domain management, Botnet command-and-control channel exchanges and other indicators of either compromise or malicious intent. [REF 1] IP address may also end up on a security reputation list if they are identified as compromised through vendor-specific signature-based processes. Security reputation intelligence from vendors is typically made available to perimeter and end-point products through proprietary, internet-based queries to vendor information bases.

This system of using proactive, security reputation intelligence has many benefits, but also several weakness and scaling challenges. Specifically, existing intelligence systems are: 1. subject to direct attack from the internet on distribution points, for instance 2. are proprietary to vendor devices 3. require fat-clients consuming both bandwidth and CPU, and 4. introduces flow latency while queries are sent, received and processed 5. introduces intelligence latency as reputation lists will be inevitably cached and only periodically refreshed given the number and range of vendor-specific processing elements

3.1. Packet Staining Benefits

In contrast to the challenges of current security reputation intelligence systems, packet staining has the following strengths 1. packet staining can occur transparently in the network, presenting no attack surface

Macaulay Expires February 2, 2013 [Page 4] Internet-Draft IPv6 Packet Staining Aug 2012

2. packet staining uses standardized, public domain IPv6 capabilities 3. security rules can be easily applied in hardware or firmware 4. reading packet stains introduces little to no latency 5. near-real-time threat intelligence distribution systems can be implemented can be implemented out of band in PMDs using a standardized packet staining method allowing multiple intelligence sources (vendor sources) to be aggregated and applied in an agnostic (cross-vendor) manner.

3.2. Implementation and support models

Packet staining may be accomplished by different entities including carriers, enterprises and third-party value-added service providers.

Carriers or service providers may elect to implement staining centres at strategic locations in the network to provide value-added services on a subscription basis. Under this model, subscribers to a security staining service would see their traffic directed through a staining centre where Destination Options are added to the IPv6 headers and IPv4 traffic is encapsulated within IPv6 tunnels, with stained headers.

Carriers or service providers may elect to stain all IPv6 traffic entering their network, and allow subscribers to process the stains at their own discretion.

If such upstream, network-based staining services are inappropriate or unavailable, Enterprise data centre managers / cloud computing service providers may elect to deploy IPv6 staining at the perimeter into the internal network, tunnelling all IPv4 traffic, and allow data centre/cloud service users to process stains at their discretion.

Enterprise may wish to deploy IPv6 on internal networks, and stain all internal traffic whereby security nodes and end-points may apply corporate security policy related to reputation.

3.3. Use cases

The following are example use-cases for a security technique based upon a packet staining system.

Organization Perimeter Use-case Traffic to a subscriber is routed through a PMD in the carrier network configured to stain (apply Destination Options extensions) all packets to the subscriber (TM)s IP-range, which have entries in the threat intelligence information base. The PMD accesses the information base from a locally cached

Macaulay Expires February 2, 2013 [Page 5] Internet-Draft IPv6 Packet Staining Aug 2012

file or other method not defined in this draft. Packets from sources not in the information base pass through the PDM unchanged. Packets from sources in the information base have a Destinations Option added to the datagram header. The Destination Options contains reputation from the information base. The format of the destination option is discussed later in this draft. IPv6 perimeter devices such as firewalls, web proxies or security routers on the perimeter of the subscriber network look for Destination Options on incoming packets with reputation stains. If a stain is found, the perimeter device applies the organization policy associated with the reputation indicated by the stain. For instance, drop the packet, quarantine the packet, issue alarms, or pass the packets and associated flow to specially hardened extra-net authentication systems, or do nothing.

IPv4 support Use-case" IPv4 header fields and options are not suitable for packet staining; however, there is a clear security benefit to supporting IPv4 flows. IPv4 traffic to a subscriber is routed through a PMD in the carrier network configured to encapsulate the IPv4 traffic in an IPv6 tunnel. The PMD applies a stain (Destination Options extension) to the IPv6 tunnel as per the Perimeter Use-case above. Subscriber perimeter devices such as firewalls, web proxies or security routers are configured to support both native IPv6 flows and IPv6 tunnels contain legacy IPv4 flows. Perimeter devices look for Destination Options on incoming IPv6 packets with reputation stains. If a stain is found, the perimeter device applies the organization policy associated with the reputation indicated by the stain to the IPv4 packet within the IPv6 tunnel. In this manner IPv4 support may be transparent to end-users and applications.

IPv6 end-point use-case" IPv6 end-points may make use of reputation stains by processing Destination Options before engaging in any application level processing. In the case of certain classes of smart device, remote and mobile sensors, reputation stains may be a critical form of security when other mitigations such as signature bases and firewalls are too power and processor intensive to support.

URL-specific stains" it is a common occurrence to see large public content portals with millions of users sharing dozens of addresses. Frequently, malicious content will be loaded to such sites. This content represents a very small fraction of the otherwise legitimate content on the site, which may be under the direct control of entirely separate entities . Degrading the reputation of IP addresses used by these large portals based on a very small amount of content is problematic. For such sites, reputation stains should have the ability to include the URL of malicious content, such that the reputation of the only specific portions of these large portals is degraded according to threat evidence, rather than the entire IP

Macaulay Expires February 2, 2013 [Page 6] Internet-Draft IPv6 Packet Staining Aug 2012

address, CIDR block, ASN or domain name.

4. Requirements for staining IPv6 packets

1. The default behaviour of a security node MUST be to leave a packet unchanged (apply no stain). 2. Reputation stains may be inserted or overwritten by security nodes in the path. 3. Reputation stains may not be applied by the sender/source of the packet. 4. The reputation staining mechanism needs to be visible to all stain-aware nodes on the path. 5. The mechanism needs to be able to traverse nodes that do not understand the reputation stains. This is required to ensure that packet-staining can be incrementally deployed over the Internet. 6. The presence of the reputation staining mechanism should not significantly alter the processing of the packet by nodes, unless policy is explicitly configured. This is required to ensure that stained packets do not face any undue delays or drops due to a badly chosen mechanism. 7. A PMD should be able to distinguish a trusted stain from an untrusted stain, through mechanism such as digital signatures or intrinsic trust among network elements. 8. A staining node MAY apply more specific and selective staining services according to subscriptions. Staining nodes SHOULD support different reputation taxonomies to support different subscribers and/or interoperability with other staining entities, and have the ability to stain flows to different subscriber sources according to different semantics. 9. Staining MUST NOT increase header size such that headers are fragmented due to nodes supporting MTU smaller than the complete header, once stained. Therefore staining should only be applied within a domain of control where MTU is known and can be managed.

5. Packet Stain Destination Option (PSDO)

The Packet Stain Destination Option (PSDO) is a destination option that can be included in IPv6 datagrams that are inserted by PMDs in order to inform packet staining aware nodes on the path, or endpoints, that the PSDO has an alignment requirement of (none).

Macaulay Expires February 2, 2013 [Page 7] Internet-Draft IPv6 Packet Staining Aug 2012

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Option Type | Option Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|U| Stain Data | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Figure 1: Packet Stain Destination Option Layout

Option Type

8-bit identifier of the type of option. The option identifier for the reputation stain option will be allocated by the IANA.

Option Length

8-bit unsigned integer. The length of the option (excluding the Option Type and Option Length fields).

S Bit

When this bit is set, the reputation stain option has been signed.

U Bit

When this bit is set, the reputation stain option contains a malicious URL.

Stain Data

Contains the staining data.

6. Acknowledgements

The author wishes to achknowledge the guidance and support of Suresh Krishnan from Ericsson’s Montreal lab. The author also wishes to credit Chris Mac-Stoker from NIKSUN for his substantial contributions to the early stages of the packet staining concept.

7. Security Considerations

Some implementation may elect to no apply digital signature to reputation stains in the Destination Option, in which case the stain

Macaulay Expires February 2, 2013 [Page 8] Internet-Draft IPv6 Packet Staining Aug 2012

is not protected in any way, even if IPsec authentication [RFC4302] is in use. Therefore an unsigned reputation stain can be forged by an on-path attacker. Implementers are advised that any en-route change to an unsigned security reputation stain value is undetectable. Therefore packet staining use the Destination Options extension without digital signatures requires intrinsic trust among the network elements and the PMD, and the destination node or intervening security nodes such as firewalls or IDS services. For this reason, receiving nodes MAY need to take account of the network from which the stained packet was received. For instance, a multi- homed organization may have some service providers with staining services and others that do not. A receiving node SHOULD be able to distinguish which source from which stains are expected. A receiving node SHOULD by default ignore any reputation stains from sources (networks or devices) that have not been specifically configured as trusted.

The reputation intelligence of IP source addresses, ASNs, CIDR blocks and domains is fundamental to the application of reputation stains within packet headers. Such reputation information can be seeded from a variety of open and closed sources. Poorly managed or compromised intelligence information bases can result in denial of service against legitimate IP addresses, and allow malicious entities to appear trustworthy. Intelligence information bases themselves may be compromised in a variety of ways; for instance the raw information feeds may be corrupted with erroneous information, alternately the intelligence reputation algorithms could be flawed in design or corrupted such that they generate false reputation scores. Therefore seed intelligence SHOULD be sourced and monitored with demonstratable diligence. Similarly, reputation algorithms should be protected from unauthorized change with multi-layered access controls.

The value of reputation stains will be directly proportional to the trustworthiness, reliability and reputation of the intelligence source itself. Operators of security nodes SHOULD have defined and auditable methods upon which they select and manage the source of reputation intelligence and the packet staining infrastructure itself.

8. IANA Considerations

This document defines a new IPv6 destination option for carrying security reputation packet stains. IANA is requested to assign a new destination option type (TBA1) in the Destination Options registry maintained at http://www.iana.org/assignments/ipv6-parameters 1) Signed Security Reputation Option, 2) Unsigned Security Reputation Option 3) Signed Security Reputation Option with malicious URL 4)

Macaulay Expires February 2, 2013 [Page 9] Internet-Draft IPv6 Packet Staining Aug 2012

Unsigned Security Reputation Option with malicious URL The act bits for this option need to be 10 and the chg bit needs to be 0.

9. Normative References

[REF1] Macaulay, T., "Upstream Intelligence: anatomy, architecture, case studies and use-cases.", Information Assurance Newsletter, DOD , Aug to Feburary 2010 to 2011.

[REF2] Gont, F., "Security and Interoperability of Oversized IPv6 Header Chains", June 2012.

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998.

Author’s Address

Tyson Macaulay McAfee Inc. 2821 Mission College Boulevard Sanata Clara, California U.S.A.

Email: [email protected]

Macaulay Expires February 2, 2013 [Page 10]

6man S. Zhou, Ed. Internet-Draft R. Zhang Intended status: Standards Track Z. Xie Expires: March 17, 2013 ZTE Corporation September 13, 2012

Another Support for Multiple Hash Algorithms in Cryptographically Generated Addresses (CGAs) draft-zhou-6man-mhash-cga-02

Abstract

This document provides a support for multiple hash algorithms in Cryptographically Generated Addresses (CGAs) defined in RFC 3972.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on March 17, 2013.

Copyright Notice

Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Zhou, et al. Expires March 17, 2013 [Page 1] Internet-Draft draft-zhou-6man-mhash-cga-00 September 2012

Table of Contents

1. Introduction ...... 3 1.1. Terminology ...... 4 2. Mhash-method Extension ...... 4 3. Hash Algorithm Identity Parameter ...... 4 4. CGA Generation Procedure ...... 5 5. CGA Verification Procedure ...... 6 6. IANA Considerations ...... 7 7. Security Considerations ...... 7 8. Normative References ...... 8 Authors’ Addresses ...... 8

Zhou, et al. Expires March 17, 2013 [Page 2] Internet-Draft draft-zhou-6man-mhash-cga-00 September 2012

1. Introduction

Cryptographically Generated Addresses (CGAs) defined in [RFC3972] is a method of binding a public key to an IPv6 address, with the aim of providing address ownership in many internet protocols. In the generation and verification of a CGA address, a cryptographically secure hash function, SHA-1 in the case of [RFC3972], is used to hash a public key into part of the network IP address.

As pointed out in Section 4 of [RFC4982], it is wise to enable multiple hash functions support in CGAs so that once the current hash function does not satisfy future requirements ,e.g., potential future applications of the CGAs may need a more cryptographically secure hash algorithm than SHA-1, the transition to an alternative hash function is as easy as possible.

To provide a sense of hash algorithms agility , a method of reusing the security parameter bits in the address is specified[RFC4982]. Security parameter , sec, defined in [RFC3972], is a 3 bit value from 0 to 7 used in the hash extension technique (Section 7.2 in [RFC3972] ), to compensate the truncated SHA-1 output length because of insufficient bits space in a CGA address. According to the method specified in RFC4982, the security parameter is also used to represent hash algorithm identity:

000 means sec=0 and SHA-1

001 means sec=1 and SHA-1

010 means sec=2 and SHA-1

Then security parameter is limited to 0,1,2 . They may be sufficient for now, but higher security parameter value may also be required with computers becoming faster, as pointed out in Section 7.2 , RFC 3972.

Even with limited security parameter value, the method in RFC 4982 can only support three hash algorithms at most. That is besides SHA-1, we have a second choice of an alternative hash algorithm number one with sec=0,1,2 and a third choice of another alternative hash algorithm number two with sec can only be two values from {0,1,2}.

Taking the above two factors into consideration, at some time in the future we will be faced with a painful choice, high security parameter or a more secure hash algorithm? And we may be also challenged with pain of high cost of upgrading because of the massive number of IPv6 nodes that may be using CGA addresses.

Zhou, et al. Expires March 17, 2013 [Page 3] Internet-Draft draft-zhou-6man-mhash-cga-00 September 2012

In this document, a support for multiple hash algorithms without limiting security parameter or downgrading the security level of CGAs is provided. The proposed solution follows the idea of encoding the hash algorithm identity in the CGA addresses to prevent from downgrading attacks, the detailed description of downgrading attack can be found in Section 4.1, [RFC4982].

1.1. Terminology

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

2. Mhash-method Extension

To accomodate RFC 4982, an extension field "Mhash-method" is defined. The format is illustrated in Figure 1.

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Extension Type | Extension Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Mhash-method| +-+-+-+-+-+-+-+

Extension Type: TBA. (16-bit unsigned integer).

Extension Data Length: 1. (16-bit unsigned integer. Length of the multiple-hash-method field of this option, in octets.)

Mhash-method: 1 octet length field. If Mhash-method equal 0, it means the method of denoting hash algorithm specified in RFC 4982 is adopted, if Mhash-method equal 1, it means the method specified in this document is adopted.

3. Hash Algorithm Identity Parameter

A hash algorithm identity parameter (hid) in CGA is defined to denote the hash algorithm adopted when calculating HASH1 and HASH2. The hash algorithm identity parameter is a three-bit unsigned integer, and it is encoded in the 3rd-5th bits of the interface identifier. This can be written as follows:

hid = (interface identifier & 0x1c00000000000000) >> 58

Zhou, et al. Expires March 17, 2013 [Page 4] Internet-Draft draft-zhou-6man-mhash-cga-00 September 2012

0 1 2 3 4 5 6 7 8 9 0 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | sec | hid |0 0| Leftmost 56 bits of HASH1 output | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

4. CGA Generation Procedure

Generate a CGA as defined in RFC 3972 except some modification to steps 2,3,5,6 and 9 as shown in the following:

1. Set the modifier to a random or pseudo-random 128-bit value.

2. Concatenate from left to right the modifier, 9 zero octets, the encoded public key, and any optional extension fields. Execute the adopted hash algorithm ( denoted by value of hid) on the concatenation. Take the 112( or 115 in case sec=7 ) leftmost bits of the hash value. The result is Hash2.

3. Compare the 16*Sec+3 leftmost bits of Hash2 with zero. If they are all zero, continue with step 4. Otherwise, increment the modifier by one and go back to step 2.

4. Set the 8-bit collision count to zero.

5. Concatenate from left to right the final modifier value, the subnet prefix, the collision count, the encoded public key, and any optional extension fields. Execute the adopted hash algorithm on the concatenation. Take the 56 leftmost bits of the hash value. The result is Hash1.

6. Form an interface identifier from Hash1 by writing the value of Sec into the three leftmost bits, writing the value of hid into the following three bits and by setting bits 6 and 7 (i.e., the "u" and "g" bits) to zero.

7. Concatenate the 64-bit subnet prefix and the 64-bit interface identifier to form a 128-bit IPv6 address with the subnet prefix to the left and interface identifier to the right, as in a standard IPv6 address .

8. Perform duplicate address detection if required. If an address collision is detected, increment the collision count by one and go back to step 5. However, after three collisions, stop and report the error.

9. Form the CGA Parameters data structure by concatenating from left to right the final modifier value, the subnet prefix, the final

Zhou, et al. Expires March 17, 2013 [Page 5] Internet-Draft draft-zhou-6man-mhash-cga-00 September 2012

collision count value, the encoded public key, Mhash-method value (equal 1 in this case) and any other optional extension fields.

5. CGA Verification Procedure

Verify a CGA as defined in RFC 3972 except some modification to steps 3,4,6 and 7 as shown in the following:

1. Check that the collision count in the CGA Parameters data structure is 0, 1, or 2. The CGA verification fails if the collision count is out of the valid range.

2. Check that the subnet prefix in the CGA Parameters data structure is equal to the subnet prefix (i.e., the leftmost 64 bits) of the address. The CGA verification fails if the prefix values differ.

3. If the Mhash-method value in the Mhash-method extension filed is 1, read the hash algorithm identity parameter hid from the 3rd- 5th bits of the 64-bit interface identifier of the address, execute the hash algorithm denoted by hid on the CGA Parameters data structure. Take the 56 leftmost bits of the hash value. The result is Hash1. If the Mhash-method value in the Mhash- method extension filed is 0, do exactly as specified in RFC 3972 and RFC4982.

4. Compare Hash1 with the interface identifier (i.e., the rightmost 56 bits) of the address. If the 56-bit values differ, the CGA verification fails.

5. Read the security parameter Sec from the three leftmost bits of the 64-bit interface identifier of the address. (Sec is an unsigned 3-bit integer.)

6. Concatenate from left to right the modifier, 9 zero octets, the public key, and any extension fields that follow the public key in the CGA Parameters data structure. Execute the hash algorithm denoted by hid on the concatenation. Take the 112 (or 115 in case sec=7) leftmost bits of the SHA-1 hash value. The result is Hash2.

7. Compare the 16*Sec+3 leftmost bits of Hash2 with zero. If any one of them is not zero, the CGA verification fails. Otherwise, the verification succeeds.

Zhou, et al. Expires March 17, 2013 [Page 6] Internet-Draft draft-zhou-6man-mhash-cga-00 September 2012

6. IANA Considerations

This document defines one new CGA Extension Type [RFC4581] option, which must be assigned by IANA:

Name: Mhash-method extension type;

Value: TBA.

Description: see Section 2.

The values of Mhash-method are also defined:

Name: Mhash-method extension value;

Value: 0 meaning RFC 4982, 1 meaning this document;

Description: see Section 2.

This document also defines a new parameter (hid) in CGA, the value of which must be assigned by IANA. It may be assigned as follows:

Name | Value ------+------SHA-1 | 000 SHA-244 | 001 SHA-256 | 010 SHA-384 | 011 SHA-512 | 100 TBD | 101 TBD | 110 TBD | 111

7. Security Considerations

The security of applications using CGAs relies on the adopted public key schemes, which is out of the scope of this document, as well as the adopted hash algorithms.

A high cryptographically secure hash algorithm is obviously required. But no hash algorithms are guaranteed to be secure for ever, it is wise to add algorithm agility into CGAs in case current hash algorithm be successfully attacked.

This document suggests adding more flexible hash algorithm agility to CGAs. The method in this document follows the idea of encoding the hash algorithm identifier in the interface identifier to avoid

Zhou, et al. Expires March 17, 2013 [Page 7] Internet-Draft draft-zhou-6man-mhash-cga-00 September 2012

downgrading attack, as analyzed in Section 4.1, RFC 4982.

Actually CGAs only adopt truncated forms of a hash algorithm which is considered cryptographically secure in the sense of its regular form. As specified in RFC 3972, the effective bits relating to the security of CGAs are only the leftmost 59 bits (in the case of HASH1) , and the left most 16*sec bits (in the case of HASH2) of the whole 160 bits SHA-1 output . It is roughly estimated that the overall security of the hash algorithm is O( 2^(16*sec+59))(Section 7.2, RFC3972).

In this document, 3 bits originally used for output of HASH1are taken off the interface identifier to denote hash algorithm identity, while 3 more bits of output of HASH2 are checked, in a whole the whole security level is kept roughly the same, i.e.,O( 2^(16*sec+59)) .

8. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC3972] Aura, T., "Cryptographically Generated Addresses (CGA)", RFC 3972, March 2005.

[RFC4982] Bagnulo, M. and J. Arkko, "Support for Multiple Hash Algorithms in Cryptographically Generated Addresses (CGAs)", RFC 4982, July 2007.

Authors’ Addresses

Sujing Zhou (editor) ZTE Corporation No.68 Zijinghua Rd. Yuhuatai District Nanjing, Jiang Su 210012 R.R.China

Email: [email protected]

Zhou, et al. Expires March 17, 2013 [Page 8] Internet-Draft draft-zhou-6man-mhash-cga-00 September 2012

Ruishan Zhang ZTE Corporation 889 Bibo Rd, Zhangjiang Hi-Tech Park Shanghai 201203 P.R.China

Email: [email protected]

Zhenhua XIe ZTE Corporation No.68 Zijinghua Rd. Yuhuatai District Nanjing, Jiang Su 210012 P.R.China

Phone: +86-25-52871287 Fax: +86-25-52871000 Email: [email protected]

Zhou, et al. Expires March 17, 2013 [Page 9]