资源预览内容
第1页 / 共128页
第2页 / 共128页
第3页 / 共128页
第4页 / 共128页
第5页 / 共128页
第6页 / 共128页
第7页 / 共128页
第8页 / 共128页
第9页 / 共128页
第10页 / 共128页
亲,该文档总共128页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述
Chapter 4Network LayerComputer Networking: A Top Down Approach Featuring the Internet, 3rd edition. Jim Kurose, Keith RossAddison-Wesley, July 2004. A note on the use of these ppt slides:Were making these slides freely available to all (faculty, students, readers). Theyre in PowerPoint form so you can add, modify, and delete slides (including this one) and slide content to suit your needs. They obviously represent a lot of work on our part. In return for use, we only ask the following:q If you use these slides (e.g., in a class) in substantially unaltered form, that you mention their source (after all, wed like people to use our book!)q If you post any slides in substantially unaltered form on a www site, that you note that they are adapted from (or perhaps identical to) our slides, and note our copyright of this material.Thanks and enjoy! JFK/KWRAll material copyright 1996-2006J.F Kurose and K.W. Ross, All Rights Reserved1Network LayerChapter 4: Network LayerChapter goals: runderstand principles behind network layer services:mnetwork layer service modelsmforwarding versus routingmhow a router worksmrouting (path selection)mdealing with scalemadvanced topics: IPv6, mobilityrinstantiation, implementation in the Internet2Network LayerChapter 4: Network Layerr4. 1 Introductionr4.2 Virtual circuit and datagram networksr4.3 Whats inside a routerr4.4 IP: Internet ProtocolmDatagram formatmIPv4 addressingmICMPmIPv6r4.5 Routing algorithmsmLink statemDistance VectormHierarchical routingr4.6 Routing in the InternetmRIPmOSPFmBGPr4.7 Broadcast and multicast routing3Network LayerNetwork layerrtransport segment from sending host to receiving host ron sending side encapsulates segments into datagrams (packet)ron rcving side, delivers segments to transport layerrnetwork layer protocols in every host, routerrRouter examines header fields in all IP datagrams passing through itnetworkdata linkphysicalnetworkdata linkphysicalnetworkdata linkphysicalnetworkdata linkphysicalnetworkdata linkphysicalnetworkdata linkphysicalnetworkdata linkphysicalnetworkdata linkphysicalapplicationtransportnetworkdata linkphysicalapplicationtransportnetworkdata linkphysical4Network LayerTwo Key Network-Layer Functionsrforwarding: move packets from routers input to appropriate router outputrrouting: determine route taken by packets from source to dest. mrouting algorithmsanalogy:rrouting: process of planning trip from source to destrforwarding: process of getting through single interchange5Network Layer1230111value in arrivingpackets headerrouting algorithmlocal forwarding tableheader value output link01000101011110013221Interplay between routing and forwarding6Network LayerConnection setuprConnection setup(except for forwarding and routing) is the 3rd important function of network layer in some network architectures:mATM, frame relay, X.25rbefore datagrams flow, two end hosts and intervening routers establish virtual connectionmrouters get involvedrnetwork vs transport layer connection service:mnetwork: between two hosts (may also involve intervening routers in case of VCs)mtransport: between two processes7Network LayerNetwork service modelQ: What service model for “channel” transporting datagrams from sender to receiver?Example services for individual datagrams:rguaranteed deliveryrguaranteed delivery with less than 40 msec delayBut Internet only provides “best-effort service” in which seams almost “nothing”.Example services for a flow of datagrams:rin-order datagram deliveryrguaranteed minimum bandwidth to flowrrestrictions on changes in inter-packet spacing8Network LayerNetwork layer service models:NetworkArchitectureInternetATMATMATMATMServiceModelbest effortCBRVBRABR(Available)UBRBandwidthnoneConstantBit Rateguaranteedrateguaranteed minimumnoneLossnoyesyesnonoOrdernoyesyesyesyesTimingnoyesyesnonoCongestionfeedbackno (inferredvia loss)nocongestionnocongestionyesnoGuarantees ?9Network LayerChapter 4: Network Layerr4. 1 Introductionr4.2 Virtual circuit and datagram networksr4.3 Whats inside a routerr4.4 IP: Internet ProtocolmDatagram formatmIPv4 addressingmICMPmIPv6r4.5 Routing algorithmsmLink statemDistance VectormHierarchical routingr4.6 Routing in the InternetmRIPmOSPFmBGPr4.7 Broadcast and multicast routing10Network LayerNetwork layer connection and connection-less servicerdatagram network provides network-layer connectionless servicerVC network provides network-layer connection serviceranalogous to the transport-layer services, but:mservice: host-to-hostmno choice: network provides one or the other, not both of them.mimplementation: beside in end system also in network core11Network LayerVirtual circuitsrcall setup, teardown for each call before/after data can flowreach packet carries VC identifier (not destination host address)revery router on source-dest path maintains “state” for each passing connectionrlink, router resources (bandwidth, buffers) may be allocated to VC (dedicated resources = predictable service)“source-to-dest path behaves much like telephone circuit”mperformance-wisemnetwork actions along source-to-dest path12Network LayerVC implementationa VC consists of:1.path from source to destination2.VC numbers, one number for each link along path3.Entries(表项) in forwarding tables in routers along pathrpacket belonging to VC carries VC number (rather than dest address)rVC number can be changed on each link.mNew VC number comes from forwarding tablem一条虚电路上的逐段链路不使用同一VC号,能够”减少分组首部中VC字段的长度”,且”大大简化了虚电路的建立,免得路由器交换和处理相当大量的报文”13Network LayerForwarding table122232123VC numberinterfacenumberIncoming interface Incoming VC # Outgoing interface Outgoing VC #1 12 3 222 63 1 18 3 7 2 171 97 3 87 Forwarding table innorthwest router:Routers maintain connection state information!创建一新连接时,新连接项加入表中;释放一旧连接时,从表中删除该项.14Network LayerVirtual circuits: signaling protocols(交换用于建立与释放连接的信令报文的协议)rused to setup, maintain teardown VCrused in ATM, frame-relay, X.25rnot used in todays Internetapplicationtransportnetworkdata linkphysicalapplicationtransportnetworkdata linkphysical1. Initiate call2. incoming call3. Accept call4. Call connected5. Data flow begins6. Receive data15Network LayerDatagram networks (Internet)rno call setup at network layerrrouters: no “state” about end-to-end connectionsmno network-level concept of “connection”rpackets forwarded using destination host addressmpackets between same source-dest pair may take different pathsm路由器中有将目的地址映射到链路接口的转发表applicationtransportnetworkdata linkphysicalapplicationtransportnetworkdata linkphysical1. Send data2. Receive data16Network LayerForwarding table Destination Address Range Link Interface 11001000 00010111 00010000 00000000 through 0 11001000 00010111 00010111 11111111 11001000 00010111 00011000 00000000 through 1 11001000 00010111 00011000 11111111 11001000 00010111 00011001 00000000 through 2 11001000 00010111 00011111 11111111 otherwise 332 bits addr. means near 4 billion possible entries(表项) need to establish难以实施.采用前缀匹配来简化实现.17Network LayerLongest prefix matching Prefix Match Link Interface 11001000 00010111 00010 0 11001000 00010111 00011000 1 11001000 00010111 00011 2 otherwise 3转发表中只需这四个表项既可转发表中只需这四个表项既可.DA: 11001000 00010111 00011000 10101010 ExamplesDA: 11001000 00010111 00010110 10100001 Which interface?Which interface?18Network LayerDatagram or VC network: why?Internet (datagram)rdata exchange among computersm“elastic” service, no strict timing req. r“smart” end systems (computers)mcan adapt, perform control, error recoverymsimple inside network, complexity at “edge”rmany link types mdifferent characteristicsmuniform service difficultATM (VC)revolved from telephonyrhuman conversation: mstrict timing, reliability requirementsmneed for guaranteed servicer“dumb (哑/不说话)” end systemsmtelephonesmcomplexity inside network19Network LayerChapter 4: Network Layerr4. 1 Introductionr4.2 Virtual circuit and datagram networksr4.3 Whats inside a routerr4.4 IP: Internet ProtocolmDatagram formatmIPv4 addressingmICMPmIPv6r4.5 Routing algorithmsmLink statemDistance VectormHierarchical routingr4.6 Routing in the InternetmRIPmOSPFmBGPr4.7 Broadcast and multicast routing20Network LayerRouter Architecture OverviewTwo key router functions: rrun routing algorithms/protocol (RIP, OSPF, BGP)rforwarding datagrams from incoming to outgoing link21Network LayerInput Port FunctionsDecentralized (分散式分散式) switching: rgiven datagram dest., lookup output port using forwarding table in input port memoryrgoal: complete input port processing at line speed(执行一次查找的时间应小于从输入端口接收一个分组所需的时间)rqueuing: if datagrams arrive faster than forwarding rate (into switch fabric) 线路端接 Physical layer:bit-levelreceptionData link layer:e.g., Ethernetsee chapter 522Network LayerThree types of switching fabrics23Network LayerSwitching Via MemoryFirst generation routers:r traditional computers with switching under direct control of CPUrpacket copied to systems memory(借助于选路处理器)r speed limited by memory bandwidth (2 bus crossings per datagram同一分组用了两倍时间完成写/读)InputPortOutputPortMemorySystem Bus24Network LayerSwitching Via a Bus (无需选路处理器干预)rdatagram from input port memory to output port memory via a shared bus (一次只能有一个分组通过总线传送)rbus contention: switching speed limited by bus bandwidthr1 Gbps bus, Cisco 1900: sufficient speed for access and enterprise routers (接入网和企业网,not regional or backbone)25Network LayerSwitching Via An Interconnection Network (纵横制)rovercome bus bandwidth limitationsrBanyan networks, other interconnection nets initially developed to connect processors in multiprocessor architecturerAdvanced design: fragmenting datagram into fixed length cells且加上标签,然后switch cells through the fabric. 从而大大简化和加快了分组的交换.rCisco 12000: switches 60 Gbps through the interconnection network(借助于选路处理器)26Network LayerOutput PortsrBuffering required when datagrams arrive from fabric faster than the transmission raterScheduling discipline chooses among queued datagrams for transmission27Network LayerOutput port queueingrbuffering when arrival rate via switch exceeds output line speedrqueueing (delay) and loss due to output port buffer overflow! (丢失发生取决于流量、负载交换结构的相对速率和线路速率等因素,其概率值一般是与平均队列长度、最小阈值和最大阈值有关的函数值)28Network LayerInput Port QueuingrFabric speed is slower than input ports combined - queueing may occur at input queues rHead-of-the-Line (HOL) blocking: queued datagram at front of queue prevents others in queue from moving forward (线路前部阻塞)rqueueing delay and loss due to input buffer overflow!29Network LayerChapter 4: Network Layerr4. 1 Introductionr4.2 Virtual circuit and datagram networksr4.3 Whats inside a routerr4.4 IP: Internet ProtocolmDatagram formatmIPv4 addressingmICMPmIPv6r4.5 Routing algorithmsmLink statemDistance VectormHierarchical routingr4.6 Routing in the InternetmRIPmOSPFmBGPr4.7 Broadcast and multicast routing30Network LayerThe Internet Network layerforwardingtableHost, router network layer functions(三个重要组件完成):Routing protocolspath selectionRIP, OSPF, BGPIP protocoladdressing conventions(规则)datagram formatpacket handling conventionsICMP protocolerror reportingrouter “signaling”Transport layer: TCP, UDPLink layerphysical layerNetworklayer31Network LayerChapter 4: Network Layerr4. 1 Introductionr4.2 Virtual circuit and datagram networksr4.3 Whats inside a routerr4.4 IP: Internet ProtocolmDatagram formatmIPv4 addressingmICMPmIPv6r4.5 Routing algorithmsmLink statemDistance VectormHierarchical routingr4.6 Routing in the InternetmRIPmOSPFmBGPr4.7 Broadcast and multicast routing32Network LayerIP datagram formatverlength32 bitsdata (variable length,typically a TCP or UDP segment;也可是ICMP报文段)16-bit identifierheader checksumtime tolive32 bit source IP addressIP protocol versionNumber(4位)header length(4位) (首部bytes:至少20B)max numberremaining hops(decremented at each router:减为零时丢弃)for fragmenta-tion/Reassembly(标识;16位;标志:3位,有1位表示分片是否为最后一个分片-置0;片偏移:13位,以8字节块为单位)total datagramlength (bytes): 1500Bupper layer protocolto deliver payload to(6:TCP;17:UDP)head.lentype ofservice8个优先级:3位;D(延迟)、T(吞吐量)、R(可靠性)、C (费用)各1位;1位未用。flgsfragment offsetupper layer32 bit destination IP addressOptions (if any)E.g. timestamp,record routetaken, specifylist of routers to visit.how much overhead with TCP?r20 bytes of TCPr20 bytes of IPr= 40 bytes + app layer overhead33Network LayerIP Fragmentation & Reassembly(路由器的分片与目的主机的组装)rnetwork links have MTU (max.transfer size) - largest possible link-level frame.mdifferent link types, different MTUs (以太网frame1500B,而广域网链路的frame576B)rlarge IP datagram divided (“fragmented”,) within net (因IP数据报被封装在链路层的frame中,故受其MTU 所约束)mone datagram becomes several datagramsm“reassembled” only at final destinationmIP header bits used to identify, order related fragmentsfragmentation: in: one large datagramout: 3 smaller datagramsReassembly组装34Network LayerIP Fragmentation and ReassemblyID=xoffset=0fragflag=0length=4000ID=xoffset=0fragflag=1length=1500ID=xoffset=185fragflag=1length=1500ID=xoffset=370fragflag=0length=1040One large datagram becomesseveral smaller datagramsExampler4000 byte datagram (含有20B的头,数据为3980B)rMTU = 1500 bytes第一片:1480 bytes in data field(0到1479B)第二片: offset =1480/8 (data field:1480到2959B)第三片:数据为1020B(3980-1480-1480, 2960到3979B)35Network LayerChapter 4: Network Layerr4. 1 Introductionr4.2 Virtual circuit and datagram networksr4.3 Whats inside a routerr4.4 IP: Internet ProtocolmDatagram formatmIPv4 addressingmICMPmIPv6r4.5 Routing algorithmsmLink statemDistance VectormHierarchical routingr4.6 Routing in the InternetmRIPmOSPFmBGPr4.7 Broadcast and multicast routing36Network LayerIP Addressing: introductionrIP address: 32-bit identifier for host, router interface rinterface: connection between host/router and physical linkmrouters typically have multiple interfacesmhost typically has one interfacemIP addresses associated with each interface (技术上IP地址与接口相关联,而非主机或路由器)223.1.1.1223.1.1.2223.1.1.3223.1.1.4223.1.2.9223.1.2.2223.1.2.1223.1.3.2223.1.3.1223.1.3.27223.1.1.1 = 11011111 00000001 00000001 0000000122311137Network LayerSubnetsrIP address: msubnet part (high order bits)mhost part (low order bits) rWhats a subnet ?mdevice interfaces with same subnet part of IP addressmcan physically reach each other without intervening router223.1.1.1223.1.1.2223.1.1.3223.1.1.4223.1.2.9223.1.2.2223.1.2.1223.1.3.2223.1.3.1223.1.3.27network consisting of 3 subnetssubnet38Network LayerSubnets223.1.1.0/24223.1.2.0/24223.1.3.0/24 To determine the subnets, detach each interface from its host or router, creating islands of isolated networks. Each isolated network is called a subnet. 为了确定子网,分开主机和路由器的每个接口,从而产生了几个分离的网络岛。这些独立的网络中的每个叫做一个子网。Subnet mask(子网掩码): /2439Network LayerSubnetsHow many? 6个:223.1.1.0/24,223.1.2.0/24,223.1.3.0/24,和223.1.7.0/24,223.1.8.0/24,223.1.9.0/24223.1.1.1223.1.1.3223.1.1.4223.1.2.2223.1.2.1223.1.2.6223.1.3.2223.1.3.1223.1.3.27223.1.1.2223.1.7.0223.1.7.1223.1.8.0223.1.8.1223.1.9.1223.1.9.240Network LayerIP addressing: CIDR (无类别域间选路) CIDR: Classless InterDomain Routing 这是英特网的地址分配策略,将32位的IP地址划分为两部分。msubnet portion of address of arbitrary lengthmaddress format: a.b.c.d/x, where x is # bits in subnet portion of address采用CIDR之前,有A、B、C(网络地址分别为8 、16和24位)、D和E类网络的“分类编址”方案,易造成IP地址的浪费或地址空间的低利用率。11001000 00010111 00010000 00000000subnetparthostpart200.23.16.0/2341Network LayerIP addresses: how to get one? Q: How does host get IP address? 是先获得主机所在组织的块地址,然后由该组织为其内的主机或路由器接口分配独立的IP地址。rhard-coded (手动配置)by system admin in a filemWintel: control-panel-network-configuration-tcp/ip-propertiesmUNIX: /etc/rc.configrDHCP(Dynamic Host Configuration Protocol): dynamically get address from the DHCP serverm“plug-and-play协议” (能将主机自动连接进一个网络)mC/S结构协议, four steps: DHCP server发现(主机广播发送DHCP发现报文的IP数据报,找到要与自己交互的DHCP server ); DHCP server提供(用一个携带了推荐的IP地址及租用期、网络掩码等信息的DHCP提供报文以响应); DHCP请求(用携带了配置参数的请求报文对选中的服务器进行响应、回显配置参数);DHCPACK(被选中的服务器用DHCPACK 报文进行相应,证实所要求的参数)42Network LayerIP addresses: how to get one?Q: How does network get subnet part of IP addr?A: gets allocated portion of its provider ISPs address spaceISPs block 11001000 00010111 00010000 00000000 200.23.16.0/20 Organization 0 11001000 00010111 00010000 00000000 200.23.16.0/23 Organization 1 11001000 00010111 00010010 00000000 200.23.18.0/23 Organization 2 11001000 00010111 00010100 00000000 200.23.20.0/23 . . . .Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23 43Network LayerIP addressing: the last word.Q: How does an ISP get block of addresses?A: ICANN: Internet Corporation for Assigned Names and Numbersmallocates addresses(给地区性英特网注册机构)mmanages DNSmassigns domain names, resolves disputes(分配域名,解决域名纷争)44Network LayerNAT: Network Address Translation(英特网的一个重要组件)10.0.0.110.0.0.210.0.0.310.0.0.4138.76.29.7local network(e.g., home network)10.0.0/24rest ofInternetDatagrams with source or destination in this networkhave 10.0.0/24 address for source, destination (as usual)All datagrams leaving localnetwork have same single source NAT IP address: 138.76.29.7,different source port numbers45Network LayerNAT: Network Address Translation Implementation: NAT router (NAT使能路由器) must:moutgoing datagrams: replace (source IP address, port #) of every outgoing datagram to (NAT IP address, new port #) 所有离开NAT使能路由器通向Internet的报文都具有同一个源的IP地址(NAT IP address) 和new port #。. . . remote clients/servers will respond using (NAT IP address, new port #) as destination addr.mremember (in NAT translation table) every (source IP address, port #) to (NAT IP address, new port #) translation pairmincoming datagrams: replace (NAT IP address, new port #) in dest fields of every incoming datagram with corresponding (source IP address, port #) stored in NAT table 所有来自Internet流向NAT使能路由器的报文都具有同一个目的IP地址(NAT IP address) 和new port #,且进入后将其置换成已存储在NAT translation table中与之配对的(source IP address, port #) 。46Network LayerNAT: Network Address Translation10.0.0.110.0.0.210.0.0.3S: 10.0.0.1, 3345D: 128.119.40.186, 80110.0.0.4138.76.29.71: host 10.0.0.1 sends datagram to 128.119.40.186, 80NAT translation tableWAN side addr LAN side addr138.76.29.7, 5001 10.0.0.1, 3345 S: 128.119.40.186, 80 D: 10.0.0.1, 33454S: 138.76.29.7, 5001D: 128.119.40.186, 8022: NAT routerchanges datagramsource addr from10.0.0.1, 3345 to138.76.29.7, 5001,updates tableS: 128.119.40.186, 80 D: 138.76.29.7, 500133: Reply arrives dest. address: 138.76.29.7, 50014: NAT routerchanges datagramdest addr from138.76.29.7, 5001 to 10.0.0.1, 3345 47Network LayerMore words about NATr16-bit port-number field: m60,000 simultaneous connections with a single LAN-side address! (NAT可支持LAN内单一IP与WAN的60000并行连接!)rNAT is controversial(争议的):mrouters should only process up to layer 3mPort number can be used only for addressing process but hostmviolates end-to-end argumentNAT possibility must be taken into account by app designers, eg, P2P applicationsmaddress shortage should instead be solved by IPv6, not by NATNAT的使用被认为是权宜之计。48Network LayerChapter 4: Network Layerr4. 1 Introductionr4.2 Virtual circuit and datagram networksr4.3 Whats inside a routerr4.4 IP: Internet ProtocolmDatagram formatmIPv4 addressingmICMPmIPv6r4.5 Routing algorithmsmLink statemDistance VectormHierarchical routingr4.6 Routing in the InternetmRIPmOSPFmBGPr4.7 Broadcast and multicast routing49Network LayerICMP: Internet Control Message Protocolrused by hosts & routers to communicate network-level informationmerror reporting(由路由器创建并发向主机): unreachable host, network, port, protocolmecho request/reply (used by pingPacket InterNet Groper 网内探测分组)rnetwork-layer “above” IP:mICMP msgs carried in IP datagrams (IP的有效载荷)rICMP message: type, code plus both header and first 8 bytes of IP datagram causing (ICMP报文包括类型,编码和引起该ICMP报文被首次生成的那个IP数据报的首部和前8个字节内容)Type Code description0 0 echo reply (回声回答,既对ping程序中 报文echo request的回 答)3 0 dest. network unreachable3 1 dest host unreachable3 2 dest protocol unreachable3 3 dest port unreachable3 6 dest network unknown3 7 dest host unknown4 0 source quench (源抑制,即拥塞控制 - not used,由TCP在传输层做)8 0 echo request (回声请求)9 0 route advertisement10 0 router discovery11 0 TTL expired (TTL过期)12 0 bad IP header (IP首部错误)50Network LayerTrace route and ICMP(Traceroute程序使用ICMP报文来实现从一台主机到世界上任意一台其他主机之间的路由跟踪)rSource sends series of UDP segments to destmFirst has TTL =1mSecond has TTL=2, etc.mUnlikely port number (每个携带了具有一个不可达的UDP端口号)rWhen nth datagram arrives to nth router:mRouter discards datagrammAnd sends to source an ICMP message (type 11, code 0)mMessage includes name of router& IP addressrWhen ICMP message arrives, source calculates RTTrTraceroute does this 3 timesStopping criterion (停止准则)rUDP segment eventually arrives at destination hostrDestination returns ICMP “dest port unreachable” packet (type 3, code 3)rWhen source host gets this ICMP, stops(即:不再发送另外的探测分组). 这样,原主机获得了位于它与目 的主机之间的路由器数量、标识 以及两主机间的RTT(每个TTL提供了3个结果)。51Network LayerChapter 4: Network Layerr4. 1 Introductionr4.2 Virtual circuit and datagram networksr4.3 Whats inside a routerr4.4 IP: Internet ProtocolmDatagram formatmIPv4 addressingmICMPmIPv6r4.5 Routing algorithmsmLink statemDistance VectormHierarchical routingr4.6 Routing in the InternetmRIPmOSPFmBGPr4.7 Broadcast and multicast routing52Network LayerIPv6rInitial motivation (动机): 32-bit address space soon to be completely allocated. rAdditional motivation:mheader format helps speed processing/forwardingmheader changes to facilitate (便于/推动) QoS rIPv6 (128-bit) datagram format: mfixed-length 40 byte header(但地址为32B,其它控制信息只有8B;IPv4 中最小时分别为:8B、12B)53Network LayerIPv6 Header (Cont)Priority: identify priority among datagrams in flow(8位)Flow Label: identify datagrams in same “flow.” (20位) (concept of“flow” not well defined,但音视频流可为其中).Next header: identify data type for upper layer protocol (6:TCP,17:UDP)or 扩展头部的类型(8位)54Network LayerOther Changes from IPv4rChecksum: removed entirely to reduce processing time at each hop (传输层和数据链路层均进行检验,而现在强调要快速处理;IPv4有是因其包含TTL字段,每台路由器上都变而需重算首部检验和,是耗时的操作)。rno fragmentation (不允许在中间路由器上进行分片,而只能在源与目的地主机上执行。如收到的数据报过大,则丢弃,并发回“分组太大”的ICMP差错报文)rOptions: allowed, but outside of header, indicated by “Next Header” fieldrICMPv6: new version of ICMPmadditional message types, e.g. “Packet Too Big”mmulticast group management functions55Network LayerTransition From IPv4 To IPv6rNot all routers can be upgraded simultaneousmno “flag days”(“设置标志日”不可行!)mHow will the network operate with mixed IPv4 and IPv6 routers? rDual-stack:节点具备发/收IPv4与IPv6数据报的能力,同时拥有两种功能。但某些字段(如流标志)信息可能丢失。rTunneling(建隧道): IPv6 carried as payload in IPv4 datagram among IPv4 routers56Network LayerTunnelingABEFIPv6IPv6IPv6IPv6tunnelLogical view:Physical view:ABEFIPv6IPv6IPv6IPv6IPv4IPv457Network LayerTunnelingABEFIPv6IPv6IPv6IPv6tunnelLogical view:Physical view:ABEFIPv6IPv6IPv6IPv6CDIPv4IPv4Flow: XSrc: ADest: FdataFlow: XSrc: ADest: FdataFlow: XSrc: ADest: FdataSrc:BDest: EFlow: XSrc: ADest: FdataSrc:BDest: EA-to-B:IPv6E-to-F:IPv6B-to-C:IPv6 insideIPv4B-to-C:IPv6 insideIPv458Network LayerChapter 4: Network Layerr4. 1 Introductionr4.2 Virtual circuit and datagram networksr4.3 Whats inside a routerr4.4 IP: Internet ProtocolmDatagram formatmIPv4 addressingmICMPmIPv6r4.5 Routing algorithmsmLink statemDistance VectormHierarchical routingr4.6 Routing in the InternetmRIPmOSPFmBGPr4.7 Broadcast and multicast routing59Network Layer1230111value in arrivingpackets headerrouting algorithmlocal forwarding tableheader value output link01000101011110013221Interplay between routing, forwarding(转发表中信息的计算、交换与配置是由运行在路由器中的选路算法完成的)60Network Layeruyxwvz2213112535Graph: G = (N,E)N = set of routers = u, v, w, x, y, z E = set of links = (u,v), (u,x), (v,x), (v,w), (x,w), (x,y), (w,y), (w,z), (y,z) Graph abstractionRemark: Graph abstraction is useful in other network contextsExample: P2P, where N is set of peers and E is set of TCP connections61Network LayerGraph abstraction: costsuyxwvz2213112535 c(x,x) = cost of link (x,x),且为无向图-e.g., c(w,z) = c(z, w) = 5 , (费用可为:链路长度、速率等) cost could always be 1, or inversely related to bandwidth,or inversely related to Congestion(每条链路的费用可为1,如在RIP中;或:设置与链路容量/拥塞成反比,如在OSPF中。后面将会逐步看到)Cost of path (x1, x2, x3, xp) = c(x1,x2) + c(x2,x3) + + c(xp-1,xp) Question: Whats the least-cost path between u and z ? (17条可能路径)Routing algorithm: algorithm that finds least-cost path(即“好”路径)62Network LayerRouting Algorithm classification1,Global or decentralized information?Global(全局):rall routers have complete topology(连通性), link cost info (链路费用信息)r“link state” algorithmsDecentralized(分布式): rrouter knows physically-connected neighbors, link costs to neighborsrIterative(迭代) process of computation, exchange of info with neighborsr“distance vector” algorithms2,Static or dynamic?Static: rroutes change slowly over timeDynamic: rroutes change more quicklymperiodic updatemin response to link cost changes3,Load-sensitive or insensitive? (负载敏感或迟钝?)63Network LayerChapter 4: Network Layerr4. 1 Introductionr4.2 Virtual circuit and datagram networksr4.3 Whats inside a routerr4.4 IP: Internet ProtocolmDatagram formatmIPv4 addressingmICMPmIPv6r4.5 Routing algorithmsmLink statemDistance VectormHierarchical routingr4.6 Routing in the InternetmRIPmOSPFmBGPr4.7 Broadcast and multicast routing64Network LayerA Link-State Routing AlgorithmDijkstras algorithmrnet topology, link costs known to all nodesmaccomplished via “link state broadcast” mall nodes have same inforcomputes least cost paths from one node (source”) to all other nodesmgives forwarding table for that noderiterative: after k iterations, know least cost path to k dest.sNotation:rc(x,y): link cost from node x to y; = if not direct neighborsrD(v): current value of least cost of path from source to dest. vrp(v): predecessor node along path from source to v. (前任节点,既v的邻居)rN: set of nodes whose least cost path definitively known (节点子集且v在N中)65Network LayerDijsktras Algorithm1 Initialization: 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(u,v) 6 else D(v) = 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N : 12 D(v) = min( D(v), D(w) + c(w,v) ) 13 /* new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v */ 15 until all nodes in N 66Network LayerDijkstras algorithm: example(每次在找从原节点沿最低费用路径到达当前节点的前一个节点)Step012345NuuxuxyuxyvuxyvwuxyvwzD(v),p(v)2,u2,u2,uD(w),p(w)5,u4,x3,y3,yD(x),p(x)1,uD(y),p(y)2,xD(z),p(z) 4,y4,y4,yuyxwvz221311253567Network LayerDijkstras algorithm: example (2) uyxwvzResulting shortest-path tree from u:vxywz(u,v)(u,x)(u,x)(u,x)(u,x)destinationlinkResulting forwarding table in u:68Network LayerDijkstras algorithm, discussionAlgorithm complexity: n nodes(不算源节点)reach iteration: need to check all nodes(n、n-1、n-2、1), 找出具有最低费用且not in N 中的wrn(n+1)/2 (需要搜寻的节点总数),最坏情况下的Algorithm complexity : O(n2)rmore efficient implementations possible: O(nlogn)Oscillations(振荡)possible:re.g., link cost = amount of carried trafficr链路费用是非对称的,即:两个节点之间两个方向上的流量不同ADCB11+ee0e1100ADCB2+e0001+e 1ADCB02+e1+e10 0ADCB2+e0e01+e 1initially recomputerouting recompute recompute69Network LayerChapter 4: Network Layerr4. 1 Introductionr4.2 Virtual circuit and datagram networksr4.3 Whats inside a routerr4.4 IP: Internet ProtocolmDatagram formatmIPv4 addressingmICMPmIPv6r4.5 Routing algorithmsmLink statemDistance VectormHierarchical routingr4.6 Routing in the InternetmRIPmOSPFmBGPr4.7 Broadcast and multicast routing70Network LayerDistance Vector Algorithm是一种分布式、迭代和异步的算法。分布式:每个节点从所有直接邻居接收信息和执行计算,结果回 送给邻居;迭代:此过程一直要持续到邻居之间没有信息交换为止;异步:不要求所有节点相互之间步伐一致地操作。Bellman-Ford Equation (dynamic programming)Define:dx(y) is cost of least-cost path from x to yThen:dx(y) = min c(x,v) + dv(y) where min is taken over all neighbors v of xv71Network LayerBellman-Ford example uyxwvz2213112535Clearly, dv(z) = 5, dx(z) = 3, dw(z) = 3du(z) = min c(u,v) + dv(z), c(u,x) + dx(z), c(u,w) + dw(z) = min 2 + 5, 1 + 3, 5 + 3 = 4Node that achieves minimum is nexthop in shortest path forwarding tableB-F equation says:72Network LayerDistance Vector Algorithm (每个节点x以Dx(y) 开始,对N中的所有节点估计从它自己到节点y的最低费用路径的费用) rDx(y) = estimate of least cost from x to yrNode x knows cost to each neighbor v: c(x,v)rNode x maintains distance vector Dx = Dx(y): y N rNode x also maintains its neighbors distance vectorsmFor each neighbor v, x maintains Dv = Dv(y): y N 73Network LayerDistance vector algorithm (4):节点具有的唯一信息是它到直接相连邻居的链路费用和它从这些邻居接收到的信息.Basic idea: rEach node periodically sends its own distance vector estimate to neighborsrWhen a node x receives new DV estimate from neighbor, it updates its own DV using B-F equation:Dx(y) minvc(x,v) + Dv(y) for each node y NrUnder natural conditions, the estimate Dx(y) converge (收敛)to the actual least cost dx(y) 74Network LayerDistance Vector Algorithm (5)(当节点x 知道它的直接相连的链路费用变化了或从某个邻居接收到一个距离向量的更新时,就更新其距离向量.但x真正需要知道的不是到目的地y的最短路径,而是邻居节点到y 的最短路径,它是沿着最短路径到y的下一眺路由器.)Iterative, asynchronous:each local iteration caused by: rlocal link cost change rDV update message from neighborDistributed:reach node notifies neighbors only when its DV changesmneighbors then notify their neighbors if necessarywait for (change in local link cost or msg from neighbor)recompute estimatesif DV to any dest has changed, notify neighbors Each node:75Network Layerx y zxyz0 2 7fromcost tofromfromx y zxyz0fromcost tox y zxyz cost tox y zxyz7 10cost to2 0 1 2 0 17 1 0timexz127ynode x tablenode y tablenode z tableDx(y) = minc(x,y) + Dy(y), c(x,z) + Dz(y) = min2+0 , 7+1 = 2Dx(z) = minc(x,y) + Dy(z), c(x,z) + Dz(z) = min2+1 , 7+0 = 332 76Network Layerx y zxyz0 2 7fromcost tofromfromx y zxyz0 2 3fromcost tox y zxyz0 2 3fromcost tox y zxyz cost tox y zxyz0 2 7fromcost tox y zxyz0 2 3fromcost tox y zxyz0 2 3fromcost tox y zxyz0 2 7fromcost tox y zxyz7 10cost to2 0 1 2 0 17 1 02 0 17 1 02 0 13 1 02 0 13 1 02 0 13 1 02 0 13 1 0timexz127ynode x tablenode y tablenode z tableDx(y) = minc(x,y) + Dy(y), c(x,z) + Dz(y) = min2+0 , 7+1 = 2Dx(z) = minc(x,y) + Dy(z), c(x,z) + Dz(z) = min2+1 , 7+0 = 377Network LayerDistance Vector: link cost changesLink cost changes(大到小:good):rnode detects local link cost change rupdates routing info, recalculates distance vectorrif DV changes, notify neighbors “goodnews travelsfast”xz1450y1At time t0, y detects the link-cost change, updates its DV, and informs its neighbors.At time t1, z receives the update from y and updates its table. It computes a new least cost (费用5减为2: good news) to x and sends its neighbors its DV.At time t2, y receives zs update and updates its distance table. ys least costs do not change and hence y does not send any message to z. 该算法进入静止状态! (只需2次迭代)78Network LayerDistance Vector: link cost changesLink cost changes (小变大4 60:bad ):rbad news travels slow -44 iterations before algorithm stabilizes: (计算起于y到x的最小费用。由于节点y 仅有的信息是:它到x 的费用是60,且z上次已告诉y, z能以费用5到达x.故y试图经z回y到x的费用从7开始进入选路环路,即Y与z之间不停地反复进行报文交换,直到44次迭代后最终算出经由y的费用大于50为止,y将选到z且由z与x的直连的路经。see text) -若.:410000, 50 9999,要迭代多少次? - “count to infinity” problem!Poisoned reverse(毒性逆转): rIf Z routes through Y to get to X :mZ tells Y its (Zs) distance to X is infinite (so Y wont route to X via Z), y就永远不会试图经由z选路到x!rwill this completely solve count to infinity problem? No!(当涉及到3个以上的更多节点时)xz1450y6079Network LayerComparison of LS and DV algorithmsMessage complexityrLS: with n nodes, E links, O(nE) msgs sent rDV: exchange between neighbors onlymConvergence(收敛) time varies(如若遇到”选路环路”)Speed of ConvergencerLS: be O(n2) algorithm and requires O(nE) msgsmmay have oscillations(振荡)rDV: convergence time variesmmay be routing loopsmcount-to-infinity problemRobustness: what happens if router malfunctions?LS: mnode can advertise incorrect link costmeach node computes only its own table,既路由计算是分离的.DV:mDV node can advertise incorrect path costmeach nodes table used by others, so,error propagate thru network80Network LayerChapter 4: Network Layerr4. 1 Introductionr4.2 Virtual circuit and datagram networksr4.3 Whats inside a routerr4.4 IP: Internet ProtocolmDatagram formatmIPv4 addressingmICMPmIPv6r4.5 Routing algorithmsmLink statemDistance VectormHierarchical routingr4.6 Routing in the InternetmRIPmOSPFmBGPr4.7 Broadcast and multicast routing81Network LayerHierarchical Routingscale: with 200 million destinations:rcant store all dests in routing tables!rrouting table exchange would swamp(淹没) links! 导致没有剩余的带宽供发送数据分组使用.administrative autonomy(管理自治:将路由器组织进自治系统AS)rinternet = network of networksreach network admin may want to control routing(LS or DV) in its own networkOur routing study thus far - idealization rall routers identical(同一的、同质的)rnetwork “flat” not true in practice82Network LayerHierarchical Routingraggregate routers into regions, “autonomous systems” (AS:是一组处于相同的管理与技术控制下的路由器集合,每个AS又包含多个子网)rrouters in same AS run same routing protocolm“intra-AS” routing protocolmrouters in different AS can run different intra-AS routing protocolGateway routerrDirect link to router in another ASrAS之间都运行相同的选路协议结论:结论: 规模与管理职权的问题可通过定义自治系统AS来解决。因为一个AS内部的路由器仅需要知道本AS内的路由器,所以规模问题可得到解决;又因为一个组织可运行它选择的任何AS内部选路协议,所以管理职权问题亦可得到解决。83Network Layer3b1d3a1c2aAS3AS1AS21a2c2b1bIntra-ASRouting algorithmInter-ASRouting algorithmForwardingtable3cInterconnected ASesrForwarding table is configured by both intra- and inter-AS routing algorithmmIntra-AS sets entries for internal destsmInter-AS & Intra-As sets entries for external dests 84Network Layer3b1d3a1c2aAS3AS1AS21a2c2b1b3cInter-AS tasksrSuppose router in AS1 receives datagram for which dest is outside of AS1mRouter should forward packet towards one of the gateway routers, but which one?AS1 needs:rto learn which dests are reachable through AS2 and which through AS3rto propagate this reachability info to all routers in AS1Job of inter-AS routing!85Network LayerExample: Setting forwarding table in router 1drSuppose AS1 learns (via inter-AS protocol) that subnet x is reachable via AS3 (gateway 1c) but not via AS2.rInter-AS protocol propagates reachability info to all internal routers.rRouter 1d determines from intra-AS routing info that its interface I is on the least cost path to 1c.rPuts in forwarding table entry (x,I).3b1d3a1c2aAS3AS1AS21a2c2b1b3c86Network LayerLearn from inter-AS protocol that subnet x is reachable via multiple gatewaysUse routing infofrom intra-AS protocol to determinecosts of least-cost paths to eachof the gatewaysHot potato(尽可能快地摆脱分组)routing:Choose the gateway that has the smallest least costDetermine fromforwarding table the interface I that leads to least-cost gateway. Enter (x,I) in forwarding tableExample: Choosing among multiple ASesrNow suppose AS1 learns from the inter-AS protocol that subnet x is reachable from AS3 and from AS2.rTo configure forwarding table, router 1d must determine towards which gateway it should forward packets for dest x. rThis is also the job on inter-AS routing protocol!rHot potato routing: send packet towards closest of two routers.87Network LayerChapter 4: Network Layerr4. 1 Introductionr4.2 Virtual circuit and datagram networksr4.3 Whats inside a routerr4.4 IP: Internet ProtocolmDatagram formatmIPv4 addressingmICMPmIPv6r4.5 Routing algorithmsmLink statemDistance VectormHierarchical routingr4.6 Routing in the InternetmRIPmOSPFmBGPr4.7 Broadcast and multicast routing88Network LayerIntra-AS Routing(英特网中英特网中AS内部选路协议内部选路协议)rAlso known as Interior Gateway Protocols (IGP:内 部网关协议)rMost common Intra-AS routing protocols:mRIP: Routing Information ProtocolmOSPF: Open Shortest Path First(开放最短路径优先)mIGRP: Interior Gateway Routing Protocol (Cisco proprietary所有权) 前面学过的前面学过的LS、DV算法以及算法以及AS概念,概念,都在当今的因特网选路中发挥了重要作用都在当今的因特网选路中发挥了重要作用!89Network LayerChapter 4: Network Layerr4. 1 Introductionr4.2 Virtual circuit and datagram networksr4.3 Whats inside a routerr4.4 IP: Internet ProtocolmDatagram formatmIPv4 addressingmICMPmIPv6r4.5 Routing algorithmsmLink statemDistance VectormHierarchical routingr4.6 Routing in the InternetmRIPmOSPFmBGPr4.7 Broadcast and multicast routing90Network LayerRIP ( Routing Information Protocol)rUses Distance vector protocol (DV algorithm)rSet in the lower level ISP or intranet (enterprise net)rDistance metric(测度): # of hops (跳数,每条链路即1跳的费用为1,max = 15 hops),即从原路由器到目的子网(含目的子网)的最短路径所经过的子网数量。DCBAuvwxyzdestination hops u 1 v 2 w 2 x 3 y 3 z 2 From router A to subsets:91Network LayerRIP advertisementsrDistance vectors: exchanged among neighbors every 30 sec via Response Message (also called advertisement:通告)rEach advertisement: list of up to 25 destination nets within AS92Network LayerRIP: Example Destination Network Next Router Num. of hops to dest. wA2yB2 zB7x-1.wxyzACDBRouting table in D93Network LayerRIP: Example Destination Network Next Router Num. of hops to dest. wA2yB2 zB A7 5x-1.Routing table in D(30s后收到了来自A的信息而进行更新)wxyzACDB Dest Next hops w - 1 x - 1 z C 4 . .Advertisementfrom A to D94Network LayerRIP: Link Failure and Recovery If no advertisement heard after 180 sec,then neighbor/link declared deadmroutes via neighbor invalidatedmnew advertisements sent to neighborsmneighbors in turn send out new advertisements (if tables changed)mlink failure info quickly (?) propagates to entire netmpoison reverse used to prevent ping-pong loops (infinite distance = 16 hops)95Network LayerRIP Table processingrRIP routing tables managed by application-layer process called route-d (daemon:邮件收发的后台程序)radvertisements sent in UDP packets, periodically repeatedphysicallinknetwork forwarding (IP) tableTransprt (UDP)routedphysicallinknetwork (IP)Transprt (UDP)routedforwardingtable96Network LayerChapter 4: Network Layerr4. 1 Introductionr4.2 Virtual circuit and datagram networksr4.3 Whats inside a routerr4.4 IP: Internet ProtocolmDatagram formatmIPv4 addressingmICMPmIPv6r4.5 Routing algorithmsmLink statemDistance VectormHierarchical routingr4.6 Routing in the InternetmRIPmOSPFmBGPr4.7 Broadcast and multicast routing97Network LayerOSPF (Open Shortest Path First)r“open”: publicly available, Set in the close to top level ISP rUses Link State algorithm mLS packet disseminationmTopology map of entire AS created at each nodemRoute computation using Dijkstras algorithm(链路费用为1,或按与链路容量成反比来设置)rOSPF advertises one time per 30 Min and the advertisement carries one entry per neighbor routerrAdvertisements disseminated to entire AS (via flooding)mCarried in OSPF messages directly over IP (其报文直接承载在IP分组中,其上层协议值是89;rather than TCP or UDP) OSPF是一个使用洪泛链路状态信息的链路状态协议和一个使用Dijkstra最低费用路径算法98Network LayerOSPF “advanced” features (not in RIP)rSecurity: all OSPF messages are authenticated, so can prevent malicious intrusion (鉴别过的,意味着仅有受信任的路由器能参与一个AS内的OSPF协议:阻止恶意入侵者) rMultiple same-cost paths allowed (only one path in RIP) 这样一个大messege的不同packet可在不同path上传递rFor each link, multiple cost metrics for different TOS (e.g., satellite link cost set “low” for best effort; high for real time)rIntegrated support for uni- and multicast(单播和多播) : mMulticast OSPF (MOSPF) uses same topology data base as OSPF link (多、单播具有相同的拓扑数据库)rHierarchical OSPF in large domains(具有按层次结构构造一个AS的能力).99Network LayerHierarchical OSPF(规模很大时)100Network LayerHierarchical OSPFrTwo-level hierarchy: local area, backbone.rArea border routers: “summarize” distances to nets in own area, advertise to other Area Border routers.rBackbone routers: run OSPF routing limited to backbone.rBoundary routers: connect to other ASs.101Network LayerChapter 4: Network Layerr4. 1 Introductionr4.2 Virtual circuit and datagram networksr4.3 Whats inside a routerr4.4 IP: Internet ProtocolmDatagram formatmIPv4 addressingmICMPmIPv6r4.5 Routing algorithmsmLink statemDistance VectormHierarchical routingr4.6 Routing in the InternetmRIPmOSPFmBGPr4.7 Broadcast and multicast routing102Network LayerInternet inter-AS routing: BGPrBGP (Border Gateway Protocol): the de facto(事实上) standardrBGP provides each AS a means to:1.Obtain subnet reachability information from neighboring ASs.2.Propagate reachability information to all routers within AS.3.Determine “good” routes to subnets based on reachability information and policy.rallows each subnet to advertise its existence to rest of Internet: “I am here”(避免了子网的隔离与孤独存在)BGP是Internet中绝对重要的协议,是它将所有的东西黏合在一起了。实现起来极其复杂!103Network LayerBGP basicsrPairs of routers (BGP peers) exchange routing info over semi-permanent TCP connections(179号端口,如:3a与1c间,1b与2a间,各有一条这样的连接): BGP sessions( eBGP session和iBGP session:外、内部)mBGP sessions need not correspond (对应)to physical links.rWhen AS2 advertises a prefix(一个子网或一个子网的集合) to AS1, AS2 is promising it will forward any datagrams destined to that prefix(详见P253的例子).mAS2 can aggregate(聚合) prefixes in its advertisement3b1d3a1c2aAS3AS1AS21a2c2b1b3ceBGP sessioniBGP session104Network LayerDistributing reachability inforWith eBGP session between 3a and 1c, AS3 sends prefix reachability info to AS1.r1c can then use iBGP do distribute this new prefix reach info to all routers in AS1r1b can then re-advertise new reachability info to AS2 over 1b-to-2a eBGP sessionrWhen router learns of new prefix, creates a entry for prefix in its forwarding table.3b1d3a1c2aAS3AS1AS21a2c2b1b3ceBGP sessioniBGP session105Network LayerPath attributes & BGP routesrWhen advertising a prefix, advert includes BGP attributes. mprefix + attributes = “route”(带有属性的前缀被称为一条路由)rTwo important attributes:mAS-PATH: contains ASs (No.), eg. AS 67 AS 17 :当一个前缀传送到一个AS时,该AS将它的号码增加到AS-PATH属性中. (路由器使用该属性来检测和防止循环通告;特别是当它的AS已包含在该路径的列表中时,它将拒绝该通告。)mNEXT-HOP: Indicates specific internal-AS router to next-hop AS. (是一个开始某AS-PATH的路由器接口的IP地址。There may be multiple links from current AS to next-hop-AS.)rWhen gateway router receives route advertisement, uses import policy(输入策略) to accept/decline (可能过滤掉一条路由,是因为该AS可能不想通过该路由的AS-PATH中的某个AS发送流量,或因为它已经知道了到相同前缀的偏好路由).106Network LayerBGP route selectionrRouter may learn about more than 1 route to some prefix. Router must select route.rElimination(消除) rules:1.Local preference(偏好)value attribute: policy decision(策略决定,具有最高本地偏好值的路由将被选择)2.本地偏好值相同时,将选Shortest AS-PATH (DV算法,测度用AS的跳数而非路由器跳数)3.1和2都相同时,将选Closest NEXT-HOP router(最靠近是指费用最低的路由器): hot potato routing4.Additional criteria (如使用BGP标识符)107Network LayerBGP routing policy(note:A,B,C,X,Y and W are AS,not routers)rA,B,C are provider networks 且彼此对等rX,W,Y are customer (of provider networks) and stub networks(是桩网络:仅承载源地址或目的地址为本AS的流量)rX is dual-homed(attached to two networks)且是桩网络,故:mX does not want to route from B via X to Cm. so X will not advertise to B a route to C108Network LayerBGP routing policy (2)rA advertises to B the path AW rB advertises to X the path BAW (因为X是B的客户)rShould B advertise to C the path BAW?mNo way! B gets no “revenue” (免费)for routing CBAW since neither W nor C are Bs customers mB wants to route only to/from its customers!遵从的经验法则:任何穿越某ISP主干网的流量必须是“或其源或其目的(或两者)位于该ISP的某个客户网络中,否则这些流量将会免费搭乘该ISP的网络。”109Network LayerWhy different Intra- and Inter-AS routing ? Policy: rInter-AS: admin wants control over how its traffic routed, who routes through its net. rIntra-AS: single admin, so no policy decisions neededScale:rhierarchical routing saves table size, reduced update trafficPerformance: rIntra-AS: can focus on performancerInter-AS: policy may dominate(支配) over performance110Network LayerChapter 4: Network Layerr4. 1 Introductionr4.2 Virtual circuit and datagram networksr4.3 Whats inside a routerr4.4 IP: Internet ProtocolmDatagram formatmIPv4 addressingmICMPmIPv6r4.5 Routing algorithmsmLink statemDistance VectormHierarchical routingr4.6 Routing in the InternetmRIPmOSPFmBGPr4.7 Broadcast and multicast routing111Network LayerR1R2R3R4sourceduplicationR1R2R3R4in-networkduplicationduplicatecreation/transmissionduplicateduplicateBroadcast Routing(地址为全1)rDeliver packets from source to all other nodesrSource duplication is inefficient(R1到R2链路带宽资源浪费):rSource duplication: how does source determine recipient addresses? (可以做到,但将增加更多的开销,并使简单的协议复杂化!)112Network LayerIn-network duplicationrFlooding: when node receives brdcst pckt, sends copy to all neighborsmProblems: cycles & broadcast storm(剧增大量的冗余分组和产生光斑风暴)rControlled flooding: node only brdcsts pkt if it hasnt brdcst same packet beforem Node keeps track of pckt ids already brdcsted(每个节点维护它已经收到、复制和转发的源地址和序号列表,当节点收到分组已在列表中,则丢弃;否则复制转发:在应用层进行)m Or reverse path forwarding (RPF): only forward pckt if it arrived on shortest path between node and source(仅当该分组到达的链路正好是位于它自己到其源的最短单播路径上才传输,否则丢弃)rSpanning tree:如果图上每段链路具有相应费用且一棵树的费用就是其链路费用之和,则该图中所有生成树中费用最小的生成树称为最小生成树。m对网络节点构造出一棵生成树m源节点仅向属于该生成树的特定链路发送分组m接收节点也只向生成树中除接收该分组的邻居以外的其他邻居转发分组No redundant packets received by any node113Network LayerABGDEcFABGDEcF(a) Broadcast initiated at A(b) Broadcast initiated at DSpanning TreerFirst construct a spanning treerNodes forward copies only along spanning tree(树中不包括即不能出现Cycles)114Network LayerABGDEcF12345(a)Stepwise construction of spanning treeABGDEcF(b) Constructed spanning treeSpanning Tree: Creationr定义 Center node(如E为中心节点)rEach node sends unicast join message to center node(假定加入树并向E转发加入树报文的顺序为:F、B(通过D)、A、C、G (通过D) )mMessage forwarded until it arrives at a node already belonging to spanning tree115Network LayerMulticast Routing: Problem StatementQ:怎样标识多播分组的接收方?怎样为发送到这些接收方的分组编址?A:多播数据报使用间接地址来编址,即用一个标识来表示一组接收方。 具体实现:由IGMP(互联网组管理协议)和多路选路协议(算法)完成。rGoal: find a tree (or trees) connecting routers having local mcast group(与一个D类地址相关联的接收方组,因Internet中表示一组接收方的单一标识就是一个D类多播地址:1110) members mtree: not all paths between routers usedmsource-based: different tree from each sender to rcvrsmshared-tree: same tree used by all group membersShared treeSource-based treesApproaches for building mcast treesApproaches:rsource-based tree: one tree per sourcemshortest path treesmreverse path forwarding会出现无相连主机要加入多播的路由器收到了多播分组的现象,要进行剪枝(pruning):向上游路由器发送一个剪枝报文。rgroup-shared tree: group uses one treemminimal spanning (Steiner斯泰纳) mcenter-based trees加入报文使用单播选路朝着中心转发,直到到达已经属于多播树的一台路由器或中心;沿加入报文路经上的所有路由器将向发起该多播加入的边缘路由器转发接收到的多播分组。we first look at basic approaches, then specific protocols adopting these approachesShortest Path Treermcast forwarding tree: tree of shortest path routes from source to all receiversmDijkstras algorithmR1R2R3R4R5R6R7216345irouter with attachedgroup memberrouter with no attachedgroup memberlink used for forwarding,i indicates order linkadded by algorithmLEGENDS: sourceReverse Path Forwardingif (mcast datagram received on incoming link on shortest path back to center) then flood datagram onto all outgoing links else ignore datagramqrely on routers knowledge of unicast shortest path from it to senderqeach router has simple forwarding behavior:Reverse Path Forwarding: exampleresult is a source-specific reverse SPTmay be a bad choice with asymmetri(非对称的) linksR1R2R3R4R5R6R7router with attachedgroup memberrouter with no attachedgroup memberdatagram will be forwardedLEGENDS: sourcedatagram will not be forwardedReverse Path Forwarding: pruningrforwarding tree contains subtrees with no mcast group membersmno need to forward datagrams down subtreem“prune” msgs sent upstream by router with no downstream group membersR1R2R3R4R5R6R7router with attachedgroup memberrouter with no attachedgroup memberprune messageLEGENDS: sourcelinks with multicastforwardingPPPShared-Tree: Steiner TreerSteiner(斯泰纳)Tree: minimum cost tree connecting all routers with attached group membersrproblem is NP-completernot used in practice:mcomputational complexityminformation about entire network neededmmonolithic: rerun whenever a router needs to join/leaveCenter-based treesrsingle delivery tree shared by allrone router identified as “center” of treerto join:medge router sends unicast join-msg addressed to center routermjoin-msg “processed” by intermediate routers and forwarded towards centermjoin-msg either hits existing tree branch for this center, or arrives at centermpath taken by join-msg becomes new branch of tree for this routerCenter-based trees: an exampleSuppose R6 chosen as center:R1R2R3R4R5R6R7router with attachedgroup memberrouter with no attachedgroup memberpath order in which join messages generatedLEGEND21311,Internet Multicasting Routing: DVMRP实现了一个具有反向路经转发与剪枝算法(RPF)的基于源的树。rDVMRP: distance vector multicast routing protocol, RFC1075rflood and prune: reverse path forwarding, source-based treemRPF tree based on DVMRPs own routing tables constructed by communicating DVMRP routers mno assumptions about underlying unicastminitial datagram to mcast group flooded everywhere via RPFmrouters not wanting group: send upstream prune msgsTunnelingQ: How to connect “islands” of multicast routers in a “sea” of unicast routers? qmcast datagram encapsulated inside “normal” (non-multicast-addressed) datagramqnormal IP datagram sent thru “tunnel” via regular IP unicast to receiving mcast routerqreceiving mcast router unencapsulates to get mcast datagramphysical topologylogical topology2,Protocol Independent Multicast : PIMrnot dependent on any specific underlying unicast routing algorithm (works with all)rtwo different multicast distribution scenarios :Dense(稠密模式):qgroup members densely packed, in “close” proximity.qbandwidth more plentifulSparse(稀疏模式):q# networks with group members small wrt # interconnected networksqgroup members “widely dispersed”qbandwidth not plentifulChapter 4: summaryr4. 1 Introductionr4.2 Virtual circuit and datagram networksr4.3 Whats inside a routerr4.4 IP: Internet ProtocolmDatagram formatmIPv4 addressingmICMPmIPv6r4.5 Routing algorithmsmLink statemDistance VectormHierarchical routingr4.6 Routing in the InternetmRIPmOSPFmBGPr4.7 Broadcast and multicast routing128Network Layer
收藏 下载该资源
网站客服QQ:2055934822
金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号