相关协议:
1)IEEE 802.1ag 连通性故障管理(CFM:Connectivity Fault Management)
2)IEEE 802.3ah 第一英里的以太网(EFM:Ethernet in the First Mile),
3)ITU-T Y.1731 以太网OAM(Operation, Administration and Maintenance)
802.3ah
EFM OAM工作在数据链路层,其协议报文被称为OAMPDU.
The frame transmission rate is limited to a maximum of 10 frames per second; therefore, the impact of OAM on normal operations is negligible.
a) Destination Address (DA). The DA in OAMPDUs is the Slow_Protocols_Multicast address. 0x0180-C200-0002慢速协议报文的特点就是不能被网桥转发,因此无论是否具备OAM功能或OAM功能是否激活,EFM OAM报文都不能跨多跳转发
b) Source Address (SA). The SA in OAMPDUs carries the individual MAC address associated with
the port through which the OAMPDU is transmitted.
c) Length/Type. 0x8809
d) Subtype. The Subtype field identifies the specific Slow Protocol being encapsulated. OAMPDUs carry the Subtype value 0x03.
e) Flags. The Flags field contains status bits.
Additional diagnostic information may be sent using the Event Notification OAMPDU.
主要的Flag:
--2 Critical Event
----1 = A critical event has occurred.
----0 = A critical event has not occurred.
--1 Dying Gasp
----1 = An unrecoverable local failure condition has occurred.
----0 = An unrecoverable local failure condition has not occurred.
--0 Link Fault
The PHY has detected a fault has occurred in the receive direction of the local DTE (e.g., link, Physical layer).
----1 = Local device's receive path has detected a fault.
----0 = Local device's receive path has not detected a fault.
f) Code.
The Code field identifies the specific OAMPDU.
--0x00Information OAMPDU
信息OAMPDU,也称为心跳报文
用于在本端与远端的OAM实体之间交互各种状态信息(包括本地信息TLV、远端信息TLV和组织自定义信息TLV)
--0x04
Loopback Control OAMPDU
环回控制OAMPDU
用于检测链路质量和定位链路故障,该报文中带有使能/去使能信息,用来开启/关闭远端环回功能
Command Description
0x01 Enable OAM Remote Loopback
0x02 Disable OAM Remote Loopback
0x00,0x03-0xFF Reserved Shall not be transmitted, should be ignored on reception by OAM client
g) Data/Pad.
This field contains the OAMPDU data and any necessary pad. Implementations shall support OAMPDUs at least minFrameSize in length.
h) FCS.
This field is the Frame Check Sequence
以太网OAM功能建立在以太网OAM连接的基础上。
以太网OAM连接的建立过程也称为Discovery阶段,即本端OAM实体发现远端OAM实体、并与之建立稳定对话的过程。
在这个过程中,相连的OAM实体通过交互Information OAMPDU通报各自的以太网OAM配置信息和本端支撑的以太网OAM能力信息。当OAM实体收到对端的配置参数后,决定是否建立OAM连接。
交互以下的配置信息:
• OAM mode
The mode can be either active or passive and can be used to determine device functionality.
以太网OAM的连接模式有两种:主动模式和被动模式;
以太网OAM连接只能由主动模式的OAM实体发起,而被动模式的OAM实体只能等待对端OAM实体的连接请求;
都处于被动模式下的两个OAM实体之间无法建立以太网OAM连接。
• OAM configuration (capabilities)
Advertises the capabilities of the local OAM entity. With this information a peer can determine, what functions are supported and accessible; for example, loopback capability.
• OAMPDU configuration
Includes the maximum OAMPDU size for receipt and delivery. This information along with the rate limiting of 10 frames per second can be used to limit the bandwidth allocated to OAM traffic.
• Platform identity
A combination of an organization unique identifier (OUI) and 32-bits of vendor-specific information. OUI allocation, controlled by the IEEE, is typically the first three bytes of a MAC address.
2)故障侦测和告警:通过发送检测报文来探测链路的连通性,当链路出现故障时及时通知网络管理员;
链路监控用于在各种环境下检测和发现链路层故障,通过Event Notification OAMPDU来监控链路。
当一端OAM实体监控到一般链路事件时,将向其对端发送Event Notification OAMPDU以进行通报。
• Link Event TLVs
主要有以下4种Event Type.
-0x01 错误信号事件(Errored Symbol Period Event):单位时间内的错误信号数量超过定义的阈值
Counts the number of symbol errors that occurred during the specified period. The period is specified by the number of symbols that can be received in a time interval on the underlying physical layer.
==============================================
Default / lower bound / upper bound
Window Size number of symbols that can be received in 1 second / in 1 second / in 1 minute
Threshold 1 symbol error / 0 symbol error / unspecified
==============================================
--0x02 错误帧事件(Errored Frame Event):单位时间内的错误帧数量超过定义的阈值
Counts the number of errored frames detected during the specified period. The period is specified by a time interval.
==============================================
Default / lower bound / upper bound
Window Size: 1 second / 1 second / 1 minute
Threshold: 1 frame error / 0 frame error / unspecified
==============================================
--0x03 错误帧周期事件(Errored Frame Period Event):指定帧数N为周期,在收到N个帧的周期内错误帧数超过定义的阈值
Counts the number of errored frames detected during the specified period. The period is specified by a number of received frames.
==============================================
Default / lower bound / upper bound
Window Size: number of minFrameSize frames that can be received in 1 second / 100ms / 1 minute
Threshold: 1 frame error / 0 frame error / unspecified
==============================================
--0x04 错误帧秒数事件(Errored Frame Seconds Event):指定M秒数下有错误帧的秒数超过了定义的阈值
Counts the number of errored frame seconds that occurred during the specified period. The period is specified by a time interval.
==============================================
Default / lower bound / upper bound
Window Size: 60 seconds / 10 seconds / 900 seconds
Threshold: 1 errored second / 0 errored seconds / unspecified
==============================================
802.1ag
CFM导入了以下概念:
• 维护域(Maintenance Domain:MD)
指明了连通错误检测所覆盖的网络,其边界是由配置在端口上的一系列维护端点所定义的。维护域以"维护域名"来标识。
维护域共分为八级,用整数0~7来表示,数字越大级别越高,维护域的范围也就越大。
不同维护域之间可以相邻或嵌套,但不能交叉,且嵌套时只能由较高级别的维护域来嵌套较低级别维护域。
低级别维护域的CFD PDU进入高级别维护域后会被丢弃;高级别维护域的CFD PDU则可以穿越低级别维护域;相同级别的维护域的CFD PDU不可以互相穿越。
• 维护集(Maintenance Association:MA)
在维护域内根据需要可以配置多个维护集(MA),每个维护集是维护域内一些维护点的集合。
维护集以"维护域名+维护集名"来标识。维护集中的维护点可以接收由本维护集中其它维护点发来的报文。
一个维护集(MA)可以服务于多个VLAN,但是同一MD中的不同MA不能共享同一个VLAN。
• 维护点(Maintenance Point:MP)
维护点(MP)配置在端口上,属于某个维护集,可分为维护端点(MEP)和维护中间点(MIP)两种。
1) MEP: Maintenance association End Point
维护端点以称为MEP ID的整数(range 1 - 8191)来标识,在同一个MA中它是唯一的。
它确定了维护域的范围和边界。维护端点所属的维护集和维护域确定了该维护端点所发出报文的VLAN属性和级别。
维护端点的级别决定了其所能处理的报文的级别,维护端点所发出报文的级别就是该维护端点的级别。
当维护端点收到高于自己级别的报文时,会将其按原有路径继续转发;而当维护端点收到小于或等于自己级别的报文时不会再转发,以确保低级别维护域内的报文不会扩散到高级别维护域中。
维护端点具有方向性,分为外向(DOWN)维护端点和内向(UP)维护端点两种。
维护端点的方向表明了维护域相对于该端口的位置。其中,外向维护端点通过其所在端口向外发送报文,内向维护端点则不通过其所在端口向外发送报文,而是通过该设备上的其它端口向外发送报文。
2)MIP: Maintenance association Intermediate Point
维护中间点位于维护域内部,不能主动发出CFM协议报文,但可以响应LBM和LTM报文。
维护中间点所属的维护集和维护域确定了该维护中间点所接收报文的VLAN属性和级别。
维护中间点可以配合维护端点完成类似于ping和tracert的功能。与维护端点类似,当维护中间点收到高于自己级别的报文时,不会进行处理,而是将其按原有路径转发;而当维护中间点收到小于等于自己级别的报文时,才会进行处理。
维护中间点是根据一定的规则,由系统在每个端口上计算出来的。用户应根据网络规划的情况,选择合适的生成规则。
缺省情况下,设备上不配置维护中间点。
如果在该维护域内的所有端口上都规划有维护中间点,则应选择default规则。
如果仅在低层维护域有维护端点时规划维护中间点,则应选择explicit规则。
维护端点列表是同一维护集内允许配置的本地维护端点和需要监控的远端维护端点的集合,它限定了维护集内维护端点的选取范围:不同设备上同一维护集内的所有维护端点都应包含在此列表中,且MEP ID互不重复。如果维护端点收到远端设备发来的CCM报文携带的维护端点不在同一维护集的维护端点列表中,就丢弃该报文。
MEPs may monitor either all frames or a set of VLANs. You may configure a virtual switch as either VLAN-aware or VLAN-unaware to recognize or not recognize VLAN tagged frames or packets when they are delivered from another virtual switch in the network.
• VLAN-Aware
Maintenance Entities, such as MAs and MEPs, monitor a VLAN or set of VLANs that are associated with a Primary VLAN ID. These Maintenance Entities protect only the VLAN or set of VLANs to which
they are associated.
• VLAN-Unaware
Maintenance Entities, such as MAs and MEPs, monitor all data frames passing through a port, regardless of using VLANs. If the Maintenance Entity is VLAN-Unaware, do not specify the VLAN parameter in the configuration commands.
CFM PDU的格式如下:
a) MD level
维护域的级别,取值范围为0~7,取值越大表示级别越高
b)Version
协议版本号,为0
c)OpCode
消息编码,不同取值表示不同类型的CFM PDU,常见的CFM PDU如下所示。
=======================================================
OpCode值 报文类型 目的MAC地址 作用
--------------------------------------------------------
0x01 CCM PDU 01-80-C2-00-00-3x(组播地址) 用于连续性检测,各维护端点均可发出
0x02 LBR PDU 环回发起端的MAC(单播地址) 用于环回,由环回对端回应
0x03 LBM PDU 环回目的端的MAC(单播地址) 用于环回,由环回发起端发出
0x04 LTR PDU 链路跟踪发起端的MAC(单播地址) 用于链路跟踪,由链路跟踪对端回应
0x05 LTM PDU 01-80-C2-00-00-3y(组播地址) 用于链路跟踪,由链路跟踪发起端发出
0, 6-31 Reserved for IEEE 802.1
32-63 Defined by ITU-T Y.1731
33 AIS
35 LCK
37 TST
39 APS
41 MCC
43 LMM
42 LMR
45 1DM
47 DMM
46 DMR
64-255 Reserved for IEEE 802.1
========================================================
目的MAC地址中x和y的取值
MD level x的取值 y的取值
7 7 F
6 6 E
5 5 D
4 4 C
3 3 B
2 2 A
1 1 9
0 0 8
CCM:Continuity Check Message 连续性检测报文
LBMoopback Message 环回消息
LBRoopback Reply 环回应答
LTinktrace 链路跟踪
LTMinktrace Message 链路跟踪消息
LTRinktrace Reply 链路跟踪应答
d)Flags
Flag域,该字段在不同类型的CFM PDU中表示不同的含义
e)Varies with value of OpCode
--Sequence number:序列号,初始值为一个随机值,以后维护端点每发送一个CCM PDU,该字段的取值就会加1
--Loopback transaction ID/LTR/LTM transaction ID:处理编号,初始值为0,以后维护端点每发送一个LBR/LBM/LTR/LTM PDU,该字段的取值就会加1
f)TLV(Type, Length, Value)
TLV stands for Type, Length, Value and denotes a method of encoding variable-length and/or optional information in a PDU. TLVs are not aligned to any particular word or octet boundary. TLVs follow each other with no padding between TLVs.TLV or organization / Type field
End TLV / 0 在终了TLV中,类型 = 0,长度和数值字段都不用。
Sender ID TLV / 1
Port Status TLV / 2
Data TLV / 3
Interface Status TLV / 4
Reply Ingress TLV / 5
Reply Egress TLV / 6
LTM Egress Identifier TLV / 7
LTR Egress Identifier TLV / 8
Reserved for IEEE 802.1 / 9-30,64-255
Defined by ITU-T Y.1731 / 32测试TLV 33-63 Reserved
Organization-Specific TLV / 31
3. 链路跟踪功能
链路跟踪功能用于确定源端到目标维护端点的路径。该功能的实现方式是:由源端发送LTM(Linktrace Message,链路跟踪报文)给目标维护端点,目标维护端点及LTM经过的维护中间点接收到该报文后,发送LTR(Linktrace Reply,链路跟踪应答报文)给源端,源端根据收到的应答报文确定到目标维护端点的路径。LTM是组播报文,LTR是单播报文。
• LTM PDU
Flag字段仅使用第8位(UseFDBonly),其他全部置为0.
If UseFDBonly is set, indicates that only MAC addresses learned in a Bridge's Filtering Database, and not information saved in the MIP CCM Database, is to be used to determine the Egress Port.
LTM PDU的Additional LTM TLVs必须包含
--LTM Egress Identifier TLV(type=7)
可能包含
--Sender ID TLV(type=2)
--Organization-Specific TLV(type=31)
• LTR PDU
Reply TTL(1 octet)包含LTR为之发送的LTM的TTL字段的数值再递减1。(One less than the value from the LTM TTL field in the LTM that triggered the transmission of this LTR. If the LTM TTL field contained a 0, no LTR is transmitted.)
Flag字段定义如下:
==================================
Mnemonic Meaning Bit
UseFDBonly Copied from LTM. 8 (MSB)
FwdYes The LTM was (1) or was not (0) forwarded. 7
TerminalMEP The MP reported in Reply Egress TLV(or Reply Ingress TLV, if it is not present) is a MEP. 6
Reserved Copied from LTM. 5 - 1
==================================
LTR PDU的Additional LTR TLVs必须包含
--LTR Egress Identifier TLV(8)
--Reply Ingress TLV(5)和Reply Egress TLV(6)中的任意1个或者全部
可能包含
--Sender ID TLV(type=2)
--Organization-Specific TLV(type=31)
________________________________________
Y.1731
Y.1731的CFM部分和802.1ag基本相同,只是使用的一些术语有区别。
=====================
Y.1731术语 802.1ag术语
MEG MA
MEGID MAID
MEG等级 MA 等级
=====================
另外CFM部分增加了例如ETH-AIS,ETH-LCK等。
4. ETH-AIS以太网告警指示信号
告警抑制功能用来减少故障告警的上报数量。如果维护端点在3.5个CCM报文发送周期内未收到远端维护端点发来的CCM报文,便开始周期性地发送AIS(Alarm Indication Signal,告警指示信号)报文,该报文地发送方向与CCM报文相反。维护端点在收到AIS报文后,会抑制本端的故障告警,并继续发送AIS报文。各维护端点如果在3.5个CCM报文发送周期内重新收到了CCM报文,便停止发送AIS报文。AIS报文是组播报文。
The alarm indication signal suppresses alarms at the client-layer MEPs after detecting a fault or an AIS condition at the server-layer MEP. The operator can enable or disable the AIS functionality for a MEP. If AIS is enabled and a fault or AIS condition is detected at a MEP, AIS frames are transmitted to the client-layer MEPs in the direction opposite to the peer MEP. On receiving the AIS frame, the client-layer MEP suppresses its fault alarms.
5. ETH-LCK 以太网锁定信号
以太网锁定信号功能(ETH-LCK)用于通告服务器层(子层)MEP的管理性锁定以及随后的数据业务
流中断,该业务流是送往期待接收这业务流的MEP的。它使得接收带有ETH-LCK信息的帧的MEP能区分是故障情况,还是服务器层(子层)MEP的管理性锁定动作。
Y.1731增加了PM(Performance Monitoring)功能。
用于性能监测的OAM功能可以测量不同的性能参数。性能参数是针对点到点的ETH连接来定义的。
1. ETH-LM 帧丢失率的测量
Frame Loss Measurement for Frame Loss Ratio - is used to collect values applicable for ingress and egress service frames where the counters maintain a count of transmitted and received data frames between a pair of MEPs.
ETH-LM可以以两种方式进行:
• 单端的ETH-LM
其实现方式是由源端发送LMM(Loss Measurement Message,丢包测量报文)报文给目标维护端点,目标维护端点收到该报文后,会发送LMR(Loss Measurement Reply,丢包测量应答)报文给源端,源端则根据两个连续的LMR报文来计算源端和目标维护端点间的丢包数,即源端从收到第二个LMR报文开始,根据本LMR报文和前一个LMR报文的统计计数来计算源端和目标维护端点间的丢包数。LMM报文和LMR报文都是单播报文。
• 双端的ETH-LM
每个MEP向它对等的MEP周期地发送带有ETH-LM信息帧,以便于对等MEP的帧丢失测量。
用于双端ETH-LM信息的PDU是CCM PDU。
2. ETH-DM 帧时延的测量
ETH-DM 可用于测量帧时延和帧时延变化。
ETH-DM can be used for on-demand OAM to measure frame delay and frame delay variation.
Frame delay and frame delay variation measurements are performed by sending periodic frames with ETH-DM information to the peer MEP and receiving frames with ETH-DM information from the peer MEP during the diagnostic interval.
ETH-DM可以以两种方式进行:
• 单向ETH-DM
MEP发送带有ETH-DM信息的帧1DM PDU(One-way Delay Measurement,单向时延测量),其中包含TxTimestampf(ETH-DM传输时的时戳)信息单元。
接收的MEP可以将这一数值与ETH-DM帧的接收时间RxTimef进行比较,并按下式计算单向的帧时延:
帧时延 = RxTimef - TxTimeStampf
但是,单向帧时延的测量需要发送端MEP和接收端MEP的时钟同步。就帧时延变化的测量而言,它基于前后帧时延测量之间的差值,对于时钟同步的要求可以放松,因为在前后帧时延测量的差别中,相位差的间隔可以抵消。
• 双向ETH-DM
最通常情况下,要求时钟同步是不实际的,这时帧时延测量将只能在双向测量中进行。
MEP发送一个带有ETH-DM请求信息的帧DMM(Delay Measurement Message,时延测量报文)报文,它携带TxTimeStampf,同时接收端MEP以1个带有ETH-DM回复信息帧DMR(Delay Measurement Reply,时延测量应答)PDU进行回应,回复帧中有从ETH-DM请求信息中复制来的TxTimeStampf。MEP接收该带有ETH-DM回复信息的帧,将TxTimeStampf与ETH-DM回复信息帧的接收时间RxTimeb进行比较,并按下式进行双向帧时延和双向帧时延变化的测量:
帧时延 = RxTimeb - TxTimeStampf