S1断链告警处理案例.docx
《S1断链告警处理案例.docx》由会员分享,可在线阅读,更多相关《S1断链告警处理案例.docx(11页珍藏版)》请在冰豆网上搜索。
S1断链告警处理案例
ØS1断链告警处理指导
1.故障现象描述
告警管理中查看到基站上报“S1断链告警(198094830)”告警码,如下图所示:
2.故障分析排查思路
根据TD-LTE的网络接口协议,S1链路是建立在物理传输层、数据链路层、IP协议层、SCTP偶联链路之上的传输协议层,如下图所示:
所以处理S1链路故障,需要从底层开始排查:
1、首先排查站点是否存在传输告警,排除传输故障;
2、其次基站IP地址配置是否正常;
3、再次确认SCTP偶联断告警,排除SCTP偶联告警;
4、最后排查是否存在S1AP建立失败(协商失败或基站无小区),与核心网核对小区TAC值是否配置一致。
3.故障排查步骤
1、查看基站告警,是否存在传输类相关告警,例如“网元断链告警”、“SCTP偶联断链”告警,若存在以上告警,需要先按照以上告警排查指导手册,先解决以上告警。
2、检查ENODEB------MME或SGW路由IP地址是否配置正确;通过telnet命令登录到CC板,使用BRS命令对MME及SGW地址进行PING包测试,详细登录方式如下,红色字体均需要输入:
通过服务器远程登录:
bash-3.2$telnet10.30.143.201
前台通过网线直连登录地址:
192.254.1.16
正在尝试...
连接到10.30.143.201(192.254.1.16)
(none)login:
zte(用户名)
Password:
zte(密码)
Processing/etc/profile...Done
#/ushell
->Pleaseinputpassword!
->
***(密码zte)
->Loginsuccess!
!
ushelltoolmenu:
------------------------------------------------------------------------------
'ps'or'PS'listprocessrunontheboard
'prxxx'or'PRxxx'takeoverxxxprocessprintfinfo
'nprxxx'or'NPRxxx'nottakeoverxxxprocessprintfinfo
'dbxxx'or'DBxxx'debugxxxprocessprintfinfo
'ndbxxx'or'NDBxxx'notdebugxxxprocessprintfinfo
'padxxx'or'PADxxx'debugandtakeoverxxxprocessprintfinfo
'npadxxx'or'NPADxxx'notdebugandtakeoverxxxprocessprintfinfo
'pall'or'PALL'displaycurrentdebugandtakeoverinfo
'ncheck'or'NCHECK'Donotcheckanotherushellexist
'check'or'CHECK'Docheckanotherushellexist
'Q'or'q'cancelallprocessdebugandprintfinfo
'exit'or'EXIT'cancelushell
xxxisprocessidyouwanttodebugortakeoverprintfinfo
------------------------------------------------------------------------------
$$ps(查看前台进程)
PIDUSERVSZSTATCOMMAND
1root1304Sinit
2root0SW[softirq-high/0]
3root0SW[softirq-timer/0]
4root0SW[softirq-net-tx/]
5root0SW[softirq-net-rx/]
6root0SW[softirq-block/0]
7root0SW[softirq-tasklet]
8root0SW[softirq-sched/0]
9root0SW[softirq-hrtimer]
10root0SW[softirq-rcu/0]
11root0SW[watchdog/0]
12root0DW[chkeventd/0]
13root0SW<[events/0]
14root0SW<[rt_events/0]
15root0SW<[khelper]
16root0SW<[kthread]
17root0SW<[rt_kthread]
37root0SW<[kblockd/0]
42root0SW<[khubd]
83root0SW[pdflush]
84root0SW[pdflush]
85root0SW<[kswapd0]
86root0SW<[aio/0]
621root0SW[mtdblockd]
678root1253mS/MGR.EXE
680root9156S/tftp
683root1308Stelnetd
685root1312Sinetd
686root1312S-/bin/./ash
697root0SWN[jffs2_gcd_mtd0]
1201root457mS/Product_lte_tdd.so8891V3.10.10P30R1/AGT_LTE_TDD.EXE
1750root1316S-sh
1751root9216R/ushell
1753root1304Rsh-cps
1754root1308Rps
$$pad678(登录到平台进程)
[678]
ushellenterprintmod
ushellenterdebugmod
$$brsping"200.1.10.200"(ping核心网MME地址)
[678]
[begintoexcelfun:
brsping]
value=0(0x0)
[endtoexcelfun:
brsping]
Ping:
findnoroutefordest,sendbydefaultgateway[0xac1e8fc1].
sendpingseq:
1...
$$
[678]
PING===>replyfrom200.1.10.200packetsize=36time=14ms.——正常ping通时返回的时长
[678]
sendpingseq:
2...
[678]
PING===>replyfrom200.1.10.200packetsize=36time=4ms.
[678]
sendpingseq:
3...
[678]
PING===>replyfrom200.1.10.200packetsize=36time=3ms.
[678]
sendpingseq:
4...
[678]
PING===>replyfrom200.1.10.200packetsize=36time=24ms.
[678]
Pingstatisticsfor200.1.10.200:
Packets:
Sent=4,Received=4,Lost=0(0%loss),
Approximateroundtriptimesinmilli-seconds:
Minimum=3ms,Maximum=24ms,Average=11ms
(ping核心网MME控制面200.1.10.200地址结果,丢包率0%,证明基站到MME链路正常。
)
brsping"200.1.30.20"(ping核心网SGW地址)
[678]
[begintoexcelfun:
brsping]
value=0(0x0)
[endtoexcelfun:
brsping]
sendpingseq:
1...
$$
[678]
PING===>replyfrom200.1.30.20packetsize=36time<1ms.
[678]
sendpingseq:
2...
[678]
PING===>replyfrom200.1.30.20packetsize=36time<1ms.
[678]
sendpingseq:
3...
[678]
PING===>replyfrom200.1.30.20packetsize=36time=1ms.
[678]
sendpingseq:
4...
[678]
PING===>replyfrom200.1.30.20packetsize=36time<1ms.
[678]
Pingstatisticsfor200.1.30.20:
Packets:
Sent=4,Received=4,Lost=0(0%loss),
Approximateroundtriptimesinmilli-seconds:
Minimum=0ms,Maximum=1ms,Average=0ms
(ping核心网SGW用户面200.1.30.20地址结果,丢包率0%,证明基站到SGW链路正常。
)
通过以上步骤,排查基站到EPC的控制面MME和用户面SGW链路均正常。
3、Pad到平台进程,showtcb查看偶联状态,
继续在平台进程中输入“showtcb”命令查看,偶联状态是否正常,若偶联异常,按照偶联断链告警指导手册处理。
$$showtcb
[678]
[begintoexcelfun:
showtcb]
=====Begin:
ShowAssocTCBInfo=====
TCBinfo0:
偶联号0
ULPID=0,AssoID=0,Checksum=1,InstanceID=0
LocalPort=6051,SourIP=100.64.20.108,VpnId=31
PeerPort=6051,DestIP=200.1.10.200,VpnId=31
AssociationState=established(此处显示偶联状态,established标示偶联正常)
CulTsnAcked=2597222824,NextTsnAssign=2597222825,LastRecvTSN=1479945961
OutStandingSize=0,PendingChkNum=261888,MtuSize=1500
TxReChkNum=0
TxStrmNum=2,RxStrmNum=2
PeerVerifTag=1479945957,MyVerifTag=2597222817
TCBinfo11:
偶联号11
ULPID=11,AssoID=11,Checksum=0,InstanceID=11
LocalPort=36422,SourIP=100.64.20.108,VpnId=31
PeerPort=36422,DestIP=100.64.20.109,VpnId=31
AssociationState=established(此处显示偶联状态,established标示偶联正常)
CulTsnAcked=1926134201,NextTsnAssign=1926134202,LastRecvTSN=866462450
OutStandingSize=0,PendingChkNum=261888,MtuSize=1500
TxReChkNum=0
TxStrmNum=2,RxStrmNum=2
PeerVerifTag=866462422,MyVerifTag=1926134177
TCBinfo12:
偶连号12
ULPID=12,AssoID=12,Checksum=1,InstanceID=12
LocalPort=36422,SourIP=100.64.20.108,VpnId=31
PeerPort=36422,DestIP=100.64.43.43,VpnId=31
AssociationState=established(此处显示偶联状态,established标示偶联正常)
CulTsnAcked=1926134201,NextTsnAssign=1926134202,LastRecvTSN=2040807940
OutStandingSize=0,PendingChkNum=261888,MtuSize=1500
TxReChkNum=0
TxStrmNum=2,RxStrmNum=2
PeerVerifTag=2040807912,MyVerifTag=1926134177
TCBinfo13:
偶连号13
ULPID=13,AssoID=13,Checksum=1,InstanceID=13
LocalPort=36422,SourIP=100.64.20.108,VpnId=31
PeerPort=36422,DestIP=100.64.25.85,VpnId=31
AssociationState=cookie_wait(此处显示偶联状态,cookiewait标示偶联不正常)
CulTsnAcked=0,NextTsnAssign=2957087305,LastRecvTSN=0
OutStandingSize=0,PendingChkNum=4294967168,MtuSize=0
TxReChkNum=0
TxStrmNum=2,RxStrmNum=2
PeerVerifTag=0,MyVerifTag=2957087305
TCBinfo14:
偶连号14
ULPID=14,AssoID=14,Checksum=1,InstanceID=14
LocalPort=36422,SourIP=100.64.20.108,VpnId=31
PeerPort=36422,DestIP=100.64.43.39,VpnId=31
AssociationState=cookie_wait(此处显示偶联状态,cookiewait标示偶联不正常)
CulTsnAcked=0,NextTsnAssign=2529838369,LastRecvTSN=0
OutStandingSize=0,PendingChkNum=4294967168,MtuSize=0
TxReChkNum=0
TxStrmNum=2,RxStrmNum=2
PeerVerifTag=0,MyVerifTag=2529838369
=====End:
ShowAssocTCBInfo=====
value=34(0x22)
[endtoexcelfun:
showtcb]
$$exit(退出平台进程)
ushellrecvsigno:
0.
quitdebugandexitushell!
#exit(退出基站CC单板连接)
关闭连接。
4、排查传输和基站偶联异常问题后,若S1断链故障还未解决,需要排查与EPC之间的S1对接参数核对,检查TD-LTE的E-UTRANTDD小区中跟踪区码TAC是否按照EPC协商的值配置,如下图所示:
如果此参数与EPC侧配置不一致,会导致S1链路建立失败,需要按照规划修改此配置参数。
4.故障排查总结
S1链路基于高层的协议链路,排查过程涉及到多个底层协议链路,需要从底层链路开始排查:
1、首先需要确定物理传输链路是否正常。
2、其次排查IP协议层地址是否配置正确。
3、再次确认SCTP偶联链路是否正常。
4、最后检查与EPC侧S1对接参数TAC是否配置一致。