如何解决CPU过度消耗问题

运维 数据库运维
CPU是电脑中的核心配件,只有火柴盒那么大,几十张纸那么厚,但它却是一台计算机的运算核心和控制核心。电脑中所有操作都由CPU负责读取指令,本文主要为大家介绍解决CPU过度消耗问题。

问题描述:开发人员报告系统运行缓慢,影响用户访问.

  1.登陆数据库主机

  使用vmstat检查,发现CPU资源已经耗尽,大量任务位于运行队列:

bash-2.03$ vmstat 3
procs memory page disk faults cpu
r b w swap free re mf pi po fr de sr s6 s9 s1 sd in sy cs us sy id
0 0 055042321464112 0 0 0 0 0 0 0 0 1 1 0 4294967196 0 0 -84 -5 -145
131 0 0 5368072 1518360 56 691 0 2 2 0 0 0 1 0 0 3011 7918 2795 97 3 0
131 0 0 5377328 1522464 81 719 0 2 2 0 0 0 1 0 0 2766 8019 2577 96 4 0
130 0 0 5382400 1524776 67 682 0 0 0 0 0 0 0 0 0 3570 8534 3316 97 3 0
134 0 0 5373616 1520512 127 1078 0 2 2 0 0 0 1 0 0 3838 9584 3623 96 4 0
136 0 0 5369392 1518496 107 924 0 5 5 0 0 0 0 0 0 2920 8573 2639 97 3 0
132 0 0 5364912 1516224 63 578 0 0 0 0 0 0 0 0 0 3358 7944 3119 97 3 0
129 0 0 5358648 1511712 189 1236 0 0 0 0 0 0 0 0 0 3366 10365 3135 95 5 0
129 0 0 5354528 1511304 120 1194 0 0 0 0 0 0 0 4 0 3235 8864 2911 96 4 0
128 0 0 5346848 1507704 99 823 0 0 0 0 0 0 0 3 0 3189 9048 3074 96 4 0
125 0 0 5341248 1504704 80 843 0 2 2 0 0 0 6 1 0 3563 9514 3314 95 5 0
133 0 0 5332744 1501112 79 798 0 0 0 0 0 0 0 1 0 3218 8805 2902 97 3 0
129 0 0 5325384 1497368 107 643 0 2 2 0 0 0 1 4 0 3184 8297 2879 96 4 0
126 0 0 5363144 1514320 81 753 0 0 0 0 0 0 0 0 0 2533 7409 2164 97 3 0
136 0 0 5355624 1510512 169 566 786 0 0 0 0 0 0 1 0 3002 8600 2810 96 4 0
130 1 0 5351448 1502936 267 580 1821 0 0 0 0 0 0 0 0 3126 7812 2900 96 4 0
129 0 0 5347256 1499568 155 913 2 2 2 0 0 0 0 1 0 2225 8076 1941 98 2 0
116 0 0 5338192 1495400 177 1162 0 0 0 0 0 0 0 1 0 1947 7781 1639 97 3 0

  2.使用Top命令

  观察进程CPU耗用,发现没有明显过高CPU使用的进程

$ top

last pid: 28313; load averages: 99.90, 117.54, 125.71 23:28:38
296 processes: 186 sleeping, 99 running, 2 zombie, 9 on cpu
CPU states: 0.0% idle,96.5% user, 3.5% kernel,0.0% iowait, 0.0%swap
Memory: 4096M real, 1404M free, 2185M swap in use, 5114M swap free

PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND
27082 oracle8i 1 33 0 1328M 1309M run 0:17 1.29% oracle
26719 oracle8i 1 55 0 1327M 1306M sleep 0:29 1.11% oracle
28103 oracle8i 1 35 0 1327M 1304M run 0:06 1.10% oracle
28161 oracle8i 1 25 0 1327M 1305M run 0:04 1.10% oracle
26199 oracle8i 1 45 0 1328M 1309M run 0:42 1.10% oracle
26892 oracle8i 1 33 0 1328M 1310M run 0:24 1.09% oracle
27805 oracle8i 1 45 0 1327M 1306M cpu/1 0:10 1.04% oracle
23800 oracle8i 1 23 0 1327M 1306M run 1:28 1.03% oracle
25197 oracle8i 1 34 0 1328M 1309M run 0:57 1.03% oracle
21593 oracle8i 1 33 0 1327M 1306M run 2:12 1.01% oracle
27616 oracle8i 1 45 0 1329M 1311M run 0:14 1.01% oracle
27821 oracle8i 1 43 0 1327M 1306M run 0:10 1.00% oracle
26517 oracle8i 1 33 0 1328M 1309M run 0:33 0.97% oracle
25785 oracle8i 1 44 0 1328M 1309M run 0:46 0.96% oracle
26241 oracle8i 1 45 0 1327M 1306M run 0:42 0.96% oracle

  3.检查进程数量

bash-2.03$ ps -ef|grep ora|wc -l
258
bash-2.03$ ps -ef|grep ora|wc -l
275
bash-2.03$ ps -ef|grep ora|wc -l
274
bash-2.03$ ps -ef|grep ora|wc -l
278
bash-2.03$ ps -ef|grep ora|wc -l
277
bash-2.03$ ps -ef|grep ora|wc -l
366

  发现系统存在大量Oracle进程,大约在300左右,而正常情况下Oracle连接数应该在100左右.

  4.检查数据库

  查询v$session_wait获取各进程等待事件

SQL> select sid,event,p1,p1text from v$session_wait;

SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- -------- 124 latch free 1.6144E+10 address
1 pmon timer 300 duration
2 rdbms ipc message 300 timeout
3 rdbms ipc message 300 timeout
11 rdbms ipc message 30000 timeout
6 rdbms ipc message 180000 timeout
4 rdbms ipc message 300 timeout
134 rdbms ipc message 6000 timeout
147 rdbms ipc message 6000 timeout
275 rdbms ipc message 17995 timeout
274 rdbms ipc message 6000 timeout

SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- --------
118 rdbms ipc message 6000 timeout
7 buffer busy waits 17 file#
56 buffer busy waits 17 file#
161 buffer busy waits 17 file#
195 buffer busy waits 17 file#
311 buffer busy waits 17 file#
314 buffer busy waits 17 file#
205 buffer busy waits 17 file#
269 buffer busy waits 17 file#
200 buffer busy waits 17 file#
164 buffer busy waits 17 file#

SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- -------
140 buffer busy waits 17 file#
66 buffer busy waits 17 file#
10 db file sequential read 17 file#
18 db file sequential read 17 file#
54 db file sequential read 17 file#
49 db file sequential read 17 file#
48 db file sequential read 17 file#
46 db file sequential read 17 file#
45 db file sequential read 17 file#
35 db file sequential read 17 file#
30 db file sequential read 17 file#

SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- -------
29 db file sequential read 17 file#
22 db file sequential read 17 file#
178 db file sequential read 17 file#
175 db file sequential read 17 file#
171 db file sequential read 17 file#
123 db file sequential read 17 file#
121 db file sequential read 17 file#
120 db file sequential read 17 file#
117 db file sequential read 17 file#
114 db file sequential read 17 file#
113 db file sequential read 17 file#

SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- ------
111 db file sequential read 17 file#
107 db file sequential read 17 file#
80 db file sequential read 17 file#
222 db file sequential read 17 file#
218 db file sequential read 17 file#
216 db file sequential read 17 file#
213 db file sequential read 17 file#
199 db file sequential read 17 file#
198 db file sequential read 17 file#
194 db file sequential read 17 file#
192 db file sequential read 17 file#

SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- ------
188 db file sequential read 17 file#
249 db file sequential read 17 file#
242 db file sequential read 17 file#
239 db file sequential read 17 file#
236 db file sequential read 17 file#
235 db file sequential read 17 file#
234 db file sequential read 17 file#
233 db file sequential read 17 file#
230 db file sequential read 17 file#
227 db file sequential read 17 file#
336 db file sequential read 17 file#

SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- ------
333 db file sequential read 17 file#
331 db file sequential read 17 file#
329 db file sequential read 17 file#
327 db file sequential read 17 file#
325 db file sequential read 17 file#
324 db file sequential read 17 file#
320 db file sequential read 17 file#
318 db file sequential read 17 file#
317 db file sequential read 17 file#
316 db file sequential read 17 file#
313 db file sequential read 17 file#

SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- ------
305 db file sequential read 17 file#
303 db file sequential read 17 file#
301 db file sequential read 17 file#
293 db file sequential read 17 file#
290 db file sequential read 17 file#
288 db file sequential read 17 file#
287 db file sequential read 17 file#
273 db file sequential read 17 file#
271 db file sequential read 17 file#
257 db file sequential read 17 file#
256 db file sequential read 17 file#

SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- ------
254 db file sequential read 17 file#
252 db file sequential read 17 file#
159 db file sequential read 17 file#
153 db file sequential read 17 file#
146 db file sequential read 17 file#
142 db file sequential read 17 file#
135 db file sequential read 17 file#
133 db file sequential read 17 file#
132 db file sequential read 17 file#
126 db file sequential read 17 file#
79 db file sequential read 17 file#

SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- ------
77 db file sequential read 17 file#
72 db file sequential read 17 file#
70 db file sequential read 17 file#
69 db file sequential read 17 file#
67 db file sequential read 17 file#
63 db file sequential read 17 file#
55 db file sequential read 17 file#
102 db file sequential read 17 file#
96 db file sequential read 17 file#
95 db file sequential read 17 file#
91 db file sequential read 17 file#

SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- ------
81 db file sequential read 17 file#
15 db file sequential read 17 file#
19 db file scattered read 17 file#
50 db file scattered read 17 file#
285 db file scattered read 17 file#
279 db file scattered read 17 file#
255 db file scattered read 17 file#
243 db file scattered read 17 file#
196 db file scattered read 17 file#
187 db file scattered read 17 file#
170 db file scattered read 17 file#

SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- ------
162 db file scattered read 17 file#
138 db file scattered read 17 file#
110 db file scattered read 17 file#
108 db file scattered read 17 file#
92 db file scattered read 17 file#
330 db file scattered read 17 file#
310 db file scattered read 17 file#
302 db file scattered read 17 file#
299 db file scattered read 17 file#
89 db file scattered read 17 file#
5 smon timer 300 sleep time

SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- ---------
20 SQL*Net message to client 1952673792 driver id
103 SQL*Net message to client 1650815232 driver id
....
148 SQL*Net more data from client 1952673792 driver id
291 SQL*Net more data from client 1952673792 driver id

244 rows selected.

  发现存在大量db file scattered read及db file sequential read等待.

  5.捕获相关SQL

  这里用到了我的以下脚本getsqlbysid.sql:

SELECT sql_text
FROM v$sqltext a
WHERE a.hash_value = (SELECT sql_hash_value
FROM v$session b
WHERE b.SID = '&sid')
ORDER BY piece ASC
/

SQL> @getsql
Enter value for sid: 18
old 5: where b.sid='&sid'
new 5: where b.sid='18'

SQL_TEXT
----------------------------------------
select i.vc2title,i.numinfoguid from hs_info i where i.intenab
ledflag = 1 and i.intpublishstate = 1 and i.datpublishdate <=
sysdate and i.numcatalogguid = 2047 order by i.datpublishdate d
esc, i.numorder desc

SQL> /
Enter value for sid: 54
old 5: where b.sid='&sid'
new 5: where b.sid='54'

SQL_TEXT
---------------------------------------
select i.vc2title,i.numinfoguid from hs_info i where i.intenab
ledflag = 1 and i.intpublishstate = 1and i.datpublishdate <=
sysdate and i.numcatalogguid = 33 order by i.datpublishdate des
c, i.numorder desc

SQL> /
Enter value for sid: 49
old 5: where b.sid='&sid'
new 5: where b.sid='49'

SQL_TEXT
---------------------------------------
select i.vc2title,i.numinfoguidfrom hs_info i where i.intenab
ledflag = 1 and i.intpublishstate = 1 and i.datpublishdate <=
sysdate and i.numcatalogguid = 26 order by i.datpublishdate des
c, i.numorder desc

  对几个全表扫描进程跟踪以后,得到以上SQL语句.

  以上语句如果良好编码应该使用绑定变量.但是现在这个不是我们关心的.

  使用该应用用户连接,检查其执行计划:

SQL> set autotrace trace explain
SQL> select i.vc2title,i.numinfoguid
2 from hs_info i where i.intenabledflag = 1
3 and i.intpublishstate = 1 and i.datpublishdate <=sysdate
4 and i.numcatalogguid = 3475
5 order by i.datpublishdate desc, i.numorder desc ;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=228 Card=1 Bytes=106)
1 0 SORT (ORDER BY) (Cost=228 Card=1 Bytes=106)
2 1 TABLE ACCESS (FULL) OF 'HS_INFO' (Cost=218 Card=1 Bytes=106)

SQL> select count(*) from hs_info;

COUNT(*)
----------
227404

  该表这里有22万记录,全表扫描已经不再适合.

  检查该表,存在以下索引:

SQL> select index_name,index_type from user_indexes where table_name='HS_INFO';

INDEX_NAME INDEX_TYPE
------------------------------ ---------------------------
HSIDX_INFO1 FUNCTION-BASED NORMAL
HSIDX_INFO_SEARCHKEY DOMAIN
PK_HS_INFO NORMAL

  检查索引键值:

SQL> select index_name,column_name from user_ind_columns where table_name ='HS_INFO';

INDEX_NAME COLUMN_NAME
------------------------------ --------------------
HSIDX_INFO1 NUMORDER
HSIDX_INFO1 SYS_NC00024$
HSIDX_INFO_SEARCHKEY VC2INDEXWORDS
PK_HS_INFO NUMINFOGUID

SQL> desc hs_info
Name Null? Type
----------------------------------- -------- -------------
NUMINFOGUID NOT NULL NUMBER(15)
NUMCATALOGGUID NOT NULL NUMBER(15)
INTTEXTTYPE NOT NULL NUMBER(38)
VC2TITLE NOT NULL VARCHAR2(60)
VC2AUTHOR VARCHAR2(100)
NUMPREVINFOGUID NUMBER(15)
NUMNEXTINFOGUID NUMBER(15)
NUMORDER NOT NULL NUMBER(15)
DATPUBLISHDATE NOT NULL DATE
INTPUBLISHSTATE NOT NULL NUMBER(38)
VC2PUBLISHERID VARCHAR2(30)
VC2INDEXWORDS VARCHAR2(200)
VC2WAPPREVPATH VARCHAR2(200)
VC2WEBPREVPATH VARCHAR2(200)
VC2WAP2PREVPATH VARCHAR2(200)
NUMVISITED NOT NULL NUMBER(15)
INTENABLEDFLAG NOT NULL NUMBER(38)
DATCREATETIME NOT NULL DATE
DATMODIFYTIME NOT NULL DATE
VC2NOTES VARCHAR2(1000)
INTINFOTYPE NOT NULL NUMBER(38)
VC2PRIZEFLAG VARCHAR2(1)
VC2DESC VARCHAR2(1000)

  6.决定创建新的索引以消除全表扫描

SQL> create index hs_info_NUMCATALOGGUID on hs_info(NUMCATALOGGUID);

Index created.

SQL> set autotrace trace explain
SQL> select i.vc2title,i.numinfoguid
2 from hs_info i where i.intenabledflag = 1
3 and i.intpublishstate = 1 and i.datpublishdate <=sysdate
4 and i.numcatalogguid = 3475
5 order by i.datpublishdate desc, i.numorder desc ;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=12 Card=1 Bytes=106)
1 0 SORT (ORDER BY) (Cost=12 Card=1 Bytes=106)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'HS_INFO' (Cost=2 Card=1 Bytes=106)
3 2 INDEX (RANGE SCAN)OF'HS_INFO_NUMCATALOGGUID'(NON-UNIQUE)(Cost=1 Card=1)

  7.观察系统状况

  原大量等待消失

SQL> select sid,event,p1,p1text from v$session_wait where event not like 'SQL%';

SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- ------------
1 pmon timer 300 duration
2 rdbms ipc message 300 timeout
3 rdbms ipc message 300 timeout
6 rdbms ipc message 180000 timeout
59 rdbms ipc message 6000 timeout
118 rdbms ipc message 6000 timeout
275 rdbms ipc message 30000 timeout
147 rdbms ipc message 6000 timeout
62 rdbms ipc message 6000 timeout
11 rdbms ipc message 30000 timeout
4 rdbms ipc message 300 timeout

SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- ------------
305 db file sequential read 17 file#
356 db file sequential read 17 file#
19 db file scattered read 17 file#
5 smon timer 300 sleep time

15 rows selected.

  持续观察的CPU使用情况

bash-2.03$ vmstat 3
procs memory page disk faults cpu
r b w swap free re mf pi po fr de sr s6 s9 s1 sd in sy cs us sy id
20 0 0 5421792 1503488 38 434 136 0 0 0 0 0 0 2 0 2931 7795 2622 91 9 0
23 1 0 5416080 1500632 95 734 56 0 0 0 0 0 0 0 0 2949 8057 2598 89 11 0
26 0 0 5412016 1498480 210 1170 21 5 5 0 0 0 2 1 0 3301 9647 3116 90 10 0
25 0 0 5394912 1490160 242 1606 56 0 0 0 0 0 0 1 0 3133 9318 2850 89 11 0
40 0 0 5390200 1488112 162 1393 66 0 0 0 0 0 0 0 0 2848 9080 2502 90 10 0
40 0 0 5377120 1481792 136 1180 120 2 2 0 0 0 1 1 0 2846 9099 2593 92 8 0
36 0 0 5363216 1475168 134 1169 53 0 0 0 0 0 3 2 0 2871 7989 2621 88 12 0
39 0 0 5348936 1469160 157 1448 210 0 0 0 0 0 0 0 0 3660 10062 3480 88 12 0
35 0 0 5344552 1466472 7 15 56 0 0 0 0 0 0 0 0 2885 7663 2635 92 8 0
34 0 0 5343016 1465416 44 386 77 0 0 0 0 0 0 0 0 3197 8486 2902 92 8 0
31 0 0 5331568 1459696 178 1491 122 0 0 0 0 0 0 3 0 3237 9461 3005 89 11 0
31 0 0 5317792 1453008 76 719 80 0 0 0 0 0 0 0 0 3292 8736 3025 93 7 0
31 2 0 5311144 1449552 235 1263 69 2 2 0 0 0 1 0 0 3473 9535 3357 88 12 0
25 0 0 5300240 1443920 108 757 18 2 2 0 0 0 1 1 0 2377 7876 2274 95 5 0
19 0 0 5295904 1441840 50 377 0 0 0 0 0 0 0 1 0 1915 6598 1599 98 1 0
----以上为创建索引之前部分
----以下为创建索引之后部分,CPU使用率恢复正常
procs memory page disk faults cpu
r b w swap free re mf pi po fr de sr s6 s9 s1 sd in sy cs us sy id
40 1 0 5290040 1439208 315 3894 8 2 2 0 0 0 1 6 0 3631 13414 5206 61 9 30
0 1 0 5237192 1414744 731 6749 45 0 0 0 0 0 2 7 0 3264 13558 4941 52 14 34
0 0 0 5163632 1380608 747 6585 10 0 0 0 0 0 0 1 0 2617 12291 3901 46 12 41
1 0 0 5090224 1348152 712 6079 29 0 0 0 0 0 0 6 0 2825 12416 4178 50 12 39
1 0 0 5023672 1317296 714 6183 24 0 0 0 0 0 0 5 0 3166 12424 4745 47 13 40
0 0 0 4955872 1287136 737 6258 16 0 0 0 0 0 0 3 0 2890 11777 4432 44 12 44
1 0 0 4887888 1256464 809 6234 8 2 2 0 0 0 0 2 0 2809 12066 4247 45 12 43
0 0 0 4828912 1228200 312 2364 13 5 5 0 0 0 2 1 0 2410 6816 3492 38 6 57
0 0 0 4856816 1240168 8 138 0 0 0 0 0 0 1 0 0 2314 4026 3232 34 4 62
0 0 0 4874176 1247712 0 86 0 0 0 0 0 0 0 0 0 2298 3930 3324 35 2 63
2 0 0 4926088 1270824 34 560 0 0 0 0 0 0 0 0 0 2192 4694 2612 29 16 55
0 0 0 5427320 1512952 53 694 0 0 0 0 0 0 3 2 0 2443 5085 3340 33 12 55
0 0 0 5509120 1553136 0 37 0 0 0 0 0 0 0 0 0 2309 3908 3321 35 1 64
0 0 0 5562048 1577000 16 234 0 0 0 0 0 0 0 0 0 2507 5187 3433 35 8 57
0 0 0 5665672 1623848 252 1896 8 2 2 0 0 0 1 0 0 2091 6548 2939 34 5 61
0 0 0 5654752 1618208 5 173 16 0 0 0 0 0 0 0 0 2226 4218 3051 35 4 60
0 0 0 5727024 1651120 28 254 0 0 0 0 0 0 0 0 0 2126 4224 2982 38 2 60
0 0 0 5723184 1648880 93 562 8 2 2 0 0 0 1 1 0 2371 5140 3432 38 3 59
0 0 0 5730744 1652512 7 177 26 2 2 0 0 0 1 0 0 2465 4442 3575 36 3 61

通过上面的操作,问题就得到解决啦,上文有较多的数字,看起来比较复杂,希望大家认真看,相信一定会给大家带来惊喜。

【编辑推荐】

  1. 关于Memcached客户端CPU过高问题的排查
  2. 正确的查看Oracle CPU的情况
  3. Oracle优化CPU使用的实际操作方案详解
责任编辑:迎迎 来源: 希赛网
相关推荐

2010-04-16 10:38:49

CPU消耗

2012-09-05 11:09:15

SELinux操作系统

2010-04-29 17:46:31

Oracle死锁

2020-07-08 07:00:00

LinuxCPU应用程序

2017-07-20 07:30:16

大数据数据互联网

2024-10-29 16:41:24

SpringBoot跨域Java

2013-05-21 10:49:59

Windows硬件冲突

2011-08-29 10:34:00

网络安全云安全云计算

2023-10-30 18:35:47

MySQL主从延时

2017-10-17 09:21:06

2021-06-06 13:05:15

前端跨域CORS

2010-07-16 13:52:26

telnet漏洞

2024-11-21 16:47:55

2017-01-12 15:17:27

线程CPU定位

2019-06-03 10:30:42

MySQLCPU数据库

2011-09-05 13:32:56

2009-09-21 17:10:14

struts Hibe

2023-05-25 08:00:36

阿⾥云DNS重试机制

2021-10-20 20:27:55

MySQL死锁并发

2019-07-30 08:28:44

VirtualBox桥接网络
点赞
收藏

51CTO技术栈公众号