联机日志文件过小引发的log file 相关等待
Oracle 联机重做日志文件记录了数据库的所有变化(DML,DDL或管理员对数据所作的结构性更改等),用于对于意外删除或宕机利用日志文件实现数据恢复来确保数据的完整性。但不合理的联机日志文件规划将引发日志相关的等待事件。下面是这样一个来自生产环境中的例子。
1、故障描述
--客户描述该数据库晚上用于实现数据同步以及汇总,以前一直工作的比较良好,随着需要同步的数量量的增大,最近变得越来越慢。
--下面我们首先取了客户晚8点至第二天7点的awr report。
WORKLOAD REPOSITORY report for
DB Name DB Id Instance Inst Num Release RAC Host
------------ ----------- ------------ -------- ----------- --- ------------
ST990 2152526631 ST990 1 10.2.0.3.0 NO v2011db02p
Snap Id Snap Time Sessions Curs/Sess
--------- ------------------- -------- ---------
Begin Snap: 21787 21-Feb-13 20:00:22 50 19.5
End Snap: 21798 22-Feb-13 07:00:47 44 20.0
Elapsed: 660.42 (mins)
DB Time: 928.06 (mins)
--从上面的awr report 可知,单实例,10.2.0.3版本,期间的会话数也不多
--Elapsed < DB Time
--Elapsed Time=(20130222 07:00:00 - 20130221 20:00:00)≈ 660
--DB Time=928.06 ,运行环境为16核CPU, 660*16=10560, cpu花费了928.06分钟在处理Oralce非空闲等待和运算上
--从上可知,整个系统还是比较空闲
--下面来看看top event
Top 5 Timed Events Avg %Total
~~~~~~~~~~~~~~~~~~ wait Call
Event Waits Time (s) (ms) Time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
CPU time 20,673 37.1
log file parallel write 27,399 4,797 175 8.6 System I/O
control file parallel write 13,428 4,688 349 8.4 System I/O
log file sync 19,564 3,795 194 6.8 Commit
db file scattered read 26,651,537 3,439 0 6.2 User I/O
--从上面的top event事件上来看,log file相关等待事件表现明显
--log file parallel write等待事件总等待次数27,399 总等待时间4,797/60=79.95(min),超出一个小时,相当可观
--其次是control file parallel write与log file sync事件的相关等待
--下面是等待事件的detail信息
Wait Events DB/Inst: ST1200/ST1200 Snaps: 21787-21798
-> s - second
-> cs - centisecond - 100th of a second
-> ms - millisecond - 1000th of a second
-> us - microsecond - 1000000th of a second
-> ordered by wait time desc, waits desc (idle events last)
%Time Total Wait wait Waits
Event Waits -outs Time (s) (ms) /txn
---------------------------- -------------- ------ ----------- ------- ---------
log file parallel write 27,399 .0 4,797 175 1.1
control file parallel write 13,428 .0 4,688 349 0.5
log file sync 19,564 10.6 3,795 194 0.8
db file scattered read 26,651,537 .0 3,439 0 1,049.4
db file sequential read 6,682,373 .0 1,567 0 263.1
log file switch (checkpoint 1,091 92.9 1,019 934 0.0
Datapump dump file I/O 633,458 .0 286 0 24.9
log file switch completion 332 31.6 183 552 0.0
log buffer space 255 47.8 155 608 0.0
free buffer waits 2,409 99.5 120 50 0.1
buffer busy waits 145 62.8 96 664 0.0
2、分析故障
--客户描述该数据库晚上用于实现数据同步以及汇总,以前一直工作的比较良好,随着需要同步的数量量的增大,最近变得越来越慢。
--下面我们首先取了客户晚8点至第二天7点的awr report。
WORKLOAD REPOSITORY report for
DB Name DB Id Instance Inst Num Release RAC Host
------------ ----------- ------------ -------- ----------- --- ------------
ST990 2152526631 ST990 1 10.2.0.3.0 NO v2011db02p
Snap Id Snap Time Sessions Curs/Sess
--------- ------------------- -------- ---------
Begin Snap: 21787 21-Feb-13 20:00:22 50 19.5
End Snap: 21798 22-Feb-13 07:00:47 44 20.0
Elapsed: 660.42 (mins)
DB Time: 928.06 (mins)
--从上面的awr report 可知,单实例,10.2.0.3版本,期间的会话数也不多
--Elapsed < DB Time
--Elapsed Time=(20130222 07:00:00 - 20130221 20:00:00)≈ 660
--DB Time=928.06 ,运行环境为16核CPU, 660*16=10560, cpu花费了928.06分钟在处理Oralce非空闲等待和运算上
--从上可知,整个系统还是比较空闲
--下面来看看top event
Top 5 Timed Events Avg %Total
~~~~~~~~~~~~~~~~~~ wait Call
Event Waits Time (s) (ms) Time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
CPU time 20,673 37.1
log file parallel write 27,399 4,797 175 8.6 System I/O
control file parallel write 13,428 4,688 349 8.4 System I/O
log file sync 19,564 3,795 194 6.8 Commit
db file scattered read 26,651,537 3,439 0 6.2 User I/O
--从上面的top event事件上来看,log file相关等待事件表现明显
--log file parallel write等待事件总等待次数27,399 总等待时间4,797/60=79.95(min),超出一个小时,相当可观
--其次是control file parallel write与log file sync事件的相关等待
--下面是等待事件的detail信息
Wait Events DB/Inst: ST1200/ST1200 Snaps: 21787-21798
-> s - second
-> cs - centisecond - 100th of a second
-> ms - millisecond - 1000th of a second
-> us - microsecond - 1000000th of a second
-> ordered by wait time desc, waits desc (idle events last)
%Time Total Wait wait Waits
Event Waits -outs Time (s) (ms) /txn
---------------------------- -------------- ------ ----------- ------- ---------
log file parallel write 27,399 .0 4,797 175 1.1
control file parallel write 13,428 .0 4,688 349 0.5
log file sync 19,564 10.6 3,795 194 0.8
db file scattered read 26,651,537 .0 3,439 0 1,049.4
db file sequential read 6,682,373 .0 1,567 0 263.1
log file switch (checkpoint 1,091 92.9 1,019 934 0.0
Datapump dump file I/O 633,458 .0 286 0 24.9
log file switch completion 332 31.6 183 552 0.0
log buffer space 255 47.8 155 608 0.0
free buffer waits 2,409 99.5 120 50 0.1
buffer busy waits 145 62.8 96 664 0.0
3、几个log file 事件 log file parallel write
The log file parallel write wait event has three parameters: files, blocks, and requests. In Oracle Database 10g, this wait event falls under the System I/O wait class. Keep the following key thoughts in mind when dealing with the log file parallel write wait event.
The log file parallel write event belongs only to the LGWR process. A slow LGWR can impact foreground processes commit time. Significant log file parallel write wait time is most likely an I/O issue
log file sync
The log file sync wait event has one parameter: buffer#. In Oracle Database 10g, this wait event falls under the Commit wait class. Keep the following key thoughts in mind when dealing with the log file sync wait event.
The log file sync wait event is related to transaction terminations (commits or rollbacks).
When a process spends a lot of time on the log file sync event, it is usually indicative of too many commits or short transactions.
The log file switch (checkpoint incomplete) wait event has no wait parameters.
In Oracle Database 10g, this wait event falls under the Configuration wait class. Keep the following key thought in mind when dealing with the log file switch (checkpoint incomplete) wait event.
Excessive log switches caused by small log files and a high transaction rate
更多的知识点可以参考 Oracle Wait Interface: A Practical Guide to Performance Diagnostics & Tuning
4、建议与解决方案 a、从上面的分析以及日志相关等待事件的解释来看,首要的是增加日志文件的大小(200-250MB)。可参考:调整联机重做日志大小(change redo log size) b、日志文件组太多,建议减少到4-5组 c、可能的情形下,将日志存放到高速磁盘(目前是raid 5上),如存放到raid 0之上 d、采用批量提交的方式来提交事务 e、建议增加DBWn的数目
- 【深度学习量化投资】RNNs在股票价格预测的应用基于Keras
- 关于webview调用js出现has no method 'toString'
- 深入学习Apache Spark和TensorFlow
- 搭建 WPF 上的 UI 自动化测试框架
- ttf设置文字字体
- R语言构建追涨杀跌量化交易模型(附源代码)
- Apache Spark中使用DataFrame的统计和数学函数
- android进程 清理及activity栈管理
- 机器学习模型的变量评估和选择基于技术指标『深度解析』
- Picasso and Android-Universal-Image-Loader缓存框架
- 解决ListView嵌套ListView遇到的问题
- 《OEA - 实体扩展属性系统 - 设计方案说明书》
- webview与js的相互交互
- Java与js的交互
- JavaScript 教程
- JavaScript 编辑工具
- JavaScript 与HTML
- JavaScript 与Java
- JavaScript 数据结构
- JavaScript 基本数据类型
- JavaScript 特殊数据类型
- JavaScript 运算符
- JavaScript typeof 运算符
- JavaScript 表达式
- JavaScript 类型转换
- JavaScript 基本语法
- JavaScript 注释
- Javascript 基本处理流程
- Javascript 选择结构
- Javascript if 语句
- Javascript if 语句的嵌套
- Javascript switch 语句
- Javascript 循环结构
- Javascript 循环结构实例
- Javascript 跳转语句
- Javascript 控制语句总结
- Javascript 函数介绍
- Javascript 函数的定义
- Javascript 函数调用
- Javascript 几种特殊的函数
- JavaScript 内置函数简介
- Javascript eval() 函数
- Javascript isFinite() 函数
- Javascript isNaN() 函数
- parseInt() 与 parseFloat()
- escape() 与 unescape()
- Javascript 字符串介绍
- Javascript length属性
- javascript 字符串函数
- Javascript 日期对象简介
- Javascript 日期对象用途
- Date 对象属性和方法
- Javascript 数组是什么
- Javascript 创建数组
- Javascript 数组赋值与取值
- Javascript 数组属性和方法
- 面试官:你精通多少种语言的 Hello World?
- Redis安装(Windows和Linux)详细图解
- 史上最详细版 头文件biso.h,graphics.h,libbgi.a
- ZooKeeper入门,这一篇给你讲的明明白白
- 数论-GCD、LCM、扩展欧几里得
- “豪 横”版 channel_v3.json,你确定不需要?
- Redis-性能测试(redis-benchmark)
- 一条贪吃蛇的使命——零基础入门贪吃蛇游戏
- 数论-快速幂、矩阵快速幂
- 字节一面,面试官告诉我链表掌握的不熟练
- 好玩、有趣的 Linux 命令学习神器 kmdr!
- 基于web的机票管理系统设计与实现(二)
- 任意进制转换(2进制、8进制、16进制等)
- 动态规划-数位DP
- R海拾遗-table1绘制