We recently had issue with one of our physical servers.
It will become unresponsive and hang in a lot of situations.
Oracle Linux Server release 7.3
NAME="Oracle Linux Server"
VERSION="7.3"
Linux pwbdd-fe-1 4.1.12-61.1.18.el7uek.x86_64 #
256GB of RAM and 16 CPU (2socket * 8 cores)
Looked into sar logs and found out the memory usage was around 99.7%.
We found out Page allocations error in the /var/log/messages
kworker/u66:0: page allocation failure: order:1, mode:0x204020
CPU: 8 PID: 1579 Comm: kworker/u66:0 Tainted: G OE 4.1.12-61.1.18.el7uek.x86_64 #2
There was no entry in Oracle Database log or any trace file, so database had to do minimal with it.
We finally fixed it with below kernel settings.
I have to admit that I took help of an ODA with Linux 7 and similar amount of RAM as well
Referred Links and
https://utcc.utoronto.ca/~cks/space/blog/linux/DecodingPageAllocFailures
http://tech.donghao.org/2016/03/25/avoid-page-allocation-failure-for-linux-kernel/
https://unix.stackexchange.com/questions/128539/what-would-happen-if-the-amount-of-free-memory-vm-min-free-kbytes-was-too-low
https://stackoverflow.com/questions/21374491/vm-min-free-kbytes-why-keep-minimum-reserved-memory
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/performance_tuning_guide/s-memory-tunables
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/performance_tuning_guide/sect-red_hat_enterprise_linux-performance_tuning_guide-configuration_tools-configuring_system_memory_capacity
It will become unresponsive and hang in a lot of situations.
Oracle Linux Server release 7.3
NAME="Oracle Linux Server"
VERSION="7.3"
Linux pwbdd-fe-1 4.1.12-61.1.18.el7uek.x86_64 #
256GB of RAM and 16 CPU (2socket * 8 cores)
Looked into sar logs and found out the memory usage was around 99.7%.
We found out Page allocations error in the /var/log/messages
kworker/u66:0: page allocation failure: order:1, mode:0x204020
CPU: 8 PID: 1579 Comm: kworker/u66:0 Tainted: G OE 4.1.12-61.1.18.el7uek.x86_64 #2
There was no entry in Oracle Database log or any trace file, so database had to do minimal with it.
We finally fixed it with below kernel settings.
- vm.min_free_kbytes = 1048576
- vm.swappiness = 60
- vm.dirty_ratio = 20
I have to admit that I took help of an ODA with Linux 7 and similar amount of RAM as well
Referred Links and
https://utcc.utoronto.ca/~cks/space/blog/linux/DecodingPageAllocFailures
http://tech.donghao.org/2016/03/25/avoid-page-allocation-failure-for-linux-kernel/
https://unix.stackexchange.com/questions/128539/what-would-happen-if-the-amount-of-free-memory-vm-min-free-kbytes-was-too-low
https://stackoverflow.com/questions/21374491/vm-min-free-kbytes-why-keep-minimum-reserved-memory
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/performance_tuning_guide/s-memory-tunables
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/performance_tuning_guide/sect-red_hat_enterprise_linux-performance_tuning_guide-configuration_tools-configuring_system_memory_capacity
Oracle Support Docs
[Linux OS] System Hung with Large Numbers of Page Allocation Failures with "order:5" on Exadata Environments (Doc ID 1546861.1)
Linux: Out-of-Memory (OOM) Killer (Doc ID 452000.1)
Top 5 Issues That Cause Node Reboots or Evictions or Unexpected Recycle of CRS (Doc ID 1367153.1)
RAC and Oracle Clusterware Best Practices and Starter Kit (Linux) (Doc ID 811306.1)
Linux Kernel: The SLAB Allocator (Doc ID 434351.1)
[Linux OS] System Hung with 'page allocation failure' (Doc ID 2059297.1)
Linux: Out-of-Memory (OOM) Killer (Doc ID 452000.1)
Top 5 Issues That Cause Node Reboots or Evictions or Unexpected Recycle of CRS (Doc ID 1367153.1)
RAC and Oracle Clusterware Best Practices and Starter Kit (Linux) (Doc ID 811306.1)
Linux Kernel: The SLAB Allocator (Doc ID 434351.1)
[Linux OS] System Hung with 'page allocation failure' (Doc ID 2059297.1)
No comments:
Write comments