Troubleshooting ideas and solutions for high CPU usage in Linux systems

Troubleshooting ideas and solutions for high CPU usage in Linux systems

Preface

As Linux operation and maintenance engineers, in our daily work we may encounter situations where the CPU load on Linux servers reaches 100% and remains high. If the CPU continues to run high, it will affect the normal operation of the business system and cause losses to the company.


Many operation and maintenance personnel are often at a loss when encountering this situation. For CPU overload problems, the following two methods can usually be used to quickly locate them:

Method 1

Step 1: Use

top command, then press shift+p to sort by CPU

Find the pid of the process that is using too much CPU

Step 2: Use

top -H -p [process id]

Find the id of the thread that consumes the most resources in the process

Step 3: Use

echo 'obase=16;[thread id]' | bc or printf "%x\n" [thread id]

Convert the thread id to hexadecimal (letters should be lowercase)

bc is the calculator command in Linux

Step 4: Execution

jstack [process id] |grep -A 10 [thread id in hexadecimal]"

View thread status information

Method 2

Step 1: Use

top command, then press shift+p to sort by CPU

Find the process that is using too much CPU

Step 2: Use

ps -mp pid -o THREAD,tid,time | sort -rn

Get thread information and find threads that use up a lot of CPU

Step 3: Use

echo 'obase=16;[thread id]' | bc or printf "%x\n" [thread id]

Convert the required thread ID to hexadecimal format

Step 4: Use

jstack pid |grep tid -A 30 [hexadecimal of thread id]

Print thread stack information

Case Study

Scenario Description

Troubleshooting high CPU usage of JAVA processes in production environments

Solution process

1. According to the top command, it is found that the Java process with PID 2633 occupies up to 300% of the CPU and a fault occurs.

2. After finding the process, how to locate the specific thread or code? First, display the thread list and sort it by the threads with high CPU usage:

[root@localhost ~]# ps -mp 2633 -o THREAD,tid,time | sort -rn

The results are as follows:


The thread (TID) 3626 with the highest CPU consumption was found, which has occupied the CPU time for 12 minutes!

3. Convert the required thread TID to hexadecimal format

[root@localhost ~]# printf "%x\n" 3626
e18

4. Finally, use the jstack command to print out the stack information of this thread under the process:

[root@localhost ~]# jstack 2633 |grep "e18" -A 30

Compared with troubleshooting, discovering the fault is equally important! Most monitoring software on the market can achieve real-time observation of server load, such as Zabbix, Nagios, Alibaba Cloud Monitoring (for cloud servers), etc. However, most of the software requires operation and maintenance personnel to actively set rules or conduct tests to discover problems. How can we receive alerts passively?

I would like to recommend a practical operation and maintenance software to you - Professor Wang. For users whose businesses are deployed on Alibaba Cloud, they only need to bind the read-only AcessKey that needs to be monitored to promptly notify the corresponding team members of the alarm information of the cloud resources.

The change from active to passive approach reduces the workload of operation and maintenance engineers on the one hand, and reduces the chances of O&M engineers missing or ignoring alarms on the other.

Summarize

The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. Thank you for your support of 123WORDPRESS.COM.

You may also be interested in:
  • Detailed explanation of Linux CPU load and CPU utilization
  • Detailed explanation of the process of troubleshooting the cause of high CPU usage under Linux

<<:  Detailed explanation of Vue custom instructions and their use

>>:  Summary of the installation process of MySql 8.0.11 and the problems encountered when linking with Navicat

Recommend

Detailed explanation of Vue.js directive custom instructions

Customize a demo command The syntax of Vue custom...

Detailed explanation of how to efficiently import multiple .sql files into MySQL

MySQL has multiple ways to import multiple .sql f...

Delete the image operation of none in docker images

Since I usually use the docker build command to g...

CSS uses the autoflow attribute to achieve seat selection effect

1. Autoflow attribute, if the length and width of...

How to use lodop print control in Vue to achieve browser compatible printing

Preface This control will have a watermark at the...

A brief summary of my experience in writing HTML pages

It has been three or four months since I joined Wo...

Introduction to the use of http-equiv attribute in meta tag

meta is an auxiliary tag in the head area of ​​htm...

A brief talk about JavaScript Sandbox

Preface: Speaking of sandboxes, our minds may ref...

Embed player in web page embed element autostart false invalid

Recently, I encountered the need to embed a player...

How to configure domestic sources in CentOS8 yum/dnf

CentOS 8 changed the software package installatio...

Conflict resolution when marquee and flash coexist in a page

The main symptom of the conflict is that the FLASH...