I mistakenly used a limited server as an iperf server for 5000 parallel connections. (limit is 1024 processes)
Now every time I log in, I see this:
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: Resource temporarily unavailable
Then, I try to kill them, but when I do ps, I get this:
-bash-4.1$ ps
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: Resource temporarily unavailable
Same happens when I do a killall or similar things.
I have even tried to reboot the system but again this is what I get after reboot:
-bash-4.1$ sudo reboot
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: Resource temporarily unavailable
-bash-4.1$
So Basically I cannot do anything. all the commands get this error :/
I can, however, do «exit».
This is an off-site server that I do not have physical access to, so I cannot turn it off/on physically.
Any ideas how I can fix this problem? I highly appreciate any help.
fork: Resource temporarily unavailable
The error means that the current shell resource is limited (check the limits by ulimit -a
). So you can either try in another shell, or increase the resources by using ulimit
command which controls over the resources available to the shell and processes it creates on operating system.
To increase the limits, try running:
ulimit -Sn unlimited && ulimit -Sl unlimited
to raise the soft limits to hard one, or:
ulimit -l unlimited
ulimit -n 10240
to set the maximum size a process to unlimited and the maximum number of open file to 10240.
See: help ulimit
for more information.
To make it persistent, add the above settings into your startup rc files (e.g. ~/.bashrc
).
You can also use /etc/sysctl.conf
(see: man sysctl.conf
) to increase the kernel limits, e.g.
kern.maxprocperuid=1000
kern.maxproc=2000
For the case in the comments, where you were not using much memory per thread, you were hitting the cgroup limits. You will find the default to be around 12288, but the value is writable:
$ cat /sys/fs/cgroup/pids/user.slice/user-1000.slice/pids.max
12288
$ echo 15000 | sudo tee /sys/fs/cgroup/pids/user.slice/user-1000.slice/pids.max
15000
$ cat /sys/fs/cgroup/pids/user.slice/user-1000.slice/pids.max
15000
And if I use my «what is the thread limit» program (found here) to check, before:
$ ./thread-limit
Creating threads ...
100 threads so far ...
200 threads so far ...
...
12100 threads so far ...
12200 threads so far ...
Failed with return code 11 creating thread 12281 (Resource temporarily unavailable).
Malloc worked, hmmm
and after:
$ ./thread-limit
Creating threads ...
100 threads so far ...
200 threads so far ...
300 threads so far ...
...
14700 threads so far ...
14800 threads so far ...
14900 threads so far ...
Failed with return code 11 creating thread 14993 (Resource temporarily unavailable).
Malloc worked, hmmm
Of course, the numbers above are not exact because the «doug» user has a few other threads running, such as my SSH sessions to my sever. Check with:
$ cat /sys/fs/cgroup/pids/user.slice/user-1000.slice/pids.current
8
Program used:
/* compile with: gcc -pthread -o thread-limit thread-limit.c */
/* originally from: http://www.volano.com/linuxnotes.html */
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>
#include <string.h>
#define MAX_THREADS 100000
#define PTHREAD_STACK_MIN 1*1024*1024*1024
int i;
void run(void) {
sleep(60 * 60);
}
int main(int argc, char *argv[]) {
int rc = 0;
pthread_t thread[MAX_THREADS];
pthread_attr_t thread_attr;
pthread_attr_init(&thread_attr);
pthread_attr_setstacksize(&thread_attr, PTHREAD_STACK_MIN);
printf("Creating threads ...n");
for (i = 0; i < MAX_THREADS && rc == 0; i++) {
rc = pthread_create(&(thread[i]), &thread_attr, (void *) &run, NULL);
if (rc == 0) {
pthread_detach(thread[i]);
if ((i + 1) % 100 == 0)
printf("%i threads so far ...n", i + 1);
}
else
{
printf("Failed with return code %i creating thread %i (%s).n",
rc, i + 1, strerror(rc));
// can we allocate memory?
char *block = NULL;
block = malloc(65545);
if(block == NULL)
printf("Malloc failed too :( n");
else
printf("Malloc worked, hmmmn");
}
}
sleep(60*60); // ctrl+c to exit; makes it easier to see mem use
exit(0);
}
See also here
EDIT May, 2020: For newer versions of Ubuntu, the default maximum PID number is now 4194304, and therefore adjusting it is not needed.
Now, if you have enough memory, the next limit will be defined by the default maximum PID number, which is 32768, but is also writable. Obvioulsy in order to have more than 32768 simultaneous processes or tasks or threads their PID will have to be allowed to be higher:
$ cat /proc/sys/kernel/pid_max
32768
$ echo 80000 | sudo tee /proc/sys/kernel/pid_max
80000
$ cat /proc/sys/kernel/pid_max
80000
Note that is quite on purpose that a number bigger than 2**16 was chosen, to see if it was actually allowed. And so now, set the cgroup max to, say 70000:
$ echo 70000 | sudo tee /sys/fs/cgroup/pids/user.slice/user-1000.slice/pids.max
70000
$ cat /sys/fs/cgroup/pids/user.slice/user-1000.slice/pids.max
70000
And at this point, realize that the above listed program seems to have a limit of about 32768 threads, even if resources are still available, and so use another method. My test server with 16 gigabytes of memory seems to exhaust some other resource at about 62344 tasks, even though there does seem to still be memory available.
$ cat /sys/fs/cgroup/pids/user.slice/user-1000.slice/pids.current
62344
top:
top - 13:48:26 up 21:08, 4 users, load average: 281.52, 134.90, 70.93
Tasks: 62535 total, 201 running, 62334 sleeping, 0 stopped, 0 zombie
%Cpu0 : 96.6 us, 2.4 sy, 0.0 ni, 1.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu1 : 95.7 us, 2.4 sy, 0.0 ni, 1.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 95.1 us, 3.1 sy, 0.0 ni, 1.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 93.5 us, 4.6 sy, 0.0 ni, 1.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu4 : 94.8 us, 3.4 sy, 0.0 ni, 1.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu5 : 95.5 us, 2.6 sy, 0.0 ni, 1.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu6 : 94.7 us, 3.5 sy, 0.0 ni, 1.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu7 : 93.8 us, 4.5 sy, 0.0 ni, 1.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 15999116 total, 758684 free, 10344908 used, 4895524 buff/cache
KiB Swap: 16472060 total, 16470396 free, 1664 used. 4031160 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
37884 doug 20 0 108052 68920 3104 R 5.7 0.4 1:23.08 top
24075 doug 20 0 4360 652 576 S 0.4 0.0 0:00.31 consume
26006 doug 20 0 4360 796 720 S 0.4 0.0 0:00.09 consume
30062 doug 20 0 4360 732 656 S 0.4 0.0 0:00.17 consume
21009 doug 20 0 4360 748 672 S 0.3 0.0 0:00.26 consume
Seems I finally hit my default ulimit settings for both user processes and number of timers (signals):
$ ulimit -i
62340
doug@s15:~$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 62340
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 32768
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 62340
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
If I raise those limits, in my case, I did it via /etc/security/limits.conf
:
# /etc/security/limits.conf
#
# S18 specific edits. 2019.12.24
# also for a rediculous number of threads test.
#
#Each line describes a limit for a user in the form:
#
#<domain> <type> <item> <value>
#
#Where:
#<domain> can be:
# - a user name
# - a group name, with @group syntax
# - the wildcard *, for default entry
# - the wildcard %, can be also used with %group syntax,
# for maxlogin limit
# - NOTE: group and wildcard limits are not applied to root.
# To apply a limit to the root user, <domain> must be
# the literal username root.
#
#<type> can have the two values:
# - "soft" for enforcing the soft limits
# - "hard" for enforcing hard limits
#
#<item> can be one of the following:
# - core - limits the core file size (KB)
# - data - max data size (KB)
# - fsize - maximum filesize (KB)
# - memlock - max locked-in-memory address space (KB)
# - nofile - max number of open file descriptors
* - nofile 32768
# - rss - max resident set size (KB)
# - stack - max stack size (KB)
# - cpu - max CPU time (MIN)
# - nproc - max number of processes
* - nproc 200000
# - as - address space limit (KB)
# - maxlogins - max number of logins for this user
# - maxsyslogins - max number of logins on the system
# - priority - the priority to run user process with
# - locks - max number of file locks the user can hold
# - sigpending - max number of pending signals
* - sigpending 200000
# - msgqueue - max memory used by POSIX message queues (bytes)
# - nice - max nice priority allowed to raise to values: [-20, 19]
# - rtprio - max realtime priority
# - chroot - change root to directory (Debian-specific)
#
#<domain> <type> <item> <value>
#
#* soft core 0
#root hard core 100000
#* hard rss 10000
#@student hard nproc 20
#@faculty soft nproc 20
#@faculty hard nproc 50
#ftp hard nproc 0
#ftp - chroot /ftp
#@student - maxlogins 4
# End of file
I am able to go to 126020 threads, before the return of the inability to fork. This time the limit was (keep in mind that there are about `150 root owned threads on this server, before the test starts):
cat /proc/sys/kernel/threads-max
126189
O.K. so now adjusting that parameter:
echo 99999999 | sudo tee /proc/sys/kernel/threads-max
99999999
I can get to about 132,000 threads before my 16 gigabyte server starts to swap memory, and trouble errupts.
$ cat /sys/fs/cgroup/pids/user.slice/user-1000.slice/pids.current
132016
Note: running top places a significant additional load on the system under these conditions, so I didn’t run it. However memory:
doug@s18:~/config/etc/security$ free -m
total used free shared buff/cache available
Mem: 15859 15509 270 1 79 137
Swap: 2047 4 2043
At some point you will get into trouble, but it is absolutely amazing how gracefully the system bogs down. Once my system starts to swap, it totally boggs down and I had many of these errors:
Feb 17 16:13:02 s15 kernel: [ 967.907305] INFO: task waiter:119371 blocked for more than 120 seconds.
Feb 17 16:13:02 s15 kernel: [ 967.907335] Not tainted 4.10.0-rc8-stock #194
Feb 17 16:13:02 s15 kernel: [ 967.907357] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
And my load average ballooned to ~29000. But I just left the computer for an hour and it sorted itself out. I staggered the spin out of the threads by 200 microseconds per spin out, and that also seemed to help.
By
Published January 6, 2018
Linux/Unix
In Linux, we have several ways to put limitations on complete system or for a particular user.
In this article, we will discuss how to resolve -bash: fork: retry: Resource temporarily unavailable error in Linux ?
So the best way is to replicate the issue and understand its root cause.
We will first limit number of processes a user can initiate for a particular user/group and then fork processes more than the limit and observe the behavior.
To start with, we will discuss about a file called /etc/security/limits.conf.
[root@nglinux ~]# file /etc/security/limits.conf /etc/security/limits.conf: ASCII English text [root@nglinux ~]# more /etc/security/limits.conf # /etc/security/limits.conf # #Each line describes a limit for a user in the form: # #
The file imposes limits on a
a. particular domain(it could be either a user or group)
b. type is either hard(maximum limit after which usage is not allowed) or soft(soft limit after which warning starts).
c. Item could be one of the following:
# – core – limits the core file size (KB)
# – data – max data size (KB)
# – fsize – maximum filesize (KB)
# – memlock – max locked-in-memory address space (KB)
# – nofile – max number of open file descriptors
# – rss – max resident set size (KB)
# – stack – max stack size (KB)
# – cpu – max CPU time (MIN)
# – nproc – max number of processes
# – as – address space limit (KB)
# – maxlogins – max number of logins for this user
# – maxsyslogins – max number of logins on the system
# – priority – the priority to run user process with
# – locks – max number of file locks the user can hold
# – sigpending – max number of pending signals
# – msgqueue – max memory used by POSIX message queues (bytes)
# – nice – max nice priority allowed to raise to values: [-20, 19]
# – rtprio – max realtime priority
d. Value contains the value in decimal number format.
Limit number of processes i.e. nproc
Now lets check how to limit number of processes by a particular user on Linux.
1. First lets create a new user.
[root@nglinux ~]# useradd saket [root@nglinux ~]# echo nglinux | passwd --stdin saket Changing password for user saket. passwd: all authentication tokens updated successfully. [root@nglinux ~]#
2. Now lets make entry in /etc/security/limits.conf
[root@nglinux ~]# tail -3 /etc/security/limits.conf saket hard nproc 10 # End of file [root@nglinux ~]#
3. Now switch to saket user and test the scenario by creating more processes than 20 for this user.
[saket@nglinux ~]$ ps PID TTY TIME CMD 7397 pts/0 00:00:00 bash 7426 pts/0 00:00:00 top 7432 pts/0 00:00:00 vim 7435 pts/0 00:00:00 tail 7436 pts/0 00:00:00 ps [saket@nglinux ~]$ [saket@nglinux ~]$ for i in {1..25} ; do tail -f testfile & done [4] 7454 [5] 7455 [6] 7456 [7] 7457 [8] 7458 [9] 7459 -bash: fork: retry: Resource temporarily unavailable hello hello hello hello hello hello -bash: fork: retry: Resource temporarily unavailable -bash: fork: retry: Resource temporarily unavailable -bash: fork: retry: Resource temporarily unavailable
Now if you try to check the running processes you will receive error, resource temporary unavailable, it means the resources got exhausted for this particular user i.e. number of processes got exceeded.
And now you can’t even fork any other command and hence ps fails.
[saket@nglinux ~]$ ps -bash: fork: retry: Resource temporarily unavailable -bash: fork: retry: Resource temporarily unavailable -bash: fork: retry: Resource temporarily unavailable -bash: fork: retry: Resource temporarily unavailable -bash: fork: Resource temporarily unavailable [saket@nglinux ~]$
And if you try to run the above command with any other user, you can run it successfully.
[root@nglinux2 ~]# echo "hello" > testfile [root@nglinux2 ~]# for i in {1..25}; do tail -f testfile & done [1] 3218 [2] 3219 [3] 3220 [4] 3221 [5] 3222 [6] 3223 hello [7] 3224 [8] 3225 hello [9] 3226 [10] 3227 hello hello hello hello hello hello [11] 3228 hello [12] 3229 hello [13] 3230 hello hello [14] 3231 [15] 3232 [16] 3233 [17] 3234 hello hello hello hello [18] 3235 hello [19] 3236 [20] 3237 hello [21] 3238 hello hello [22] 3239 hello hello [23] 3240 [24] 3241 [25] 3242 [root@nglinux2 ~]# ps PID TTY TIME CMD 2328 pts/0 00:00:00 bash 3218 pts/0 00:00:00 tail 3219 pts/0 00:00:00 tail 3220 pts/0 00:00:00 tail 3221 pts/0 00:00:00 tail 3222 pts/0 00:00:00 tail 3223 pts/0 00:00:00 tail 3224 pts/0 00:00:00 tail 3225 pts/0 00:00:00 tail 3226 pts/0 00:00:00 tail 3227 pts/0 00:00:00 tail 3228 pts/0 00:00:00 tail 3229 pts/0 00:00:00 tail 3230 pts/0 00:00:00 tail 3231 pts/0 00:00:00 tail 3232 pts/0 00:00:00 tail 3233 pts/0 00:00:00 tail 3234 pts/0 00:00:00 tail 3235 pts/0 00:00:00 tail 3236 pts/0 00:00:00 tail 3237 pts/0 00:00:00 tail 3238 pts/0 00:00:00 tail 3239 pts/0 00:00:00 tail 3240 pts/0 00:00:00 tail 3241 pts/0 00:00:00 tail 3242 pts/0 00:00:00 tail 3259 pts/0 00:00:00 ps [root@nglinux2 ~]#
Seems interesting. Just try it by yourself and do post your valuable comments here.
In case you receive similar message, you now know what to check first and resolve the issue.
Can’t log in to the server
The server refused to start a shell.
After logging in to the server, execute the ls command to report an error:
$ls
-bash: fork: retry: Resource temporarily unavailable
The essence of the above error message is that the Linux operating system cannot create more processes, causing errors.
So to solve this problem you need to modify Linux to allow more processes to be created.
Modify the maximum number of Linux processes
We can passulimit -aTo view some of the system parameters of the current Linux system.
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 62357
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 65536
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Among the above parameters, usually we pay more attention to the maximum number of files a process can open, namelyopen files。
The maximum number of processes allowed to be created by the system ismax user processes This parameter.
we can useulimit -u 4096 modifymax user processesThe value, but only in the session of the current terminal, the system default value is still used after re-login.
The correct way to modify is to modify/etc/security/limits.d/90-nproc.confThe value in the file.
$ cat /etc/security/limits.d/90-nproc.conf
# Default limit for number of user's processes to prevent
# accidental fork bombs.
# See rhbz #432903 for reasoning.
* soft nproc 4096
We only need to modify the value of 4096 in the above file.