137 код ошибки linux - Не ошибается лишь тот, кто ничего не делает!

When I run the following command, I expect the exit code to be 0 since my combined container runs a test that successfully exits with an exit code of 0.

docker-compose up --build --exit-code-from combined

Unfortunately, I consistently receive an exit code of 137 even when the tests in my combined container run successfully and I exit that container with an exit code of 0 (more details on how that happens are specified below).

Below is my docker-compose version:

docker-compose version 1.25.0, build 0a186604

According to this post, the exit code of 137 can be due to two main issues.

The container received a docker stop and the app is not gracefully handling SIGTERM
The container has run out of memory (OOM).

I know the 137 exit code is not because my container has run out of memory. When I run docker inspect <container-id>, I can see that «OOMKilled» is false as shown in the snippet below. I also have 6GB of memory allocated to the Docker Engine which is plenty for my application.

[
    {
        "Id": "db4a48c8e4bab69edff479b59d7697362762a8083db2b2088c58945fcb005625",
        "Created": "2019-12-12T01:43:16.9813461Z",
        "Path": "/scripts/init.sh",
        "Args": [],
        "State": {
            "Status": "exited",
            "Running": false,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false, <---- shows container did not run out of memory
            "Dead": false,
            "Pid": 0,
            "ExitCode": 137,
            "Error": "",
            "StartedAt": "2019-12-12T01:44:01.346592Z",
            "FinishedAt": "2019-12-12T01:44:11.5407553Z"
        },

My container doesn’t exit from a docker stop so I don’t think the first reason is relevant to my situation either.

How my Docker containers are set up

I have two Docker containers:

b-db — contains my database
b-combined — contains my web application and a series of tests, which run once the container is up and running.

I’m using a docker-compose.yml file to start both containers.

version: '3'
services:
    db:
        build:
            context: .
            dockerfile: ./docker/db/Dockerfile
        container_name: b-db
        restart: unless-stopped
        volumes:     
            - dbdata:/data/db
        ports:
            - "27017:27017"
        networks:
            - app-network

    combined:
        build:
            context: .
            dockerfile: ./docker/combined/Dockerfile
        container_name: b-combined
        restart: unless-stopped
        env_file: .env
        ports:
            - "5000:5000"
            - "8080:8080"
        networks:
            - app-network
        depends_on:
            - db

networks:
    app-network:
        driver: bridge

volumes:
    dbdata:
    node_modules:

Below is the Dockerfile for the combined service in docker-compose.yml.

FROM cypress/included:3.4.1

WORKDIR /usr/src/app

COPY package*.json ./

RUN npm install

COPY . .

EXPOSE 5000

RUN npm install -g history-server nodemon

RUN npm run build-test

EXPOSE 8080

COPY ./docker/combined/init.sh /scripts/init.sh

RUN ["chmod", "+x", "/scripts/init.sh"]

ENTRYPOINT [ "/scripts/init.sh" ]

Below is what is in my init.sh file.

#!/bin/bash
# Start front end server
history-server dist -p 8080 &
front_pid=$!

# Start back end server that interacts with DB
nodemon -L server &
back_pid=$!

# Run tests
NODE_ENV=test $(npm bin)/cypress run --config video=false --browser chrome

# Error code of the test
test_exit_code=$?

echo "TEST ENDED WITH EXIT CODE OF: $test_exit_code"

# End front and backend server
kill -9 $front_pid
kill -9 $back_pid

# Exit with the error code of the test
echo "EXITING SCRIPT WITH EXIT CODE OF: $test_exit_code"
exit "$test_exit_code"

Below is the Dockerfile for my db service. All its doing is copying some local data into the Docker container and then initialising the database with this data.

FROM  mongo:3.6.14-xenial

COPY ./dump/ /tmp/dump/

COPY mongo_restore.sh /docker-entrypoint-initdb.d/

RUN chmod 777 /docker-entrypoint-initdb.d/mongo_restore.sh

Below is what is in mongo_restore.sh.

#!/bin/bash
# Creates db using copied data
mongorestore /tmp/dump

Below are the last few lines of output when I run docker-compose up --build --exit-code-from combined; echo $?.

...
b-combined | user disconnected
b-combined | Mongoose disconnected
b-combined | Mongoose disconnected through Heroku app shutdown
b-combined | TEST ENDED WITH EXIT CODE OF: 0 ===========================
b-combined | EXITING SCRIPT WITH EXIT CODE OF: 0 =====================================
Aborting on container exit...
Stopping b-combined   ... done
137

What is confusing as you can see above, is that the test and script ended with exit code of 0 since all my tests passed successfully but the container still exited with an exit code of 137.

What is even more confusing is that when I comment out the following line (which runs my Cypress integration tests) from my init.sh file, the container exits with a 0 exit code as shown below.

NODE_ENV=test $(npm bin)/cypress run --config video=false --browser chrome

Below is the output I receive when I comment out / remove the above line from init.sh, which is a command that runs my Cypress integration tests.

...
b-combined | TEST ENDED WITH EXIT CODE OF: 0 ===========================
b-combined | EXITING SCRIPT WITH EXIT CODE OF: 0 =====================================
Aborting on container exit...
Stopping b-combined   ... done
0

How do I get docker-compose to return me a zero exit code when my tests run successfully and a non-zero exit code when they fail?

EDIT:

After running the following docker-compose command in debug mode, I noticed that b-db seems to have some trouble shutting down and potentially is receiving a SIGKILL signal from Docker because of that.

docker-compose --log-level DEBUG up --build --exit-code-from combined; echo $?

Is this indeed the case according to the following output?

...
b-combined exited with code 0
Aborting on container exit...
http://localhost:None "GET /v1.25/containers/196f3e622847b4c4c82d8d761f9f19155561be961eecfe874bbb04def5b7c9e5/json HTTP/1.1" 200 None
http://localhost:None "GET /v1.25/containers/json?limit=-1&all=1&size=0&trunc_cmd=0&filters=%7B%22label%22%3A+%5B%22com.docker.compose.project%3Db-property%22%2C+%22com.docker.compose.oneoff%3DFalse%22%5D%7D HTTP/1.1" 200 3819
http://localhost:None "GET /v1.25/containers/196f3e622847b4c4c82d8d761f9f19155561be961eecfe874bbb04def5b7c9e5/json HTTP/1.1" 200 None
http://localhost:None "GET /v1.25/containers/0626d6bf49e5236440c82de4e969f31f4f86280d6f8f555f05b157fa53bae9b8/json HTTP/1.1" 200 None
http://localhost:None "GET /v1.25/containers/196f3e622847b4c4c82d8d761f9f19155561be961eecfe874bbb04def5b7c9e5/json HTTP/1.1" 200 None
http://localhost:None "GET /v1.25/containers/json?limit=-1&all=0&size=0&trunc_cmd=0&filters=%7B%22label%22%3A+%5B%22com.docker.compose.project%3Db-property%22%2C+%22com.docker.compose.oneoff%3DFalse%22%5D%7D HTTP/1.1" 200 4039
http://localhost:None "POST /v1.25/containers/196f3e622847b4c4c82d8d761f9f19155561be961eecfe874bbb04def5b7c9e5/attach?logs=0&stdout=1&stderr=1&stream=1 HTTP/1.1" 101 0
http://localhost:None "GET /v1.25/containers/196f3e622847b4c4c82d8d761f9f19155561be961eecfe874bbb04def5b7c9e5/json HTTP/1.1" 200 None
http://localhost:None "GET /v1.25/containers/196f3e622847b4c4c82d8d761f9f19155561be961eecfe874bbb04def5b7c9e5/json HTTP/1.1" 200 None
http://localhost:None "GET /v1.25/containers/0626d6bf49e5236440c82de4e969f31f4f86280d6f8f555f05b157fa53bae9b8/json HTTP/1.1" 200 None
Stopping b-combined   ...
Stopping b-db         ...
Pending: {<Container: b-db (0626d6)>, <Container: b-combined (196f3e)>}
Starting producer thread for <Container: b-combined (196f3e)>
http://localhost:None "GET /v1.25/containers/196f3e622847b4c4c82d8d761f9f19155561be961eecfe874bbb04def5b7c9e5/json HTTP/1.1" 200 None
http://localhost:None "GET /v1.25/containers/196f3e622847b4c4c82d8d761f9f19155561be961eecfe874bbb04def5b7c9e5/json HTTP/1.1" 200 None
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
http://localhost:None "GET /v1.25/containers/196f3e622847b4c4c82d8d761f9f19155561be961eecfe874bbb04def5b7c9e5/json HTTP/1.1" 200 None
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
Pending: {<Container: b-db (0626d6)>}
http://localhost:None "POST /v1.25/containers/196f3e622847b4c4c82d8d761f9f19155561be961eecfe874bbb04def5b7c9e5/wait HTTP/1.1" 200 32
http://localhost:None "POST /v1.25/containers/196f3e622847b4c4c82d8d761f9f19155561be961eecfe874bbb04def5b7c9e5/stop?t=10 HTTP/1.1" 204 0
http://localhost:None "GET /v1.25/containers/196f3e622847b4c4c82d8d761f9f19155561be961eecfe874bbb04def5b7c9e5/json HTTP/1.1" 200 None
http://localhost:None "POST /v1.25/containers/196f3e622847b4c4c82d8d761f9f19155561bStopping b-combined   ... done
Finished processing: <Container: b-combined (196f3e)>
Pending: {<Container: b-db (0626d6)>}
Starting producer thread for <Container: b-db (0626d6)>
http://localhost:None "GET /v1.25/containers/196f3e622847b4c4c82d8d761f9f19155561be961eecfe874bbb04def5b7c9e5/json HTTP/1.1" 200 None
http://localhost:None "GET /v1.25/containers/0626d6bf49e5236440c82de4e969f31f4f86280d6f8f555f05b157fa53bae9b8/json HTTP/1.1" 200 None
Pending: set()
Pending: set()
Pending: set()
Pending: set()
Pending: set()
Pending: set()
http://localhost:None "GET /v1.25/containers/0626d6bf49e5236440c82de4e969f31f4f86280d6f8f555f05b157fa53bae9b8/json HTTP/1.1" 200 None
http://localhost:None "POST /v1.25/containers/0626d6bf49e5236440c82de4e969f31f4f86280d6f8f555f05b157fa53bae9b8/stop?t=10 HTTP/1.1" 204 0
http://localhost:None "POST /v1.25/containers/0626d6bf49e5236440c82de4e969f31f4f86280d6f8f555f05b157fa53bae9b8/wait HTTP/1.1" 200 30
Stopping b-db         ... done
Pending: set()
http://localhost:None "GET /v1.25/containers/0626d6bf49e5236440c82de4e969f31f4f86280d6f8f555f05b157fa53bae9b8/json HTTP/1.1" 200 None
http://localhost:None "GET /v1.25/containers/196f3e622847b4c4c82d8d761f9f19155561be961eecfe874bbb04def5b7c9e5/json HTTP/1.1" 200 None
137

Источник

On a large project you may sometimes encounter a mysterious build failure with no clear error message. The logs say
something about java exiting with exit code 1, and on further investigation you find that the reason was that another
java exited with error code 137. Usually this happens on a CI machine (Jenkins, TeamCity, GitLab). What does it mean
and how do you fix it?

137 = killed by SIGKILL

Exit code 137 is Linux-specific and means that your process was killed by a signal, namely SIGKILL. The main reason
for a process getting killed by SIGKILL on Linux (unless you do it yourself) is running out of memory.¹

Physical memory vs JVM heap

It is important here to understand that the process was killed by the operating system, not the JVM. In fact, when the
JVM runs out of heap, it throws an OutOfMemoryError and you get a nice stack trace, whereas exit code 137 means that
the process was killed abruptly without any chance to produce a stack trace.

The JVM has a well-known -Xmx option to limit its heap usage. If your process gets killed with exit code 137, you want
to lower the heap limit, not raise it, as you want your process to be constrained by the JVM (to get the nice stack
trace and diagnostics) and not the kernel.

What to do

In summary, to fix error 137, you need to take one of these three measures:

Talk to your CI administrators to find out how much memory the agent has available.
Make sure that any heap limits you pass to the JVM are lower than the amount of memory on the machine.
If your build starts taking too long or fails due to an OutOfMemoryError coming from the JVM, you have to either
ask the CI team to give your machine more memory (letting you increase the JVM heap limit), or optimize the
memory-hungry part of the build.

Источник

Docker systems can be used for a wide range of applications, from setting up development environments to hosting web instances.

Live Docker containers that crash often can end up ruining their purpose. As a result, a major concern faced by Docker providers is ensuring the container uptime.

Containers crash due to many reasons, the main issue being lack of enough memory. During a crash, container will show an exit code that explains the reason for its crash.

Today we’ll see what causes Docker ‘Exited (137)’ crash message and how to fix it.

What causes error 137 in Docker

Docker crashes are often denoted by the message ‘Exited’ in the container ‘STATUS’ when listing the container using ‘docker ps -a’ command.

Error 137 in Docker denotes that the container was ‘KILL’ed by ‘oom-killer’ (Out of Memory). This happens when there isn’t enough memory in the container for running the process.

‘OOM killer’ is a proactive process that jumps in to save the system when its memory level goes too low, by killing the resource-abusive processes to free up memory for the system.

Here is a snippet that shows that the Mysql container Exited with error 137:

Docker container exited with error 137

When the MySQL process running in the container exceeded its memory usage, OOM-killer killed the container and it exited with code 137.

[ Running a Docker infrastructure doesn’t have to be hard, or costly. Get world class Docker management services at affordable pricing. ]

How to debug error 137 in Docker

Each Docker container has a log file associated with it. These log files store all the relevant information and updates related to that container.

Examining the container log file is vital in troubleshooting its crash. Details of the crash can be identified by checking the container logs using ‘docker logs’ command.

For the MySQL container that crashed, the log files showed the following information:

Docker error 137 – Out of memory

The logs in this case clearly shows that the MySQL container was killed due to the mysqld process taking up too much memory.

Reasons for ‘Out of memory’ error 137 in Docker

In Docker architecture, the containers are hosted in a single physical machine. Error 137 in Docker usually happens due to 2 main out-of-memory reasons:

1. Docker container has run out of memory

By default, Docker containers use the available memory in the host machine. To prevent a single container from abusing the host resources, we set memory limits per container.

But if the memory usage by the processes in the container exceeds this memory limit set for the container, the OOM-killer would kill the application and the container crashes.

An application can use up too much memory due to improper configuration, service not optimized, high traffic or usage or resource abuse by users.

Docker system architecture

2. Docker host has no free memory

The memory limit that can be allotted to the Docker containers is limited by the total available memory in the host machine which hosts them.

Many often, when usage and traffic increases, the available free memory may be insufficient for all the containers. As a result, containers may crash.

[ Never let your business be affected by crashing containers! Our Docker experts take care of your infrastructure and promptly resolves all container issues. ]

How to resolve error 137 in Docker

When a Docker container exits with OOM error, it shows that there is a lack of memory. But, the first resort should not be to increase the RAM in the host machine.

Improperly configured services, abusive processes or peak traffic can lead to memory shortage. So the first option is to identify the cause of this memory usage.

After identifying the cause, the following corrective actions can be done in the Docker system to avoid further such OOM crashes.

1. Optimize the services

Unoptimized applications could take up more than optimal memory. For instance, an improperly configured MySQL service can quickly consume the entire host memory.

So, the first step is to monitor the application running in the container and to optimize the service. This can be done by editing the configuration file or recompiling the service.

2. Mount config files from outside

It is always advisable to mount the config files of services from outside the Docker container. This will allow to edit them easily without recompiling the Docker image.

For instance, in MySQL docker image, “/etc/mysql/conf.d” can be mounted as a volume. Any configuration changes for MySQL service can be done without affecting that image.

3. Monitor the container usage

Monitoring the container’s memory usage to detect abusive users, resource-depleted processes, traffic spikes, etc. is very vital in Docker system management.

Depending on the traffic and resource usage of processes, memory limits for the containers can be changed to suit their business purpose better.

4. Add more RAM to the host machine

After optimizing the services and setting memory limits, if the containers are running at their maximum memory limits, then we should add more RAM.

Adding more RAM and ensuring enough swap memory in the host machine would help the containers to utilize that memory whenever there is a memory crunch.

In short..

Today we saw how to fix error 137 in Docker using a systematic debugging method. But there may be scenarios where the error code may not be displayed, especially when the container is launched from some shell script.

At Bobcares, examining Docker logs, optimizing applications, limiting resource usage for containers, monitoring the traffic, etc. are routinely done to avoid crashes.

Before deploying Docker containers for live hosting or development environment setup, we perform stress test by simulating the estimated peak traffic.

This helps us to minimize crashes in the live server. If you’d like to know how to manage your Docker resources efficiently for your business purpose, we’d be happy to talk to you.

Looking for a stable Docker setup?

Talk to our Docker specialists today to know how we can keep your containers top notch!

CLICK HERE FOR URGENT FIX!

var google_conversion_label = «owonCMyG5nEQ0aD71QM»;

Источник

Exit code 137 occurs when a process is terminated because it’s using too much memory. Your container or Kubernetes pod will be stopped to prevent the excessive resource consumption from affecting your host’s reliability.

Processes that end with exit code 137 need to be investigated. The problem could be that your system simply needs more physical memory to meet user demands. However, there might also be a memory leak or sub-optimal programming inside your application that’s causing resources to be consumed excessively.

In this article, you’ll learn how to identify and debug exit code 137 so your containers run reliably. This will reduce your maintenance overhead and help stop inconsistencies caused by services stopping unexpectedly. Although some causes of exit code 137 can be highly specific to your environment, most problems can be solved with a simple troubleshooting sequence.

What is exit code 137?

All processes emit an exit code when they terminate. Exit codes provide a mechanism for informing the user, operating system, and other applications why the process stopped. Each code is a number between 0 and 255. The meaning of codes below 125 is application-dependent, while higher values have special meanings.

A 137 code is issued when a process is terminated externally because of its memory consumption. The operating system’s out of memory manager (OOM) intervenes to stop the program before it destabilizes the host.

When you start a foreground program in your shell, you can read the ? variable to inspect the process exit code:

As this example returned 137, you know that demo-binary was stopped because it used too much memory. The same thing happens for container processes, too—when a memory limit is being approached, the process will be terminated, and a 137 code issued.

Pods running in Kubernetes will show a status of OOMKilled when they encounter a 137 exit code. Although this looks like any other Kubernetes status, it’s caused by the operating system’s OOM killer terminating the pod’s process. You can check for pods that have used too much memory by running Kubectl’s get pods command:

$ kubectl get pods

NAME	READY	STATUS	RESTARTS	AGE
demo-pod	0/1	OOMKilled	0	2m05s

Memory consumption problems can affect anyone, not just organizations using Kubernetes. You could run into similar issues with Amazon ECS, RedHat OpenShift, Nomad, CloudFoundry, and plain Docker deployments. Regardless of the platform, if a container fails with a 137 exit code, the root cause will be the same: there’s not enough memory to keep it running.

For example, you can view a stopped Docker container’s exit code by running docker ps -a:

$ docker ps -a

CONTAINER ID	IMAGE	COMMAND	CREATED	STATUS
cdefb9ca658c	demo-org/demo-image:latest	«demo-binary»	2 days ago	Exited (137) 1 day ago

The exit code is shown in brackets under the STATUS column. The 137 value confirms this container stopped because of a memory problem.

Causes of container memory issues

Understanding the situations that lead to memory-related container terminations is the first step towards debugging exit code 137. Here are some of the most common issues that you might experience.

Container memory limit exceeded

Kubernetes pods will be terminated when they try to use more memory than their configured limit allows. You might be able to resolve this situation by increasing the limit if your cluster has spare capacity available.

Application memory leak

Poorly optimized code can create memory leaks. A memory leak occurs when an application uses memory, but doesn’t release it when the operation’s complete. This causes the memory to gradually fill up, and will eventually consume all the available capacity.

Natural increases in load

Sometimes adding physical memory is the only way to solve a problem. Growing services that experience an increase in active users can reach a point where more memory is required to serve the increase in traffic.

Requesting more memory than your compute nodes can provide

Kubernetes pods configured with memory resource requests can use more memory than the cluster’s nodeshave if limits aren’t also used. A request allows consumption overages because it’s only an indication of how much memory a pod will consume, and doesn’t prevent the pod from consuming more memory if it’s available.

Running too many containers without memory limits

Running several containers without memory limits can create unpredictable Kubernetes behavior when the node’s memory capacity is reached. Containers without limits have a greater chance of being killed, even if a neighboring container caused the capacity breach.

Preventing pods and containers from causing memory issues

Debugging container memory issues in Kubernetes—or any other orchestrator—can seem complex, but using the right tools and techniques helps make it less stressful. Kubernetes assigns memory to pods based on the requests and limits they declare. Unless it resides in a namespace with a default memory limit, a pod that doesn’t use these mechanisms can normally access limitless memory.

Setting memory limits

Pods without memory limits increase the chance of OOM kills and exit code 137 errors. These pods are able to use more memory than the node can provide, which poses a stability risk. When memory consumption gets close to the physical limit, the Linux kernel OOM killer intervenes to stop processes that are using too much memory.

Making sure each of your pods includes a memory limit is a good first step towards preventing OOM kill issues. Here’s a sample pod manifest:

The requests field indicates the pod wants 256 Mi of memory. Kubernetes will use this information to influence scheduling decisions, and will ensure that the pod is hosted by a node with at least 256 Mi of memory available. Requests help to reduce resource contention, ensuring your applications have the resources they need. It’s important to note, though, that they don’t prevent the pod from using more memory if it’s available on the node.

This sample pod also includes a memory limit of 512 Mi. If memory consumption goes above 512 Mi, the pod becomes a candidate for termination. If there’s too much memory pressure and Kubernetes needs to free up resources, the pod could be stopped. Setting limits on all of your pods helps prevent excessive memory consumption in one from affecting the others.

Investigating application problems

Once your pods have appropriate memory limits, you can start investigating why those limits are being reached. Start by analyzing traffic levels to identify anomalies as well as natural growth in your service. If memory use has grown in correlation with user activity, it could be time to scale your cluster with new nodes, or to add more memory to existing ones.

If your nodes have sufficient memory, you’ve set limits on all your pods, and service use has remained relatively steady, the problem is likely to be within your application. To figure out where, you need to look at the nature of your memory consumption issues: is usage suddenly spiking, or does it gradually increase over the course of the pod’s lifetime?

A memory usage graph that shows large peaks can point to poorly optimized functions in your application. Specific parts of your codebase could be allocating a lot of memory to handle demanding user requests. You can usually work out the culprit by reviewing pod logs to determine which actions were taken around the time of the spike. It might be possible to refactor your code to use less memory, such as by explicitly freeing up variables and destroying objects after you’ve finished using them.

Memory graphs that show continual increases over time usually mean you’ve got a memory leak. These problems can be tricky to find, but reviewing application logs and running language-specific analysis tools can help you discover suspect code. Unchecked memory leaks will eventually fill all the available physical memory, forcing the OOM killer to stop processes so the capacity can be reclaimed.

Final thoughts

Exit code 137 means a container or pod is trying to use more memory than it’s allowed. The process gets terminated to prevent memory usage ballooning indefinitely, which could cause your host system to become unstable.

Excessive memory usage can occur due to natural growth in your application’s use, or as the result of a memory leak in your code. It’s important to set correct memory limits on your pods to guard against these issues; while reaching the limit will prompt termination with a 137 exit code, this mechanism is meant to protect you against worse problems that will occur if system memory is depleted entirely.

When you’re using Kubernetes, you should proactively monitor your cluster so you’re aware of normal memory consumption and can identify any spikes.

Источник

If you are reporting a new issue, make sure that we do not have any duplicates already open. You can ensure this by searching the issue list for this repository. If there is a duplicate, please close your issue and add a comment to the existing issue instead.

If you suspect your issue is a bug, please edit your issue description to include the BUG REPORT INFORMATION shown below. If you fail to provide this information within 7 days, we cannot debug your issue and will close it. We will, however, reopen it if you later provide the information.

For more information about reporting issues, see CONTRIBUTING.md.

You don’t have to include this information if this is a feature request

(This is an automated, informational response)

BUG REPORT INFORMATION

Use the commands below to provide key information from your environment:

docker version:
docker info:

Provide additional environment details (AWS, VirtualBox, physical, etc.):

List the steps to reproduce the issue:
1.
2.
3.

Describe the results you received:

Describe the results you expected:

Provide additional info you think is important:

———-END REPORT ———

#ENEEDMOREINFO

Источник

137 = killed by SIGKILL

Physical memory vs JVM heap

What to do

What causes error 137 in Docker

How to debug error 137 in Docker

Reasons for ‘Out of memory’ error 137 in Docker

1. Docker container has run out of memory

2. Docker host has no free memory

How to resolve error 137 in Docker

1. Optimize the services

2. Mount config files from outside

3. Monitor the container usage

4. Add more RAM to the host machine

In short..

Looking for a stable Docker setup?

What is exit code 137?

Causes of container memory issues

Container memory limit exceeded

Application memory leak

Natural increases in load

Requesting more memory than your compute nodes can provide

Running too many containers without memory limits

Preventing pods and containers from causing memory issues

Setting memory limits

Investigating application problems

Final thoughts

BUG REPORT INFORMATION

Не пропустите эти материалы по теме: