Ошибка сегментации си это - Не ошибается лишь тот, кто ничего не делает!

The segmentation fault, also known as segfault, is a type of computer error that occurs whenever an unexpected condition causes the processor to attempt to access a memory location that’s outside its own program storage area. The term “segmentation” refers to a memory protection mechanism used in virtual memory operating systems.

This specific error arises because data is typically shared on a system using different memory sections, and the program storage space is shared among applications.

Segmentation faults are usually triggered by an access violation, which occurs when the CPU attempts to execute instructions outside its memory area or tries to read or write into some reserved address that does not exist. This action results in halting the current application and generates an output known as Segmentation Fault.

#1. What are the Symptoms of Segmentation Fault?

The symptoms of segmentation faults may vary depending on how and where they’re generated. Typically, this error is generated due to one of the following conditions:

#a. Dereferencing a null pointer

Programming languages offer references, which are pointers that identify where in memory an item is located. A null pointer is a special pointer that doesn’t point to any valid memory location. Dereferencing (accessing) null pointer results in segmentation faults or null pointer exceptions.

/**
 * @file main.c
 * @author freecoder
 * @brief this program allow to handle a segmentation fault error
 *
 * @version 1.0
 * @date 8 Jan. 2022
 *
 * @copyright Copyright (c) 2022
 *
 */
#include <stdio.h>

/* main program entry */
int main(int argc, char **argv)
{
	/* local variables */
	unsigned int *puiPointer = NULL;
	/* body program */
	*puiPointer = 20;
	return 0;
}

after compiling and running the program with the gdb command, the segmentation fault error appears:

➜  Article-XX gcc -g main.c -o main
➜  Article-XX ./main               
[1]    7825 segmentation fault  ./main
➜  Article-XX gdb ./main           
GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./main...
(gdb) start
Temporary breakpoint 1 at 0x401111: file main.c, line 20.
Starting program: /home/others/Article-XX/main 
warning: Error disabling address space randomization: Operation not permitted
Temporary breakpoint 1, main (argc=1, argv=0x7ffc9c096258) at main.c:20
20              unsigned int *puiPointer = NULL;
(gdb) list
15
16      /* main program entry */
17      int main(int argc, char **argv)
18      {
19              /* local variables */
20              unsigned int *puiPointer = NULL;
21
22              /* body program */
23
24              *puiPointer = 20;
(gdb) s
24              *puiPointer = 20;
(gdb) s
Program received signal SIGSEGV, Segmentation fault.
0x000000000040111d in main (argc=1, argv=0x7ffc9c096258) at main.c:24
24              *puiPointer = 20;
(gdb)

segmentation fault Dereferencing a null pointer

#b. Trying to access memory not initialized

Programs using uninitialized variables may crash when attempting to access uninitialized memory or may expose data stored in the uninitialized variables by writing to them. Also in the case when the program attempts to read or write to an area of memory not allocated with malloc(), calloc() or realloc().

An example of a simple segmentation fault is trying to read from a variable before it has been set:

/**
 * @file main.c
 * @author freecoder
 * @brief this program allow to handle a segmentation fault error
 *
 * @version 1.0
 * @date 8 Jan. 2022
 *
 * @copyright Copyright (c) 2022
 *
 */
#include <stdio.h>

/* main program entry */
int main(int argc, char **argv)
{
	/* local variables */
	unsigned int *puiPointer;
	/* body program */
	*puiPointer = 20;
	return 0;
}

In this case, the pointer puiPointer will be pointing to a random location in memory, so when the program attempts to read from it (by dereferencing *puiPointer), a segmentation fault will be triggered:

➜  Article-XX gcc -g main.c -o main
➜  Article-XX gdb ./main           
GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./main...
(gdb) start
Temporary breakpoint 1 at 0x401111: file main.c, line 24.
Starting program: /home/others/Article-XX/main 
warning: Error disabling address space randomization: Operation not permitted
Temporary breakpoint 1, main (argc=1, argv=0x7fff6df4f038) at main.c:24
24              *puiPointer = 20;
(gdb) list
19              /* local variables */
20              unsigned int *puiPointer;
21
22              /* body program */
23
24              *puiPointer = 20;
25
26              return 0;
27      }
(gdb) s
Program received signal SIGSEGV, Segmentation fault.
0x0000000000401115 in main (argc=1, argv=0x7fff6df4f038) at main.c:24
24              *puiPointer = 20;
(gdb)

segmentation fault - Trying to access memory not initialized

#c. Trying to access memory out of bounds for the program

In most situations, if a program attempts to access (read or write) memory outside of its boundaries, a segmentation fault error will occur. A code example of a simple segmentation fault error is below:

/**
 * @file main.c
 * @author freecoder
 * @brief this program allow to handle a segmentation fault error
 *
 * @version 1.0
 * @date 8 Jan. 2022
 *
 * @copyright Copyright (c) 2022
 *
 */
#include <stdio.h>

/* main program entry */
int main(int argc, char **argv)
{
	/* local variables */
	unsigned int uiArray[20];
	/* body program */
	uiArray[5000] = 1;
	return 0;
}

As shown bellow, the segmentation fault occurs after executing the out of bounds statement:

➜  Article-XX gcc -g main.c -o main
➜  Article-XX gdb ./main           
GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./main...
(gdb) start
Temporary breakpoint 1 at 0x401111: file main.c, line 23.
Starting program: /home/others/Article-XX/main 
warning: Error disabling address space randomization: Operation not permitted
Temporary breakpoint 1, main (argc=1, argv=0x7ffdb68620f8) at main.c:23
23              uiArray[5000] = 1;
(gdb) list
18      {
19              /* local variables */
20              unsigned int uiArray[20];
21
22              /* body program */
23              uiArray[5000] = 1;
24
25              return 0;
26      }
(gdb) s
Program received signal SIGSEGV, Segmentation fault.
main (argc=1, argv=0x7ffdb68620f8) at main.c:23
23              uiArray[5000] = 1;
(gdb)

#d. Trying to modify string literals

/**
 * @file main.c
 * @author freecoder
 * @brief this program allow to handle a segmentation fault error
 *
 * @version 1.0
 * @date 8 Jan. 2022
 *
 * @copyright Copyright (c) 2022
 *
 */
#include <stdio.h>

/* main program entry */
int main(int argc, char **argv)
{
	/* local variables */
	char* pucString = "Sample String 1";
	/* body program */
	pucString[14] = '2';
	return 0;
}

As shown bellow, we got a segmentation error because the compiler put the string constant “Sample String 1” in read-only memory while trying to modify the contents of that memory which fails as a result:

➜  Article-XX gcc -g main.c -o main
➜  Article-XX gdb ./main           
GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./main...
(gdb) start
Temporary breakpoint 1 at 0x401111: file main.c, line 20.
Starting program: /home/others/Article-XX/main 
warning: Error disabling address space randomization: Operation not permitted
Temporary breakpoint 1, main (argc=1, argv=0x7ffc2ea212e8) at main.c:20
20              char* pucString = "Sample String 1";
(gdb) list
15
16      /* main program entry */
17      int main(int argc, char **argv)
18      {
19              /* local variables */
20              char* pucString = "Sample String 1";
21
22              /* body program */
23              pucString[14] = '2';
24
(gdb) n
23              pucString[14] = '2';
(gdb) 
Program received signal SIGSEGV, Segmentation fault.
main (argc=1, argv=0x7ffc2ea212e8) at main.c:23
23              pucString[14] = '2';
(gdb)

#e. Using variable’s value as an address

A segmentation fault occurs when accidentally you are using a variable’s value as an address as you can see through the code example bellow:

/**
 * @file main.c
 * @author freecoder
 * @brief this program allow to handle a segmentation fault error
 *
 * @version 1.0
 * @date 8 Jan. 2022
 *
 * @copyright Copyright (c) 2022
 *
 */
#include <stdio.h>

/* main program entry */
int main(int argc, char **argv)
{
	/* local variables */
	int iVariable;
	/* body program */
	scanf("%d", iVariable);
	return 0;
}

As shown in the terminal consol bellow, the segmentation occurs after the scans statement:

➜  Article-XX gcc -g main.c -o main
➜  Article-XX gdb ./main           
GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./main...
(gdb) start
Temporary breakpoint 1 at 0x401135: file main.c, line 23.
Starting program: /home/others/Article-XX/main 
warning: Error disabling address space randomization: Operation not permitted
Temporary breakpoint 1, main (argc=1, argv=0x7fff418f9658) at main.c:23
23              scanf("%d", iVariable);
(gdb) list
18      {
19              /* local variables */
20              int iVariable;
21
22              /* body program */
23              scanf("%d", iVariable);
24
25              return 0;
26      }
(gdb) n
1
Program received signal SIGSEGV, Segmentation fault.
0x00007ff3e1d2201a in __vfscanf_internal (s=<optimized out>, format=<optimized out>, [email protected]=0x7fff418f9460, [email protected]=2)
    at vfscanf-internal.c:1895
1895    vfscanf-internal.c: No such file or directory.
(gdb)

segmentation fault gemination fault using variable's value as address

#f. Stack overflow

The segmentation fault error may occur if the call stack pointer exceeds the stack bound in case of an infinite recursive function call:

/**
 * @file main.c
 * @author freecoder
 * @brief this program allow to handle a segmentation fault error
 *
 * @version 1.0
 * @date 8 Jan. 2022
 *
 * @copyright Copyright (c) 2022
 *
 */
#include <stdio.h>

/* main program entry */
int main(void)
{
	/* local variables */
	/* body program */
	main();
	return 0;
}

As shown bellow, the segmentation fault error happened, due to a stack oveflow after calling the main function:

➜  Article-XX gcc -g main.c -o main
➜  Article-XX gdb ./main                                
GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./main...
(gdb) start
Temporary breakpoint 1 at 0x40110a: file main.c, line 22.
Starting program: /home/others/Article-XX/main 
warning: Error disabling address space randomization: Operation not permitted
Temporary breakpoint 1, main () at main.c:22
22              main();
(gdb) list
17      int main(void)
18      {
19              /* local variables */
20
21              /* body program */
22              main();
23
24              return 0;
25      }
(gdb) n
Program received signal SIGSEGV, Segmentation fault.
main () at main.c:22
22              main();
(gdb)

#2. How do you Fix Segmentation Faults?

Because segmentation faults are often associated with memory management issues or problematic pointer assignments, they can be fixed by making sure that the target application correctly handles these errors and does not attempt to read or write memory locations outside its own address space.

There are also certain procedures which you can follow in order to prevent and fix segmentation faults:

#a. How to Prevent Segmentation Faults?

Most segmentation faults occur due to memory access errors, so it’s important to make sure that pointers used by an application always reference valid data areas.

Check the reference of null memory.
Testing the code with Valgrind or Electric Fence
Assert() before dereferencing a suspective pointer, mainly a pointer embedded in a struct that is maintained in a container in a list or an array.
Always remember to initialize pointers properly.
Protect shared resources against concurrent access in multithreading by using a mutex or a semaphore.
Use of free() routine

#b. How to Fix Segmentation Faults?

There are some tools that you can use in order to fix the segmentation faults:

Gdb and core dump file
Gdb and backtrace.
Debugfs and Dmesg for kernel debugging

Conclusion

A segmentation fault is generally caused by a programming bug that tries to access either non-existent or protected memory. It can also happen as a result of dividing an integer by zero (causing the program counter to be redirected to nowhere), accessing memory that is out of bounds at an address that does not contain valid data or code.

Finally, when enabled on some operating systems (and in some embedded programming environments), the processor may issue an exception if a memory address contains a non-mapped machine code instruction.

I hope this post has clarified what segmentation faults on the x86 architecture imply and how to avoid them. Do not forget to share the information on social networks if you believe it is useful for others. If you have any queries, please do not hesitate to leave a comment and subscribe to our newsletter. Best of luck with your coding and see you in the next article!

Источник

What is a segmentation fault? Is it different in C and C++? How are segmentation faults and dangling pointers related?

asked Feb 27, 2010 at 9:23

Rajendra UppalRajendra Uppal

18.9k15 gold badges59 silver badges57 bronze badges

Segmentation fault is a specific kind of error caused by accessing memory that “does not belong to you.” It’s a helper mechanism that keeps you from corrupting the memory and introducing hard-to-debug memory bugs. Whenever you get a segfault you know you are doing something wrong with memory – accessing a variable that has already been freed, writing to a read-only portion of the memory, etc. Segmentation fault is essentially the same in most languages that let you mess with memory management, there is no principal difference between segfaults in C and C++.

There are many ways to get a segfault, at least in the lower-level languages such as C(++). A common way to get a segfault is to dereference a null pointer:

int *p = NULL;
*p = 1;

Another segfault happens when you try to write to a portion of memory that was marked as read-only:

char *str = "Foo"; // Compiler marks the constant string as read-only
*str = 'b'; // Which means this is illegal and results in a segfault

Dangling pointer points to a thing that does not exist anymore, like here:

char *p = NULL;
{
    char c;
    p = &c;
}
// Now p is dangling

The pointer p dangles because it points to the character variable c that ceased to exist after the block ended. And when you try to dereference dangling pointer (like *p='A'), you would probably get a segfault.

answered Feb 27, 2010 at 9:36

It would be worth noting that segmentation fault isn’t caused by directly accessing another process memory (this is what I’m hearing sometimes), as it is simply not possible. With virtual memory every process has its own virtual address space and there is no way to access another one using any value of pointer. Exception to this can be shared libraries which are same physical address space mapped to (possibly) different virtual addresses and kernel memory which is even mapped in the same way in every process (to avoid TLB flushing on syscall, I think). And things like shmat — these are what I count as ‘indirect’ access. One can, however, check that they are usually located long way from process code and we are usually able to access them (this is why they are there, nevertheless accessing them in a improper way will produce segmentation fault).

Still, segmentation fault can occur in case of accessing our own (process) memory in improper way (for instance trying to write to non-writable space). But the most common reason for it is the access to the part of the virtual address space that is not mapped to physical one at all.

And all of this with respect to virtual memory systems.

answered Jul 3, 2011 at 23:22

A segmentation fault is caused by a request for a page that the process does not have listed in its descriptor table, or an invalid request for a page that it does have listed (e.g. a write request on a read-only page).

A dangling pointer is a pointer that may or may not point to a valid page, but does point to an «unexpected» segment of memory.

answered Feb 27, 2010 at 9:27

To be honest, as other posters have mentioned, Wikipedia has a very good article on this so have a look there. This type of error is very common and often called other things such as Access Violation or General Protection Fault.

They are no different in C, C++ or any other language that allows pointers. These kinds of errors are usually caused by pointers that are

Used before being properly initialised
Used after the memory they point to has been realloced or deleted.
Used in an indexed array where the index is outside of the array bounds. This is generally only when you’re doing pointer math on traditional arrays or c-strings, not STL / Boost based collections (in C++.)

answered Feb 27, 2010 at 20:35

Component 10Component 10

10.2k7 gold badges47 silver badges64 bronze badges

According to Wikipedia:

A segmentation fault occurs when a
program attempts to access a memory
location that it is not allowed to
access, or attempts to access a memory
location in a way that is not allowed
(for example, attempting to write to a
read-only location, or to overwrite
part of the operating system).

answered Feb 27, 2010 at 9:30

Orhan CinarOrhan Cinar

8,3932 gold badges34 silver badges48 bronze badges

Segmentation fault is also caused by hardware failures, in this case the RAM memories. This is the less common cause, but if you don’t find an error in your code, maybe a memtest could help you.

The solution in this case, change the RAM.

edit:

Here there is a reference: Segmentation fault by hardware

answered Jun 24, 2014 at 16:59

Wikipedia’s Segmentation_fault page has a very nice description about it, just pointing out the causes and reasons. Have a look into the wiki for a detailed description.

In computing, a segmentation fault (often shortened to segfault) or access violation is a fault raised by hardware with memory protection, notifying an operating system (OS) about a memory access violation.

The following are some typical causes of a segmentation fault:

Dereferencing NULL pointers – this is special-cased by memory management hardware
Attempting to access a nonexistent memory address (outside process’s address space)
Attempting to access memory the program does not have rights to (such as kernel structures in process context)
Attempting to write read-only memory (such as code segment)

These in turn are often caused by programming errors that result in invalid memory access:

Dereferencing or assigning to an uninitialized pointer (wild pointer, which points to a random memory address)
Dereferencing or assigning to a freed pointer (dangling pointer, which points to memory that has been freed/deallocated/deleted)
A buffer overflow.
A stack overflow.
Attempting to execute a program that does not compile correctly. (Some compilers will output an executable file despite the presence of compile-time errors.)

Jamal

7617 gold badges22 silver badges31 bronze badges

answered Oct 14, 2014 at 10:05

iampranabroyiampranabroy

1,7061 gold badge15 silver badges11 bronze badges

Segmentation fault occurs when a process (running instance of a program) is trying to access read-only memory address or memory range which is being used by other process or access the non-existent (invalid) memory address.
Dangling Reference (pointer) problem means that trying to access an object or variable whose contents have already been deleted from memory, e.g:

int *arr = new int[20];
delete arr;
cout<<arr[1];  //dangling problem occurs here

answered Dec 10, 2013 at 22:34

Sohail xIN3NSohail xIN3N

2,9312 gold badges30 silver badges29 bronze badges

In simple words: segmentation fault is the operating system sending a signal to the program
saying that it has detected an illegal memory access and is prematurely terminating the program to prevent
memory from being corrupted.

answered Jul 19, 2017 at 13:43

There are several good explanations of «Segmentation fault» in the answers, but since with segmentation fault often there’s a dump of the memory content, I wanted to share where the relationship between the «core dumped» part in Segmentation fault (core dumped) and memory comes from:

From about 1955 to 1975 — before semiconductor memory — the dominant technology in computer memory used tiny magnetic doughnuts strung on copper wires. The doughnuts were known as «ferrite cores» and main memory thus known as «core memory» or «core».

Taken from here.

answered Oct 13, 2018 at 18:39

Viktor NonovViktor Nonov

1,4721 gold badge12 silver badges26 bronze badges

In computing, a segmentation fault or access violation is a fault, or failure condition, raised by hardware with memory protection,
notifying an operating system the software has attempted to access a
restricted area of memory. -WIKIPEDIA

You might be accessing the computer memory with the wrong data type. Your case might be like the code below:

#include <stdio.h>

int main(int argc, char *argv[]) {
    
    char A = 'asd';
    puts(A);
    
    return 0;
    
}

‘asd’ -> is a character chain rather than a single character char data type. So, storing it as a char causes the segmentation fault. Stocking some data at the wrong position.

Storing this string or character chain as a single char is trying to fit a square peg in a round hole.

Terminated due to signal: SEGMENTATION FAULT (11)

Segm. Fault is the same as trying to breath in under water, your lungs were not made for that. Reserving memory for an integer and then trying to operate it as another data type won’t work at all.

answered Apr 23, 2021 at 3:29

victorkolisvictorkolis

77013 silver badges13 bronze badges

«Segmentation fault» means that you tried to access memory that you do not have access to.

The first problem is with your arguments of main. The main function should be int main(int argc, char *argv[]), and you should check that argc is at least 2 before accessing argv[1].

Also, since you’re passing in a float to printf (which, by the way, gets converted to a double when passing to printf), you should use the %f format specifier. The %s format specifier is for strings (»-terminated character arrays).

answered Mar 1, 2019 at 11:44

PHP Worm…PHP Worm…

4,0891 gold badge25 silver badges48 bronze badges

Simple meaning of Segmentation fault is that you are trying to access some memory which doesn’t belong to you. Segmentation fault occurs when we attempt to read and/or write tasks in a read only memory location or try to freed memory. In other words, we can explain this as some sort of memory corruption.

Below I mention common mistakes done by programmers that lead to Segmentation fault.

Use scanf() in wrong way(forgot to put &).

int num;
scanf("%d", num);// must use &num instead of num

Use pointers in wrong way.

int *num; 
printf("%d",*num); //*num should be correct as num only
//Unless You can use *num but you have to point this pointer to valid memory address before accessing it.

Modifying a string literal(pointer try to write or modify a read only memory.)

char *str;  

//Stored in read only part of data segment
str = "GfG";      

//Problem:  trying to modify read only memory
*(str+1) = 'n';

Try to reach through an address which is already freed.

// allocating memory to num 
int* num = malloc(8); 
*num = 100; 

// de-allocated the space allocated to num 
free(num); 

// num is already freed there for it cause segmentation fault
*num = 110;

Stack Overflow -: Running out of memory on the stack
Accessing an array out of bounds’
Use wrong format specifiers when using printf() and scanf()‘

answered Jan 3, 2020 at 17:50

KalanaKalana

5,5357 gold badges28 silver badges51 bronze badges

Consider the following snippets of Code,

SNIPPET 1

int *number = NULL;
*number = 1;

SNIPPET 2

int *number = malloc(sizeof(int));
*number = 1;

I’d assume you know the meaning of the functions: malloc() and sizeof() if you are asking this question.

Now that that is settled,
SNIPPET 1 would throw a Segmentation Fault Error.
while SNIPPET 2 would not.

Here’s why.

The first line of snippet one is creating a variable(*number) to store the address of some other variable but in this case it is initialized to NULL.
on the other hand,
The second line of snippet two is creating the same variable(*number) to store the address of some other and in this case it is given a memory address(because malloc() is a function in C/C++ that returns a memory address of the computer)

The point is you cannot put water inside a bowl that has not been bought OR a bowl that has been bought but has not been authorized for use by you.
When you try to do that, the computer is alerted and it throws a SegFault error.

You should only face this errors with languages that are close to low-level like C/C++. There is an abstraction in other High Level Languages that ensure you do not make this error.

It is also paramount to understand that Segmentation Fault is not language-specific.

answered Oct 15, 2020 at 10:17

There are enough definitions of segmentation fault, I would like to quote few examples which I came across while programming, which might seem like silly mistakes, but will waste a lot of time.

You can get a segmentation fault in below case while argument type mismatch in printf:

#include <stdio.h>
int main(){   
  int a = 5;
  printf("%s",a);
  return 0;
}

output : Segmentation Fault (SIGSEGV)

When you forgot to allocate memory to a pointer, but try to use it.

#include <stdio.h> 
typedef struct{
  int a;
} myStruct;   
int main(){
  myStruct *s;
  /* few lines of code */
  s->a = 5;
  return 0;
}

output : Segmentation Fault (SIGSEGV)

Kamil Jarosz

2,1682 gold badges19 silver badges30 bronze badges

answered Nov 10, 2019 at 15:44

NPENPE

3974 silver badges13 bronze badges

Segmentation fault occurs when a process (running instance of a program) is trying to access a read-only memory address or memory range which is being used by another process or access the non-existent memory address.

seg fault,when type gets mismatched

answered Jan 9 at 19:05

A segmentation fault or access violation occurs when a program attempts to access a memory location that is not exist, or attempts to access a memory location in a way that is not allowed.

 /* "Array out of bounds" error 
   valid indices for array foo
   are 0, 1, ... 999 */
   int foo[1000];
   for (int i = 0; i <= 1000 ; i++) 
   foo[i] = i;

Here i[1000] not exist, so segfault occurs.

Causes of segmentation fault:

it arise primarily due to errors in use of pointers for virtual memory addressing, particularly illegal access.

De-referencing NULL pointers – this is special-cased by memory management hardware.

Attempting to access a nonexistent memory address (outside process’s address space).

Attempting to access memory the program does not have rights to (such as kernel structures in process context).

Attempting to write read-only memory (such as code segment).

answered Dec 8, 2015 at 16:14

Источник

Segmentation faults in C or C++ is an error that occurs when a program attempts to access a memory location it does not have permission to access. Generally, this error occurs when memory access is violated and is a type of general protection fault. Segfaults are the abbreviation for segmentation faults.

The core dump refers to the recording of the state of the program, i.e. its resources in memory and processor. Trying to access non-existent memory or memory which is being used by other processes also causes the Segmentation Fault which leads to a core dump.

A program has access to specific regions of memory while it is running. First, the stack is used to hold the local variables for each function. Moreover, it might have memory allocated at runtime and saved on the heap (new in C++ and you may also hear it called the “free store“). The only memory that the program is permitted to access is it’s own (the memory previously mentioned). A segmentation fault will result from any access outside of that region.

Segmentation fault is a specific kind of error caused by accessing memory that “does not belong to you“:

When a piece of code tries to do a read-and-write operation in a read-only location in memory or freed block of memory, it is known as a segmentation fault.
It is an error indicating memory corruption.

Common Segmentation Fault Scenarios

In a Segmentation fault, a program tries to access memory that is not authorized to access, or that does not exist. Some common scenarios that can cause segmentation faults are:

Modifying a string literal
Accessing an address that is freed
Accessing out-of-array index bounds
Improper use of scanf()
Stack Overflow
Dereferencing uninitialized pointer

1. Modifying a String Literal

The string literals are stored in the read-only section of the memory. That is why the below program may crash (gives segmentation fault error) because the line *(str+1) = ‘n’ tries to write a read-only memory.

Example:

C

#include <stdio.h>

int main()

{

char* str;

str = "GfG";

*(str + 1) = 'n';

return 0;

}

C++

#include <iostream>

using namespace std;

int main()

{

char* str;

str = "GfG";

*(str + 1) = 'n';

return 0;

}

Output

timeout: the monitored command dumped core

/bin/bash: line 1: 32 Segmentation fault timeout 15s ./83b16132-8565-4cb1-aedb-4eb593442235 < 83b16132-8565-4cb1-aedb-4eb593442235.in

Refer, to Storage for Strings in C for more details.

2. Accessing an Address That is Freed

Here in the below code, the pointer p is dereferenced after freeing the memory block, which is not allowed by the compiler. Such pointers are called dangling pointers and they produce segment faults or abnormal program termination at runtime.

Example:

C

#include <stdio.h>

#include <stdlib.h>

int main(void)

{

int* p = (int*)malloc(8);

*p = 100;

free(p);

*p = 110;

printf("%d", *p);

return 0;

}

C++

#include <iostream>

using namespace std;

int main(void)

{

int* p = (int*)malloc(sizeof(int));

*p = 100;

free(p);

*p = 110;

return 0;

}

Output

Segmentation Fault

3. Accessing out-of-bounds Array Index

In C and C++, accessing an out-of-bounds array index may cause a segmentation fault or other undefined behavior. There is no boundary checking for arrays in C and C++. Although in C++, the use of containers such as with the std::vector::at() method or with an if() statement, can prevent out-of-bound errors.

Example:

C

#include <stdio.h>

int main(void)

{

int arr[2];

arr[3] = 10;

return (0);

}

C++

#include <iostream>

using namespace std;

int main()

{

int arr[2];

arr[3] = 10;

return 0;

}

Output

Segmentation Faults

4. Improper use of scanf()

The scanf() function expects the address of a variable as an input. Here in this program n takes a value of 2 and assumes its address as 1000. If we pass n to scanf(), input fetched from STDIN is placed in invalid memory 2 which should be 1000 instead. This causes memory corruption leading to a Segmentation fault.

Example:

C

#include <stdio.h>

int main()

{

int n = 2;

scanf("%d", n);

return 0;

}

C++

#include <iostream>

using namespace std;

int main()

{

int n = 2;

cin >> n;

return 0;

}

Output

Segementation Fault

5. Stack Overflow

It’s not a pointer-related problem even code may not have a single pointer. It’s because of running out of memory on the stack. It is also a type of memory corruption that may happen due to large array size, a large number of recursive calls, lots of local variables, etc.

Example:

C

#include <stdio.h>

int main()

{

int arr[2000000000];

return 0;

}

C++

#include <iostream>

using namespace std;

int main()

{

int array[2000000000];

return 0;

}

Output

Segmentation Fault

6. Buffer Overflow

If the data being stored in the buffer is larger than the allocated size of the buffer, a buffer overflow occurs which leads to the segmentation fault. Most of the methods in the C language do not perform bound checking, so buffer overflow happens frequently when we forget to allot the required size to the buffer.

Example:

C

#include <stdio.h>

int main()

{

char ref[20] = "This is a long string";

char buf[10];

sscanf(ref, "%s", buf);

return 0;

}

C++

#include <iostream>

using namespace std;

int main()

{

char ref[20] = "This is a long string";

char buf[10];

sscanf(ref, "%s", buf);

return 0;

}

Output

Segmentation Fault

7. Dereferencing an Uninitialized or NULL Pointer

It is a common programming error to dereference an uninitialized pointer (wild pointer), which can result in undefined behavior. When a pointer is used in a context that treats it as a valid pointer and accesses its underlying value, even though it has not been initialized to point to a valid memory location, this error occurs. Data corruption, program errors, or segmentation faults can result from this. Depending on their environment and state when dereferencing, uninitialized pointers may yield different results.

As we know the NULL pointer does not points to any memory location, so dereferencing it will result in a segmentation fault.

Example:

C

#include <stdio.h>

int main()

{

int* ptr;

int* nptr = NULL;

printf("%d %d", *ptr, *nptr);

return 0;

}

C++

#include <iostream>

using namespace std;

int main()

{

int* ptr;

int* nptr = NULL;

cout << *ptr << " " << *nptr;

return 0;

}

Output

Segmentation Fault

How to Fix Segmentation Faults?

We can fix segmentation faults by being careful about the causes mentioned:

Avoid modifying string literals.
Being careful when using pointers as they are one of the most common causes.
Considering the buffer and stack size before storing the data to avoid buffer or stack overflow.
Checking for bounds before accessing array elements.
Use scanf() and printf() carefully to avoid incorrect format specifiers or buffer overflow.

Overall, the cause of the segmentation fault is accessing the memory that does not belong to you in that space. As long as we avoid doing that, we can avoid the segmentation fault. If you cannot find the source of the error even after doing it, it is recommended to use a debugger as it directly leads to the point of error in the program.

Last Updated :
07 May, 2023

Like Article

Save Article

Источник

Когда я делаю ошибку в коде, то обычно это приводит к появлению сообщения “segmentation fault”, зачастую сокращённого до “segfault”. И тут же мои коллеги и руководство приходят ко мне: «Ха! У нас тут для тебя есть segfault для исправления!» — «Ну да, виноват», — обычно отвечаю я. Но многие ли из вас знают, что на самом деле означает ошибка “segmentation fault”?

Чтобы ответить на этот вопрос, нам нужно вернуться в далёкие 1960-е. Я хочу объяснить, как работает компьютер, а точнее — как в современных компьютерах осуществляется доступ к памяти. Это поможет понять, откуда же берётся это странное сообщение об ошибке.

Вся представленная ниже информация — основы компьютерной архитектуры. И без нужды я не буду сильно углубляться в эту область. Также я буду применять всем известную терминологию, так что мой пост будет понятен всем, кто не совсем на «вы» с вычислительной техникой. Если же вы захотите изучить вопрос работы с памятью подробнее, то можете обратиться к многочисленной доступной литературе. А заодно не забудьте покопаться в исходном коде ядра какой-нибудь ОС, например, Linux. Я не буду излагать здесь историю вычислительной техники, некоторые вещи не будут освещаться, а некоторые сильно упрощены.

Немного истории

Когда-то компьютеры были очень большими, весили тонны, при этом обладали одним процессором и памятью примерно на 16 Кб. Стоил такой монстр порядка $150 000 и мог выполнять лишь одну задачу за раз: в каждый момент времени выполнялся только один какой-то процесс. Архитектуру памяти в те времена можно схематически представить так:

То есть на ОС приходилась, скажем, четверть всей доступной памяти, а остальной объём отдавался под пользовательские задачи. В то время роль ОС заключалась в простом управлении оборудованием с помощью прерываний ЦПУ. Так что операционке нужна была память для себя, для копирования данных с устройств и для работы с ними (режим PIO). Для вывода данных на экран нужно было использовать часть основной памяти, ведь видеоподсистема либо не имела своей оперативки, либо обладала считанными килобайтами. А уже сама программа выполнялась в области памяти, идущей сразу после ОС, и решала свои задачи.

Совместный доступ к ресурсам

Главная проблема заключалась в том, что устройство, стоящее $150 000, было однозадачным и тратило целые дни на обработку нескольких килобайт данных.

Из-за непомерной стоимости мало кто мог позволить себе приобрести сразу несколько компьютеров, чтобы обрабатывать одновременно несколько задач. Поэтому люди начали искать способы совместного доступа к вычислительным ресурсам одного компьютера. Так наступила эра многозадачности. Обратите внимание, что в те времена ещё никто не помышлял о многопроцессорных компьютерах. Так как же можно заставить компьютер с одним ЦПУ выполнять несколько разных задач?

Решением стало использование планировщика задач (scheduling): пока один процесс прерывался, ожидая завершения операций ввода/вывода, ЦПУ мог выполнять другой процесс. Я не буду здесь больше касаться планировщика задач, это слишком обширная тема, не имеющая отношения к памяти.

Если компьютер способен поочерёдно выполнять несколько задач, то распределение памяти будет выглядеть примерно так:

Задачи А и В хранятся в памяти, поскольку копировать их на диск и обратно слишком затратно. И по мере того, как процессор выполняет ту или иную задачу, он обращается к памяти за соответствующими данными. Но тут возникает проблема.

Когда один программист будет писать код для выполнения задачи В, он должен знать границы выделяемых сегментов памяти. Допустим, задача В занимает в памяти отрезок от 10 до 12 Кб, тогда каждый адрес памяти должен быть жёстко закодирован в пределах этих границ. Но если компьютер будет выполнять сразу три задачи, то память будет поделена на большее количество сегментов, и значит сегмент для задачи В может оказаться сдвинут. Тогда код программы придётся переписывать, чтобы она могла оперировать меньшим объёмом памяти, а также изменить все указатели.

Здесь всплывает и иная проблема: что если задача В обратится к сегменту памяти, выделенному для задачи А? Такое легко может произойти, ведь при работе с указателями памяти достаточно сделать маленькую ошибку, и программа будет обращаться к совершенно другому адресу, нарушив целостность данных другого процесса. При этом задача А может работать с очень важными с точки зрения безопасности данными. Нет никакого способа помешать В вторгнуться в область памяти А. Наконец, вследствие ошибки программиста задача В может перезаписать область памяти ОС (в данном случае от 0 до 4 Кб).

Адресное пространство

Чтобы можно было спокойно выполнять несколько задач, хранящихся в памяти, нам нужна помощь от ОС и оборудования. В частности, адресное пространство. Это некая абстракция памяти, выделяемая ОС для какого-то процесса. На сегодняшний день это фундаментальная концепция, которая используется везде. По крайней мере, во ВСЕХ компьютерах гражданского назначения принят именно этот подход, а у военных могут быть свои секреты. Персоналки, смартфоны, телевизоры, игровые приставки, умные часы, банкоматы — ткните в любой аппарат, и окажется, что распределение памяти в нём осуществляется по принципу «код-стек-куча» (code-stack-heap).

Адресное пространство содержит всё, что нужно для выполнения процесса:

Машинные инструкции, которые должен выполнить ЦПУ.
Данные, с которыми будут работать эти машинные инструкции.

Схематически адресное пространство делится следующим образом:

Стек (stack) — это область памяти, в которой программа хранит информацию о вызываемых функциях, их аргументах и каждой локальной переменной в функциях. Размер области может меняться по мере работы программы. При вызове функций стек увеличивается, а при завершении — уменьшается.
Куча (heap) — это область памяти, в которой программа может делать всё, что заблагорассудится. Размер области может меняться. Программист имеет возможность воспользоваться частью памяти кучи с помощью функции malloc(), и тогда эта область памяти увеличивается. Возврат ресурсов осуществляется с помощью free(), после чего куча уменьшается.
Кодовый сегмент (code) — это область памяти, в которой хранятся машинные инструкции скомпилированной программы. Они генерируются компилятором, но могут быть написаны и вручную. Обратите внимание, что эта область памяти также может быть разделена на три части (текст, данные и BSS). Эта область памяти имеет фиксированный размер, определяемый компилятором. В нашем примере пусть это будет 1 Кб.

Поскольку стек и куча могут меняться в размерах, они размещены в противоположных частях общего адресного пространства. Направления изменения их размеров показаны стрелками. В обязанности ОС входит контроль над тем, чтобы эти области не наложились друг на друга.

Виртуализация памяти

Допустим, задача А получила в своё распоряжение всю доступную пользовательскую память. И тут возникает задача В. Как быть? Решение было найдено в виртуализации.

Напомню одну из предыдущих иллюстраций, когда в памяти одновременно находятся А и В:

Допустим, А пытается получить доступ к памяти в собственном адресном пространстве, например по индексу 11 Кб. Возможно даже, что это будет её собственный стек. В этом случае ОС нужно придумать, как не подгружать индекс 1500, поскольку по факту он может указывать на область задачи В.

На самом деле, адресное пространство, которое каждая программа считает своей памятью, является памятью виртуальной. Фальшивкой. И в области памяти задачи А индекс 11 Кб будет фальшивым адресом. То есть — адресом виртуальной памяти.

Каждая программа, выполняющаяся на компьютере, работает с фальшивой (виртуальной) памятью. С помощью некоторых чипов ОС обманывает процесс, когда он обращается к какой-либо области памяти. Благодаря виртуализации ни один процесс не может получить доступ к памяти, которая ему не принадлежит: задача А не влезет в память задачи В или самой ОС. При этом на пользовательском уровне всё абсолютно прозрачно, благодаря обширному и сложному коду ядра ОС.

Таким образом, каждое обращение к памяти регулируется операционной системой. И это должно осуществляться очень эффективно, чтобы не слишком замедлять работу различных выполняющихся программ. Эффективность обеспечивается с помощью аппаратных средств, преимущественно — ЦПУ и некоторых компонентов вроде MMU. Последний появился в виде отдельного чипа в начале 1970-х, а сегодня MMU встраиваются непосредственно в процессор и в обязательном порядке используются операционными системами.

Вот небольшая программка на С, демонстрирующая работу с адресами памяти:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv)
{
    int v = 3;
    printf("Code is at %p n", (void *)main);
    printf("Stack is at %p n", (void *)&v);
    printf("Heap is at %p n", malloc(8));

    return 0;
}

На моей машине LP64 X86_64 она показывает такой результат:

Code is at 0x40054c
Stack is at 0x7ffe60a1465c
Heap is at 0x1ecf010

Как я и описывал, сначала идёт кодовый сегмент, затем куча, а затем стек. Но все эти три адреса фальшивые. В физической памяти по адресу 0x7ffe60a1465c вовсе не хранится целочисленная переменная со значением 3. Никогда не забывайте, что все пользовательские программы манипулируют виртуальными адресами, и только на уровне ядра или аппаратных драйверов допускается использование адресов физической памяти.

Переадресация

Переадресация (транслирование, перевод, преобразование адресов) — это термин, обозначающий процесс сопоставления виртуального адреса физическому. Занимается этим модуль MMU. Для каждого выполняющегося процесса операционка должна помнить соответствия всех виртуальных адресов физическим. И это довольно непростая задача. По сути, ОС приходится управлять памятью каждого пользовательского процесса при каждом обращении. Тем самым она превращает кошмарную реальность физической памяти в полезную, мощную и лёгкую в использовании абстракцию.

Давайте рассмотрим подробнее.

Когда запускается процесс, ОС бронирует для него фиксированный объём физической памяти, пусть это будет 16 Кб. Начальный адрес этого адресного пространства сохраняется в специальной переменной base. А в переменной bounds записывается размер выделенной области памяти, в нашем примере — 16 Кб. Эти два значения записываются в каждую таблицу процессов — PCB (Process Control Block).

Итак, это виртуальное адресное пространство:

А это его физический образ:

ОС решает выделить диапазон физических адресов от 4 до 20 Кб, то есть значение base равно 4 Кб, а значение bounds равно 4 + 16 = 20 Кб. Когда процесс ставится в очередь на выполнение (ему выделяется процессорное время), ОС считывает из PCB значения обеих переменных и копирует их в специальные регистры ЦПУ. Далее процесс запускается и пытается обратиться, допустим, к виртуальному адресу 2 Кб (в своей куче). К этому адресу ЦПУ добавляет значение base, полученное от ОС. Следовательно, физический адрес будет 2+ 4 = 6 Кб.

Физический адрес = виртуальный адрес + base

Если получившийся физический адрес (6 Кб) выбивается из границ выделенной области (4—20 Кб), это означает, что процесс пытается обратиться к памяти, которая ему не принадлежит. Тогда ЦПУ генерирует исключение и сообщает об этом ОС, которая обрабатывает данное исключение. В этом случае система обычно сигнализирует процессу о нарушении: SIGSEGV, Segmentation Fault. Этот сигнал по умолчанию прерывает выполнение процесса (это можно настраивать).

Перераспределение памяти

Если задача А исключена из очереди на выполнение, то это даже лучше. Это означает, что планировщик попросили выполнить другую задачу (допустим, В). Пока выполняется В, операционка может перераспределить всё физическое пространство задачи А. Во время выполнения пользовательского процесса ОС зачастую теряет управление процессором. Но когда процесс делает системный вызов, процессор снова возвращается под контроль ОС. До этого системного вызова операционка может что угодно делать с памятью, в том числе и целиком перераспределять адресное пространство процесса в другой физический раздел.

В нашем примере это осуществляется достаточно просто: ОС перемещает 16-килобайтную область в другое свободное место подходящего размера и просто обновляет значения переменных base и bounds для задачи А. Когда процессор возвращается к её выполнению, процесс переадресации всё ещё работает, но физическое адресное пространство уже изменилось.

С точки зрения задачи А ничего не меняется, её собственное адресное пространство по-прежнему расположено в диапазоне 0-16 Кб. При этом ОС и MMU полностью контролируют каждое обращение задачи к памяти. То есть программист манипулирует виртуальной областью 0-16 Кб, а MMU берёт на себя сопоставление с физическими адресами.

После перераспределения образ памяти будет выглядеть так:

Программисту теперь не нужно заботиться о том, с какими адресами памяти будет работать его программа, не нужно переживать о конфликтах. ОС в связке с MMU снимают с него все эти заботы.

Сегментация памяти

В предыдущих главах мы рассмотрели вопросы переадресации и перераспределения памяти. Однако у нашей модели работы с памятью есть ряд недостатков:

Мы предполагаем, что каждое виртуальное адресное пространство имеет размер в 16 Кб. Это не имеет никакого отношения к действительности.
ОС приходится поддерживать список свободных диапазонов физической памяти размером по 16 Кб, чтобы выделять их для новых запускаемых процессов или перераспределения текущих выделенных областей. Как можно эффективно осуществлять всё это, не ухудшив производительность всей системы?
Мы выделяем по 16 Кб каждому процессу, но ведь не факт, что каждый из них будет использовать всю выделенную область. Так что мы просто теряем кучу памяти на пустом месте. Это называется внутренней фрагментацией (internal fragmentation) — память резервируется, но не используется.

Для решения некоторых из этих проблем давайте рассмотрим более сложную систему организации памяти — сегментацию. Смысл её прост: принцип “base and bounds” распространяется на все три сегмента памяти — кучу, кодовый сегмент и стек, причём для каждого процесса, вместо того чтобы рассматривать образ памяти как единую уникальную сущность.

В результате мы больше не теряем память между стеком и кучей:

Как вы могли заметить, свободное пространство в виртуальной памяти задачи А больше не размещено в памяти физической. И память теперь используется гораздо эффективнее. ОС теперь должна запоминать для каждой задачи три пары base и bounds, по одной для каждого сегмента. MMU, как и раньше, занимается переадресацией, но оперирует уже тремя baseи тремя bounds.

Допустим, у кучи задачи А параметр base равен 126 Кб, а bounds — 2 Кб. Пусть задача А обращается к виртуальному адресу 3 Кб (в куче). Тогда физический адрес определяется как 3 – 2 Кб (начало кучи) = 1 Кб + 126 Кб (сдвиг) = 127 Кб. Это меньше 128, а значит ошибки обращения не будет.

Совместное использование сегментов

Сегментирование физической памяти не только не позволяет виртуальной памяти отъедать физическую, но также даёт возможность совместного использования физических сегментов с помощью виртуальных адресных пространств разных процессов.

Если дважды запустить задачу А, то кодовый сегмент у них будет один и тот же: в обеих задачах выполняются одинаковые машинные инструкции. В то же время у каждой задачи будут свои стек и куча, поскольку они оперируют разными наборами данных.

При этом оба процесса не подозревают, что делят с кем-то свою память. Такой подход стал возможен благодаря внедрению битов защиты сегмента (segment protection bits).

Для каждого создаваемого физического сегмента ОС регистрирует значение bounds, которое используется MMU для последующей переадресации. Но в то же время регистрируется и так называемый флаг разрешения (permission flag).

Поскольку сам код нельзя модифицировать, то все кодовые сегменты создаются с флагами RX. Это значит, что процесс может загружать эту область памяти для последующего выполнения, но в неё никто не может записывать. Другие два сегмента — куча и стек — имеют флаги RW, то есть процесс может считывать и записывать в эти свои два сегмента, однако код из них выполнять нельзя. Это сделано для обеспечения безопасности, чтобы злоумышленник не мог повредить кучу или стек, внедрив в них свой код для получения root-прав. Так было не всегда, и для высокой эффективности этого решения требуется аппаратная поддержка. В процессорах Intel это называется “NX bit”.

Флаги могут быть изменены в процессе выполнения программы, для этого используется mprotect().

Под Linux все эти сегменты памяти можно посмотреть с помощью утилит /proc/{pid}/maps или /usr/bin/pmap.

Вот пример на PHP:

$ pmap -x 31329
0000000000400000   10300    2004       0 r-x--  php
000000000100e000     832     460      76 rw---  php
00000000010de000     148      72      72 rw---    [ anon ]
000000000197a000    2784    2696    2696 rw---    [ anon ]
00007ff772bc4000      12      12       0 r-x--  libuuid.so.0.0.0
00007ff772bc7000    1020       0       0 -----  libuuid.so.0.0.0
00007ff772cc6000       4       4       4 rw---  libuuid.so.0.0.0
... ...

Здесь есть все необходимые подробности относительно распределения памяти. Адреса виртуальные, отображаются разрешения для каждой области памяти. Каждый совместно используемый объект (.so) размещён в адресном пространстве в виде нескольких частей (обычно код и данные). Кодовые сегменты являются исполняемыми и совместно используются в физической памяти всеми процессами, которые разместили подобный совместно используемый объект в своём адресном пространстве.

Shared Objects — это одно из крупнейших преимуществ Unix- и Linux-систем, обеспечивающее экономию памяти.

Также с помощью системного вызова mmap() можно создавать совместно используемую область, которая преобразуется в совместно используемый физический сегмент. Тогда у каждой области появится индекс s, означающий shared.

Ограничения сегментации

Итак, сегментация позволила решить проблему неиспользуемой виртуальной памяти. Если она не используется, то и не размещается в физической памяти благодаря использованию сегментов, соответствующих именно объёму используемой памяти.

Но это не совсем верно.

Допустим, процесс запросил у кучи 16 Кб. Скорее всего, ОС создаст в физической памяти сегмент соответствующего размера. Если пользователь потом освободит из них 2 Кб, тогда ОС придётся уменьшить размер сегмента до 14 Кб. Но вдруг потом программист запросит у кучи ещё 30 Кб? Тогда предыдущий сегмент нужно увеличить более чем в два раза, а возможно ли это будет сделать? Может быть, его уже окружают другие сегменты, не позволяющие ему увеличиться. Тогда ОС придётся искать свободное место на 30 Кб и перераспределять сегмент.

Главный недостаток сегментов заключается в том, что из-за них физическая память сильно фрагментируется, поскольку сегменты увеличиваются и уменьшаются по мере того, как пользовательские процессы запрашивают и освобождают память. А ОС приходится поддерживать список свободных участков и управлять ими.

Фрагментация может привести к тому, что какой-нибудь процесс запросит такой объём памяти, который будет больше любого из свободных участков. И в этом случае ОС придётся отказать процессу в выделении памяти, даже если суммарный объём свободных областей будет существенно больше.

ОС может попытаться разместить данные компактнее, объединяя все свободные области в один большой чанк, который в дальнейшем можно использовать для нужд новых процессов и перераспределения.

Но подобные алгоритмы оптимизации сильно нагружают процессор, а ведь его мощности нужны для выполнения пользовательских процессов. Если ОС начинает реорганизовывать физическую память, то система становится недоступной.

Так что сегментация памяти влечёт за собой немало проблем, связанных с управлением памятью и многозадачностью. Нужно как-то улучшить возможности сегментации и исправить недостатки. Это достигается с помощью ещё одного подхода — страниц виртуальной памяти.

Разбиение памяти на страницы

Как было сказано выше, главный недостаток сегментации заключается в том, что сегменты очень часто меняют свой размер, и это приводит к фрагментации памяти, из-за чего может возникнуть ситуация, когда ОС не выделит для процессов нужные области памяти. Эта проблема решается с помощью страниц: каждое размещение, которое ядро делает в физической памяти, имеет фиксированный размер. То есть страницы — это области физической памяти фиксированного размера, ничего более. Это сильно облегчает задачу управления свободным объёмом и избавляет от фрагментации.

Давайте рассмотрим пример: виртуальное адресное пространство объёмом 16 Кб разбито на страницы.

Мы не говорим здесь о куче, стеке или кодовом сегменте. Просто делим память на куски по 4 Кб. Затем то же самое делаем с физической памятью:

ОС хранит таблицу страниц процесса (process page table), в которой представлены взаимосвязи между страницей виртуальной памяти процесса и страницей физической памяти (страничный кадр, page frame).

Теперь мы избавились от проблемы поиска свободного места: страничный кадр либо используется, либо нет (unused). И ядру не в пример легче найти достаточное количество страниц, чтобы выполнить запрос процесса на выделение памяти.

Страница — это мельчайшая и неделимая единица памяти, которой может оперировать ОС.

У каждого процесса есть своя таблица страниц, в которой представлена переадресация. Здесь уже используются не значения границ области, а номер виртуальной страницы (VPN, virtual page number) и сдвиг (offset).

Пример: размер виртуального пространства 16 Кб, следовательно, нам нужно 14 бит для описания адресов (2¹⁴ = 16 Кб). Размер страницы 4 Кб, значит нам нужно 4 Кб (16/4), чтобы выбрать нужную страницу:

Когда процесс хочет использовать, например, адрес 9438 (вне границ 16 384), то он запрашивает в двоичном коде 10.0100.1101.1110:

Это 1246-й байт в виртуальной странице номер 2 («0100.1101.1110»-й байт в «10»-й странице). Теперь ОС достаточно просто обратиться к таблице страниц процесса, чтобы найти эту страницу номер 2. В нашем примере она соответствует восьмитысячному байту физической памяти. Следовательно, виртуальный адрес 9438 соответствует физическому адресу 9442 (8000 + сдвиг 1246).

Как уже было сказано, каждый процесс обладает лишь одной таблицей страниц, поскольку у каждого процесса собственная переадресация, как и у сегментов. Но где же именно хранятся все эти таблицы? Наверное, в физической памяти, где же ещё им быть?

Если сами таблицы страниц хранятся в памяти, то для получения VPN надо обращаться к памяти. Тогда количество обращений к ней удваивается: сначала мы извлекаем из памяти номер нужной страницы, а затем обращаемся к самим данным, хранящимся в этой странице. И если скорость доступа к памяти невелика, то ситуация выглядит довольно грустно.

Буфер быстрой переадресации (TLB, Translation-lookaside Buffer)

Использование страниц в качестве основного инструмента поддержки виртуальной памяти может привести к сильному снижению производительности. Разбиение адресного пространства на небольшие куски (страницы) требует хранения большого количества данных о размещении страниц. А раз эти данные хранятся в памяти, то при каждом обращении процесса к памяти осуществляется ещё одно, дополнительное обращение.

Для поддержания производительности снова используется помощь оборудования. Как и при сегментации, мы аппаратными методами помогаем ядру эффективно осуществлять переадресацию. Для этого используется TLB, входящий в состав MMU, и представляющий собой простой кэш для некоторых VPN-переадресаций. TLB позволяет ОС не обращаться к памяти лишний раз, чтобы получить физический адрес из виртуального.

Аппаратный MMU инициируется при каждом обращении к памяти, извлекает из виртуального адреса VPN и запрашивает у TLB, хранится ли в нём переадресация с этого VPN. Если да, то его роль выполнена. Если нет, то MMU находит нужную таблицу страниц процесса, и если она ссылается на валидный адрес, то обновляет данные в TLB, чтобы тот предоставлял их при следующем обращении.

Как вы понимаете, если в кэше отсутствует нужная переадресация, то это замедляет обращение к памяти. Можно предположить, что чем больше размер страниц, тем больше вероятность, что в TLB окажутся нужные данные. Но тогда мы будем тратить больше памяти на каждую страницу. Так что здесь нужен какой-то компромисс. Современные ядра умеют использовать страницы разных размеров. Например, Linux способен оперировать «огромными» страницами по 2 Мб вместо традиционных 4 Кб.

Также рекомендуется хранить данные компактно, в смежных адресах памяти. Если вы раскидаете их по всей памяти, то куда чаще в TLB не будет обнаруживаться нужной переадресации, либо он будет постоянно переполняться. Это называется эффективностью пространственной локальности (spacial locality efficiency): данные, которые расположены в памяти сразу за вашими, могут размещаться в той же физической странице, и тогда благодаря TLB вы получите выигрыш в производительности.

Кроме того, TLB в каждой записи хранит так называемые ASID (Address Space Identifier, идентификатор адресного пространства). Это нечто вроде PID, идентификатора процесса. Каждый процесс, поставленный в очередь на выполнение, имеет собственный ASID, и TLB может управлять обращением любого процесса к памяти, без риска ошибочных обращений со стороны других процессов.

Повторимся снова: если пользовательский процесс пытается обратиться к неправильному адресу, тот наверняка будет отсутствовать в TLB. Следовательно, будет запущена процедура поиска в таблице страниц процесса. В ней хранится переадресация, но с неправильным набором битов. В х86-системах переадресации имеют размер 4 Кб, то есть битов в них немало. А значит есть вероятность найти правильный бит, равно как и другие вещи, наподобие бита изменения («грязного бита», dirty bit), битов защиты (protection bit), бита обращения (reference bit) и т.д. И если запись помечена как неправильная, то ОС по умолчанию выдаст SIGSEGV, что приведёт к ошибке “segmentation fault”, даже если о сегментах уже и речи не идёт.

На самом деле разбиение памяти на страницы в современных ОС устроено куда сложнее, чем я расписал. В частности, используются многоуровневые записи в таблицах страниц, многостраничные размеры, вытеснение страниц (page eviction), также известное как «обмен» (ядро скидывает страницы из памяти на диск и обратно, что повышает эффективность использования основной памяти и создаёт у процессов иллюзию её неограниченности).

Заключение

Теперь вы знаете, что стоит за сообщением “segmentation fault”. Раньше операционки использовали сегменты для размещения пространства виртуальной памяти в пространстве физической. Когда пользовательский процесс хочет обратиться к памяти, то он просит MMU переадресовать его. Но если полученный адрес ошибочен, — находится вне пределов физического сегмента, или если сегмент не имеет нужных прав (попытка записи в read only-сегмент), — то ОС по умолчанию отправляет сигнал SIGSEGV, что приводит к прерыванию выполнения процесса и выдаче сообщения “segmentation fault”. В каких-то ОС это может быть “General protection fault”. Вы можете изучить исходный код Linux для х86/64-платформ, отвечающий за ошибки доступа к памяти, в частности — за SIGSEGV. Также можете посмотреть, как на этой платформе осуществляется сегментирование. Вы откроете для себя интересные моменты относительно разбиения на страницы, дающие куда больше возможностей, чем при использовании классических сегментов.

Источник

From Wikipedia, the free encyclopedia

In computing, a segmentation fault (often shortened to segfault) or access violation is a fault, or failure condition, raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restricted area of memory (a memory access violation). On standard x86 computers, this is a form of general protection fault. The operating system kernel will, in response, usually perform some corrective action, generally passing the fault on to the offending process by sending the process a signal. Processes can in some cases install a custom signal handler, allowing them to recover on their own,^[1] but otherwise the OS default signal handler is used, generally causing abnormal termination of the process (a program crash), and sometimes a core dump.

Segmentation faults are a common class of error in programs written in languages like C that provide low-level memory access and few to no safety checks. They arise primarily due to errors in use of pointers for virtual memory addressing, particularly illegal access. Another type of memory access error is a bus error, which also has various causes, but is today much rarer; these occur primarily due to incorrect physical memory addressing, or due to misaligned memory access – these are memory references that the hardware cannot address, rather than references that a process is not allowed to address.

Many programming languages may employ mechanisms designed to avoid segmentation faults and improve memory safety. For example, Rust employs an ownership-based^[2] model to ensure memory safety.^[3] Other languages, such as Lisp and Java, employ garbage collection,^[4] which avoids certain classes of memory errors that could lead to segmentation faults.^[5]

Overview[edit]

Example of human generated signal

Segmentation fault affecting Krita in KDE desktop environment

A segmentation fault occurs when a program attempts to access a memory location that it is not allowed to access, or attempts to access a memory location in a way that is not allowed (for example, attempting to write to a read-only location, or to overwrite part of the operating system).

The term «segmentation» has various uses in computing; in the context of «segmentation fault», a term used since the 1950s,^{[citation needed]} it refers to the address space of a program.^[6] With memory protection, only the program’s own address space is readable, and of this, only the stack and the read/write portion of the data segment of a program are writable, while read-only data and the code segment are not writable. Thus attempting to read outside of the program’s address space, or writing to a read-only segment of the address space, results in a segmentation fault, hence the name.

On systems using hardware memory segmentation to provide virtual memory, a segmentation fault occurs when the hardware detects an attempt to refer to a non-existent segment, or to refer to a location outside the bounds of a segment, or to refer to a location in a fashion not allowed by the permissions granted for that segment. On systems using only paging, an invalid page fault generally leads to a segmentation fault, and segmentation faults and page faults are both faults raised by the virtual memory management system. Segmentation faults can also occur independently of page faults: illegal access to a valid page is a segmentation fault, but not an invalid page fault, and segmentation faults can occur in the middle of a page (hence no page fault), for example in a buffer overflow that stays within a page but illegally overwrites memory.

At the hardware level, the fault is initially raised by the memory management unit (MMU) on illegal access (if the referenced memory exists), as part of its memory protection feature, or an invalid page fault (if the referenced memory does not exist). If the problem is not an invalid logical address but instead an invalid physical address, a bus error is raised instead, though these are not always distinguished.

At the operating system level, this fault is caught and a signal is passed on to the offending process, activating the process’s handler for that signal. Different operating systems have different signal names to indicate that a segmentation fault has occurred. On Unix-like operating systems, a signal called SIGSEGV (abbreviated from segmentation violation) is sent to the offending process. On Microsoft Windows, the offending process receives a STATUS_ACCESS_VIOLATION exception.

Causes[edit]

The conditions under which segmentation violations occur and how they manifest themselves are specific to hardware and the operating system: different hardware raises different faults for given conditions, and different operating systems convert these to different signals that are passed on to processes. The proximate cause is a memory access violation, while the underlying cause is generally a software bug of some sort. Determining the root cause – debugging the bug – can be simple in some cases, where the program will consistently cause a segmentation fault (e.g., dereferencing a null pointer), while in other cases the bug can be difficult to reproduce and depend on memory allocation on each run (e.g., dereferencing a dangling pointer).

The following are some typical causes of a segmentation fault:

Attempting to access a nonexistent memory address (outside process’s address space)
Attempting to access memory the program does not have rights to (such as kernel structures in process context)
Attempting to write read-only memory (such as code segment)

These in turn are often caused by programming errors that result in invalid memory access:

Dereferencing a null pointer, which usually points to an address that’s not part of the process’s address space
Dereferencing or assigning to an uninitialized pointer (wild pointer, which points to a random memory address)
Dereferencing or assigning to a freed pointer (dangling pointer, which points to memory that has been freed/deallocated/deleted)
A buffer overflow
A stack overflow
Attempting to execute a program that does not compile correctly. (Some compilers^[which?] will output an executable file despite the presence of compile-time errors.)

In C code, segmentation faults most often occur because of errors in pointer use, particularly in C dynamic memory allocation. Dereferencing a null pointer, which results in undefined behavior, will usually cause a segmentation fault. This is because a null pointer cannot be a valid memory address. On the other hand, wild pointers and dangling pointers point to memory that may or may not exist, and may or may not be readable or writable, and thus can result in transient bugs. For example:

char *p1 = NULL;           // Null pointer
char *p2;                  // Wild pointer: not initialized at all.
char *p3  = malloc(10 * sizeof(char));  // Initialized pointer to allocated memory
                                        // (assuming malloc did not fail)
free(p3);                  // p3 is now a dangling pointer, as memory has been freed

Dereferencing any of these variables could cause a segmentation fault: dereferencing the null pointer generally will cause a segfault, while reading from the wild pointer may instead result in random data but no segfault, and reading from the dangling pointer may result in valid data for a while, and then random data as it is overwritten.

Handling[edit]

The default action for a segmentation fault or bus error is abnormal termination of the process that triggered it. A core file may be generated to aid debugging, and other platform-dependent actions may also be performed. For example, Linux systems using the grsecurity patch may log SIGSEGV signals in order to monitor for possible intrusion attempts using buffer overflows.

On some systems, like Linux and Windows, it is possible for the program itself to handle a segmentation fault.^[7] Depending on the architecture and operating system, the running program can not only handle the event but may extract some information about its state like getting a stack trace, processor register values, the line of the source code when it was triggered, memory address that was invalidly accessed^[8] and whether the action was a read or a write.^[9]

Although a segmentation fault generally means that the program has a bug that needs fixing, it is also possible to intentionally cause such failure for the purposes of testing, debugging and also to emulate platforms where direct access to memory is needed. On the latter case, the system must be able to allow the program to run even after the fault occurs. In this case, when the system allows, it is possible to handle the event and increment the processor program counter to «jump» over the failing instruction to continue the execution.^[10]

Examples[edit]

Segmentation fault on an EMV keypad

Writing to read-only memory[edit]

Writing to read-only memory raises a segmentation fault. At the level of code errors, this occurs when the program writes to part of its own code segment or the read-only portion of the data segment, as these are loaded by the OS into read-only memory.

Here is an example of ANSI C code that will generally cause a segmentation fault on platforms with memory protection. It attempts to modify a string literal, which is undefined behavior according to the ANSI C standard. Most compilers will not catch this at compile time, and instead compile this to executable code that will crash:

int main(void)
{
    char *s = "hello world";
    *s = 'H';
}

When the program containing this code is compiled, the string «hello world» is placed in the rodata section of the program executable file: the read-only section of the data segment. When loaded, the operating system places it with other strings and constant data in a read-only segment of memory. When executed, a variable, s, is set to point to the string’s location, and an attempt is made to write an H character through the variable into the memory, causing a segmentation fault. Compiling such a program with a compiler that does not check for the assignment of read-only locations at compile time, and running it on a Unix-like operating system produces the following runtime error:

$ gcc segfault.c -g -o segfault
$ ./segfault
Segmentation fault

Backtrace of the core file from GDB:

Program received signal SIGSEGV, Segmentation fault.
0x1c0005c2 in main () at segfault.c:6
6               *s = 'H';

This code can be corrected by using an array instead of a character pointer, as this allocates memory on stack and initializes it to the value of the string literal:

char s[] = "hello world";
s[0] = 'H';  // equivalently, *s = 'H';

Even though string literals should not be modified (this has undefined behavior in the C standard), in C they are of static char [] type,^[11]^[12]^[13] so there is no implicit conversion in the original code (which points a char * at that array), while in C++ they are of static const char [] type, and thus there is an implicit conversion, so compilers will generally catch this particular error.

Null pointer dereference[edit]

In C and C-like languages, null pointers are used to mean «pointer to no object» and as an error indicator, and dereferencing a null pointer (a read or write through a null pointer) is a very common program error. The C standard does not say that the null pointer is the same as the pointer to memory address 0, though that may be the case in practice. Most operating systems map the null pointer’s address such that accessing it causes a segmentation fault. This behavior is not guaranteed by the C standard. Dereferencing a null pointer is undefined behavior in C, and a conforming implementation is allowed to assume that any pointer that is dereferenced is not null.

int *ptr = NULL;
printf("%d", *ptr);

This sample code creates a null pointer, and then tries to access its value (read the value). Doing so causes a segmentation fault at runtime on many operating systems.

Dereferencing a null pointer and then assigning to it (writing a value to a non-existent target) also usually causes a segmentation fault:

int *ptr = NULL;
*ptr = 1;

The following code includes a null pointer dereference, but when compiled will often not result in a segmentation fault, as the value is unused and thus the dereference will often be optimized away by dead code elimination:

Buffer overflow[edit]

The following code accesses the character array s beyond its upper boundary. Depending on the compiler and the processor, this may result in a segmentation fault.

char s[] = "hello world";
char c = s[20];

Stack overflow[edit]

Another example is recursion without a base case:

int main(void)
{
    return main();
}

which causes the stack to overflow which results in a segmentation fault.^[14] Infinite recursion may not necessarily result in a stack overflow depending on the language, optimizations performed by the compiler and the exact structure of a code. In this case, the behavior of unreachable code (the return statement) is undefined, so the compiler can eliminate it and use a tail call optimization that might result in no stack usage. Other optimizations could include translating the recursion into iteration, which given the structure of the example function would result in the program running forever, while probably not overflowing its stack.

References[edit]

^ Expert C programming: deep C secrets By Peter Van der Linden, page 188
^ «The Rust Programming Language — Ownership».
^ «Fearless Concurrency with Rust — The Rust Programming Language Blog».
^ McCarthy, John (April 1960). «Recursive functions of symbolic expressions and their computation by machine, Part I». Communications of the ACM. 4 (3): 184–195. doi:10.1145/367177.367199. S2CID 1489409. Retrieved 2018-09-22.
^ Dhurjati, Dinakar; Kowshik, Sumant; Adve, Vikram; Lattner, Chris (1 January 2003). «Memory Safety Without Runtime Checks or Garbage Collection» (PDF). Proceedings of the 2003 ACM SIGPLAN Conference on Language, Compiler, and Tool for Embedded Systems. ACM. 38 (7): 69–80. doi:10.1145/780732.780743. ISBN 1581136471. S2CID 1459540. Retrieved 2018-09-22.
^ «Debugging Segmentation Faults and Pointer Problems — Cprogramming.com». www.cprogramming.com. Retrieved 2021-02-03.
^ «Cleanly recovering from Segfaults under Windows and Linux (32-bit, x86)». Retrieved 2020-08-23.
^ «Implementation of the SIGSEGV/SIGABRT handler which prints the debug stack trace». GitHub. Retrieved 2020-08-23.
^ «How to identify read or write operations of page fault when using sigaction handler on SIGSEGV?(LINUX)». Retrieved 2020-08-23.
^ «LINUX – WRITING FAULT HANDLERS». Retrieved 2020-08-23.
^ «6.1.4 String literals». ISO/IEC 9899:1990 — Programming languages — C.
^ «6.4.5 String literals». ISO/IEC 9899:1999 — Programming languages — C.
^ «6.4.5 String literals». ISO/IEC 9899:2011 — Programming languages — C.
^ What is the difference between a segmentation fault and a stack overflow? at Stack Overflow

External links[edit]

Process: focus boundary and segmentation fault^{[dead link]}
A FAQ: User contributed answers regarding the definition of a segmentation fault
A «null pointer» explained
Answer to: NULL is guaranteed to be 0, but the null pointer is not?
The Open Group Base Specifications Issue 6 signal.h

Источник

#1. What are the Symptoms of Segmentation Fault?

#a. Dereferencing a null pointer

#b. Trying to access memory not initialized

#c. Trying to access memory out of bounds for the program

#d. Trying to modify string literals

#e. Using variable’s value as an address

#f. Stack overflow

#2. How do you Fix Segmentation Faults?

#a. How to Prevent Segmentation Faults?

#b. How to Fix Segmentation Faults?

Conclusion

Consider the following snippets of Code,

Here’s why.

Common Segmentation Fault Scenarios

1. Modifying a String Literal

C

C++

2. Accessing an Address That is Freed

C

C++

3. Accessing out-of-bounds Array Index

C

C++

4. Improper use of scanf()

C

C++

5. Stack Overflow

C

C++

6. Buffer Overflow

C

C++

7. Dereferencing an Uninitialized or NULL Pointer

C

C++

How to Fix Segmentation Faults?

Немного истории

Совместный доступ к ресурсам

Адресное пространство

Виртуализация памяти

Переадресация

Перераспределение памяти

Сегментация памяти

Совместное использование сегментов

Ограничения сегментации

Разбиение памяти на страницы

Буфер быстрой переадресации (TLB, Translation-lookaside Buffer)

Заключение

Overview[edit]

Causes[edit]

Handling[edit]

Examples[edit]

Writing to read-only memory[edit]

Null pointer dereference[edit]

Buffer overflow[edit]

Stack overflow[edit]

See also[edit]

References[edit]

External links[edit]

Не пропустите эти материалы по теме: