Integer overflow can be demonstrated through an odometer overflowing, a mechanical version of the phenomenon. All digits are set to the maximum 9 and the next increment of the white digit causes a cascade of carry-over additions setting all digits to 0, but there is no higher digit (1,000,000s digit) to change to a 1, so the counter resets to zero. This is wrapping in contrast to saturating.
In computer programming, an integer overflow occurs when an arithmetic operation attempts to create a numeric value that is outside of the range that can be represented with a given number of digits – either higher than the maximum or lower than the minimum representable value.
The most common result of an overflow is that the least significant representable digits of the result are stored; the result is said to wrap around the maximum (i.e. modulo a power of the radix, usually two in modern computers, but sometimes ten or another radix).
An overflow condition may give results leading to unintended behavior. In particular, if the possibility has not been anticipated, overflow can compromise a program’s reliability and security.
For some applications, such as timers and clocks, wrapping on overflow can be desirable. The C11 standard states that for unsigned integers, modulo wrapping is the defined behavior and the term overflow never applies: «a computation involving unsigned operands can never overflow.»[1]
On some processors like graphics processing units (GPUs) and digital signal processors (DSPs) which support saturation arithmetic, overflowed results would be «clamped», i.e. set to the minimum or the maximum value in the representable range, rather than wrapped around.
Origin[edit]
The register width of a processor determines the range of values that can be represented in its registers. Though the vast majority of computers can perform multiple-precision arithmetic on operands in memory, allowing numbers to be arbitrarily long and overflow to be avoided, the register width limits the sizes of numbers that can be operated on (e.g., added or subtracted) using a single instruction per operation. Typical binary register widths for unsigned integers include:
- 4-bit: maximum representable value 24 − 1 = 15
- 8-bit: maximum representable value 28 − 1 = 255
- 16-bit: maximum representable value 216 − 1 = 65,535
- 32-bit: maximum representable value 232 − 1 = 4,294,967,295 (the most common width for personal computers as of 2005),
- 64-bit: maximum representable value 264 − 1 = 18,446,744,073,709,551,615 (the most common width for personal computer central processing units (CPUs), as of 2021),
- 128-bit: maximum representable value 2128 − 1 = 340,282,366,920,938,463,463,374,607,431,768,211,455
When an unsigned arithmetic operation produces a result larger than the maximum above for an N-bit integer, an overflow reduces the result to modulo N-th power of 2, retaining only the least significant bits of the result and effectively causing a wrap around.
In particular, multiplying or adding two integers may result in a value that is unexpectedly small, and subtracting from a small integer may cause a wrap to a large positive value (for example, 8-bit integer addition 255 + 2 results in 1, which is 257 mod 28, and similarly subtraction 0 − 1 results in 255, a two’s complement representation of −1).
Such wraparound may cause security detriments—if an overflowed value is used as the number of bytes to allocate for a buffer, the buffer will be allocated unexpectedly small, potentially leading to a buffer overflow which, depending on the use of the buffer, might in turn cause arbitrary code execution.
If the variable has a signed integer type, a program may make the assumption that a variable always contains a positive value. An integer overflow can cause the value to wrap and become negative, which violates the program’s assumption and may lead to unexpected behavior (for example, 8-bit integer addition of 127 + 1 results in −128, a two’s complement of 128). (A solution for this particular problem is to use unsigned integer types for values that a program expects and assumes will never be negative.)
Flags[edit]
Most computers have two dedicated processor flags to check for overflow conditions.
The carry flag is set when the result of an addition or subtraction, considering the operands and result as unsigned numbers, does not fit in the given number of bits. This indicates an overflow with a carry or borrow from the most significant bit. An immediately following add with carry or subtract with borrow operation would use the contents of this flag to modify a register or a memory location that contains the higher part of a multi-word value.
The overflow flag is set when the result of an operation on signed numbers does not have the sign that one would predict from the signs of the operands, e.g., a negative result when adding two positive numbers. This indicates that an overflow has occurred and the signed result represented in two’s complement form would not fit in the given number of bits.
Definition variations and ambiguity[edit]
For an unsigned type, when the ideal result of an operation is outside the type’s representable range and the returned result is obtained by wrapping, then this event is commonly defined as an overflow. In contrast, the C11 standard defines that this event is not an overflow and states «a computation involving unsigned operands can never overflow.»[1]
When the ideal result of an integer operation is outside the type’s representable range and the returned result is obtained by clamping, then this event is commonly defined as a saturation. Use varies as to whether a saturation is or is not an overflow. To eliminate ambiguity, the terms wrapping overflow[2] and saturating overflow[3] can be used.
The term underflow is most commonly used for floating-point math and not for integer math.[4] However, many references can be found to integer underflow.[5][6][7][8][9] When the term integer underflow is used, it means the ideal result was closer to minus infinity than the output type’s representable value closest to minus infinity. When the term integer underflow is used, the definition of overflow may include all types of overflows, or it may only include cases where the ideal result was closer to positive infinity than the output type’s representable value closest to positive infinity.
When the ideal result of an operation is not an exact integer, the meaning of overflow can be ambiguous in edge cases. Consider the case where the ideal result has a value of 127.25 and the output type’s maximum representable value is 127. If overflow is defined as the ideal value being outside the representable range of the output type, then this case would be classified as an overflow. For operations that have well defined rounding behavior, overflow classification may need to be postponed until after rounding is applied. The C11 standard[1] defines that conversions from floating point to integer must round toward zero. If C is used to convert the floating point value 127.25 to integer, then rounding should be applied first to give an ideal integer output of 127. Since the rounded integer is in the outputs range, the C standard would not classify this conversion as an overflow.
Inconsistent behavior[edit]
The behavior on occurrence of overflow may not be consistent in all circumstances. For example, in the language Rust, while functionality is provided to give users choice and control, the behavior for basic use of mathematic operators is naturally fixed; however, this fixed behavior differs between a program built in ‘debug’ mode and one built in ‘release’ mode.[10] In C, unsigned integer overflow is defined to wrap around, while signed integer overflow causes undefined behavior.
Methods to address integer overflow problems[edit]
Language | Unsigned integer | Signed integer |
---|---|---|
Ada | modulo the type’s modulus | raise Constraint_Error |
C, C++ | modulo power of two | undefined behavior |
C# | modulo power of 2 in unchecked context; System.OverflowException is raised in checked context[11]
|
|
Java | modulo power of two (char is the only unsigned primitive type in Java) | modulo power of two |
JavaScript | all numbers are double-precision floating-point except the new BigInt | |
MATLAB | Builtin integers saturate. Fixed-point integers configurable to wrap or saturate | |
Python 2 | — | convert to long type (bigint) |
Seed7 | — | raise OVERFLOW_ERROR[12] |
Scheme | — | convert to bigNum |
Simulink | configurable to wrap or saturate | |
Smalltalk | — | convert to LargeInteger |
Swift | Causes error unless using special overflow operators.[13] |
Detection[edit]
Run-time overflow detection implementation UBSan
(undefined behavior sanitizer) is available for C compilers.
In Java 8, there are overloaded methods, for example Math.addExact(int, int)
, which will throw an ArithmeticException
in case of overflow.
Computer emergency response team (CERT) developed the As-if Infinitely Ranged (AIR) integer model, a largely automated mechanism to eliminate integer overflow and truncation in C/C++ using run-time error handling.[14]
Avoidance[edit]
By allocating variables with data types that are large enough to contain all values that may possibly be computed and stored in them, it is always possible to avoid overflow. Even when the available space or the fixed data types provided by a programming language or environment are too limited to allow for variables to be defensively allocated with generous sizes, by carefully ordering operations and checking operands in advance, it is often possible to ensure a priori that the result will never be larger than can be stored. Static analysis tools, formal verification and design by contract techniques can be used to more confidently and robustly ensure that an overflow cannot accidentally result.
Handling[edit]
If it is anticipated that overflow may occur, then tests can be inserted into the program to detect when it happens, or is about to happen, and do other processing to mitigate it. For example, if an important result computed from user input overflows, the program can stop, reject the input, and perhaps prompt the user for different input, rather than the program proceeding with the invalid overflowed input and probably malfunctioning as a consequence.
CPUs generally have a way to detect this to support addition of numbers larger than their register size, typically using a status bit. The technique is called multiple-precision arithmetic. Thus, it is possible to add two numbers each two bytes wide using just a byte addition in steps: first add the low bytes then add the high bytes, but if it is necessary to carry out of the low bytes this is arithmetic overflow of the byte addition and it becomes necessary to detect and increment the sum of the high bytes.
Handling possible overflow of a calculation may sometimes present a choice between performing a check before a calculation (to determine whether or not overflow is going to occur), or after it (to consider whether or not it likely occurred based on the resulting value). Caution should be shown towards the latter choice. Firstly, since it may not be a reliable detection method (for example, an addition may not necessarily wrap to a lower value). Secondly, because the occurrence of overflow itself may in some cases be undefined behavior. In the C language, overflow of unsigned integers results in wrapping, but overflow of signed integers is undefined behavior. Consequently, a C compiler is free to assume that the programmer has ensured that signed overflow cannot possibly occur and thus it may silently optimise out any check subsequent to the calculation that involves checking the result to detect it without giving the programmer any warning that this has been done. It is thus advisable to always implement checks before calculations, not after them.
Explicit propagation[edit]
If a value is too large to be stored it can be assigned a special value indicating that overflow has occurred and then have all successive operations return this flag value. Such values are sometimes referred to as NaN, for «not a number». This is useful so that the problem can be checked once at the end of a long calculation rather than after each step. This is often supported in floating-point hardware called FPUs.
Programming language support[edit]
Programming languages implement various mitigation methods against an accidental overflow: Ada, Seed7, and certain variants of functional languages trigger an exception condition on overflow, while Python (since 2.4) seamlessly converts internal representation of the number to match its growth, eventually representing it as long
– whose ability is only limited by the available memory.[15]
In languages with native support for arbitrary-precision arithmetic and type safety (such as Python, Smalltalk, or Common Lisp), numbers are promoted to a larger size automatically when overflows occur, or exceptions thrown (conditions signaled) when a range constraint exists. Using such languages may thus be helpful to mitigate this issue. However, in some such languages, situations are still possible where an integer overflow can occur. An example is explicit optimization of a code path which is considered a bottleneck by the profiler. In the case of Common Lisp, this is possible by using an explicit declaration to type-annotate a variable to a machine-size word (fixnum)[16] and lower the type safety level to zero[17] for a particular code block.[18][19][20][21]
In stark contrast to older languages such as C, some newer languages such as Rust provide built-in functions that allow easy detection and user choice over how overflow should be handled case-by-case. In Rust, while use of basic mathematic operators naturally lacks such flexibility, users can alternatively perform calculations via a set of methods provided by each of the integer primitive types. These methods give users several choices between performing a checked (or overflowing) operation (which indicates whether or not overflow occurred via the return type); an ‘unchecked’ operation; an operation that performs wrapping, or an operation which performs saturation at the numeric bounds.
Saturated arithmetic[edit]
In computer graphics or signal processing, it is typical to work on data that ranges from 0 to 1 or from −1 to 1. For example, take a grayscale image where 0 represents black, 1 represents white, and the values in between represent shades of gray. One operation that one may want to support is brightening the image by multiplying every pixel by a constant. Saturated arithmetic allows one to just blindly multiply every pixel by that constant without worrying about overflow by just sticking to a reasonable outcome that all these pixels larger than 1 (i.e., «brighter than white») just become white and all values «darker than black» just become black.
Examples[edit]
Unanticipated arithmetic overflow is a fairly common cause of program errors. Such overflow bugs may be hard to discover and diagnose because they may manifest themselves only for very large input data sets, which are less likely to be used in validation tests.
Taking the arithmetic mean of two numbers by adding them and dividing by two, as done in many search algorithms, causes error if the sum (although not the resulting mean) is too large to be represented and hence overflows.[22]
An unhandled arithmetic overflow in the engine steering software was the primary cause of the crash of the 1996 maiden flight of the Ariane 5 rocket.[23] The software had been considered bug-free since it had been used in many previous flights, but those used smaller rockets which generated lower acceleration than Ariane 5. Frustratingly, the part of the software in which the overflow error occurred was not even required to be running for the Ariane 5 at the time that it caused the rocket to fail: it was a launch-regime process for a smaller predecessor of the Ariane 5 that had remained in the software when it was adapted for the new rocket. Further, the true cause of the failure was a flaw in the engineering specification of how the software dealt with the overflow when it was detected: it did a diagnostic dump to its bus, which would have been connected to test equipment during software testing during development but was connected to the rocket steering motors during flight; the data dump drove the engine nozzle hard to one side which put the rocket out of aerodynamic control and precipitated its rapid breakup in the air.[24]
On 30 April 2015, the U.S. Federal Aviation Administration announced it will order Boeing 787 operators to reset its electrical system periodically, to avoid an integer overflow which could lead to loss of electrical power and ram air turbine deployment, and Boeing deployed a software update in the fourth quarter.[25] The European Aviation Safety Agency followed on 4 May 2015.[26] The error happens after 231 hundredths of a second (about 249 days), indicating a 32-bit signed integer.
Overflow bugs are evident in some computer games. In Super Mario Bros. for the NES, the stored number of lives is a signed byte (ranging from −128 to 127) meaning the player can safely have 127 lives, but when the player reaches their 128th life, the counter rolls over to zero lives (although the number counter is glitched before this happens) and stops keeping count. As such, if the player then dies it’s an immediate game over. This is caused by the game’s data overflow that was an error of programming as the developers may not have thought said number of lives could be earned.
In the arcade game Donkey Kong, it is impossible to advance past level 22 due to an integer overflow in its time/bonus. The game calculates the time/bonus by taking the level number a user is on, multiplying it by 10, and adding 40. When they reach level 22, the time/bonus number is 260, which is too large for its 8-bit 256 value register, so it overflows to a value of 4 – too short to finish the level. In Donkey Kong Jr. Math, when trying to calculate a number over 10,000, it shows only the first 4 digits. Overflow is the cause of the famous «split-screen» level in Pac-Man.[27] Such a bug also caused the Far Lands in Minecraft Java Edition which existed from the Infdev development period to Beta 1.7.3; it was later fixed in Beta 1.8. The same bug also existed in Minecraft Bedrock Edition but has since been fixed.[28]
In the Super Nintendo Entertainment System (SNES) game Lamborghini American Challenge, the player can cause their amount of money to drop below $0 during a race by being fined over the limit of remaining money after paying the fee for a race, which glitches the integer and grants the player $65,535,000 more than it would have had after going negative.[29]
A similar glitch occurs in S.T.A.L.K.E.R.: Clear Sky where the player can drop into a negative amount by fast travelling without sufficient funds, then proceeding to the event where the player gets robbed and has all of their currency taken away. After the game attempts to take the player’s money away to an amount of $0, the player is granted 2147482963 in game currency.[30]
An integer signedness bug in the stack setup code emitted by the Pascal compiler prevented IBM–Microsoft Macro Assembler (MASM) version 1.00, a DOS program from 1981, and many other programs compiled with the same compiler, to run under some configurations with more than 512 KB of memory.
IBM–Microsoft Macro Assembler (MASM) version 1.00, and likely all other programs built by the same Pascal compiler, had an integer overflow and signedness error in the stack setup code, which prevented them from running on newer DOS machines or emulators under some common configurations with more than 512 KB of memory. The program either hangs or displays an error message and exits to DOS.[31]
In August 2016, a casino machine at Resorts World casino printed a prize ticket of $42,949,672.76 as a result of an overflow bug. The casino refused to pay this amount, calling it a malfunction, using in their defense that the machine clearly stated that the maximum payout was $10,000, so any prize exceeding that had to be the result of a programming bug. The New York State Gaming Commission ruled in favor of the casino.[32]
See also[edit]
- Buffer overflow
- Heap overflow
- Modular arithmetic
- Pointer swizzling
- Software testing
- Stack buffer overflow
- Static program analysis
- Unix signal
References[edit]
- ^ a b c ISO staff. «ISO/IEC 9899:2011 Information technology — Programming languages — C». ANSI.org.
- ^ «Wrap on overflow — MATLAB & Simulink». www.mathworks.com.
- ^ «Saturate on overflow — MATLAB & Simulink». www.mathworks.com.
- ^ Arithmetic underflow
- ^ «CWE — CWE-191: Integer Underflow (Wrap or Wraparound) (3.1)». cwe.mitre.org.
- ^ «Overflow And Underflow of Data Types in Java — DZone Java». dzone.com.
- ^ Mir, Tabish (4 April 2017). «Integer Overflow/Underflow and Floating Point Imprecision». medium.com.
- ^ «Integer underflow and buffer overflow processing MP4 metadata in libstagefright». Mozilla.
- ^ «Avoiding Buffer Overflows and Underflows». developer.apple.com.
- ^ «Operator expressions — The Rust Reference». Rust-lang.org. Retrieved 2021-02-12.
- ^ BillWagner. «Checked and Unchecked (C# Reference)». msdn.microsoft.com.
- ^ Seed7 manual, section 16.3.3 OVERFLOW_ERROR.
- ^ The Swift Programming Language. Swift 2.1 Edition. October 21, 2015.
- ^ As-if Infinitely Ranged Integer Model
- ^ Python documentation, section 5.1 Arithmetic conversions.
- ^ «Declaration TYPE«. Common Lisp HyperSpec.
- ^ «Declaration OPTIMIZE«. Common Lisp HyperSpec.
- ^ Reddy, Abhishek (2008-08-22). «Features of Common Lisp».
- ^ Pierce, Benjamin C. (2002). Types and Programming Languages. MIT Press. ISBN 0-262-16209-1.
- ^ Wright, Andrew K.; Felleisen, Matthias (1994). «A Syntactic Approach to Type Soundness». Information and Computation. 115 (1): 38–94. doi:10.1006/inco.1994.1093.
- ^ Macrakis, Stavros (April 1982). «Safety and power». ACM SIGSOFT Software Engineering Notes. 7 (2): 25–26. doi:10.1145/1005937.1005941. S2CID 10426644.
- ^ «Extra, Extra — Read All About It: Nearly All Binary Searches and Mergesorts are Broken». googleresearch.blogspot.co.uk.
- ^ Gleick, James (1 December 1996). «A Bug and A Crash». The New York Times. Retrieved 17 January 2019.
- ^ Official report of Ariane 5 launch failure incident.
- ^ Mouawad, Jad (30 April 2015). «F.A.A. Orders Fix for Possible Power Loss in Boeing 787». New York Times.
- ^ «US-2015-09-07: Electrical Power – Deactivation». Airworthiness Directives. European Aviation Safety Agency. 4 May 2015.
- ^ Pittman, Jamey. «The Pac-Man Dossier».
- ^ «Minecraft Gamepedia Page». Minecraft Gamepedia.
- ^ Archived at Ghostarchive and the Wayback Machine: «Lamborghini American Challenge SPEEDRUN (13:24)». YouTube.
- ^ «Money glitch :: S.T.A.L.K.E.R.: Clear Sky General Discussions».
- ^ Lenclud, Christophe. «Debugging IBM MACRO Assembler Version 1.00».
- ^ Kravets, David (June 15, 2017). «Sorry ma’am you didn’t win $43M – there was a slot machine ‘malfunction’«. Ars Technica.
External links[edit]
- Phrack #60, Basic Integer Overflows
- Phrack #60, Big Loop Integer Protection
- Efficient and Accurate Detection of Integer-based Attacks
- WASC Threat Classification – Integer Overflows
- Understanding Integer Overflow in C/C++
- Binary Overflow – Binary Arithmetic
- ISO C11 Standard
What is an integer overflow error?
Why do i care about such an error?
What are some methods of avoiding or preventing it?
Earlz
61.8k98 gold badges301 silver badges498 bronze badges
asked Apr 14, 2010 at 21:46
8
Integer overflow occurs when you try to express a number that is larger than the largest number the integer type can handle.
If you try to express the number 300 in one byte, you have an integer overflow (maximum is 255). 100,000 in two bytes is also an integer overflow (65,535 is the maximum).
You need to care about it because mathematical operations won’t behave as you expect. A + B doesn’t actually equal the sum of A and B if you have an integer overflow.
You avoid it by not creating the condition in the first place (usually either by choosing your integer type to be large enough that you won’t overflow, or by limiting user input so that an overflow doesn’t occur).
answered Apr 14, 2010 at 21:51
JohnJohn
16k10 gold badges70 silver badges110 bronze badges
The easiest way to explain it is with a trivial example. Imagine we have a 4 bit unsigned integer. 0 would be 0000 and 1111 would be 15. So if you increment 15 instead of getting 16 you’ll circle back around to 0000 as 16 is actually 10000 and we can not represent that with less than 5 bits. Ergo overflow…
In practice the numbers are much bigger and it circles to a large negative number on overflow if the int is signed but the above is basically what happens.
Another way of looking at it is to consider it as largely the same thing that happens when the odometer in your car rolls over to zero again after hitting 999999 km/mi.
answered Apr 14, 2010 at 21:51
KrisKris
14.4k7 gold badges55 silver badges65 bronze badges
1
When you store an integer in memory, the computer stores it as a series of bytes. These can be represented as a series of ones and zeros.
For example, zero will be represented as 00000000
(8 bit integers), and often, 127 will be represented as 01111111
. If you add one to 127, this would «flip» the bits, and swap it to 10000000
, but in a standard two’s compliment representation, this is actually used to represent -128. This «overflows» the value.
With unsigned numbers, the same thing happens: 255 (11111111
) plus 1 would become 100000000
, but since there are only 8 «bits», this ends up as 00000000
, which is 0.
You can avoid this by doing proper range checking for your correct integer size, or using a language that does proper exception handling for you.
answered Apr 14, 2010 at 21:52
Reed CopseyReed Copsey
552k78 gold badges1154 silver badges1373 bronze badges
1
An integer overflow error occurs when an operation makes an integer value greater than its maximum.
For example, if the maximum value you can have is 100000, and your current value is 99999, then adding 2 will make it ‘overflow’.
You should care about integer overflows because data can be changed or lost inadvertantly, and can avoid them with either a larger integer type (see long int in most languages) or with a scheme that converts long strings of digits to very large integers.
answered Apr 14, 2010 at 21:50
RiddariRiddari
1,7053 gold badges26 silver badges57 bronze badges
Overflow is when the result of an arithmetic operation doesn’t fit in the data type of the operation. You can have overflow with a byte-sized unsigned integer if you add 255 + 1, because the result (256) does not fit in the 8 bits of a byte.
You can have overflow with a floating point number if the result of a floating point operation is too large to represent in the floating point data type’s exponent or mantissa.
You can also have underflow with floating point types when the result of a floating point operation is too small to represent in the given floating point data type. For example, if the floating point data type can handle exponents in the range of -100 to +100, and you square a value with an exponent of -80, the result will have an exponent around -160, which won’t fit in the given floating point data type.
You need to be concerned about overflows and underflows in your code because it can be a silent killer: your code produces incorrect results but might not signal an error.
Whether you can safely ignore overflows depends a great deal on the nature of your program — rendering screen pixels from 3D data has a much greater tolerance for numerical errors than say, financial calculations.
Overflow checking is often turned off in default compiler settings. Why? Because the additional code to check for overflow after every operation takes time and space, which can degrade the runtime performance of your code.
Do yourself a favor and at least develop and test your code with overflow checking turned on.
answered Apr 14, 2010 at 23:52
dthorpedthorpe
35.3k5 gold badges75 silver badges119 bronze badges
0
I find showing the Two’s Complement representation on a disc very helpful.
Here is a representation for 4-bit integers. The maximum value is 2^3-1 = 7.
For 32 bit integers, we will see the maximum value is 2^31-1.
When we add 1 to 2^31-1 : Clockwise we move by one and it is clearly -2^31 which is called integer overflow
Ref : https://courses.cs.washington.edu/courses/cse351/17wi/sections/03/CSE351-S03-2cfp_17wi.pdf
answered Dec 29, 2020 at 17:41
mcvkrmcvkr
3,0996 gold badges38 silver badges63 bronze badges
From wikipedia:
In computer programming, an integer
overflow occurs when an arithmetic
operation attempts to create a numeric
value that is larger than can be
represented within the available
storage space. For instance, adding 1 to the largest value that can be represented
constitutes an integer overflow. The
most common result in these cases is
for the least significant
representable bits of the result to be
stored (the result is said to wrap).
You should care about it especially when choosing the appropriate data types for your program or you might get very subtle bugs.
answered Apr 14, 2010 at 21:49
Darin DimitrovDarin Dimitrov
1.0m270 gold badges3284 silver badges2923 bronze badges
1
From http://www.first.org/conference/2006/papers/seacord-robert-slides.pdf :
An integer overflow occurs when an integer is
increased beyond its maximum value or
decreased beyond its minimum value.
Overflows can be signed or unsigned.
P.S.: The PDF has detailed explanation on overflows and other integer error conditions, and also how to tackle/avoid them.
answered Apr 14, 2010 at 21:50
N 1.1N 1.1
12.4k6 gold badges43 silver badges61 bronze badges
I’d like to be a bit contrarian to all the other answers so far, which somehow accept crappy broken math as a given. The question is tagged language-agnostic and in a vast number of languages, integers simply never overflow, so here’s my kind-of sarcastic answer:
What is an integer overflow error?
An obsolete artifact from the dark ages of computing.
why do i care about it?
You don’t.
how can it be avoided?
Use a modern programming language in which integers don’t overflow. (Lisp, Scheme, Smalltalk, Self, Ruby, Newspeak, Ioke, Haskell, take your pick …)
answered Apr 14, 2010 at 23:17
Jörg W MittagJörg W Mittag
362k75 gold badges440 silver badges647 bronze badges
0
This happens when you attempt to use an integer for a value that is higher than the internal structure of the integer can support due to the number of bytes used. For example, if the maximum integer size is 2,147,483,647 and you attempt to store 3,000,000,000 you will get an integer overflow error.
answered Apr 14, 2010 at 21:50
1
Теги: Переполнение целых, переполнение целых без знака, int overflow, unsigned overflow.
Переполнение целых чисел
Вы уже знаете, что при сложении целых может происходить переполнение. Загвоздка в том, что компьютер не выдаёт предупреждения при переполнении:
программа продолжит работать с неверными данными. Более того, поведение при переполнении определено только для целых без знака.
Переполнение может привести к серьёзным проблемам: обнулению и потере данных, возможным эксплойтам, трудноуловимым ошибкам, которые будут накапливаться с течением времени.
Рассмотрим несколько приёмов отслеживания переполнения целых со знаком и переполнения целых без знака.
1. Предварительная проверка данных. Мы знаем, из файла limits.h, максимальное и минимальное значение для чисел типа int. Если оба числа положительные,
то их сумма не превысит INT_MAX, если разность INT_MAX и одного из чисел меньше второго числа. Если оба числа отрицательные, то разность INT_MIN и
одного из чисел должна быть больше другого. Если же оба числа имеют разные знаки, то однозначно их сумма не превысит INT_MAX или INT_MIN.
int sum1(int a, int b, int *overflow) { int c = 0; if (a > 0 && b > 0 && (INT_MAX - b < a) || a < 0 && b < 0 && (INT_MIN - b > a)) { *overflow = 1; } else { *overflow = 0; c = a + b; } return c; }
В этой функции переменной overflow будет присвоено значение 1, если было переполнение. Функция возвращает сумму, независимо от результата сложения.
2. Второй способ проверки – взять для суммы тип, максимальное (и минимальное) значение которого заведомо больше суммы двух целых. После сложения необходимо проверить, чтобы сумма была не больше , чем INT_MAX и не меньше INT_MIN.
int sum2(int a, int b, int *overflow) { signed long long c = (signed long long) a + (signed long long) b; if (c < INT_MAX && c > INT_MIN) { *overflow = 0; c = a + b; } else { *overflow = 1; } return (int) c; }
Обратите внимание на явное приведение типов. Без него сначала произойдёт переполнение, и неправильное число будет записано в переменную c.
3. Третий способ проверки платформозависимый, более того, его реализация будет разной для разных компиляторов. При переполнении целых (обычно) поднимается флаг переполнения в регистре флагов. Можно на ассемблере проверить значение флага сразу же после выполнения суммирования.
int sum3(int a, int b, int *overflow) { int noOverflow = 1; int c = a + b; __asm { jno NO_OVERFLOW mov noOverflow, 0 NO_OVERFLOW: } if (noOverflow) { *overflow = 0; } else { *overflow = 1; } return c; }
Здесь переменная noOverflow равна 1, если нет переполнения. jno (jump if no overflow) выполняет переход к метке NO_OVERFLOW, если переполнения не было.
Если же переполнение было, то выполняется
mov noOverflow, 0
По адресу переменной noOverflow записывается нуль.
Работа с числами без знака гораздо проще: при переполнении происходит обнуление и известно, что получившееся число заведомо будет меньше каждого из слагаемых.
unsigned usumm(unsigned a, unsigned b, int *overflow) { unsigned c = a + b; if (c < a || c < b) { *overflow = 1; } else { *overflow = 0; } return c; }
Вот полный код, с проверками.
#include <conio.h> #include <stdio.h> #include <limits.h> int sum1(int a, int b, int *overflow) { int c = 0; if (a > 0 && b > 0 && (INT_MAX - b < a) || a < 0 && b < 0 && (INT_MIN - b > a)) { *overflow = 1; } else { *overflow = 0; c = a + b; } return c; } int sum2(int a, int b, int *overflow) { signed long long c = (signed long long) a + (signed long long) b; if (c < INT_MAX && c > INT_MIN) { *overflow = 0; c = a + b; } else { *overflow = 1; } return (int) c; } int sum3(int a, int b, int *overflow) { int noOverflow = 1; int c = a + b; __asm { jno NO_OVERFLOW mov noOverflow, 0 NO_OVERFLOW: } if (noOverflow) { *overflow = 0; } else { *overflow = 1; } return c; } unsigned usumm(unsigned a, unsigned b, int *overflow) { unsigned c = a + b; if (c < a || c < b) { *overflow = 1; } else { *overflow = 0; } return c; } void main() { int overflow; int sum; unsigned usum; //sum1 sum = sum1(3, 5, &overflow); printf("%d + %d = %d (%d)n", 3, 5, sum, overflow); sum = sum1(INT_MAX, 5, &overflow); printf("%d + %d = %d (%d)n", INT_MAX, 5, sum, overflow); sum = sum1(INT_MIN, -5, &overflow); printf("%d + %d = %d (%d)n", INT_MIN, -5, sum, overflow); sum = sum1(INT_MAX, INT_MIN, &overflow); printf("%d + %d = %d (%d)n", INT_MAX, INT_MIN, sum, overflow); sum = sum1(-10, -20, &overflow); printf("%d + %d = %d (%d)n", -10, -20, sum, overflow); //sum2 sum = sum2(3, 5, &overflow); printf("%d + %d = %d (%d)n", 3, 5, sum, overflow); sum = sum2(INT_MAX, 5, &overflow); printf("%d + %d = %d (%d)n", INT_MAX, 5, sum, overflow); sum = sum2(INT_MIN, -5, &overflow); printf("%d + %d = %d (%d)n", INT_MIN, -5, sum, overflow); sum = sum2(INT_MAX, INT_MIN, &overflow); printf("%d + %d = %d (%d)n", INT_MAX, INT_MIN, sum, overflow); sum = sum2(-10, -20, &overflow); printf("%d + %d = %d (%d)n", -10, -20, sum, overflow); //sum3 sum = sum3(3, 5, &overflow); printf("%d + %d = %d (%d)n", 3, 5, sum, overflow); sum = sum3(INT_MAX, 5, &overflow); printf("%d + %d = %d (%d)n", INT_MAX, 5, sum, overflow); sum = sum3(INT_MIN, -5, &overflow); printf("%d + %d = %d (%d)n", INT_MIN, -5, sum, overflow); sum = sum3(INT_MAX, INT_MIN, &overflow); printf("%d + %d = %d (%d)n", INT_MAX, INT_MIN, sum, overflow); sum = sum3(-10, -20, &overflow); printf("%d + %d = %d (%d)n", -10, -20, sum, overflow); //usum usum = usumm(10u, 20u, &overflow); printf("%u + %u = %u (%d)n", 10u, 20u, usum, overflow); usum = usumm(UINT_MAX, 20u, &overflow); printf("%u + %u = %u (%d)n", UINT_MAX, 20u, usum, overflow); usum = usumm(20u, UINT_MAX, &overflow); printf("%u + %u = %u (%d)n", 20u, UINT_MAX, usum, overflow); getch(); }
Q&A
Всё ещё не понятно? – пиши вопросы на ящик
Функции с переменным числом параметров
Overflow is a phenomenon where operations on 2 numbers exceeds the maximum (or goes below the minimum) value the data type can have. Usually it is thought that integral types are very large and people don’t take into account the fact that sum of two numbers can be larger than the range. But in things like scientific and mathematical computation, this can happen. For example, an unhandled arithmetic overflow in the engine steering software was the primary cause of the crash of the maiden flight of the Ariane 5 rocket. The software had been considered bug-free since it had been used in many previous flights; but those used smaller rockets which generated smaller accelerations than Ariane 5’s.This article will tell how this problem can be tackled.
In this article, we will only deal with integral types (and not with types like float and double)
In order to understand how to tackle this problem we will first know how numbers are stored.
About integers:
If the size of a data type is n bytes, it can store 28n different values. This is called the data type’s range.
If size of an unsigned data type is n bytes, it ranges from 0 to 28n-1
If size of a signed data type is n bytes, it ranges from -28n-1 to 28n-1-1
So, a short(usually 2 bytes) ranges from -32768 to 32767 and an unsigned short ranges from 0 to 65535
Consider a short variable having a value of 250.
It is stored int the computer like this (in binary format)
00000000 11111010
Complement of a number is a number with its bits toggled. It is denoted by ~
For eg. ~250 is 11111111 00000101
Negative numbers are stored using 2’s complement system. According to this system, -n=~n+1
-250 is stored as 11111111 00000110
http://stackoverflow.com/questions/1049722/what-is-2s-complement
10000000 00000000 (-32768) has no positive counterpart. Its negative is the number itself (try -n=~n+1)
11100010 01110101 will be read as 57973 if data type is unsigned while it will be read as -7563 if data type is signed. If you add 65536 (which is the range) to -7563, you get 57973.
Overflow:
Consider a data type var_t of 1 byte (range is 256):
signed var_t a,b;
unsigned var_t c,d;
If c is 200(11001000) and d is 100(01100100), c+d is 300(00000001 00101100), which is more than the max value 255(11111111). 00000001 00101100 is more than a byte, so the higher byte will be rejected and c+d will be read as 44. So, 200+100=44! This is absurd! (Note that 44=300-256). This is an example of an unsigned overflow, where the value couldn’t be stored in the available no. of bytes. In such overflows, the result is moduloed by range (here, 256).
If a is 100(01100100) and b is 50(00110010), a+b is 150(10010110), which is more than the max value 127. Instead, a+b will be read as -106 (note that -106=150-256). This is an example of a signed overflow, where result is moduloed by range(here, 256).
Detecting overflow:
Division and modulo can never generate an overflow.
Addition overflow:
Overflow can only occur when sign of numbers being added is the same (which will always be the case in unsigned numbers)
signed overflow can be easily detected by seeing that its sign is opposite to that of the operands.
Let us analyze overflow in unsigned integer addition.
Consider 2 variables a and b of a data type with size n and range R.
Let + be actual mathematical addition and a$b be the addition that the computer does.
If a+b<=R-1, a$b=a+b
As a and b are unsigned, a$b is more or equal to than both a and b.
If a+b>=R a$b=a+b-R
as R is more than both a and b, a-R and b-R are negative
So, a+b-R<a and a+b-R<b
Therefore, a$b is less than both a and b.
This difference can be used to detect unsigned addition overflow. a-b can be treated as a+(-b) hence subtraction can be taken care of in the same way.
Multiplication overflow:
There are two ways to detect an overflow:
1. if a*b>max, then a>max/b (max is R-1 if unsigned and R/2-1 if signed).
2. Let there be a data type of size n and range R called var_t and a data type of size 2n called var2_t.
Let there be 2 variables of var_t called a and b. Range of var2_t will be R*R, which will always be more than the product of a and b. hence if var2_t(a)*var2_t(b)>R overflow has happened.
Truncation:
This happens when a shorter is assigned from a longer variable. For eg, short a;long b=70000;a=b;
Only the lower bits are copied and the value’s meaning is translated.
short a;int b=57973;a=b;
will also show this behaviour become -7563.
Similar behaviour will be shown if int is replaced by unsigned short.
Type conversion:
consider unsigned int a=4294967290;int b=-6; return (a==b);
This returns 1.
Whenever an operation is performed between an unsigned and a signed variable of the same type, operands are converted to unsigned.
Whenever an operation is performed between a long type and a short type, operands are converted to the long type.
The above code returned 1 as a and b were converted to unsigned int and then compared.
If we used __int64 (a 64-bit type) instead of unsigned int and 18446744073709551610 instead of 4294967290, the result would have been the same.
Type promotion:
Whenever an operation is performed on two variables of a type shorter than int, the type of both variables is converted to int. For eg. short a=32000,b=32000;cout<<a+b<<endl;
would display 64000, which is more than the max value of short. The reason is that a and b were converted to int and a+b would return an int, which can have a value of 64000.
Libraries:
Microsoft Visual C++ 2010 has a header file safeint.h which has functions like safeadd,safesubtract,etc. It is a templated header file (and hence header only).
Само слово «переполнение» вполне описывает уязвимость, которую мы собираемся обсудить в этом посте. Рассмотрим стакан, в который наливают воду. Если общий объем налитой воды меньше или равен объему стакана, все в розовом цвете. Когда объем воды превышает объем стакана, он «переливается через край», и у вас остается мокрый пол, который нужно мыть.
Спецификаторы и модификаторы типов переменных
В C (и многих других языках) значения хранятся в переменных определенного типа. (Конечно, есть такие языки, как Python, где программисту не нужно указывать типы переменных.) Существует четыре основных типа переменных:
- уголь
- инт
- плавать
- двойной
В дополнение к указанным выше типам (также называемым спецификаторами типов) есть четыре модификатора типов:
- подписал
- неподписанный
- короткая
- длинная
Тип переменной вместе с (необязательным) модификатором определяет максимальное и минимальное значения, которые может хранить переменная. Например, на 32-разрядных машинах тип переменной signed int может хранить значения от -2 147 483 648 до 2 147 483 647, или, другими словами, signed int имеет размер 4 байта. Дополнительные сведения о размерах типов данных см. в разделе Типы данных C.
Примечание
Система, которую я использую для всех примеров:
[email protected]@ubuntu:~$ uname -a
Linux ubuntu 4.15.0–36-generic #39~16.04.1-Ubuntu SMP Вт, 25 сентября, 09:00:45 UTC 2018 i686 i686 i686 GNU /Линукс
Крушение ракеты Ariane 5
Если у нас есть переменная signed int с именем, например x, и мы пытаемся ввести значение, превышающее максимальное или минимальное допустимое значение для signed int , он переполнится. В слабо типизированных языках, таких как C, ситуации переполнения обычно не помечаются как ошибки во время компиляции. gcc, скорее всего, выдаст предупреждение в таких случаях, но его можно игнорировать (что является проблемой). Рассмотрим следующую программу:
int main(void) {
int x = 0x180000000;
printf («%dn», x);
return 0;
В C модификатор типа по умолчанию для int — signed. У нас есть тип переменной signed int, в которую мы помещаем значение (по какой-либо причине) 0x180000000, которое является шестнадцатеричным для 6 442 450 944, что, очевидно, превышает максимально допустимое значение 2 147 483 647. Давайте посмотрим, что произойдет, когда мы скомпилируем с помощью gcc и запустим:
[email protected]@ubuntu:~$ gcc -o overflow overflow.c
overflow.c: В функции ‘main’:
overflow.c:5:13: предупреждение: переполнение в неявной константе преобразование [-Woverflow]
int x = 0x180000000;
^
[email protected]@ubuntu:~$ ./overflow
-2147483648
C просто посмотрел на 32 бита, отбросил значение 33-го бита и выше. Это случай усечения, вызванного переполнением. Представьте себе ситуацию, когда значение x требуется на более позднем этапе программы, и ожидается, что x будет содержать 0x180000000. В худшем случае это приведет к катастрофе. Рассмотрим случай с ракетой Ariane 5, которая разбилась из-за небольшой компьютерной программы, пытавшейся вставить 64-битное число в 16-битное пространство.
Целочисленное переполнение — примеры
Как правило, не все уязвимости могут быть использованы злоумышленниками. Однако уязвимости переполнения особенно опасны, потому что они могут привести к нестабильности программы даже при нормальной работе и во многих случаях также могут быть использованы. Прежде чем мы рассмотрим другие примеры, вот что нужно иметь в виду:
Круг целых чисел
Когда я изучал целочисленные переполнения, я обнаружил, что Круг целых чисел (полностью выдуманное название) помогает развить интуитивное понимание этого предмета.
Для signed int старший значащий бит (MSB) также отмечает знак значения — бит знака. Например: 0x7ffffffff равно 2147483647, но 0x80000000 равно -2147483648, что совпадает с 2³¹, но с отрицательным знаком, так как старший бит равен 1. Для unsigned int такого бита знака не существует. Следовательно, 0x80000000 равно 2147483648.
Переполнение происходит, когда два числа с:
- добавляются одинаковые знаки, и результат отрицательный,
- разные знаки вычитаются, и результат положительный.
Сложение можно рассматривать как движение по кругу целых чисел по часовой стрелке, а вычитание можно рассматривать как движение против часовой стрелки. Теперь, когда мы кое-что знаем об условиях переполнения, давайте рассмотрим два примера.
Целочисленное переполнение — сложение
Рассмотрим следующую программу:
int main(void) {
int num1 = 0x7ffffffff;
int num2 = 0x1;
int addation_result = num1 + num2;
printf («%dn» , результат_дополнения);
unsigned int num3 = 0xffffffff;
unsigned int num4 = 0x1;
unsigned int uaddition_result = num3 + num4;
printf («%un», uaaddition_result);
вернуть 0;
Для неподготовленного глаза ожидается, что значение в addition_result будет 0x80000000 (или 2 147 483 648), а uaddition_result ожидается 0x100000000 (или 4 294 967 296). Посмотрим реальные значения:
[email protected]@ubuntu:~$ gcc -o переполнение overflow.c
[email protected]@ubuntu:~$ ./overflow
-2147483648
0
Что за …? Вот что произошло:
- При добавлении 0x1 к 0x7ffffffff результат стал 0x80000000. Итак, почему это вызвало переполнение? Ответ все еще может быть представлен в 32 битах. Вообще-то, нет. Для типа signed int количество битов, доступных для значения, равно 31 биту, поскольку 32-й бит является битом знака. В этом случае для представления суммы также требовался 32-й бит, что вызывало переполнение.
- В добавлении unsigned int для представления суммы требовался 33-й бит, что приводило к переполнению.
Опять же, представьте, если бы значения этих переменных требовались на более поздних этапах программы. Какая катастрофа будет, если операция, ожидающая 0x100000000, вместо этого получит 0x0.
Целочисленное переполнение — умножение
У меня есть идеальный пример для этого. Следующий код представляет собой фрагмент одной из задач в picoctf 2018:
…
if(number_flags › 0){
int total_cost = 0;
total_cost = 1000*number_flags;
printf(«nВаша общая стоимость: %dn», total_cost) ;
if(total_cost ‹= account_balance){
account_balance = account_balance — total_cost;
printf(«nВаш новый баланс: %dnn», account_balance);
}
else{
printf(«Недостаточно средствn»);
}
…
Вот как можно использовать этот код:
[email protected]:~$ nc 2018shell1.picoctf.com 5795
Добро пожаловать в приложение Store V1.0
Самое безопасное приложение для покупок в мире
[1] Проверка баланса счета
[2] Купить вещи
[3] Выйти
Введите пункт меню
1
Баланс: 1100
Добро пожаловать в приложение Store V1.0
Самое безопасное приложение для покупок в мире
[1] Проверка баланса счета
[2] Купить вещи
[3] Выйти
Введите пункт меню
2
Текущие аукционы
[1] Не могу поверить, что это не флаг!
[2] Настоящий флаг
1
Имитационные флаги стоят 1000 каждый, сколько вы хотите?
1000000000
Ваша общая стоимость: -727379968
Ваш новый баланс: 727381068
И это целочисленное переполнение из-за умножения! Обратите внимание на этот оператор в исходном коде:
total_cost = 1000*number_flags
Что делать, если значение продукта превышает диапазон положительных значений, возможных для переменной signed int, total_cost? Он станет отрицательным, и вот что случилось! Представьте, если бы реальное производственное программное обеспечение имело такой код!
Сделанный!
Это все! В этой статье мы специально рассмотрели целочисленные переполнения, но переполнение может произойти в переменной любого типа данных.
Уязвимости переполнения легко пробираются в код и обнаружить их достаточно сложно. Самый простой способ смягчить их — не вводить их в первую очередь. Однако фаззинг — хороший метод для обнаружения переполнения. Проверка кода также может помочь, но для этого нужен опытный глаз. Уязвимости переполнения опасны, как вы уже видели на примере Ariane 5 и picoctf.
Спасибо за чтение, и если у вас есть какие-либо вопросы, пожалуйста, дайте мне знать в разделе комментариев ниже, и я свяжусь с вами, как только смогу.