PHP
Raiting:
11

Working with memory


imageThere is a widespread view that the ordinary PHP developer does not need to control memory management, but "controlling" and "knowing" are slightly different concepts. I will try to throw light upon some aspects of memory management when working with variables and arrays, and some interesting pitfalls of the internal optimization of PHP. As you can see, the optimization is good, but if you do not know exactly how it is optimized, you might meet the pitfalls, which can make you pretty nervous.

Overview


Learning the basics

In PHP a variable consists of two parts: "name" that is stored in hash_table, symbol_table and "value" that is stored in zval container.
This method allows creating multiple variables that are referring to one value, which in some cases allows optimizing the memory usage. How it looks in practice will be written further.

The most common code elements without which it is difficult to imagine a functional script are the following things:
- Creation, assignment and removal of variables (numbers, strings, etc.).
- Creation of arrays and their bypass (as an example will be used the function foreach).
- Passing and return values for functions / methods.

Namely, these aspects of working with memory will be discussed in this post. It turned out quite roomy, however here will be nothing mega-complex, and everything will be quite simple, clear and with examples.

The first example of working with memory

Here is a basic example; the analysis of memory usage.
For this we need a couple of simple functions (file func.php):

<?php
function memoryUsage($usage, $base_memory_usage) {
printf("Bytes diff: %d\n", $usage - $base_memory_usage);
}
function someBigValue() {
return str_repeat('SOME BIG STRING', 1024);
}
?>

Also, the first simple example of memory usage test for a string:

<?php
include('func.php');
echo "String memory usage test.\n\n";
$base_memory_usage = memory_get_usage();
$base_memory_usage = memory_get_usage();

echo "Start\n";
memoryUsage(memory_get_usage(), $base_memory_usage);

$a = someBigValue();

echo "String value setted\n";
memoryUsage(memory_get_usage(), $base_memory_usage);

unset($a);

echo "String value unsetted\n";
memoryUsage(memory_get_usage(), $base_memory_usage);
?>

Annotation:
Certainly, a code is not optimized in terms of workability, but in this case, it is extremely important the clarity of memory usage for which is implemented this idea.

The result of code is obvious:
String memory usage test.

Start
Bytes diff: 0
String value setted
Bytes diff: 15448
String value unsetted
Bytes diff: 0

The same example, but instead of unset($a) use $a=null;:
Start
Bytes diff: 0
String value setted
Bytes diff: 15448
String value set to null
Bytes diff: 76

As you can see, the variable has not been completely removed. Here still remains another 76 bytes for it.
It is fairly enough if you consider that the same number is allocated for the variables of boolean, integer, and float. This is not about the size of memory allocated for the value of variable, but it is about a full memory usage to store the data about the assigned variable (zval container with the value and the name of the variable).
So if you want to free the memory using assignment, it is not fundamental to assign exactly null value. This expression $a=10000; gives the same result for the memory usage.

The PHP documentation says that the casting to null will remove the variable and its value; however, this script indicates that something is not right, which actually is a bug (documentation).

Why is the assignment of null used, if you can use unset()?
Assignment is changing the value of variable, respectively, if the new value requires less memory, then it will free right away, but this requires the computing resources.
unset() frees the memory that is allocated for the variable name and its value.
unset() and assignment of null work differently with references to the variables. unset() will remove only reference, while the null assignment will change the value of the referenced variable names, respectively, all variables will refer to the value of null.

Annotation:
There is a misconception that unset() is a function, however, it is not true. unset() is a language structure (such as if), which confirms the documentation, respectively, it cannot be used for referencing through a variable value:

$unset_func_name = 'unset';
$unset_func_name($some_var);

Here is some more information (when changing the above example):

$a = array();
allocates 164 bytes, unset($a) will return that all.

class A { }
$a = new A();

allocate 184 bytes, unset($a) will return that all.

$a = new stdClass();
allocates 272 bytes, but after unset($a) 88 bytes will be gone.

So far these examples are not critical in terms of memory usage, since the string and numeric values are stored and processed. Things get much worse when the arrays are used (objects also have a number of features).

Arrays

In PHP arrays take some memory, and namely arrays store large volume of data for processing, so you have to be very careful when working with them. However, working with arrays in PHP has its "optimization beauty", therefore, it is worth mentioning about those moments.

Insidious example 1

<?php
include('func.php');
echo "Array memory usage example.";
$base_memory_usage = memory_get_usage();
$base_memory_usage = memory_get_usage();

echo 'Base usage.'.PHP_EOL;
memoryUsage(memory_get_usage(), $base_memory_usage);

$a = array(someBigValue(), someBigValue(), someBigValue(), someBigValue());

echo 'Array is set.'.PHP_EOL;
memoryUsage(memory_get_usage(), $base_memory_usage);

foreach ($a as $k=>$v) {
$a[$k] = someBigValue();
unset($k, $v);
echo 'In FOREACH cycle.'.PHP_EOL;
memoryUsage(memory_get_usage(), $base_memory_usage);
}

echo 'Usage right after FOREACH.'.PHP_EOL;
memoryUsage(memory_get_usage(), $base_memory_usage);

unset($a);
echo 'Array unset.'.PHP_EOL;
memoryUsage(memory_get_usage(), $base_memory_usage);
?>

At first glance it might seem that the memory usage of an array $a will not change (except for setting the variables $k and $v), but PHP has a special approach when working with arrays in this case.

Look at the output:
Array memory usage example.Base usage.
Bytes diff: 0
Array is set.
Bytes diff: 61940
In FOREACH cycle.
Bytes diff: 77632
In FOREACH cycle.
Bytes diff: 93032
In FOREACH cycle.
Bytes diff: 108432
In FOREACH cycle.
Bytes diff: 123832
Usage right after FOREACH.
Bytes diff: 61940
Array unset.
Bytes diff: 0

It turns out that in the last iteration of the foreach loop the memory usage of arrays has doubled in this case, although it is not obvious in terms of a code. But immediately after the loop, the memory usage has returned to its original value. The reason for this is optimization of the array use in a loop. When you try to change the source array during working loop, implicitly is created a copy of the array structure (but it is not copy of the values), which becomes available when the loop is completed, and the source structure is removed. Thus, in the example above, if you assign new values to the source array, then they will not be replaced immediately, because there will be allocated a separate memory for them, which will be returned at the exit of the loop.
This moment is easy to miss, which could lead to significant memory usage during the working loop with large data arrays, such as the sampling from the database.

Annotation:
Inside of the loop, even after changing the value of $a[$k], you cannot get the value, which is still stored in the source array, if the value of $v is not kept. Repeated referencing to $a[$k] will give a new value.

It is important to note that the allocation of memory for the new temporary array in case of changes will occur at a time for the entire structure of the array, but it will be separately for each variable element. Thus, if there is an array with many elements (but it is not necessarily with the larger values), then at a time memory usage will be essential.

Insidious example 2
The code is slightly changed.

echo 'Array is set.'.PHP_EOL;
memoryUsage(memory_get_usage(), $base_memory_usage);
$b = &$a; // Добавим это
foreach ($a as $k=>$v) {
$a[$k] = someBigValue();
unset($k, $v);
echo 'In FOREACH cycle.'.PHP_EOL;
memoryUsage(memory_get_usage(), $base_memory_usage);
}
unset($b);
echo 'Usage right after FOREACH.'.PHP_EOL;
memoryUsage(memory_get_usage(), $base_memory_usage);

The code of loop was not changed, the only reference counter to the source array was changed, but it's fundamentally changed the working loop:
Bytes diff: 0
Array is set.
Bytes diff: 61940
In FOREACH cycle.
Bytes diff: 61988
In FOREACH cycle.
Bytes diff: 61988
In FOREACH cycle.
Bytes diff: 61988
In FOREACH cycle.
Bytes diff: 61988
Usage right after FOREACH.
Bytes diff: 61940
Array unset.
Bytes diff: 0

Here is small change: (61988 - 61940 = 48 bytes for storing reference variable $b).
If the array that is used for the loop has more than one reference to itself, then the optimization should not be applied from example 1.
Exactly the same result will be gotten if an array $b will be used for the loop:

echo 'Array is set.'.PHP_EOL;
memoryUsage(memory_get_usage(), $base_memory_usage);

foreach ($a as $k=>&$v) {
$a[$k] = someBigValue(); // or $v = someBigValue();
unset($k, $v);
echo 'In FOREACH cycle.'.PHP_EOL;
memoryUsage(memory_get_usage(), $base_memory_usage);
}

echo 'Usage right after FOREACH.'.PHP_EOL;
memoryUsage(memory_get_usage(), $base_memory_usage);

The result:
Bytes diff: 0
Array is set.
Bytes diff: 61940
In FOREACH cycle.
Bytes diff: 61940
In FOREACH cycle.
Bytes diff: 61940
In FOREACH cycle.
Bytes diff: 61940
In FOREACH cycle.
Bytes diff: 61940
Usage right after FOREACH.
Bytes diff: 61940
Array unset.
Bytes diff: 0

It is also worth noting that the addition of passing $v by the reference does not augment the reference counter of the source array, and also it disables the optimization.

Passing by reference or passing by copying

Let us consider the case: what to do if you want to pass to a method or a function (or return from them) some large value. The first obvious solution is passing / returning by the reference.
However, the PHP documentation says: do not use returning by the reference to improve performance. The PHP core itself is engaged in the optimization.

Let us try to understand what kind of optimization is it.

Here is a simple example (so far without passing arguments):

$a = someBigValue();
$b = $a;

echo "String value setted";
memoryUsage(memory_get_usage(), $base_memory_usage);

unset($a, $b);
...

By direct logic in memory should be allocated two blocks for the value of variables. However, PHP optimizes this point:
Start
Bytes diff: 0
String value setted
Bytes diff: 15496
String value unsetted
Bytes diff: 0

In this case, 15448 bytes are taken by the variable $a, while the remaining 48 bytes are allocated for the variable $b, although there is not any connection by reference between them. This memory usage remains as long as we somehow do not change one of these variables:

$a = someBigValue();
$b = $a;
$b = strval($b);

echo "String value setted";
memoryUsage(memory_get_usage(), $base_memory_usage);

unset($a, $b);

As a result we will get:
Bytes diff: 0
String value setted
Bytes diff: 30896
String value unsetted
Bytes diff: 0

As we see the attempt to touch the variable value of $b leads to situation that the script allocates a separate memory space for storing. The same thing happens if we try to touch the value of $a.

This optimization works for the specific values and the separate values of the array.
To understand better this, let's look at the example below:

$a = array(someBigValue(), someBigValue()); // 31052 bytes
$b = $a; // + 48 байт = 31100 bytes
$b[0] = someBigValue();

echo "String value setted";
memoryUsage(memory_get_usage(), $base_memory_usage);

unset($a, $b);

This example will output:
Bytes diff: 0
String value setted
Bytes diff: 46704
String value unsetted
Bytes diff: 0

The new memory (15k + byte) has been allocated to create only a copy of the value for the zero element of the array, rather than for the entire array of $b. The value of $b[1] is still "optimized associated" with $a[1].

All the above described works similarly to passing / returning of the values through a "optimized copying" inside / out of functions and methods. If inside of the method you do not "touch" the passed value, then a separate memory space will not be allocated for it (a memory will be allocated only for a variable name to bind it with the value). If you are passing by copying and changing the value inside of the method, then before you attempt to make the changes there will be already created a complete copy of the actual value.

Thus, PHP really removes the need to use passing by reference to optimize the memory usage. Passing by reference has the practical value only if the source value has to be changed with displaying of these changes from outside of the method.

Here is a code as an example:

<?php
include('func.php');

function testUsageInside($big_value, $base_memory_usage) {
echo 'Usage inside function then $big_value NOT changed.'.PHP_EOL;
memoryUsage(memory_get_usage(), $base_memory_usage);

$big_value[0] = someBigValue();
echo 'Usage inside function then $big_value[0] changed.'.PHP_EOL;
memoryUsage(memory_get_usage(), $base_memory_usage);

$big_value[1] = someBigValue();
echo 'Usage inside function then also $big_value[1] changed.'.PHP_EOL;
memoryUsage(memory_get_usage(), $base_memory_usage);

}

echo "Array memory usage example.";
$base_memory_usage = memory_get_usage();
$base_memory_usage = memory_get_usage();

echo 'Base usage.'.PHP_EOL;
memoryUsage(memory_get_usage(), $base_memory_usage);

$a = array(someBigValue(), someBigValue(), someBigValue(), someBigValue());

echo 'Array is set.'.PHP_EOL;
memoryUsage(memory_get_usage(), $base_memory_usage);

testUsageInside($a, $base_memory_usage);

echo 'Usage right after function call.'.PHP_EOL;
memoryUsage(memory_get_usage(), $base_memory_usage);

unset($a);
echo 'Array unset.'.PHP_EOL;
memoryUsage(memory_get_usage(), $base_memory_usage);
?>

Conclusion:
Array memory usage example.
Base usage.
Bytes diff: 0
Array is set.
Bytes diff: 61940
Usage inside function then $big_value NOT changed.
Bytes diff: 61940
Usage inside function then $big_value[0] changed.
Bytes diff: 77632
Usage inside function then also $big_value[1] changed.
Bytes diff: 93032
Usage right after function call.
Bytes diff: 61940
Array unset.
Bytes diff: 0

As you can see, in the function has not been created a copy of the array, despite the fact that the actual value is passing by copying. Even a partial modification of the transferred array did not create a full copy, but it allocated only the memory for the new values.

You should pay attention to these two values exclusively for educational purposes:
Array is set.
Bytes diff: 61940
Usage inside function then $big_value NOT changed.
Bytes diff: 61940

The memory usage has not increased when transferring control to a function, although there appeared a new variable $ big_value. This is due to the fact that at the stage of parsing text script interpreter determined whether this function will be used in the code and pre-allocated for the names of its input parameters the place in memory (if the function is not used, then the interpreter ignores it and does not allocate memory for it). And since it has "optimized passing by copying", then the existing variable name $ big_value was just implicitly linked to a large array $a. As a result, there was given a value in the function by copying without spending a single extra byte.

Annotation:
In PHP5 (unlike PHP4) all objects are passed by reference by default, although it is inferior reference. See this article.

A short summary


No doubt these examples in terms of optimizing memory usage in PHP are only "a drop in the sea," but they describe the most often cases, where it makes sense to think about what code to choose in order to optimize the memory usage and save yourself a lot of headaches.

Here are some more useful links:

nikic.github.com/2011/12/12/How-big-are-PHP-arrays-really-Hint-BIG.html
nikic.github.com/2011/11/11/PHP-Internals-When-does-foreach-copy.html
blog.golemon.com/2007/01/youre-being-lied-to.html
hengrui-li.blogspot.com/2011/08/php-copy-on-write-how-php-manages.html
sldn.softlayer.com / blog / dmcaloon / PHP-Memory-Management-Foreach
blog.preinheimer.com / index.php? / archives/354-Memory-usage-in-PHP.html
derickrethans.nl / talks / phparch-php-variables-article.pdf

UPD
In the main part of the article was not covered an important point.
If there is a variable on which is created the reference, then when it is transferred to a function as an argument it will be copied immediately, namely there will not be used copy-on-write optimization.

Here is an example:

<?php
include('func.php');
function testFunc($a, $base_memory_usage) {
memoryUsage(memory_get_usage(), $base_memory_usage);
}
$base_memory_usage = 0;
$base_memory_usage = memory_get_usage();
memoryUsage(memory_get_usage(), $base_memory_usage); // 0 bytes
$a = someBigValue();
$b = &$a;
memoryUsage(memory_get_usage(), $base_memory_usage); // 15496 bytes
testFunc($a, $base_memory_usage); // 30896 bytes
memoryUsage(memory_get_usage(), $base_memory_usage); // 15496 bytes
unset($a, $b);
memoryUsage(memory_get_usage(), $base_memory_usage); // 0 bytes
?>
BumBum 9 february 2012, 14:16
Vote for this post
Bring it to the Main Page
 

Comments

0 ZachSmith June 26, 2013, 18:32
thanks for the interesting article!
i will be back for more articles like this :)

Leave a Reply

B
I
U
S
Help
Avaible tags
  • <b>...</b>highlighting important text on the page in bold
  • <i>..</i>highlighting important text on the page in italic
  • <u>...</u>allocated with tag <u> text shownas underlined
  • <s>...</s>allocated with tag <s> text shown as strikethrough
  • <sup>...</sup>, <sub>...</sub>text in the tag <sup> appears as a superscript, <sub> - subscript
  • <blockquote>...</blockquote>For  highlight citation, use the tag <blockquote>
  • <code lang="lang">...</code>highlighting the program code (supported by bash, cpp, cs, css, xml, html, java, javascript, lisp, lua, php, perl, python, ruby, sql, scala, tex)
  • <a href="http://...">...</a>link, specify the desired Internet address in the href attribute
  • <img src="http://..." alt="text" />specify the full path of image in the src attribute