PHP
Raiting:
17

Finding and fixing bugs in PHP sources


image
Honestly I warn: take this text with a certain amount of skepticism. I just recently started to get acquainted with the internals of PHP, but I would like to tell you about what is happening behind the scenes bug # 75237 .
Bug detection
It all started with finding lines of code to cover at the Chicago PHP UG , which was part of the PHP TestFest 2017 .
I found an uncovered line using the gcov for ext / json / json_encoder. c. I tried writing a PHP code that would handle this line .
image
Tried and so and so, but nothing worked, and suddenly I through this code inadvertently initiated segfault :
<?php
class Foo implements JsonSerializable {
public function jsonSerialize() {
return new self;
}
}

# Segfault!
var_dump(json_encode(new Foo));
But in PHP there should never be a segfault. The main PHP code should throw an exception and provide a nice error message for the user space before the segfault happens.
To find out which versions of PHP are affected by this bug, I created snippet 3v4l and saw that all actively supported versions generate segfault.
This happens only when jsonSerialize () returns a new instance of itself. If I return the same instance using $ this, the code works as it should.
<?php
class Foo implements JsonSerializable {
public function jsonSerialize() {
return $this;
}
}

# Works fine
var_dump(json_encode(new Foo));
Sending a bug report
All bugs in PHP should be reported via bugs.php.net . Although I tried to patch the bug myself, I still had to create a report, which can be referenced in the patch.
So I filled out the form - and a bug appeared # 75237 .
Writing a patch
Now the most difficult. How to start patching this bug?
Updating php-src and recompiling
First I got all the previous changes from the main Git repository to make sure that I'm working with the latest version of php-src. I remotely calledupstream, pointing to the main php-src repository on GitHub .
$ git fetch upstream
$ git checkout master
$ git rebase upstream/master master
Then he created a new branch master for the patch and named it in honor of the ID bug.
$ git checkout -b bug75237 Now you need to recompile from the source code. I will not go into details, I have another article, which describes in detail, how to compile PHP from the source code .
When I recompile PHP, I run a bash script called config, which I store outside the main folder of php-src. It deletes all existing compiled binaries and other files not under version control, collects the configuration script and starts the configuration with the necessary flags included by default.
Here's my script, if you want to do something like this for yourself:
#!/bin/sh

make distclean
./vcsclean
./buildconf
./configure --enable-maintainer-zts \
--enable-debug \
--enable-cli \
--with-zlib \
--enable-mbstring \
--with-openssl=/usr/local/opt/openssl \
--with-libxml-dir=/usr/local/opt/libxml2 \
--with-sodium \
"[email protected]"
Why are there such strange ways? I'm working on a Mac with some dependencies installed with Homebrew , so here and there for some extensions the paths lead to / usr / local / opt.
So I just ran my configuration script, and then compiled everything using make.
$ ../config && make -j9 At sapi / cli / php, a binary was created, after which I checked the PHP version and ran the code leading to segfault to make sure that the master branch of php-src still still contains a bug.
$ sapi/cli/php --version
$ sapi/cli/php json_encode.php
Yes! Again segfault. Now we use the debugger to go through the C-code, which is executed line by line by our PHP-script.
Running GDB
To debug the code, I used GDB . It is designed for compiled languages ​​like C and does not work with regular PHP files. But PHP is written in C, so with gdb --args you can get GDB to execute a compiled binary with PHP script.
$ gdb --args sapi/cli/php json_encode.php Running GDB, I set a breakpoint to pause the program, and then went through each line of C-code. I told you about the details in the above video, but in short: it turned out that the whole thing is in the function <php_json_encode_serializable_object () </code>, so I installed a control point here and ran the program with the help of the run.
> break php_json_encode_serializable_object
> run
When GDB reaches the control point and sets the code execution to pause, you can write:

n and jump to the next line,
s and go to the scope / frame of the next line.
c and continue execution until the next checkpoint.

After entering the commands, I realized that the code enters an infinite recursion betweenphp_json_encode_zval () and php_json_encode_serializable_object () . This leads to stack overflow and segfault, because the program runs out of memory.
I was interested in expressionif inside php_json_encode_serializable_object () .
if ((Z_TYPE(retval) == IS_OBJECT) &&
(Z_OBJ(retval) == Z_OBJ_P(val))) {
//
} else {
//
}
There, the case was handled when the returnedSsonialize () value is the same instance of it, but no one caught the situation when a new instance was returned.
I will mention again that the details are described in the video, but here I will give the final part of the code, which I added for throwing an exception, when jsonSerialize () returns a new instance of itself.
if ((Z_TYPE(retval) == IS_OBJECT) &&
// Make sure the objects are not the same instance
(Z_OBJ(retval) != Z_OBJ_P(val)) &&
// Check that the class names of the objects are the same
zend_string_equals(ce->name, Z_OBJCE(retval)->name)) {
// Throw an exception
zend_throw_exception_ex(NULL, 0, "%s::jsonSerialize() cannot return a new instance of itself", ZSTR_VAL(ce->name));

// Garbage collection
zval_ptr_dtor(&retval);
zval_ptr_dtor(&fname);
PHP_JSON_HASH_APPLY_PROTECTION_DEC(myht);

// Return fail
return FAILURE;
} else if ((Z_TYPE(retval) == IS_OBJECT) &&
(Z_OBJ(retval) == Z_OBJ_P(val))) {
//
} else {
//
}
I did a lot of trial and error to come to this option, and I was very helped by the phpinternalsbook.com site, especially when working with zval .
Update: Sarah Golmont , a long time being a key PHP developer and one of the PHP 7.2 release managers, gave a wonderful comment to my patch :" If you just try to match a class, then you can only compare ce. So it will be more correct than comparing names, and (nominally) more effectively. "
Thanks to Sarah's suggestion, we can change the string with zend_string_equals () in the expression:
if ((Z_TYPE(retval) == IS_OBJECT) &&
(Z_OBJ(retval) != Z_OBJ_P(val)) &&
// This line changes
ce == Z_OBJCE(retval)) {
// Same as above
} else if ((Z_TYPE(retval) == IS_OBJECT) &&
(Z_OBJ(retval) == Z_OBJ_P(val))) {
//
} else {
//
}
Testing
Before applying the patch, you need to write a test that confirms that the error is actually fixed.
Again, without details, but if you need more information, then I created a six-part guide to writing tests for the PHP source code .
$ vi ext/json/tests/bug75237.phpt
--TEST--
Bug #75237 (jsonSerialize() - Returning new instance of self causes segfault)
--SKIPIF--
<?php if (!extension_loaded("json")) die("skip ext/json required"); ?>
--FILE--
<?php
class Foo implements JsonSerializable {
public function jsonSerialize() {
return new self;
}
}

try {
var_dump(json_encode(new Foo));
} catch (Exception $e) {
echo $e->getMessage();
}
?>
--EXPECT--
Foo::jsonSerialize() cannot return a new instance of itself
I ran the test and checked it, and then ran all the tests inext / json to make sure I did not break anything.
$ make test TESTS=ext/json/tests/bug75237.phpt
$ make test TESTS=ext/json/tests/
All tests were completed, so I was ready to apply the patch.
Sending PR and updating the bug
Then I filled the patch in my fork on GitHub ...
$ git add ext/json/json_encoder.c ext/json/tests/bug75237.phpt
$ git commit -m "Fix bug 75237 - (jsonSerialize() - Returning new instance of self causes segfault)"
$ git push origin
... and created PR in the main php-src repository.
Finally, I returned to my error report and updated it with a link to the newly created PR.
Waiting for response and making changes
After sending PR a little time passed and I received the response from several employees working on PHP. One of them helped to understand that I incorrectly compare class names (I fixed it in the video and pieces of code in the article, but left unchanged the original PR, so you can analyze the first option).
Eventually Niki P closed PR , because we identified a deeper, more general problem, related to PHP. This language has no general protection against stack overflow, which means that the segfault can be created in many other contexts for .
Here's how Sarah expressed this:
I present: My average day. pic.twitter.com/RdvutMbH58- General Pollita (@SaraMG) September 15, 2017
Have fun!
Although my patch was not merged with the code base, I still consider it successful, because in the process I not only learned a lot, but also drew the developers' attention to the main problem with stack overflow, which can contribute to its solution.
I hope you liked the story of my adventure with a bug. I wish good luck in finding and fixing your own bugs in PHP source code!
Papay 20 november 2017, 13:25
Vote for this post
Bring it to the Main Page
 

Comments

Leave a Reply

B
I
U
S
Help
Avaible tags
  • <b>...</b>highlighting important text on the page in bold
  • <i>..</i>highlighting important text on the page in italic
  • <u>...</u>allocated with tag <u> text shownas underlined
  • <s>...</s>allocated with tag <s> text shown as strikethrough
  • <sup>...</sup>, <sub>...</sub>text in the tag <sup> appears as a superscript, <sub> - subscript
  • <blockquote>...</blockquote>For  highlight citation, use the tag <blockquote>
  • <code lang="lang">...</code>highlighting the program code (supported by bash, cpp, cs, css, xml, html, java, javascript, lisp, lua, php, perl, python, ruby, sql, scala, text)
  • <a href="http://...">...</a>link, specify the desired Internet address in the href attribute
  • <img src="http://..." alt="text" />specify the full path of image in the src attribute