Why does Windows arrange invisible desktop icons for about 20 seconds?
“What's wrong with the computer - it has an nvme drive installed, but opening the explorer, if you haven't done it for a long time, takes about 10 seconds, opening the .zip file on the desktop takes about a minute, and when you press the“ Start ”key, you need to wait about 20 seconds? "
Around the end of January, I was shown a Twitter post in which a Windows user with a powerful machine talks about random freezes in Explorer. Many unscientific theories have been proposed. I don’t usually do the analysis of performance problems with strangers, but this case seemed interesting, so I decided to study it.
Freya sent me ETW tracing what is happening with her car and I investigated it with the help of Windows Performance Analyzer (WPA) ... The first thing I noticed is that the UI Delays graph shows, as you might have guessed, that the explorer.exe thread 7888 was unable to check messages for 20.531 seconds. It is frozen.
Explorer.exe has a lot of UI threads, so this does not mean that the whole process freezes, however, one of its windows definitely freezes and causes freezes all over the place, which is bad.
If a thread is unable to pump messages, it means that it is either busy with something else (consuming CPU resources), or waiting for something else (the CPU is idle). After examining the 20.531 second MsgCheck Delay time period in more detail, I checked the CPU Usage (Precise) data (taken from the context switch toolkit) and saw that thread 9228 was running 99.2% of the time - it was consuming a lot of CPU resources.
The next step was to find out what was going on. The CPU Usage (Sampled) data (from the profiler with 1 kHz sampling) told me that stream 9228 spends roughly 99.7% of its time (26994 out of 27074 samples) in the BatchPositionChangesHelper destructor (line 21) and its child tasks (lines 23- 25). This is a very expensive destructor.
I don’t have access to this source code, but I did a little research on the stacks and they hint that explorer.exe was spending more than 20 seconds doing a lot of tasks related to ... aligning the position of icons.
Yeah, that's rightArranging icons on the desktop is a fairly simple task. You just need to align them into columns, move to the next column and exit when the screens are full. Therefore, 20 seconds to line up the icons did not seem like a realistic time to me, and I assumed that the root cause could be some strange shell extension or other third-party software, but then I tried to reproduce the bug in the simplest way. I figured I would just create a thousand copies of a tiny .jpg on my Desktop and see if explorer.exe was slow. This experiment was too dumb to be sufficient, but nonetheless:
src = os.path.join(script_path, ‘SunsetWhales.jpg’)I ran this simple script with a file_count of 1000 and suddenly explorer.exe started to slow down insanely for over twenty seconds. It really turned out to be that simple.
dst = os.path.join(desktop_path, ‘TestFiles%04d.jpg’)
for i in range(file_count):
shutil.copy(src, dst % i)
But why?Computers are very fast today. Freya's processor is clocked at 4.6 GHz and has approximately 950 GIF files on her desktop. In 20 seconds, its CPU must execute 92 billion clock cycles, or 97 million clock cycles per image. And that's a lot.
I assumed that the problem was related to an observation I called Dawson's first law of computation for myself: O (n ^ 2) - optimal time for poorly scalable algorithms - fast enough to get into production, but slow enough to break everything after it gets there.
That is, the most likely explanation for why the icons take so long to arrange is that the icon reordering code uses O (n ^ 2) (aka quadratic ) algorithm, that is, when the number of icons is doubled, the time for their alignment quadruples. With this performance scaling, an algorithm that works well with ten elements can slow down with just a thousand elements.
Good theory, but how do you prove it?
Scientifically!I started by writing a script that will fill my Desktop with a given number of images. I ran it over and over with more and more images and recorded an ETW trace so I could measure the performance. I also tracked explorer.exe using the Task Manager so that I can figure out when it has finished one task and is ready for the next.
My first test gave chaotic results, they looked like non-linear growth, but any attempts to match with the line would seem more like faith than fit with the data. To test my theory better, I needed to understand what was going on.
Studying the traces, I realized that the BatchPositionChangesHelper destructor was executing most of the time (blue area), but not all the time the explorer was running (green area):
I realized that, among other things, the alignment work was interrupted by the mapping work, and then I understood the reason for this variability.
When my Python script started creating images, the explorer.exe process noticed this and immediately started trying to place icons. In the process of creating images, he could do this several times, which gave unpredictable results. It was a race situation that caused the consistency of total costs to be disrupted. Since I didn’t have access to the source code for explorer.exe, I needed to figure out a way to make it wait for all the images to finish generating, and then build it. I implemented this using the psutil library to pause the explorer.exe process while the images were being generated. Then, when I resumed the process, it did all the work. The code looks something like this:
explorer = p = psutil.Process(explorer_pid)After doing this, I ran your test batch file while recording the ETW trace. To minimize noise and trace size, I disabled context switch call stacks (optional) and disabled indexing of the Desktop folder. I monitored the CPU utilization of the explorer.exe process using the Task Manager and hit Enter to go to the next test when the load dropped to zero. This is how I got a very nice graph of the CPU usage of the explorer.exe process:
The individual blocks represent the CPU load for 100, 200, 300, and so on images up to 1000. If you are careful, you will notice that the CPU load increases faster than linearly, but slower than quadratic. That is, the initial data suggests that the alignment algorithm is not-quite-O (n ^ 2).
However, explorer isn't just about building. If some of the problems are O (n), that is, linear, then they will weaken the impact of the O (n ^ 2) problems. As n increases, O (n ^ 2) tasks will gradually dominate, but I didn't want my test program to run even longer than the 160 seconds it requires.
IsolationSo the next task will be to isolate the time spent on the BatchPositionChangesHelper destructor. In my test trace it accounted for 78.4% of the time spent in explorer.exe and 92.3% of the time spent on a busy thread, and if I can prove it was quadratic then I can prove that as n increases, it will be dominate forever.
To do this, I looked at the CPU Usage (Sampled) data and filtered it so that samples are displayed only in the BatchPositionChangesHelper destructor and its child tasks. Then I looked at ten different areas of the graph and plotted the number of samples. The curve is so smooth it looks fake, but this is the real data.
If you look at the key points of the graph, for example, where the number of images is 500 and then 1000, you can see that performance scaling is slightly worse than O (n ^ 2). That is, it takes more than four times the time to line up 1000 icons than it takes to line up 500 icons.
Decisive blowI usually have few icons on my desktop, so I am practically immune to this bug. However, I have seen people whose desktop is completely filled with icons. They probably show it, but in a less serious form.
Freya used her Desktop to store GIF files. She treated him like a folder (which he actually is), in which you can conveniently store images. She rarely used desktop icons. So when the number of icons became overwhelming over time, she decided to uncheck the "Show desktop icons" checkbox so that there was less confusion. The icons disappeared and she could continue to save images to this folder.
Freezes experienced by her, in which explorer spent more than 20 seconds lining up icons on the Desktop, spending 92 billion CPU cycles on correct placement of icons occurred ... when the icons were hidden.
This is a new level of awesomeness.
Gridding the icons was supposed to be a linear operation initially, but somehow it was written quadratic and it was done regardless of whether the icons were shown or not.
That's all. If you are writing code that others will run, make sure it scales well enough to handle any possible amount of data, no matter how sane it is. Quadratic algorithms usually fail this check.
TrapsIt seems that this bug is related to a change in the use of multiple monitors (as I was told, this is often the case with streamers), so for a while I conducted tests by connecting and disconnecting my external monitor. This is not good for effective testing, and it looks like I messed up the external monitor connector on my personal laptop. He no longer sees the external monitor. Oops.
Yes, with symbolsWhen I analyzed Freya's trace, I just loaded it into Windows Performance Analyzer (WPA) and waited. I didn't need to check the version of Windows it was running on and know what patches it had installed. WPA just looked at the debug information of all EXE and PE files and downloaded the symbol files from Microsoft's symbol servers (and from Chrome servers, because I allowed it). Symbol servers are good. If you are on Windows, enable the use of symbol servers. If you are not on Windows, then I strongly sympathize with you (it is a pity that debugging and profiling, especially problems on other users' machines, is much more difficult for you).
Reporting a problemI don’t know how many people this bug affected (those with 200-300 icons, it doesn’t affect it quite well, and with a larger number the situation gradually becomes worse), and I have no way to fix it. therefore I reported it ... I do not have much hope that it will be corrected. Under my last the quadratic time bug in Windows zero comments even though I posted it months ago.
Raw measurements from my tests saved here , and the tests themselves are located on github ... This bug is extremely easy to reproduce. If anyone wants to create a Feedback Hub entry, they need to do so. I recommend using the UIforETW Browse Folder option when the Desktop hangs - the operation will be blocked for the entire hang time.
Lessons for software developersI have interviewed many times in my career. Often I was asked to come up with an algorithm that would perform some kind of artificial task. The obvious "brute force" algorithm usually turns out to be quadratic (O (n ^ 2)) or, less often, exponential (O (2 ^ n)). Most often, this leads to discussion of the following topics:
Why Quadratic and Exponential Algorithms Are Unacceptably Slow for Most Real World Problems
How to improve the algorithm to be O (n log n) or better.
Despite realizing this problem, we, the developers, continue to write quadratic code. Code that's fast enough to get into production, but slow enough to break everything when it gets there ... See, for example, this , this , this , this and this articles. We need to stop doing this.
|Vote for this post
Bring it to the Main Page