Algorithms
Raiting:
7

# How to simulate a large number of interacting particles

Let us consider a situation when you need to handle collisions between some objects. What do you do in this case? Probably, the easiest solution would be to check interaction between these objects. This is the right decision, and everything will be fine until you get a lot of objects. As soon as their number gets up to a few thousand, you will notice that everything works much slower. What if you get thousands of the particles? Then everything stops. Here things get much more interesting. What tricks and optimizations would you use to solve this problem?

Let us consider 2D case to make it simple, the particles are round, and the radius of the particles is the same for all of them.

1. Overview of algorithms
1.1. Exhaustive search
1.2. Sweep & Prune
1.3. Regular grid
2. Some optimizations
2.1. Sweep & Prune
2.2. Regular grid
3. Speed comparison
4. Application (software and source code)
5. Conclusion

## 1. Overview of algorithms

1.1. Exhaustive search

This is the simplest and the slowest of all possible ways. There is a check between all possible pairs of particles.

void Update()
{
for (int i = 0; i < NUM_P; ++i) {
for (int j = i + 1; j < NUM_P; ++j) {
Collision(&_particles[i], &_particles[j]);
}
}
}

Complexity: O(n ^ 2)

* It is very easy to understand and implement
* Not sensitive to different particle sizes
* It is the slowest one

1.2. Sweep & Prune

The essence of this algorithm is that we first sort all the particles on their left edge along the axis OX, and then we check each particle only with those that have left edge less than the right edge of the current particle.

I will try to describe the algorithm’s work in the images.
Initial state:

As you can see, the particles are not ordered.
After sorting, on the left edge of the particle (the position of X minus the particle radius), you get the following picture:

Now you can start searching. Check each particle with every other particle until the right edge of the first particle gets greater than the left edge of the second particle.

Let us consider the following example.
0 particle checks collision with 1 and 2:

1 with 2 and 3.
2 only with 3.
3 checks collision with 4, 5 and 6:

I think the point is made.

My implementation of this algorithm:

void Update()
{
qsort(_sap_particles, NUM_P, sizeof(Particle *), CompareParticles);

for (int i = 0; i < NUM_P; ++i) {
for (int j = i + 1; j < NUM_P && _sap_particles[i]->pos.x + RADIUS_P > _sap_particles[j]->pos.x - RADIUS_P; ++j) {
Collision(_sap_particles[i], _sap_particles[j]);
}
}
}
It should be noted that this algorithm is most effective in cases where the movement of particles is a little for each time, because most of the load takes the sorting, and it will be done faster if the particles are already partially sorted.

Also this algorithm can be used with more complex objects, and in this case, you should use them AABB (Axis-Aligned Bounding Box, the minimum bounding rectangle).

Complexity: O(n ^ 2) - in the worst case O(n * log(n)) – in the average case

* It is easy to implement
* Not sensitive to different particle sizes
* Good performance
* It requires a little time to understand the algorithm

1.3. Regular grid

As you probably guessed from the name, the essence of the algorithm comes to the fact that all space is divided into a regular grid of small squares which size is equal to the diameter of the particle. Each square (cell) of this grid is an array.
It may look like this:

const int _GRID_WIDTH = (int)(WIDTH / (RADIUS_P * 2.0f));
const int _GRID_HEIGHT = (int)(HEIGHT / (RADIUS_P * 2.0f));

std::vector<Particle *> _grid[_GRID_WIDTH][_GRID_HEIGHT];
The grid is cleaned and re-filled during iteration of the main loop. Filling algorithm is very simple: the cell index of the particle is calculated by dividing the two coordinates on the particle’s diameter and casting-out the fractional part . Example:

int x = (int)(_particles[i].pos.x / (RADIUS_P * 2.0f));
int y = (int)(_particles[i].pos.y / (RADIUS_P * 2.0f));

Thus, for each particle needs to be calculated the index and put it into a cell with this index (add an element in the array).
Now you just have to go through each cell and check out all of its particles with all particles of adjacent 8 cells.

You can do this like that:

void Update()
{
for (int i = 0; i < _GRID_WIDTH; ++i) {
for (int j = 0; j < _GRID_HEIGHT; ++j) {
_grid[i][j].clear();
}
}

for (int i = 0; i < MAX_P; ++i) {
int x = (int)(_particles[i].pos.x / (RADIUS_P * 2.0f));
int y = (int)(_particles[i].pos.y / (RADIUS_P * 2.0f));

_grid[x][y].push_back(&_particles[i]);
}

/ / Then many loops of checking with the adjacent cells
}
Complexity: O(n)

* It is the fastest of all
* It is easy to implement
* It is sensitive to the different particle sizes (requires modification)

## 2. Some optimizations

2.1. Sweep & Prune

This algorithm can be slightly improved by making the choice of which axis to sort particles. The optimal axis will be the one along which the largest number of particles.

The optimal axis OY:

The optimal axis OX:

2.2. Regular grid

Optimization 1:
Let us use more efficient representation of the grid:

const int _MAX_CELL_SIZE = 4;
const int _GRID_WIDTH = (int)(WIDTH / (RADIUS_P * 2.0f));
const int _GRID_HEIGHT = (int)(HEIGHT / (RADIUS_P * 2.0f));

int _cell_size[_GRID_WIDTH * _GRID_HEIGHT];
Particle *_grid[_GRID_WIDTH * _GRID_HEIGHT * _MAX_CELL_SIZE];

I think that 1D array is faster than 2D one, as well as the static array of the constant length is faster than dynamic std::vector. Also, I had to make another array, which will indicate how many particles are in each cell.
Another thing is that I put a limit on the number of particles per cell, and this is not good, because it threatens that under strong squeezing there will be need to write a much larger number of particles in the cell, and it is not possible, because some collisions are not processed. However, this problem could be avoided by proper handling of collisions.

Optimization 2:
We will only put changes into it that have occurred, instead of complete cleaning the array of grid during iteration of the main loop.
We go over all particles of each grid cell, we calculate a new index of the particle, and if it is not equal to current we remove the particle from the current cell and add it in the new cell that has the free space.
Removal is done by writing in its place last particle of the same cell.
Here is my code that implements it:

for (int i = 0; i < _GRID_WIDTH * _GRID_HEIGHT; ++i) {
for (int j = 0; j < _cell_size[i]; ++j) {
int x = (int)(_grid[i * _MAX_CELL_SIZE + j]->pos.x / (RADIUS_P * 2.0));
int y = (int)(_grid[i * _MAX_CELL_SIZE + j]->pos.y / (RADIUS_P * 2.0));

if (x < 0) {
x = 0;
}
if (y < 0) {
y = 0;
}
if (x >= _GRID_WIDTH) {
x = _GRID_WIDTH - 1;
}
if (y >= _GRID_HEIGHT) {
y = _GRID_HEIGHT - 1;
}

/ / If the index of the particles changed and the new cell has available space
if (x * _GRID_HEIGHT + y != i && _cell_size[x * _GRID_HEIGHT + y] < _MAX_CELL_SIZE) {
_grid[(x * _GRID_HEIGHT + y) * _MAX_CELL_SIZE + _cell_size[x * _GRID_HEIGHT + y]++] = _grid[i * _MAX_CELL_SIZE + j];
/ / Replace the old particle with the last one from the same cell
_ _grid[i * _MAX_CELL_SIZE + j] = _grid[i * _MAX_CELL_SIZE + --_cell_size[i]];
}
}
}

Optimization 3:
We check 8 adjacent cells, but it would be sufficient to check only 4 adjacent cells.
For example:

The rest of the cells will be checked either or later:

Optimization 4:
We can slightly speed up the algorithm’s work by increasing the locality of data in memory. This allows more frequent reading of data from the cache of the processor and accessing less RAM. But it should not be done at each iteration of the main loop, because it will kill all the performance, however, this operation could be done once per second, or once every few seconds.
In order to increase the locality we need to change the order of the particles in the main array, according to their location in the grid.
Namely, the first 4 particles will be from 0 cell, following 4 from 1 and so on.

## 3. Speed comparison

The number of particles Bruteforce (ms) Sweep & Prune (ms) Regular grid (ms)
1000 4 1 1
2000 14 1 1
5000 89 4 2
10000 355 10 4
20000 1438 28 7
30000 3200 55 11
100000 12737 140 23

The table shows that working time of the algorithms corresponds to their complexity, regular grid grows almost linearly:).

## 4. Application (software and source code)

particles_system.tar.gz

In the archive is a project written in CodeBlocks, as well as binaries for Linux, however, I have used a library ClanLib to create the window and graphics output, therefore, the project should compile without problems in Windows.

Program controlling:
1 – Bruteforce
2 – Sweep & Prune
3 – Regular grid
The left mouse button – move.

## 5. Conclusion

This article gives an overview of the most common algorithms that are used for handling of the collisions. However, this does not limit the scope of these algorithms, and they could be used for any other interaction. The main task that they handle, it is more convenient representation of the data to be able quickly to filter out uninteresting pairs in advance.
KlauS 14 february 2012, 16:38
 Vote for this post Bring it to the Main Page

B
I
U
S
Help
Avaible tags
• <b>...</b>highlighting important text on the page in bold
• <i>..</i>highlighting important text on the page in italic
• <u>...</u>allocated with tag <u> text shownas underlined
• <s>...</s>allocated with tag <s> text shown as strikethrough
• <sup>...</sup>, <sub>...</sub>text in the tag <sup> appears as a superscript, <sub> - subscript
• <blockquote>...</blockquote>For  highlight citation, use the tag <blockquote>
• <code lang="lang">...</code>highlighting the program code (supported by bash, cpp, cs, css, xml, html, java, javascript, lisp, lua, php, perl, python, ruby, sql, scala, text)
• <a href="http://...">...</a>link, specify the desired Internet address in the href attribute
• <img src="http://..." alt="text" />specify the full path of image in the src attribute