Finding the best place in the world for a windmill

The story of how NASA, ESA, the Danish Technological University, neural networks, decision trees and other good people helped me find the best free hectare in the Far East, as well as in Africa, South America and other "so-so" places.

It seems that two years ago, and maybe even three, announced the program of distribution of free hectares in the Russian Far East . Quickly looking at the map, it became clear that just choosing the right hectare is not so simple, and the best and obvious places near the cities, for sure to go away or have already left the local. Probably, it was at this moment that I had the idea that it is possible to somehow automate the search for the best place.

Starting to romantically ponder further, I thought that it was not necessary to look at the Far East. Now it is full of land, which nobody needs anywhere, but this can change, in about 50 years, when fossil fuels will come to an end. And people will go look for new sources of energy. Then I started looking at renewable energy sources. And very quickly realized that the map of resources and territories, where this new energy can be extracted, will change greatly. Finding such places now, you can buy them beforehand and be rich in sweat. Estimating more, I imagined that for a couple of days off this can easily be done ... Now looking back, I understand that it took me about a year . I want to note right away that at that time I was not very versed in energy, in renewable sources, or in machine learning. Below is a brief retelling of my one-year project.
Selecting the type of renewable energy source
Determined with the idea, I quickly went to look, and which in general there are renewable energy sources and which one is the most energy. Here are incomplete, but the most common list:

solar radiation (solar energy);
wind power (wind power);
energy of rivers and watercourses (hydropower);
energy of tides;
wave energy;
geothermal energy;
scattered heat energy: heat of air, water, oceans, seas and reservoirs;
biomass energy,

But how to determine which one is the best and will win the rest of the rest in the future? After reading a few more interesting articles from the journals "Science and Life" and "Young Technician". I went out to the methodology LCOE (levelized cost of electricity) in which the principle is simple: smart guys try to estimate the total cost of kilowatt-hours of energy , given the production, materials, maintenance, etc. Below is a picture on data from 2016 with some projection on 2022. I took a picture fresher from here , below the boring a plate from this document.

In general, these pictures I have a darkness for different countries, made by different organizations and everything looks about the same:

Geothermal energy comes first.
Next Hydroelectric power, but it depends already strongly on the country.
In third place is Wind.

I did not like the geothermal and hydro, as in my opinion, the places where it would be possible to extract this energy can be counted on the fingers. The wind and the sun are another matter, since you can put them almost on every roof and balcony. The sun was more expensive, and three years ago the difference was more than 30 percent, I chose the Wind.

By the way, already in the middle of the project, I started to run into documents with similar reflections of the US state, namely the NREL organizations, the US Department of Energy and others, who made forecasts and rates for various energy sources in order to understand how to modernize the energy system countries. For example, in one such document, it all boiled down to several options: the share of wind energy will be large or very large.
How I wanted to crank it
The idea of ​​how to crank it was pretty simple and looked like this:

Find the places where there are windmills around the world.
Collect information at these points:

a. Wind speed.
b. Direction.
c. Temperature.
d. Relief.
e. What local fishermen love for lunch.
f. Etc.

To give this information to the model of machine learning, which would be trained and found regularities, what parameters are best influencing the choice of the construction site by a person.

Give the trained model, all the points are the remaining points around the world with the same information on it.

Get the output of a list of points that are great for placing a windmill.

In graphic form, this plan, as it turned out later, was similar to this well-known picture:

Everything was really
The first stage was quite easy. I just unloaded all the entries about the points from OpenStreetMaps.

By the way, I want to note that OSM is just a storehouse of information about objects around the world with their coordinates, there is almost everything. Therefore, to the note for data lovers, OSM is the coolest Big Data source.

It was not very difficult to do it. First I tried using online tools, it seems, by the way it's a very cool thing, but it did not work out due to limitations on the number of points and not very fast work on a lot of data. Therefore, I had to deal with utilities that uploaded data from the OSM data snapshot locally. You can always download the current cast here: here ? In compressed form it takes about 40GB. Data from it can be downloaded by requests using this Osmosis utility. As a result, I had a date set of 140 thousand points in the world with coordinates and heatmap. He looked like this:


All problems started in the second stage , as I did not really understand what information it was necessary to collect. Therefore, for a couple of days I went to read the principles of windmills and recommendations for their placement, restrictions, etc. I even left in my notes such amusing schemes about placement, gradients of winds, wind roses and other other useful terms.

As a result, I got here a list of parameters that, in my opinion, are important when choosing a location:

Average speed winds a year (ideally 10-11 m / s).
Wind Direction (Dominant direction is rozavetrov).
Minimum wind speed.
Maximum wind speed.
Power density.
Average temperature.
Average moisture content.
Mean pressure.
Height above sea level.
Distance to water.
Difference in altitude.
Smoothness of height differences.
The maximum drop in the area is 5-10 km.
Percentage of trees or plantings in the area (sverhovatost).
Distance to the settlement.
Distance to the industrial site.
The average number of inhabitants per square.
Distance to the road (sea, air).
Distance to the power grid.
Visual and sound inconvenience.
Protected areas: nature reserves and so on.
Large data
WIND . Actually, as 90% of all projects on big data break down at the stage "so now let's look at your data about which you talked so much," and cracked mine. Running to search for data on wind speed in Russia, I came across this:

And with a dozen similar and useless pictures. Then I began to guess that it is possible that in Russia there really is no wind power, since we simply do not have the wind in sufficient strength and somewhere at that moment Sechin's laughter was heard. But I clearly remember that in the Samara region some steppes and very often going out for bread in my childhood blew me back into the entrance.
Starting to look for data on Russia and then that, I realized that it was not like the data with which it was possible to do something useful. So I went to foreign sources and immediately found wonderful wind maps from Tier3 (Vaisala) . The resolution seemed sufficient and the coverage of the whole world was simply excellent. Further I realized that such data cost quite good money about ~ $ 1000 for 10 square km (data of three years ago). A failure, I thought.

After spending a week, I decided to write Vaisala, Tier3 and other foreign consulting agencies dealing with winds and other wind generators, and ask for data. I thought that, after telling what a cool idea I'm going to do, all of them to me at once will be dumped .. Only one answered - from the company Sander-Partner. Sam Sander gave some advice, and also gave links to what I need: data of the program MERRA , which leads NASA. It is worth noting that I left about a week in the evenings to figure out what Reanalysis, WRF is and just about what is going on: gathering, aggregating, simulating and predicting weather, winds and other things.

Briefly, mankind collected a bunch of weather data, a pile of maps with average temperatures and wind speed was drawn, but it was impossible to collect all these data at every point of the globe, so it was impossible, so the white spots were filled with weather simulation results for the past years and called it Reanalysis . For example, here is a website with the visualization of such wind simulations and here's how it looks:

This data was essentially a .csv coordinate grid file with an average wind speed with a large pitch, I made this kind of map with the help of the innocent free QGIS package and the data grid interpolation method.

And then, with the help of him, he extracted from the map the wind speed data for each pair of coordinates. In fact, I got a map, and a data layer for each pixel on it.

Having understood the principle of working with QGIS in about a couple of weeks, I began to build the same maps for the rest of the data sources and pull out the coordinates of the values. For temperature, humidity, pressure and other things. It should be noted that the data sets themselves were mostly taken from NASA, NOAA, ESA, WorldClim , etc. All of them are freely available. Using QGIS, I calculated and searched for the distance to the nearest points, from cities, airports and other infrastructure facilities. Each card on one parameter was considered for me about 6-8 hours. And if something was wrong it was necessary to do it again and again. A home computer rustled in my ears for about a couple of weeks, but after that even my neighbors got tired of listening to the loose cooler on it and I crawled into the cloud where I picked up a small virtual calculator.

Already after a few months I came across here on this site, made by the Department of Wind Energy of Denmark (DTU Wind Energy). It quickly became clear that their resolution was several times better than my card, I signed with them and they gladly uploaded data to me around the world, because through the site you can get only a small cast on the territory. By the way, they also made this map using the simulation of the motion of wind layers by WRF, WAsP models and achieved data resolution of up to 50-100 meters, as I had about 1-10 km.

RELIEF Remember, I wrote that the relief is very important, so I decided to use this option as well, but it turned out to be not all that simple either. First, I wrote a utility that paged data from the Google Elevation API . She did an excellent job and uploaded data for all my points around the world with a step of 10 km, it took only about 12 hours of work. But I also had the smoothness parameters of the terrain or the average value of the drop in the territory around the potential location of the windmill. That is, I needed data with a range of meters in 100-200 of the world, with which I could already calculate the average value of the difference.

In order to calculate the differences, it would take a couple of months to download the data from Google Elevation. So I went to look for other options.

The first thing I found was Wolfram cloud , which already had the necessary data. Just writing a formula, this thing started to count, using data from the cloud Wolfram. But there, too, I was waiting for a failure, since I stumbled into some limits that were not listed anywhere and received a funny correspondence supporting this service, I went to look for another option.

Here I was again helped by data sources in NASA and data from the space program STRM (NASA Shuttle Radar Topography Mission Global ). I honestly tried to deflate them from the site, but there the data was only for small areas. Having gained courage, I wrote a letter to NASA and after about a week of correspondence, they unloaded the necessary data for which I thank them very much. There, the truth was in what a clever satellite binary format, which I probably shoveled a week.

Everything ended well, and I found the metrics I needed for the altitude difference for the whole world in 10-kilometer steps. By the way, side by side I made my API service, which returns the height above the sea level by coordinates and published it here He works for Azure Tables, where I cleverly fit the data and literally for the centers I store them there. By the way, even someone a couple of times bought access to the API, because it turns out cheaper than from Google.

TOTAL . After spending about 4 months of searching, cleaning, calculating in QGIS, I got a date data set that could be used in machine learning models. And which contained about 20 different parameters in the following categories: Climate, Relief, Infrastructure, Necessity or Consumers.
Machine learning and predictions
At that time, I already had some knowledge and understanding of how machine learning algorithms work, but I did not really want to unfold all these Pythons and Anacondas. Therefore, I used the online service for teapots without SMS from Microsoft Azure ML Studio. Bribed that he is free and everything can be done with a mouse in the browser. Then the idea should be a description of how I spent another month on creating the model, clustering data and other things. Especially difficult was given all these clustering because QGIS them for a very long time did on my old home PC. As a result, the experiment looks like this.


The total number of points that needed to be estimated, came out about 1.5 million . Each such point is a territory of 10 to 10 km and so the whole world. I removed the cells, which already have windmills in a radius of 100 km, as well as some areas, and received a date set of ~ 1,500,000 records. The model gave an assessment of the suitability of each such square on the planet Earth . Used mostly neural networks and boosted decision trees. The accuracy at those points where the windmills are already standing and what my model predicted came out: Accuracy - ~ 0.9; Precision - ~ 0.9 . What, it seems to me, is pretty accurate, well, or somewhere there has passed retraining. From this exercise I received:

Firstly the points at which the model said it was a great new place for a windmill.
Secondly, the points at which the model said where the seats are not very good.

In total, I found about 30 000 most suitable places (these are new places where windmills are not located 100 km away).
Result and validation
Having received 30,000 points with new locations, I visualized them and it looks like a heatmap.


I made a small website using cartodb to render the map and laid out the entire map of the world - . I also calculated for each point an approximate output of energy c from one industrial-sized wind turbine (50 m). The points here are colored by the volume of energy, and not by the Probability estimate from the model. At each point you can click and there will seem "confidence" of the model at this point, I called it Goodness.

I also tried to check the veracity of all this expertly.

Visual inspection: the model predicts points that lie along the shore, which is similar to the truth, since there will be a good even wind with a watery surface.
Visual inspection: the cluster of points in most part coincides with the places of good and excellent speed and density of air, when compared with maps of winds. For example, here is Egypt and China:

What's next
I sometimes write and are asked to send more detailed maps of places or to explain some things on the map, but nothing more has come out of this yet. Theoretically, you can count the data not in steps of 10 km, but in 100 meters and in theory the picture can change a lot, and in theory it will be able to predict not only the area, but a specific point of location. But this requires a little more computing power, which I do not yet have. If there are ideas of application I will be glad to hear them.
kleop 22 september 2017, 8:56
Vote for this post
Bring it to the Main Page


Leave a Reply

Avaible tags
  • <b>...</b>highlighting important text on the page in bold
  • <i>..</i>highlighting important text on the page in italic
  • <u>...</u>allocated with tag <u> text shownas underlined
  • <s>...</s>allocated with tag <s> text shown as strikethrough
  • <sup>...</sup>, <sub>...</sub>text in the tag <sup> appears as a superscript, <sub> - subscript
  • <blockquote>...</blockquote>For  highlight citation, use the tag <blockquote>
  • <code lang="lang">...</code>highlighting the program code (supported by bash, cpp, cs, css, xml, html, java, javascript, lisp, lua, php, perl, python, ruby, sql, scala, text)
  • <a href="http://...">...</a>link, specify the desired Internet address in the href attribute
  • <img src="http://..." alt="text" />specify the full path of image in the src attribute