Web development

It's time to kill the web

Something is happening. People are unhappy. The specter of civil unrest is pursued by our programming communities.

For the first time a significant number of web developers openly question the web platform. Here is a typical article and discussion of . I could list more, but if you are sufficiently interested in programming to read this article, you probably already read this year at least one pompous recitation about the current state of web development. This article is not one of those. I can not compete in bullying the existing status quo with people who have to deal with web development every day. This is another article.

It's you, the front end hacker

I want to reflect on whether it is possible to create such a good competitor for the web that it will eventually replace it and include it, at least for writing applications. The web has problems like the document distribution system, but not so serious as to worry about them.

This is the first of two articles. In the first part, let's look at the deep, unsolvable problems of the web platform: I want to convince you that rejecting it is the only option. In the end, it is impossible to solve the problem if you do not analyze it first. We also briefly consider why it has now become politically acceptable to discuss such issues, although they are not something new.

In the second part, I will propose a new application platform that a small group of developers can create in a reasonable time, and which (IMHO) should be much better than what we have now. Of course, not everyone will agree with the latter. Agree with problems is always easier than to agree with decisions.

Part 1. Let's go.
Why should the web die
Web applications. What do they look like, eh? I can list a bunch of their problems, but let's stop at two.

Web development is slowly repeating the 1990s.
Web applications can not be protected.

Here's a nice post on Flux , the latest fashionable web framework from Facebook. The author draws attention to the fact that Flux recreates the programming model, which was used by Windows 1.0, which was released in 1985. Microsoft used this model because it was suitable for slow computers, but it was so inconvenient to program for it, so a decade later, as an entire ecosystem of products (like OWL) grew, allowing to abstract over the underlying WndProc message system.

One of the reasons React / Flux works this way is very slow web rendering engines. This is true, and the final result visible to the user is only slightly more intricate than the user of Windows could see 20 years ago:

Windows 98

Of course, the screen resolution has become larger. Changed the shades of gray, which we like. But the UI, which you see at the top, is quite similar in complexity to the UI that you see below:


Even the fashion for icons has not changed! Windows 98 introduced a new trend of flat icons in grayscale, while the previous ones were colored, densely packed pixel images.

But Office 2000 was quite satisfied with the CPU at 75 MHz and 32 MB of memory, while Google Docs from the screenshot uses a 2.5 GHz CPU and almost ten times more memory.

If we had a tenfold increase in productivity and functionality, it would be forgivable, but we did not get it. The development platforms in 1995 by default had all of the following, just for starters:

Visual UI editor with layout constraints and data binding.
Advanced support for software components in many languages ​​. You could mix statically typed native code and scripting languages.
The released executables were so effective that they worked on several megabytes of RAM.
Support for plotting, theming, 3D-graphics, socket programming, interactive debugging ...

Many of these functions have appeared on the web platform only in the last few years, and often very superficially. Web applications can not use real sockets, so instead servers must be translated to support "web sockets". Such basic things as UI components are quiet horror . There is no serious Web IDE, but about mixing different programming languages ​​... well, you can try to compile everything in JavaScript. Sometimes.

Developers like to write Web applications for one reason - users' expectations for such applications are extremely low. From applications for Windows 95 you expect the presence of icons, dragging the mouse, canceling the actions performed, associations with file extensions, normal combinations of hot keys, useful activity in the background ... and even working in offline ! But these are the simplest applications. The really cool software could be embedded in Office documents or expand the functionality of the Explorer, or allow the expansion of functionality arbitrary plugins, which are originally unknown to the author of the program. Web applications usually do not do anything like that.

All this is accumulating. I feel much more productive when I write a desktop program (even taking into account the different "taxes" that you have to pay, like making icons for your file types). I also prefer using them. And from conversations with others I know that I'm not the only one like that.

I think that the web became so, because HTML came out with a completely intelligible design philosophy and tools as a platform for documents, but as a platform for HTML applications had to be fixed on the snot, and nothing good has happened so far. Therefore, there are no basic things like file associations. But in HTML5 there is a peer-to-peer video streaming, because Google wanted to make Hangouts, but Google's priorities are the main thing in what functions to add to the standard. To avoid such a problem, we need a platform that was originally designed with the idea of ​​applications, and then, maybe, add more documents on top, and not vice versa.
Web applications can not be protected
In the late 1990s, the terrible implementation of the AP loomed over the software industry: security vulnerabilities in C / C ++ programs ceased to be rare errors that can be corrected separately. They appeared everywhere. People began to understand that if the C / C ++ code is put on the Internet, inevitably there will be exploits.

You can assess the innocence of that world if you read SANS report on the worm Code Red from 2001:

"Representatives of Microsoft and US security agencies held a press conference where they instructed users to download the patch from the Microsoft website and called" civic duty "the download of this patch. CNN and other news publications after the distribution of Code Red warned users about the need to install patches on their systems. "<Tgsrbq>
There were automatic updates in Windows, but if I remember correctly, it's disabled by default. The idea that the program can change without the knowledge of the user, was such a taboo.

First signs of Blaster infection

Gradually, the industry began to change, but with cries and protests. At that time, users of Linux and Mac often said that this is generally a purely Microsoft problem ... and their systems are created by some super-programmers. So while Microsoft accepted the fact that it faced an existential crisis and introduced a " safe development life cycle " ( a huge program of retraining and a new process), its competitors were practically inactive. Redmond added a firewall in Windows XP and released certificates for code signing. Mobile code has been banned. When it became clear that the security vulnerabilities are endlessly flowing, the periodic release of patches "Patch Tuesday" was introduced. Smart hackers continued to make discoveries, how to exploit previously known bugs that seemed safe, and how to bypass protection against exploits that previously seemed reliable. Communities Mac and Linux began to gradually get out of hibernation and realize the fact that they are not magically protected from viruses and exploits.

The last turning point was 2008, when Google released Chrome, an important project from the point of view that they spent a lot of effort on creating a complex but completely unnoticed sandbox for the rendering engine. In other words, the industry's best developers have recognized that they can not write secure C ++ code no matter how hard it works. This thesis and isolated architecture have become de facto standards.
The turn of the web platform
Unfortunately, the web has not brought us to the blessed land of secure applications. Although web applications are in some way isolated from the mother OS, which is good, but the applications themselves are hardly more reliable than the Windows code from 2001. Instead of permanently getting rid of inherited problems, the Web simply replaced one type of buffer overflow with another. In desktop applications, vulnerabilities such as double freeing of the same memory (double free), stack stack smash vulnerabilities, use after free memory and others were exploited. Web applications corrected them, but presented their own same errors: SQL injections, XSS, XSRF, header injections, MIME type blending, and so on.

All this leads to a simple thesis:

Unable to write a secure web application.

We will not be pedants. I'm not talking literally about all web applications. Yes, you can write a secure HTML Hello World, a flag in hand.

I'm talking about real web applications of a decent size, written in realistic terms, and this statement was not easy. The understanding came to me after eight years of working with Google. There, I watched the best and most talented web developers again and again issue code with exploitable bugs.

The Google Security department is one of the best in the world, maybe the best, and for the internal training program they released this is a useful guide listing the most popular web errors -development of . Here's their tip on how to safely send data to display in a browser:

<blockquote> For correction, you can make several changes. Any of these changes will prevent currently possible attacks, but if you add several levels of protection (" deep protection "), you will be protected from the fact that one of the levels does not work in case of vulnerabilities in the browser that will be found in the future. First, use the XSRF token, as discussed earlier, this ensures that the JSON results with sensitive data are returned only for your pages. Second, your JSON response pages should support only POST requests to prevent the script from loading through the `script` tag. Third, you should make sure that the script is not executable. The standard way to do this is to add some non-executable prefix to it, like])} while (1); </ x>. A script running in the same domain is able to read the contents of the response and get rid of the prefix, but scripts from other domains can not.

NOTE: Making scripts non-executable is not so easy. Perhaps the execution of scripts will change in the future, with the advent of new script functions and languages. Some believe that you can protect the script by turning it into a comment, that is putting it in / * and * /, but it's not so simple. (Hint: what if someone uses * / in one of their snippets?) <Tgsrbq>

Reading such an amusing hunt for witches and folklore always makes me smile. This should be a joke, but in reality here are the basic things that every web developer in Google must know, just to bring some information to the screen.

In fact, you can do all of the above, but it will not be enough. The HEIST attack allows you to steal data from a web application that unerringly implemented all security measures. It exploits the unavoidable flaws in the architecture of the Web platform itself. End of the game.

Not really! It's still worse! Protecting REST / JSON endpoints is just one of the many security concerns that a modern web developer should understand. There are dozens of others ( here is an interesting example of and another cool ).

By experience, I can say that it's impossible to hire a web developer who at least heard about all these pitfalls, not to mention the developer who is able to protect them. Here's my conclusion: if you can not find a web developer who understands how to write secure web applications, then you can not write a secure web application.
The key problem is
Virtually all problems of security on the web are explained by several key architectural problems:

Buffers that do not match their size
Protocols designed for documents, not for applications
Domain restriction rule (same origin policy)

Loss of control over the size of the buffers is a classic source of vulnerabilities in C programs, and the same problem arose in the web: all XSS and SQL injections are based on confusion about where the buffer for the code begins and the data buffer ends. The web is highly dependent on text protocols and formats, so you have to invariably parse the buffers to determine their size. This opens up a whole universe of problems with escaping, replacing and other things that are completely unnecessary.

Bugfix: The size of all buffers must be labeled with a number: from the database to the frontend server and the user interface. There should not be a need to scan something in search of magic characters and see where they end. Note that this requires binary protocols, formats, and UI logic throughout the stack.

HTTP and HTML are designed for documents. Even Yegor Khomakov managed to break Authy's two-factor authentication by just typing "../sms" in the SMS code entry field. He did it because, like all web services, Authy was created on the stack for hypertext, not software. Bypassing the directory makes sense if you really have access to directories with HTML files, as Sir Tim intended. If you represent the API as "documents", then bypassing the directory can become fatal.

REST was terrible enough already when it returned XML, but now XML has gone out of fashion and now the web uses JSON - format, so poorly designed, that they have a whole section in the wiki dedicated to security issues .

Bugfix: Let's stop pretending that REST is a good idea. REST is a bad idea that twists HTTP into what it is not, just to circumvent browser restrictions. This is another tool that has been twisted into something that it should not be from the beginning. This always ends pitifully. Given the previous point, client-server communications should use a binary protocol, designed specifically for RPC.

The domain limit rule for a web developer is another experience from Stephen King's book. Quoting from the wiki :

<blockquote> The methods for checking for domain limitation and the corresponding mechanisms are not very clearly defined for the borderline cases ... historically this caused considerable number of security vulnerabilities.

In addition, many legacy cross-domain operations that appeared before JavaScript are not subject to domain restrictions.

After all, certain types of attacks, such as DNS rebinding or server-side proxy , allow to partially destroy the host name checking.

SOP (same origin policy) was the result of the fact that Netscape screwed the program code to the format for documents. In fact, it does not make any sense. You would never create a platform for applications of this kind if you had more than 10 days of start-up time . But in fact, we can not blame them, because Netscape was a start-up that worked in the face of an acute shortage of time, and as we noted above, at that time no one seriously thought about security. The result of a 10-day coding marathon could be even worse.

Regardless of your likes, in the heart of the HEIST attack lies precisely SOP, and the attack HEIST breaks almost all web applications in ways that it's probably never possible to protect , at least without giving up backward compatibility. This is another reason why you can not write a secure web application.

Bugfix: Applications need a clear identity, and you should stop exchanging security tokens with each other by default. If you do not have permission to access the server, you should not be able to send messages to it. This is understandable on every platform, except the web.

There are a lot of other architecture problems that make it difficult to create secure web applications, but the above examples, I hope, are enough to convince you.

HTML5 is the plague of our industry. Although it does some things well, these advantages are easily repeated on other platforms for web applications, and virtually none of the key problems of the web architecture can not be eliminated. That's why the web is lost on mobile devices: when compared with competing platforms that are really designed rather than grown naturally, developers almost always choose the native version. But outside the mobile world, we do not have anything good. We desperately need a way to conveniently distribute isolated, secure, auto-upgradeable applications for desktops and laptops.

Ten years ago, I would be crucified for writing such an article. I expect some grumbling now, but recently it has become socially acceptable to criticize the web. In the past, the web has been drawn into competition with proprietary platforms like Flash, Shockwave and Java. The web was open, but its survival as a competitive platform raised doubts. Its final rebirth and victory is a story that presses on all emotional buttons: openness is better than closeness, collective possession is better than proprietary code, David defeated Goliath and so on. Many programmers have developed a true tribal fidelity in relation to it. Adding a "web-based" to all the console instantly makes it fashionable. Express an opinion that Macromedia Flash is good - and you can take away the identity of the boom.

But times have changed. The web is so fathered that it's almost pointless to call it open: you do not have the chance to implement HTML5 if you do not have a few billion dollars in your pocket that you want to lower. The W3C consortium did not listen to users' requests and now it became unnecessary, so if you do not work in Google or Microsoft, then you can not somehow influence the technical development of the web. Some of the previously closed competing platforms have now been opened. And the JavaScript ecosystem explodes under the weight of its own meaningless threshing.

Time to go back to the net. And now take a drink and read the following article in the series: what will follow the web?
Papay 16 november 2017, 12:17
Vote for this post
Bring it to the Main Page


Leave a Reply

Avaible tags
  • <b>...</b>highlighting important text on the page in bold
  • <i>..</i>highlighting important text on the page in italic
  • <u>...</u>allocated with tag <u> text shownas underlined
  • <s>...</s>allocated with tag <s> text shown as strikethrough
  • <sup>...</sup>, <sub>...</sub>text in the tag <sup> appears as a superscript, <sub> - subscript
  • <blockquote>...</blockquote>For  highlight citation, use the tag <blockquote>
  • <code lang="lang">...</code>highlighting the program code (supported by bash, cpp, cs, css, xml, html, java, javascript, lisp, lua, php, perl, python, ruby, sql, scala, text)
  • <a href="http://...">...</a>link, specify the desired Internet address in the href attribute
  • <img src="http://..." alt="text" />specify the full path of image in the src attribute