Although the title “The Tangled Web” gave a different idea, the subtitle “A Guide to Securing Modern Web Applications” pointed to its security background.
It turned out to be a dive into the history and workarounds of the web!
I wrote down some of the things I learned from reading the book as a reminder to my future self. Maybe they encourage you to read the book as well.
Creating parsers for the web is hard ¶
Legacy rules how interact with newer features, workarounds for real-life web sites and quirks in different browsers. Trying to build it on your own is difficult to get right. What a mess! This starts with parsing and displaying URLs and HTML, continues with CSS and JavaScript.
This allows for a myriad of injection attacks once you allow your users to provide content in your application or web site. The author suggests parsing an then re-creating the content, but even that seems to me to be hard to get “right” (as in preventing injection).
Don’t be liberal ¶
There’s Jon Postel’s law: “Be conservative in what you do, be liberal in what you accept from others.”
This “Robustness Principle” tricks browsers to see JavaScript where is actually none, to render HTML where there should be text only. This is a severe security risk. Try to avoid being liberal wherever possible. Even harmless features might be combined in a security attack. Being liberal seems to be a trade-off to security.
There is an encoding called UTF-7 ¶
You probably have heard about UTF-8 and base64 encoding. UTF-7 is not very well known: It looks almost like text, but allows you to hide HTML relevant characters like pointy and curly brackets – <, >, }, { – or other JavaScript- and HTML-relevant characters for attacks. Trick your browser to decode it in an unexpected place, et voilà - XSS at its best!
Every feature leaks personal information ¶
This might be visited link formatting, APIs to derive the orientation of the browser or other old and new browser features. I liked the idea of rendering a number derived from the sites previously visited and tricking the user to enter it as a CAPTCHA in an input field (see: “I Still Know What You Visited Last Summer” by Weinberg et. al.).
Even if a feature doesn’t provide personal information at first sight, it will allow JavaScript to fingerprint your system, see Panopticlick by EFF and AmIUnique w/ source code. It will identify you again when you visit the same site, are being served an ad from the same advertiser or execute some JavaScript snippet from one of the big internet companies.
Numerous ways to trick the user ¶
Clicking buttons, arranging windows in a special kind of way, hiding and showing windows at places where you anticipate the user to click next: JavaScript, APIs and browsers allow all this. The browsers have improved over time, but still quirks and new attacks are inevitable.
An advice from Michal Zalewski, the author: It might be a good idea to check that the mouse has been hovering your web site’s window at least 500 ms before allowing the user to click a button to execute something security related or non-revertible.
Strange Times!
Thanks you Dennis for lending me this book. I’m sorry it took me so long to read – now I can finally return it.
The book has a page on the author’s web site where you can find sample chapters. You can buy this Book online, for example from the publisher No Starch Press (DRM free) or at Amazon. There might be used ones available at a cheaper price.