Things to keep in mind concerning CSRF attacks
The first milestone releases of Vert.x 4.0.0 suffered from CVE-2020-35217. Thanks to Xhelal, we were able to address the security bugs and fix it on the first beta. This blog post is a explanation why CSRF should be used in your web applications.
Be aware of the danger: Cross-Site Request Forgery (CSRF)
A popular web vulnerability that has attracted a lot of attention in the research community is Cross-Site Scripting (XSS). This vulnerability allows the attacker to inject arbitrary JavaScript (JS) code on a website that will run in the same origin as the website. According to the Open Web Application Security Project (OWASP), XSS is the second most prevalent issue in the 2017’s OWASP Top [1], and it is found in around two-thirds of all web applications. However, while XSS gets all the attention, few developers pay attention to another attack that can be equally destructive and far easier to exploit. This attack is called Cross-Site Request Forgery (CSRF). It was ranked by OWASP as the fifth most dangerous web vulnerability twice (2007, 2010) and eighth in 2013. Fortunately, the developers’ awareness has increased in recent years and most web frameworks provide at least one defensive mechanism against CSRF. As a result, CSRF was not listed in the 2017’s OWASP Top 10 list. Unfortunately, this attack is far from extinct.
CSRF Origins
In 1988 Norm Hardy published a paper where he explained a theoretical security issue that he called “confused deputy” [2]. This security issue was first reported in 2000 in BugTraq (nowadays SecurityFocus [3]). The post[4] in BugTraq showed how ZOPE, a Python web framework, was vulnerable to this web “confused deputy” that we today know as CSRF. This term (“CSRF”) was first used in June 2001 by Peter Watkins, followed by a detailed description in 2004 by Thomas Schreiber [5]. The author described a variety of attack scenarios where CSRF could be exploited, providing the most detailed description of the problem at the time. Since then CSRF has had many names such as XSRF, one-click attack, Cross-Site Reference Forgery, session riding, sea surf, hostile linking, “sleeping giant” etc. For the rest of this blog, we will refer to this vulnerability as CSRF.
How does CSRF exactly work?
There are many definitions for it in the literature, but the core idea remains the same; in a CSRF attack the victim’s browser is tricked by an attacker into sending a state-changing HTTP request with authentication cookies, which the victim did not intend. This exploits the fact that cookies are widely used on the internet and browsers automatically attach them to requests [6]. For instance, online shops, social networks, webmail applications, etc. use cookies to maintain state and track/re-identify users without needing to reauthenticate. The reason cookies are used is because HTTP is a stateless protocol. The server responds to a received request and then “forgets” about the connection. To prevent this statelessness of HTTP, the authentication information is stored somewhere in the server-side (e.g. session store, database, etc.) and the browser receives only an identifier (ID) from the server for that session, often via a cookie. The browser stores this cookie and when a user sends a request to the server, the browser will also automatically attach the cookie(s) for this web application. The server retrieves the Session ID from the cookie and looks up in its session storage or database to retrieve the user’s data, identified by the Session ID.
Many alternatives exist when it comes to forging a CSRF request. If a state-changing request can be executed through
HTTP GET, then an attacker can exploit this in (mainly) two ways. One option would be for the attacker to send an email
that contains a HTML tag with the CSRF payload (e.g. <img src=“http://bank.com/transfer?amount=x&dest=y”/>
). If the
webmail application loads HTML images automatically, then the browser will send the HTTP GET request and the CSRF attack
succeeds. The second option is for the attacker to trick the user into visiting his malicious website, which contains
the above HTML image tag. Note that the attacker is not limited to only the <img>
tag. The attack can be triggered
by using different HTML tags, which usually provide a src attribute. However, in most cases, applications perform
state-changing request through HTTP POST requests. In this case, the attacker has to create a hidden JS form in his
malicious website with the exact form fields that the server is expecting. Then, the attacker can use JS’s events (e.g.
onload()) to automatically post the hidden form when the victim loads/visits the page.
CSRF can be considered a type of the confused deputy attack where the web browser (confused deputy) is tricked into sending a forged request (for a less privileged attacker) to a web application. A CSRF attack works because, by design, a web browser automatically attaches all cookies that it has for the target web application when a request is sent. A server that does not protect against CSRF would accept and execute the request as coming from the victim since the session cookie was part of the request. What is worse, the victim is not aware of the attack until when it is too late. The figure below shows the steps of a common CSRF attack. However, some conditions have to be fulfilled for the attack to work:
- The attacker must find an unprotected state-changing operation in the target web application.
- The victim must be already authenticated to the target web application (i.e. a session cookie is already stored in his/her browser).
- The attacker must forge the state-changing request correctly. This means that the attacker must include all HTML form fields or request parameters that the server-side expects.
- The attacker has to trick the authenticated victim into visiting the attacker’s website (where the CSRF attack will take place) or trick the victim into clicking a link.
- The means for authentication must be automatically attached to the request by the browser (e.g. Basic Authentication request header [7] or cookies). Note that, different from XSS, CSRF aims to reuse the session cookie, not steal it.
Impact of CSRF attacks
The impact of the attack depends on the specific operation that is vulnerable to CSRF, but also on the privileges that the victim has. This can result in a money transfer, change of password, a purchase in a shopping website, account compromise, created admin user, etc. Sometimes CSRF can be even more dangerous than session hijacking. For instance, in a court case, the victim cannot argue that he did not perform a transaction because the IP of the request (although unintended) was that of the victim. To makes things even worse, the victim doesn’t even know which malicious website he visited that triggered the CSRF attack. In the case of session hijacking, the malicious attacker logs in with stolen credentials (usually) from an IP address different from that of the victim’s. Therefore, in this case, the victim can argue that he was the victim of an attack. There have also been other examples of CSRF attacks that lead to remote code execution with root privileges [8] or compromise of a root certificate [9].
Sometimes CSRF is mistakenly confused with XSS, but they are two different attacks. XSS aims to execute arbitrary JS code on a vulnerable website. It abuses the trust that a client has in a certain web application, thus clicking the link. On the other hand, CSRF tricks the user’s browser to send unintended HTTP requests to a vulnerable web application. It exploits the trust that a web application has in the user. The web application assumes that if a request was received, then it originated from the user (because of the session cookie) and executes it. Additionally, in a CSRF exploit the attacker can trick the victim’s browser to send an HTTP request, but he cannot read the response of that request while XSS can issue requests and also read the response. XSS attacks are based on JS, while CSRF attacks can also be carried out just by using a crafted HTML form. Finally, if a web application is vulnerable to XSS, then it is also vulnerable to CSRF. However, if a web application is safe from XSS, it might still be vulnerable to CSRF.
Since its discovery in 2001, there have been many reported CSRF attacks. Major websites such as Netflix, Google, Yahoo, financial institutes, Facebook, etc. have been vulnerable to CSRF and in some cases even more than once. Some of the most famous cases are:
- The Netflix website (2006): an attacker could add a DVD to the victim’s shopping cart, change the shipping address of the victim, or even compromise his/her account.
- New York Time’s website [10]: CSRF that leaks the email address of the user. It was used for spamming the victims. The website kept users logged in for over a year.
- ING Direct web application [10]: vulnerable to a CSRF attack that allowed unauthorized money transfers from victim’s account to the attacker’s account.
- YouTube [10]: The vulnerability allowed the attacker to perform almost all actions that a user can normally do.
- Google, Yahoo, PayPal (2008) [11] were vulnerable to Login CSRF.
- Facebook (2009) was vulnerable to CSRF. The attacker could use an HTML
<img>
tag to steal the user’s account information. - MetaFilter [10]: the vulnerability allowed an attacker to take control of a user’s account.
- Twitter [12] was also vulnerable to CSRF in 2010. When authenticated users visited the malicious website, they unintentionally posted two tweets – one with a link leading to this malicious website and another with a tweet about goats. Every user who clicked on the link provided in the first tweet also posted those two tweets, hence the worm was spread.
Most popular anti-CSRF defenses
Synchronizer Token Pattern (STP) [13] is one of the most popular countermeasures. A secret, unguessable, random value (known as CSRF token) is generated on the server-side using a cryptographically-secure pseudorandom generator (CSPRNG) with a random input/seed. The generated token is stored in server-side storage and must be tied to a specific user (usually linked to the Session ID). This storage can be a session datastore (e.g. Redis), a database, a filesystem (e.g. in PHP), server’s memory, etc. The CSRF token is sent as part of the server’s response and is usually placed in a hidden HTML form field. Once a request arrives in server-side, the server will use the Session ID from the session cookie (found in the incoming request) to extract the CSRF token from the storage. It will then compare it against the CSRF token that came as part of the request’s body (or in a custom header).
Double Submit Cookie [14] is another popular countermeasure that makes use of cookies instead of session storage to store a CSRF token. The security of this countermeasure relies on the SOP. Only JS running within the same origin is allowed to read or modify the cookie’s value. The server-side generates a CSRF token same as in the STP countermeasure. The server creates a cookie with the CSRF token in it and sends both this cookie and the CSRF token (usually in an HTML form) to the client-side. When a request is sent to the server-side, this cookie that holds the CSRF token will be automatically sent by the browser in addition to the CSRF token in the request body/custom header, hence the name “Double Submit”. The server-side will retrieve the CSRF token from the cookie and compare against the CSRF token in the request body/custom HTTP request header.
Great does not mean perfect!
Although anti-CSRF defenses mitigate CSRF attacks to a great extent, they are not perfect and may be susceptible to different attacks vectors:
Cryptography concerns
- Use of unsafe functions for randomness: the function that is used to generate CSRF tokens/secrets is crucial for security. Pseudo-Random Number Generators (PRNG) are fast functions that output low-quality randomness and should not be used to generate strings for critical security operations. It is advised to use cryptographically-secure PRNG (CSPRNG) instead. They provide enough randomness/entropy in exchange for longer generation time. Most languages provide a CSPRNG [15] so make sure to check before you end up using functions like Math.Random().
- Insufficient randomness of the (CSRF) token: tokens need to be randomly generated (i.e. high entropy) so that it cannot be guessed or brute-forced in a reasonable amount of time. In order to withstand the computation power of today’s computers, tokens needs to have an entropy of at least 128-bit to be considered secure.
- Insufficient randomness of the cryptographic key: in those cases when the CSRF token/cookie is signed and/or encrypted (for additional security or to prevent tampering), the secret key that is used for encrypting or signing during token generation might not be secure enough. The secret key should be random enough so that the attacker cannot easily brute-force it. Developers often copy and paste it from the documentation code snippet or Stack Overflow posts without realizing the risks.
- Lack of key rotation: secret keys should often be rotated. The lifetime of the cryptographic key is important and depends on many factors as already covered in detail by OWASP [16].
- Use of insecure cryptographic algorithms: a web framework might still be using an insecure cryptographic algorithm such as DES, MD-5, SHA-1, or use unsafe block cipher such as Electronic Code Book (ECB). OWASP provides detailed documentation concerning cryptographic operations (as mentioned above).
- Use of deprecated/unpatched cryptographic libraries: the cryptographic algorithms are often provided by libraries. An unpatched or deprecated library might be problematic and developers should always be using the latest (patched) version.
- Insecure storage of the application’s cryptographic key: aside from randomness and key rotation, its storage is also important. Storing a secret key hardcoded in the source code (or in some other insecure location) would compromise the key if the code is leaked or a (malicious) employee has access to it.
- Server-side token storage: A possible implementation mistake (for STP defense for example) would be the incorrect mapping user-token between the user and the CSRF secret that is stored on the server-side. An incorrect mapping might lead to many users having the same token. If the attacker and victim share the same token, the attacker can easily forge a successful CSRF attack. Although this might sound as improbable, it has even happened recently [17].
Token transmission from server-side to client-side
- MITM attacks: Transmission of the secret values over HTTP is insecure since a network attacker can perform a traditional MITM attack by intercepting the request and leak the CSRF token. Additionally, SSL stripping attack or a more advanced MITM [18] might be exploited.
- BREACH [19]: is another attack vector that can leak the CSRF token using a compression-based side-channel if the HTTP response is compressed. This attack is possible if the CSRF token is in the HTTP response body (which normally is), along with some user-specified input. The authors that discovered BREACH showed how they leaked a CSRF token in 30 seconds in Microsoft’s Outlook Web Access website.
- Placing the token in the URL: is a common mistake that might lead to CSRF token leakage through log files, browser history and Refer(r)er header. Another trick is to retrieve the CSRF token by using the so-called “CSS History Hack” [20].
HTTP(S) request with the CSRF token from client-side
- Code injection (XSS, Dangling Markup, CSS tricks): this category of attacks aims to leak the secret token by using JS (XSS), HTML (Dangling markup), or CSS. Any XSS vector can be used to leak the CSRF token that is placed in the hidden HTML form. One might think that CSRF is pointless when the attacker can already perform XSS, a larger threat than CSRF. However, there are cases when the attacker can be detected, e.g. in server-side XSS cases. Alternatively, an XSS vector in a subdomain might be exploited to attack an XSS-secure target parent domain. For instance, the attacker can use the XSS in the subdomain to set cookies for the parent domain and perform a cookie tossing attack (to be discussed soon). Dangling Markup is another kind of attack that uses HTML to extract the CSRF token when attacker-controlled input is reflected in the HTML. A detailed example of this attack can be found in this blog by Gareth Heyes.
- Clickjacking [21]: is another attack that can leak CSRF token or render the CSRF defense useless. The defense can be bypassed by framing the target web application on the attacker’s website. The victim is then tricked into submitting an HTML form on the target web application (same origin) which is loaded inside the attacker’s website with the CSRF token in it. Since the CSRF token is part of the request, the defense becomes pointless. However, traditional Clickjacking is limited to clicking buttons while in reality, an HTML form has to be filled in order to perform a sensitive operation. Stone [22] showed how Clickjacking could be used to achieve this. He suggests the use of drag-and-drop API to leak the CSRF token and/or fill HTML forms.
Cookie injection (for cookie-based mechanisms like Double Submit Cookie)**
- Cookie tossing [23]: this attack vector is the natural enemy of Double Submit countermeasure and exploits the fact that an attacker-controlled subdomain can set a cookie for the target parent domain. It also exploits the complex nature of cookies. A cookie is stored as a unique combination of name/domain/path in the browser. Because name/domain/path determine the uniqueness of a cookie, an attacker can create a cookie from a subdomain that he controls with the same name and domain but with a different path. This will create a whole new cookie and the browser will store it even if it has the same name. When the request is sent, the browser will attach this header: “Cookie: Xname=good; Xname=bad” (and no other cookie attributes). As a result, the server that hosts the target parent domain now sees two cookies with the same name but cannot distinguish which one belongs to the parent domain. The success of the attack relies on the fact that the attacker’s cookie (i.e. “Xname=bad” cookie) is the one that is considered by the server. To increase the success chance, the attacker can also include a specific path in the forged cookie, e.g. path=transfer. This exploits the fact that (most) browsers will consider a cookie with a path to be the “more specifically-scoped” and will send it first. Some programming languages (e.g. PHP) create an array from the Cookie header and only consider the first cookie of the array. Other languages like Python consider the last cookie in the array. Egor Homakov [24] showed a detailed example of a real-life application of cookie tossing on GitHub.
- Cookie jar overflow [25]: is an attack vector that targets the web browser’s cookie jar. When a website sets a cookie, the browser adds it to the cookie jar which can be thought of as a database in the browser that stores cookies. Just like a real jar, there is a limit to how many cookies it can store. Firefox allows up to 150 cookies while Chrome allows 180. If the limit is reached, the browser will start replacing old cookies with new ones. Therefore, an attacker can do a cookie jar overflow to “kick out” every single cookie of the parent domain and replace them with attacker-specified cookies (and CSRF token) just by running a JS snippet code on a subdomain. Even if the attacker cannot modify a cookie (e.g. HttpOnly or SameSite), he can use this overflow to control what cookies the browser stores.
Server-side CSRF verification**
- Insecure token comparison (timing attacks) [26]: timing attacks on token comparison can happen if a token comparison is done using the ”==” operator or a function that is based on this operator. The duration of the comparison is longer for strings with many characters in common (e.g. “university” and “universe”) and shorter otherwise (e.g., “university” and “test”). This time difference in the comparison can be used as a side-channel to guess the CSRF token.
- Missing checks for “safe” HTTP methods: a common mistake is to perform CSRF verification only for unsafe HTTP methods such as POST, PUT, DELETE, PATCH. Indeed, according to RFC 2616 [27], “safe” methods should be idempotent. However, developers use GET-based requests for state-changing operations quite often in practice (e.g. for log out). This allows for GET-based CSRF attacks to happen.
- Missing check for non-POST unsafe HTTP methods: is an even more dangerous practice. There are few framework developers that only perform CSRF verification only for POST requests and ignore other unsafe methods such as DELETE or PUT. In some cases, the developer doesn’t understand the risk of such behavior [28].
- HTTP Method Override is also a “feature” that can bypass the CSRF verification that is performed only for specific HTTP headers. The attacker can forge his request as a GET request and also add a custom request header (X-HTTP-Method-Override) to override the request method to PUT, POST, or DELETE. Usually, this is provided as a middleware. If this middleware is specified after the CSRF middleware, then the CSRF verification will be bypassed because it considers the request as GET. Next, when method override middleware is executed, it will change the request method to POST and will execute the CSRF request [29]. Another attack vector using HTTP Method Override happens when the attacker specifies an arbitrary name as a modified request method and the countermeasure will trigger the CSRF verification only for a list of specified methods (e.g. only for POST, DELETE, etc.). In this case, the verification will not be called at all, leading to CSRF.
- Logical errors: are all errors that don’t throw an exception but have a flaw in the code’s logic or programming mistakes (e.g. incorrect “if-else” clauses). As we will see during the field study, there are few frameworks that have actually fallen victim to this problem.
- Replay attack [30]: is one of the oldest tricks that is usually exploited in cryptographic protocols. The attacker intercepts a message that is sent from A to B, say a request for money transfer, and then replay this attack by sending the intercepted message to B multiple times. B has no way of checking the freshness of the message and will do the transaction every time. This can happen with CSRF tokens as well. An attacker that leaks a single CSRF token can use it multiple times until token expiration. This attack is also aided by the way sessions are usually handled in practice. Some sessions can last for days or months and if time is not specified in the cookie, the end of the session is considered when the browser is closed. However, some browsers like Chrome have “Clear cookies and site data when you quit Chrome” feature, which is by default disabled. As a result, the session does not really end even when the browser is closed.
• Cross-Site WebSocket Hijacking [31]: (WS) connections are another possible attack vector. An attacker can write some code in his malicious website that initiates a WS handshake with the target server. Once the victim visits the page, the browser will send an UPGRADE request header together with the session cookie. In a normal scenario, the server responds with CORS response headers that would prevent the cross-origin connection. However, the interesting fact is that WS does not respect SOP or CORS policy and the connection will actually be established. As a result, the attacker can now leak the CSRF token and forge successful CSRF attacks.
• Incorrect SOP relaxation (e.g. faulty CORS): is also a possible attack vector that can lead to CSRF token leakage. An over-permissive CORS that sets the response headers Access-Control-Allow-Origin: true and Access-Control-Allow-Credentials:true would leak the CSRF token of the victim. The attacker can simply perform a GET request to retrieve the CSRF-protected HTML form, read the response and steal the CSRF token. Then, he can continue performing a CSRF attack by providing the valid CSRF token.
Note: SameSite cookies appear to be the next attempt to prevent CSRF attacks. Although SameSite raises the bar for attackers, it is not perfect as well. For example, SameSite: Lax does not completely prevent CSRF. Three possible attacks can be exploited: 1) attacker can use top-level navigation (<a>
) to trigger GET-based CSRF. 2) “Client-side” CSRF can circumvent this mechanism and even send POST-based CSRF requests with cookies attached. 3) <portal>
, a new HTML tag that was introduced by Google in the end of 2019 for performant website framing. Until now, it is still a draft and only available on Google Canary. However, developers should be aware of the security risks. If the target web application is embedded into attacker’s site using <portal>
, then the browser will send the SameSite=Lax cookie even if it is a cross-origin request [32]. Recent attacks show that SameSite can also be circumvented with other means [33].
I hope that developers become aware of these attack vectors and read the documentation of the framework they use carefully. There is currently no framework that prevents all these attack vectors and if security is your priority, make sure to check that you are safe from the above-mentioned attack vectors.
References
- https://owasp.org/www-project-top-ten/2017/A7_2017-Cross-Site_Scripting_(XSS)
- https://doi.org/10.1145/54289.871709
- https://www.securityfocus.com
- https://web.archive.org/web/20000622042229/http://www.zope.org/Members/jim/ZopeSecurity/ClientSideTrojan
- https://crypto.stanford.edu/cs155old/cs155-spring08/papers/Session_Riding.pdf
- https://homes.cs.washington.edu/~yoshi/papers/czeskis-arls.pdf
- https://tools.ietf.org/html/rfc7617
- https://www.kb.cert.org/vuls/id/584089/
- https://www.kb.cert.org/vuls/id/264385/
- https://people.eecs.berkeley.edu/~daw/teaching/cs261-f11/reading/csrf.pdf
- https://dl.acm.org/doi/10.1145/1455770.1455782
- https://twitter.com/TwitterSupport/status/25614603915
- https://cheatsheetseries.owasp.org/cheatsheets/Cross-Site_Request_Forgery_Prevention_Cheat_Sheet.html#synchronizer-token-pattern
- https://cheatsheetseries.owasp.org/cheatsheets/Cross-Site_Request_Forgery_Prevention_Cheat_Sheet.html#double-submit-cookie
- https://cheatsheetseries.owasp.org/cheatsheets/Cryptographic_Storage_Cheat_Sheet.html
- https://cheatsheetseries.owasp.org/cheatsheets/Key\_Management\_Cheat\_Sheet.html\#key-management-lifecycle-best-practices
- https://nvd.nist.gov/vuln/detail/CVE-2020-11825
- https://owasp.org/www-chapter-london/assets/slides/David_Johansson-Double_Defeat_of_Double-Submit_Cookie.pdf
- http://www.breachattack.com/resources/BREACH%20-%20SSL,%20gone%20in%2030%20seconds.pdf
- https://blog.jeremiahgrossman.com/2006/08/i-know-where-youve-been.html
- https://en.wikipedia.org/wiki/Clickjacking
- https://www.contextis.com/media/downloads/Next_Generation_Clickjacking.pdf
- https://media.blackhat.com/bh-ad-11/Lundeen/bh-ad-11-Lundeen-New_Ways_Hack_WebApp-WP.pdf
- http://homakov.blogspot.com/2013/03/hacking-github-with-webkit.html
- https://www.sjoerdlangkemper.nl/2020/05/27/overwriting-httponly-cookies-from-javascript-using-cookie-jar-overflow/
- https://cwe.mitre.org/data/definitions/208.html
- https://www.ietf.org/rfc/rfc2616.txt
- https://github.com/hapijs/crumb/issues/4
- http://blog.nibblesec.org/2014/05/nodejs-connect-csrf-bypass-abusing.html
- https://en.wikipedia.org/wiki/Replay_attack
- https://christian-schneider.net/CrossSiteWebSocketHijacking.html
- https://research.securitum.com/security-analysis-of-portal-element/
- https://www.facebook.com/notes/facebook-bug-bounty/client-side-csrf/2056804174333798/?fref=mentions