Google is working on a new feature called “Signed HTTP Exchanges (SGX)” (note: Google also abbreviates it as “SXG”). According to their website:
Signed HTTP Exchanges (SGX), part of an emerging technology called Web Packages is now available in Chrome 73. A Signed HTTP Exchange makes it possible to create “portable” content that can be delivered by other parties, and this is the key aspect, it retains the integrity and attribution of the original site.
The IPFS project discusses using that technology in this GitHub issue. They quote Mozilla with the following statement:
Mozilla has concerns about the shift in the web security model required for handling web-packaged information. Specifically, the ability for an origin to act on behalf of another without a client ever contacting the authoritative server is worrisome, as is the removal of a guarantee of confidentiality from the web security model (the host serving the web package has access to plain text). We recognise that the use cases satisfied by web packaging are useful, and would be likely to support an approach that enabled such use cases so long as the foregoing concerns could be addressed.
Mozilla hits the nail on its head with their concern. Once you introduce web packages that can be supplied by (potentially evil minded) 3rd parties, all kind of bad things may happen:
To realize such an attack, an attacker would first create at least daily copies of provided web packages of a target (of course, assuming that the target supports web packages).
Then, once a vulnerable version is identified (even if years have passed between collecting the web package and identifying its vulnerable) the attacker can weaponize it. The attacker can even try to artificially put different pieces of a website together to achieve a certain outcome.
There is information here that talks about caching (emphasis added):
When fetching a resource (
https://distributor.example.org/foo.sxg) with the
application/signed-exchangeMIME type, the UA parses it, checks its signatures, and then if all is well, redirects to its request URL (
https://publisher.example.org/foo) with a “stashed” exchange attached to the request. The redirect applies all the usual processing, and then when it would normally check for an HTTP cache hit, it also checks whether the stashed request matches the redirected request and which of the stashed exchange or HTTP cache contents is newer. If the stashed exchange matches and is newer, the UA returns the stashed response.
As written above, the browser shall check the HTTP cache to see if there is a newer result than the provided SXG resource. This works, of course, only in the case that the user already visited the target site serving a newer version (than the SXG). Users that never visited the site, or that are opening the web package in a private browser window, will still be vulnerable.
The actual protection against this type of attack is to use the “expiration time” field of the signature header and set it to a relatively low future date.
In addition, web servers should use the server side “Cache-Control” and “Expires” headers to make sure the browsers can use it for the comparison, whether or not an HTTP cache hit is newer than the provided SXG.
What if, for example, https://code.jquery.com/jquery-latest.min.js would (hypothetically) reference “jqueryupdate.com” and the developers decide to streamline their operations to use “update.jquery.com” instead, and abandon the old one? In the future an attacker can register jqueryupdate.com and supply the old JS file via SXG. It would allow the attacker to locally inject arbitrary JS code into any website that reference this jQuery file.
There are more attacks possible similar to the one described with poisoning 3rd party libraries.
The attacker could package up these SXG into a single website and send it via email or share the link via social media.
As stated before, the only real protection against this is setting strict cache expiration and signature expiration time settings.
February 2021: Launch of the European Internet Archive The European Internet Archive just launched! 🎉🥳 ➡ https://archive.eu/ 225 TLDs added to the list of web crawling We have added 225 top-level domains (TLDs) to the list of web crawling. Find the full list and how we are categorizing them in this blog post. Our dataset
We have added the new category “Bot Logs”. It contains data collected by and leaked from viruses such as Azorult. Such data is often sold on marketplaces such as the Genesis Market. We decided to index such data into this new category to help filtering out relevant results. You can find this new category in
We are excited to announce that we just added 225 top-level domains (TLDs) to the list of web crawling! Below is the full list. The domain count per TLD represents the domains registered according to DomainTools. We group multiple TLDs into “buckets” to make it manageable – you can select these buckets in the Advanced