Open government is the governing doctrine which holds that citizens have the right to access the documents and proceedings of the government to allow for effective public oversight.
Wikipedia
After the recent firing of Christopher Krebs, the 1st director of the Cybersecurity and Infrastructure Security Agency (CISA), we decided to leap into action and preserve public data from US governmental websites.
We created the new category, “Government US”, that includes all Public Web data from these TLDs:
As a result, you can now access historical versions of the CISA website via https://intelx.io/?did=3dab2dd6-724f-4c66-916f-62c586ab7037. New copies are made every few days so it will catch any changes – and preserve any content that might be deleted or altered by the current or future administration. This data is available for free (you don’t even need an account).
Why historical websites look plain: For security reasons, we remove any JavaScript, images, and external references including CSS files which contain the style sheet information. As a result you only see the bare HTML content without backgrounds, colors and images.
To search for historical versions of a particular US government domain, select the “Government US” category in the Advanced menu:
You will then see the website with all crawled URLs visualized as tree:
At Intelligence X, transparency is paramount. Our users have full access to our data set and we are transparent where data is coming from. If you click on a search result there is a “Metadata” tab that shows you all the details.
At the time of writing, the crawlers were running for less than 24 hours, even though the dataset is already growing quickly:
At Intelligence X we categorize data sources into buckets. Buckets can be used as filters and to broadly identify the source of individual search results. For example, the bucket “Darknet Tor” indicates the result origins from some a Tor hidden service (.onion domain) and was collected by our Tor crawler. Buckets have human readable names
We just added support for an additional 152 top-level domains (TLDs), increasing the support to 511 TLDs in total. Support means that you can search for those domains across intelx.io and APIs, and internally that our backend supports processing them. While you can start searching for them immediately, it will take some time until our
Earlier today at 11:24 The Guardian Journalist Shaun Walker posted the security procedure and the security token used to pass makeshift checkpoints in Ukraine related to the Russian Ukrainian war: This is a reminder to journalists – and the public – to take OPSEC (operations security) seriously and not endanger people on the ground. Posting