🥳 It has been 2 years since the launch of Intelligence X!
At this point we would like to thank our users & customers for their trust and we look forward to the future.
-Intelligence X Team
Make sure to follow us on Twitter for the latest updates.
The New York Post published an alleged Hunter Biden email. We looked into it in our blog post An OSINT investigation into the alleged Hunter Biden email. tl;dr: It appears that the email addresses in the alleged email are accurate. This does not validate nor refute the content of said email.
We have indexed 29 published Hunter Biden emails here: https://intelx.io/?did=558538b2-c441-48f9-9bd5-1e4d19a14e03. Use the “Tree View” tab to nagivate between the documents.
We added support for .bit and .bazar domains. These are decentralized top-level domains by Emercoin and Namecoin. They store the DNS records on a blockchain, only allowing the owner of the private key to modify them.
Our Phonebook also supports these TLDs and provides discovery of such domains:
We have updated our Maltego Transform to v4. We added an “Intelligence X Selectors Transform” to pivot from a search result to entities, i.e. listing all URLs, domains, etc. from the given search result.
You can watch this GIF showing how it lists all domains and URLs from the original Silkroad site.
Another new transform is the “History Transform”, which lists all historical copies indexed by Intelligence X about a website.
We developed a native SQL parser for our indexer. While some terms like email addresses were recognized in SQL dumps by our generic text parser, others such as IPs were not. The new algorithm is now running over the existing dataset.
Writing a generic SQL parser is tricky since Postgres, MySQL, MSSQL have slightly different syntaxes, on top of different encodings and escaping techniques.
As part of our ongoing quality efforts, we have improved our spam detection algorithm. It deleted 2 billion records in the Public Web bucket.
In our German Web bucket alone it removed 6 million subdomains with 1 TB of data that was detected as spam.
Moving forward our web crawlers are more efficient. 🕷
In October 2019 we started crawling Craigslist, indexing and making historical copies. The project ran for about 6 months, but was not unlocked on intelx.io for public access. Stats on the data:
🔹 644 GB of data
🔹 16,857,052 website copies
🔹 464,776,016 total selectors
Since this is low value/quality data and we have a data minimization policy, we destroyed it. A fun fact is that in all the indexed Craigslist data, there was only 1 Bitcoin address present along with the message “The Bitcoin wishing well is here. Make a wish and send some btc. The more btc you send the more wishes you get.”.
Kleissner Investments s.r.o., Na Strzi 1702/65, 14000 Prague, Czech Republic
If you don’t wish to receive this newsletter anymore, please click here to unsubscribe.
June 2021: New Usenet data category We added the new data category Usenet. It contains historical and current data from Usenet, which is “a worldwide distributed discussion system”. Today, Usenet is mostly used for piracy. This new category stores currently 209,469,453 selectors and is expected to grow substantially. Improved inline statistics We have improved the
Intelligence X supports Peernet – Founder’s Statement I am excited to announce Peernet, a decentralized network that allows sharing of data freely without censorship and restrictions. Here is the pitch deck: https://peernet.org/dl/Peernet%20Deck.pdf Peernet is making quick progress from its inception as I am finalizing the whitepaper and developing the core library. I would like to
February 2021: Launch of the European Internet Archive The European Internet Archive just launched! 🎉🥳 ➡ https://archive.eu/ 225 TLDs added to the list of web crawling We have added 225 top-level domains (TLDs) to the list of web crawling. Find the full list and how we are categorizing them in this blog post. Our dataset