We just added native support to Intelligence X for the following data formats:
Native support means end-to-end support. This ranges from indexing and crawling files of various data sources, to processing them internally and presenting them to the end-user on the frontend intelx.io. Indexing is the process of taking a file, reading it, and extracting any text thus making it searchable.
intelx.io shows text preview in the results and supports inline view. This means that it immediately shows the text of a document in a detailed view (when you click on a result) without forcing the user to leave the website or download the file locally.
Inline view of office documents is a convenient feature, but also has an important security aspect: if the end-user downloads and opens unknown office documents (especially from the darknet), there is a risk of malicious embedded VBA macros and other exploits.
Intelligence X now natively supports all major office formats: Word, Excel, PowerPoint, and PDF.
Before, a PowerPoint file was displayed in the detailed view as “data salad” 🥗:
Now, you can see the text of the presentation (both in the preview and detailed view):
We launched a new product: “Identity Portal”! It allows users to find all lines in a text where a search term appears, and to download a list of leaked accounts under a specific domain or email address. This product is exclusively available on request to companies and governments. If you are interested, please contact us!
June 2020: New Phonebook service! 🎉 We just launched a free new service: https://phonebook.cz It lists all email addresses, subdomains, and URLs for the input domain. Try it out – it’s free! It uses the same dataset as intelx.io – which is 20 billion records. There is an existing phonebook feature at intelx.io since its
May 2020: New dorks website, Tor, DDoS test and a Europol takedown Our dataset continues to grow significantly: 17,660,962,195 selectors In the past few months, we have invested in 200+ TB of enterprise storage which allows us to scale up data collection even more. As for the public web, we are currently crawling these TLDs: