creepyCrawler

OSINT tool to crawl a site and extract useful recon info.

Provide a starting URL and automatically gather URLs to crawl via hrefs, robots.txt, and sitemap
Extract useful recon info:
- Emails
- Social media links
- Subdomains
- Azure, AWS, and GCP cloud storage links
- Files
- Login pages
- A list of crawled site links
- HTML comments
- IP addresses
- Marketing tags (UA,GTM, etc.)
- 'Interesting' findings such as frame ancestors content and resources that return JSON content
Built-in FireProx to automatically create endpoints for each subdomain, rotate source IP, and cleanup at the end
- Forked and modified (chm0dx/fireprox) from the awesome ustayready/fireprox
Headless (Playwright) option for sites that serve content via js
HTTP/SOCKS proxy support

Install

git clone https://github.com/chm0dx/creepyCrawler.git
cd creepyCrawler
pip install -r requirements.txt

Docker

docker build -t creepycrawler .
docker run -it creepycrawler args

Use

creepyCrawler.py --url URL [--email EMAIL] [--threads THREADS] [--limit LIMIT] [--proxy PROXY] [--headers HEADERS] [--fireprox]
                 [--region REGION] [--access_key ACCESS_KEY] [--secret_access_key SECRET_ACCESS_KEY] [--json] [--robots]
                 [--sitemap] [--suppress_progress] [--comments] [--tags]

Crawl a site and extract useful recon info.

optional arguments:
  -h, --help            show this help message and exit
  --url URL             An initial URL to target.
  --email EMAIL         A comma-separated list of email domains to look for. Defaults to the root domain of the passed-in URL.
  --threads THREADS     The max number of threads to use. Defaults to 10.
  --limit LIMIT         The number of URLs to process before exiting. Defaults to 500. Set to 0 for no limit (careful).
  --proxy PROXY         Specify a proxy to use.
  --headers HEADERS     Override defaults with the indicated headers. ex: "{'user-agent':'value','accept':'value'}"
  --fireprox            Automatically configure FireProx endpoints. Pass in credentials or use the ~/.aws/credentials file.
  --region REGION       The AWS region to create FireProx resources in.
  --access_key ACCESS_KEY
                        The access key used to create FireProx resources in AWS.
  --secret_access_key SECRET_ACCESS_KEY
                        The secret access key used to create FireProx resources in AWS.
  --json                Output in JSON format
  --robots              Search pages found in the robots.txt file
  --sitemap             Search pages found in the site's sitemap
  --suppress_progress   Only show final output
  --comments            Return HTML comments extracted from crawled pages
  --tags                Return tags (UA,GTM,etc.) extracted from crawled pages
  --ips                 Return IP addresses extracted from crawled page content
  --headless            Run in headless mode. Requires Playwright and deps (or use docker).

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
fireprox		fireprox
Dockerfile		Dockerfile
README.md		README.md
__init__.py		__init__.py
creepyCrawler.py		creepyCrawler.py
creepyCrawler_demo.gif		creepyCrawler_demo.gif
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

creepyCrawler

Install

Docker

Use

About

Releases

Packages

Languages

chm0dx/creepyCrawler

Folders and files

Latest commit

History

Repository files navigation

creepyCrawler

Install

Docker

Use

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages