Skip to content

Weborama/storm-crawler

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

storm-crawler

Low-latency, large scale crawler based on Storm (and maybe HBase). Merely a field for experimentation for now but hopefully we'll have some Spouts and Bolts that will be useful for building crawlers.

Install Maven and call : mvn clean assembly:assembly to generate the full jar

storm jar storm-crawler-0.1-SNAPSHOT-jar-with-dependencies.jar com.digitalpebble.storm.crawler.CrawlTopology crawl

Mailing list : http://groups.google.com/group/digitalpebble

About

Storm-based crawler

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 100.0%