Ex-Crawler Server 0.1.6 Alpha released! See Changelog for more details
Ex-Crawler Project is divided into three parts, which together provide an flexible and powerful (web-) crawler and search engine supporting distributed (volunteer & grid) computing.
The main part, the Ex-Crawler Server / Daemon is a highly configurable web (http and some other protocols) crawler written in Java. It comes with its own server, where you can configure the server, add new URLs and more, including a user management System. Ex-crawler supports distributed grid / volunteer crawling, with a graphical (swing) GUI Client, so that even non-advanced users can easily contribute to your Search engine. You can also configure the Ex-Crawler daemon to crawl only for sites from your selected country or countries (top level domains, like .us or .de and on language detection), you can also set Ex-Crawler to just crawl sites matching your given topic (experimental). The crawled informations are stored in SQL Database(s) (currently MySQL, PostgreSQL and MSSQL are supported).
You can also easily add custom plugins (or develop your own) for almost every purpose (Through interfaces, which been triggered on for example on every crawled image or website, new Host etc.) without any need to modify the server source code, actually you don't even need to know the server code. Ex-crawler automatically detects plugins. Developing plugins for Ex-Crawler is really easy, nobody should need more then five (!) minutes to understand the system and develop his own. For the plugins, many (static) helper classes are existing for fast and easy development. For more details about the plugin system and howto develop your own take a look at the Developer resources.
The graphical distributed computing (crawling) client (for your volunteers, written for non-advanced users) comes with a nice graphical User Interface, an embedded database, server management, user stats, simple assistants, multi language support, client user state detection. The client is currently only available in Subversion and still under heavy development.
The third part, the web search engine written in PHP brings an Content Management System, to easily maintain the search engine. It supports templates through smarty, so you can easily style it to your wishes. The application Framework is forked from Joomla 1.5. So that extensions from Joomla can easily be ported to the engine. Other features are automatically user language and country detection. Multi language support. User accounts are shared with the server and the graphical client (including stats). So if you create a user account on the website, you can use it also with the graphical client for example. The search engine is at the moment also only available through Subversion and still very early alpha.
Search results are easily adjustable through ExRank (in development), so that you can easily adjust the whole search behavior to your users.
A desktop search client is also planed. We're always looking for people who want to help.
This site is currently under construction and many areas are still missing. If you have any questions or problems, feel free to contact us using the sourceforge mailing list or the Bug Tracker.
Or you could also join our IRC channel:
#ex-crawler @ freenode.net