Apache SOLR – Blazing fast open source enterprise search platform
Apache SOLR and Elastic Search are both search server platforms based on Apache Lucene – which is a Java based code library and API which can provide search capabilities for applications. Both SOLR and ElasticSearch have extended the capabilities of Lucene API by adding some features on top of it. Although SOLR and ElasticSearch both offer similar features Apache SOLR is more trusted and proven search platform with a brand-name of Apache which powers some of most heavily-trafficked websites like ebay, Magento, Netflix, Disney, Instagram, Ticketmaster, etc…
Apache SOLR is blazing fast open source enterprise search platform which offers powerful full -text search ,hit highlighting ,faceted search, NOSQL features ,dynamic clustering , database integration, rich document(Word,PDF,etc..) handling ,geospatial search. SOLR is highly reliable, scalable and fault-tolerant providing distributed indexing, replication, centralized configuration, load balanced querying, automated failover and more …
SOLR is standalone enterprise search server within the servlet container – Jetty and has REST-like HTTP/XML/JSON APIs which eases the data exchange with external data sources like databases, XML, JSON, CSV. It outputs data in XML,JSON,CSV,PHP, Python, Ruby, Velocity, XSLT, native Java. Elastic Search on the other hand provides data exchange in the limited options JSON/HTML/XML. Shards are the partitioning unit for the Lucene index, both SOLR and ElasticSearch have them. You can distribute your index by placing shards on different machines in a cluster. So if you decided you wanted to split your index into 10 shards on day one, and two years later you want to add another 5 shards SOLR supports shard splitting, which allows you to create more shards by splitting existing shards. SOLR supports distributed group by (including grouped sorting, filtering, faceting, etc), ElasticSearch does not.SOLR provides features like anayzers, tokenizers, filters in order to break down the textual data and manipulate it as per the need.
SOLR provides mechanism to import data from relational databases. The Data Import Handler (DIH) provides a mechanism for importing content from a data store and indexing it. In addition to relational databases and NOSQL databases, DIH can index content from HTTP based data sources such as RSS and ATOM feeds, e-mail repositories, and structured XML where an XPath processor is used to generate fields. DIH supports complete import of data through full-import and partial or incremental import of data through delta-import.In SOLR, Index Handlers are Request Handlers designed to add, delete and update documents to the index. It also provides features of custom transformers for various transformations to be done on the indexed data.
Solr has a bigger, more mature user, dev, and contributor community compared to ElasticSearch.Solr is developed by contributors to the Apache Software Foundation (ASF) through an open, meritocratic process that goes well beyond simply sharing the source code. Visit the ASF website to learn more.
There are some of the good informative webpages for reading and learning more on SOLR :
http://wiki.apache.org/solr/
https://cwiki.apache.org/confluence/display/solr/
http://lucene.apache.org/solr
http://solr-vs-elasticsearch.com/
https://thinkbiganalytics.com/solr-vs-elastic-search/
There are no comments yet.