Elastic search is a real-time search and analytics engine. It is based on Apache Lucene and is open-source.
It is designed to be scalable which means it is distributed and has Node Discovery in it. So it can automatically recognize other elastic search nodes and connect to them if required. It does automatic sharding, in a very simple way, it has its own identifier and just uses identifier modulo number of shards to determine what shard everything goes in. As a result of this, it can do a lot of smart things like where to route some queries and if an update comes where to put that update to make sure things are local as well. It does query distribution, so on querying one node it goes to all the nodes as well. All of those things one requires in this cloud type of world are available for free in elastic search.
It has a RESTful, HTTP API with a wrapper for any language one can think of. Almost every day a new language wrapper comes out. One of the things about elastic search is that it is old JASON. It really fits the document model, because the document model uses the JASON structure. If there is a structure in a book with several authors and each author is having a last name, the elastic search will put this in a machine index so it can be searched.
Elastic search is schema-less. It does field type recognition because the JASON document structure is not just strings but it can recognize a date, number, or a floating-point number. Also, if a schema is not provided it gives a field number and tries to be smart about it. Any JASON document that is put in is stored in the source document. It maintains a version number automatically, so if any update is done it increments an internal version document.
Elastic search has integrated faceting, which works really fast using all the caches that are available. It adds statistical aggregates like sum, average, and number fields which are very powerful. Many of the queries that one wants to do can actually be fulfilled by this. It has many different field types, strings, all types of numeric, geospatial attachments, and arrays (arrays of numbers, arrays of strings). Among documents, it can have both sub-documents and nested documents.
Elastic search assumes certain things about data, sharding, and configuration because it has a RESTful HTTP interface. It is possible to do cross-index searching and multi-document typing for all of those books/journals/documents that have authors. It is a really flexible tool that supports almost everything expected and is set to become the next evolution of search.
Have questions? Contact the technology experts at InApp to learn more.