Upgrade WordPress Search with Elasticsearch

  • WordPress search is intentionally basic and not suitable for most business applications.
  • Elasticsearch is an enterprise ready search application that tremendously improves the WordPress search experience.
  • Elasticsearch enables advanced search functionality like PDF document search, autocomplete, synonym search, intelligent suggestions, and many others.
  • WEBDOGS develops custom Elasticsearch solutions for enterprise clients running WordPress.

There’s a lot of buzz in the WordPress community about Elasticsearch and for good reason: Elasticsearch significantly upgrades basic WordPress search. There are a ton of options out there with respect to integrating Elasticsearch: as simple as “turning it on” in Jetpack and as complex as enterprise WordPress implementations, similar to what WEBDOGS has done for our partners like ForeScout. But before we dive into recommendations, it’s important to understand just how different Elasticsearch is compared to out-of-the-box WordPress search, as well as what’s possible with Elasticsearch that simply isn’t with basic WordPress search.

WordPress Search – Intentionally Basic

The first thing to understand about basic WordPress search is that it’s intentionally basic! WordPress is a content management system (“CMS”), not an enterprise search application. The knowledge and expertise required to to build a CMS versus a search application are completely different and WordPress doesn’t pretend to do search well. That just not where the core team is focused.

So how does search in WordPress work? Well, it pretty much comes down to a few lines of code:

This snippet of code is from class-wp-query.php and is essentially the guts of what powers search in WordPress. The first few lines up to the foreach statement are doing some very basic clean up of the search terms, making sure bad characters aren’t getting through.

Once inside the foreach statement, the party begins. WordPress starts building a MySQL statement chain for each term. If the term is preceded by a -, then it is understood as not to include.

We could go into a lot more detail, but the gist is this: WordPress has a completely unintelligent search algorithm that looks for exact matches of search terms in just three places: the content, the title, and the excerpt. No exact match for a term in those three places within the db? No matches at all. Pack it up. Do not pass go. No SERPs for you.

Elasticsearch – an actual Search Application

Listen, WordPress is to search what rolling down a hill in a barrel is to travel. Elasticseach is to search what flying in a Gulfstream is to travel. They’re just not at all the same.

It’s difficult to get into the inner-workings of Elasticsearch because it is exceptionally complex, but the Elastic team provides a great initial (and fairly technical) overview in “Elasticsearch from the Bottom Up, Part 1.”

At the most basic level, Elasticsearch gets fed content (for instance from WordPress’ database via a plugin like ElasticPress) and “indexes” the content. Indexing in Elasticsearch works like this: Elasticsearch looks at each individual word within a piece of content, records that the word occurs in a given document, as well as the number of times the word occurs, and records all that within an internal database (of sorts). From there, it gets wildly more complex but suffice to say it is incredibly powerful because it has a central database that knows which documents contain which words, as well as the frequency of a word within a document (that gives us a sense of relevance).

But wait, there’s more! Elasticsearch is also able to take into consideration metadata for given content, as well as assign values that represent importance (or “relevancy”) to content. For instance, a developer may implement a custom taxonomy in WordPress called “Industry” that a business uses to categorize

There’s a bunch of other math and statistics that takes place to generate results, but that story isn’t as compelling as all of the other things Elasticsearch is capable of. Let’s explore some key features that will extend Google-like functionality to any site.

Level Up Search – Key Elasticsearch Features

Like any application, Elasticsearch comes configured a particular way out of the box and it’s up to professionals like WEBDOGS to optimize the configuration and connect Elaticsearch to an incoming stream of data. WEBDOGS leverages the ElasticPress WordPress plugin to establish a connection between WordPress and Elasticsearch. Since Elasticsearch is its own, separate application it also requires separate hosting. For Elasticsearch hosting we exclusively use Bonsai.io. Once WordPress is connected to Elasticsearch, the world of search is opened for some powerful functionality:

  • PDF and other document type indexing: A number of our partner organizations have multiple forms of content, from PDFs to Microsoft Word Documents, and many others. Elasticsearch can ingest these documents and index their content, making PDF and other documents searchable on your website.
  • Autocomplete: Many of us recall when Google rolled out autocomplete and it felt like magic to type in the first few letters of a word and watch their algorithms start to guess the search term. Elasticsearch enables similar, if slightly less advanced, autocomplete functionality based on search terms within a given index. In other words, if the word appears within the Elasticsearch term database, it will also appear in the autocomplete list.
  • Synonym search: By compiling a custom configured dictionary of synonyms, Elasticsearch enables search queries like “white paper” to also show matches for a term like “case study.” Since this is completely customizable, each business is able to fine tune their search synonyms to ensure their customers are getting the content they need most.
  • Intelligent suggestions: Typically used in conjunction with autocomplete, Elasticsearch has a number of “suggestion” APIs that enable the suggestion of terms and phrases, and even enable suggestions based on a filter (e.g. suggest terms more relevant to a certain content type). Suggestion is a powerful way to surface new or important content to users looking at potentially related topics on your site.
  • Query boosting: Many businesses have a number of customizations built into WordPress that house additional custom post type metadata. Sometimes, this metadata contains some of the most important content for a given post type. For examples, consider a recipe site with custom meta fields for ingredients. In the case of the recipe site, they may want to boost the importance ingredient list content in search. That means if there’s a match for a search query in the ingredient list, that recipe would display very high up in the search results. Also, think back to synonym search. Say the recipe site has a large US and Australian audience and one of the ingredients is listed as “red bell pepper.” A synonym entry may exist for “capsicum” (the word Aussies tend to use for red bell pepper), so when an Australian user searches for “capsicum” the recipe with “red bell pepper” shows.
  • Related Posts: Elasticsearch has powerful document comparison technology that enables it to receive a document (e.g. the entire contents of a post) and find content within the Elasticsearch database that is likely related. This empowers content creators to surface related posts, pages, or other content types in an automated way, reducing total curation time and getting relevant content to site visitors.

Next Steps

If you’d like to understand how Elasticsearch might benefit your organization, reach out to our team. There are a lot of ways to integrate Elasticsearch, starting with “out-of-the-box” basics and extending to custom enterprise solutions. The possibilities are limited only by imagination.