A simple but powerful search algorithm in PHP, MySQL

Ezhil Kannan B R
Criar Solutions
Published in
5 min readMar 6, 2019

--

Hello all, I’ve been working with PHP for a while and in one of my projects, I had the need to implement a search algorithm. After some learning, I have found some best ways to implement it.

Requirement :

Let’s assume we are building a Tech aggregator website and we store the article URLs, title, content of an article in the mysqli database. Now our task is to display the best results of the article URLs based on the search terms provided by the user.

Creating the Database :

First of all, we have to create the database to store the given values and a separate column called “indexing” to store the indexing values. When a user enters a search string we will search in this column. It is used for a fast and effective search. The database looks like this.

There are plenty of different methods for indexing. But you should choose the best and robust method. When a search query is received, we only look at the indexing table and return the results instead of searching the entire table. Now let’s create a very basic layout for the search form. Let’s name it “sample_search.php”.

Let’s ignore the UI and focus on the backend works. The form output looks like this

Now Let’s start the indexing. What’s the big deal with indexing anyway? We can just compare and find the matching strings of a query right?

Not quite. Let’s assume the user searches for the word “science”. We just search for the string “science” and return the content that matches the word “science”. Now, what if the user as a result of a typo, entered search query as “synce” or “Science” or “SCience” or “SCIEnce” or “SCIENS” or “Sience”. If we follow the same procedure as above, we are most likely to return an empty result.

TO ERR IS HYUMANN

And of course, most people know the correct spelling of science. But what if some mistake in spelling happened. There’s a clue to fix this problem. Without any word processing techniques, we can see that all the words “synce”, “Sience”, “Science”,etc., sound similar.

Here we will use a powerful function in PHP called metaphone(). It returns a value that corresponds to the sound of that given word

Now try running the following PHP code

The output of the above code looks like this

Abraca Tabra! We get the same sounds for the different spellings of science. Now if we do the indexing in such a way that we store the sounds of each of the words in the content and title. We can easily return the most reliable test results with the given search string. Here are the steps followed for indexing

1. First, establish the connection

2. Fetch the article_title, article_content from the database

3. For each row of article_title and article_content, split the content into single words using the function explode() and append the metaphone($word) with a space to the variable $sound. Update the corresponding column ‘indexing’

Now we have successfully indexed our title and content and stored them in the ‘indexing’ column. Now we will implement the search logic in the file ‘sample_search.php’

1. Once the query is submitted, we convert the search query into a metaphone string. Then search for that string in the indexing column using the “LIKE” query in SQL

But remember, while indexing, we have added the metaphone of the words in the contents individually. So if a string has two or more words, like “HELLO WORLD”, we won't be able to find its corresponding metaphone in ‘indexing’ column. Hence we split the words and append the metaphone of the individual words to the search string.

Explanation:

metaphone(“Hello World”)= “HLWRLT”. But in our ‘indexing’ column it is saved as “HL WRLT” (with space as they are separate words). So we have to insert a space accordingly and then search for it.

2. Now we display the results if the number of results > zero

Sample Output for search query “data”

3. If the number of results is zero or no results found

In some cases, we don’t get any exact matches. But a part of the search query has matched. For e.g. Let the search string be “Magnificent Asia”. Now let’s assume this exact string is not found but there are separate articles with “Asia” and “Magnificent”. So we split the search query into search words and find all the articles that match the metaphone of search words.

If we still don’t find any words, then we display the message “No results found”

You can add ranking to each of the articles, by using the data like number of visits, number of shares etc., But that’s for another post. Hope you all found this helpful.

Resources:

  1. http://php.net/manual/en/function.metaphone.php
  2. source code: https://github.com/Ezhilyo/search_algorithm

--

--

Seeking answers for the universe, the life and everything. (No, it’s not 42)