How to Improve Product Search Algorithm to Meet Growing Customer Expectations

Modern eCommerce Solutions Based on Deep Learning Algorithms (Part 1)

Altimetrik Poland Tech Blog
13 min readJan 19, 2023

Series index:

  1. How to Improve Product Search Algorithm to Meet Growing Customer Expectations
  2. Visual Search Engine — the Future of Search (publication soon)
  3. Retrieval-Ranking Recommendation System for H&M Dataset (publication soon)

Introduction

This article is part of the series “Modern eCommerce solutions based on deep learning,” in which we dig into the mystery of using state-of-the-art algorithms to grow an eCommerce business, meet growing customer expectations and easily gain market advantage.

The range of online stores is getting bigger and bigger every year, and it is becoming increasingly difficult for website visitors to navigate among thousands of products to find the right one. The truth is that a customer will only make a purchase if he or she can find it quickly and easily, so retailers are racing to provide more and more convenient solutions for the customer.
Nowadays everything happens very fast, people rush through daily issues at a breakneck speed. A good shopping system must keep up with this speed. One of the main reasons why a customer gives up searching for a store (when the recommendation/search system is not adequate) is the limited amount of time available for shopping. Therefore, improving the search engine leads to building a good shopping experience for customers and increasing their brand loyalty. Otherwise, if the search engine does not work well, the customer will become discouraged and move on to another store’s website.

Effective product search is the key to success in the race for customer affection. It might seem that this problem was solved long ago — search engines are now a common feature of every e-commerce site. But are these tools really effective? Have you ever seen a “no search results found” page after typing a phrase into an online store search engine? Or, on the contrary, receive thousands of results that did not bring you any closer to buying your dream product? This means that popular solutions are not up to the task, and it’s time to look at the problem from a different perspective.

In this article, we will briefly discuss:

  • the business benefits of implementing an effective product search engine
  • the basic concept of how search engines work
  • an overview of the effectiveness of available search tools in popular online clothing stores
  • our alternative approach to the product search problem
  • conclusions from all of the above points

We will analyze everything on the basis of the fashion industry, but all conclusions and observations can be easily transferred to any line of e-commerce.

Figure 1 — Sample search bar from https://www.marksandspencer.com

Is it worth investing in search engines?

Search engines have a direct impact on every stage of the customer-store relationship.

Providing customers with a tool that allows them to search for products on a website efficiently, conveniently and quickly:

  • attracts more consumers
  • gains more conversions (increases the number of products purchased)
  • improves the user’s search experience and affects the customer’s overall online shopping experience
  • strengthens customer loyalty to the brand

Visitors who use the search function have a clear goal — they are customers who want to make a purchase, not just browse the store’s offerings. These customers have a higher purchase intention and generate a huge portion of the store’s profits.
If that’s not enough for you, look at the problem from the other side. If the search process is unfriendly, difficult, boring or doesn’t return the right results, the customer is quickly discouraged from exploring the site further, and the store may lose his interest. Most likely, he or she will never return to it again. Losing customers is much more expensive than providing an effective product search tool for the site. So investing in search engines always makes sense!

How search engines work?

From the customer’s perspective, the product search process is trivial (or at least it should be trivial) — the customer describes in a few words the product he or she is looking for, clicks “search”, immediately gets results that are exactly what is needed, and makes a purchase! Nothing could be simpler than that!

But how do search engines really work from a technical point of view?
Let’s break down the path from typing a phrase to making a purchase into a few basic steps performed sequentially:

  1. selecting from the store’s database those items that meet the description given by the customer (a long list of products);
  2. matching the results to the customer’s needs (intelligent ranking and selection of the most relevant items for a given customer based on the customer’s characteristics, browsing history and market trends);
  3. filtering of results (the user can narrow down the list of results using predefined filters based on product attributes — in order to detail the expectations of the product that were not given in the entered phrase).
Figure 2— Steps in the search process between typing in a phrase and making a purchase

Improving the last two steps (ranking and filtering) is extremely important in effective product search, and we will discuss them in detail in one of the next articles in this series (a part on modern recommendation systems).

However, if the algorithm does not work at the very beginning (the first step), even the most efficient processes in the subsequent steps will not result in a purchase. Therefore, here we focus on developing an effective algorithm for selecting products from the store’s offer that match the phrase typed in by the customer.
How does a search engine understand a site visitor’s input? Currently, most engines focus on searching by tags and/or natural language algorithms. In the next part, we will test the effectiveness of such algorithms in practice.

Review of search engines in popular clothing stores

We decided to personally check how effectively the major clothing stores perform search tasks and whether these search tools can actually facilitate the process of finding the desired product. For this purpose, we selected 5 popular clothing stores and checked the validity of search results for 20 phrases (divided into four categories).

The query results for each phrase and for each store were ranked as follows:

  • 1.0 — the top results were relevant and matched the phrase entered;
  • 0.5 — some of the top returned products were not pertinent, additional filters had to be applied (e.g., selecting products for women, even though the phrase clearly indicated this), or the search engine suggested correcting spelling errors in order to search again;
  • 0.0 — the search engine returned no products or the top products were completely incorrect.

The result does not depend directly on the number of products indicated (correctly or not). The evaluation was conducted from the customer’s perspective and assessed whether the returned results brought the customer any closer to purchasing the product.
In addition, we verified that all types of products searched for were available in the current offerings of these stores, in order to avoid a negative evaluation of the search engine due to the actual lack of a product in the offerings of a particular store, i.e. all phrases were selected in such a way that it would be possible to find such products in the stores’ offerings.

The results are presented in the screenshot below.

Table 1 — Correctness of results for text search engines — English language

The average correctness of the results for all stores is shown in the last column of the Table 1 (separately for each phrase complexity category). The overall average result for all analyzed phrases is equal to 62%. This means that only slightly more than half of the queries were returned with satisfactory results. The rest were blank or contained completely incorrect products. From a retailer’s perspective, these are huge untapped product sales opportunities! Detailed results for each phrase complexity category are discussed in the subsections below.

Basic phrases

Analyzing the results more closely, we can see that the current tools are effective (100% correct) for simple, general queries (e.g. “red dress” or “grey hoodie”).

Figure 3— Search results for the phrase ‘grey hoodie’. Source: https://www2.hm.com/en_gb

These types of queries are often only the first step on the path to making a purchase, because due to their generality they can lead to a huge number of results (1,469 returned products to be exact — in the screenshot below) and the customer has to apply further filters to find the right product.

Figure 4— Search results for ‘red dress’ phrase. Source: https://www.next.co.uk/

Long phrases

As the query becomes more complex, the customer’s expectations are more precisely defined. Naturally, this narrows the list of returned products (e.g., exactly 128 returned products in the screenshot below), and the customer’s path to find and purchase the desired product is shorter.

Figure 5— Search results for ‘knee-length dress with a floral pattern’ phrase. Source: https://www.next.co.uk/

Unfortunately, the longer the phrase, the lower the effectiveness of search engines (the average effectiveness for this category is equal to only 56%). From a business point of view, this is a very big problem, because if customers are looking for a specific product (indicated by typing in a long phrase), they probably want to make a purchase. The results show that half the time they don’t find it. If such a customer is loyal to the brand, he or she will probably try to look for the product in another way, if not, such clients will move to a competitor’s site.

What results does the customer receive in cases where a search for the full phrase has failed?

Two types of responses can be distinguished:

  • a simple “no search results found” page — a result that completely discourages further searches on that page;
  • results based only on certain words of the phrase — this approach could be quite effective after applying additional filters to the page. Unfortunately, we have repeatedly observed that the words selected by the search engine from the entire phrase entered (the basis for displaying results) do not even reflect the type of product searched for (e.g., “sports lady” instead of “sports lady swimsuit” from the screenshot below— despite the fact that the store offered swimwear), making the returned product suggestions useless.
Figure 6— Search results for ‘sports lady swimsuit’ phrase. Source: https://www.zara.com/

Long phrases with punctation marks

The results in Table 1 show that the weakest point of the analyzed search engines are phrases with a punctuation mark. The average effectiveness for this category is equal to only 40%!

As with the previous category of queries, most search engines return no results or results based only on certain words from the phrase, e.g:

  • ‘yellow’ instead of ‘yellow, fashion handbag with strap’ in Figure 7 — which does not even reflect the type of product;
  • ‘handbag, with strap’ in Figure 8 — correctly identifying the product type, but definitely requiring further filtering of the results to find the desired product.
Figure 7 — Search results for ‘yellow, fashion handbag with strap’ phrase. Source: https://www.next.co.uk/
Figure 8 — Search results for ‘yellow fashion handbag with strap’ phrase. Source: https://www.marksandspencer.com/

Phrases with spelling errors

The last category analyzed was spelling errors (which happen to every user). The average effectiveness for this category is only 50%.

Here we observed exactly the same problems as in the section above:

  • lack of a results page
  • results based only on certain words from the phrase

Examples of erroneous results are shown below in Figure 9 and Figure 10.

Figure 9 — Search results for ‘women’s pink sirt [shirt]’ phrase. Source: https://www2.hm.com/en_gb
Figure 10 — Search results for ‘floarl [floral] pijama shirt and bottoms]’ phrase. Source: https://www.marksandspencer.com/

English is only an example of the language that can be used when searching for a product, so we repeated the same exercise based on a more complicated language — Polish. The same phrases were translated from English to Polish and tested on the websites of popular Polish stores. The results are shown in Table 2.

Table 2 — Correctness of results for text search engines — Polish language

The effectiveness of the available tools has dropped sharply for each category. The average correctness of all results is equal to 48%. Even for the simplest queries, not every search engine was able to match the result. Methods based only on tag searches or Natural Language Processing (NLP) algorithms do not perform well enough at this level of language complexity.

An alternative approach to the product search problem

We checked above that effective product search is not easy. Then you might want to look at this challenge in an alternative way. And that’s what we’d like to propose in this article!

How to search? The phrase typed in by the visitor could be compared directly with photos (not text or tags!) of the products on offer. A photo contains many times more information than any text description. Therefore, it enables the AI system to intelligently understand the search intent regardless of spelling errors or the level of complexity of the words used, and in fact gives the most accurate search results.

Figure 11 — Images of products from H&M Kaggle competition

Such an algorithm:

  • pays attention to all product details (pattern, shape, texture, color, …) that can be reflected in the search phrase;
  • is independent of the quality of the database of product descriptions and tags (currently, an e-commerce product search algorithm is only as good as the data it has to work with);
  • analyzes the meaning of a phrase instead of individual words;
  • understands synonyms, long and complex phrases, punctuation marks, plural and singular search words and spelling error.

Seems unattainable? It isn’t. Modern deep learning algorithms allow you to compare text with a photo and do it quickly and efficiently!

We have developed the following solution architecture:

Figure 12 — Architecture graph for the proposed solution

We tested various multimodal neural network models to find the best numerical representations for text and image inputs. These models can combine knowledge of linguistic concepts with semantic knowledge of images and represent them in the same “language” (embedding space). This allows easy comparison of objects of the two classes (images and texts) by defining a similarity measure for the embedding vectors, such as cosine similarity. Finally, products with vectors most similar to the vector for the embedded text are returned (Figure 13).

Figure 13- Similarity for text and image embeddings

This scheme is easily adaptable and effective for any language by training models on the right set of phrases.

To test the effectiveness of such a solution, we took data from the H&M Kaggle contest (H&M Personalized Fashion Recommendations | Kaggle). This dataset contains information on more than 100,000 products and their high-quality images.
The results of the text-image search for the phrases analyzed in the previous examples are shown in the images below (from Figure 14 to Figure 21). The search was performed only on the basis of information obtained from the images (product descriptions and tags were not used).

Figure 14 — Results of searching by text-to-image algorithm for phrase ‘red dress’
Figure 15 — Results of searching by text-to-image algorithm for phrase ‘grey hoodie’
Figure 16 — Results of searching by text-to-image algorithm for phrase ‘knee-length dress with a floral pattern’
Figure 17 — Results of searching by text-to-image algorithm for phrase ‘sports lady swimsuit’
Figure 18 — Results of searching by text-to-image algorithm for phrase ‘yellow, fashion handbag with strap’
Figure 19 — Results of searching by text-to-image algorithm for phrase ‘classic, leather, black shoes’
Figure 20 — Results of searching by text-to-image algorithm for phrase ‘women’s pink sirt [shirt]’
Figure 21 — Results of searching by text-to-image algorithm for phrase ‘floarl [floral] pijama shirt and bottoms’

We can see that the algorithm successfully interpreted the main meaning of each of the phrases entered. We calculated a measure of search efficiency for each category to objectively compare the results with previous analyses. The average efficiency for the proposed approach is 94%, which is significantly higher than previous results for each category. Detailed results are shown in Table 3.

Table 3 — Comparison of search performance of all methods

In addition, for our new approach, we did not observe any of the problems appearing for current search engines from popular clothing stores:

  • no “no search results found” page
  • no focus on a single word in complex phrases
  • lack of results completely unrelated to the phrase entered

This means that the analyzed text-image model can be a great base for search engines, which will allow retailers to forget about losing customers due to problems with finding the right product in the store’s offer.
This is a breakthrough solution in the field, as it is free from the basic limitations of traditional algorithms. The algorithm is designed to find the connection between text and image, so it returns results even when the phrase typed in is not the simplest and most obvious definition of the product the customer is looking for.
If this approach is complemented additionally with traditional product descriptions and basic tags, it guarantees virtually 100% effectiveness of search results for any type of phrase.

Conclusions

Let’s be honest — searching for products is not easy. But it can be!
If you want to create an enjoyable online shopping experience that generates business profit, these days a simple keyword-based search engine on your website is no longer enough. However, you have a much better source of information about the product you’re selling: a photo. This source is indescribable, universal and complete. We have observed a much higher search performance of algorithms that use the information about the product contained in its photo. Just take advantage of it!

Since we’ve already discovered the power of the information contained in a product photo, in the next article in this series we’ll follow suit. We will introduce the ability to search for a product without entering any text! This search engine is based only on the product photo provided by the customer! Doesn’t that sound great?

Words by Monika Sikorska, Data Scientist at Altimetrik Poland

https://www.linkedin.com/in/monika-sikorska-215738a7/

Editing by Kinga Kuśnierz, Content Writer at Altimetrik Poland

https://www.linkedin.com/in/kingakusnierz/

--

--

Altimetrik Poland Tech Blog

This is a Technical Blog of Altimetrik Poland team. We focus on subjects like: Java, Data, Mobile, Blockchain and Recruitment. Waiting for your feedback!