In the event you’ve ever shopped on Amazon, you’ve used Your Orders. This characteristic maintains your full order historical past courting again to 1995, so you possibly can observe and handle each buy you’ve made. The order historical past search characteristic allows you to discover your previous purchases by coming into key phrases within the search bar. Past simply discovering gadgets, it supplies a simple method to repurchase the identical or related gadgets, saving you effort and time.
Varied options throughout Amazon’s procuring expertise, resembling Rufus and Alexa, use order historical past search that can assist you discover your previous purchases. Subsequently, it’s vital that order historical past search can find your previous bought gadgets as precisely and shortly as doable.
On this submit, we present you the way the Your Orders workforce improved order historical past search by introducing semantic search capabilities on prime of our present lexical search system, utilizing Amazon OpenSearch Service and Amazon SageMaker.
Limitations of lexical search
Order historical past search makes use of lexical matching to seek out gadgets from your entire order historical past of a buyer that match no less than one phrase of the search key phrases. For instance, if a buyer searches for “orange juice,” the system retrieves all orange juice gadgets in addition to recent oranges and different fruit juices the client had beforehand ordered. Though lexical matching can present a excessive recall of things with phrases matching the search key phrases exactly, it doesn’t work nicely for associated or generic search key phrases, like “well being drinks” on this instance.
For the reason that launch of Rufus, Amazon’s AI-enabled procuring assistant, a rising variety of clients are experiencing a streamlined and richer procuring journey, together with looking for their earlier purchases with Rufus. Clients can now ask “Present me wholesome drinks” with out worrying about utilizing prolonged, extra exact phrases like “kombucha”, “inexperienced tea”, and “protein shakes”. This makes the search expertise extra conversational and intent-based, presenting a possibility to make merchandise discovery extra intuitive. For Rufus to reply order historical past searches with the identical intuitive expertise resembling “Present me the wholesome drinks I purchased final yr”, the underlying order historical past information retailer (“Your Orders”) wants semantic search functionality to know the underlying semantics of search key phrases past the standard lexical matching.
Challenges implementing semantic search
Implementing semantic search at our scale offered a number of technical challenges:
- Scale – We would have liked to allow semantic search throughout billions of information comparable to clients’ order historical past globally.
- Zero downtime – We would have liked to maintain the system 100% obtainable whereas making adjustments on the backend to introduce semantic search.
- Stopping search high quality degradation – Semantic search is meant to enhance the standard of search outcomes. Nonetheless, in some circumstances, it could cut back search high quality. For instance, if a buyer remembers their merchandise title precisely and needs to seek out solely gadgets matching that title, surfacing related gadgets along with the precisely matching gadgets will improve crowding in outcomes and make it tougher to seek out the related merchandise. Equally, semantic search won’t work for circumstances the place the client intends to go looking by identifier values, like order ID, which lack an inherent semantic which means. For these eventualities, we use lexical search solely.
Answer overview
Semantic search is powered by giant language fashions (LLMs), that are principally educated on human languages. These fashions may be tailored to take a chunk of textual content in any language they had been educated in and emit an embedding vector of a set size, no matter the enter textual content size. By design, embedding vectors seize the semantic which means of enter textual content such that two semantically related textual content strings have excessive cosine similarity computed on their respective embedding vectors. For semantic search on order historical past, the enter textual content topic to embedding technology and similarity computation are the client search phrases and the product textual content of bought gadgets.
We divide our resolution into two elements:
- Enhancing system scalability and resiliency for dealing with requests at scale – Earlier than implementing semantic search, we wanted to make sure our infrastructure might deal with the elevated computational load, main us to undertake a cell-based structure. This step just isn’t wanted for each use case, however techniques with very excessive scale when it comes to request or information quantity can profit so much from its use earlier than implementing a resource-intensive use case like semantic search.
- Implementing semantic search – We started by evaluating the obtainable embedding fashions, utilizing the offline analysis capabilities of Amazon Bedrock to check totally different fashions. After we chosen our mannequin, we might set up the infrastructure for producing embedding vectors.
Enhancing system scalability and resiliency
We used the cell-based structure design sample for bettering our scalability and resiliency. A cell-based design entails partitioning the system into similar, smaller, self-contained chunks, or cells, which deal with solely part of the general site visitors obtained by the system. The next diagram exhibits a high-level illustration of a cell-based design for order historical past search.

Every cell serves an outlined subset of our clients. Cells don’t want to speak with each other to serve a buyer request. Every buyer is assigned to a cell and every request from that buyer is routed to that cell. The OpenSearch Service area in every cell holds information just for the subset clients that it’s purported to serve. The variety of cells (N) and distribution of knowledge amongst these cells depends upon the enterprise use case, however the objective is to realize as even a distribution of knowledge and site visitors as doable.
The routing logic may be stored as easy or as refined because the use case requires it to be. The cell task values can both be computed at runtime for every request, or they are often computed one time and written to a cache or persistent information retailer like Amazon DynamoDB, from the place cell task values may be fetched for subsequent requests. For order historical past search, the logic was easy and fast sufficient to be executed at runtime for every request. Trying up cell task from a persistent information retailer is particularly helpful for circumstances the place there’s a danger of some cells changing into “heavier” than others over time. In such circumstances, it turns into simpler to redistribute the heavy cell’s information by merely overriding cell task values for particular keys within the information retailer, as a substitute of getting to alter the partitioning logic instantly, which could have an effect on information distribution throughout all of the cells.
Because the system’s load grows, the variety of cells within the system may be elevated to deal with the extra site visitors. Even with out rising the variety of cells within the system, we are able to redistribute present information among the many present N cells by reassigning some keys from a number of closely populated cells to totally different frivolously populated cells to unfold out the load extra evenly throughout all of the cells and make extra environment friendly use of the infrastructure.
A cell-based structure additionally helps make the system extra resilient. For instance, if we lose one cell, our capability is diminished solely by 1/N, as a substitute of 100%. This association will also be improved to cut back the capability loss even additional by assigning partitioning keys to 2 or extra cells such that they get written to 2 or extra cells. In such circumstances, lack of a single cell doesn’t lead to information loss.
Implementing semantic search
Implementing semantic seek for our order historical past search required a number of key choices and technical steps. We started by evaluating the obtainable embedding fashions, utilizing the offline analysis capabilities of Amazon Bedrock to check totally different fashions in opposition to our particular enterprise area necessities. This analysis course of helped us determine which mannequin would ship one of the best efficiency for our use case. After we chosen our mannequin, we wanted to ascertain the infrastructure for producing embedding vectors. We containerized our embedding mannequin and registered it in Amazon Elastic Container Registry (Amazon ECR), then deployed it utilizing SageMaker inference endpoints to deal with the precise vector computation at scale.
For the search infrastructure itself, we selected OpenSearch Service to implement our semantic search capabilities. OpenSearch Service offered each the vector storage we wanted and the search algorithms required to ship related outcomes to our customers.
One in all our largest challenges was updating our historic information to help semantic search on present orders. We constructed a knowledge processing pipeline utilizing AWS Step Capabilities to orchestrate the workflow and AWS Lambda features to deal with the precise vector technology for our legacy information, so we might present semantic seek for all of the information we needed to.
The next diagram illustrates the high-level structure.

Mannequin analysis and choice
Order historical past search makes use of an embedding mannequin educated on Amazon-specific information. Area-specific coaching is crucial as a result of the generated embedding vectors should work nicely for the enterprise context to return high quality outcomes.
We used an LLM-as-a-judge methodology with Anthropic’s Claude on Amazon Bedrock to guage candidate fashions. Anthropic’s Claude obtained prompts containing anonymized merchandise textual content and search phrases from buyer order historical past, then filtered and ranked gadgets by relevance. These outcomes served as floor fact for comparability.
We evaluated fashions utilizing commonplace rating metrics:
- Normalized Discounted Cumulative Acquire (NDCG) – Measures rating high quality in opposition to very best order
- Imply Reciprocal Rank (MRR) – Considers place of first related merchandise
- Precision – Charges accuracy of retrieved outcomes
- Recall – Charges skill to retrieve all related gadgets
This course of helped us decide one of the best mannequin.
Retrieval technique: Buyer-scoped complete search
Order historical past search has two key necessities:
- Search solely by means of the requesting buyer’s order historical past – We don’t need gadgets from one buyer’s order historical past exhibiting up in search outcomes for an additional buyer
- Search all of that buyer’s historical past – We don’t need to miss exhibiting an merchandise that may have been related for the client’s search phrase simply because the search algorithm missed evaluating it for some motive
Our strategy includes utilizing OpenSearch Service to retrieve all gadgets for the client who issued the search question, calculating relevance scores for every of them in opposition to the search phrase, sorting by rating, and returning prime Okay outcomes. This supplies complete outcomes protection for every buyer.
Vector storage with OpenSearch Service
We used two OpenSearch Service options for environment friendly vector storage and search:
- knn_vector datatype – Constructed-in help for storing embedding vectors. Current domains can add this subject sort with out reindexing, enabling actual kNN search throughout all information. We didn’t want approximate kNN as a result of the variety of information for many clients was sufficiently small for actual kNN to scale.
- Scripted scoring – Painless scripts compute vector similarity server-side, lowering shopper complexity and sustaining low latency.
Hybrid search
Hybrid search refers to combining the outcomes of lexical and semantic search to profit from the strengths of every. The hybrid question capabilities of OpenSearch Service simplify implementing hybrid search by letting shoppers specify each sorts of queries in a single request. OpenSearch Service runs each queries in parallel, merges their outcomes, normalizes the relevance scores of the sub-queries, and kinds outcomes by the offered type order (relevance rating by default) earlier than returning them to shoppers.
This provides shoppers one of the best of each sorts of searches. For instance, there are particular eventualities the place the search phrase doesn’t make a lot sense semantically, like when clients search by their orderId values. Semantic search just isn’t designed for such circumstances; these are finest served utilizing key phrase matching.
The hybrid search performance helped save implementation effort and potential latency improve for order historical past search.
Updating historic information
After the infrastructure has been arrange, newly ingested information are continued with the related embedding vectors and help semantic search on these information. Nonetheless, when clients search, they sometimes seek for merchandise that they had bought earlier. Subsequently, the system won’t assist enhance buyer expertise a lot until the older information are up to date to incorporate the related embeddings. The strategy to populate this information depends upon the dimensions of the issue at hand.
Releasing the change to attenuate potential buyer influence
Our ultimate step was to launch the change to shoppers in a way such that the influence of any potential issues is as small as doable. There are a number of methods to try this, together with:
- Implementing semantic search in a way such that any transient points within the semantic search circulation make the logic fall again to lexical-only search, as a substitute of failing the request fully. Even when semantic search doesn’t execute, the system ought to nonetheless be capable of return outcomes of lexical search to the shopper, as a substitute of empty outcomes.
- Gating the change such that the default conduct stays lexical-only search and shoppers who want the semantic search characteristic should go an extra flag within the request, for instance, which executes the semantic or hybrid circulation just for these requests.
- Holding the brand new circulation behind a characteristic flag throughout the preliminary interval such that it may very well be turned off fully if some crucial downside is detected.
Examples of improved buyer expertise
The next are some examples of buyer interactions with Rufus that required Rufus to question the respective buyer’s order historical past to reply their query and provides them the required items of data.
The next screenshots present how semantic search picks up picket spoons for a “sustainable utensils” question and totally different sorts of chargers regardless of not having the key phrase “charger” within the title description, within the case of the wall connector.

The next screenshots present how semantic search picks up related outcomes regardless that the title description doesn’t embody the queried key phrases.

The semantic search characteristic of order historical past search helped Rufus fetch them and present to the purchasers. Earlier than semantic search, Rufus wasn’t in a position to present any outcomes to clients for such queries.
Enterprise influence
Our resolution resulted within the following key enterprise impacts:
- Buyer expertise enhancements – The answer achieved 10% enchancment in question recall, rising the proportion of searches that return related outcomes. It additionally decreased customer support contacts for points associated to finding previous orders.
- Associate integration success – The answer strengthened pure language processing capabilities for Alexa and Rufus, enhancing their skill to interpret order historical past queries. It additionally decreased the necessity for reranking and postprocessing by companion groups. We improved question success price by 20%, which means extra buyer searches now return no less than one related merchandise. We additionally noticed enhanced consequence protection by 48%, with semantic search constantly surfacing extra related matches that lexical search would have missed.
Conclusion
On this submit, we confirmed you the way we developed Amazon order historical past search to help semantic search capabilities. This transition concerned utilizing cutting-edge AI expertise whereas working inside present infrastructure limitations to develop options that averted disruption and maintained SLAs throughout the characteristic improve. The implementation additionally concerned backfilling, the place billions of paperwork had been processed at charges a number of occasions greater than regular ingestion to compute embedding vectors for beforehand bought gadgets. This operation required cautious engineering and took benefit of the resilience OpenSearch Service gives even beneath excessive load.
Past the fast implementation, this basis allows continued innovation in search expertise. The embedding vectors framework can incorporate improved fashions as they change into obtainable, and the structure helps enlargement into new capabilities resembling personalization and multi-modal search.
You may get began with actual k-NN search at this time following the directions in Precise k-NN search. In the event you’re on the lookout for a managed resolution to your OpenSearch cluster, try Amazon OpenSearch Service.
