In “How Personalization Technologies Are Used Across the Web” we saw eight examples of personalization in action, but what is powering these experiences?
Let’s get a little more hands-on and consider a sample problem, where the goal is to make __Personalized Client Engagement Recommendations__ for salespeople by sketching out a recommendation engine algorithm.
Imagine that there are six salespeople and four clients, where each client is engaged by multiple salespeople. Based on the __historical bookings data__, we know how much revenue each salesperson earned from each client. However, there is no such information for some possible combinations of clients and salespeople. The goal is to __predict how much each salesperson can make in bookings with each client__ and do the right assignment of clients to salespeople. The information could be summarized in the visual form in the sales matrix as shown below.
We can observe that, for example, Alyssa worked with Google, Capgemini, and ExxonMobil and generated $500,000, $400,000, and $400,000, respectively. However, there is no data for Alyssa and Staples. Let’s predict the potential revenue by using basic ideas commonly implemented when building recommendation, personalization, and matching engines.
First, let’s consider __similarity__. Based on the historical bookings, one can note that Mike and Brandon are more similar to Alyssa compared to other salespeople. The way we got to this conclusion is by computing the “__distance__” between the known bookings Alyssa generated with the known bookings generated by each salesperson.
For example, the “distance” since between Alyssa and Brandon is $200,000 because:
– Alyssa and Brandon both worked on Google and ExxonMobil (we skip the clients where there is no info for at least one of the salespeople).
– Both Alyssa and Brandon generated $500,000 from Google and, therefore, from that point of view they are the same. In other words, Google doesn’t contribute to the “distance”.
– Alyssa made $400,000 from ExxonMobil, while Brandon only $200,000. In this case, the difference is $200,000.
– Then, we sum over all the clients shared between Alyssa and Brandon and get $0 + $200,000 = $200,000 total “distance”.
The same exercise for Mike gives us $0 and based on the available data Mike is the most similar to Alyssa. The numbers for other salespeople are left as an exercise for arithmetic-hungry readers.
For completeness, it is worth mentioning that the “distance” could be defined in many different ways. Here we used the most simple approach based on pairwise difference. In reality, recommendation systems use complex mathematical norms such as Euclidean, Manhattan, or supremum norms and optimize it depending on the application and objective function.
For now, we can take Mike as he is the most similar to Alyssa. However, in this case, we will be estimating Alyssa’s booking for Staples only based on Mike’s bookings. A likely better strategy is to use a few salespeople. Let us stick with N=2 for exemplary purposes. In reality, N (__neighbor count__) is an __adjustable parameter of an algorithm tuned with machine learning__.
Using both Mike’s and Brandon’s bookings and considering them as equally relevant, the prediction for (Alyssa, Staples) will be $400,000 = ½ ($500,000 + $300,000).
However, as we noted above, Mike is more similar to Alyssa than Brandon. Let us leverage that insight and consider Mike’s past bookings as more important. We can set the weight for Mike as 1 and the weight for Brandon as ½. In this case, the prediction for Alyssa will be approximately $433,000 ~= 1/1.5 * (1 * $500,000 + 0.5 * $300,000).
Here, we illustrated yet another powerful idea commonly used in personalization/relevance solutions— __weighted voting__. In the first case, the weight for each candidate salesperson was equal, in the second case, the weight for Mike was higher than for Brandon and decayed inversely proportional to the position of a salesperson, when sorted by “distance”. In practice, the weighting function is yet __another parameter that need to be optimized by machine learning__.
Above, we have covered only the most basic ideas underlying the modern recommendation systems. Yet, we hope that now you understand the main philosophy, which is *__“to predict the value for a pair (object1, object2), we find several objects similar to object1, look at the available values for object2 among all of the candidate objects, and compute a weighted similarity score”__*. The parameters of these and more advanced algorithms are tuned using machine learning.
## Rapildly Delivering Personalization Solutions
At Gigster, we abstract away and package core mathematical modeling concepts (like those described above) into AI components that can be effectively glued together to rapidly deliver an AI solutions to clients, while leaving room to incorporate custom logic. A Personalization solution might include basic (nearest neighbour) and advanced (matrix and tensor factorization with sparse regularization) algorithms, the machine learning training and deployment AI infrastructure, and data ingestion workflows. See an example below.
AI components arranged in our frameworks are robust, reusable, enterprise-grade solution starters that come with the good parts of a SaaS enterprise product (e.g. support, training, updates) yet don’t force Gigster clients to stick with a specific vendor since we share the source code. Most importantly, AI component frameworks enable faster time-to-market and time-to-value.
If you think you might need a personalization and matching solution in your organization, __contact our AI team__ and we will happily schedule a call with you to discuss it.