Trifork Blog

Mahout – Taste :: Part Three – Estimators

July 8th, 2010 by
|

In Taste, estimators are the bridge between the generic item- or user recommendation logic and the specific similarity algorithm. Estimators are mainly used as part of the recommendation process, however, they are also used for evaluating recommenders. Additionally, the ‘recommended because’ feature is also powered by an estimator. This blog covers some Taste internals and shows you how estimators are used within Taste via a few code samples.

Estimators for recommendations

Let’s start with the main usage of estimators: providing recommendations. Suppose we create a GenericItemBasedRecommender, provide it with a DataModel and one of Taste’s ItemSimilarity implementations.

To fetch a few recommendations we call GenericItemBasedRecommender.mostSimilarItems(long itemID, int howMany), as shown in the snippet below:

  @Override
  public List<RecommendedItem> mostSimilarItems(long itemID, int howMany) throws TasteException {
    return mostSimilarItems(itemID, howMany, null);
  }

  @Override
  public List<RecommendedItem> mostSimilarItems(long itemID, int howMany,
                                                Rescorer<LongPair> rescorer) throws TasteException {
    TopItems.Estimator<Long> estimator = new MostSimilarEstimator(itemID, similarity, rescorer);
    return doMostSimilarItems(new long[] {itemID}, howMany, estimator);
  }

After delegating the method call to a more generic mostSimilarItems method, a MostSimilarEstimator is constructed and passed to the protected method doMostSimilarItems. The whole process of estimating and recommending is implemented via an estimator and algorithm specific logic within a recommender.

Now let’s zoom in on the doMostSimilarItems method. See the snippet below:

  private List<RecommendedItem> doMostSimilarItems(long[] itemIDs,
                                                   int howMany,
                                                   TopItems.Estimator<Long> estimator) throws TasteException {
    DataModel model = getDataModel();
    FastIDSet possibleItemsIDs = new FastIDSet();
    for (long itemID : itemIDs) {
      PreferenceArray prefs = model.getPreferencesForItem(itemID);
      int size = prefs.length();
      for (int i = 0; i < size; i++) {
        long userID = prefs.get(i).getUserID();
        possibleItemsIDs.addAll(model.getItemIDsFromUser(userID));
      }
    }
    possibleItemsIDs.removeAll(itemIDs);
    return TopItems.getTopItems(howMany, possibleItemsIDs.iterator(), null, estimator);
  }

The snippet above describes the core logic for item-based recommendation. This process consists of three steps:

  1. Fetch all preferences for the given item(s)
  2. For each preference get the corresponding user and fetch all their other preferences
  3. From this set of preferences, minus the given item, get the corresponding items and determine the top items based on the given estimator

The TopItems is a helper class for fetching the top ranked items of a set of items for a given estimator.

Now on to the estimator. All estimators implement TopItems.Estimator<T> interface which is really simple. It returns an estimate for a ‘thing’ as a double.

  public interface Estimator<T> {
    double estimate(T thing) throws TasteException;
  }

Now on to the MostSimilarEstimator:

  public static class MostSimilarEstimator implements TopItems.Estimator<Long> {

    private final long toItemID;
    private final ItemSimilarity similarity;
    private final Rescorer<LongPair> rescorer;

    public MostSimilarEstimator(long toItemID, ItemSimilarity similarity, Rescorer<LongPair> rescorer) {
      this.toItemID = toItemID;
      this.similarity = similarity;
      this.rescorer = rescorer;
    }

    @Override
    public double estimate(Long itemID) throws TasteException {
      LongPair pair = new LongPair(toItemID, itemID);
      if ((rescorer != null) && rescorer.isFiltered(pair)) {
        return Double.NaN;
      }
      double originalEstimate = similarity.itemSimilarity(toItemID, itemID);
      return rescorer == null ? originalEstimate : rescorer.rescore(pair, originalEstimate);
    }
  }

This estimator does three things:

  1. Use the Rescorer to filter items. Rescorers can be used to create domain specific filtering of items
  2. Use the ItemSimilarity to calculate the preference of a user for the given item
  3. Optionally boost the similarity value with the Rescorer

This setup allows you to plugin arbitrary ItemSimilarity algorithms in the recommender.

Recommended because…

Another interesting feature of the GenericItemBasedRecommender is the ‘Recommended because’ feature. With this feature you can determine why a certain item was recommended to you, i.e. which of your preferences were largely responsible for giving you this recommendation.

To use this feature call recommendedBecause(long userID, long itemID, int howMany), see snippet below:

  @Override
  public List<RecommendedItem> recommendedBecause(long userID, long itemID, int howMany) throws TasteException {
    if (howMany < 1) {
      throw new IllegalArgumentException("howMany must be at least 1");
    }

    DataModel model = getDataModel();
    TopItems.Estimator<Long> estimator = new RecommendedBecauseEstimator(userID, itemID, similarity);

    PreferenceArray prefs = model.getPreferencesFromUser(userID);
    int size = prefs.length();
    FastIDSet allUserItems = new FastIDSet(size);
    for (int i = 0; i < size; i++) {
      allUserItems.add(prefs.getItemID(i));
    }
    allUserItems.remove(itemID);

    return TopItems.getTopItems(howMany, allUserItems.iterator(), null, estimator);
  }

It takes all items the given user has a preferences for, minus the given item and passes this to TopItems, along the with RecommendedBecauseEstimator, see the code below:

  private class RecommendedBecauseEstimator implements TopItems.Estimator<Long> {

    private final long userID;
    private final long recommendedItemID;
    private final ItemSimilarity similarity;

    private RecommendedBecauseEstimator(long userID, long recommendedItemID, ItemSimilarity similarity) {
      this.userID = userID;
      this.recommendedItemID = recommendedItemID;
      this.similarity = similarity;
    }

    @Override
    public double estimate(Long itemID) throws TasteException {
      Float pref = getDataModel().getPreferenceValue(userID, itemID);
      if (pref == null) {
        return Float.NaN;
      }
      double similarityValue = similarity.itemSimilarity(recommendedItemID, itemID);
      return (1.0 + similarityValue) * pref;
    }
  }

}

This RecommendedBecauseEstimator determines the ranking by multiplying the preference value of the user by the item similarity value of the current item pair. After this process the top ranked items are those items that were most important in causing a recommendation of the given item.

Conclusions

This concludes the overview of some Taste internals and has hopefully given you a clearer picture on how recommendations and estimators work inside Taste. In future posts I will probably expand on this topic, especially within the context the evaluation of recommenders. If you have any questions regarding Taste in general or this topic of estimators feel free to leave a comment.

5 Responses

  1. July 21, 2010 at 00:44 by KenL

    Great series, thanks for posting them. Is there any chance you could provide a link to the code for this post similar to the link in the previous post? I’m trying to work with the GenericItemBasedRecommender with the taste-web WAR building functionality but to no avail.

  2. July 21, 2010 at 10:49 by Frank Scholten

    @KenL: Thank you!

    This post contains no custom code, all snippets from this post are part of Taste. They are from GenericItemBasedRecommender and only shown to illustrate how estimators work internally.

    At http://mahout.apache.org/taste.html, below the demo heading, there is documentation on using taste-web.

    What kind of exceptions/errors do you get when trying to run your own GenericItemBasedRecommender implementation via taste-web?

  3. July 21, 2010 at 17:04 by KenL

    The class constructor in the examples in Mahout source all use the Recommender interface, but this does not give you access the the unique methods in GenericItemBasedRecommender. I tried extending this class but ran into constructor problems. I’m sure there is a simple solution, but so far it has eluded me. The problem is probably my lack of experience with Java – 95% of the programing I do is in python.

  4. March 22, 2011 at 09:53 by Rahul Sharma

    Hi Frank,

    Can you just help me out in using solR Indexes as input to taste.

    Thanks,
    Rahul Sharm

  5. October 22, 2014 at 14:11 by jacky

    I think they have a bug with their GenericItemBasedRecommender
    I’ve tried using some kinds of similarities and all of them gave me exactly the same result. did you see that?