This paper uses feature hierarchy to improve recommendation accuracy. This paper aims to learn the horizontal relationship in the feature hierarchy which has not yet been exploited.
The key assumption is that items under the same category are complementary but items under the same “parent category” are alternative. For this assumption to work, the feature hierarchy must be categorized according to complementary and alternative properties.
The author assumes that when two products are more similar if they are rated by the same user. They define item co-occurrence as:
This formula is similar to the point-wise mutual information (PMI).
The influence of feature to the item i is the average IC of item i and item j in the feature category .
If item i has co-occured with many items in , the feature influence on i will be high and vice versa.
The following definition is the key assumption of this work:
Item i and j are alternate if . It said that if a user consumes item j, he/she will likely not consume item i. The opposite has to be true as well.
If we know that a user consumes item j, he will likely consume item i too, then these two items are complementary.
This work represents a feature as a bag of items. To measure the relationship between two features, the author calculates the avearge IC between all item pairs in both features.
Modeling Vertical Direction
The model learns a latent vector of all features, and then for each item i, the additional item latent vector is a linear combination between feature influence and feature latent vectors:
When item i and j are in the same feature categories, the item latent vector will be more similar. The feature influence acts as a weight.
Modeling Horizontal Direction
The horizontal relationship is enforced via a regularization. If two features are related, then their latent vectors must be closer in the latent space.
The co-occurrence signal is used to determine if two items are complementary or alternative. But this work does not really care about modeling complementary or alternative. They simply use the co-occurrence information to place similar products closer in the latent space.
They represent a feature as a bag of items and then they learn a feature latent vector and perform. a linear combination of feature latent vector as an alternate item representation.
They enforce the feature latent vector to be more similar if these two features are more similar.
This work assumes that item rated by the same user are more similar. If the user purchases a running shoe and bed sheet, then these two products will be more similar. Since the chance that these two products will be co-purchased are less likely, the co-occurrence information based on co-rated information proves to be useful in this work.
The model can be summarized as:
Item Latent vector = Traditional Latent vector + a weight-sum of feature vectors.
Sun, Zhu, et al. “Exploiting both Vertical and Horizontal Dimensions of Feature Hierarchy for Effective Recommendation.” AAAI. 2017.