Friday, November 2, 2012

Matching Recommendation Technologies and Domains - Part 2

 
 
http://danceswithfat.files.wordpress.com/2011/05/choices-sign.jpg
 
 
Matching Domain Characteristics with Knowledge Sources
 
The choice of domain and the characteristics of the application place certain constraints on the kinds of knowledge sources that a recommender system may deploy. In turn, the availability and quality of knowledge sources influences what recommendation technologies a recommender can profitably use.

 
 
Effect @Social Knowledge
 
In heterogenous domains, social knowledge should be considered as a knowledge source since it is gathered by user’s input and does not need extensive knowledge engineering.
 
However, social knowledge is not sufficiently accurate and reliable for high risk domains or for domains which need explanation.
 
Social knowledge will tend to be sparse for high churn domains.
 
Using social knowledge is appropriate for domains with implicit type of interaction since it is possible to mine the users’ behavior using machine learning and statistical techniques which are the typical algorithms in collaborative filtering.
 
In domains with unstable user preference, the social knowledge can be misleading since the historical data is unreliable.
 
Effect @Individual Knowledge

In heterogenous domains, it might be difficult to transfer user’s input on certain items for recommending other items. For example, it is not certain that two users who have similar taste about movies, would also like similar music.
 
* A domain that requires knowledge of the user’s short-term requirements is most likely suited to some kind of knowledge-based recommendation.Constraints and preferences allow the user to limit and to rank options. For example, a dog owner might have a strict constraint that any apartment he rents accept his pet. A parent with young children might have a preference to be close to parks and playgrounds.
 
In high risk domains and domains which need explanation, it is usually necessary to have explicit requirements and constraints from the user. Similarly, user requirements are more likely to be needed in domains with unstable preferences since the historical data are unreliable.
 
Effect @Content Knowledge
 
The most basic kind of content knowledge is item features. These features can typically be used as is in a recommender system, although implementers will often want to restrict the feature space. If items are represented by unstructured documents such as news stories, the implementer will need to draw from information extraction (IE) techniques to extract and select features for use in recommendation. Features can be reduced further by applying more sophisticated feature selection techniques.Content knowledge in multimedia format presents an additional challenge.
 
The quality of recommendations produced by a content-based or knowledgebased recommender will be entirely dependent on the quality of the content data on which its decisions are based. Indeed, the lack of reliable item features is often cited as a motivating factor for avoiding content-based recommendation. The cost involved in creating and maintaining a database of useful item features should not be underestimated, particularly for heterogeneous domains.
 
New technical innovations arrive regularly, requiring that the schema and the individual entries for each item be updated. If there are a large number of not-entirely-independent features extracted in a variety of ways, the system may be tolerant of noisy feature data. On the other hand, applications with high risk will need to pay special attention to having clean item features.
 
Effect @Domain Knowledge (sub-content)
 
A knowledge-based recommender will typically need to know more than just what features are associated with what items. The most basic form of domain knowledge that a recommender can employ is an ontology over the item features. Such an ontology allows the system to reason about the relationship between features at a level deeper than just raw equality or difference.

Many high risk choices have constraints imposed by the domain that a recommender needs to obey.

The recommendation problem can be in some cases formulated entirely as constraint satisfaction with constraints being contributed both by the user and by the system.

A final category of domain knowledge is means-ends knowledge, which is the knowledge that enables a system to map between the user’s goals (ends) and the products that might satisfy them (means).

Part of the reason that users benefit from recommender systems is that they can make good choices without necessarily being conversant with all of the complexities of the product space.

Mapping Domains to Technologies



@Collaborative

some domain types for which social knowledge seems not very useful, in particular, high risk domains and ones with high churn.

In high churn domains, there may not be enough time for an item to build up a reputation among a large number of peer users before it is replaced with other items.

When there is large risk associated with a domain, most users are going to need a more convincing explanation of the appropriateness of a recommendation beyond simply that others liked it. This is particularly important if we consider the problem of robustness in collaborative systems.

@Knowledge-based

Similarly, if we look at the interaction, we can see that it is not always possible to gather every kind of knowledge type from every type of interaction.In systems with implicit inputs, we do not gather any kind of direct requirements from the user.

Preference instability favors knowledge-based techniques. Learning over a user’s prior interactions may turn out to be a hinderance rather than a help. However, in certain cases, such as web personalization, users may provide enough implicit data in a single session to form a useful profile that can be compared to others.
 @General Tips

 In cases where the criteria do not help to reach a definitive conclusion, it is worth noting that the different technologies do have different implementation and maintenance costs.

Collaborative recommendation is likely to be the least expensive to implement. It requires a database of user ratings, but it does not require clean, wellengineered item features, which is the minimum requirement for the other recommendation technologies.

Knowledge-based technologies are going to be the most expensive approach requiring knowledge engineering and continuing maintenance. So, a developer might wish to start by implementing the least expensive solution compatible with the domain.

Another factor to consider is that with hybrid recommendation it is possible to combine techniques. For example, to deal with a heterogeneous environment with unstable preferences, a hybrid between content-based and collaborative recommendation may be desirable.

 Sample Recommendation Domains

Table 3 illustrates the application of these criteria in 10 different domains where recommendation applications exist. Not all combinations of the six criteria are represented, but we can see that the considerations given above are fairly predictive.



High-risk domains generally lead to knowledge-based recommendation; scrutability is also a good predictor of this. Heterogeneous domains are handled largely with collaborative recommendation.

Web page recommendation looks a bit contradictory when we consider high churn and preference instability, which would seem to militate against collaborative methods. However, as discussed above, database size can compensate for preference instability and these recommenders do collect large amounts of implicit preference data in each session. Also, heterogeneity is high, which argues in favor of using social knowledge.

Conclusion

This chapter considers recommender systems as intelligent systems, and as such, dependent on knowledge. The differences between recommendation approaches can be best understood through reference to the different knowledge sources that they employ. By considering how domain characteristics impact the availability and quality of knowledge sources, we can connect recommendation technologies and domain characteristics.

We have examined 6 different factors: heterogeneity, risk, churn, preference stability, interaction style, and scrutability, and considered their impact on the knowledge sources available for recommendation. From this analysis, we derive constraints on what recommendation technologies will be most appropriate for domains according to their characteristics. Application of these criteria to some existing systems shows that they do a reasonably good job of predicting what technologies have been successfully employed both in research and applications.