The Grouplens Research Project: About Me … Recommenders History of Recommender Systems the Early Years …

The GroupLens Research Project Summer 2001

About me …

Associate Professor of Computer Science & The GroupLens Research Project: Engineering, Univ. of Minnesota Collaborative Filtering Recommender Systems Ph.D. (1993) from U.C. Berkeley GUI toolkit architecture Joseph A. Konstan Teaching Interests: HCI, GUI Tools University of Minnesota Research Interests: General HCI, and ... [email protected] Collaborative Information Filtering Multimedia Authoring and Systems http://www.grouplens.org Web Automation Visualization and Information Management

Summer 2001 1 Summer 2001 2

The Problem: Recommenders Information Overload Too many Tools to help identify worthwhile stuff research papers Filtering interfaces books ÎE-mail filters, clipping services movies Recommendation interfaces web pages ÎSuggestion lists, “top-n,” offers and promotions … even Usenet News articles! Prediction interfaces ÎEvaluate candidates, predicted ratings

Summer 2001 3 Summer 2001 4

The Early Years …

History of Recommender Why cave dwellers survived

Systems How editors are like cave dwellers

Critics, critics, everywhere

Summer 2001 5 Summer 2001 6

Collaborative Information Filtering Filtering Information retrieval Premise Dynamic information need Information needs more complex than Static content base keywords or topics: quality and taste Small Community: Manual Information filtering Tapestry – database of content & comments Static information need Active CF – easy mechanisms for Dynamic content base forwarding content to relevant readers

Summer 2001 7 Summer 2001 8

Automated CF Usenet Interface The GroupLens Project (Resnick et al. CSCW ’94) ACF for Usenet News Îusers rate items Îusers are correlated with other users Îpersonal predictions for unrated items Nearest-Neighbor Approach Îfind people with history of agreement Îassume stable tastes

Summer 2001 9 Summer 2001 10

Usenet Trial (Miller et al. Usenet ’97; Does it Work? Konstan et al. CACM Mar. ’97) Medium-scale Usenet trial Yes: The numbers don’t lie! seven weeks Usenet trial: rating/prediction correlation Îrec.humor: 0.62 (personalized) vs. 0.49 (avg.) 250 users; 47,569 ratings; over 600,000 Îcomp.os.linux.system: 0.55 (pers.) vs. 0.41 (avg.) predictions Îrec.food.recipes: 0.33 (pers.) vs. 0.05 (avg.) variety of newsgroups Significantly more accurate than predicting Îmoderated and unmoderated average or modal rating. Îtechnical and recreational Higher accuracy when partitioned by gathered reading activity as well as ratings newsgroup

Summer 2001 11 Summer 2001 12

It Works ACF Blossomed Meaningfully Well! Relationship with User Behavior 1995 Ringo (later Firefly) Twice as likely to read 4/5 than 1/2/3 Bellcore Video Recommender

Users Like GroupLens 1996 Recommender Systems Workshop Some users stayed 12 months after the trial! Early commercialization Agents Inc. (later Firefly) Net Perceptions new issues of scale and performance!

Summer 2001 13 Summer 2001 14

Today Amazon.com

Broad research community live research systems substantial integration with machine learning, information filtering

Increasing commercial application available commercial tools

Summer 2001 15 Summer 2001 16

Cdnow album Wine.com Seeking advisor

Summer 2001 17 Summer 2001 18

CDNow Album advisor recommendations ML-home

Summer 2001 19 Summer 2001 20

ML-comedy ML-clist

Summer 2001 21 Summer 2001 22

ML-rate ML-search

Summer 2001 23 Summer 2001 24

ML-slist ML-review

Summer 2001 25 Summer 2001 26

How It Works

C.F. Engine

Ratings Correlations

Summer 2001 27 Summer 2001 28

ratings ratings

request C.F. Engine C.F. Engine

Ratings Correlations Ratings Correlations

Summer 2001 29 Summer 2001 30

ratings ratings

request request C.F. Engine C.F. Engine predictions recommendations

Ratings Correlations Ratings Correlations Neighborhood Neighborhood

Summer 2001 31 Summer 2001 32

GroupLens Model of Understanding the Information Filtering Computation Hoop Star Pretty Titanic Blimp Rocky Users rate Items. Dreams Wars Woman XV Users are correlated with other users. Joe D A B D ? ? Predictions made for an item’s value to a John A F D F particular user by combining ratings of Susan A A A A A A highly correlated users who rated it. Pat D A C Recommendations for items for a particular Jean user by identifying popular items among A C A C A correlated users. Ben F A F Nathan D A A

Summer 2001 33 Summer 2001 34

Understanding the Understanding the Computation Computation Hoop Star Pretty Titanic Blimp Rocky Hoop Star Pretty Titanic Blimp Rocky Dreams Wars Woman XV Dreams Wars Woman XV Joe D A B D ? ? Joe D A B D ? ? John A F D F John A F D F Susan A A A A A A Susan A A A A A A Pat D A C Pat D A C Jean A C A C A Jean A C A C A Ben F A F Ben F A F Nathan D A A Nathan D A A

Summer 2001 35 Summer 2001 36

Summer 2001 37 Summer 2001 38

Summer 2001 39 Summer 2001 40

Accuracy, Scale, and Sparsity Recent and Current Research Algorithm Performance and Metrics

Filterbots

Dimensionality Reduction Algorithms

Summer 2001 41 Summer 2001 42

Interfaces and User Experience Other Research (not covered in this talk)

Explaining Recommendations Distributed Recommenders (Sarwar)

Ephemeral Recommendations E-Commerce Recommender Applications PolyLens: Multi-User Recommendations (Schafer)

MetaLens: Multi-Source Recommendations User and Usage Studies

Summer 2001 43 Summer 2001 44

Algorithm Performance and Metrics (Herlocker et al., SIGIR ’99; …) Breese studied recommender algorithms Accuracy, Scale, and Sparsity k-nearest neighbors as good as any We looked at relevant tuning parameters limiting neighborhood size important normalization of ratings very important most other parameters unimportant Îcorrelation measure, weightings

Summer 2001 45 Summer 2001 46

Which Metrics? Filterbots

Many metrics used in published work The Inspiration: Error metrics (MAE, MSE, RMSE) Need selfless, consistent raters Decision-support metrics (ROC, errors) IR metrics (version of precision, recall) We found that there are only two types Humans? Rank-sensitive, value-sensitive No: ratings robots. All seem to work equally well and nearly identically, within type

Summer 2001 47 Summer 2001 48

Filterbot Integration Filterbot Studies

Usenet Filterbots (Sarwar et al. CSCW 98) Simple, non-personalized filterbots ÎSpelling, Length, % new text One filterbot at a time MovieLens Filterbots (Good et al. AAAI ’99) Personalized filterbots ÎLearned from genre, cast, descriptions GroupLens Many filterbots per person

Summer 2001 49 Summer 2001 50

Advantages of the Lessons Learned FilterBot model Even simple filterbots added value Combines best of agents and humans C.F. best way to create a personal agents rate frequently, quickly, consistently humans add subjective taste and quality combination of filterbots Filterbots better than a small community Framework pulls out the best of each of users use only the bots that work; ignore the others Filterbots + users in CF better than either use only the people who agree; ignore the others alone balance people and bots based on available ratings and agreement

Summer 2001 51 Summer 2001 52

Risks of Filterbots New Algorithms (Sarwar et al., EC 00 & WebKDD 01)

• What if no humans read certain articles? Significant challenges “voluntary” censorship or quality control? Scale Number of users • What about rogue filterbots? Number of items Sparsity • What if people “prefer” filterbots to Small percentage of items experienced humans? Hard to find overlap with other users

Summer 2001 53 Summer 2001 54

Example Idea: Dimensionality Challenge Reduction Synonymy Latent Semantic Indexing Similar products treated differently Used by the IR community for document similarity Increases sparsity, loss of transitivity Works well with similar vector space model Results in poor quality Uses Singular Value Decomposition (SVD) Example Main Idea Find (latent) “taste space” C1 rates recycled letter pads High Represent users and items as points (vectors) in taste C2 rates recycled memo pads High space ÎBoth of them like Reduced space is dense and less-noisy Recycled office products

Summer 2001 55 Summer 2001 56

SVD: Mathematical SVD for Collaborative Background Filtering

1. Low dimensional representation = SS V’ O(m+n) storage requirement R Rk UUk k Vk’ k x n m X rk rk X rk kr X n m X n 2. Direct m x k . m x n Prediction The reconstructed matrix Rk = Uk.Sk.Vk’ is the closest rank-k matrix to the original matrix R.

Summer 2001 57 Summer 2001 58

Experimental Results: Prediction SVD as Prediction Generator Pur e- CF Setup (k is fixedExperiment at 14 for SVD) SVD Data Sets MovieLens data (www.movielens.umn.edu) 0.86Movie data Î943 users, 1,682 items 0.84Used SVD for prediction generation Î100,000 ratings on 1-5 Likert scale 0.82based on the train data ÎUsed for prediction and neighborhood experiments 0.8

MAE Computed MAE E-commerce data 0.78 Î6,502 users, 23,554 items 0.76Obtained similar numbers from CF- Î97,045 purchases 0.74predict ÎUsed for neighborhood experiment 0.72

.2 5 .3 5 .4 5 .5 5 .6 5 .7 5 .8 5 .9 5 0 .2 0 .3 0 .4 0 .5 0 .6 0 .7 0 .8 0 .9 0 0 0 0 0 0 0 0 x (train/test ratio) Summer 2001 59 Summer 2001 60

SVD Conclusions

Successful and promising approach Interfaces and User Experience Several obstacles to overcome Incremental update Efficient “top-n” recommendations

Exploring SVD-based and other new algorithmic approaches

Summer 2001 61 Summer 2001 62

Explaining Recommendations Two Studies (Herlocker et al. CSCW 2000) Challenge: Belief Pilot study of explanation feature Why should users believe the Users liked explain recommendations? Unclear whether they become more When should users believe the effective decision-makers recommendations? Comprehensive study of different Approach explanation approaches Explain recommendations Wide variation of effectiveness ÎReveal data, process Some explanations hurt decision-making ÎCorroborating data, track record

Summer 2001 63 Summer 2001 64

Most Compelling Less Compelling Interfaces Interfaces

Your Neighbors' Ratings for this • Simple visual • Anything with even Movie representations of minimal complexity 25 23 neighbors ratings - More than two dimensions 20 15 • Statement of strong • Any use of statistical 10 7 previous performance terminology 5 3 - Correlation, variance, etc.

Number of Neighbors “MovieLens has predicted 0 correctly 80% of the time 1's and 2's 3's 4's and 5's Rating for you”

SummerExplanations 2001 65 SummerExplanations 2001 66

Addressing One Simple Ephemeral Needs Approach (Herlocker) User submits “theme” query What is an ephemeral interest need? Theme contains examples of items similar Immediate, temporary, dynamic to those desired by the user Current systems don’t support this Set of potentially similar items identified Assume interests will remain relatively Using item-to-item correlation in ratings constant space Recommendations are relative to all your interests as a whole Potentially similar items ranked based on traditional ACF predictions

Summer 2001 67 Summer 2001 68

Theme Creation Theme Selection

Summer 2001 69 Summer 2001 70

Query Results Results of Theme Query Study Users were very positive about the theme query interface Relevance of results were dependent on the “support threshold” Low support threshold => fewer relevant results When results were relevant, users were positive overall Even the users in the low support threshold groups indicated they would like to have the interface added to MovieLens

Summer 2001 71 Summer 2001 72

PolyLens: A Group Recommender Goals (O’Connor et al. Interact 2001) Challenge: People watch movies together Explore group recommender design space Solution: A recommender for groups See if users would want and use a group Issues recommender, at least for movies Group formation, rules, composition Study behavior changes in group members Recommender algorithm for groups group vs. other users User interface new users via groups vs. other new users Learn lessons about group recommenders

Summer 2001 73 Summer 2001 74

Design Issues Design Issues

Characteristics of groups What is a group recommendation? public or private group user vs. combined individuals many or few social good functions permanent or ephemeral Privacy and interface issues Formation and evolution of groups control over joining groups withholding and recommendations joining policy balancing between info overload and administration and rights support

Summer 2001 75 Summer 2001 76

Field Testing PolyLens PolyLens

PolyLens Field Trial Timeline 819 Design choices 800 719 private, small, administered, invited groups 625 combine individual recs with minimum misery 518 600 467 Users high-information interface with opt-out 428 Groups 334 400 249 162 295 338 200 261 177 195 217 Number of groups/users of Number 102 136 0 0 66 External invitations added by popular demand MJM JASONDJFJ S D J F Experimental Survey Inviting outsiders User behavior Current prototype added; PolyLens observations userbase released made public Summer 2001 77 Summer 2001 78

Survey and Usage Field Test Results Results and Lessons Satisfaction (95% like, 77% more useful) Users like and use group recommenders Privacy not an issue (94% see, 93% share) groups have value for all members groups can help with outreach to new individual recommendations “essential” members Groups reflect “real life” groups Users trade privacy for utility New users via groups stayed 1.5x as often Groups are both permanent and group vs. other users a wash ephemeral Many stillborn groups Users must be able to find each other

Summer 2001 79 Summer 2001 80

MetaLens: A Meta- What is the problem? Recommender (Schafer)

Integrating multiple sources of information into a single recommendation list

Summer 2001 81 Summer 2001 82

Sources of Data One solution

•Genre •Critical Reviews •MPAA ratings •Average User Rating •Film length •User’s personalized + + + •Objectionable Content MovieLens prediction •Distributor •Distance to the Theater •Release Date •Special Accomodations •Start/End Time •Discounted Shows Go Meta-recommendation system see... MetaLens

Summer 2001 83 Summer 2001 84

Summer 2001 85 Summer 2001 86

What Have We Learned?

Conclusions and Future Work • Meta-recommenders can be built.

• Anecdotally, users like them.

• Some users make heavy use of them, and heavy users are most likely to make some use of them.

Summer 2001 87 Summer 2001 88

Conclusions Future Work

Collaborative filtering works! Better integration of collaborative and content filtering Lots of important issues: Algorithms Better support for community Interfaces and User Experience Privacy Better understanding of user rewards, Applications social role of recommenders

Summer 2001 89 Summer 2001 90

CF Under Portfolios of Content Diminishing Returns Original goal of CF was to help people sift What if my recommender knows which articles through the junk to find the good stuff. I’ve read, and can identify articles by topic?

Today, there may be so much good stuff that What if it sees that I experience marginal returns from reading similar articles on a you need to sift even more. topic?

Certain types of content yield diminishing Could we downgrade some articles based on returns, even with high quality “lack of new content?” Could we discover which articles using collaborative filtering?

Summer 2001 91 Summer 2001 92

Temporal Collaborative Wine for the Time Filtering Today’s CF systems may expire or Seasonal taste – can we detect that a particular degrade ratings, but do little to detect customer shifts wine tastes during hot and cold weather? Can we learn either the or predict changes in preference. content, or separate profiles, reflecting these different tastes? Ripe area with lots of commercial Evolving taste – can we help a wine newcomer applications … build her palate? Could we identify wines that take her a step or two beyond her current ones? Can we do so by augmenting regular collaborative filtering with temporal models?

Summer 2001 93 Summer 2001 94

Acknowledgements

• This work is being supported by grants from the National Science Foundation, The GroupLens Research Project: and by grants from Net Perceptions, Collaborative Filtering Recommender Systems Inc. Joseph A. Konstan University of Minnesota • Many people have contributed ideas, [email protected] time, and energy to this project. http://www.grouplens.org

Summer 2001 95 Summer 2001 96