content rating behind the firewall
DESCRIPTION
Content rating behind the firewall. Presentation to the SIKM group April 2011TRANSCRIPT
Content rating behind the firewall
April 2011
Presented for SIKM by David Thomas, Deloitte
Copyright © 2009 Deloitte Development LLC. All rights reserved.2
Dave Thomas is a Product Manager for the U.S. Intranet at Deloitte. He previously worked in client service at Deloitte Consulting in their SharePoint Practice and as a Project Manager for Global Consulting Knowledge Management (GCKM) on their Portal: the Knowledge Exchange (KX).
Background
• KX is available globally and attracts over 20,000 unique visitors per month• KX is built on a heavily customized version of SharePoint 2003• Users typically come a few times a month to retrieve documents, utilize communities
and complete other knowledge related tasks
Content rating behind the firewall
Copyright © 2009 Deloitte Development LLC. All rights reserved.4
•Content rating has been common on the internet for some time but there seem to be limited examples of successful rating systems behind the firewall. Internal usage of ratings at our organization is historically <5% on the two platforms that it was deployed on
•Stan Garfield posed the question “Has anyone had positive experiences with 1-5 star content rating mechanisms inside a firewall?”January 2010. Here are selection of responses (thank you to SIKM members):
– “I think that 5-star rating systems are ideal for apples to apples comparisons. Most knowledge objects (and of course people) cannot be compared in this manner”
– “I think there is an added complication in that inside the firewall it might also be important to know who is doing the rating. The CEO's rating might carry a little more weight than the janitor's”
– “With process documents, I'd like the idea of ratings because the purpose of the document is clear. For other documents, the case is muddier.
– “rating a book or toaster is very different from rating a specific piece of content. Products purchased on Amazon tend to have more common use cases”
– On a deployed CR system: ”At first, pretty much no one rated. We suspect this was for several of the reasons that you pose in your document but also because it was unclear what they were being asked to rate - the quality of the writing, whether or not they agreed with the author, whether or not they thought highly of the author, or whether they liked the quality of the document. In an effort to encourage participation, the sponsors clarified the intent of the ratings”
Content rating behind the firewall
Copyright © 2009 Deloitte Development LLC. All rights reserved.5
•In March 2010, a project was initiated to provide a content rating system that was integrated into the Portal. The perceived value was that a content rating system would allow stronger rated content to be easily identified and promoted accordingly. Later on, weaker content could be removed /archived earlier (separate project)
•Our Portal runs on SharePoint 2003 so this project entailed custom work (no 3rd party webparts available) . The rating system integrated with
– Published content pages giving ability to rate
– Search user interface allowing retrieval based on rating
•The first part of the project was research on various options for rating. Determined that most rating systems fell into one of three buckets:
• ‘favorites and flags’
• ‘this or that’
• ‘rating schemes’
Content rating behind the firewall
Copyright © 2009 Deloitte Development LLC. All rights reserved.6
Design: Single value rating scheme - usually positive
Type 1: ‘Favorites and Flags’
Copyright © 2009 Deloitte Development LLC. All rights reserved.7
Design: Positive/Negative value: Yes/No, Like/Dislike, Up/Down
Type 2: ‘This or That’
Copyright © 2009 Deloitte Development LLC. All rights reserved.8
Type 3: Rating Schemes
Design: Traditional 1-X rating scheme (1-5 and 1-10 are common)
Copyright © 2009 Deloitte Development LLC. All rights reserved.9
•‘Inside the firewall’ is not ‘outside the firewall’ – user behavior might to be different
• Scale could be an issue (will there be enough people rating enough content to be meaningful)
• There is a lag between the time a knowledge asset is accessed and the time a rating can fairly be made. The user may also no longer be logged into the repository at the time the rating could be applied• When someone watches a short video they watch and can rate quickly in most cases as the
rating mechanism is often easy and convenient. They have consumed the media asset and are positioned to make a judgment on it. The value of a document is not known until after it has been downloaded and read and that can take time.
• We could experience cultural resistance when trying to implement content rating• Anonymity• Lack of desire to rate content as poor will likely be evident• People are not used to rating content inside the firewall
Some of the concerns identified pre-deployment
Copyright © 2009 Deloitte Development LLC. All rights reserved.10
•“Eight of these graphs have what is known to reputation system aficionados as J-curves- where the far right point (5 Stars) has the very highest count, 4-Stars the next, and 1-Star a little more than the rest.”
•“a J-curve is considered less-than ideal for several reasons: The average aggregate scores all clump together between 4.5 to 4.7 and therefore they all display as 4- or 5-stars and are not-so-useful for visually sorting between options. Also, this sort of curve begs the question: Why use a 5-point scale at all? Wouldn't you get the same effect with a simpler thumbs-up/down scale, or maybe even just a super-simple favorite pattern?”
•“If a user sees an object that isn't rated, but they like, they may also rate and/or review, usually giving 5-stars - otherwise why bother - so that others may share in their discovery. People don't think that mediocre objects are worth the bother of seeking out and creating internet ratings”
•“There is one ratings curve not shown here, the U-curve, where 1 and 5 stars are disproportionately selected”
Typical ratings distributions
Outside the firewall: generally ‘J Curves’ exist . The authors of Building Web Reputation Systems did research on ratings of various Yahoo sites
• Product or service based sites with either a) tightly nit communities or b) Incentivization or c) huge user groups can generate U curves also (Amazon.com is often cited as an example)
Copyright © 2009 Deloitte Development LLC. All rights reserved.11
•One of the groups evaluated (custom autos) generated a ‘W Curve’. This actually represented a preferred distribution for our deployment and we later speculated on whether we would achieve it.
•“The biggest difference is most likely that Autos Custom users were rating each other's content. The other sites had users evaluating static, unchanging or feed-based content in which they don't have a vested interest”
•“Looking more closely at how Autos Custom ratings worked and the content was being evaluated showed why 1-stars were given out so often: users were providing feedback to other users in order to get them to change their behavior. Specifically, you would get one star if you 1) Didn't upload a picture of your ride, or 2) uploaded a dealer stock photo of your ride”
•“The 5-star ratings were reserved for the best-of-the-best. Two through Four stars were actually used to evaluate quality and completeness of the car's profile. Unlike all the sites graphed here, the 5-star scale truly represented a broad sentiment and people worked to improve their scores.”
Typical ratings distributions cont..d
Deciding on a content rating design
Copyright © 2009 Deloitte Development LLC. All rights reserved.13
Deciding on a content rating design
Question
Do users update and try and improve their content over time?
No, once they contribute that’s it
Would users rate other peoples content negatively? Would they expect a change be made to the underlying deliverable?
• Don’t expect many users to rate content negatively • If users did it could be because the content didn’t meet their needs for a given situation, not that it is poor content necessarily
Do users have a vested interest in rating something ? (what's in it for them?)
Not clear what the value proposition is to rate something. Rating will improve ability for the broader population, but there is no immediate incentive
Is there a tight nit group who will populate content ratings or sense of ‘I should rate this content’?
No, but we do have a reasonably large number of users, but they have limited time
Is our organization culturally disposed toward positive ratings only…?
In my experience yes
Based on the W distribution example, we asked some questions to determine whether a 1-5 rating scheme would work and we could get the desired W.
Ultimately, we decided to custom develop a 1-5 Rating Scheme (Type 3). There were other drivers identified on that drove this decision.
Copyright © 2009 Deloitte Development LLC. All rights reserved.14
•The business drivers for implementing a 1-5 Rating scheme• Simple, familiar model to rate published content• Granularity – ability to get and average score and promote / remove content as needed• Alignment with SharePoint 2010 (future platform) reduced disruption for the user when we moved
•Resource constraints meant some deferral of some functionality for future releases. For the first release:• Single classification of knowledge asset - published content. (Other types would later follow).
• Rating occurred on the content record only
• No mechanism for comments (even though they often go hand in hand with ratings) • concern on moderation team requirements, some risk aversion, worried comments would either not be used (people not
comfortable) or perhaps inappropriate in some cases
• We didn’t give explicit guidance on what each of the ratings meant – just used the ‘1. Not recommended –5. Highly recommended’ nomenclature. Suggestion to provide something like below was not pursued:
• Marketing and promotion at the launch of rating meant that the rating activity was essentially incentivized for the user. This had an impact on the usage as you will see.
Deploying a rating system
Content rating data
Copyright © 2009 Deloitte Development LLC. All rights reserved.16
Jun-
10
Jul-1
0
Aug-1
0
Sep-1
0
Oct-1
0
Nov-1
0
Dec-1
0
Jan-
11
Feb-1
10
200
400
600
800
1000
1200
1400
1600
1800
2000 18961746
423501 550
1491
1114
367281
Commentary
•Average of 836 unique pieces of content each month (about 2.5% of content available) was rated.
•Additional rating capabilities were deployed for qualifications in October/November – potentially raised awareness around rating in general.
•KX has seasonality effects in user visits
Rating events / unique pieces of content rated
Jun-
10
Jul-1
0
Aug-1
0
Sep-1
0
Oct-1
0
Nov-1
0
Dec-1
0
Jan-
11
Feb-1
10
500
1000
1500
2000
2500
30002662
2227
465 550 601
1687
1210
706
299
Copyright © 2009 Deloitte Development LLC. All rights reserved.17
Commentary
• This graph normalizes the absolute number of ratings to the page views for item. The long-term average is around 1-1.5%.
Content Conversion
Jun-10 Jul-10 Aug-10 Sep-10 Oct-10 Nov-10 Dec-10 Jan-11 Feb-110.00%
0.50%
1.00%
1.50%
2.00%
2.50%
3.00%
3.50%3.20%
2.67%
0.56% 0.59% 0.65%
1.95%1.72%
0.77%
0.32%
Copyright © 2009 Deloitte Development LLC. All rights reserved.18
Ratings per user / monthly new users
Jun-10 Jul-10 Aug-10 Sep-10 Oct-10 Nov-10 Dec-10 Jan-11 Feb-110.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
2.822.99
1.952.28
2.55
3.013.28
2.87
1.42
Commentary
•Average of around 2.75 ratings applied by each user.
•Average of around 270 new raters each month although heavily skewed by the first 2 months.
•There are repeat raters using the system. Current new rater run-rate is around 100 users a month
Jun-10 Jul-10 Aug-10Sep-10 Oct-10 Nov-10Dec-10 Jan-11 Feb-110
100
200
300
400
500
600
700
800
900
1000 944
591
143 116 134
386
198
109 87
Copyright © 2009 Deloitte Development LLC. All rights reserved.19
Average score and rating distribution
FY11 Jun
FY11 Jul
FY11 Aug
FY11 Sep
FY11 Oct
FY11 Nov
FY11 Dec
FY11 Jan
FY11 Feb
0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
70.0%
80.0%
90.0%
100.0%
72.3% 74.7% 73.4%
61.6% 60.7%
72.9% 72.0%
88.8%
76.6%
% 4 or 5 star ratings
FY11 Jun
FY11 Jul
FY11 Aug
FY11 Sep
FY11 Oct
FY11 Nov
FY11 Dec
FY11 Jan
FY11 Feb
3.91
3.92
3.93
3.94
3.95
3.96
3.97
3.98
3.99
4.003.99 3.99
3.98
3.97
3.95
3.94 3.94
3.99 3.99
Average rating
Commentary
•There is some fluctuation in the 4 and 5 ratings but the long-term average is 73%.
•Average rating is extremely steady and has been from month1.
Copyright © 2009 Deloitte Development LLC. All rights reserved.20
Cumulative table of resultsCategory 1 month 6 months All data (10
months)
Total Ratings 3102 9361 10796
Unique Content Items Rated
2268 5960 6498
Unique Raters 1020 2504 2843
Average Rating 3.99 3.99 3.99
% Ratings 4 or 5 70% 72.1% 73.5%
2 or more ratings applied
1.5% 3.7% 4.2%
5 or more ratings applied
0.3% 1% 1.3%
1 2 3 4 50%
5%
10%
15%
20%
25%
30%
35%
40%
Content Rating distribution (All Data)
What did we learn from the experience?
Copyright © 2009 Deloitte Development LLC. All rights reserved.22
•You can build a custom simple content rating system and it will get some meaningful use: Over the last ten months, 10000+ rating events following a fairly typical J-Curve distribution with ~70% of ratings a 4 or a 5.
•There are no real benchmarks for ‘success’. Project set target was 1-3% of viewed content would be rated. Noted if 1-2% of our MUVs rate content that equates to 2500 total ratings a year. (YouTube Rating from 0.1 – 0.5% of viewers is common (sign in to rate has impact?))
•Value for knowledge assets can be situational – “one mans trash is another mans treasure”. Without comments system it is difficult to understand why something is rated a certain score.
•Feel that our users are pre-disposed to rate a lot of content 3-4. They get the concept of best in class. We have firm wide methodologies that are broadly used – they would equate that with best in class/ 5 star.
•Experienced excessive rating and of course, self rating.
•Still some level of fear that if they rate something a 1, then the document author will find out – to the extent that we put that in the FAQs for the system to address this.
•Incentivization had an impact. Require more data to see exactly how much.
What did we learn from the experience? Can content rating happen behind the firewall effectively?
Copyright © 2009 Deloitte Development LLC. All rights reserved.23
What else could be done in the future?
•Identification and promotion of high quality content in a meaningful way. Some challenges given the sheer volume of content and the distribution of our business and the interests of individual users.
•A model for removal/archive of ‘low quality content’. Business rule definition is still outstanding around this. Basic idea: If there is compelling evidence (multiple ratings) look to possibly retire the content earlier.
•Additional marketing and promotions, sponsorship. Incentivization will drive a temporary increase in rating activity, but is not sustainable.
•Demo rating as part of onboarding materials for new hires – set it as an expectation to rate assets that are used. This should be part of a broader initiative about the value of a knowledge sharing culture.
•Authored or contributed content appears on your profile. Add rating of that content as well.
•Deployment of a DVD by mail rating type model for rating (‘Blockbuster ‘). Essentially these are communications asking the user to take a proactive step of rating a consumed asset. Note: This was discussed but de-scoped as the only options at the time were heavily manual.
Copyright © 2009 Deloitte Development LLC. All rights reserved.24
Closing thoughts and Q&A
• Still feel there value in content rating we experienced both technical and organizational challenges implementing it behind the firewall but learned from the experience and at least the capability is there for users to use.
• Rating is built into common platforms that we use for document management, collaboration and knowledge sharing. In theory if rating is pervasive, and if users see it all the time, they *may* use it more.
•User behavior was generally consistent• Generally more ratings were applied by junior – mid level users (time, familiarity with the Portal)• Common for users to rate items a ‘4’ or a ‘5’• We saw long term usage is around 1% of page views and 2% of total content in the content
store rated in any one month– Strong impact when the process is incentivized (not that surprising)– Concern about the time it would take to get meaningful ratings on a lot of the content on the
Portal• Outliers are likely ‘power raters’ although might not be self-directed!
Copyright © 2009 Deloitte Development LLC. All rights reserved.25
‘“Seems like when it comes to ratings it's pretty much all or nothing.Great videos prompt action; anything less prompts indifferenceThus, the ratings system is primarily being used as a seal of approval,not as an editorial indicator of what the community thinks out a video. Rating a video joins favoriting and sharing as a way to tell the world that this is something you love”. (22/9/09 YouTube blog)
You-tube’s position on rating drove a change for them…
Questions? Feel free to contact [email protected]