记分
使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
生成候选集之后,另一个模型会对生成的模型进行评分和排名,
选择要显示的一组项目。推荐系统
可能有多个使用不同来源的候选生成器,
如下所示:
- 矩阵分解模型中的相关项。
- 负责进行个性化设置的用户功能。
- “本地”与“远程”相比items;即获取地理信息,
。
- 热门或热门商品。
- 社交图;也就是
朋友。
系统会将这些不同的来源合并到一个通用池中,
然后由单个模型打分并根据评估结果
得分。例如,系统可以训练模型来预测
用户在 YouTube 上观看视频的概率:
- 查询功能(例如,用户观看记录、语言、国家/地区、时间)
- 视频功能(例如标题、标签、视频嵌入)
然后,系统可根据
用于对模型的预测。
为什么不让候选集生成器打分呢?
由于候选网络生成器会计算一个得分(例如相似度度量
)中,您可能想要使用它们进行排名,
。但是,您应该避免这种做法,原因如下:
- 有些系统依赖于多个候选生成器。这些实验的得分
不同的生成器可能没有可比性。
- 由于候选定位设置较少,系统可负担使用
更多的特征和更复杂的模型,可以更好地捕获上下文。
选择用于评分的目标函数
您可能还记得《机器学习问题简介》(Introduction to ML Problem)
取景、
机器学习可能会像个调皮的精灵一样:非常高兴地了解目标
但你必须小心自己的所想。此恶作剧
质量也会影响推荐系统。评分方式的选择
函数会极大地影响项的排名,而最终
推荐质量。
示例:
点击加号图标可了解使用每种方法分别会有什么影响
目标。
尽可能提高点击率
如果评分函数针对点击进行优化,系统可能会推荐
点击诱饵类视频此评分函数会生成点击,
提供良好的用户体验用户的兴趣可能会迅速消退。
最大限度地延长观看时长
如果评分函数针对观看时长进行优化,系统可能会
推荐非常长的视频,这可能会导致糟糕的用户体验。
请注意,多次短时间的观看与一秒的观看体验一样。
提高多样性并尽可能延长会话观看时长
推荐较短的视频,但这类视频更有可能持续
互动。

评分中的位置偏差
与屏幕较低的内容相比,
位置在屏幕较靠前的内容。不过,在对视频评分时,
系统通常不知道指向该视频的链接将出现在屏幕上的什么位置
最终将出现。查询包含所有可能位置的模型也
价格高昂。即使查询多个排名是可行的,
但仍然无法在多个排名得分之间找到一致的排名。
解决方案
- 创建与位置无关的排名。
- 对所有候选字词进行排名,就像它们位于屏幕的顶部位置一样。
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2024-07-26。
[null,null,["最后更新时间 (UTC):2024-07-26。"],[[["\u003cp\u003eRecommendation systems often involve candidate generation followed by scoring and ranking to select items for display.\u003c/p\u003e\n"],["\u003cp\u003eCandidate generation can leverage various sources like user features, popular items, or social graphs, which are then combined and scored by a separate model.\u003c/p\u003e\n"],["\u003cp\u003eUsing a single scoring model allows for better comparability and context consideration compared to relying on individual candidate generator scores.\u003c/p\u003e\n"],["\u003cp\u003eCareful selection of the scoring function is crucial as it directly impacts the quality and relevance of recommendations, with considerations for click-bait or overly long content.\u003c/p\u003e\n"],["\u003cp\u003ePositional bias should be addressed by aiming for position-independent rankings, potentially by scoring candidates as if they were in the top position.\u003c/p\u003e\n"]]],[],null,["# Scoring\n\nAfter candidate generation, another model scores and ranks the generated\ncandidates to select the set of items to display. The recommendation system\nmay have multiple candidate generators that use different sources, such\nas the following: \nExamples \n- Related items from a matrix factorization model.\n- User features that account for personalization.\n- \"Local\" vs \"distant\" items; that is, taking geographic information into account.\n- Popular or trending items.\n- A social graph; that is, items liked or recommended by friends.\n\nThe system combines these different sources into a common pool of\ncandidates that are then scored by a single model and ranked according to\nthat score. For example, the system can train a model to predict the\nprobability of a user watching a video on YouTube given the following:\n\n- query features (for example, user watch history, language, country, time)\n- video features (for example, title, tags, video embedding)\n\nThe system can then rank the videos in the pool of candidates according\nto the prediction of the model.\n\nWhy not let the candidate generator score?\n------------------------------------------\n\nSince candidate generators compute a score (such as the similarity measure\nin the embedding space), you might be tempted to use them to do ranking as\nwell. However, you should avoid this practice for the following reasons:\n\n- Some systems rely on multiple candidate generators. The scores of these different generators might not be comparable.\n- With a smaller pool of candidates, the system can afford to use more features and a more complex model that may better capture context.\n\nChoosing an objective function for scoring\n------------------------------------------\n\nAs you may remember from [Introduction to ML Problem\nFraming](https://developers.google.com/machine-learning/problem-framing/),\nML can act like a mischievous genie: very happy to learn the objective\nyou provide, but you have to be careful what you wish for. This mischievous\nquality also applies to recommendation systems. The choice of scoring\nfunction can dramatically affect the ranking of items, and ultimately the\nquality of the recommendations.\n\n**Example:**\n\nClick the plus icons to learn what happens as a result of using each\nobjective.\n\n#### Maximize Click Rate\n\nIf the scoring function optimizes for clicks, the systems may recommend\nclick-bait videos. This scoring function generates clicks but does not\nmake a good user experience. Users' interest may quickly fade.\n\n#### Maximize Watch Time\n\nIf the scoring function optimizes for watch time, the system might\nrecommend very long videos, which might lead to a poor user experience.\nNote that multiple short watches can be just as good as one long watch.\n\n#### Increase Diversity and Maximize Session Watch Time\n\nRecommend shorter videos, but ones that are more likely to keep the\nuser engaged.\n\nPositional bias in scoring\n--------------------------\n\nItems that appear lower on the screen are less likely to be clicked than\nitems appearing higher on the screen. However, when scoring videos, the\nsystem usually doesn't know where on the screen a link to that video will\nultimately appear. Querying the model with all possible positions is too\nexpensive. Even if querying multiple positions were feasible, the system\nstill might not find a consistent ranking across multiple ranking scores.\n\n### Solutions\n\n- Create position-independent rankings.\n- Rank all the candidates as if they are in the top position on the screen."]]