基于内容的过滤
使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
基于内容的过滤功能使用商品功能来推荐其他商品
与用户所喜欢的内容相似、基于他们之前的操作或露骨内容
反馈。
为了演示基于内容的过滤,我们来手动设计一些特征
适用于 Google Play 商店的应用下图显示了一个特征矩阵,其中
每行代表一个应用,每列代表一项功能。特性
可能会包含类别(例如教育、休闲、健康),发布商
以及许多其他元素为简单起见,假设以下特征矩阵为二元:
非零值表示该应用具有该功能。
您还可以在同一特征空间中表示用户。一些与用户相关的
功能可以由用户明确提供。例如,用户选择
“娱乐应用”。其他特征可以是隐式的,
根据他们之前安装过的应用显示相关数据例如,用户
安装了 Science R Us 发布的另一个应用。
模型应推荐与此用户相关的商品。为此,您必须
请先选择一个相似度指标(例如点积)。然后,您必须
设置系统,根据此相似度对每个候选推荐项进行评分
指标。请注意,这些推荐针对的是此用户,
未使用任何关于其他用户的信息。
使用点积作为相似度度量
考虑用户嵌入 \(x\) 和应用的情况
嵌入 \(y\) 都是二元向量。开始时间
\(\langle x, y \rangle = \sum_{i = 1}^d x_i y_i\)、
都有助于 \(x\) \(y\) 和
求和。也就是说, \(\langle x, y \rangle\) 是数字
两个矢量中同时激活的特征。较高
点积表示更常见的特征,因此相似度更高。
亲自尝试一下吧!
计算上述应用问题中每个应用的点积。
然后使用该信息回答下面的问题:
我们应推荐哪款应用?
由 Science R Us 打造的教育应用。
回答正确!此商品包含最高的点积
,我们的用户非常喜欢科学和教育类应用。
由 Healthcare 创建的健康应用。
该应用的得分为 1。这并不是我们的
但肯定不是最佳选择
由 TimeWastr 打造的休闲应用。
实际上,此应用具有最低的点积,为 0。我们的
用户对游戏等休闲应用不感兴趣。
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2024-07-26。
[null,null,["最后更新时间 (UTC):2024-07-26。"],[[["\u003cp\u003eContent-based filtering suggests items similar to a user's preferences by analyzing item features and user interactions.\u003c/p\u003e\n"],["\u003cp\u003eUser and item features are represented in a feature matrix, where common features indicate higher similarity.\u003c/p\u003e\n"],["\u003cp\u003eDot product is used as a similarity metric, with higher values indicating stronger relevance between user and item.\u003c/p\u003e\n"],["\u003cp\u003eRecommendations are tailored to individual users based on their specific features and interactions, without using data from other users.\u003c/p\u003e\n"],["\u003cp\u003eThe system identifies the best recommendations by calculating dot products and selecting items with the highest scores.\u003c/p\u003e\n"]]],[],null,["# Content-based filtering uses item features to recommend other items\nsimilar to what the user likes, based on their previous actions or explicit\nfeedback.\n\nTo demonstrate content-based filtering, let's hand-engineer some features\nfor the Google Play store. The following figure shows a feature matrix where\neach row represents an app and each column represents a feature. Features\ncould include categories (such as Education, Casual, Health), the publisher\nof the app, and many others. To simplify, assume this feature matrix is binary:\na non-zero value means the app has that feature.\n\nYou also represent the user in the same feature space. Some of the user-related\nfeatures could be explicitly provided by the user. For example, a user selects\n\"Entertainment apps\" in their profile. Other features can be implicit,\nbased on the apps they have previously installed. For example, the user\ninstalled another app published by Science R Us.\n\nThe model should recommend items relevant to this user. To do so, you must\nfirst pick a similarity metric (for example, dot product). Then, you must\nset up the system to score each candidate item according to this similarity\nmetric. Note that the recommendations are specific to this user, as the model\ndid not use any information about other users.\n\nUsing dot product as a similarity measure\n-----------------------------------------\n\nConsider the case where the user embedding \\\\(x\\\\) and the app\nembedding \\\\(y\\\\) are both binary vectors. Since\n\\\\(\\\\langle x, y \\\\rangle = \\\\sum_{i = 1}\\^d x_i y_i\\\\), a\nfeature appearing in both \\\\(x\\\\) and \\\\(y\\\\) contributes a 1 to\nthe sum. In other words, \\\\(\\\\langle x, y \\\\rangle\\\\) is the number\nof features that are active in both vectors simultaneously. A high\ndot product then indicates more common features, thus a higher similarity.\n\nTry it yourself!\n----------------\n\nCalculate the dot product for each app in the preceding app problem.\nThen use that information to answer the question below: \nWhich app should we recommend? \nThe educational app created by Science R Us. \nYou are correct! This item has the highest dot product at 2. Our user really likes science and educational apps. \nThe health app created by Healthcare. \nThis app scores a 1. It isn't the worst recommendation our system could make, but it certainly isn't the best. \nThe casual app created by TimeWastr. \nThis app actually has the lowest dot product at 0. Our user isn't interested in casual apps like games."]]