Google 處理垃圾搜尋結果的成效 - 2019 年網路垃圾內容處置報告
透過集合功能整理內容
你可以依據偏好儲存及分類內容。
2020 年 6 月 9 日,星期二
Google 重視使用者的每一筆搜尋。因此,每當您利用 Google 搜尋找尋實用的相關資訊時,我們都希望盡可能確保您找到最優質的結果,這也是我們一直履行的承諾。
遺憾的是,網路上存在一些我們稱之為「垃圾內容」的干擾行為與內容,這些垃圾內容會導致使用體驗大打折扣,妨礙使用者尋找實用資訊。我們有多個團隊致力於防止搜尋結果中出現垃圾內容,努力不懈地在與垃圾內容發布者的對抗中制敵機先。與此同時,我們將繼續與網站管理員交流,協助他們遵循最佳做法,在 Google 搜尋中獲得理想表現,進而在開放網路中提供優質內容。
我們將在下文中回顧過去一年的表現,大致說明我們在 2019 年是如何防堵 Google 搜尋中的垃圾內容,以及為網站管理員社群提供的支援。
大規模防堵垃圾內容
我們的索引系統收錄了數千億個網頁,每天都會提供數十億次查詢服務,所以免不了有層出不窮的不肖人士企圖操控搜尋排名。事實上據我們觀察,我們每天都會找到超過 250 億個垃圾資訊網頁。這個驚人的數字顯示出垃圾內容發布者會長期持續大規模地在網路上散布垃圾內容。對此,我們會慎重以待,盡可能確保減少您在搜尋結果中看到的垃圾網頁。在我們的努力下,現在當使用者經由我們的搜尋結果造訪網頁時,已經有超過 99% 的機率不會受到垃圾內容干擾。
去年至今的成就
我們在 2018 年的報告中提到過,使用者自製垃圾內容的數量已減少 80%。令人高興的是,這類濫用行為的數量在 2019 年並未增加。垃圾內容連結依然是很常見的一種垃圾內容形式,但我們的團隊成功在 2019 年限制了這類垃圾內容的影響範圍。我們的系統攔截了超過 90% 的垃圾連結,同時削弱了付費連結或連結交換等手法的效果。
雖然駭客攻擊帶來的垃圾內容仍然很常見且不易解決,不過影響程度較前幾年已更為趨緩。我們仍持續在開發解決方案,希望能提高對這類內容的偵測效率、迅速通知受影響的網站管理員和平台,並協助他們修復遭到入侵的網站。
垃圾內容趨勢
我們在 2019 年的首要任務之一,就是利用機器學習系統改進杜絕垃圾內容的措施。我們的機器學習解決方案結合了成熟且歷經時間檢驗的人工強制處理措施,能有效協助辨識垃圾搜尋結果並防止使用者受到這類干擾。
過去幾年,我們觀察到有越來越多垃圾網站含有自動產生的內容和剪輯內容,而且多有令搜尋者不悅或危害搜尋者安全的行為,比如假按鈕、大量廣告、可疑的重新導向和惡意軟體。這類網站通常具有欺騙性,對使用者也沒有任何實際價值。與 2018 年相比,我們在 2019 年將這類垃圾內容對搜尋服務使用者的影響減少了超過 60%。
在提升垃圾內容攔截能力與效率的同時,我們也會持續努力減少帶來詐騙、詐欺等各種危害的網站。這類網站會偽裝成官方網站或具有公信力的網站來取信於使用者,導致他們因此洩漏個人機密資訊、損失錢財或使裝置感染惡意軟體,情況屢見不鮮。我們一直密切關注容易引來詐騙和詐欺的查詢,也持續努力在預防使用者受到這些垃圾內容手法的危害。
與網站管理員和開發人員攜手打造更美好的網路環境
在打擊垃圾內容方面,我們主要是仰賴自動化系統來偵測濫用行為,但這些系統並沒有無懈可擊到滴水不漏的地步。另外,任何 Google 搜尋使用者也都可以檢舉搜尋到的垃圾內容、網路詐騙或惡意軟體,協助我們抵禦垃圾內容和其他問題。我們在 2019 年收到了將近 23 萬次的垃圾搜尋結果檢舉,並成功對其中的 82% 採取了行動。我們衷心感謝您提出的所有檢舉,也感謝您協助我們維護安全的搜尋環境!
我們在收到檢舉或發現可疑內容時會怎麼做呢?一旦我們偵測到網站出現異常,當務之急便是通知網站管理員。為了讓網站管理員知道他們的網站在搜尋結果中的呈現方式可能受到問題影響,並提供一些切實可行的改進建議,我們在 2019 年總共寄出了超過 9,000 萬封訊息,當中約有 430 萬封訊息是關於違反《網站管理員指南》而遭到專人介入處理的情況。
此外,我們也不斷想方設法為網站擁有者提供更妥善的協助,在 2019 年採取了許多改善通知效率的措施,比如推出新版 Search Console 訊息、WordPress 網站專用的 Site Kit 以及新版 Search Console 中的 DNS 自動驗證。我們希望這些措施能讓網站管理員以更便利的方式驗證網站,並持續發揮作用;也希望網站管理員可以藉此快速掌握最新動態,更精準有效率地解決垃圾內容問題或駭客問題。
在致力於清理垃圾內容的同時,我們也沒忘記跟上網路發展,重新調整了對 "nofollow"
連結的定位。如今,原本是為了防堵垃圾留言及標示贊助商連結所推出的 "nofollow"
屬性已有長足進步。然而我們不想止步於此,如同我們打擊垃圾內容的方式會不斷進步,我們認為現在正是讓網路進一步發展的時刻,因此推出了 rel="sponsored"
和 rel="ugc"
這兩種新的連結屬性,讓網站管理員能夠以更多方式對 Google 搜尋指出特定連結的性質。如今,這兩種屬性和 rel="nofollow"
網站都已成為我們處理網站排名的依據。對於這些新的 rel 屬性能受到世界各地網站管理員的接受與採用,我們著實興奮不已!
一如既往,我們很珍惜去年與全球網站管理員交流的每個機會,能夠協助他們改善搜尋排名並獲得意見回饋,同樣使我們獲益良多。Google 在全球許多城市舉辦了超過 150 場的線上諮詢會和其他線上與線下活動,目標對象涵蓋各種群體,包括搜尋引擎最佳化 (SEO) 專員、開發人員、線上行銷人員和業主。在這些活動中,我們很高興看到各界人士踴躍參與 Webmaster Conference。這些 Webmaster Conference 活動以 12 種語言舉行,場地橫跨全球 15 個國家的 35 處地點,也曾在山景城首次以產品高峰會的形式舉辦。雖然我們目前還無法舉辦現場活動,但希望未來能多多舉行這類活動和線上活動。
網站管理員產品討論社群也持續在為網站管理員提供解決方案和實用秘訣,2019 年總計增加了 3 萬多筆以十幾種語言發文的會話串。另外,為了確保您的問題都能獲得解答,我們也在 YouTube 推出了 #AskGoogleWebmasters 等系列影片 (例如《破解搜尋引擎最佳化 (SEO) 的迷思》)。
我們知道改善網路環境還有很長的路要走,也希望來年能繼續與您攜手共進。因此,請一定要關注我們的 Twitter、YouTube、網誌、產品討論社群,或是親身參與一場離您較近的會議,我們期待與您面對面交流!
發文者:Google 搜尋關係團隊成員 Cherry Prommawin 與 Google 搜尋品質分析人員 Duy Nguyen
除非另有註明,否則本頁面中的內容是採用創用 CC 姓名標示 4.0 授權,程式碼範例則為阿帕契 2.0 授權。詳情請參閱《Google Developers 網站政策》。Java 是 Oracle 和/或其關聯企業的註冊商標。
[null,null,[],[[["\u003cp\u003eGoogle is actively combating webspam, blocking over 25 billion spammy pages daily to ensure high-quality search results.\u003c/p\u003e\n"],["\u003cp\u003eGoogle reduced the impact of spam with auto-generated and scraped content by over 60% in 2019 using machine learning and manual actions.\u003c/p\u003e\n"],["\u003cp\u003eWebmasters are encouraged to report spam, phishing, and malware; Google provides tools and resources like Search Console to help them address website issues.\u003c/p\u003e\n"],["\u003cp\u003eGoogle fosters collaboration with webmasters through online and offline events, online help communities, and dedicated YouTube series.\u003c/p\u003e\n"],["\u003cp\u003eGoogle introduced new link attributes (\u003ccode\u003erel="sponsored"\u003c/code\u003e and \u003ccode\u003erel="ugc"\u003c/code\u003e) as hints for ranking purposes, complementing the existing \u003ccode\u003erel="nofollow"\u003c/code\u003e attribute.\u003c/p\u003e\n"]]],["Google Search actively combats webspam, identifying over 25 billion spammy pages daily. In 2019, they reduced the impact of auto-generated and scraped content spam by 60% and caught over 90% of link spam. They sent 90 million messages to webmasters about website issues, with 4.3 million related to manual actions. New link attributes were introduced (rel=\"sponsored,\" rel=\"ugc\"). The company held numerous online and offline webmaster events and engaged with the community through various online platforms.\n"],null,["# How we fought Search spam on Google - Webspam Report 2019\n\nTuesday, June 09, 2020\n\n\nEvery search matters. That is why whenever you come to Google Search to find relevant and useful\ninformation, it is our ongoing commitment to make sure users receive the highest quality results\npossible.\n\n\nUnfortunately, on the web there are some disruptive behaviors and content that we call \"webspam\"\nthat can degrade the experience for people coming to find helpful information. We have a number\nof teams who work to prevent webspam from appearing in your search results, and it's a constant\nchallenge to stay ahead of the spammers. At the same time, we continue to engage with webmasters\nto ensure they're following best practices and can find success on Search, making great content\navailable on the open web.\n\n\nLooking back at last year, here's a snapshot of how we fought spam on Search in 2019, and how we\nsupported the webmaster community.\n\nFighting Spam at Scale\n----------------------\n\n\nWith hundreds of billions of webpages in our index serving billions of queries every day,\nperhaps it's not too surprising that there continue to be bad actors who try to manipulate\nsearch ranking. In fact, we observed that **more than 25 Billion pages we discover each\nday are spammy**. That's a lot of spam and it goes to show the scale, persistence, and\nthe lengths that spammers are willing to go. We're very serious about making sure that your\nchance of encountering spammy pages in Search is as small as possible. Our efforts have helped\nensure that more than 99% of visits from our results lead to experiences without spam.\n\nUpdates from last year\n----------------------\n\n\nIn 2018, we reported that we had reduced\n[user-generated spam](/search/docs/advanced/guidelines/user-gen-spam) by 80%,\nand we're happy to confirm that this type of abuse did not grow in 2019. Link spam continued to\nbe a popular form of spam, but our team was successful in containing its impact in 2019. More\nthan 90% of link spam was caught by our systems, and techniques such as paid links or link\nexchange have been made less effective.\n\n\nHacked spam, while still a commonly observed challenge, has been more stable compared to\nprevious years. We continued to work on solutions to better detect and notify affected\nwebmasters and platforms and\n[help them recover from hacked websites](/web/fundamentals/security/hacked).\n\nSpam Trends\n-----------\n\n\nOne of our top priorities in 2019 was improving our spam fighting capabilities through machine\nlearning systems. Our machine learning solutions, combined with our proven and time-tested\nmanual enforcement capability, have been instrumental in identifying and preventing spammy\nresults from being served to users.\n\n\nIn the last few years, we've observed an increase in spammy sites with\n[auto-generated](/search/docs/advanced/guidelines/auto-gen-content) and\n[scraped content](/search/docs/advanced/guidelines/scraped-content)\nwith behaviors that annoy or harm searchers, such as fake buttons, overwhelming ads, suspicious\nredirects and malware. These websites are often deceptive and offer no real value to people. In\n2019, we were able to reduce the impact on Search users from this type of spam by more than 60%\ncompared to 2018.\n\n\nAs we improve our capability and efficiency in catching spam, we're continuously investing in\nreducing broader types of harm, like scams and fraud. These sites trick people into thinking\nthey're visiting an official or authoritative site and in many cases, people can end up\ndisclosing sensitive personal information, losing money, or infecting their devices with\nmalware. We have been paying close attention to queries that are prone to scam and fraud and\nwe've worked to stay ahead of spam tactics in those spaces to protect users.\n\nWorking with webmasters and developers for a better web\n-------------------------------------------------------\n\n\nMuch of the work we do to fight against spam is using automated systems to detect spammy\nbehavior, but those systems aren't perfect and can't catch everything. As someone who uses\nSearch, you can also help us fight spam and other issues by\n[reporting spam on search](/search/docs/advanced/guidelines/report-spam),\n[phishing](https://safebrowsing.google.com/safebrowsing/report_phish/) or\n[malware](https://www.google.com/safebrowsing/report_badware/). We received nearly\n230,000 reports of search spam in 2019, and we were able to take action on 82% of those reports\nwe processed. We appreciate all the reports you sent to us and your help in keeping search\nresults clean!\n\n\nSo what do we do when we get those reports or identify that something isn't quite right? An\nimportant part of what we do is notifying webmasters when we detect something wrong with their\nwebsite. In 2019, we generated more than 90 million messages to website owners to let them know\nabout issues, problems that may affect their site's appearance on Search results and potential\nimprovements that can be implemented. Of all messages, about 4.3 million were related to\n[manual actions](https://support.google.com/webmasters/answer/9044175), resulting\nfrom violations of our Webmaster Guidelines.\n\n\nAnd we're always looking for ways to better help site owners. There were many initiatives in\n2019 aimed at improving communications, such as\n[the new Search Console messages](/search/blog/2019/12/search-console-messages),\n[Site Kit for WordPress sites](/search/blog/2019/10/site-kit-is-now-available-for-all)\nor\n[the Auto-DNS verification in the new Search Console](/search/blog/2019/09/auto-dns-verification).\nWe hope that these initiatives have equipped webmasters with more convenient ways to get their\nsites verified and will continue to be helpful. We also hope this provides quicker access to\nnews and that webmasters will be able to fix webspam issues or hack issues more effectively and\nefficiently.\n\n\nWhile we deeply focused on cleaning up spam, we also didn't forget to keep up with the evolution\nof the web and\n[rethought how we wanted to\ntreat `\"nofollow\"` links](/search/blog/2019/09/evolving-nofollow-new-ways-to-identify). Originally introduced as a means to\nhelp fight comment spam and annotate sponsored links, the `\"nofollow\"`\nattribute has come a long way. But we're not stopping there. We believe it's time for it to\nevolve even more, just as how our spam fighting capability has evolved. We introduced two new\nlink attributes, `rel=\"sponsored\"` and `rel=\"ugc\"`,\nthat provide webmasters with additional ways to identify to Google Search the nature of\nparticular links. Along with `rel=\"nofollow\"`, we began treating these\nas hints for us to incorporate for ranking purposes. We are very excited to see that these new\nrel attributes were well received and adopted by webmasters around the world!\n\nEngaging with the community\n---------------------------\n\n\nAs always, we're grateful for all the opportunities we had last year to connect with webmasters\naround the world, helping them improve their presence in Search and hearing feedback. We\ndelivered more than 150 online office hours, online events and offline events in many cities\nacross the globe to a wide range of audience including SEOs, developers, online marketers and\nbusiness owners. Among those events, we have been delighted by\n[the momentum behind our Webmaster Conferences](/search/blog/2019/09/join-us-at-webmaster-conference-in)\nin 35 locations across 15 countries and 12 languages around the world, including the first\nProduct Summit version in Mountain View. While we're not currently able to host in-person\nevents, we look forward to more of these\n[events](/search/events) and virtual\ntouchpoints in the future.\n\n\nWebmasters continued to find solutions and tips on our\n[Webmasters Help Community](https://support.google.com/webmasters/community)\nwith more than 30,000 threads in 2019 in more than a dozen languages. On YouTube, we\n[launched #AskGoogleWebmasters](/search/blog/2019/08/you-askgooglewebmasters-we-answer)\nas well as series such as\n[SEO mythbusting](/search/blog/2019/06/a-new-series-on-seo-for-web-developers) to\nensure that your questions get answered and your uncertainties get clarified.\n\n\nWe know that our journey to better web with you is ongoing and we would love to continue this\nwith you in the year to come! Therefore, do keep in touch on\n[Twitter](https://twitter.com/googlesearchc),\n[YouTube](https://www.youtube.com/channel/UCWf2ZlNsCGDS89VBF_awNvA),\n[blog](/search/blog),\n[Help Community](https://support.google.com/webmasters/community) or see you in\nperson at one of\n[our conferences](/search/events) near you!\n\n\nPosted by [Cherry Prommawin](https://www.linkedin.com/in/cherry-prom/), Search Relations, and [Duy Nguyen](/search/blog/authors/duy-nguyen), Search Quality Analyst"]]