{"id":139914,"date":"2025-12-26T10:02:44","date_gmt":"2025-12-26T18:02:44","guid":{"rendered":"https:\/\/xira.com\/p\/2025\/12\/26\/google-built-its-empire-scraping-the-web-now-its-suing-to-stop-others-from-scraping-google\/"},"modified":"2025-12-26T10:02:44","modified_gmt":"2025-12-26T18:02:44","slug":"google-built-its-empire-scraping-the-web-now-its-suing-to-stop-others-from-scraping-google","status":"publish","type":"post","link":"https:\/\/xira.com\/p\/2025\/12\/26\/google-built-its-empire-scraping-the-web-now-its-suing-to-stop-others-from-scraping-google\/","title":{"rendered":"Google Built Its Empire Scraping The Web. Now It\u2019s Suing To Stop Others From Scraping Google."},"content":{"rendered":"<p>Last week,\u00a0<a href=\"https:\/\/storage.courtlistener.com\/recap\/gov.uscourts.cand.461513\/gov.uscourts.cand.461513.1.0.pdf\" rel=\"nofollow noopener\" target=\"_blank\">Google filed suit<\/a>\u00a0against SerpApi, a scraping company that helps businesses pull data from Google search results. The lawsuit claims SerpApi violated DMCA Section 1201 by circumventing Google\u2019s \u201ctechnological protection measures\u201d to access search results\u2014and the copyrighted content within them\u2014without permission.<\/p>\n<p>There\u2019s just one problem with this theory: Google built its entire business on scraping the web without asking permission first. And now it wants to use one of the most abused provisions in copyright law to stop others from doing something functionally similar to what made Google a tech giant in the first place.<\/p>\n<p>The lawsuit comes on the heels of Reddit\u2019s\u00a0<a href=\"https:\/\/www.techdirt.com\/2025\/10\/24\/reddits-ai-scraping-lawsuit-is-an-attack-on-the-open-internet\/\" rel=\"nofollow noopener\" target=\"_blank\">equally problematic anti-scraping suit from October<\/a>\u2014which we called an attack on the open internet. Reddit sued Perplexity and various scraping firms (including SerpApi), claiming they violated 1201 by circumventing\u2026 Google\u2019s technological protections. Reddit was mad it had cut a multi-million dollar licensing deal with Google for access to Reddit content, and these firms were routing around both that deal and Google itself to provide similar results to users. The legal theory was bizarre: Reddit didn\u2019t own the copyright on user posts, and the scrapers weren\u2019t even touching Reddit directly\u2014yet Reddit claimed standing to sue based on circumventing someone else\u2019s TPMs.<\/p>\n<p>So now, Google has filed its own, similar lawsuit, going after SerpApi directly, focused on how SerpApi gets around its attempts to block such scraping. Google released\u00a0<a href=\"https:\/\/blog.google\/technology\/safety-security\/serpapi-lawsuit\/\" rel=\"nofollow noopener\" target=\"_blank\">a blog post defending this lawsuit<\/a>:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><em>We<\/em>\u00a0<a href=\"https:\/\/storage.googleapis.com\/gweb-uniblog-publish-prod\/documents\/Google_v._SerpApi__Complaint.pdf\" rel=\"nofollow noopener\" target=\"_blank\"><em>filed a suit<\/em><\/a>\u00a0<em>today against the scraping company SerpApi for circumventing security measures protecting others\u2019 copyrighted content that appears in Google search results. We did this to ask a court to stop SerpApi\u2019s bots and their malicious scraping, which violates the choices of websites and rightsholders about who should have access to their content. This lawsuit follows<\/em>\u00a0<a href=\"https:\/\/www.nytimes.com\/2025\/10\/22\/technology\/reddit-data-scrapers-perplexity-theft.html\" rel=\"nofollow noopener\" target=\"_blank\"><em>legal action<\/em><\/a>\u00a0<em>that other websites have taken against SerpApi and similar scraping companies, and is part of our long track record of affirmative litigation to<\/em>\u00a0<a href=\"https:\/\/blog.google\/outreach-initiatives\/public-policy\/legal-action-and-legislation-fight-scammers\/\" rel=\"nofollow noopener\" target=\"_blank\"><em>fight scammers<\/em><\/a>\u00a0<em>and<\/em>\u00a0<a href=\"https:\/\/blog.google\/outreach-initiatives\/public-policy\/taking-legal-action-to-protect-users-of-ai-and-small-businesses\/\" rel=\"nofollow noopener\" target=\"_blank\"><em>bad actors<\/em><\/a>\u00a0<em>on the web.<\/em><\/p>\n<p><em>Google follows industry-standard crawling protocols, and honors websites\u2019 directives over crawling of their content. Stealthy scrapers like SerpApi override those directives and give sites no choice at all. SerpApi uses shady back doors \u2014 like cloaking themselves, bombarding websites with massive networks of bots and giving their crawlers fake and constantly changing names \u2014 circumventing our security measures to take websites\u2019 content wholesale. This unlawful activity has increased dramatically over the past year.<\/em><\/p>\n<p><em>SerpApi deceptively takes content that Google licenses from others (like images that appear in Knowledge Panels, real-time data in Search features and much more), and then resells it for a fee. In doing so, it willfully disregards the rights and directives of websites and providers whose content appears in Search.<\/em><\/p>\n<\/blockquote>\n<p>Look, SerpApi\u2019s behavior is sketchy. Spoofing user agents, rotating IPs to look like legitimate users, solving CAPTCHAs programmatically\u2014Google\u2019s complaint paints a picture of a company actively working to evade detection. But the legal theory Google is deploying to stop them threatens something far bigger than one shady scraper.<\/p>\n<p>Google\u2019s\u00a0<em>entire business<\/em>\u00a0is built on scraping as much of the web as possible without first asking permission. The fact that they now want to invoke DMCA 1201\u2014one of the most consistently abused provisions in copyright law\u2014to stop others from scraping them exposes the underlying problem with these licensing-era arguments: they\u2019re attempts to pull up the ladder after you\u2019ve climbed it.<\/p>\n<p>Just from a straight up perception standpoint, it\u00a0<em>looks<\/em>\u00a0bad.<\/p>\n<p>To be clear: this isn\u2019t about defending SerpApi. They appear to be bad actors who built a business on evading detection systems. The problem is that Google chose to go after them using a legal weapon with a long history of collateral damage. When you invoke Section 1201 against web scraping, you\u2019re not just targeting one sketchy company\u2014you\u2019re potentially rewriting the rules for how the entire open web functions. The choice of weapon matters, especially when that weapon has been repeatedly abused to stifle legitimate competition and could now be turned against the very openness that made the modern internet possible.<\/p>\n<p>For many years, we\u2019ve discussed the many, many problems of\u00a0<a href=\"https:\/\/www.techdirt.com\/tag\/dmca-1201\/\" rel=\"nofollow noopener\" target=\"_blank\">DMCA Section 1201<\/a>. It\u2019s the \u201canti-circumvention\u201d part of the law that says merely any attempt to get around a \u201ctechnological protection measure\u201d (or even just tell someone else how to get around a technological protection measure) could be deemed to violate the law, even if the TPMs in question were wholly ineffective, and even if the intent in getting around the TPM had nothing to do with copyright infringement.<\/p>\n<p>That has lead to years of abusive practices by companies who would put silly, pointless \u201cTPMs\u201d in place just in order to be able to use the law to limit competition. There were lawsuits over\u00a0<a href=\"https:\/\/www.techdirt.com\/2005\/02\/21\/lexmark-slapped-down-again-in-dmca-suit\/\" rel=\"nofollow noopener\" target=\"_blank\">printer ink cartridges<\/a>\u00a0and\u00a0<a href=\"https:\/\/www.techdirt.com\/2004\/09\/01\/court-opens-garage-doors-but-sets-murky-precedent\/\" rel=\"nofollow noopener\" target=\"_blank\">garage door openers<\/a>, among other things.<\/p>\n<p>Here, Google is saying that it put in place a TPM in January of 2025 called \u201cSearchGuard\u201d (which sounds like an advanced CAPTCHA of some sort) to prevent SerpApi from scraping its search results, but SerpApi figured out a way around it:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><em>When SearchGuard launched in January 2025, it effectively blocked SerpApi from accessing Google\u2019s Search results and the copyrighted content of Google\u2019s partners. But SerpApi immediately began working on a means to circumvent Google\u2019s technological protection measure. SerpApi quickly discovered means to do so and deployed them.<\/em><\/p>\n<p><em>SerpApi\u2019s answer to SearchGuard is to mask the hundreds of millions of automated queries it is sending to Google each day to make them appear as if they are coming from human users. SerpApi\u2019s founder recently described the process as \u201ccreating fake browsers using a multitude of IP addresses that Google sees as normal users.\u201d<\/em><\/p>\n<p><em>SerpApi\u2019s fakery takes many forms. For example, when SerpApi submits an automated query to Google and SearchGuard responds with a challenge, SerpApi may misrepresent the device, software, or location from which the query is sent in order to solve the challenge and obtain authorization to submit queries. Additionally or alternatively, SerpApi may solve SearchGuard\u2019s challenge with a \u201clegitimate\u201d request and then syndicate the resulting authorization, that is, share it with unauthorized machines around the world, to enable their \u201cfake browsers\u201d to generate automated queries that appear to Google as authorized. It also uses automated means to bypass CAPTCHAs, another aspect of SearchGuard that tests users to ensure they are humans rather than machines.<\/em><\/p>\n<\/blockquote>\n<p>Getting around these protections eats up Google\u2019s resources, and sure, that must be annoying for Google. But the real motivation shows up when Google gets to the economics of the situation. Google has started cutting licensing deals with content partners\u2014most notably the multi-million dollar Reddit deal\u2014and now those partners are pissed that SerpApi lets others access similar data without paying anyone:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><em>For Google, SerpApi\u2019s automated scraping not only consumes substantial computing resources without payment, but also disrupts Google\u2019s content partnerships. Google licenses content so that it can enhance the Search results it provides to users and thereby boost its competitive standing. SerpApi undermines Google\u2019s substantial investment in those licenses, making the content available to other services that need not incur similar costs.<\/em><\/p>\n<p><em>SerpApi\u2019s scraping of Google Search results also impacts the rights holders who license content to Google. Without permission or compensation, SerpApi takes their content from Google and widely distributes it for use by third parties. That, in turn, threatens to disrupt Google\u2019s relationship with the rights holders who look to Google to prevent the misappropriation of the content Google displays. At least one Google content partner, Reddit, has already sued SerpApi for its misconduct.<\/em><\/p>\n<\/blockquote>\n<p>This is where the 1201 theory becomes genuinely dangerous. Google\u2019s argument, if accepted, provides a roadmap for any website operator who wants to lock down their content: slap on a trivial TPM\u2014a CAPTCHA, an IP check, anything\u2014and suddenly you can invoke federal law against anyone who figures out how to get around it, even if their purpose has nothing to do with copyright infringement.<\/p>\n<p>The implications spiral outward quickly. If Google succeeds here, what stops every major website from deciding they want licensing revenue from the largest scrapers? Cloudflare could put bot detection on the huge swath of the internet it serves and demand Google pay up. WordPress could do the same across its massive network. The open web\u2014built on the assumption that published content is publicly accessible for indexing and analysis\u2014becomes a patchwork of licensing requirements, each enforced through 1201 threats.<\/p>\n<p>That doesn\u2019t seem good for the prospects of a continued open web.<\/p>\n<p>Google\u2019s legal theory has another significant problem: the requirement that a TPM must \u201ceffectively control\u201d access. Just last week, a court\u00a0<a href=\"https:\/\/blog.ericgoldman.org\/archives\/2025\/12\/are-robots-txt-instructions-legally-binding-ziff-davis-v-openai.htm\" rel=\"nofollow noopener\" target=\"_blank\">rejected<\/a>\u00a0Ziff Davis\u2019s attempt to turn robots.txt into a 1201 violation when OpenAI allegedly ignored its crawling restrictions. The court\u2019s reasoning is directly applicable here:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><em>Robots.txt files instructing web crawlers to refrain from scraping certain content do not \u201ceffectively control\u201d access to that content any more than a sign requesting that visitors \u201ckeep off the grass\u201d effectively controls access to a lawn. On Ziff Davis\u2019s own telling, robots.txt directives are merely requests and do not effectively control access to copyrighted works. A web crawler need not \u201cappl[y] . . . information, or a process or a treatment,\u201d in order to gain access to web content on pages that include robots.txt directives; it may access the content without taking any affirmative step other than impertinently disregarding the request embodied in the robots.txt files. The FAC therefore fails to allege that robots.txt files are a \u201ctechnological measure that effectively controls access\u201d to Ziff Davis\u2019s copyrighted works, and the DMCA section 1201(a) claim fails for this reason.<\/em><\/p>\n<\/blockquote>\n<p>Google will argue SearchGuard is different\u2014it\u2019s more than a polite request, it actively challenges and blocks scrapers. But if SerpApi can routinely bypass it by spoofing browsers and rotating IPs, does it really \u201ceffectively control\u201d access? Or is it just a slightly more sophisticated \u201ckeep off the grass\u201d sign that determined actors can ignore?<\/p>\n<p>This question matters enormously because it determines whether the statute that was supposed to prevent piracy of CDs and DVDs now also governs every attempt to access publicly-available web pages through automated means.<\/p>\n<p>For decades, we\u2019ve operated under a system where robots.txt represented a voluntary, good-faith approach to web crawling. The major players respected these directives not because they had to, but because maintaining that norm benefited everyone. That system is breaking down, not because of SerpApi, but because of the rise of scrapers focused on LLM training, mixed with other companies wanting to find licensing deals to get a cut of the money flows. Reddit and Google negotiating licensing deals over open web content was a warning sign of all of this, and now it\u2019s spilling out into the courts with questionable 1201 claims.<\/p>\n<p>Both Reddit and Google frame this as protecting the open internet from bad actors. But pulling up the ladder after you\u2019ve climbed it isn\u2019t protection\u2014it\u2019s rent-seeking. Google built an empire on the assumption that publicly accessible web content could be freely scraped and indexed. Now it wants to rewrite the rules\u2026 using Hollywood\u2019s favorite tool to block access to information.<\/p>\n<p>The real problem isn\u2019t that Google is fighting back against SerpApi\u2019s evasive tactics. It\u2019s that they chose to fight using a legal weapon that, if successful, fundamentally changes how we understand access to the open web. Section 1201 has already been wildly abused to stifle competition in everything from printer cartridges to garage door openers. Extending it to cover basic web scraping because SerpApi seems sketchy threatens the foundational assumption that published web content is accessible for indexing, research, and analysis.<\/p>\n<p>Google has the resources to solve this problem through better engineering or by raising the actual cost of evasion high enough that SerpApi\u2019s business model fails. Instead, they\u2019ve opted for a legal shortcut that, if it works, will reshape the internet in ways that go far beyond one sketchy scraping company.<\/p>\n<p>The internet is changing, and legitimate questions exist about how web scraping should function in an era of large language models and AI training. But those questions won\u2019t be answered well by stretching copyright law to cover something it was never designed for, and empowering every website operator to demand licensing fees simply by putting up a CAPTCHA.<\/p>\n<p>That\u2019s not protecting the open web. That\u2019s closing it.<\/p>\n<p><a href=\"https:\/\/embed.documentcloud.org\/documents\/26438560-google-v-serpapi-complaint\/?embed=1\" rel=\"nofollow noopener\" target=\"_blank\">Click here to see the docs<\/a>.<\/p>\n<p><a href=\"https:\/\/www.techdirt.com\/2025\/12\/24\/google-built-its-empire-scraping-the-web-now-its-suing-to-stop-others-from-scraping-google\/\" rel=\"nofollow noopener\" target=\"_blank\">Google Built Its Empire Scraping The Web. Now It\u2019s Suing To Stop Others From Scraping Google<\/a><\/p>\n<p><strong>More Law-Related Stories From Techdirt:<\/strong><\/p>\n<p><a href=\"https:\/\/www.techdirt.com\/2025\/12\/24\/maga-legislators-want-to-add-mercenaries-to-trumps-perverse-take-on-the-war-on-drugs\/\" rel=\"nofollow noopener\" target=\"_blank\">MAGA Legislators Want To Add Mercenaries To Trump\u2019s Perverse Take On The \u2018War On Drugs\u2019<br \/><\/a><a href=\"https:\/\/www.techdirt.com\/2025\/12\/23\/40-years-of-copyright-obstruction-to-human-rights-and-social-justice\/\" rel=\"nofollow noopener\" target=\"_blank\">40 Years Of Copyright Obstruction To Human Rights And Social Justice<br \/><\/a><a href=\"https:\/\/www.techdirt.com\/2025\/12\/24\/uk-law-enforcement-pushed-hard-to-maintain-access-to-deeply-flawed-facial-recognition-tech\/\" rel=\"nofollow noopener\" target=\"_blank\">UK Law Enforcement Pushed Hard To Maintain Access To Deeply Flawed Facial Recognition Tech<\/a><\/p>\n<p><a href=\"https:\/\/www.techdirt.com\/2025\/12\/24\/google-built-its-empire-scraping-the-web-now-its-suing-to-stop-others-from-scraping-google\/\" rel=\"nofollow noopener\" target=\"_blank\"><br \/><\/a><\/p>\n<p>The post <a href=\"https:\/\/abovethelaw.com\/2025\/12\/google-built-its-empire-scraping-the-web-now-its-suing-to-stop-others-from-scraping-google\/\" rel=\"nofollow noopener\" target=\"_blank\">Google Built Its Empire Scraping The Web. Now It\u2019s Suing To Stop Others From Scraping Google.<\/a> appeared first on <a href=\"https:\/\/abovethelaw.com\/\" rel=\"nofollow noopener\" target=\"_blank\">Above the Law<\/a>.<\/p>\n<p>Last week,\u00a0<a href=\"https:\/\/storage.courtlistener.com\/recap\/gov.uscourts.cand.461513\/gov.uscourts.cand.461513.1.0.pdf\" rel=\"nofollow noopener\" target=\"_blank\">Google filed suit<\/a>\u00a0against SerpApi, a scraping company that helps businesses pull data from Google search results. The lawsuit claims SerpApi violated DMCA Section 1201 by circumventing Google\u2019s \u201ctechnological protection measures\u201d to access search results\u2014and the copyrighted content within them\u2014without permission.<\/p>\n<p>There\u2019s just one problem with this theory: Google built its entire business on scraping the web without asking permission first. And now it wants to use one of the most abused provisions in copyright law to stop others from doing something functionally similar to what made Google a tech giant in the first place.<\/p>\n<p>The lawsuit comes on the heels of Reddit\u2019s\u00a0<a href=\"https:\/\/www.techdirt.com\/2025\/10\/24\/reddits-ai-scraping-lawsuit-is-an-attack-on-the-open-internet\/\" rel=\"nofollow noopener\" target=\"_blank\">equally problematic anti-scraping suit from October<\/a>\u2014which we called an attack on the open internet. Reddit sued Perplexity and various scraping firms (including SerpApi), claiming they violated 1201 by circumventing\u2026 Google\u2019s technological protections. Reddit was mad it had cut a multi-million dollar licensing deal with Google for access to Reddit content, and these firms were routing around both that deal and Google itself to provide similar results to users. The legal theory was bizarre: Reddit didn\u2019t own the copyright on user posts, and the scrapers weren\u2019t even touching Reddit directly\u2014yet Reddit claimed standing to sue based on circumventing someone else\u2019s TPMs.<\/p>\n<p>So now, Google has filed its own, similar lawsuit, going after SerpApi directly, focused on how SerpApi gets around its attempts to block such scraping. Google released\u00a0<a href=\"https:\/\/blog.google\/technology\/safety-security\/serpapi-lawsuit\/\" rel=\"nofollow noopener\" target=\"_blank\">a blog post defending this lawsuit<\/a>:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><em>We<\/em>\u00a0<a href=\"https:\/\/storage.googleapis.com\/gweb-uniblog-publish-prod\/documents\/Google_v._SerpApi__Complaint.pdf\" rel=\"nofollow noopener\" target=\"_blank\"><em>filed a suit<\/em><\/a>\u00a0<em>today against the scraping company SerpApi for circumventing security measures protecting others\u2019 copyrighted content that appears in Google search results. We did this to ask a court to stop SerpApi\u2019s bots and their malicious scraping, which violates the choices of websites and rightsholders about who should have access to their content. This lawsuit follows<\/em>\u00a0<a href=\"https:\/\/www.nytimes.com\/2025\/10\/22\/technology\/reddit-data-scrapers-perplexity-theft.html\" rel=\"nofollow noopener\" target=\"_blank\"><em>legal action<\/em><\/a>\u00a0<em>that other websites have taken against SerpApi and similar scraping companies, and is part of our long track record of affirmative litigation to<\/em>\u00a0<a href=\"https:\/\/blog.google\/outreach-initiatives\/public-policy\/legal-action-and-legislation-fight-scammers\/\" rel=\"nofollow noopener\" target=\"_blank\"><em>fight scammers<\/em><\/a>\u00a0<em>and<\/em>\u00a0<a href=\"https:\/\/blog.google\/outreach-initiatives\/public-policy\/taking-legal-action-to-protect-users-of-ai-and-small-businesses\/\" rel=\"nofollow noopener\" target=\"_blank\"><em>bad actors<\/em><\/a>\u00a0<em>on the web.<\/em><\/p>\n<p><em>Google follows industry-standard crawling protocols, and honors websites\u2019 directives over crawling of their content. Stealthy scrapers like SerpApi override those directives and give sites no choice at all. SerpApi uses shady back doors \u2014 like cloaking themselves, bombarding websites with massive networks of bots and giving their crawlers fake and constantly changing names \u2014 circumventing our security measures to take websites\u2019 content wholesale. This unlawful activity has increased dramatically over the past year.<\/em><\/p>\n<p><em>SerpApi deceptively takes content that Google licenses from others (like images that appear in Knowledge Panels, real-time data in Search features and much more), and then resells it for a fee. In doing so, it willfully disregards the rights and directives of websites and providers whose content appears in Search.<\/em><\/p>\n<\/blockquote>\n<p>Look, SerpApi\u2019s behavior is sketchy. Spoofing user agents, rotating IPs to look like legitimate users, solving CAPTCHAs programmatically\u2014Google\u2019s complaint paints a picture of a company actively working to evade detection. But the legal theory Google is deploying to stop them threatens something far bigger than one shady scraper.<\/p>\n<p>Google\u2019s\u00a0<em>entire business<\/em>\u00a0is built on scraping as much of the web as possible without first asking permission. The fact that they now want to invoke DMCA 1201\u2014one of the most consistently abused provisions in copyright law\u2014to stop others from scraping them exposes the underlying problem with these licensing-era arguments: they\u2019re attempts to pull up the ladder after you\u2019ve climbed it.<\/p>\n<p>Just from a straight up perception standpoint, it\u00a0<em>looks<\/em>\u00a0bad.<\/p>\n<p>To be clear: this isn\u2019t about defending SerpApi. They appear to be bad actors who built a business on evading detection systems. The problem is that Google chose to go after them using a legal weapon with a long history of collateral damage. When you invoke Section 1201 against web scraping, you\u2019re not just targeting one sketchy company\u2014you\u2019re potentially rewriting the rules for how the entire open web functions. The choice of weapon matters, especially when that weapon has been repeatedly abused to stifle legitimate competition and could now be turned against the very openness that made the modern internet possible.<\/p>\n<p>For many years, we\u2019ve discussed the many, many problems of\u00a0<a href=\"https:\/\/www.techdirt.com\/tag\/dmca-1201\/\" rel=\"nofollow noopener\" target=\"_blank\">DMCA Section 1201<\/a>. It\u2019s the \u201canti-circumvention\u201d part of the law that says merely any attempt to get around a \u201ctechnological protection measure\u201d (or even just tell someone else how to get around a technological protection measure) could be deemed to violate the law, even if the TPMs in question were wholly ineffective, and even if the intent in getting around the TPM had nothing to do with copyright infringement.<\/p>\n<p>That has lead to years of abusive practices by companies who would put silly, pointless \u201cTPMs\u201d in place just in order to be able to use the law to limit competition. There were lawsuits over\u00a0<a href=\"https:\/\/www.techdirt.com\/2005\/02\/21\/lexmark-slapped-down-again-in-dmca-suit\/\" rel=\"nofollow noopener\" target=\"_blank\">printer ink cartridges<\/a>\u00a0and\u00a0<a href=\"https:\/\/www.techdirt.com\/2004\/09\/01\/court-opens-garage-doors-but-sets-murky-precedent\/\" rel=\"nofollow noopener\" target=\"_blank\">garage door openers<\/a>, among other things.<\/p>\n<p>Here, Google is saying that it put in place a TPM in January of 2025 called \u201cSearchGuard\u201d (which sounds like an advanced CAPTCHA of some sort) to prevent SerpApi from scraping its search results, but SerpApi figured out a way around it:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><em>When SearchGuard launched in January 2025, it effectively blocked SerpApi from accessing Google\u2019s Search results and the copyrighted content of Google\u2019s partners. But SerpApi immediately began working on a means to circumvent Google\u2019s technological protection measure. SerpApi quickly discovered means to do so and deployed them.<\/em><\/p>\n<p><em>SerpApi\u2019s answer to SearchGuard is to mask the hundreds of millions of automated queries it is sending to Google each day to make them appear as if they are coming from human users. SerpApi\u2019s founder recently described the process as \u201ccreating fake browsers using a multitude of IP addresses that Google sees as normal users.\u201d<\/em><\/p>\n<p><em>SerpApi\u2019s fakery takes many forms. For example, when SerpApi submits an automated query to Google and SearchGuard responds with a challenge, SerpApi may misrepresent the device, software, or location from which the query is sent in order to solve the challenge and obtain authorization to submit queries. Additionally or alternatively, SerpApi may solve SearchGuard\u2019s challenge with a \u201clegitimate\u201d request and then syndicate the resulting authorization, that is, share it with unauthorized machines around the world, to enable their \u201cfake browsers\u201d to generate automated queries that appear to Google as authorized. It also uses automated means to bypass CAPTCHAs, another aspect of SearchGuard that tests users to ensure they are humans rather than machines.<\/em><\/p>\n<\/blockquote>\n<p>Getting around these protections eats up Google\u2019s resources, and sure, that must be annoying for Google. But the real motivation shows up when Google gets to the economics of the situation. Google has started cutting licensing deals with content partners\u2014most notably the multi-million dollar Reddit deal\u2014and now those partners are pissed that SerpApi lets others access similar data without paying anyone:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><em>For Google, SerpApi\u2019s automated scraping not only consumes substantial computing resources without payment, but also disrupts Google\u2019s content partnerships. Google licenses content so that it can enhance the Search results it provides to users and thereby boost its competitive standing. SerpApi undermines Google\u2019s substantial investment in those licenses, making the content available to other services that need not incur similar costs.<\/em><\/p>\n<p><em>SerpApi\u2019s scraping of Google Search results also impacts the rights holders who license content to Google. Without permission or compensation, SerpApi takes their content from Google and widely distributes it for use by third parties. That, in turn, threatens to disrupt Google\u2019s relationship with the rights holders who look to Google to prevent the misappropriation of the content Google displays. At least one Google content partner, Reddit, has already sued SerpApi for its misconduct.<\/em><\/p>\n<\/blockquote>\n<p>This is where the 1201 theory becomes genuinely dangerous. Google\u2019s argument, if accepted, provides a roadmap for any website operator who wants to lock down their content: slap on a trivial TPM\u2014a CAPTCHA, an IP check, anything\u2014and suddenly you can invoke federal law against anyone who figures out how to get around it, even if their purpose has nothing to do with copyright infringement.<\/p>\n<p>The implications spiral outward quickly. If Google succeeds here, what stops every major website from deciding they want licensing revenue from the largest scrapers? Cloudflare could put bot detection on the huge swath of the internet it serves and demand Google pay up. WordPress could do the same across its massive network. The open web\u2014built on the assumption that published content is publicly accessible for indexing and analysis\u2014becomes a patchwork of licensing requirements, each enforced through 1201 threats.<\/p>\n<p>That doesn\u2019t seem good for the prospects of a continued open web.<\/p>\n<p>Google\u2019s legal theory has another significant problem: the requirement that a TPM must \u201ceffectively control\u201d access. Just last week, a court\u00a0<a href=\"https:\/\/blog.ericgoldman.org\/archives\/2025\/12\/are-robots-txt-instructions-legally-binding-ziff-davis-v-openai.htm\" rel=\"nofollow noopener\" target=\"_blank\">rejected<\/a>\u00a0Ziff Davis\u2019s attempt to turn robots.txt into a 1201 violation when OpenAI allegedly ignored its crawling restrictions. The court\u2019s reasoning is directly applicable here:<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><em>Robots.txt files instructing web crawlers to refrain from scraping certain content do not \u201ceffectively control\u201d access to that content any more than a sign requesting that visitors \u201ckeep off the grass\u201d effectively controls access to a lawn. On Ziff Davis\u2019s own telling, robots.txt directives are merely requests and do not effectively control access to copyrighted works. A web crawler need not \u201cappl[y] . . . information, or a process or a treatment,\u201d in order to gain access to web content on pages that include robots.txt directives; it may access the content without taking any affirmative step other than impertinently disregarding the request embodied in the robots.txt files. The FAC therefore fails to allege that robots.txt files are a \u201ctechnological measure that effectively controls access\u201d to Ziff Davis\u2019s copyrighted works, and the DMCA section 1201(a) claim fails for this reason.<\/em><\/p>\n<\/blockquote>\n<p>Google will argue SearchGuard is different\u2014it\u2019s more than a polite request, it actively challenges and blocks scrapers. But if SerpApi can routinely bypass it by spoofing browsers and rotating IPs, does it really \u201ceffectively control\u201d access? Or is it just a slightly more sophisticated \u201ckeep off the grass\u201d sign that determined actors can ignore?<\/p>\n<p>This question matters enormously because it determines whether the statute that was supposed to prevent piracy of CDs and DVDs now also governs every attempt to access publicly-available web pages through automated means.<\/p>\n<p>For decades, we\u2019ve operated under a system where robots.txt represented a voluntary, good-faith approach to web crawling. The major players respected these directives not because they had to, but because maintaining that norm benefited everyone. That system is breaking down, not because of SerpApi, but because of the rise of scrapers focused on LLM training, mixed with other companies wanting to find licensing deals to get a cut of the money flows. Reddit and Google negotiating licensing deals over open web content was a warning sign of all of this, and now it\u2019s spilling out into the courts with questionable 1201 claims.<\/p>\n<p>Both Reddit and Google frame this as protecting the open internet from bad actors. But pulling up the ladder after you\u2019ve climbed it isn\u2019t protection\u2014it\u2019s rent-seeking. Google built an empire on the assumption that publicly accessible web content could be freely scraped and indexed. Now it wants to rewrite the rules\u2026 using Hollywood\u2019s favorite tool to block access to information.<\/p>\n<p>The real problem isn\u2019t that Google is fighting back against SerpApi\u2019s evasive tactics. It\u2019s that they chose to fight using a legal weapon that, if successful, fundamentally changes how we understand access to the open web. Section 1201 has already been wildly abused to stifle competition in everything from printer cartridges to garage door openers. Extending it to cover basic web scraping because SerpApi seems sketchy threatens the foundational assumption that published web content is accessible for indexing, research, and analysis.<\/p>\n<p>Google has the resources to solve this problem through better engineering or by raising the actual cost of evasion high enough that SerpApi\u2019s business model fails. Instead, they\u2019ve opted for a legal shortcut that, if it works, will reshape the internet in ways that go far beyond one sketchy scraping company.<\/p>\n<p>The internet is changing, and legitimate questions exist about how web scraping should function in an era of large language models and AI training. But those questions won\u2019t be answered well by stretching copyright law to cover something it was never designed for, and empowering every website operator to demand licensing fees simply by putting up a CAPTCHA.<\/p>\n<p>That\u2019s not protecting the open web. That\u2019s closing it.<\/p>\n<p><a href=\"https:\/\/embed.documentcloud.org\/documents\/26438560-google-v-serpapi-complaint\/?embed=1\" rel=\"nofollow noopener\" target=\"_blank\">Click here to see the docs<\/a>.<\/p>\n<p><a href=\"https:\/\/www.techdirt.com\/2025\/12\/24\/google-built-its-empire-scraping-the-web-now-its-suing-to-stop-others-from-scraping-google\/\" rel=\"nofollow noopener\" target=\"_blank\">Google Built Its Empire Scraping The Web. Now It\u2019s Suing To Stop Others From Scraping Google<\/a><\/p>\n<p><strong>More Law-Related Stories From Techdirt:<\/strong><\/p>\n<p><a href=\"https:\/\/www.techdirt.com\/2025\/12\/24\/maga-legislators-want-to-add-mercenaries-to-trumps-perverse-take-on-the-war-on-drugs\/\" rel=\"nofollow noopener\" target=\"_blank\">MAGA Legislators Want To Add Mercenaries To Trump\u2019s Perverse Take On The \u2018War On Drugs\u2019<br \/><\/a><a href=\"https:\/\/www.techdirt.com\/2025\/12\/23\/40-years-of-copyright-obstruction-to-human-rights-and-social-justice\/\" rel=\"nofollow noopener\" target=\"_blank\">40 Years Of Copyright Obstruction To Human Rights And Social Justice<br \/><\/a><a href=\"https:\/\/www.techdirt.com\/2025\/12\/24\/uk-law-enforcement-pushed-hard-to-maintain-access-to-deeply-flawed-facial-recognition-tech\/\" rel=\"nofollow noopener\" target=\"_blank\">UK Law Enforcement Pushed Hard To Maintain Access To Deeply Flawed Facial Recognition Tech<\/a><\/p>\n<p><a href=\"https:\/\/www.techdirt.com\/2025\/12\/24\/google-built-its-empire-scraping-the-web-now-its-suing-to-stop-others-from-scraping-google\/\" rel=\"nofollow noopener\" target=\"_blank\"><br \/><\/a><\/p>\n<p>The post <a href=\"https:\/\/abovethelaw.com\/2025\/12\/google-built-its-empire-scraping-the-web-now-its-suing-to-stop-others-from-scraping-google\/\" rel=\"nofollow noopener\" target=\"_blank\">Google Built Its Empire Scraping The Web. Now It\u2019s Suing To Stop Others From Scraping Google.<\/a> appeared first on <a href=\"https:\/\/abovethelaw.com\/\" rel=\"nofollow noopener\" target=\"_blank\">Above the Law<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Last week,\u00a0Google filed suit\u00a0against SerpApi, a scraping company that helps businesses pull data from Google search results. The lawsuit claims SerpApi violated DMCA Section 1201 by circumventing Google\u2019s \u201ctechnological protection measures\u201d to access search results\u2014and the copyrighted content within them\u2014without permission. There\u2019s just one problem with this theory: Google built its entire business on scraping [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"_et_pb_use_builder":"","_et_pb_old_content":"","_et_gb_content_width":"","_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[16],"tags":[],"class_list":["post-139914","post","type-post","status-publish","format-standard","hentry","category-above_the_law"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/xira.com\/p\/wp-json\/wp\/v2\/posts\/139914","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/xira.com\/p\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/xira.com\/p\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/xira.com\/p\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/xira.com\/p\/wp-json\/wp\/v2\/comments?post=139914"}],"version-history":[{"count":0,"href":"https:\/\/xira.com\/p\/wp-json\/wp\/v2\/posts\/139914\/revisions"}],"wp:attachment":[{"href":"https:\/\/xira.com\/p\/wp-json\/wp\/v2\/media?parent=139914"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/xira.com\/p\/wp-json\/wp\/v2\/categories?post=139914"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/xira.com\/p\/wp-json\/wp\/v2\/tags?post=139914"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}