WHAT HAPPENS WHEN AI LEARNS FROM YOUR WORK

April 1, 2026

Debanjan Ranu

Rajiv Gandhi School of Intellectual Property Law, IIT Kharagpur

Case Name: Getty Images (US), Inc. and others v Stability AI Ltd

Citation: 2025 EWHC 2863 (Ch)

Court: High Court of England and Wales

Coram: Justice Joanna Smith DBE

Introduction

The emergence of Artificial Intelligence (AI) has posed a new and complex question for the world, including the judicial authorities: in the era of machine learning, in contrast to replication, what does it mean to “copy”? This tricky question has been answered to a certain extent in the recent case of Getty Images v. Stability AI 2025, becoming the United Kingdom’s first such major judicial engagement with a generative AI model and technology. Getty alleged that Stability AI’s Stable Diffusion LLM model is functionally a repository of millions of copyright-protected photographs internalised through one vast LAION Dataset. The case raised a question for the Court to answer: are the parameters of an AI system, developed through exposure to copyrighted content, also works capable of infringing the rights of entrepreneurs, etc., parties? By refusing to apply machine learning as equivalent to reproduction, the Court avoided the leap from doctrine. And of course, copyright law is built on the notion of expression: fixed, material, human-authored expression. Machine learning, in contrast, is based on abstraction. It is changing images into mathematics — relationships and patterns, angles and probabilities. The issue in Getty was not one of copying through training, since the training claim had been dismissed for jurisdictional reasons. Instead, the question was whether the product of the training process, namely the learned model weights, could themselves be regarded as an infringing “article.” This framing puts the judgment in the crosshairs of doctrinal reading, technological comprehension and policy choice.

Factual Background

The defendant is Stability AI, which had built and developed Stable Diffusion, a generative model trained on different subsets and portions of the LAION-5B datapool, which is nothing but a humongous collection of images/photographs scraped and collected from publicly accessible websites. Among these were thousands of Getty Images-owned photographs, many still bearing Getty’s watermark, incorporated into the dataset without a licence. Getty contended that its images were used at scale during training, and that the resulting model carried an imprint of its creative works.

Now, to properly evaluate and understand this dispute requires clarity about how a Stable Diffusion system processes data. The model does not store images in any traditional sense. Instead, during training, each image is broken down into mathematical representations, and the system repeatedly adjusts its internal model weights (the numerical parameters that determine how the model interprets visual features). These weights, amounting in millions or billions, encode statistical patterns “learned” from the training data, i.e., how edges tend to meet, how textures correlate with shapes, how colours transition. A weight might capture the likelihood of a curved line appearing with a certain shading, but no weight contains or reproduces any portion of an image. They reflect learned relationships, not stored expressions.

Getty’s position treated the trained model as a transformed artefact whose internal parameters had absorbed the expressive qualities of its photographs, i.e., something beyond a mere code. Stability AI, conversely, characterised the model as a functional system shaped by statistical learning, not a repository containing copyrighted works. This factual tension between the model as a mathematical construct and the model as an alleged transformed embodiment of copyrighted images formed the foundation of the copyright dispute.

Issues

The Court distilled the copyright questions to three primary legal issues:

Whether Stable Diffusion’s model weights constitute an “article” for section 27 Copyrights, Designs and Patents Act?
Whether those weights embody an “infringing copy” of Getty’s copyrighted works within sections 22–23 of the CDPA?
Whether distributing or making available the model in the UK constitutes secondary infringement?

Court’s Reasoning

The Court approached these questions by returning to first principles. Copyright law distinguishes between an idea and its expression, and between conduct that reproduces the expression and conduct that merely learns from or analyses it. The model weights, in the Court’s view, occupied the latter category.

The analysis begins by describing how Stable Diffusion processes information. During training, images are decomposed into lower-dimensional representations, and the model iteratively adjusts numerical parameters (weights) in response to patterns in the data. These weights do not contain or store pixels. Nor can any original image be reconstructed from them. Instead, they reflect statistical guidance: associations between prompts and visual features. In other words, the model learns “how to draw”, not “what to draw.”

This factual grounding allowed the Court to conclude that the model weights do not constitute an “article” within section 27 because they do not embody any of Getty’s works. The provision, the Court emphasised, presupposes an object in which a copy of the work is fixed. Drawing on Synthon v. SKB 2005, the Court reaffirmed that embodiment must be specific, identifiable and reproducible and not merely inferential or probabilistic. A computer program may be capable of producing images, but that does not make it an article that contains them.

From this, the Court reasoned that the downstream distribution of the model could not amount to secondary infringement because the precondition of having possession or dealing with an infringing copy was never established. The distinction between learning and copying thus became decisive: even if the training process involved unauthorised reproductions (an issue no longer live), the results of that learning are not themselves reproductions.

Underlying the Court’s reasoning is an interpretive caution. The judgment signals an unwillingness to stretch copyright law into spaces not contemplated by Parliament, especially where doing so could unsettle longstanding boundaries between expression and function. Similarly, in the US Second Circuit case, Authors Guild v. Google 2015, the Court formally gave recognition to the possibility that technological processes can generate intermediate copies without producing expressive outputs that fall within the purview of copyright. But while Google Books framed this within transformative fair use, the UK Court achieved a similar outcome through the definitional structure of the CDPA.

Critique

The judgment is doctrinally tidy, but it leaves several conceptual tensions unresolved.

One significant concern is whether the emphasis on reconstructability sets the threshold for infringement too high. Copyright has always policed non-literal copying from plot skeletons to programming structure. Courts have long acknowledged that expressive value may be communicated through selective elements, patterns or abstractions. The House of Lords in Designers Guild v. Russell Williams, 2000 recognised that infringement can occur where the expressive core is captured, even if precise reproduction is absent. Applying this logic, one might argue that if the model internalises distinctive aspects of Getty’s images such that they inform generated outputs, this may merit a different analytical frame.

Yet, the Court resisted extending such reasoning to AI training, partly because model weights are functional artefacts. Here lies the tension: copyright doctrine has historically adapted to prevent technological evasion, but it also resists collapsing into a general control over influence or style. The Getty decision walks this narrow line by holding that statistical impact does not equate to expressive embodiment.

Another weakness lies in the Court’s limited engagement with contemporary scientific debates surrounding “memorisation” in neural networks. Recent studies show that large models can inadvertently memorise and regurgitate training data. Notably, it cannot be overlooked that Stability AI itself disclosed that rare instances of near-identical outputs may occur under specific user-generated prompts. If these models sometimes produce outputs that look strikingly similar to their training images, is it still convincing to say that their inner architecture contains no trace of the originals? The Court does not fully address this possibility, choosing instead to treat such outputs as edge cases irrelevant to the question of weight-level embodiment.

The Court’s silence on this scientific nuance may become problematic as models grow larger and memorisation becomes empirically more demonstrable. Future litigation may test whether a model that occasionally reproduces training data, without storing it in a conventionally reconstructable form, still falls outside the scope of “copy.”

Impact

The implications of the judgment extend beyond Getty and Stability AI, touching the very architecture of the UK’s emerging AI regulatory landscape. For developers, the ruling provides significant breathing room. If model weights are not copies, then the act of distributing or deploying trained models in the UK does not give rise to secondary infringement. This effectively removes a major legal barrier to open-source AI research. It also places the UK closer to jurisdictions like the US, where courts have shown reluctance to extend copyright control over analytical or transformative computational processes. For creators, it feels a lot more like a rescinding of protection. Image libraries, artists and photographers more and more use the license model as it was proposed: gaining control over the uses of materials downstream. If machine learning systems can wolf down their output wholesale without setting off reproduction rights, creators might be battling against models that have been shaped by their own work and yet bear no relation to their legal appeals.

A useful illustration emerges when comparing the judgment with the treatment of sampling in music copyright. In the US case of Bridgeport Music v. Dimension Films 2005, even a small, altered sample was held to infringe because it appropriated expressive value. AI training, by contrast, involves using entire works to shape internal statistical distributions, yet the output is characterised as expression-free. This generates an interesting asymmetry: using a two-second audio clip invokes copyright liability, but exposing a model to 100,000 photographs may not.

The policy consequences are profound. Lawmakers around the world are discussing whether AI training should have an opt-out framework, licensing arrangements or even mandatory compensation. The differences in jurisdictional philosophy are apparent from the EU AI Act and Japan’s more liberal de minimis exceptions. Post-Getty, the position in the UK is uncertain: doctrinally conservative as it remains, there may well at some point come a time when it is felt that the growing gulf between what copyright law purports to protect and how business operates has simply become indefensible. Economically, the judgment may embolden technology firms to base operations in the UK, positioning the country as an innovation-friendly jurisdiction. But culturally, it risks alienating creative industries already strained by the impact of AI on labour and attribution. Whether the government can balance these interests without legislative reform remains uncertain.

Conclusion

The High Court’s refusal to treat model weights as infringing copies is more than a victory for Stability AI. It is a decisive statement about the limits of copyright in an AI-driven world. By grounding its analysis in the embodiment, reproducibility, and expression of statistical learning methods, the Court managed to avoid reducing statistical learning to a type of illegality-producing copying. But the ruling also exposes doctrinal gaps and policy incoherence that are almost certain to resurface as generative AI capabilities continue to advance. If AI keeps shaking up creative markets this much, and that is a big if, given how little we really know about all the change currently emerging over the horizon for us, AI will eventually prompt a bigger question: Does copyright law need to be bent to accommodate machine learning, or should machine learning learn to stay within its traditional border? Getty Images v. Stability AI does not answer this question, but it paves the way for another round of debates, in courtrooms, legislatures and creative communities at large.

References

Getty Images v. Stability AI, [2025] EWHC 2863 (Ch).
Synthon B.V. v. SmithKline Beecham plc, [2005] UKHL 59.
Designers Guild Ltd. v. Russell Williams (Textiles) Ltd., [2000] 1 WLR 2416.
Authors Guild v. Google Inc., 804 F.3d 202 (2nd Cir. 2015).
Bridgeport Music, Inc. v. Dimension Films, 410 F.3d 792 (6th Cir. 2005).
UK Government, Artificial Intelligence and Intellectual Property: Call for Views (Intellectual Property Office, 7 September 2020), available at: https://www.gov.uk/government/consultations/artificial-intelligence-and-intellectual-property-call-for-views
N. Carlini, D. Ippolito, M. Jagielski, K. Lee, F. Tramer & C. Zhang, “Quantifying Memorisation Across Neural Language Models” (2022), arXiv, available at: https://arxiv.org/abs/2202.07646