Why MEDIA AGENCIES should care about generative AI’s Data Thirst
by Ann-Kathrin Pfleging
Over the last two years, generative AI’s potential to transform work and society has become central to discussions about technological innovation. For media agencies and the wider marketing industry, the question remains: is the enthusiasm justified, or is generative AI overhyped?
By using AI-driven algorithms, media companies can supply hyper-targeted content recommendations and align campaigns with individual preferences and behaviours. It already plays a significant role in today’s advertising, with Meta announcing just this month new image and text generators that streamline creative production whilst following brand guidelines.
However, generative AI’s dependence on vast, high-quality internet data also raises urgent concerns. In her keynote at SXSW, star futurist Amy Webb explains this problem. As she puts it, generative AI could very soon “run out” of the internet’s data, triggering a crucial question: What happens when there’s no more data left to feed generative AI?
In media, where generative AI is used for content creation, media planning, activation and search, this has far-reaching consequences.
Because the internet’s data sources are limited, AI-generated content risks becoming repetitive and lacking in diversity. This exacerbates problems of algorithmic biases that are already major criticisms of AI. In addition, AI algorithms may struggle to differentiate between accurate facts and misinformation because they rely on outdated or biased data.
In media reporting, this could also lead to distorted representations and inaccurate predictions of audience behaviour. This then impacts campaign effectiveness and audience engagement, as well as performance reporting, weakening the overall reliability of AI-generated insights.
Recent agreements underline this growing thirst for high-quality data. OpenAI, the developer of ChatGPT, has signed a five-year deal with News Corp to get access to the content of publications such as The Wall Street Journal and The Times. Similar deals have also been agreed with Axel Springer and the Financial Times. However, while such partnerships provide a temporary solution by feeding AI with more data, they also highlight media owners’ increasing reliance on AI companies for revenue whilst print sales continue to decline. The long-term impact of this for the media industry remains to be seen.
So, what’s next?
In search for longer-term solutions, AI developers, including industry giants Meta, Google, and Microsoft, started to train their AI models with synthetic data, meaning artificial data generated by AI. However, not only does this risk further worsening existing biases and misinformation, but researchers have also observed irreversible errors and nonsensical results in these models (Firstpost 2024), leaving generative AI at risk to become outdated and less capable of continuing to produce diverse and accurate content.
In short, generative AI’s data thirst matters for media agencies. It not only puts its own future at stake but also poses challenges to the media industry which already relies on generative AI’s capabilities. It is therefore crucial to proactively address these challenges by promoting continuous learning about how to navigate this rapidly changing landscape. Yet, we should not forget to balance these innovations with ethical considerations to ensure that everyone benefits equitably.