Key Highlights –
- Google used its Gemini LLM to scan 5 million news articles globally, extracting reports of 2.6 million flood events to create a geo-tagged training dataset called “Groundsource”
- The resulting flash flood forecasting model now flags risk across urban areas in 150 countries on Google’s Flood Hub platform, with data shared directly with emergency response agencies
- The model works at 20-square-kilometre resolution and currently lacks local radar integration making it efficient for under-resourced regions
Flash floods kill more than 5,000 people every year, and unlike most weather events, they are notoriously resistant to prediction. They are too short-lived and too localised to be tracked the way river flows or temperatures are — which means the deep learning models now capable of forecasting most weather patterns have had very little useful data to work with when it comes to flash floods specifically.
Google’s approach to fixing that is, to put it simply, unconventional. Rather than waiting for sensor infrastructure that many developing countries cannot afford to build, the company turned to the one source of localised, historically rich data that already exists at scale: journalism.
What Groundsource Actually Is & How It Was Built
Google researchers fed Gemini, the company’s large language model, through 5 million news articles from around the world. The model was tasked with identifying flood reports, extracting location and timing data, and organising the results into a structured, geo-tagged time series. What came out is “Groundsource“: a dataset of 2.6 million recorded flood events, built entirely from written media rather than physical sensors.
This is the first time Google has used an LLM specifically to generate a quantitative dataset from qualitative, written sources which is a methodological shift worth paying attention to beyond just the flood application. Gila Loike, a Google Research product manager, confirmed the dataset and the research were made publicly available Thursday.
From there, researchers trained a separate model which was built on a Long Short-Term Memory (LSTM) neural network. The said model aimed to take global weather forecast data as input and output flash flood probability for a given region. That model now runs on Google’s Flood Hub, covering urban risk areas across 150 countries, with its outputs being shared with emergency response organisations directly.
António José Beleza, an emergency response official at the Southern African Development Community who trialled the model with Google, noted it helped his organisation respond to floods more quickly, which is a credible early indicator that the system is functional in field conditions, not just in research environments.
The Limitations Google Isn’t Hiding
Google has been upfront about where the model falls short, and it is worth being equally direct here. The forecasting resolution sits at 20 square kilometres which is relatively coarse when compared to national weather systems. More significantly, the model does not currently incorporate local radar data, which is what allows agencies like the US National Weather Service to track precipitation in real time and issue precise alerts.
That is not a minor gap. Radar integration is precisely what separates a useful early-warning system from a genuinely life-saving one in densely populated urban areas where even a few hours of advance notice determines whether evacuation is possible. For now, the model is most appropriately positioned as a tool for regions where the alternative is no forecast at all, instead of posing as a competitor to infrastructure-backed national systems.
You may also like to read: – Anthropic announces new institute to study the impact of powerful AI systems
What This Means Beyond Floods
The more durable implication of Groundsource may be the method itself rather than the output. Juliet Rothenberg, a programme manager on Google’s Resilience team, told reporters the team intends to explore applying the same LLM-to-dataset approach to other ephemeral but high-stakes phenomena, heat waves and mudslides were cited specifically.
For perspective, data scarcity is one of the central bottlenecks in AI-driven climate modelling. Marshall Moutenot, CEO of Upstream Tech and co-founder of dynamical.org which is a group curating machine learning-ready weather datasets, called it a “creative approach” to a problem the field has struggled with for years. If Groundsource’s methodology proves replicable across other event types, it could meaningfully expand the range of climate risks that AI models can actually be trained on.
The broader question is whether Google keeps this infrastructure public and collaborative, or whether Flood Hub evolves into a proprietary data advantage. The dataset has been shared openly for now, but as these systems become more operationally embedded with governments and emergency agencies, the governance of who controls that data and under what terms becomes a legitimate question the company will eventually need to answer.33









