Pretty much the only thing I think AI could be useful for - forecasting the weather based off tracking massive amounts of data. I look forward to seeing how this particular field of study is improved.
Bonus points, AI weather modeling, for once, saves energy relative to physics models. Pair it with some sort of light weight physical model to keep the hallucinations at bay, and you’ve got a good combo.
what’s perhaps most striking about GenCast is that it requires significantly less computing power than traditional physics-based ensemble forecasts like ENS. According to Google, a single one of its TPU v5 tensor processing units can produce a 15-day GenCast forecast in eight minutes. By contrast, it can take a supercomputer with tens of thousands of processors hours to produce a physics-based forecast.
If true this is extremely impressive, but this is their own evaluation, so it may be biased.
What they leave off is how much goes into training the model, but I imagine once they settle on a trained model it can carry on pretty efficiently for a long time, especially if they’re baking in things like atmospheric CO2 levels to help keep forecasts in line with global warming.
Absolutely, but training is only once, being so efficient to make the actual forecast, you could have a forecast personally made for your own garden, which may be very different than a generic one covering hundreds of km². Then the about 90% accuracy will feel WAY more accurate.
I feel this personally, I live in the hills outside of a valley metro. All weather data is forecasted off of valley sensors, but shit gets weird when you suddenly climb 2000+ ft.
The best weather services in my area are those that can factor in peoples household meters into their forecasting, but those services still aren’t perfect.
I live in a hilly county in a country at the intersection of two weather cells, with a warm ocean current bathing our coast. Prediction in those conditions is a real challenge. For example, my neighbors 50 metres from me get consistently more snow and ice than I do. More stations would really help, but moving from there to crowd-sourced forecasting has issues due to lack of calibration and other biases. It can help, but not as much as you might think.
The non-AI models in use now all get feedback on each run from actual observations, that’s used to correct model parameters for later runs.
I’m sure the model would need to be continuously updated to take in more recent weather data.
Inputting newer weather condition data is different than changing the model. The model is the machine that does the computing, the weather data is just inputting variables. As an analogy it’s like a computer - the hardware itself doesn’t change, but if you do different clicks and typing input then the computer will output different things on screen. The ai model itself only changes when you train it differently.
There’s a difference between the real-ish-time weather data continuously fed in to output predictions, and the decades of weather data used to build the model. The continuous feed of data is more than likely part of what Google alleges is saving significant energy.
Its the training on decades of information, and occasional updates to those trained models that take a significant amount of resources, but hopefully for relatively short bursts.
It actually makes sense if you think about it from the perspective that ML is about generalizing trends/functions. Simulating the world is hard, generalizing the world based on past observations - easy (with some lossyness).
generalizing the world based on past observations - easy (with some lossyness)
I know people who spend their lives working on climate models, and none of them would say it’s easy. And climate models are “generalizing the world based on past observations” plugged into some very complex physical models.
Yeah, I’ve long thought that weather forecasts are a perfect use case for AI. AI is great with complicated systems that are hard to model accurately but have lots of available data.
Current weather forecasts kinda suck. I try to schedule jobs around when it’s going to rain, and have to frequently reschedule because rain forecasts aren’t very accurate. I really hope we can see improvements.
It would be amazing if it could have a significant impact on spatial and temporal accuracy of things like rain. I feel like for me the existing weather report is good enough for “it will probably rain tomorrow” but it’s really hit-or-miss when you get to hourly resolution. A good model may be able to go so far as to say “it will probably rain between 3-4pm on the east side of town tomorrow, and 2-3pm on the west side”
That’s the dream at least. With enough data and a sophisticated enough model it feels like it could be possible.
I’m not convinced you can ever get that resolution. There’s a big difference between modeling the broad trends and trying to remove the uncertainty from a process that’s inherently probabilistic.
Theoretically with enough data it could predict exactly what is going to happen do we have enough data currently to do that probably not but weather isn’t just completely random we just don’t understand it enough yet
It’s an insanely complex, coupled system full of turbulence, so that “theoretically” is doing some heavy lifting. The best models now need to be run on supercomputers, despite scores of scientists and software people constantly trying to find further optimizations for the algorithms. AI isn’t going to better discriminate signal from noise when the biggest constraint on the existing S/N ratio is the lack of suffiicient compute resource.
Furthermore, unless the AI does explainability, which it almost certainly doesn’t, nobody’s going to use its output in life-and-limb-critical applications like first responders, defense, even road gritting.
My argument is that that is not the case.
There are many systems in nature that have randomness fundamentally built in. You can model the broad strokes, but the low level details are inherently unpredictable because random processes are involved at the low level. You can predict the general pattern of airflow over a jet wing, but it’s not a lack of input resolution that makes it impossible to project the path of a specific molecule.
It would be amazing if it could have a significant impact on spatial and temporal accuracy of things like rain.
That has to do with the forecast’s grid size, along with some irreducibly complex and ill-conditioned physics that AI won’t help with. The best general-forecast global models now have a 10km grid. Some specialist models go down to about 2km. So, depending on the size of your town, probably not fine-grained enough, though there are also point forecasts that take into account terrain, albedo and other fine-grained features and can be pretty accurate, especially when there are some good observation stations nearby so the forecast models can be continually trained based on actual data.
You’re better off looking at QPFs than regular forecasts.
But, if you’re wanting something like “will there be rain at this GPS coordinate at this time”, then under some conditions that is just impossible to predict. It’s not a problem with the how clever the models are or a lack of data, the physics makes it legitimately random.
Ensemble forecasts are better yet, but you need to learn what they mean before using them. Not many people understand how to interpret probabilistic forecast data.
I thought QPFs are generated from ensemble forecasts?
Rain forecasts are mostly spot on for me. Keep in mind, %chance of rain is covering a wide area. If we want better rain forecasts we have to dial in the resolution.
You’ll also need more accurate remote sensor data (precipitation happens in a very narrow range of temperature, pressure and humidity), better observation data, better terrain models (microclimate is influenced by the interaction of terrain and the atmosphere). The forecast grid sizes used now are based on choosing the smallest grid size we can afford to compute that yields meaningful forecast data. Computational cost more than quadruples each time grid size halves. “More than” because altitude levels matter too, though for terrestrial forecasts, the ones near the ground matter a whole lot more than what’s happening at 6km.
I had one time a couple weeks ago where I was scheduling jobs on Monday, we were supposed to be rained out Tuesday, light/scattered showers Wednesday, and heavy rain Thursday.
Actual results was no rain Tuesday, absolute downpour on Wednesday, and sunny Thursday and Friday.
that’s… actually a great use for AI. good to see something intelligent is being done after all.
Weather and other very complex and hard to predict stuff like fluid dynamics, things that don’t have a human or completely random behaviour, is probably the best thing an AI could do for humanity.
Also could be good for understanding animal communication like dolphins etc.
I don’t think it will really be able to do much actual human work, though. Maybe management…
Implying management is human work?
Show me the skill scores, from an independent tester, or I’ll regard this as nothing but hype.
Many fields of science are only as far as they are because of AI being able to analyse Big data fast. Weather surly is not the only one. To name some examples: Astrophysics, geophysics, psychology (crowd behaviour), biology, Farming (optimising), and many more
So far what we’ve seen have been some promising point solutions, but very little that’s useful beyond that.
Indeed, what’s perhaps most striking about GenCast is that it requires significantly less computing power than traditional physics-based ensemble forecasts like ENS. According to Google, a single one of its TPU v5 tensor processing units can produce a 15-day GenCast forecast in eight minutes. By contrast, it can take a supercomputer with tens of thousands of processors hours to produce a physics-based forecast.
So it’s more accurate and uses significantly less computing power than current systems. Nice!
About 4 years ago, this video showed that a ML model can be used to cut costs on physics simulations. It’s about time we did that with weather too.
It’s not just about cutting costs, but also improving accuracy. Physical simulations factor in a dozen or so weather conditions to predict outcomes. Machine learning can track thousands of conditions, drawing connections not realized in physical models, leading to much more accurate statistical models.
Physical simulations factor in a dozen or so weather conditions to predict outcomes.
Many more parameters than that.
Machine learning can track thousands of conditions
Scientists already know which ones are relevant. You’re not going find any big surprises there with an AI. Shotgun-style factor analysis has already been done to death. The price of baked beans doesn’t impact the wind direction in the Persian Gulf. It’s OK to not consider it.
drawing connections not realized in physical models
Again, it’s possible but unlikely. And you’d need an AI that could be queried to tell you what factors it considered, and most of them don’t work that way right now.
Statistical models don’t become more accurate because you throw irrelevant parameters at them. But that’s how ML systems work.
Yeah, that’s pretty impressive. I wonder if you could apply the same philosophy in other areas too. Instead of training the model with data produced in a simulation, you could just feed it real world data instead. Like, if you gave a bunch of stress-strain data to a model, could you make better predictions about the behavior of physical structures, such as bridges and towers.
There are already non-AI physical modeling programs that do that.
Yes there are, but would it be possible to replace them with ML and get more accurate predictions?
deleted by creator
As much data as they have, it seems like they could use more data. Just as an example, I have two weather apps on my phone. And for the same city, they will give me two different temperatures. Checked back to back. And those temperatures will both be different than the temperature of my thermometer at my house. What if each city had say like 50 sensors all over the city that would report in and then they would take the average of all 50 of those sensors in order to get a more accurate number? And that’s just for temperature.
Your two apps are probably reporting forecast data from two different models.
What if each city had say like 50 sensors all over the city that would report in and then they would take the average of all 50 of those sensors in order to get a more accurate number?
You’re asking the obvious question, but it’s not quite that simple.
The temperature at different points in your city can be widely different. Averaging 50 sensors would eliminate that difference and give a single number for the whole city. That would be a loss of accuracy. What you really need are forecasts at multiple points in your city, based on the weighted average of those 50 sensors, also taking into account other factors such as altitude, insolation, type of buildings or vegetation, albedo and I forget what else. For example, if your forecast point is on a road, tarmac retains heat differently than grass does, so snow will melt faster on a road than on a field. For another example, buildings often emit heat in the winter, and that will impact how much snow accumulates at a given point. Look up the urban heat island effect for more on this.