Bala Balachandar

Vision Quest

By Cindy Spence

Imagine trying to recreate the eruption of Mount St. Helens or the collapse of the World Trade Center.

“Sometimes, experiments are not an option,” says UF mechanical and aerospace engineering Professor Bala Balachandar.

And even if he could simulate such complex events in the lab, Balachandar says there is currently no computer capable of fully analyzing the flood of data.

But Balachandar and his team are out to change that, with the support of a five-year, $10 million grant from the National Nuclear Security Administration.

Balachandar’s field of predictive simulation science is like looking into a crystal ball. It takes the known behavior of atoms and molecules — which can be expressed through mathematical equations — and applies them to unknown events, like the chaos of massive explosions — natural or man-made.

It’s easy to understand why a nuclear security agency would be interested in such capabilities, but Balachandar says the practical applications of being able to simulate such complex events go far beyond nuclear devices.

Predictive simulation science grew out of the nuclear test bans of the 1990s. The bans put national nuclear labs in a bind: how do you certify a nuclear warhead is serviceable if you can’t test it? Scientists developed computational surrogates for nuclear testing, and in the process found those models useful in studying other complex phenomena that can’t be tested in a lab, like supernovae and volcanic eruptions.

Taking a complex process and converting it to a computational model is tricky. A model must be detailed enough to account for many variables so that an accurate prediction can be made. Too much detail, however, and the model collapses under the weight of its own data.

“My back-of-the-envelope calculation is not very accurate,” Balachandar says. “On the other hand, if I have an equation for every molecule it would take 100 billion years to predict what’s going to happen, and that is not useful. So we have to strike a sweet spot.”

In the case of nuclear weapons, Balachandar says, the model should be detailed enough, for example, to tell the President of the United States, “You can make a decision based on my simulation, because I trust my answers.” Or, if the issue is an impending volcanic eruption, the simulation should be detailed enough that an emergency management official can decide when and how far to evacuate.

Hurricane forecasting is a predictive science success story, Balachandar says, noting that 50 years ago hurricanes in India caused huge losses of life. A hurricane just last year in India caused fewer than 10 deaths. Balachandar points out that predicting a hurricane’s strength at somewhere between category 1 and category 5 is not useful, but a prediction with 95 percent certainty that a hurricane will be category 5 provides information that is useful.

Scientists won’t make the decisions, but they can give decision-makers an answer they can use in their deliberations.

“Simulation allows people to make better decisions,” Balachandar says. “We reduce the guesswork by bringing in as much physics and computational power as possible.”

As a big data challenge, Balachandar says, simulating complex phenomena is so data intensive that it requires the world’s largest, fastest computers, and then some.

The best computers today operate at petascale, performing a quadrillion — a thousand trillion — calculations every second. Imagine a desktop computer with a quad core; put 250,000 together and you have a million cores, and can do petascale calculations. In the quest for greater precision and speed in simulations, Balachandar says scientists are eyeing the next generation — exascale computers.

“Which don’t exist and won’t for many years,” says Alan George, the lead computer scientist on the team and a professor of electrical and computer engineering.

George and colleagues Herman Lam and Greg Stitt and a team of students at CHREC, the National Science Foundation Center for High-Performance Reconfigurable Computing at UF, are tackling the problem of computing power for simulation science. Down the hall from George’s office sits the Novo-G, the most powerful reconfigurable computer in the world. The reconfigurability of the Novo-G represents a dramatic departure from the trend of present computer architectures, but even the Novo-G is not exascale.

In the past, science could anticipate the growth of computing power using Moore’s Law, which forecast that computing power would double roughly every two years. But with petascale computing, a milestone reached five years ago, Moore’s Law is reaching its limits, making exascale the holy grail of computing, at least for now.

In theory, all you’d have to do is string 1,000 petascale machines together and you’d have exascale. But you’d need one nuclear power plant to run that many machines, Balachandar says, and another nuclear power plant to cool them.

Balachandar and the physics team have all kinds of “wonderful ideas about how to make their simulations more accurate, more detailed, more sophisticated, more robust,” George says, so the computer scientists learn how their applications function and try to develop a new computing architecture to support them.

“These are very complex simulations, and to perform them in a response time shorter than your lifespan you need more and more powerful computers,” George says. “If you were to look at all the supercomputers in the world, that’s what most of them are being used to do; some are simulating Earth climate, others are simulating multiphase turbulence, and so on. Supercomputers are widely used to study and solve problems so computationally intensive that the most powerful computers in the world are necessary to do it.”

George’s team is charged with the task of studying and evaluating ways to build and use the exascale computer of the future — what Balachandar calls a “grand challenge” — and George says such challenges are natural in computing.

“Before exascale was Mount Everest, petascale was Mount Everest, and before that terascale,” George says. “So now exascale is the next big challenge.”

The road to exascale computing is fraught with unprecedented difficulties. Pursuing exascale computing in the traditional manner — by connecting a thousand petascale machines — doesn’t make sense, George says. Each petascale machine costs hundreds of millions of dollars, is as big as a building, and generates so much heat that its cooling bills alone run $1 million a month.

“That just doesn’t make sense, economically, practically or technically,” George says. “Because of that, exascale is really opening up new research challenges. How do we design systems in new and better ways than we ever have before?”

In the early days of computing, the machine was the main cost. Then software became the main cost. Now, George says, one day, maybe not too far away, energy may become the dominant cost in computing.

“If we’re already spending a million dollars a month on utility bills for some of these machines, just imagine what that might be with exascale if we don’t find a better way to do things,” George says.

George is just the man for the job. As recently as 2004, there wasn’t a centralized campus computing infrastructure, but there was a big appetite for it. George chaired a committee that convinced units from all across campus – health, agriculture, liberal arts and engineering – to pool their resources, and high-performance computing at UF took hold.

George moved on to CHREC and reconfigurable computing, and for some applications, the Novo-G is the fastest computer in the world. The one-size-fits-all architecture of a conventional computer doesn’t change, whether it is used for word processing or sequencing a genome. In reconfigurable computing, the architecture changes to suit the need, and that delivers faster processing at a lower energy cost.

The potential of the Novo-G has drawn interest from more than 30 industry partners.

George points out that the traditional style of expanding computing power has pitfalls. As processors and chips get smaller and more and more are added, the system grows ever more complex, meaning more can go wrong, and then reliability becomes an issue.

“All of these things are daunting,” George says. “Nobody really knows the best way to study something that doesn’t exist. Balachandar studies things you can’t actually build and test, and we are as well.”

Balachandar says the point is to co-design: do what can be done at petascale, while developing methods that will work at exascale.

Already, the team is using UF’s new HiPerGator computer to develop codes, test them, run cases and fine-tune formulas before hopping on to even more powerful computers at the national labs, and, eventually, the exascale machines of the future.

“Traditionally, we scientists work on existing machines and then a new machine comes out and we scramble for three years to change our code to work on the new machine, then guess what. The next machine comes out, and by the time we learn to use that one … so this co-design strategy is a paradigm shift.”

Revolutionizing the processing of the gigantic datasets for simulations will lead to discoveries in many other fields as well, Balachandar says.

“We’re doing something based on a future, non-existent technology, to move science forward,” Balachandar says. “This is great, this is awesome.”

Sources:

Related Websites:

This article was originally featured in the Summer 2014 issue of Explore Magazine.