Nvidia’s GTC 2012
Wednesday
There were plenty of programming sessions Wednesday morning. At 10:30 AM, Professor Iain Couzin of Princeton University Department of Ecology and Evolutionary Biology kicked off the Keynote address of Day 2 at GTC 2012. He is one of the first educators who had the foresight to realize the potential of GPU computing for his line of work and he spent two years porting all of his research tools to CUDA long before it became popular to do so.
He believes it was the best investment in time because he is now able to do in minutes on the GPU what used to take weeks to accomplish on the CPU. He detailed his early years as a researcher beginning with a regular GeForce gaming video card and then migrating to Tesla when he had the money to do so.
What he is looking for is the patterns in nature of collective behavior. He was able to demonstrate the similarity of animal groups – think of huge flocks of birds in flight or fish in huge schools that function as a sort of “collective mind”. There is no telepathy involved although it was believed to be so not that long ago. Even invading cancer cells in a tumor – or colliding galaxies – and humans, seem to exhibit similar collective behavior in patterns that can be charted using the GPU.
What GPU computing has allowed Dr. Couzin to do is to simulate thousands of individals in an experimental framework and to track their collective behavior. Dr. Couzin was able to demonstrate how collective behavior and collective action emerges in a wide range of groups – from fish and birds to plague locusts and even to humans crowds.
He explained how important it is for a group to align and yet not collide – as collision can be fatal in birds. They also need to be able to avoid predators. These are the patterns in nature that are considered models.
One of the most interesting findings is that certain individuals that have information (food, perhaps) influence the group that does not have this information. There is an interesting democritization going on that has implications for human behavior.
The experiment goes to show how the leaders (informed individuals) influence a group and how their influence is mitigated by uninformed individuals. He again stressed how important CUDA is because they want to study thousands of individuals – not just a few. And of course, the actions of predators on the group – or in the case of humans, actors to disrupt a group – is important to track and only GPU computing can do it in real time.
This has implications for tech forums. Professor Couzin made a surprising discovery that counters conventional wisdom that uninformed humans are more easily influenced by extremists. Instead, his findings suggest that the presence of those without strong views increase the odds that a group will go with the majority opinion. Uninformed individuals in a group are very important as they dilute a minority preference with strongly held preferences, and they tend to support the majority.
He gave an interesting example of a person stating on a forum saying that he will buy a Radeon. A person with an strongly held opposite view might interact with this person, suggesting that it is a stupid decision. One person may give this individual pause but is unlikely to change his mind. However when two or more individuals suggest it is a bad decision simultaneously, it may have a much stronger effect on the original purchasing decision
Of course there is a lot of mathematics involved, but it may be expressed as a chart:
Below a critical density of uninformed individuals, the minority with strongly-held opinions can easily win. However, when there is a higher density of uninformed individuals, not much happens until suddenly the situation flips and the majority wins. When there is a sufficient amount of people with no bias, the minority cannot win no matter how intransigent they are as demonstrated in the following model.
According to this research, uninformed individuals tend to promote democracy in animals, and of course this suggests further researcher with humans. Professor Couzin went on to suggest that humans interaction is not as complex as we like to think.
There is much more that the Professor presented and he also went on to explain why Locusts migrate to become a plague.
You can catch his entire keynote here:
http://smooth-las-akam.istreamplanet.com/live/demo/nvid4/player.html
Then it was time for lunch and back to the networking/exhibit hall
And of course, there were two-hour sessions on each day for the exhibitors, Nvidia and their partners, to show 0ff their products and GPU-related technology. Many big names and also very small new startups were represented. At 2 PM this editor headed for “Inside Kepler” to listen to co-presenters, Stephen Jones and Lars Nyland of Nvidia. Unfortunately, ABT was unable to attend the Emerging Companies Summit Fireside Chat with Jensen because it was held at the same time.
Inside Kepler
This is the deep dive into the GK110 architecture that we touched on earlier. The differences between Fermi and Kepler architecture was highlighted and the advantages of Kepler was stressed. We are not going to spend a lot of time on it as it is incredibly detailed.
Here is the link to the video of Inside Kepler which includes the Q&A session for a total of 90 minutes:
http://smooth-las-akam.istreamplanet.com/live/demo/nvid5/player.html
Nvidia said they were stuck with performance on Fermi as they could not increase power any further. Their engineering goal was to make a more efficient chip that was even more fully-featured for programming. Kepler is the result and the performance of this 7.1 billion transistor chip is impressive.
It was stressed that Kepler was redesigned to do much more programming than Fermi and the clock speed was dropped so as to make it more efficient. The SMX architecture was redesigned to be far more complex so the clockspeed needed to be slower.
To take advantage of Kepler’s expanded programmability, there are new instruction sets.
Shuffle instructions have been simplified, and it is easier to exchange instructions more efficiently with Kepler than Fermi. The more advanced sessions actually showed how to do this by examples. Atomic Operations have also been significantly improved with a performance factor of 2x-10x over Fermi.
So the code needs to be relatively complex on Fermi without Atomics; now see the simplification of code and timesaving with Kepler’s high speed Atomics:
The examples kept coming of how Kepler was going to speed up and improve programming over Fermi. Next, they demonstrated the incredible crunching power of Kepler’s GK110 using all of the extra registers, improved shuffle and floating-point code.
This galaxy demo simulation was run in Jensen’s keynote and again in this session using real astrophysics code to show what will happen beginning 3.5 billion years in the future.
Next up was the topic of Kepler’s dynamic parallism. In other words, it is the GPUs ability to do work without waiting for the CPU.
This will change how programmers program.
Your program can now be adjusted dynamically and automatically saving the programmer a lot of time as more work is being done on the GPU with less dependence on the CPU. Nested parallism thus becomes possible on the GPU freeing up the CPU for other tasks. And of course, there were a lot of architectural improvements in Kepler over Fermi that makes this possible.
Instead of a straight feed-forward with Fermi, we now have a feedback unit and management to keep track of what normally had to be taken care of by software. Also, with Fermi, processes cannot share the GPU so that they had to be run one at a time. With Hyper-Q, all of the streams can launch at once instead of queueing up in the pipeline.
This is Fermi.
And this is Kepler. True multi-processing can occur in a single CUDA program.
Programming has been greatly enhanced with Kepler. The session then went to 1/2 hour of questions and answers. Anyone interested in Kepler architecture should consult the whitepaper which can be downloaded from Nvidia’s site as a .pdf
http://www.nvidia.com/content/PDF/kepler/NVIDIA-Kepler-GK110-Architecture-Whitepaper.pdf
After that, it was time to head back to the hotel room to finish the news bit on the GTC then off to the networking exhibit halls again and finally across the street to the GTC party where jugglers provided the main entertainment – even tossing running chainsaws to each other! We didn’t stay too late as Thursday would be our busiest day with a full session schedule.
LoL, what was that Zoobe thing – in that little picture on the 3rd page – for that hardcore nerd to insert something into this “alien” enclosure?
http://www.zoobe.com/