Nvidia’s GTC 2012
Tuesday featuring Jensen’s Keynote
The Keynote speech delivered by Nvidia’s superstar CEO Jen-Hsun Huang (aka “Jensen”) highlighted the rapid growth of GPU computing from humble beginnings. What was really surprising is Nvidia’s foray into cloud computing – including gaming – where “convenience” is the new buzzword. Just as the 2009 GTC covered the then upcoming 40nm GF100 Fermi architecture, the 2012 GTC was all about the new 28nm Kepler GPU. It is a huge advance in effficiency in performance per watt over Fermi and it will be the foundation for Nvidia’s new cloud initiative based on Kepler architecture.
Kepler Hardware
K10 is the Tesla version of the GTX 690 – a dual GPU with single precision that is directed at Computing and it especially has uses for oil and gas research, national defense including signal and image processing, and industry. Here is one example out of many that shows the uses of GPU calculations for NASA.K10 will be available next month and a single Tesla K10 accelerator board features two GK104 Kepler GPUs that deliver a combined performance of 4.58 teraflops of peak single-precision floating point and 320 GB per second memory bandwidth.
Available later in Q4 this year will be Nvidia’s flagship GPU based on GF110 Kepler. This GPU delivers three times more double precision compared to Fermi architecture-based Tesla products and it supports the Hyper-Q and dynamic parallelism capabilities. The GK110 GPU is expected to be incorporated into the new Titan supercomputer at the Oak Ridge National Laboratory in Tennessee as well as other supercomputers. Here is the chip itself – 7.1 Billion transistors, the most complex piece of silicon anywhere!
Tesla’s K20 should have three times the double precision capabilities of K10. Here is the K20 as it should look when it is released in Q4’s timeframe of November-December:
Of course, Nvidia won’t speak publicly about it’s gaming GPU based on GK110, but it is logical that it will be released after the professional market is satisfied, probably early next year. TSMC still has less capacity and much less production than Nvidia would like as it appears that they are easily selling out of every GTX 690, GTX 680 and GTX 670 that they can make.
Nvidia’s CEO delivers the Keynote that defines the GTC
Nvidia’s CEO Jen-Hsun Huang (aka “Jensen”) delivered the keynote to a packed hall and he set the stage for the entire conference. He is a superstar in his own right and the $100 replicas of his leather jacket that he wore on stage sold out at the Nvidia store within an hour of the keynote’s ending.
Here is one of Nvidia’s photos that shows the press in front at the tables and the rest of the audience is also paying rapt attention. This editor is in the audience and in this picture.
Jensen began by showing the incredible growth of GPU computing and how it is changing our lives for the better. From humble beginnings, it has grown significantly since the first Nvision08 just four years ago. CUDA is Nvidia’s own propritary GPU language which can be considered similar to x86 for CPU. From 150,000 CUDA downloads and 1 Supercomputer in 2008 to 1,500,000 CUDA downloads and 35 supercomputers today; and from 60 universities and 4,000 academic papers to 560 universities teaching CUDA and 22,500 academic papers in just 4 years!
While Nvidia offers support for OpenCL, they say that they are not seeing any shift to OpenCL even though OpenCL gives developers a much more cross-platform approach. No one except AMD appears to be waiting for the OpenCL tools to evolve or even for Intel to get tools out there for its own multi-core MIC processor. Nvidia has created the tools and programmers are excited to use them.
Convenience
One of the trends noted is that companies no longer supply notebooks or devices for employees any more than than a “company car”. The trend is for employees to Bring Your Own Devices (BYOD) to work. Of course, this means that all of the devices that an employee must use have to be configured by the company’s IT department and they must be configured to work together securely.
Nvidia is going to be at the forefront of this with their new initiative with Kepler – the first GPU that can be “virtualized” to drive cloud computing. Cloud computing is simply convenient.
It is a great advantage to be able to use any device seamlessly and the important thing is that excellent graphics can be delivered to any device – now with about the same amount of latency that a gaming console has. This has application for business including for gaming.
Kepler is Nvidia’s first “Virtualized GPU” which allows end users to use any device from anywhere allowing the same excellent graphics on all devices. This allows data centers to be driven from Kepler GPUs as the applications will reside in the cloud no longer requiring the applications to reside on individual devices. The PC then itself becomes an application.
Joining Jensen on stage at GTC’s Day One Keynote were executives representing Nvidia’s partners supporting these cloud technologies. They included: David Yen, general manager and senior vice president of the Data Center Group at Cisco; Brad Peterson, technology strategist, and Sumit Dhawan, group vice president and general manager, at Citrix; David Perry, CEO and co-founder of Gaikai; and Grady Cofer, visual effects supervisor at Industrial Light & Magic.
A practical way the virtualized GPU can be used for business was illustrated when Jensen invited Grady Cofer of Industrial Light & Magic on stage. The problem that Grady explained, exists now when he tries to demo movie clips for a director. No matter how many shots he loads up his PC with to demonstrate clips, there is never enough flexibility nor enough storage on his local machine. However, by using Nvidia’s GRID, he is able to instantly access his server and do anything that he could do from his own office – both remotely and securely.
He demonstrated how he did this with the upcoming “Battleship” and also with “The Avengers”:
Of course, gaming can also benefit by having the application reside in the cloud. No longer do gamers have to wait to download anything. They just get connected and start playing – on any device – and with the same excellent graphics across all of the devices. Kepler has taken care of the latency issues.
The idea behind using the cloud is that movies are convenient. Movies simply work on any device and games should also. Jensen looked forward to the day when anyone with broadband can have a game subscription to a gaming service similar to what Netflix provides for movies, and perhaps even at a similar monthly price. It is called the GeForce GRID and it will be implimented by Nvidia’s partners in various forms.
Jensen invited Gaikai’s CEO onstage and they explained that latency should not be an issue considering that it takes light only 100ms to circunvent the globe at the equator. Much more was revealed at the question and answer session with the press afterward which we will cover later in our article today. Even Nvidia’s Project Denver was mentioned.
Nvidia’s CEO Jensen, Chief Scientist Bill Dally, Jeff Brown, and Robert Sherbin took live questions from the press.
There were some good questions including, “who owns the GeForce GRID?” However, before we check out the Q&A session with the press, let’s look at the new initiatives that Kepler will support as outlined in Jensen’s keynote presentation. We will check them out one at at time beginning with Kepler Virtualized GPU especially as it relates to Industry, Cloud gaming, and specifically to the new GeForce GRID.
KEPLER as a VIRTUALIZED GPU
Jensen’s keynote revealed Nvidia’s VGX platform which enables IT departments to deliver a virtualized desktop with the graphics and GPU computing performance of a PC or workstation to employees using any connected device. Using the Nvidia VGX platform in the data center, employees can now access a true cloud PC from any device regardless of its operating system, and enjoy a responsive experience for the full spectrum of applications once previously reserved for the office PC. It even allows outsourcing across continents seamlessly as a Citrix slide shows.
Nvidia’s VGX enables knowledge workers for the first time to remotely access a GPU-accelerated desktop similar to a traditional local PC. The platform’s manageability options and ultra-low latency remote display capabilities extend this convenience to those using 3D design and simulation tools, which had previously been too expensive and bandwith-hungry for a virtualized desktop, to say nothing of latency issues.
Citrix had an interesting presentation. They offer many options for this kind of desktop virtulizations for high end graphic designers and users. They even have lossless imaging for medical uses available.
Kepler can only improve on Fermi and Critix is welcoming Nvidia’s new cloud initiative. Early tests are showing that using Fermi, 2MB/s of bandwidth were needed, but now with more efficient codecs and Kepler, only 1.5MB/s are required for the same results.
Integrating the VGX platform into the corporate network also enables enterprise IT departments deliver a remote desktop to employees own devices, providing users the same access they have on their own desktop terminal. At the same time, it helps reduce overall IT costs, improves data security and minimize data center complexity.
Nvidia’s VGX is based on three key technology breakthroughs: the (1) VGX plaform and boards, the (2) GPU Hypervisor, and the (3) User Selectable Machines (UCMs).
Nvidia’s VGX platform and boards
Nvidia’s VGX boards are the world’s first GPU boards designed for data centers and they enable up to 100 users to be served from a single server powered by a single VGX board. This is pretty impressive compared with traditional virtual desktop infrastructure (VDI) solutions and it expands this kind of service to far more employees in a company in a much more cost-effictive manner. It reduces current latency issues, sluggish interaction and limited application support which are associated with traditional VDI solutions.
Working together across continents securely and seamlessly is possible today and can only get better with Kepler architecture as the latency is reduced further as this Citrix presentation slide shows.
The initial VGX board features four GPUs, each with 192 CUDA architecture cores and 4 GB of frame buffer. This initial board is designed to be passively cooled and easily fits within existing server-based platforms.
NVIDIA VGX GPU Hypervisor
The Nvidia VGX GPU Hypervisor is a software layer that integrates into a commercial hypervisor, enabling access to virtualized GPU resources. This allows multiple users to share common hardware and ensure virtual machines running on a single server have protected access to critical resources. As a result, a single server can now economically support a higher density of users, while providing native graphics and GPU computing performance.
This new technology is being integrated by leading virtualization companies, such as Citrix, to add full hardware graphics acceleration to their full range of VDI products. From the Citrix presentation:
NVIDIA User Selectable Machines (USMs)
USMs allow the VGX platform to deliver the advanced experience of professional GPUs to those requiring them across an enterprise. This enables IT departments to easily support multiple types of users from a single server.
USMs allow better utilization of hardware resources, with the flexibility to configure and deploy new users’ desktops based on changing enterprise needs. This is particularly valuable for companies providing infrastructure as a service, as they can repurpose GPU-accelerated servers to meet changing demand throughout the day, week or season.
Citrix, HP, Cisco and many other virtualization companies are on board with Nvidia for this project which is being deployed later this year and additional information is available at www.nvidia.com/object/vdi-desktop-virtualization.html.
Cloud gaming
Nvidia’s virtualization capabilities allow GPUs to be simultaneously shared by multiple users. Its ultra-fast streaming display capability eliminates lag, making a remote data center feel like it’s just next door. And its extreme energy efficiency and processing density lowers data center costs. Just as Citrix has branch repeaters for reducing latency in mission critical industrial application, similar things can be done for less than ideal latency connections.
The gaming implementation of Kepler cloud technologies, Nvidia’s GeForce GRID, powers cloud gaming services. Gaming-as-a-service providers will use it to remotely deliver excellent gaming experiences, with the potential to surpass those enjoyed on a console.
With the GeForce GRID platform, service providers can deliver the most advanced visuals with lower latency, while incurring lower operating costs, particularly related to energy usage. Gamers benefit from the ability to play the latest, most sophisticated games on any connected device, including TVs, smartphones and tablets running iOS and Android.
The key technologies powering the new platform are Nvidia’s new Kepler GRID GPUs with dedicated ultra-low-latency streaming technology and cloud graphics software. Together, they fundamentally change the economics and experience of cloud gaming, enabling gaming-as-a-service providers to operate scalable data centers at costs that are in line with those of movie-streaming services. Previously it required one GPU for each player.
Kepler architecture-based GPUs enables providers to render highly complex games in the cloud and encode them on the GPU, rather than the CPU, allowing their servers to simultaneously run more game streams. Server power-consumption per game stream is reduced to about one-half that of previous implementations, an important consideration for data centers. And two users can play off of one Kepler GPU in the cloud now compared to a one to one ration previously.
Fast streaming technology reduces server latency to as little as 10 milliseconds by capturing and encoding a game frame in a single pass. The GeForce GRID platform uses fast-frame capture, concurrent rendering and single-pass encoding to achieve ultra-fast game streaming.
The latency-reducing technology in GeForce GRID GPUs compensates for the distance in the network, so gamers may feel like they are playing on a high end gaming PC located in the same room. Lightning-fast play is now possible, even when the gaming supercomputer is miles away.
Nvidia and Gaikai demonstrated a virtual game console, consisting of an LG Cinema 3D Smart TV running a Gaikai application connected to a GeForce GRID GPU in a server 10 miles away. Instant, lag-free play was enabled on HAWKEN, an upcoming Mech PC game, with only an Ethernet cable and wireless USB game pad connected to the TV. The GRID has captured the endorsement of several of these cloud providers.
For more information about GeForce GRID, please visit: http://www.nvidia.com/geforcegrid.
Q & A Session with the press
There was a long line waiting to ask questions about Jensen’s keynote and about Kepler
Bill Dally was hired as Nvidia’s chief scientist before Fermi was released. Formerly Stanford University’s computer science department chairman, Bill Dally was hired as chief scientist and vice president of Nvidia at the beginning of 2009. He has a really strong academic background in GPU programming and his impact on Nvidia’s architecture is going be felt even more in the future.
The question and answer session was clearly unscripted and it was made clear that Nvidia’s partners, the ones providing the cloud gaming service actually “own” the GeForce GRID. Jensen indicated that Nvidia was not considering running their own gaming cloud currently but ruled nothing out. Latency was going to become much less of an issue with Kepler and even OnLive was evaluating Kepler for their own service. The possibility was mentioned of using a home server using GTX 680s to stream wirelessly great gaming graphics to any room of the house and to any device!
Pricing was not given for either the K10 or the upcoming K20 and we still have no word yet. Tegra 3 and 4 were mentioned in passing and even Project Denver was given a quick nod but nothing of real interest could be gleaned about them at the Q&A session other than everything was progressing well.
There was also some humor involved when one programmer asked, “why is it taking so long?” And another asked after seeing the simulation in the keynote, “what is Nvidia doing in view of our Milky Way galaxy colliding with Andromeda?” (in 3.5 billion years). Jensen answered, “I for one am busy making plans”.
A question was asked if it was cheaper to buy the Kepler hardware or to use the cloud. And apparently the cloud is not a cheap alternative – it makes for convenience and for using all sorts of devices and for more employees to be able to access it seamlessly. They also talked about K20, the GK110 GPU as the most complex piece of silicon ever developed at 7.1 billion transistors compared to the dual-GPU K10 with 6.8 billion (total) transistors. Power usage was stressed and Kepler is much more efficient than Fermi; to say nothing of the 20 to 1 power one must use to get comparable calculations out of an all-CPU server.
What about the GTX “780”?
There were a lot of questions left unanswered and this editor was able to get his question answered by a Nvidia official afterward. The question was naturally about the future GeForce GPU that would be based on the GK110 – the video card using a 7.1 Billion transistor GPU. Of course, Nvidia doesn’t discuss unreleased products, but it was made clear that there would be such a card after the demand for Tesla and Quadro were met.
This conference is not about gaming GPUs and it won’t be mentioned in the Whitepaper, but look very carefully at the die. It is obvious that there are 5 Graphics Processing Clusters (GPCs) and 3 SMX modules per GPC. A GPC is constructed like a “mini GPU” which contain SMXs; two in the case of GK104 and three in the case of GK110. Here is the Kepler GK110 architecture whitepaper which may be downloaded as a .pdf
http://www.nvidia.com/content/PDF/kepler/NVIDIA-Kepler-GK110-Architecture-Whitepaper.pdf
Since we know that the T20 will be released in the November/December time frame, it is logical that the new Nvidia flagship gaming GPU will be after that. So you can rest assured that the GTX 690/680 will be the fastest gaming GPUs for the next six to nine months. At any rate, we will take a deep dive into Kepler architecture in Wednesday’s session.
More Tuesday Sessions
There were many more sessions at the GTC that are quite technical and dealt with CUDA programming. Our earliest session before the keynote involved fire rendering and was co-presented by Pixar and Nvidia. A “how to” was described and the evolution of fire rendering in movies and in gaming was discussed.
Pixar gave their best tips on how to properly render fire convincingly for games using CUDA calculations and they made it ‘step-by-step’.
They also showed how to use motion blur, heat and embers to make fires look really realistic in games. Embers along with noise add to the realism.
And of course, their results were displayed.
There was a lot of note taking and of course this and all GTC sessions can be accessed on Nvidia’s web site. Let’s hope we see a lot more realistic-looking fire in upcoming video games as a result.
After the keynote and Press Q&A, we had a quick lunch then we managed to catch “Graphics in the Cloud” by Nvidia’s Will Wade where he discussed the cloud visualization and the technology behind it. Afterward, we ran for a CUDA compute session where we learned about shared memory and how to use it efficiently and that planning was the most critical part of any programming. The CUDA programmers were encouraged to know their limitations and to plan out on a whiteboard if necessary, so as to be able to write CUDA code efficiently.
Nsight the Visual studio was highlighted and the details were unfortunately lost on this editor.
After the Citrix presentation described earlier, we headed to a large session that discussed the 5 years of CUDA and the progress that was made. And this got technical as usual but it was easy to see the incredible interest and how disruptive that GPU programming has been to the industry, changing the way we do things by speeding up calculations a magnitude faster than previously workable on the CPU.
After these sessions ended at 6 PM, there was a networking happy hour and it was time to visit some of the exibitors until 8 PM. On the way to the exhibitors hall, one passed the posters on display. Many uses of the GPU were illustrated at the GTC by the many posters displayed for all to see. There are some impressive uses of the GTC that affect our life.
Almost all of them are quite technical – some of them deal with national security and this one involves using the massively parallel processing of the GPU for lunar research:
Of course there is much more to the GTC and Nvidia’s partners had many exhibits that this editor just got a glimpse of (note the oxygen bar to the left). We got to interview IOFusion’s CEO about some amazing products that put the SSD to shame for speed and storage and will be making it to a gamer’s PC this year. More on that later.
Some of the exhibits are quite whimsical, like Zoobe
However, all of them use the GPU to accelerate applications, like Scalable – across huge multiple displays seamlessly, using projectors.
Even the upcoming 60,000 dollar all-electric Tesla Model S sedan was highlighted. It features a 17″ in dash Nvidia-powered display:
We have barely scratched the surface of GTC 2012. Tuesday was a full day and there were still two days to go. It was time to head to the hotel room where we had begun to write our first day analysis and we answered our forum members questions about the GTC on ABT forum.
LoL, what was that Zoobe thing – in that little picture on the 3rd page – for that hardcore nerd to insert something into this “alien” enclosure?
http://www.zoobe.com/