Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
GM200/Titan X announced with slow DP
#41
It will be interesting to see what nVidia does without AMD. If Captain Jack bombs (which it very much could), nVidia is going to find itself in a situation where they really don't have any competition in terms of midrange and high end cards. We will see what they do with their pricing on their future cards.

Their prices aren't bad right now but they are more expensive for similar performance compared to AMD. Most people are willing to pay the premium because the nVidia cards run cooler with far lower PSU requirements and less fan noise.
Reply
#42
(03-17-2015, 11:30 PM)Picao84 Wrote: Like I speculated in the old forum, Big Maxwell, unlike Big Kepler/Fermi, is not strong in DP: only 0,2 TFlops.
One more for the record, following my speculation that Maxwell would be most if not all done in 28nm, several months before the release of GM204. As usual, I was faced at the time with some fierce opposition to this idea, never mind how many arguments and facts I thrown at it.

On a side note, the keynote is very very interesting. Lots of information about Pascal and performance estimates: 10x faster than Maxwell!!

http://www.ustream.tv/channel/gpu-techno...rence-2015

Pico84, i was one of the people that thought that the gm200 would have a higher DP ratio that the gm204. I even suggested maybe the gk210 was a result of a special contract that needed the specific changes that the chip offered over the gk110.

I really did expect a higher DP ratio. I wasnt so sure the DP ratio would be as high as kepler's but i really felt that the full gm200 would still best the gk110 in DP, even if by small tad. I fully expected it to better the gk110 in DP performance, at the very least.

So my expectations were wrong.

YOU WIN!!!!

If that makes you feel better, I have no problems admitting my imagined version of the gm200 was wrong on my personal DP performance expectations of the gm200. But..........

You know you have been steadily changing your claims from the beginning. You first came off saying the GM200 would be poor in compute. That was your claim and i challenged that claim. The gm200 is actually the most powerful compute graphics cards there are. You used this term "compute" in just the same fashion as so many of the pro AMD defenders in the time of Tahiti after the gk104 launched. Just so you know, compute is not DP performance. The truth is, when people typically talk about a GPUs "compute" performance, they refer to benchmarks and task that do absolutely no double precision calculations at all. These OpenCL benchmarks, task, bitcoin mining, etc have nothing to do with DP performance and they were always pointed to when someone wanted to show how strong Tahiti was in compute. Well, turns out that the GM200 is an amassing compute card.

Maybe you just didnt know what you were talking about or maybe you are changing the goal post. Either way, there is an issue here and you were not entirely correct in your original claims.

Anyway, i have another question for you. I dont care to get into your debate with gstanford, but there is another problem among your claims. The gk110 titan did not have this static DP performance that was directly tied to the HW layout. It doesnt work like you claim at all. Actually, out of the box you only get something like 1/24th the DP rate. But you say this is hardwired to the HW layout and this is where your claims get busted. The titan will not run at 1/3rd DP rate unless you toggle it on through the control panel. It is a software switch in implemented through driver. With it off, the Gk110 has the same DP rate as the Gk104. Toggle it on and it goes to a 1/3 DP rate.

http://hexus.net/tech/reviews/graphics/5...-overview/
Quote: This fact is further substantiated by the knowledge that, just like Tesla K20X, TITAN can run double-precision compute at 1/3rd of single-precision speeds, leading to over 1TFLOPS DP throughput. However, being a gamer's card at heart, TITAN's DP rate is set to 1/24th of SP, just like GTX 680, as no games use double-precision calculations. The full 1/3rd ratio can be set via the control panel, yet doing so forces the GPU's clocks down. And no gamer wants that, right?


So i guess the GK110 has a shape shifting SMMs or gstan knows something about what he is talking about.

I am not saying that the gm200 is capable of a higher DP rate, i actually accepted the fact that i was wrong. I am just saying that you may not know as much as you think about the what nvidia disables DP. If we look at the gk110, it is obvious that Gstan is correct in suggesting the clock rate goes up. I am actually glad he was willing to add some valuable information to the discussion.

No it is up to you to figure out why Nvidia has limited the Gm200 to the same DP rate as the gm204. Cause you both cant be right
Reply
#43
(03-19-2015, 09:06 AM)ocre Wrote:
(03-17-2015, 11:30 PM)Picao84 Wrote: You know you have been steadily changing your claims from the beginning.  You first came off saying the GM200 would be poor in compute.  That was your claim and i challenged that claim.  The gm200 is actually the most powerful compute graphics cards there are.  You used this term "compute" in just the same fashion as so many of the pro AMD defenders in the time of Tahiti after the gk104 launched.  Just so you know, compute is not DP performance.  The truth is, when people typically talk about a GPUs "compute" performance, they refer to benchmarks and task that do absolutely no double precision calculations at all.  These OpenCL benchmarks, task, bitcoin mining, etc have nothing to do with DP performance and they were always pointed to when someone wanted to show how strong Tahiti was in compute.  Well, turns out that the GM200 is an amassing compute card.

That is totally not true. I never changed any goal posts. My point was always about DP.
See the following post on the old forums, still available ( I tried to copy the link but did not find it):

Quote:I was looking for an old thread and then I remembered that many were lost weeks ago.

Anyway, looks like my expectations are not so different from what may happen. No Maxwell HPC chip at all, so GM200 should indeed be a graphics oriented chip...

http://wccftech.com/nvidia-planning-dit ... ving-2017/

Quote:The reason behind not deploying Maxwell in Tesla Accelerators is said to be the lack of FP64 Floating Point Units and additional double precision hardware. And the reason behind not including DP FPUs in Maxwell might have to do with the extremely efficient design that NVIDIA was aiming for. This however means that NVIDIA’s upcoming Maxwell core, the GM200 which is the flagship core of the series might just remain the GeForce only offering unlike Kepler which was aimed for the HPC market first with the Titan supercomputer and launched a year later after the arrival of the initial Kepler cores as the GeForce GTX Titan.

Quote: Since the DP FP64 FPU hardware blocks will be removed from the top-tier cards that are rumored to arrive next year, they will include several more FP32 FPUs to make use of the remaining core space and that means better performance for the GeForce parts since games have little to do with Double precision performance.

Quote:GM200 did not have that much space to grow in 28nm, given that GK110 was already huge. Could they make it faster at DP? Probably yes. But how much? Would it be worth it for say 20% more performance? HPC market is very different from graphics market. HPC is sold in massive quantities, not individual cards. No company is going to massively upgrade their system for 20% extra performance. Jumps in performance must be much higher than that. They kinda did that sort of jump with GK210 double chip cards. They choose to optimise an existing chip, GK110, to reduce power consumption heavily in order to offer a bigger jump in performance than a single GM200 could offer. It was a very intelligent move on their part, IMO.

I even said it was a very intelligent move on their part, so take back your claims about AMD bla bla bla.

Quote:It is not a GEFORCE only part, but also a QUADRO part Wink
Just not a high end TESLA part.

More information:
http://wccftech.com/nvidia-gm200-based- ... s-compute/

Hmm, 6,07 TFLops single precision versus GK110 and GM204 5,2 TFlops. Not much of a big jump, is it? Then again, for graphics, nVIDIA architectures were always very efficient per TFlop. However, this makes me question things a little bit.. With weak DP performance, I would have expected a larger jump in single precision. Then again, the possible extra TMUs and ROPS (2x the ROPS of GK110!), plus double the cache of GK110 (I've seen a speculation from somewhere else for 3MB, up from GM204 2MB) might have taken quite a bit of the space left vacant by absent FP64 units.

I even said that the rumored performance, 6,07 TFlops was lower than what I was expecting. So no, you cannot say that I expected less Single Precison performance, or that I don't know the difference between them!!


Quote:Maybe you just didnt know what you were talking about or maybe you are changing the goal post.  Either way, there is an issue here and you were not entirely correct in your original claims.

I was correct in my original claims. Looks like you just have poor memory!

Quote: Anyway, i have another question for you.  I dont care to get into your debate with gstanford, but there is another problem among your claims.  The gk110 titan did not have this static DP performance that was directly tied to the HW layout.  It doesnt work like you claim at all.  Actually, out of the box you only get something like 1/24th the DP rate.  But you say this is hardwired to the HW layout and this is where your claims get busted.  The titan will not run at 1/3rd DP rate unless you toggle it on through the control panel.  It is a software switch in implemented through driver.  With it off, the Gk110 has the same DP rate as the Gk104.  Toggle it on and it goes to a 1/3 DP rate.

http://hexus.net/tech/reviews/graphics/5...-overview/

Quote: This fact is further substantiated by the knowledge that, just like Tesla K20X, TITAN can run double-precision compute at 1/3rd of single-precision speeds, leading to over 1TFLOPS DP throughput. However, being a gamer's card at heart, TITAN's DP rate is set to 1/24th of SP, just like GTX 680, as no games use double-precision calculations. The full 1/3rd ratio can be set via the control panel, yet doing so forces the GPU's clocks down. And no gamer wants that, right?
   

So i guess the GK110 has a shape shifting SMMs or gstan knows something about what he is talking about.  

I am not saying that the gm200 is capable of a higher DP rate, i actually accepted the fact that i was wrong.  I am just saying that you may not know as much as you think about the what nvidia disables DP.  If we look at the gk110, it is obvious that Gstan is correct in suggesting the clock rate goes up.   I am actually glad he was willing to add some valuable information to the discussion.  

No it is up to you to figure out why Nvidia has limited the Gm200 to the same DP rate as the gm204. Cause you both cant be right

My debate with Gstandford IS about the physical limitation on GM200, something that he continues to say its not true! I never said that GK110 on GeForce GTX780Ti was not artificially limited or hardwired (when I said they were fused, I was leading him into a trap.. from the moment he says it is software based he had to understand that if GM200 does not have more than 4 FP64 per SM... he had to understand GM200 is not artificially limited - guess he didn't connect the dots). Of course it is! I even showed the difference between them in active cores (960 vs 90). What you don't seem to understand is that the DP rate is not some internal clock. Its related to the amount of enabled FP64 units (software or otherwise its irrelevant, if the units are not there, there is nothing you can do). Gstanford seems to believe that nVIDIA can, if they want, enable DP rates higher than 1:32 on GM200. They cant! This conversation is all around that. Its silly!

Honestly, I'm out of this forum. There is a pack mentality here that if someone does not share the same vision, they are hunted and accused, directly or indirectly, of being pro AMD. This forum lacks rationality and free thinking. It is even worse than other places you like to bad name. At least on Beyond 3D its possible to have educated discussions and really learn with other people. I have no idea what BTR is, and I certainly started to dislike and distrust Appopin, but you guys... are not any better. Just look at yourselves in the mirror. Your self image might not be who you think you are. Over and out.
Reply
#44
doesn't matter if GM220 has 1, 4 or 4000 FP64 units on it!

They can still be ran at different ratios of full speed and if you look at the anandtech chart I posted before you will see they are running at a 1/32 rate.

If you are too thick to understand that, then I'm afraid there is nothing more that I can tell you....
Adam knew he should have bought a PC but Eve fell for the marketing hype.

Homeopathy is what happened when snake oil salesmen discovered that water is cheaper than snake oil.

The reason they call it the American Dream is because you have to be asleep to believe it. -- George Carlin
Reply
#45
Photo 
(03-19-2015, 03:48 PM)gstanford Wrote: doesn't matter if GM220 has 1, 4 or 4000 FP64 units on it!

They can still be ran at different ratios of full speed and if you look at the anandtech chart I posted before you will see they are running at a 1/32 rate.

If you are too thick to understand that, then I'm afraid there is nothing more that I can tell you....

Do you know what native means??? You are the thick one here that cannot see the obvious!
[Image: Native_GM200.png]
image upload without registration

http://www.anandtech.com/show/9059/the-n...x-review/2
Reply
#46
Perhaps "native FP64" means what "4 gb", "64 rops", "256 bit" mean on GTX 970.....
Adam knew he should have bought a PC but Eve fell for the marketing hype.

Homeopathy is what happened when snake oil salesmen discovered that water is cheaper than snake oil.

The reason they call it the American Dream is because you have to be asleep to believe it. -- George Carlin
Reply
#47
(03-19-2015, 04:34 PM)gstanford Wrote: Perhaps "native FP64" means what "4 gb", "64 rops", "256 bit" mean on GTX 970.....

So your escape pod is "perhaps" and "nvidia is lying"? Stop the bullshit. You are a perfect picture of what is wrong in this forum. Everyone wants to be right at all costs and twists reality when faced with the truth. In your case its ridiculous really, since you brought this upon yourself by lashing out at me before having all the facts straight. Like I said, you didn't read any reviews properly before opening your mouth, otherwise it would have been shut.
Reply
#48
(03-19-2015, 08:25 AM)SickBeast Wrote: It will be interesting to see what nVidia does without AMD.  If Captain Jack bombs (which it very much could), nVidia is going to find itself in a situation where they really don't have any competition in terms of midrange and high end cards.  We will see what they do with their pricing on their future cards.

Their prices aren't bad right now but they are more expensive for similar performance compared to AMD.  Most people are willing to pay the premium because the nVidia cards run cooler with far lower PSU requirements and less fan noise.

My guess is it will be a lot like it is now.

Nvidia "could" say the new price point for mid range cards is $500, but:
A. They risk pushing people out of computer gaming altogether and onto comparable goods. (consoles, tablets, phones, handhelds, intel based laptops)
B. They would sell far less of all products in the stack if they adjust prices upward by any significant amount.

It's true that some people would just pay the higher prices, but they are offset by those who would give up high end computer gaming altogether.

Nvidia can only survive selling people graphics cards every year or two at a price they want to pay. They've already lost a huge chunk of their business to consoles, and intel laptops. I guarantee you they don't want to lose more just to cash in temporarily. Consoles already have a slew of huge advantages: $400-$500 acquisition cost for whole unit and it's the only hardware you buy for 8 years, ease of use, huge installed user base that likely includes most of your gaming friends, more comfortable to sit in a recliner in your living room than in front of a pc, console game exclusives, etc..

If Nvidia attempts to pillage us, they will see very quickly how many think playing the same games at 720p or 1080p on a 50"+ screen is probably "good enough" in the face rising PC gaming costs. They would basically be ending their own business.
Reply
#49
(03-19-2015, 05:04 PM)Picao84 Wrote:
(03-19-2015, 04:34 PM)gstanford Wrote: Perhaps "native FP64" means what "4 gb", "64 rops", "256 bit" mean on GTX 970.....

So your escape pod is "perhaps" and "nvidia is lying"? Stop the bullshit. You are a perfect picture of what is wrong in this forum. Everyone wants to be right at all costs and twists reality when faced with the truth. In your case its ridiculous really, since you brought this upon yourself by lashing out at me before having all the facts straight. Like I said, you didn't read any reviews properly before opening your mouth, otherwise it would have been shut.
You are the one taking the "native FP64" bit of that chart as gospel, not me.

Picao84 Wrote:Honestly, I'm out of this forum.

Don't let the door hit you where the good lord split you.....
Adam knew he should have bought a PC but Eve fell for the marketing hype.

Homeopathy is what happened when snake oil salesmen discovered that water is cheaper than snake oil.

The reason they call it the American Dream is because you have to be asleep to believe it. -- George Carlin
Reply
#50
(03-19-2015, 03:24 PM)Picao84 Wrote: Honestly, I'm out of this forum. There is a pack mentality here that if someone does not share the same vision, they are hunted and accused, directly or indirectly, of being pro AMD. This forum lacks rationality and free thinking. It is even worse than other places you like to bad name. At least on Beyond 3D its possible to have educated discussions and really learn with other people. I have no idea what BTR is, and I certainly started to dislike and distrust Appopin, but you guys... are not any better. Just look at yourselves in the mirror. Your self image might not be who you think you are. Over and out.

There is no pack. The Duopoly is equally discussed here therefore there is certainly plenty of "rationality and free thinking".

You know the saying "If you can't take the heat, get out of the kitchen".
Reply
#51
What's funny is, free thinking people is exactly what he cannot seem to tolerate. I think he wants to be surrounded by people who agree with him, or as he puts it "able to have educated discussions". Because if you don't agree with him, you must be uneducated.

Sheesh man.
Reply
#52
(03-20-2015, 07:42 AM)BjorgenFjords Wrote: What's funny is, free thinking people is exactly what he cannot seem to tolerate. I think he wants to be surrounded by people who agree with him, or as he puts it "able to have educated discussions". Because if you don't agree with him, you must be uneducated.

Sheesh man.

I pretty much put that guy on "ignore" when he threw Nvidia Focus Group conspiracy theory out at us. You're the only member now aren't you?

Oh noes, don't unleash the PR disinformation juggernaut of all one of you upon us Keys! How can we defend ourselves against such a corporate influence?!?!
Reply
#53
In light of you being the Nvidia Focus Person and a person can't be a Group by definition, I say Nvidia has no transparency and is competing unfairly with AMD!

Without the treachery of the Nvidia Focus Person and intel's buyer loyalty rebates, AMD would probably have us all living at Star Trek-like levels of technology.
Reply
#54
Picao seemed to have too much of a temper. Rage issues. It's too bad because he was pretty knowledgeable.
Reply
#55
(03-20-2015, 04:57 PM)RolloTheGreat Wrote: In light of you being the Nvidia Focus Person and a person can't be a Group by definition, I say Nvidia has no transparency and is competing unfairly with AMD!

Without the treachery of the Nvidia Focus Person and intel's buyer loyalty rebates, AMD would probably have us all living at Star Trek-like levels of technology.

They already do - their shills at least. Star Trek is fantasy and the shills live in fantasy land..........
Adam knew he should have bought a PC but Eve fell for the marketing hype.

Homeopathy is what happened when snake oil salesmen discovered that water is cheaper than snake oil.

The reason they call it the American Dream is because you have to be asleep to believe it. -- George Carlin
Reply
#56
(03-20-2015, 06:49 PM)gstanford Wrote: They already do - their shills at least.  Star Trek is fantasy and the shills live in fantasy land..........

I think I know why Captain jack is taking so long.

They're only allowed to use AMD CPUs designing it.

ATi Engineer:"Boss I would have had Captain Jack done 6 months ago if I had a Haswell! What's with this "Vishera" crap?!"

Manager: "Shut up and be glad you have a job! AMD was nice enough to take us in when NVIDIA had pushed us to the edge of bankruptcy, you'll use their CPUs and like them!"

ATi Engineer: "But boss! My freaking PHONE is faster than this "top of the line" desktop! I remember when they launched this chipset when I was in High School!"

Manager: "I know....just pray they don't sell us to Kia to make dash displays for subcompact economy cars for the rest of our lives......"
Reply
#57
(03-19-2015, 03:24 PM)Picao84 Wrote: That is totally not true. I never changed any goal posts. My point was always about DP.
See the following post on the old forums, still available ( I tried to copy the link but did not find it):



I even said it was a very intelligent move on their part, so take back your claims about AMD bla bla bla.




I even said that the rumored performance, 6,07 TFlops was lower than what I was expecting. So no, you cannot say that I expected less Single Precison performance, or that I don't know the difference between them!!




I was correct in my original claims. Looks like you just have poor memory!


My debate with Gstandford IS about the physical limitation on GM200, something that he continues to say its not true! I never said that GK110 on GeForce GTX780Ti was not artificially limited or hardwired (when I said they were fused, I was leading him into a trap.. from the moment he says it is software based he had to understand that if GM200 does not have more than 4 FP64 per SM... he had to understand GM200 is not artificially limited - guess he didn't connect the dots). Of course it is! I even showed the difference between them in active cores (960 vs 90). What you don't seem to understand is that the DP rate is not some internal clock. Its related to the amount of enabled FP64 units (software or otherwise its irrelevant, if the units are not there, there is nothing you can do). Gstanford seems to believe that nVIDIA can, if they want, enable DP rates higher than 1:32 on GM200. They cant! This conversation is all around that. Its silly!

Honestly, I'm out of this forum. There is a pack mentality here that if someone does not share the same vision, they are hunted and accused, directly or indirectly, of being pro AMD. This forum lacks rationality and free thinking. It is even worse than other places you like to bad name. At least on Beyond 3D its possible to have educated discussions and really learn with other people. I have no idea what BTR is, and I certainly started to dislike and distrust Appopin, but you guys... are not any better. Just look at yourselves in the mirror. Your self image might not be who you think you are. Over and out.

There is no reason for you to get so over dramatic. I openly stated I was wrong in my expectations for the DP performance of the GM200. I clearly have no problem admitting that much. But when i challenged you to explain how the GK110 can have a software switch from gk104 like DP ratio to the 1/3rd ratio of the original titan, you get all crazy.

I have no idea what your talking about when you say your being hunted for being pro AMD. Do W-H-A-T-?
That is when i realized maybe you are really are completely irrational.

I could be wrong though. And i wont have an issue admitting it.........I actually hope that i am. I thought you were an alright guy that has strong beliefs and will fight for them. But really, your getting crazy and worked up with a lot of stuff that is 100% all in your head.

For starters, I am not on gstanfords side on this at all. The DP performance of the GM200 is limited. I conceded, I see that.

But that doesnt mean you were correct about the clock rate. Because obviously, the gk110 had some way of switching between the 1/24 rate and the 1/3 rate. This doesnt mean that the gm200 can increase its DP clock/rato. But you originally claimed that DP was a result of the hardware physical FP64 units and there was no such clock or timing. There obviously is a way with the gk110.

My personal feeling is that the special ability to increase the clock/timing on kepler was special and only the gk110 had this ability. That is couldnt be done with the gk104. But that is my guess. But with that, i imagine that the gm200 is missing that ability as well. Perhaps this takes up much needed space in the die.

So it is not that i ever challenged the DP rate of the gm200. I believe it is and always will be 1/32. It is just that you must also see that there is some special way the gk110 can go from 1/24 to 1/3. That perhaps there is something you can learn more about this. Perhaps your assertion that gstanford "doesnt know what he is talking about" is way to harsh. Because obviously there is more to this. And i think that is why you got so angry.

Have no idea why you would go off like that. There was only one person claiming GM200 can do more that 1/32 DP and that was Gstan. No one else here is challenging that. Every body else believes you, they believe that the gm200 is as DP limited as the gm204.
I just wanted you to think further into it. To realize that there is some way of changing the rate with a gk110 and it is no a hardwired/fixed result.

I also dont think you made up that the 780ti had fused off DP in each SM just to lead him into a trap. I think you mistakingly thought that is how it worked and then you try to twist your statement into something else. It is alright, i had no idea how nvidia cut back DP ratios with kepler. There are probably very few people that do know how it works. It is okay if your not 100% right, all of the time. dont get mad.

But i do believe you on the GM200. I believe it is limited to 1/32 and cannot be changed like the gk110 can. But it is obvious there is some sort of clocking/timing/ or multipliers at work in big keplers DP capabilities.
Reply
#58
Just to be clear, I don't agree or disagree with Picao or Gstanford in this matter. All I can say without any doubt it, I don't know if the 1/32 DP ratio is a hardware limited factor of GM200 or a software limited factor. It could very well be that Nvidia decided, for their own in house reasons, to restrict DP to 1/32 on "GeForce" Titan this time around. Perhaps the Quadro, and/or Tesla versions of GM200 (if there are any) will, through software, allow a significantly stronger DP ratio.
Picao maintains (without any real proof) that GM200 is hardware limited. It does not physically have the hardware for anything lower than 1/32. Gstanford maintains (without any proof) that GM200, like GK110, could through software, attain a lower ratio.
I don't agree with either of them, or disagree with either of them, because I can't.
Reply
#59
You can't design a modern FP64 unit to be as slow as 1/32 natively.

nvidia doesn't fundamentally alter the ALU's in a SMX between big and small die either. The Scheduler may get more capable between big and small dies (as was the case with GK110 vs GK104).

Basically you design the SMX once, and that is the basis for everything from the smallest Tegra die to the monster Tesla die.

The only thing that differs is the number of SMX's on the chip and how complex the scheduler is.
Adam knew he should have bought a PC but Eve fell for the marketing hype.

Homeopathy is what happened when snake oil salesmen discovered that water is cheaper than snake oil.

The reason they call it the American Dream is because you have to be asleep to believe it. -- George Carlin
Reply
#60
(03-21-2015, 01:46 PM)ocre Wrote: [quote='Picao84' pid='1038' dateline='1426757061']

I have no idea what your talking about when you say your being hunted for being pro AMD.  Do W-H-A-T-?

I'm not posting more here, but this point needs clarified. I am NOT pro AMD or an AMD fan, nor have ever been. I'm neutral with a slight bias towards nVIDIA (did you play Bioware/D&D games? Same logic as "neutral good"). Here is the the point that here some cannot understand. Here, if you post something that is not along the lines of "nVIDIA is good/AMD is bad", you are seen as pro AMD. Rubbish. Yes, I did imply Rollo and Keysplayr are overly pro nVIDIA, which is not exactly unknown to anyone. And I only said that after Keysplayr got in the way of my debate with Rollo, not to add something to the conversation, but to, out of the blue, say that my post was overly emotional. His first post on this thread. That is very significant and set the new tone. Something that he continued doing as the thread progressed. That is my rational for the "pack" comment, since most of his posts targeted me instead of adding to the conversation. Had he not done that, things were probably different. Clarification over.

(03-22-2015, 06:03 AM)gstanford Wrote: You can't design a modern FP64 unit to be as slow as 1/32 natively.

Its not the unit itself that it is slow. But I've already shown you the math behind the "1/whatever" rate. You ignored it. I actually learned it from real CUDA developers that know how the chip works and try to extract every bit of performance from the chips. But hey, what do they know right?

Quote: nvidia doesn't fundamentally alter the ALU's in a SMX between big and small die either.  The Scheduler may get more capable between big and small dies (as was the case with GK110 vs GK104).

Basically you design the SMX once, and that is the basis for everything from the smallest Tegra die to the monster Tesla die.

The only thing that differs is the number of SMX's on the chip and how complex the scheduler is.

*sigh* Wrong, wrong, wrong.

GK104 SMX:
[Image: GeForce_GTX_680_SM_Diagram_FINAL.png]

GK110 SMX:
[Image: GK110SMX.png]

Impressive how many here go "la,la,la,la" and say there is not factual evidence when nVIDIA is so open about it. Instead they rely on "ifs" and "buts" and hypothesis that nVIDIA is not saying the full truth, bla, bla, bla. They rely on "wishful thinking" and their own flawed interpretation of how things work, and accuse the person who presents facts of "spewing bullshit". Hey, sorry if I expected such big nVIDIA fans to actually read technical material (White Papers) provided by nVIDIA themselves.

Go on on your crusade Gstanford for an holy grail that does not exist (GM200 cannot go over 1:32). I won't delay you more.
Reply
#61
(03-22-2015, 05:49 PM)Picao84 Wrote: Hey, sorry if I expected such big nVIDIA fans to actually read technical material (White Papers) provided by nVIDIA themselves.

Go on on your crusade Gstanford for an holy grail that does not exist (GM200 cannot go over 1:32). I won't delay you more.

[/quote]

I can't speak for other "Nvidia Fans", but debating and/or searching for clues as to software vs hardware disabled FP functionality is about as enticing as say, "Watching paint dry".

I'm a gamer. If Nvidia figures out how to weld a couple transistors to empty beer cans and make them run the games at the resolution/detail level I desire faster than their competition, that's a sale.

This bizarre tradition of "I have read Company Xs white papers, and the presence of Part Y in amount Z should translate to a 13.84% improvement in shader based AA methods!" leaves me cold.

A couple things are true:

1. I make no money memorizing or analyzing white papers
2. Whether disabled by hardware or software, it's out of my hands so I consider it a moot point. I'm not going to start bridging cut traces or running bathtub drivers in a quest for 5% more fps.

Rage on, Picao, but don't expect me to waste my time with it.
Reply
#62
Picao. I'm not saying you are right or you're wrong. I am saying you could be either, or there is some kind of middle of the road truth. What you do not understand (maybe you do but continue on) is that you do not have all the data necessary to reach a conclusion about this. Neither does Gstanford. You "think" you do, but you don't. The two charts you have shown of the block diagrams of GK104 and GK110 are interesting. Tell me, what was the DP ratio of GK104?

I know you said you aren't posting anymore, but I hope you do.

And lastly, the "emotional" comment of mine was based ENTIRELY on the way you come off in your posts. You show anger. Aggression. Frustration. <---- emo things.

Anyway, if you can change that and just stick to data without throwing a hissy fit, that would be cool.
Reply
#63
(03-22-2015, 07:10 PM)RolloTheGreat Wrote:
(03-22-2015, 05:49 PM)Picao84 Wrote: Hey, sorry if I expected such big nVIDIA fans to actually read technical material (White Papers) provided by nVIDIA themselves.

Go on on your crusade Gstanford for an holy grail that does not exist (GM200 cannot go over 1:32). I won't delay you more.

I can't speak for other "Nvidia Fans", but debating and/or searching for clues as to software vs hardware disabled FP functionality is about as enticing as say, "Watching paint dry".

I'm a gamer. If Nvidia figures out how to weld a couple transistors to empty beer cans and make them run the games at the resolution/detail level I desire faster than their competition, that's a sale.

This bizarre tradition of "I have read Company Xs white papers, and the presence of Part Y in amount Z should translate to a 13.84% improvement in shader based AA methods!" leaves me cold.

A couple things are true:

1. I make no money memorizing or analyzing white papers
2. Whether disabled by hardware or software, it's out of my hands so I consider it a moot point. I'm not going to start bridging cut traces or running bathtub drivers in a quest for 5% more fps.

Rage on, Picao, but don't expect me to waste my time with it.
[/quote]

As far as I remember, this conversation was with Gstandford, not with you. Our conversation, which is over for me, was on another thread altogether. If you do not find this conversation enticing, you have a very good solution: do not read/comment on it.

(03-22-2015, 08:11 PM)BjorgenFjords Wrote: Picao. I'm not saying you are right or you're wrong. I am saying you could be either, or there is some kind of middle of the road truth. What you do not understand (maybe you do but continue on) is that you do not have all the data necessary to reach a conclusion about this. Neither does Gstanford. You "think" you do, but you don't. The two charts you have shown of the block diagrams of GK104 and GK110 are interesting. Tell me, what was the DP ratio of GK104?

The DP ratio of GK104 was/is 1:24.

From Anandtech:
"The other change coming from GF114 is the mysterious block #15, the CUDA FP64 block. In order to conserve die space while still offering FP64 capabilities on GF114, NVIDIA only made one of the three CUDA core blocks FP64 capable. In turn that block of CUDA cores could execute FP64 instructions at a rate of ¼ FP32 performance, which gave the SM a total FP64 throughput rate of 1/12th FP32. In GK104 none of the regular CUDA core blocks are FP64 capable; in its place we have what we’re calling the CUDA FP64 block. The CUDA FP64 block contains 8 special CUDA cores that are not part of the general CUDA core count and are not in any of NVIDIA’s diagrams. These CUDA cores can only do and are only used for FP64 math."

I hope that satisfies your [possible] curiosity about the absence of DP units on the GK104 SMX diagram.

Further, like it was said above, each SMX has a block of 8 FP64. Lets do the math, shall we?
8*8 [SMX] = 64.
How many FP32 CUDA cores exist in total on GK104? 1536.
How much is 1536:64?
1536/64 = 24
So this gives it a ratio of 1:24. Coincidence?

On GM200:

Quote:GM200 is 601mm2 of graphics, and this is what makes it remarkable. There are no special compute features here that only Tesla and Quadro users will tap into (save perhaps ECC), rather it really is GM204 with 50% more GPU. This means we’re looking at the same SMMs as on GM204, featuring 128 FP32 CUDA cores per SMM, a 512Kbit register file, and just 4 FP64 ALUs per SMM, leading to a puny native FP64 rate of just 1/32.

How did Anandtech conclude that? Lets to Maths again.
4*24 [SMX] = 96
How many FP32 CUDA cores exist in total on GM200? 3072.
How much is 3072:96?
3072/96 = 32
So this gives it a ratio of 1:32. Coincidence?

I would have no problem in someone presenting data to backup their stance on the matter, but so far all I've heard is along the lines of "announced FP64 might be the same as the announced ROP count of GTX970" etc. The two situations are not even comparable. GTX970 is, by definition, a cutdown version of a chip we know how it looks like when fully enabled (GTX980). GM200 was presented by nVIDIA as fully enabled on GTX Titan. Saying they may be lying through their teeth without info to back up that claim is not what I call an educated argument. Much less when that same person tells me that the existence of "1, 4 or 4000 FP64 units on it" is irrelevant. I do not believe in magic or spontaneous generation, sorry.

Concerning the "emotional issues", I will only say that you should look at some other people posts as well - but so far I have not seen you telling that to anyone else. In fact this thread started getting "emotional" right after Gstanford said this: "You are spouting bullshit here, Picao84!" I did not see you quote him and telling him he was being emotional. I'm not the only one with this kind of writing here, neither is it my usual personal style, as you can see in other forums (feel free, my nick is always the same - Picao84). But somehow, I'm the only one targeted as being "emo"? Of course it ends up feeling like double standards and pack behavior. But perhaps its a consequence of myself being relatively new here. You are so used to the way Rollo and Gstanford express themselves that you do not notice anymore. Or maybe some moderation is indeed needed, other than just trying to defend the forum and incentive people out. You might say that moderation is not needed, but then again, why quote me and say I was being emotional? If its a free for all, let it be.
Reply
#64
(03-22-2015, 09:09 PM)Picao84 Wrote:


Rage!
Reply
#65
(03-22-2015, 11:17 PM)RolloTheGreat Wrote:
(03-22-2015, 09:09 PM)Picao84 Wrote:


Rage!

No Darth Vader, you cannot pull me again to the Dark Side!!! Cool
Reply
#66
But GK104 doesn't have any DP units. At least according to the block diagram like GK110 does. How is it that it's able to complete DP tasks if there isn't any hardware to do it?
Reply
#67
(03-23-2015, 04:56 PM)BjorgenFjords Wrote: But GK104 doesn't have any DP units. At least according to the block diagram like GK110 does. How is it that it's able to complete DP tasks if there isn't any hardware to do it?

I already answered that with a quote on what nVIDIA told Anandtech. So unless you have better source on info than nVIDIA themselfs or actually say that they are lying I don't know what's your point.

And now my question to you. If GK104 could be more powerful at DP than it already is under GeForce, why didn't nVIDIA promote it as such in Tesla? After all its a smaller chip, cheaper to manufacture. But they promote it as a Single Precision workhorse instead as Tesla K10.
Reply
#68
(03-23-2015, 07:16 PM)Picao84 Wrote:
(03-23-2015, 04:56 PM)BjorgenFjords Wrote: But GK104 doesn't have any DP units. At least according to the block diagram like GK110 does. How is it that it's able to complete DP tasks if there isn't any hardware to do it?

I already answered that with a quote on what nVIDIA told Anandtech. So unless you have better source on info than nVIDIA themselfs or actually say that they are lying I don't know what's your point.

And now my question to you. If GK104 could be more powerful at DP than it already is under GeForce, why didn't nVIDIA promote it as such in Tesla? After all its a smaller chip, cheaper to manufacture. But they promote it as a Single Precision workhorse instead as Tesla K10.

Humor me. How is it that it's able to complete DP tasks if there isn't any hardware to do it?

And to answer your question (not with another question), I don't presume to know why Nvidia does what it does. Sometimes a company may do things that to you and I, does not make any sense, but to them and their plans might be perfect for them. So don't ask me why or why not Nvidia or AMD or Intel does what they do. They have their reasons even if you think they might be stupid reasons.
Reply
#69
(03-24-2015, 01:58 AM)BjorgenFjords Wrote:
(03-23-2015, 07:16 PM)Picao84 Wrote:
(03-23-2015, 04:56 PM)BjorgenFjords Wrote: But GK104 doesn't have any DP units. At least according to the block diagram like GK110 does. How is it that it's able to complete DP tasks if there isn't any hardware to do it?

I already answered that with a quote on what nVIDIA told Anandtech. So unless you have better source on info than nVIDIA themselfs or actually say that they are lying I don't know what's your point.

And now my question to you. If GK104 could be more powerful at DP than it already is under GeForce, why didn't nVIDIA promote it as such in Tesla? After all its a smaller chip, cheaper to manufacture. But they promote it as a Single Precision workhorse instead as Tesla K10.

Humor me. How is it that it's able to complete DP tasks if there isn't any hardware to do it?

And to answer your question (not with another question), I don't presume to know why Nvidia does what it does. Sometimes a company may do things that to you and I, does not make any sense, but to them and their plans might be perfect for them. So don't ask me why or why not Nvidia or AMD or Intel does what they do. They have their reasons even if you think they might be stupid reasons.

Did you even read the Anandtech quote? It has DP units but they are not shown on the diagram and are in much lower in number than in GK110. GK104 has 8 per SMX, while GK110 has 64. Did I ever imply that GK104 does not have DP units? All my calculations on this thread shown its DP units. Along with the diagrams, nVIDIA provides an explanation. Or are you going to hang on to the diagrams alone to make a point, ignoring the remaining information?

Or are you confusing me with Gstanford? Because he is the one saying that doesn't matter how many DP units the chip has. My whole point is that less DP units = less DP performance. GM200 has only 4 DP units per SMM, must like GM204, therefore it's DP performance is low by design. GK210 succeeded GK110 for DP compute (the chip is not even available as GeForce/Quadro).
Reply
#70
(03-24-2015, 02:25 AM)Picao84 Wrote:
(03-24-2015, 01:58 AM)BjorgenFjords Wrote:
(03-23-2015, 07:16 PM)Picao84 Wrote:
(03-23-2015, 04:56 PM)BjorgenFjords Wrote: But GK104 doesn't have any DP units. At least according to the block diagram like GK110 does. How is it that it's able to complete DP tasks if there isn't any hardware to do it?

I already answered that with a quote on what nVIDIA told Anandtech. So unless you have better source on info than nVIDIA themselfs or actually say that they are lying I don't know what's your point.

And now my question to you. If GK104 could be more powerful at DP than it already is under GeForce, why didn't nVIDIA promote it as such in Tesla? After all its a smaller chip, cheaper to manufacture. But they promote it as a Single Precision workhorse instead as Tesla K10.

Humor me. How is it that it's able to complete DP tasks if there isn't any hardware to do it?

And to answer your question (not with another question), I don't presume to know why Nvidia does what it does. Sometimes a company may do things that to you and I, does not make any sense, but to them and their plans might be perfect for them. So don't ask me why or why not Nvidia or AMD or Intel does what they do. They have their reasons even if you think they might be stupid reasons.

Did you even read the Anandtech quote? It has DP units but they are not shown on the diagram and are in much lower in number than in GK110. GK104 has 8 per SMX, while GK110 has 64. Did I ever imply that GK104 does not have DP units? All my calculations on this thread shown its DP units. Along with the diagrams, nVIDIA provides an explanation. Or are you going to hang on to the diagrams alone to make a point, ignoring the remaining information?

Or are you confusing me with Gstanford? Because he is the one saying that doesn't matter how many DP units the chip has. My whole point is that less DP units = less DP performance. GM200 has only 4 DP units per SMM, must like GM204, therefore it's DP performance is low by design. GK210 succeeded GK110 for DP compute (the chip is not even available as GeForce/Quadro).

There are no DP units on GK104 according to the block graphic you linked.
Reply
#71
Feel free to read and compare:

GK104:
http://www.anandtech.com/show/5699/nvidi...0-review/2

GK110:
http://www.anandtech.com/show/6760/nvidi...tan-part-1

You are deliberately ignoring information. With that attitude we will not get anywhere.
Reply
#72
Picao, I've asked you a question twice already and you refuse to directly answer it. You're afraid of the point I have to make otherwise you wouldn't be so "careful" answering questions with questions. So, you'll never admit to possibly being wrong about GM200. I can see that. It takes intelligence to be open about things, rather than a closed, my way or the highway mentality you've exibited thus far. Not another peep from me until you answer my question DIRECTLY. No beating around the bush BS, no questions to answer my question.
Cheers.
Reply
#73
(03-24-2015, 08:13 AM)BjorgenFjords Wrote: Picao, I've asked you a question twice already and you refuse to directly answer it. You're afraid of the point I have to make otherwise you wouldn't be so "careful" answering questions with questions. So, you'll never admit to possibly being wrong about GM200. I can see that. It takes intelligence to be open about things, rather than a closed, my way or the highway mentality you've exibited thus far. Not another peep from me until you answer my question DIRECTLY. No beating around the bush BS, no questions to answer my question.
Cheers.

Picao is probably a bargain basement ambulance chaser, looking for some cheesy class action suit.

"Your honor in this interview with Ryan Smith, the defendants represented to the buying public that the Titan X is a FULLY ENABLED Titan level GPU, and yet their employee Keysplayr admits the FP64 performance has been disabled on AlienBabelTech! My clients take FP64 VERY seriously and are entitled to damages!"

Rolleyes
Reply
#74
(03-22-2015, 09:09 PM)Picao84 Wrote: The DP ratio of GK104 was/is 1:24.

From Anandtech:
"The other change coming from GF114 is the mysterious block #15, the CUDA FP64 block. In order to conserve die space while still offering FP64 capabilities on GF114, NVIDIA only made one of the three CUDA core blocks FP64 capable. In turn that block of CUDA cores could execute FP64 instructions at a rate of ¼ FP32 performance, which gave the SM a total FP64 throughput rate of 1/12th FP32. In GK104 none of the regular CUDA core blocks are FP64 capable; in its place we have what we’re calling the CUDA FP64 block. The CUDA FP64 block contains 8 special CUDA cores that are not part of the general CUDA core count and are not in any of NVIDIA’s diagrams. These CUDA cores can only do and are only used for FP64 math."

I hope that satisfies your [possible] curiosity about the absence of DP units on the GK104 SMX diagram.

Further, like it was said above, each SMX has a block of 8 FP64. Lets do the math, shall we?
8*8 [SMX] = 64.
How many FP32 CUDA cores exist in total on GK104? 1536.
How much is 1536:64?
1536/64 = 24
So this gives it a ratio of 1:24. Coincidence?

On GM200:


How did Anandtech conclude that? Lets to Maths again.
4*24 [SMX] = 96
How many FP32 CUDA cores exist in total on GM200? 3072.
How much is 3072:96?
3072/96 = 32
So this gives it a ratio of 1:32. Coincidence?

I would have no problem in someone presenting data to backup their stance on the matter, but so far all I've heard is along the lines of "announced FP64 might be the same as the announced ROP count of GTX970" etc. The two situations are not even comparable. GTX970 is, by definition, a cutdown version of a chip we know how it looks like when fully enabled (GTX980). GM200 was presented by nVIDIA as fully enabled on GTX Titan. Saying they may be lying through their teeth without info to back up that claim is not what I call an educated argument. Much less when that same person tells me that the existence of  "1, 4 or 4000 FP64 units on it" is irrelevant. I do not believe in magic or spontaneous generation, sorry.

Concerning the "emotional issues", I will only say that you should look at some other people posts as well -  but so far I have not seen you telling that to anyone else. In fact this thread started getting "emotional" right after Gstanford said this: "You are spouting bullshit here, Picao84!" I did not see you quote him and telling him he was being emotional. I'm not the only one with this kind of writing here, neither is it my usual personal style, as you can see in other forums (feel free, my nick is always the same - Picao84). But somehow, I'm the only one targeted as being "emo"? Of course it ends up feeling like double standards and pack behavior. But perhaps its a consequence of myself being relatively new here. You are so used to the way Rollo and Gstanford express themselves that you do not notice anymore. Or maybe some moderation is indeed needed, other than just trying to defend the forum and incentive people out. You might say that moderation is not needed, but then again, why quote me and say I was being emotional? If its a free for all, let it be.

I completely believe that the GM200 will be restricted to 1/32 DP. I dont think we will see cards with a higher rate. But i know that is my guess. I have been known to be wrong. But for the record, I am with you on the GM200 being limited to 1/32. I feel like it is.

But, i dont understand how the gk110 can switch between a 1/23 DP rate to a 1/3 DP rate just by clicking a check in the the control panel. How the heck is this happening? I really really dont know.

If there is special HW that allows the clock/timing/ or some multiplier - that would be really great to know. Or maybe it is schedulers? I have no idea. But if we can find whatever magic that allows the gk110 to switch between states, that would be awesome.
Reply
#75
Oh well, DP scandal or not Titan has been successful enough in the short time it has been on the market to put some serious twists in FanATIc knickers! [Image: devil.gif]

http://forums.anandtech.com/showpost.php...tcount=495
[Image: Fan_ATic_Sulk.jpg]

Poor baby! He needs some cheese to accompany his whine! Big Grin

Oh, and nvidia didn't need Titan X to be fastest (compared to AMD). 970 does that just fine. Titan X is merely insult on top of injury. Big Grin
Adam knew he should have bought a PC but Eve fell for the marketing hype.

Homeopathy is what happened when snake oil salesmen discovered that water is cheaper than snake oil.

The reason they call it the American Dream is because you have to be asleep to believe it. -- George Carlin
Reply
#76
When you can't compete, obstruct.

Amd tool A,
"People who buy nvidia are so dumb. They just giving away them their money for something they want, a powerful GPU today. They are so brainwashed, spending their money now when we all know that one day cheaper cards will come out."

AMD tool B
"Yeah, they are mindless sheeple. Buying nvidia is just dumb and irresponsible. The value will plummet overnight just like it did on the original Titan and Titan black."

AMD tool A,
"Yep. That doesn't happen when you buy AMD gpus. They are the most awesome of awesome. The holy of holy. AMD gpus never loose value, ever."

76% of the market
"Huh?"

http://www.fool.com/investing/general/20...an-bl.aspx

Seriously,
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)