If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Rate Thread | Display Modes |
#1
|
|||
|
|||
Switch from 32 bit to 64 bit - Is is worth the hassle?
I purchased windows 8 upgradeon line on October 26th with 32 bit vista and
32 bit xp. I purchased 4 upgrade licences and burnt a 32 dvd fro iso for installation I purchased ONE DVD from Microsoft which came in the mail 2 days ago - almost one month later. It has 64 bit and 32bit DVD Only one of the 4 computers I have has more than 4 GB of memory. It has 8 GB as it was custom built just in September and has ivy bridge 3570 processor and an appropriate motherboard. I know that 4 gigs of my memory are not beings used. It only cost me about 20 dollars extra to go from 4 gigs to 8 gigs so it can remain parked on my motherboard for years and it will not bother me. I do not do any heavy computer work. 32 bit programs are fine for me. I did do a test installation of 64 bit win 8 and its footprint is about 40% higher. Is there any reason to do a fresh installation of 64 bit? FD |
Ads |
#2
|
|||
|
|||
Switch from 32 bit to 64 bit - Is is worth the hassle?
FD wrote:
I purchased windows 8 upgradeon line on October 26th with 32 bit vista and 32 bit xp. I purchased 4 upgrade licences and burnt a 32 dvd fro iso for installation I purchased ONE DVD from Microsoft which came in the mail 2 days ago - almost one month later. It has 64 bit and 32bit DVD Only one of the 4 computers I have has more than 4 GB of memory. It has 8 GB as it was custom built just in September and has ivy bridge 3570 processor and an appropriate motherboard. I know that 4 gigs of my memory are not beings used. It only cost me about 20 dollars extra to go from 4 gigs to 8 gigs so it can remain parked on my motherboard for years and it will not bother me. I do not do any heavy computer work. 32 bit programs are fine for me. I did do a test installation of 64 bit win 8 and its footprint is about 40% higher. Is there any reason to do a fresh installation of 64 bit? FD There are a very limited number of software products, that come in "64 bit only" now. Adobe is the company pushing this. You'd need barrels of cash to afford some of their stuff. And if so, you'd want the 64 bit OS. Otherwise, I can't think of a user-centric reason for caring. The biggest difference I've ever seen this make, was when running some "special math". I used the GMP library, which allows extended precision arithmetic. The program was asked to calculate numbers with 40,000,000 digits. (This is for Mersenne Primes.) I compiled the program against 32 bit GMP and against 64 bit GMP. The 64 bit version could run one loop of the code, about 70% faster than the 32 bit version could. But considering how slow a loop was anyway, if the completion time was 100 years, it's not like I bothered running the code for real. The nature of the code, was too slow to be practical. (They'll find all the Primes, before my code finished.) http://en.wikipedia.org/wiki/GNU_Mul...hmetic_Library For most other purposes, you might see a 5% difference, as continuous math operations are not typical of things like Microsoft Word, or perhaps your web browser. Note that, on Intel processors, there is a slight loss of performance when switching from 32 bit to 64 bit operations. There is an internal packing operation in the execution pipeline, that combines two 32 bit operations and carries them through the pipe. When you switch to 64 bit ("pure") code, the packing operation can no longer be done, which reduces the rate stuff moves through that part of the pipe. The same thing doesn't happen on AMD64, in a way, because both sizes are slow :-) I suppose this topic is fun, if you do nothing but benchmark stuff :-) If your processor is "sufficiently fast", you probably don't care. Paul |
#3
|
|||
|
|||
Switch from 32 bit to 64 bit - Is is worth the hassle?
Paul wrote:
FD wrote: I purchased windows 8 upgradeon line on October 26th with 32 bit vista and 32 bit xp. I purchased 4 upgrade licences and burnt a 32 dvd fro iso for installation I purchased ONE DVD from Microsoft which came in the mail 2 days ago - almost one month later. It has 64 bit and 32bit DVD Only one of the 4 computers I have has more than 4 GB of memory. It has 8 GB as it was custom built just in September and has ivy bridge 3570 processor and an appropriate motherboard. I know that 4 gigs of my memory are not beings used. It only cost me about 20 dollars extra to go from 4 gigs to 8 gigs so it can remain parked on my motherboard for years and it will not bother me. I do not do any heavy computer work. 32 bit programs are fine for me. I did do a test installation of 64 bit win 8 and its footprint is about 40% higher. Is there any reason to do a fresh installation of 64 bit? FD There are a very limited number of software products, that come in "64 bit only" now. Adobe is the company pushing this. You'd need barrels of cash to afford some of their stuff. And if so, you'd want the 64 bit OS. Otherwise, I can't think of a user-centric reason for caring. The biggest difference I've ever seen this make, was when running some "special math". I used the GMP library, which allows extended precision arithmetic. The program was asked to calculate numbers with 40,000,000 digits. (This is for Mersenne Primes.) I compiled the program against 32 bit GMP and against 64 bit GMP. The 64 bit version could run one loop of the code, about 70% faster than the 32 bit version could. But considering how slow a loop was anyway, if the completion time was 100 years, it's not like I bothered running the code for real. The nature of the code, was too slow to be practical. (They'll find all the Primes, before my code finished.) http://en.wikipedia.org/wiki/GNU_Mul...hmetic_Library For most other purposes, you might see a 5% difference, as continuous math operations are not typical of things like Microsoft Word, or perhaps your web browser. Note that, on Intel processors, there is a slight loss of performance when switching from 32 bit to 64 bit operations. There is an internal packing operation in the execution pipeline, that combines two 32 bit operations and carries them through the pipe. When you switch to 64 bit ("pure") code, the packing operation can no longer be done, which reduces the rate stuff moves through that part of the pipe. The same thing doesn't happen on AMD64, in a way, because both sizes are slow :-) I suppose this topic is fun, if you do nothing but benchmark stuff :-) If your processor is "sufficiently fast", you probably don't care. Paul I once wrote a program to do a knight's tour of a chessboard; on an NCR 8250 with full monitor display. Simple trial and error technique, back-tracking when dead-end reached, and all attempts stored on an internal table. It looked pretty good running. You could input any starting point. It went from empty board to about 50% filled in seconds, then wiped the latest branches off and started new ones. I ran it for about an hour finally with no solution found. I wonder how long it would take on a modern 3GHz processor with 8GB RAM and 64-bit architecture. Ed |
#4
|
|||
|
|||
Switch from 32 bit to 64 bit - Is is worth the hassle?
On 26/11/2012 7:12 AM, Ed Cryer wrote:
Paul wrote: FD wrote: I purchased windows 8 upgradeon line on October 26th with 32 bit vista and 32 bit xp. I purchased 4 upgrade licences and burnt a 32 dvd fro iso for installation I purchased ONE DVD from Microsoft which came in the mail 2 days ago - almost one month later. It has 64 bit and 32bit DVD Only one of the 4 computers I have has more than 4 GB of memory. It has 8 GB as it was custom built just in September and has ivy bridge 3570 processor and an appropriate motherboard. I know that 4 gigs of my memory are not beings used. It only cost me about 20 dollars extra to go from 4 gigs to 8 gigs so it can remain parked on my motherboard for years and it will not bother me. I do not do any heavy computer work. 32 bit programs are fine for me. I did do a test installation of 64 bit win 8 and its footprint is about 40% higher. Is there any reason to do a fresh installation of 64 bit? FD There are a very limited number of software products, that come in "64 bit only" now. Adobe is the company pushing this. You'd need barrels of cash to afford some of their stuff. And if so, you'd want the 64 bit OS. Otherwise, I can't think of a user-centric reason for caring. The biggest difference I've ever seen this make, was when running some "special math". I used the GMP library, which allows extended precision arithmetic. The program was asked to calculate numbers with 40,000,000 digits. (This is for Mersenne Primes.) I compiled the program against 32 bit GMP and against 64 bit GMP. The 64 bit version could run one loop of the code, about 70% faster than the 32 bit version could. But considering how slow a loop was anyway, if the completion time was 100 years, it's not like I bothered running the code for real. The nature of the code, was too slow to be practical. (They'll find all the Primes, before my code finished.) http://en.wikipedia.org/wiki/GNU_Mul...hmetic_Library For most other purposes, you might see a 5% difference, as continuous math operations are not typical of things like Microsoft Word, or perhaps your web browser. Note that, on Intel processors, there is a slight loss of performance when switching from 32 bit to 64 bit operations. There is an internal packing operation in the execution pipeline, that combines two 32 bit operations and carries them through the pipe. When you switch to 64 bit ("pure") code, the packing operation can no longer be done, which reduces the rate stuff moves through that part of the pipe. The same thing doesn't happen on AMD64, in a way, because both sizes are slow :-) I suppose this topic is fun, if you do nothing but benchmark stuff :-) If your processor is "sufficiently fast", you probably don't care. Paul I once wrote a program to do a knight's tour of a chessboard; on an NCR 8250 with full monitor display. Simple trial and error technique, back-tracking when dead-end reached, and all attempts stored on an internal table. It looked pretty good running. You could input any starting point. It went from empty board to about 50% filled in seconds, then wiped the latest branches off and started new ones. I ran it for about an hour finally with no solution found. I wonder how long it would take on a modern 3GHz processor with 8GB RAM and 64-bit architecture. Ed My ZX-81 used to play chess. Single ply depth search, about 30 minutes per move. Your game takes all day. Like your opponent was a real deep thinker. My current computer is at least 3000 times faster, as moves on there, seem to take in the 1 second range, and probably with a different number of plies. The ZX-81 was playing chess with the 16KB RAM pack added. As 2KB wasn't enough to play chess :-) The 64 bit only helps if you have profitable ways to use the entire register. The GMP library gets a speedup because the register width gets fully used. And instead of doubling the speed, you get 70% more speed. Lots of other things, don't make use of the register width. I'd say the power of the 3GHz processor, doesn't always get applied. Plenty of stuff we do on computers, doesn't seem to scale that well, and leaves you with the feeling it should have run faster. Paul |
#5
|
|||
|
|||
Switch from 32 bit to 64 bit - Is is worth the hassle?
"Paul" wrote: [ ..... ] as continuous math operations are not typical of things like Microsoft Word, or perhaps your web browser. By "continuous" do you mean "floating point"? *TimDaniels* |
#6
|
|||
|
|||
Switch from 32 bit to 64 bit - Is is worth the hassle?
Timothy Daniels wrote:
"Paul" wrote: [ ..... ] as continuous math operations are not typical of things like Microsoft Word, or perhaps your web browser. By "continuous" do you mean "floating point"? *TimDaniels* Microsoft Word would likely have no long sequences of floating point instructions FMUL FDIV nor would it have long sequences of integer operations, like MUL DIV That sort of thing. The occasional INC or DEC, shift_left, shift_right, comparison, AND, OR, XOR, that's not "math" for me. A lot of those can be done with the regular ALU. Whereas a MUL or DIV or FMUL or FDIV, requires something with a lot more gates and complexity (like a different functional unit). Lots of programs do a ton of branches and comparisons, using variables to store logic states. So the code doesn't really challenge the processor that much. There have been processors in the past, if you write assembler code and put a hundred FMUL, FDIV... type instructions together, the processor will actually tip over :-) It's because compilers don't produce such sequences, that the affected processors work just fine, and "nobody notices". The processor internal noise problem was only noticed by synthetic testing (and only months after the processor was released), using sequences a compiler would not normally produce. But a practitioner, using carefully crafted assembler code, might succeed. An example of hand-optimized code, is Prime95, where a lot of the code used to search for Mersenne Primes is written in assembler. (Custom FFTs.) A developer at Microsoft, making a copy of Word, uses a high level language, and the "power sucking code density" isn't all that great in regular compiler output. But there are still twits around, who write programs entirely in assembler (twits who do it for no demonstrably good reason). And they insist on showing you stacks of paper printout, to demonstrate all the work and agony they went through (I've worked with a couple of those people :-) ) People who program in high level languages, don't usually try to impress you with stacks of paper output. The assembler people seem to like to print out their work, and then wave it around (or, use it as a seat to sit on). In the GMP library, sequences of 32 bit math instructions, can be replaced by sequences of half as many 64 bit instructions. And that ends up being around 70% faster. Normal code doesn't have the density of such improvements, to see that kind of speedup. Compares and branches, the speed doesn't change. Paul |
#7
|
|||
|
|||
Switch from 32 bit to 64 bit - Is is worth the hassle?
On Tue, 27 Nov 2012 16:36:15 -0500, Paul wrote:
Timothy Daniels wrote: "Paul" wrote: [ ..... ] as continuous math operations are not typical of things like Microsoft Word, or perhaps your web browser. By "continuous" do you mean "floating point"? *TimDaniels* Microsoft Word would likely have no long sequences of floating point instructions FMUL FDIV nor would it have long sequences of integer operations, like MUL DIV That sort of thing. The occasional INC or DEC, shift_left, shift_right, comparison, AND, OR, XOR, that's not "math" for me. A lot of those can be done with the regular ALU. Whereas a MUL or DIV or FMUL or FDIV, requires something with a lot more gates and complexity (like a different functional unit). Lots of programs do a ton of branches and comparisons, using variables to store logic states. So the code doesn't really challenge the processor that much. There have been processors in the past, if you write assembler code and put a hundred FMUL, FDIV... type instructions together, the processor will actually tip over :-) It's because compilers don't produce such sequences, that the affected processors work just fine, and "nobody notices". The processor internal noise problem was only noticed by synthetic testing (and only months after the processor was released), using sequences a compiler would not normally produce. But a practitioner, using carefully crafted assembler code, might succeed. An example of hand-optimized code, is Prime95, where a lot of the code used to search for Mersenne Primes is written in assembler. (Custom FFTs.) A developer at Microsoft, making a copy of Word, uses a high level language, and the "power sucking code density" isn't all that great in regular compiler output. But there are still twits around, who write programs entirely in assembler (twits who do it for no demonstrably good reason). And they insist on showing you stacks of paper printout, to demonstrate all the work and agony they went through (I've worked with a couple of those people :-) ) People who program in high level languages, don't usually try to impress you with stacks of paper output. The assembler people seem to like to print out their work, and then wave it around (or, use it as a seat to sit on). In the GMP library, sequences of 32 bit math instructions, can be replaced by sequences of half as many 64 bit instructions. And that ends up being around 70% faster. Normal code doesn't have the density of such improvements, to see that kind of speedup. Compares and branches, the speed doesn't change. Interesting post. When I was an IBM SE many decades ago I had a seismic customer that was very proud of its software being more advanced than many of its competitors because of the brilliance of their algorithms AND the fact that they wrote in assembler for Univac 1108s, taking advantage of detailed knowledge of the hardware. They had a dozen or more mathematicians working full time on this stuff, which they felt gave them the edge. -- Robin Bignall Herts, England |
#8
|
|||
|
|||
Switch from 32 bit to 64 bit - Is is worth the hassle?
Robin Bignall wrote:
On Tue, 27 Nov 2012 16:36:15 -0500, Paul wrote: Timothy Daniels wrote: "Paul" wrote: [ ..... ] as continuous math operations are not typical of things like Microsoft Word, or perhaps your web browser. By "continuous" do you mean "floating point"? *TimDaniels* Microsoft Word would likely have no long sequences of floating point instructions FMUL FDIV nor would it have long sequences of integer operations, like MUL DIV That sort of thing. The occasional INC or DEC, shift_left, shift_right, comparison, AND, OR, XOR, that's not "math" for me. A lot of those can be done with the regular ALU. Whereas a MUL or DIV or FMUL or FDIV, requires something with a lot more gates and complexity (like a different functional unit). Lots of programs do a ton of branches and comparisons, using variables to store logic states. So the code doesn't really challenge the processor that much. There have been processors in the past, if you write assembler code and put a hundred FMUL, FDIV... type instructions together, the processor will actually tip over :-) It's because compilers don't produce such sequences, that the affected processors work just fine, and "nobody notices". The processor internal noise problem was only noticed by synthetic testing (and only months after the processor was released), using sequences a compiler would not normally produce. But a practitioner, using carefully crafted assembler code, might succeed. An example of hand-optimized code, is Prime95, where a lot of the code used to search for Mersenne Primes is written in assembler. (Custom FFTs.) A developer at Microsoft, making a copy of Word, uses a high level language, and the "power sucking code density" isn't all that great in regular compiler output. But there are still twits around, who write programs entirely in assembler (twits who do it for no demonstrably good reason). And they insist on showing you stacks of paper printout, to demonstrate all the work and agony they went through (I've worked with a couple of those people :-) ) People who program in high level languages, don't usually try to impress you with stacks of paper output. The assembler people seem to like to print out their work, and then wave it around (or, use it as a seat to sit on). In the GMP library, sequences of 32 bit math instructions, can be replaced by sequences of half as many 64 bit instructions. And that ends up being around 70% faster. Normal code doesn't have the density of such improvements, to see that kind of speedup. Compares and branches, the speed doesn't change. Interesting post. When I was an IBM SE many decades ago I had a seismic customer that was very proud of its software being more advanced than many of its competitors because of the brilliance of their algorithms AND the fact that they wrote in assembler for Univac 1108s, taking advantage of detailed knowledge of the hardware. They had a dozen or more mathematicians working full time on this stuff, which they felt gave them the edge. If you have an instruction level simulator, then yes, you might be able to hand tune certain loops that are the critical path on some code. But behavioral simulators aren't always available. Modern architectures are complicated enough, you can't hope to win the optimization battle, using only the instruction set manual. Even with a behavioral simulator, it's still tough to do, and takes hours to make the smallest improvement. Some processors now are approaching ~1000 possible instructions, and that means there could be multiple means to write short code segments. The compiler contents itself with only a tiny percentage of those instructions. Which begs the question, why do they keep adding more instructions to the processors ? At least one person wrote an article, asking them to stop :-) Paul |
#9
|
|||
|
|||
Switch from 32 bit to 64 bit - Is is worth the hassle?
On Tue, 27 Nov 2012 19:44:14 -0500, Paul wrote:
Robin Bignall wrote: On Tue, 27 Nov 2012 16:36:15 -0500, Paul wrote: Timothy Daniels wrote: "Paul" wrote: [ ..... ] as continuous math operations are not typical of things like Microsoft Word, or perhaps your web browser. By "continuous" do you mean "floating point"? *TimDaniels* Microsoft Word would likely have no long sequences of floating point instructions FMUL FDIV nor would it have long sequences of integer operations, like MUL DIV That sort of thing. The occasional INC or DEC, shift_left, shift_right, comparison, AND, OR, XOR, that's not "math" for me. A lot of those can be done with the regular ALU. Whereas a MUL or DIV or FMUL or FDIV, requires something with a lot more gates and complexity (like a different functional unit). Lots of programs do a ton of branches and comparisons, using variables to store logic states. So the code doesn't really challenge the processor that much. There have been processors in the past, if you write assembler code and put a hundred FMUL, FDIV... type instructions together, the processor will actually tip over :-) It's because compilers don't produce such sequences, that the affected processors work just fine, and "nobody notices". The processor internal noise problem was only noticed by synthetic testing (and only months after the processor was released), using sequences a compiler would not normally produce. But a practitioner, using carefully crafted assembler code, might succeed. An example of hand-optimized code, is Prime95, where a lot of the code used to search for Mersenne Primes is written in assembler. (Custom FFTs.) A developer at Microsoft, making a copy of Word, uses a high level language, and the "power sucking code density" isn't all that great in regular compiler output. But there are still twits around, who write programs entirely in assembler (twits who do it for no demonstrably good reason). And they insist on showing you stacks of paper printout, to demonstrate all the work and agony they went through (I've worked with a couple of those people :-) ) People who program in high level languages, don't usually try to impress you with stacks of paper output. The assembler people seem to like to print out their work, and then wave it around (or, use it as a seat to sit on). In the GMP library, sequences of 32 bit math instructions, can be replaced by sequences of half as many 64 bit instructions. And that ends up being around 70% faster. Normal code doesn't have the density of such improvements, to see that kind of speedup. Compares and branches, the speed doesn't change. Interesting post. When I was an IBM SE many decades ago I had a seismic customer that was very proud of its software being more advanced than many of its competitors because of the brilliance of their algorithms AND the fact that they wrote in assembler for Univac 1108s, taking advantage of detailed knowledge of the hardware. They had a dozen or more mathematicians working full time on this stuff, which they felt gave them the edge. If you have an instruction level simulator, then yes, you might be able to hand tune certain loops that are the critical path on some code. But behavioral simulators aren't always available. Modern architectures are complicated enough, you can't hope to win the optimization battle, using only the instruction set manual. Even with a behavioral simulator, it's still tough to do, and takes hours to make the smallest improvement. Some processors now are approaching ~1000 possible instructions, and that means there could be multiple means to write short code segments. The compiler contents itself with only a tiny percentage of those instructions. Which begs the question, why do they keep adding more instructions to the processors ? At least one person wrote an article, asking them to stop :-) Heh! I wonder how much continued effort is put into application development systems such as Delphi in order to get the best translation from HLL to running code. -- Robin Bignall Herts, England |
#10
|
|||
|
|||
Switch from 32 bit to 64 bit - Is is worth the hassle?
On 11/28/2012 11:18 AM, Robin Bignall wrote:
On Tue, 27 Nov 2012 19:44:14 -0500, Paul wrote: Robin Bignall wrote: On Tue, 27 Nov 2012 16:36:15 -0500, Paul wrote: Timothy Daniels wrote: "Paul" wrote: [ ..... ] as continuous math operations are not typical of things like Microsoft Word, or perhaps your web browser. By "continuous" do you mean "floating point"? *TimDaniels* Microsoft Word would likely have no long sequences of floating point instructions FMUL FDIV nor would it have long sequences of integer operations, like MUL DIV That sort of thing. The occasional INC or DEC, shift_left, shift_right, comparison, AND, OR, XOR, that's not "math" for me. A lot of those can be done with the regular ALU. Whereas a MUL or DIV or FMUL or FDIV, requires something with a lot more gates and complexity (like a different functional unit). Lots of programs do a ton of branches and comparisons, using variables to store logic states. So the code doesn't really challenge the processor that much. There have been processors in the past, if you write assembler code and put a hundred FMUL, FDIV... type instructions together, the processor will actually tip over :-) It's because compilers don't produce such sequences, that the affected processors work just fine, and "nobody notices". The processor internal noise problem was only noticed by synthetic testing (and only months after the processor was released), using sequences a compiler would not normally produce. But a practitioner, using carefully crafted assembler code, might succeed. An example of hand-optimized code, is Prime95, where a lot of the code used to search for Mersenne Primes is written in assembler. (Custom FFTs.) A developer at Microsoft, making a copy of Word, uses a high level language, and the "power sucking code density" isn't all that great in regular compiler output. But there are still twits around, who write programs entirely in assembler (twits who do it for no demonstrably good reason). And they insist on showing you stacks of paper printout, to demonstrate all the work and agony they went through (I've worked with a couple of those people :-) ) People who program in high level languages, don't usually try to impress you with stacks of paper output. The assembler people seem to like to print out their work, and then wave it around (or, use it as a seat to sit on). In the GMP library, sequences of 32 bit math instructions, can be replaced by sequences of half as many 64 bit instructions. And that ends up being around 70% faster. Normal code doesn't have the density of such improvements, to see that kind of speedup. Compares and branches, the speed doesn't change. Interesting post. When I was an IBM SE many decades ago I had a seismic customer that was very proud of its software being more advanced than many of its competitors because of the brilliance of their algorithms AND the fact that they wrote in assembler for Univac 1108s, taking advantage of detailed knowledge of the hardware. They had a dozen or more mathematicians working full time on this stuff, which they felt gave them the edge. If you have an instruction level simulator, then yes, you might be able to hand tune certain loops that are the critical path on some code. But behavioral simulators aren't always available. Modern architectures are complicated enough, you can't hope to win the optimization battle, using only the instruction set manual. Even with a behavioral simulator, it's still tough to do, and takes hours to make the smallest improvement. Some processors now are approaching ~1000 possible instructions, and that means there could be multiple means to write short code segments. The compiler contents itself with only a tiny percentage of those instructions. Which begs the question, why do they keep adding more instructions to the processors ? At least one person wrote an article, asking them to stop :-) Heh! I wonder how much continued effort is put into application development systems such as Delphi in order to get the best translation from HLL to running code. The problems I saw some years back were related to unused and unneeded code generated by then popular compilers. It was significant enough that third party code analyzers were developed/used to locate extra unused code in the compiler outputs. Sometimes the code had to be hand patched to eliminate the problems. In one case I'm aware of, C++ originated code was so slow and inefficient, that the software was rewritten in assembly and machine, using whatever was salvageable from the C++ output coding. The system and software is still in use today, and is deployed around the world. When someone is trying to shoot a missile up your rear, there isn't a whole lot of time to do something about it! |
#11
|
|||
|
|||
Switch from 32 bit to 64 bit - Is is worth the hassle?
charlie wrote:
On 11/28/2012 11:18 AM, Robin Bignall wrote: On Tue, 27 Nov 2012 19:44:14 -0500, Paul wrote: Robin Bignall wrote: On Tue, 27 Nov 2012 16:36:15 -0500, Paul wrote: Timothy Daniels wrote: "Paul" wrote: [ ..... ] as continuous math operations are not typical of things like Microsoft Word, or perhaps your web browser. By "continuous" do you mean "floating point"? *TimDaniels* Microsoft Word would likely have no long sequences of floating point instructions FMUL FDIV nor would it have long sequences of integer operations, like MUL DIV That sort of thing. The occasional INC or DEC, shift_left, shift_right, comparison, AND, OR, XOR, that's not "math" for me. A lot of those can be done with the regular ALU. Whereas a MUL or DIV or FMUL or FDIV, requires something with a lot more gates and complexity (like a different functional unit). Lots of programs do a ton of branches and comparisons, using variables to store logic states. So the code doesn't really challenge the processor that much. There have been processors in the past, if you write assembler code and put a hundred FMUL, FDIV... type instructions together, the processor will actually tip over :-) It's because compilers don't produce such sequences, that the affected processors work just fine, and "nobody notices". The processor internal noise problem was only noticed by synthetic testing (and only months after the processor was released), using sequences a compiler would not normally produce. But a practitioner, using carefully crafted assembler code, might succeed. An example of hand-optimized code, is Prime95, where a lot of the code used to search for Mersenne Primes is written in assembler. (Custom FFTs.) A developer at Microsoft, making a copy of Word, uses a high level language, and the "power sucking code density" isn't all that great in regular compiler output. But there are still twits around, who write programs entirely in assembler (twits who do it for no demonstrably good reason). And they insist on showing you stacks of paper printout, to demonstrate all the work and agony they went through (I've worked with a couple of those people :-) ) People who program in high level languages, don't usually try to impress you with stacks of paper output. The assembler people seem to like to print out their work, and then wave it around (or, use it as a seat to sit on). In the GMP library, sequences of 32 bit math instructions, can be replaced by sequences of half as many 64 bit instructions. And that ends up being around 70% faster. Normal code doesn't have the density of such improvements, to see that kind of speedup. Compares and branches, the speed doesn't change. Interesting post. When I was an IBM SE many decades ago I had a seismic customer that was very proud of its software being more advanced than many of its competitors because of the brilliance of their algorithms AND the fact that they wrote in assembler for Univac 1108s, taking advantage of detailed knowledge of the hardware. They had a dozen or more mathematicians working full time on this stuff, which they felt gave them the edge. If you have an instruction level simulator, then yes, you might be able to hand tune certain loops that are the critical path on some code. But behavioral simulators aren't always available. Modern architectures are complicated enough, you can't hope to win the optimization battle, using only the instruction set manual. Even with a behavioral simulator, it's still tough to do, and takes hours to make the smallest improvement. Some processors now are approaching ~1000 possible instructions, and that means there could be multiple means to write short code segments. The compiler contents itself with only a tiny percentage of those instructions. Which begs the question, why do they keep adding more instructions to the processors ? At least one person wrote an article, asking them to stop :-) Heh! I wonder how much continued effort is put into application development systems such as Delphi in order to get the best translation from HLL to running code. The problems I saw some years back were related to unused and unneeded code generated by then popular compilers. It was significant enough that third party code analyzers were developed/used to locate extra unused code in the compiler outputs. Sometimes the code had to be hand patched to eliminate the problems. In one case I'm aware of, C++ originated code was so slow and inefficient, that the software was rewritten in assembly and machine, using whatever was salvageable from the C++ output coding. The system and software is still in use today, and is deployed around the world. When someone is trying to shoot a missile up your rear, there isn't a whole lot of time to do something about it! I can't say I've looked at a lot of object oriented code. The few times I have (and that was years ago), I see "name mangling" adding to the size of the code. http://en.wikipedia.org/wiki/Microso..._Name_Mangling "It provides a way of encoding name and additional information about a function, structure, class or another datatype in order to pass more semantic information" You'd see code mixed with text strings. And I don't know if selecting an optimization level (like not using -g) would remove those or not. Maybe stripped code would be missing those. The code itself didn't seem that out of the ordinary. (Looking with the "microscope", the code didn't seem wasteful. The waste might be apparent when looking at more of the code.) When I was playing with my GMP/Mersenne Prime example, I changed from C++ code (the original code snippet) to C code (as the GMP library supports both), and the biggest saving was the ability to remove a few intermediate variables. And when a single variable holds a 40,000,000 digit number, that's a significant saving. (A single number in that case, busts the L3 cache on the processor.) But in that case, I'm changing the style of the code. The resulting C code is less readable for another reviewer. But, it gets the job done faster. It depends on the size of the project, just how practical it is to manage the project with the older languages. You could start the project with the objective of having code fast enough to avoid "trying to shoot a missile up your rear", and end up with the project failing entirely, and producing no finished code at all. That's the danger. If you look at the history of large software projects, it's not very encouraging at all. One project I worked on, we had a software architect. No coders hired yet (not the time for them). He estimated the size of code needed for our product. And his estimate showed it would take twenty minutes for all the object oriented code to just *load* into the processor (or processor complex). Nothing executed yet. Just loading. Well, we laughed our asses off about that - at least I did. It's a good thing I wasn't the manager, having to deal with that snippet of info. I don't think the software architect and the manager, got along that well. Paul |
#12
|
|||
|
|||
Switch from 32 bit to 64 bit - Is is worth the hassle?
On 11/29/2012 6:36 PM, Paul wrote:
charlie wrote: On 11/28/2012 11:18 AM, Robin Bignall wrote: On Tue, 27 Nov 2012 19:44:14 -0500, Paul wrote: Robin Bignall wrote: On Tue, 27 Nov 2012 16:36:15 -0500, Paul wrote: Timothy Daniels wrote: "Paul" wrote: [ ..... ] as continuous math operations are not typical of things like Microsoft Word, or perhaps your web browser. By "continuous" do you mean "floating point"? *TimDaniels* Microsoft Word would likely have no long sequences of floating point instructions FMUL FDIV nor would it have long sequences of integer operations, like MUL DIV That sort of thing. The occasional INC or DEC, shift_left, shift_right, comparison, AND, OR, XOR, that's not "math" for me. A lot of those can be done with the regular ALU. Whereas a MUL or DIV or FMUL or FDIV, requires something with a lot more gates and complexity (like a different functional unit). Lots of programs do a ton of branches and comparisons, using variables to store logic states. So the code doesn't really challenge the processor that much. There have been processors in the past, if you write assembler code and put a hundred FMUL, FDIV... type instructions together, the processor will actually tip over :-) It's because compilers don't produce such sequences, that the affected processors work just fine, and "nobody notices". The processor internal noise problem was only noticed by synthetic testing (and only months after the processor was released), using sequences a compiler would not normally produce. But a practitioner, using carefully crafted assembler code, might succeed. An example of hand-optimized code, is Prime95, where a lot of the code used to search for Mersenne Primes is written in assembler. (Custom FFTs.) A developer at Microsoft, making a copy of Word, uses a high level language, and the "power sucking code density" isn't all that great in regular compiler output. But there are still twits around, who write programs entirely in assembler (twits who do it for no demonstrably good reason). And they insist on showing you stacks of paper printout, to demonstrate all the work and agony they went through (I've worked with a couple of those people :-) ) People who program in high level languages, don't usually try to impress you with stacks of paper output. The assembler people seem to like to print out their work, and then wave it around (or, use it as a seat to sit on). In the GMP library, sequences of 32 bit math instructions, can be replaced by sequences of half as many 64 bit instructions. And that ends up being around 70% faster. Normal code doesn't have the density of such improvements, to see that kind of speedup. Compares and branches, the speed doesn't change. Interesting post. When I was an IBM SE many decades ago I had a seismic customer that was very proud of its software being more advanced than many of its competitors because of the brilliance of their algorithms AND the fact that they wrote in assembler for Univac 1108s, taking advantage of detailed knowledge of the hardware. They had a dozen or more mathematicians working full time on this stuff, which they felt gave them the edge. If you have an instruction level simulator, then yes, you might be able to hand tune certain loops that are the critical path on some code. But behavioral simulators aren't always available. Modern architectures are complicated enough, you can't hope to win the optimization battle, using only the instruction set manual. Even with a behavioral simulator, it's still tough to do, and takes hours to make the smallest improvement. Some processors now are approaching ~1000 possible instructions, and that means there could be multiple means to write short code segments. The compiler contents itself with only a tiny percentage of those instructions. Which begs the question, why do they keep adding more instructions to the processors ? At least one person wrote an article, asking them to stop :-) Heh! I wonder how much continued effort is put into application development systems such as Delphi in order to get the best translation from HLL to running code. The problems I saw some years back were related to unused and unneeded code generated by then popular compilers. It was significant enough that third party code analyzers were developed/used to locate extra unused code in the compiler outputs. Sometimes the code had to be hand patched to eliminate the problems. In one case I'm aware of, C++ originated code was so slow and inefficient, that the software was rewritten in assembly and machine, using whatever was salvageable from the C++ output coding. The system and software is still in use today, and is deployed around the world. When someone is trying to shoot a missile up your rear, there isn't a whole lot of time to do something about it! I can't say I've looked at a lot of object oriented code. The few times I have (and that was years ago), I see "name mangling" adding to the size of the code. http://en.wikipedia.org/wiki/Microso..._Name_Mangling "It provides a way of encoding name and additional information about a function, structure, class or another datatype in order to pass more semantic information" You'd see code mixed with text strings. And I don't know if selecting an optimization level (like not using -g) would remove those or not. Maybe stripped code would be missing those. The code itself didn't seem that out of the ordinary. (Looking with the "microscope", the code didn't seem wasteful. The waste might be apparent when looking at more of the code.) When I was playing with my GMP/Mersenne Prime example, I changed from C++ code (the original code snippet) to C code (as the GMP library supports both), and the biggest saving was the ability to remove a few intermediate variables. And when a single variable holds a 40,000,000 digit number, that's a significant saving. (A single number in that case, busts the L3 cache on the processor.) But in that case, I'm changing the style of the code. The resulting C code is less readable for another reviewer. But, it gets the job done faster. It depends on the size of the project, just how practical it is to manage the project with the older languages. You could start the project with the objective of having code fast enough to avoid "trying to shoot a missile up your rear", and end up with the project failing entirely, and producing no finished code at all. That's the danger. If you look at the history of large software projects, it's not very encouraging at all. One project I worked on, we had a software architect. No coders hired yet (not the time for them). He estimated the size of code needed for our product. And his estimate showed it would take twenty minutes for all the object oriented code to just *load* into the processor (or processor complex). Nothing executed yet. Just loading. Well, we laughed our asses off about that - at least I did. It's a good thing I wasn't the manager, having to deal with that snippet of info. I don't think the software architect and the manager, got along that well. Paul One of the common trouble areas we found had to do with compiler generated "subroutines" supposedly used to decrease the memory footprint. It turned out that the more or less "generic" subs carried a lot of extra code that would never be executed. The real reason behind all the problems turned out to be "smart" hardware sub assemblies that were more or less independent machines with imbedded cpu's. They all had access to common memory, and a controlling CPU passed data and instructions to them via RAM and ROM. Then there were multiple classes of interrupts, from "I'm ready, doing what I was told, to can't do that, or an almost last resort, "I'm dead, shut me down and leave me alone". Almost none of the C++ programmers had any experience with such an environment, and the compilers were originally written around the expectation of generating code suitable for "serial" execution. Glad I got out of that general area of endeavor, and moved on to less frustrating and better paid parts of my field. When I retired, a few years ago, I was amused to find out that the systems were still in use, along with the major part of the software, and no one had figured out how to do things any differently when it came to "modernization". |
Thread Tools | |
Display Modes | Rate This Thread |
|
|