Ask Question Forum:
Model Library:2025-02-08 Updated:A.I. model is online for auto reply question page
C
O
M
P
U
T
E
R
2
8
Show
#
ASK
RECENT
←
- Underline
- Bold
- Italic
- Indent
- Step
- Bullet
- Quote
- Cut
- Copy
- Paste
- Table
- Spelling
- Find & Replace
- Undo
- Redo
- Link
- Attach
- Clear
- Code
Below area will not be traslated by Google,you can input code or other languages
Hint:If find spelling error, You need to correct it,1 by 1 or ignore it (code area won't be checked).
X-position of the mouse cursor
Y-position of the mouse cursor
Y-position of the mouse cursor
Testcursor
caretPos
Attachment:===
Asked by rwniceing
at 2024-08-31 06:18:43
Point:500 Replies:19 POST_ID:829211USER_ID:12079
Topic:
Assembly Programming Language;;Miscellaneous Programming
Recently, I am doing the different script & different program language execution time study and tutorial. For example, the following simple C++ "for loop" is completed within 0.01 second(10ms).
My question could I have simple assembly code with output result is same as the followingC++ code for me to do the program running time study ? Now I am using masm32's ml.exe and link.exe, and it spent me a lot of time for installation since I used assembly language before for a long long time ago. I don't want to directly convert C++ code into Asm code by g++ option since the execution time for both will be same based on same C++ code . Just want the assembly code in as simple or less code as possible.
Please advise
Rwniceng
My question could I have simple assembly code with output result is same as the followingC++ code for me to do the program running time study ? Now I am using masm32's ml.exe and link.exe, and it spent me a lot of time for installation since I used assembly language before for a long long time ago. I don't want to directly convert C++ code into Asm code by g++ option since the execution time for both will be same based on same C++ code . Just want the assembly code in as simple or less code as possible.
Please advise
Rwniceng
int main(){int max = 2e8;int a,b,c;for (int i = 0; i < max; i++) { a = 1234 + 5678 + i; b = 1234 * 5678 + i; c=1234/2+i; }return 0;} 1:2:3:4:5:6:7:8:9:10:
Author: rwniceing replied at 2024-09-02 22:20:00
phoffric, now I understood clearly what you said in your previous posts, thank for your reply.
For my memo :
This link describes a tools for performance benchmarking for different Game programs from using different languages on x86(1 or 4cores) and x64(one and 4 cores) processor at
http://benchmarksgame.alioth.debian.org/u32/which-programs-are-fastest.php
For my memo :
This link describes a tools for performance benchmarking for different Game programs from using different languages on x86(1 or 4cores) and x64(one and 4 cores) processor at
http://benchmarksgame.alioth.debian.org/u32/which-programs-are-fastest.php
Expert: phoffric replied at 2024-09-01 13:45:15
>> Could you describe more about terminology of Debug or Release mode in C++
For Release mode, use -O2; Debug mode, use -g.
Here are results that shows that there are no loops executed when using -O2 when the function returns 0 and does not use the calculations in the loops. I am using nested loops.
Notice that I am now using the += operation. You can revert back to the = operation and may see surprising timing results (depending on your platform and compiler version).
For Release mode, use -O2; Debug mode, use -g.
Here are results that shows that there are no loops executed when using -O2 when the function returns 0 and does not use the calculations in the loops. I am using nested loops.
Notice that I am now using the += operation. You can revert back to the = operation and may see surprising timing results (depending on your platform and compiler version).
#ifndef STDIO_H#include <stdio.h>#define STDIO_H#endif#include <iostream>#include <sys/timeb.h>using namespace std;int getMilliCount(){ timeb tb; ftime(&tb); int nCount = tb.millitm + (tb.time & 0xfffff) * 1000; return nCount;}int getMilliSpan(int nTimeStart){ int nSpan = getMilliCount() - nTimeStart; if(nSpan < 0) nSpan += 0x100000 * 1000; return nSpan;}int foo(unsigned long long loopCnt) { unsigned int a=0,b=0,c=0; // CODE YOU WANT TO TIME int start = getMilliCount(); for (unsigned long long i = 0; i < loopCnt; i++) { for(unsigned long long j = 0; j < loopCnt/4; j++) { a += 1234 + 5678 + i + j; b += 1234 * 5678 + i + j; c += 1234/2+i+j; } } int msElapsed = getMilliSpan(start); printf("Elapsed time = %4u ms Loop Count = %.1e", msElapsed, (float)loopCnt); return 0;// return a + b + c;}int main(){ unsigned int x, y, z=5; x = foo(2e3L); y = foo(2e4L); z = foo(2e5L); printf("%u %u %u", x, y, z);} 1:2:3:4:5:6:7:8:9:10:11:12:13:14:15:16:17:18:19:20:21:22:23:24:25:26:27:28:29:30:31:32:33:34:35:36:37:38:39:40:41:42:43:44:45:
Following output is for return a+b+c case:paulh@ubuntu:~/EE/Test$ rm a.out; g++ -O2 speed.cpp && ./a.outElapsed time = 1 ms Loop Count = 2.0e+03Elapsed time = 145 ms Loop Count = 2.0e+04Elapsed time = 14404 ms Loop Count = 2.0e+054246405632 2889473536 2998446080paulh@ubuntu:~/EE/Test$ rm a.out; g++ -O speed.cpp && ./a.outElapsed time = 2 ms Loop Count = 2.0e+03Elapsed time = 145 ms Loop Count = 2.0e+04Elapsed time = 14292 ms Loop Count = 2.0e+054246405632 2889473536 2998446080Following output is for "return 0" case:paulh@ubuntu:~/EE/Test$ rm a.out; g++ -O2 speed.cpp && ./a.outElapsed time = 0 ms Loop Count = 2.0e+03Elapsed time = 0 ms Loop Count = 2.0e+04Elapsed time = 0 ms Loop Count = 2.0e+050 0 0paulh@ubuntu:~/EE/Test$ rm a.out; g++ -O speed.cpp && ./a.outElapsed time = 1 ms Loop Count = 2.0e+03Elapsed time = 98 ms Loop Count = 2.0e+04Elapsed time = 9532 ms Loop Count = 2.0e+050 0 0 1:2:3:4:5:6:7:8:9:10:11:12:13:14:15:16:17:18:19:20:21:22:23:
Expert: Dave Baldwin replied at 2024-09-01 12:47:05
Your test code above is far too simple for useful measurements. This page http://en.wikipedia.org/wiki/Benchmark_%28computing%29#Open_source_benchmarks list common benchmark programs that are actually used to measure CPU and computer performance.
Author: rwniceing replied at 2024-09-01 07:42:23
Classical C is running faster than C++ on my topic post code that might not be interesting since I thought the root cause similar to as your first post in this thread. Small task use Small-size register or memory to achieve output faster than Small task use Big tools, big register or big memory.
My study example might be too small for timing benchmark of different language
which include, script, c, c++,c#,javascript, nodejs, php, asp, asm....etc
So I might need to choose other good example code
My study example might be too small for timing benchmark of different language
which include, script, c, c++,c#,javascript, nodejs, php, asp, asm....etc
So I might need to choose other good example code
Expert: trinitrotoluene replied at 2024-09-01 07:26:54
note that the compiler generated assembly code isn't the only way to code the logic. I can write out several variations of a set of assembly instructions which have the same final objective.
So just setting g++ to use the non-optimization option isn't going to work always.
What about using a different compiler with a different setting?
So in effect it comes down to the implementation of the compiler and in other words how the coder decides to write the assembly
So just setting g++ to use the non-optimization option isn't going to work always.
What about using a different compiler with a different setting?
So in effect it comes down to the implementation of the compiler and in other words how the coder decides to write the assembly
Author: rwniceing replied at 2024-09-01 06:58:14
For timing analysis on different program language , it prefer to compare code without smart optimization from compiler . Efficient code writing will improve the code speed but not too much to be improved. For the topic answer, I suggest to using g++ with non-optimization option and -S option to generate non-optimization ASM code, and based the ASM code and re-write the code and see there is any code improvement or not, and then do the timing comparison between new improving ASM code and C++ code that will be a fair timing test since both different language has NO any smart optimization and NO skip-code or No skip-function optimization included.
Thanks for all of your reply. This will continue in new thread for asm language code improvement in code efficient writing.
Rwniceing
Thanks for all of your reply. This will continue in new thread for asm language code improvement in code efficient writing.
Rwniceing
Author: rwniceing replied at 2024-09-01 06:42:27
This link describe gcc/g++ for disable and enable optimization option and level
http://www.network-theory.co.uk/docs/gccintro/gccintro_49.html
Since I've used the g++ not including -Ox flag so the compiled code result on the topic post is non-optimization, in other words, the for loop operation is executed. For optimization option enabled, the "for loop" is NOT included.
The time result I get for the optimization and non-optimization on the c++ program is recorded as follows on linux apache VPS server and showed that optimization code run faster than normal one.
Variable(Max) NON-Optimization Optimization
===================================
2e6 10ms 4ms
2e8 1105ms 560ms
Since the timing result on c++ program is changed largely from max=2e6 to 2e8 on the non-optimzation version, so it proved the for loop is running and also check its asm code that looping code is also included. how to get asm code from g++ , the link describes it at https://www3.ntu.edu.sg/home/ehchua/programming/cpp/gcc_make.html
Rwniceing
http://www.network-theory.co.uk/docs/gccintro/gccintro_49.html
Since I've used the g++ not including -Ox flag so the compiled code result on the topic post is non-optimization, in other words, the for loop operation is executed. For optimization option enabled, the "for loop" is NOT included.
The time result I get for the optimization and non-optimization on the c++ program is recorded as follows on linux apache VPS server and showed that optimization code run faster than normal one.
Variable(Max) NON-Optimization Optimization
===================================
2e6 10ms 4ms
2e8 1105ms 560ms
Since the timing result on c++ program is changed largely from max=2e6 to 2e8 on the non-optimzation version, so it proved the for loop is running and also check its asm code that looping code is also included. how to get asm code from g++ , the link describes it at https://www3.ntu.edu.sg/home/ehchua/programming/cpp/gcc_make.html
Rwniceing
Author: rwniceing replied at 2024-09-01 01:41:32
ozo, from your post, Now I get better understanding, compiler C++ will do optimization automatically so its code can run faster( than classic C Language that I need to prove it). In Compile C++ optimization, compiler might treat the "for loop" is nothing to be executed since it's nothing to do besides just variable adding and main() is nothing return beside "0", so the output code is just making single simple code to call single memory copy for final a,b,c variable change with last i=2e6 since the "for loop" function is no other meaning code such as printf("a=%d",a) in "for loop" so that after C++ optimization, The following code is just optimized from taking out "for loop" which may NOT be same as the OP code(not opcode) from C++, but just idea if we have single printf() before return 0;
int main(){int max = 2e8;int a,b,c,i; i=max;a = 1234 + 5678 + 2e8;b = 1234 * 5678 + 2e8;c=1234/2+2e8;printf("%d=%d=%d",a,b,c);return 0;} 1:2:3:4:5:6:7:
it will run faster but it is not fair to compare the timing test with ASM code.
Author: rwniceing replied at 2024-09-01 01:14:40
Phoffirce, thansk for your good rely. I review your post again, I am still not clear
Before talking about ASM code, my timing test for C++ code might be wrong and result is misleading. From your post, my timing test for C++ code is no meaning since C++ compiler think it is nothing to do within the for loop, so it run faster and it's misleading, right ?
When I compile it by gc++
g++ speed.cpp -o speed
Which is compiled in Release Mode, Right ?
Could you describe more about terminology of Debug or Release mode in C++ compiler respectively that I get a little bit confusing ? Any good aticles is about that topic ?
I tried to do change it from return 0 to return a/2 + b/3 + c*4 as you said but the result is same, it still output timing is 11ms from getmilliCOunt() function in C++
Please review my cpp code as follows, I ran it at linux shell wiht "./speed"
Please advise
Before talking about ASM code, my timing test for C++ code might be wrong and result is misleading. From your post, my timing test for C++ code is no meaning since C++ compiler think it is nothing to do within the for loop, so it run faster and it's misleading, right ?
When I compile it by gc++
g++ speed.cpp -o speed
Which is compiled in Release Mode, Right ?
Could you describe more about terminology of Debug or Release mode in C++ compiler respectively that I get a little bit confusing ? Any good aticles is about that topic ?
I tried to do change it from return 0 to return a/2 + b/3 + c*4 as you said but the result is same, it still output timing is 11ms from getmilliCOunt() function in C++
Please review my cpp code as follows, I ran it at linux shell wiht "./speed"
Please advise
#ifndef STDIO_H#include <stdio.h>#define STDIO_H#endif#include <iostream>#include <sys/timeb.h>using namespace std;int getMilliCount(){ timeb tb; ftime(&tb); int nCount = tb.millitm + (tb.time & 0xfffff) * 1000; return nCount;}int getMilliSpan(int nTimeStart){ int nSpan = getMilliCount() - nTimeStart; if(nSpan < 0) nSpan += 0x100000 * 1000; return nSpan;}int main(){int max = 2e6;int a,b,c; // CODE YOU WANT TO TIME int start = getMilliCount(); for (int i = 0; i < max; i++) { a = 1234 + 5678 + i; b = 1234 * 5678 + i; c=1234/2+i; }int milliSecondsElapsed = getMilliSpan(start);printf("Elapsed time = %u milliseconds %d", milliSecondsElapsed,max); return a/2 + b/3 + c*4;} 1:2:3:4:5:6:7:8:9:10:11:12:13:14:15:16:17:18:19:20:21:22:23:24:25:26:27:28:29:30:31:32:
Accepted Solution
Expert: phoffric replied at 2024-08-31 20:39:07
100 points EXCELLENT
>> the following simple C++ "for loop" is completed within 0.01 second(10ms).
The point of my previous comment was to point out that your timing test has little meaning since I assume you used the optimization flag rather than the debug flag. If you timed the exact code in your OP with optimization on, then the time is meaningless because the actual machine code generated for the body is essentially a no-op. There will not even be a loop condition since the optimizer sees that all computations are thrown away, so it rightly assumes that there is no need to compute anything.
You response to my previous comment made me think that I may not have made myself clear. If you use my modified return statement (return a/2 + b/3 + c*4;), then machine code for the loop and the loop's body will be generated and a timing test will then have meaning.
The point of my previous comment was to point out that your timing test has little meaning since I assume you used the optimization flag rather than the debug flag. If you timed the exact code in your OP with optimization on, then the time is meaningless because the actual machine code generated for the body is essentially a no-op. There will not even be a loop condition since the optimizer sees that all computations are thrown away, so it rightly assumes that there is no need to compute anything.
You response to my previous comment made me think that I may not have made myself clear. If you use my modified return statement (return a/2 + b/3 + c*4;), then machine code for the loop and the loop's body will be generated and a timing test will then have meaning.
Assisted Solution
Expert: ozo replied at 2024-08-31 19:09:22
100 points EXCELLENT
Although it may often be possible to write assembly code that is faster than the fastest equivalent code you could write in c++,
it may not be maintainable code because for any little change, the optimal code might involve a complete change in how registers are allocated, which an optimizing c++ compiler could do automatically, but which could be ridiculous to do if you were trying to maintain the same thing in assembly.
Where assembly code can be worthwhile might be in the middle of tight loops that are executed a lot, and which you don't expect to change any time soon.
The above is a tight loop, but in this case the optimization would be to not loop at all.
it may not be maintainable code because for any little change, the optimal code might involve a complete change in how registers are allocated, which an optimizing c++ compiler could do automatically, but which could be ridiculous to do if you were trying to maintain the same thing in assembly.
Where assembly code can be worthwhile might be in the middle of tight loops that are executed a lot, and which you don't expect to change any time soon.
The above is a tight loop, but in this case the optimization would be to not loop at all.
Assisted Solution
Expert: Dave Baldwin replied at 2024-08-31 11:14:09
100 points EXCELLENT
The speed up for ASM is rarely more than 1.5 times faster than C++ . C++ compilers these days are very efficient. Even if you had 50 geniuses rewriting Windows 7 in ASM, you would not get that kind of speed up. That also ignores things like disk access time which will not change just because you used ASM.
What would change is your ability to maintain the code base and fix problems. C and C++ automate the interfaces between modules. You would have to do that manually in ASM. Typing errors would become much more common.
What would change is your ability to maintain the code base and fix problems. C and C++ automate the interfaces between modules. You would have to do that manually in ASM. Typing errors would become much more common.
Author: rwniceing replied at 2024-08-31 10:34:53
So, what is assembly code on window 7 64-bit with Intel(R) Pentium(R) CPU P6300 @ 2.27GHz 2.27GHz (dual core) for simulating the following C++ code ?
Actually I can get ASM code if convert C++ code to ASM code with g++ option but it is not target or it is same for the running time.
Actually I can get ASM code if convert C++ code to ASM code with g++ option but it is not target or it is same for the running time.
int main(){int max = 2e8;int a,b,c;for (int i = 0; i < max; i++) { a = 1234 + 5678 + i; b = 1234 * 5678 + i; c=1234/2+i; }return 0;} 1:2:3:4:5:6:7:8:9:10:
In other words, any effiicient ASM code is better than from C++ code converting to ASM code ?
Other question or thought:
If you can write the window 7 from ASM code instead of C++ , the window 7 will run 10x faster as Apple OS but the man power resource required for doing that is 10X expensive more than C++ window 7 & or 8.1 since ASM coding line is in term of 10,000 unit
Assisted Solution
Expert: phoffric replied at 2024-08-31 09:14:40
100 points EXCELLENT
Timing the code in the OP is likely misleading. Timing tests should be done with optimization on (in VS 2010 can use Release Mode). The extra integer operations are combined in both Release and Debug modes. You should look at the generated output in order to devise good timing tests. In Release mode, your code generates hardly any output since the compiler realizes that there is nothing to do. To change significantly the assembly code generated, replace your return 0; with my return statement
int main(){ int max = 2e8; int a,b,c; for (int i = 0; i < max; i++) { a = 1234 + 5678 + i; // even in debug mode, compiler combines 1234 + 5678 b = 1234 * 5678 + i; // even in debug mode, compiler combines 1234 * 5678 c=1234/2+i; // even in debug mode, compiler combines 1234/2 } return 0;// return a/2 + b/3 + c*4;} 1:2:3:4:5:6:7:8:9:10:11:
Author: rwniceing replied at 2024-08-31 06:44:47
both program code are running for 32-bit with the same processor and same computer , why you concern the processor type ?
My systeminfo:
window7 64-bit
Intel(R) Pentium(R) CPU P6300 @ 2.27GHz 2.27GHz (dual core)
My systeminfo:
window7 64-bit
Intel(R) Pentium(R) CPU P6300 @ 2.27GHz 2.27GHz (dual core)
Author: rwniceing replied at 2024-08-31 06:41:52
and gcc+ on window is also targeted for 32-bit, Right ?
How cl.exe from visual studio 2013, when I compile C++ program, it will target to 32 or 64 bit ?
So now , if I use g++ and masm32 for C and Aseembly code that will be fair to do th benchmark testing
Please advise
How cl.exe from visual studio 2013, when I compile C++ program, it will target to 32 or 64 bit ?
So now , if I use g++ and masm32 for C and Aseembly code that will be fair to do th benchmark testing
Please advise
Expert: trinitrotoluene replied at 2024-08-31 06:40:54
not necessarily. Win 7 can run on both 32 and 64 bit processors. You need to check your processor specs and then look at its instruction set and write the code accordingly
Author: rwniceing replied at 2024-08-31 06:38:56
Now I am doing benchmark testing on window 7. I think it is 32-bit target, right ?
I am using masm32
I am using masm32
Assisted Solution
Expert: trinitrotoluene replied at 2024-08-31 06:34:02
100 points EXCELLENT
Assembly instructions aren't generic. Assembly code will depend on the processor's instruction set. So which processor are you targeting?
Anyway it will usually involve a combination of the following :
- load values into registers
- store values in memory
- perform basic arithmetic ops
- load into the accumulator
If you have been able to look at the assembly code then it shouldn't be too difficult to reverse engineer and pull out the assembly instructions you need
Anyway it will usually involve a combination of the following :
- load values into registers
- store values in memory
- perform basic arithmetic ops
- load into the accumulator
If you have been able to look at the assembly code then it shouldn't be too difficult to reverse engineer and pull out the assembly instructions you need