Buffalo NAS-Central Forums

Welcome to the Linkstation Wiki community
It is currently Sat Nov 18, 2017 12:00 am

All times are UTC+01:00




Post new topic  Reply to topic  [ 1 post ] 
Author Message
PostPosted: Sun Jan 21, 2007 1:26 pm 
Offline
Newbie

Joined: Wed May 03, 2006 6:13 pm
Posts: 15
WARNING, this guide contains trick about replace some system file, don't try to follow the guide step by step, instead make yourself understand what you are doing and doing so at your own risk.

I use openlink and if you are using debian, things might be different

A trick might be useful if you are makeing FP intensive projects under LS2.
As some people may know, LS2 does not have real FPU in its processor, when ever an FP instruction is encountered, an exception will happen and kernel will decode the instruction and emulate it. This is very slow, because each FP instruction needs to enter kernel, do the emulation, then back to user mode.
GCC comes with a compile time switch called -msoft-float, which will generate a call to a library routine, rather than generate an FP instruction which will cause exception. But that does not work well on LS2 with build tools provided on the wiki. What you need to do is to rebuild libgcc.a with -msoft-float option:
1. download gcc 3.3.5 source code
2. modify gcc-3.3.5/gcc/config/t-linux
3. add -msoft-float switch to TARGET_LIBGCC2_CFLAGS
TARGET_LIBGCC2_CFLAGS = -fPIC -msoft-float
4. configure and build your gcc, but don't install it
5. you will find libgcc.a under target gcc directory, use it to replace the same file under /usr/lib/gcc-lib/mipsel-linux/3.3.5, you may want to make a backup of the original file
6. DON'T try to replace /lib/libgcc_s.so.1 unless not only you know what you are doing, but also know how to do it. This file contains a dynamic linked version of fp routines, but fail to repace this file might cause most command such as mv, cp, ln fail to execute. I encountered a bus error when trying to do this, and then have to try use my own compiled vim to restore the file, I'm lucky and you may not, fail of replacing may brick your system. Different from run time library, libgcc.a, which is only used at compile time, are safe to replace and fail proof.
7. After that, you can try to build your FP intensive application with -msoft-float gcc option, remember to use static link, because dynamic link are not replaced. (unless you are familar with system building and already figured out how to do so safely) Go test if your application run faster.

Below are my test of the effect on an FP intensive application, I do many single/double precision add/sub/mul/divide in the application, and use default (hard float) and -msoft-float to compile the application, the performance improved about 6x, which is still slow, but alot better compares to hard float solution. The point here is, if your application does not need IEEE754 conformation and result always in a moderate range, you can change gcc source code to build a fixed-point version calculation, and add a new switch such as -mfast-float to accerlate your fp intensive application, such as most decoder.

root@LOLLIPOP:/usr/local/src/fptest# gcc a.c
root@LOLLIPOP:/usr/local/src/fptest# time ./a.out
a=12345.000000, b=54321.000000, c=670592768.000000, d=12345.000000, e=54321.000000, f=670592745.000000
a=12345.000000, b=54321.000000, c=66666.000000, d=12345.000000, e=54321.000000, f=66666.000000
a=12345.000000, b=54321.000000, c=-41976.000000, d=12345.000000, e=54321.000000, f=-41976.000000
a=12345.000000, b=54321.000000, c=0.227260, d=12345.000000, e=54321.000000, f=0.227260

real 1m7.633s
user 0m2.740s
sys 1m4.250s
root@LOLLIPOP:/usr/local/src/fptest# gcc -msoft-float a.c
root@LOLLIPOP:/usr/local/src/fptest# time ./a.out
a=12345.000000, b=54321.000000, c=670592768.000000, d=12345.000000, e=54321.000000, f=670592745.000000
a=12345.000000, b=54321.000000, c=66666.000000, d=12345.000000, e=54321.000000, f=66666.000000
a=12345.000000, b=54321.000000, c=-41976.000000, d=12345.000000, e=54321.000000, f=-41976.000000
a=12345.000000, b=54321.000000, c=0.227260, d=12345.000000, e=54321.000000, f=0.227260

real 0m11.662s
user 0m11.640s
sys 0m0.020s


Source code of my test case
Code:
int main(void) 
{
        float a=12345, b=54321, c;
        double d=12345, e=54321, f;
        int i;
        for (i=0;i<1000000L;i++) {
                c=a*b;
                f=d*e;
        }
        printf ("a=%f, b=%f, c=%f, d=%lf, e=%lf, f=%lf\n", a, b, c, d, e, f);
        for (i=0;i<1000000L;i++) {
                c=a+b;
                f=d+e;
        }
        printf ("a=%f, b=%f, c=%f, d=%lf, e=%lf, f=%lf\n", a, b, c, d, e, f);
        for (i=0;i<1000000L;i++) {
                c=a-b;
                f=d-e;
        }
        printf ("a=%f, b=%f, c=%f, d=%lf, e=%lf, f=%lf\n", a, b, c, d, e, f);
        for (i=0;i<1000000L;i++) {
                c=a/b;
                f=d/e;
        }
        printf ("a=%f, b=%f, c=%f, d=%lf, e=%lf, f=%lf\n", a, b, c, d, e, f);
}


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic  [ 1 post ] 

All times are UTC+01:00


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Limited