Jump to content


Photo

Scale2X 4 Sdl


  • Please log in to reply
2 replies to this topic

#1 sebt3

sebt3

    homebrew player (P. & C.)

  • GP32 Hardcore
  • PipPipPipPipPipPip
  • 1897 posts
  • Gender:Male
  • Location:QC

Posted 10 November 2010 - 01:11 AM

Hi there,

Some of us discussed scale2x on irc today. I found the SDL implementation (in the contrib directory) not compeling enough, so I've done my own :

Spoiler


The result is as fast as the initial (stupid) 2x scaler I wrote for the zelda's games. But I'm not a performance genius. Is there something that could drive up the performance a little more here ?


EDIT : this is completely obsolete : http://www.gp32x.com...6-improved-sdl/

Edited by sebt3, 15 November 2010 - 01:01 PM.


#2 sebt3

sebt3

    homebrew player (P. & C.)

  • GP32 Hardcore
  • PipPipPipPipPipPip
  • 1897 posts
  • Gender:Male
  • Location:QC

Posted 10 November 2010 - 08:54 PM

As codeaholic point me on IRC, if I want to get feed back, I have to say why I have done this.
First, the complete and original implementation of it is generic. And I don't have a good track record of integrating video code to SDL.
The tarball contain a contrib directory, which contain sdl/scale2x.c. At first this file looked to be exactly what I was looking for.
Then I saw that this file doesn't even take care of the optimisation explained here.

So I wrote the above by merging my simple2x I used in the zelda's games with this one. (BTW, you can see my version don't support 24bpp).

My performance concerns are :
- I'm using #define for readability, but the one in contrib/sdl is using variables. So I'm switching some "stor" to a shift and an increment. Not sure which is faster
- The original implementation take first and last column apart, I'm using (i>0?1:0) in my #define. So this test is done for every pixels. Does is worth lower the readability to remove that test every pixel ?
- I'm using "register" for the 2 loop counter, but I dont even know how much are available on this soc.Is it worth doing the research to find the variable that should be set as a register or gcc is doing a good enough job ?

I'm not going to unroll the loops : that what "-funroll-loop" is for.
I dont want to optimize this to death (anyway if you realy need performances use the complete implementation. I know Pickle have done this). I'm seeing this as a good base to talk about performance in the code and share our experience. (mine is limited so I intend to learn from your answer :P)


#3 PokeParadox

PokeParadox

    Founder of Pirate Games - Penjin Coder

  • GP32 Hardcore
  • PipPipPipPipPipPip
  • 3908 posts
  • Gender:Male
  • Location:UK
  • Interests:Homebrew and Emulation!

Posted 10 November 2010 - 10:23 PM

I'd say in something like this, don't worry about readability...
Scale2X even unoptimised is fairly small in the performance hit, but if you can lessen the hit all the better.