NPherno
03-03-2004, 06:10 PM
Alright, a thread very much aimed at Paul & Paul alone :)
So, some semi-random questions regarding skin authoring & render/draw/paint (pick your poison) performance...
I've noticed that the performance of my skin (the B71 beta'd Post-Graduate) isn't so hot when it's resized a significant amount. This machine isn't anything I'd consider 'slow' (1.7Ghz, 512MB RAM, etc), and I know that it's more than capable of non-GPU accelerated rendering some pretty advanced things at 1152x864 at 60fps (some software tools I wrote to do Hi-Poly UV Coordinate mucking... run quite fast in full-screen).
If QCD is drawing slowly over only only 1/10th the screen real-estate, I'd assume that it's cycles-per-pixel count is a significant amount higher than my own... which is scary given what my code is doing.
So, some questions that may help skinners better define thier own skins, and/or may help you adapt code that's potentially faster...
First, is there any performance gain/penalty for using 8-bit BMPs? Or RLE Compressed BMPs? One might assume that your in-memory shadow of all artwork is simple a DWORD aligned & expanded ARGB (or BGRA) Bitmap, both for speed & leaner code... and that the input format isn't relevant (and hell, if this were the case, that other input formats are easy additions... because a PSD file read, with the proper Layer support would make for some _mean_ skin packages).
Second, If you've unrolled your BLITing loops inside the skin engine, is there an ideal width/height multiplier for a skin's main size, or individual elements? Does the code use a different method to display elements that have an entirely 'rectangular' footprint as opposed to some odd shape due to masking (if not, depending entirely on your code design/layout/etc it's probably very easy, and very fast to add, and the performance gains are likely significant)? Old 8-bits per pixel code always ran faster with a 4-pixel width alignment, and no height bias... and always ran faster if you could DWORD align copies... if you unrolled BLITs further you could create faster cases for whatever unroll you'd setup. Myself, I always chose a 4 DWORD copy as ideal... so a 16 pixel alignment was king...
When dealing with resize area, is a 1 pixel resize zone slower than a 4 pixel or other 'aligned' area? Does your tiling copy code have an implicit preference?
I had other questions... but now I can't recall any of them! :)
Would love to hear what you think...
N
So, some semi-random questions regarding skin authoring & render/draw/paint (pick your poison) performance...
I've noticed that the performance of my skin (the B71 beta'd Post-Graduate) isn't so hot when it's resized a significant amount. This machine isn't anything I'd consider 'slow' (1.7Ghz, 512MB RAM, etc), and I know that it's more than capable of non-GPU accelerated rendering some pretty advanced things at 1152x864 at 60fps (some software tools I wrote to do Hi-Poly UV Coordinate mucking... run quite fast in full-screen).
If QCD is drawing slowly over only only 1/10th the screen real-estate, I'd assume that it's cycles-per-pixel count is a significant amount higher than my own... which is scary given what my code is doing.
So, some questions that may help skinners better define thier own skins, and/or may help you adapt code that's potentially faster...
First, is there any performance gain/penalty for using 8-bit BMPs? Or RLE Compressed BMPs? One might assume that your in-memory shadow of all artwork is simple a DWORD aligned & expanded ARGB (or BGRA) Bitmap, both for speed & leaner code... and that the input format isn't relevant (and hell, if this were the case, that other input formats are easy additions... because a PSD file read, with the proper Layer support would make for some _mean_ skin packages).
Second, If you've unrolled your BLITing loops inside the skin engine, is there an ideal width/height multiplier for a skin's main size, or individual elements? Does the code use a different method to display elements that have an entirely 'rectangular' footprint as opposed to some odd shape due to masking (if not, depending entirely on your code design/layout/etc it's probably very easy, and very fast to add, and the performance gains are likely significant)? Old 8-bits per pixel code always ran faster with a 4-pixel width alignment, and no height bias... and always ran faster if you could DWORD align copies... if you unrolled BLITs further you could create faster cases for whatever unroll you'd setup. Myself, I always chose a 4 DWORD copy as ideal... so a 16 pixel alignment was king...
When dealing with resize area, is a 1 pixel resize zone slower than a 4 pixel or other 'aligned' area? Does your tiling copy code have an implicit preference?
I had other questions... but now I can't recall any of them! :)
Would love to hear what you think...
N