Thread with 7 posts
jump to expanded posttouchHLE finally has a compositor!!!!! 🥳🥳🥳
the good: with a little bit of extra work, UIKit will be meaningfully usable (for very, very simple apps)
the bad: i had to use glReadPixels() for an uncommon case 😱
the ugly? i already told you, i had to use glReadPixels()
https://github.com/hikari-no-yume/touchHLE/commit/54cee560de1f98cb864e8904a92a360a14c29c88
@hikari is glReadPixels() less slow on machines that have shared CPU&GPU memory? Or is the problem almost entirely that it serialises execution between CPU & GPU?
@0x2ba22e11 I'm pretty sure it's the latter, the framebuffer isn't that big
@hikari I've heard that the latency for a PCI(e) transaction is kinda high but I don't know what order of magnitude.
Can't be *that* high or network cards and SSDs would suck I guess
@0x2ba22e11 @hikari A full 1080p RGBA framebuffer is 1920×1080×4 = ~8MB, so the transfer itself is negligible. It's mostly the pipeline flush and resulting stall that hits performance; normally the GPU is able to stay busy by processing the start of a second frame as it's finishing the first, but by the nature of GL's single-threadedness, the CPU can't send the start of the next frame until after it's finished and returned some pixels for glReadPixels().
The short version is that each frame may take 10-20ms to process through the pipeline end-to-end, but the actual steps are normally pipelined to the point that it only takes 2-5ms frame-to-frame, as each stage can start work on the next frame when it finishes it's part of the first.
@becomethewaifu @0x2ba22e11 I noticed that when rendering at 480×320, glReadPixels() takes only 2ms or so, but once I bump up the resolution by 2× or 3× I'm suddenly in “can't run at 60fps at all” territory. it's actually worse on mobile devices! there's no way this is bandwidth, it's just a throughput thing I'm sure
@hikari @becomethewaifu oh! It's interesting that it's fast at any resolution.