Thread with 9 posts
jump to expanded posttouchHLE finally has a compositor!!!!! π₯³π₯³π₯³
the good: with a little bit of extra work, UIKit will be meaningfully usable (for very, very simple apps)
the bad: i had to use glReadPixels() for an uncommon case π±
the ugly? i already told you, i had to use glReadPixels()
https://github.com/hikari-no-yume/touchHLE/commit/54cee560de1f98cb864e8904a92a360a14c29c88
@hikari is glReadPixels() less slow on machines that have shared CPU&GPU memory? Or is the problem almost entirely that it serialises execution between CPU & GPU?
@0x2ba22e11 I'm pretty sure it's the latter, the framebuffer isn't that big
@0x2ba22e11 unfortunately GPUs are optimised for throughput rather than latency, so that pipeline flush is slooooooow
@hikari I've heard that the latency for a PCI(e) transaction is kinda high but I don't know what order of magnitude.
Can't be *that* high or network cards and SSDs would suck I guess
@0x2ba22e11 @hikari A full 1080p RGBA framebuffer is 1920Γ1080Γ4 = ~8MB, so the transfer itself is negligible. It's mostly the pipeline flush and resulting stall that hits performance; normally the GPU is able to stay busy by processing the start of a second frame as it's finishing the first, but by the nature of GL's single-threadedness, the CPU can't send the start of the next frame until after it's finished and returned some pixels for glReadPixels().
The short version is that each frame may take 10-20ms to process through the pipeline end-to-end, but the actual steps are normally pipelined to the point that it only takes 2-5ms frame-to-frame, as each stage can start work on the next frame when it finishes it's part of the first.
@becomethewaifu @0x2ba22e11 I noticed that when rendering at 480Γ320, glReadPixels() takes only 2ms or so, but once I bump up the resolution by 2Γ or 3Γ I'm suddenly in βcan't run at 60fps at allβ territory. it's actually worse on mobile devices! there's no way this is bandwidth, it's just a throughput thing I'm sure
@hikari @becomethewaifu oh! It's interesting that it's fast at any resolution.