Fastest would be to ray cast directly on GPU you can use compute shaders for this (I do not have any experience with those) however Its possible to do this also on standard shader rendering pipeline see my GLSL version:
of coarse you need to add BVH or Octree to have reasonable speeds for complex scenes...
If you insist on ray casting on CPU side then it would be much better to store your output into 2 textures one holding depth and the other RGB color. If you have access to 4 component textures with enough precision you can use RGBD format and single texture. Then to render you just render single QUAD and fragment shader do the rest ...
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…