First, you need to consider the hardware you're using: GPU devices performance widely differs from a constructor to another.
Second, it also depends on the operations considered: for example adds might be faster than multiplies.
In my case, I'm only using NVIDIA devices. For this kind of hardware: the official documentation announces equivalent performance for both 32-bit integers and 32-bit single precision floats with the new architecture (Fermi). Previous architecture (Tesla) used to offer equivalent performance for 32-bit integers and floats but only when considering adds and logical operations.
But once again, this may not be true depending on the device and instructions you use.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…