In any binary bitwise operation, each output bit depends only on the two corresponding bits in the inputs. In an add operation, each output bit depends on the corresponding bits in the inputs and all the bits to the right (toward lower values).
For example, the leftmost bit of 01111111 + 00000001 is 1, but the leftmost bit of 01111110 + 00000001 is 0.
In its simplest form, an adder adds the two low bits and produces one output bit and a carry. Then the next two lowest bits are added, and the carry is added in, producing another output bit and another carry. This repeats. So the highest output bit is at the end of a chain of adds. If you do the operation bit by bit, as older processors did, then it takes time to get to the end.
There are ways to speed this up some, by feeding several input bits into more complicated logic arrangements. But that of course requires more area in the chip and more power.
Today‘s processors have many different units for performing various sorts of work—loads, stores, addition, multiplication, floating-point operations, and more. Given today’s capabilities, the work of doing an add is small compared to other tasks, so it fits within a single processor cycle.
Perhaps in theory you could make a processor that did a bitwise operation more quickly than an add. (And there are, at least on paper, exotic processors that operate asynchronously, with different units doing work at their own paces.) However, with the designs in use, you need some regular fixed cycle to coordinate many things in the processor—loading instructions, dispatching them to execution units, sending results from execution units to registers, and much, much more. Some execution units do require multiple cycles to complete their jobs (e.g., some floating-point units take about four cycles to do a floating-point add). So you can have a mix. However, with current scales, making the cycle time smaller so that it fits a bitwise operation but not an add is likely not economical.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…