Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
132 views
in Technique[技术] by (71.8m points)

Why is a switch not optimized the same way as chained if else in c/c++?

The following implementation of square produces a series of cmp/je statements like I would expect of a chained if statement:

int square(int num) {
    if (num == 0){
        return 0;
    } else if (num == 1){
        return 1;
    } else if (num == 2){
        return 4;
    } else if (num == 3){
        return 9;
    } else if (num == 4){
        return 16;
    } else if (num == 5){
        return 25;
    } else if (num == 6){
        return 36;
    } else if (num == 7){
        return 49;
    } else {
        return num * num;
    }
}

And the following produces a data table for return:

int square_2(int num) {
    switch (num){
        case 0: return 0;
        case 1: return 1;
        case 2: return 4;
        case 3: return 9;
        case 4: return 16;
        case 5: return 25;
        case 6: return 36;
        case 7: return 49;
        default: return num * num;
    }
}

Why is gcc unable to optimize the top one into the bottom one?

Dissassembly for reference: https://godbolt.org/z/UP_igi

EDIT: interestingly, MSVC generates a jump table instead of a data table for the switch case. And surprisingly, clang optimizes them to the same result.

question from:https://stackoverflow.com/questions/60109992/why-is-a-switch-not-optimized-the-same-way-as-chained-if-else-in-c-c

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The generated code for switch-case conventionally uses a jump table. In this case, the direct return through a look-up table seems to be an optimization making use of the fact that every case here involves a return. Though the standard makes no guarantees to that effect, I would be surprised if a compiler were to generate a series of compares instead of a jump-table for a conventional switch-case.

Now coming to if-else, it is the exact opposite. While switch-case executes in constant time, irrespective of the number of branches, if-else is optimized for a smaller number of branches. Here, you would expect the compiler to basically generate a series of comparisons in the order that you have written them.

So if I had used if-else because I expect most calls to square() to be for 0 or 1 and rarely for other values, then 'optimizing' this to a table-lookup could actually cause my code to run slower than I expect, defeating my purpose for using an if instead of a switch. So although it is debatable, I feel GCC is doing the right thing and clang is being overly aggressive in its optimization.

Someone had, in the comments, shared a link where clang does this optimization and generates lookup-table based code for if-else as well. Something notable happens when we reduce the number of cases to just two (and a default) with clang. It once again generates identical code for both if and switch, but this time, switches over to compares and moves instead of the lookup-table approach, for both. This means that even the switch-favoring clang knows that the 'if' pattern is more optimal when the number of cases is small!

In summary, a sequence of compares for if-else and a jump-table for switch-case is the standard pattern that compilers tend to follow and developers tend to expect when they write code. However, for certain special cases, some compilers might choose to break this pattern where they feel it provides better optimization. Other compilers might just choose to stick to the pattern anyway, even if apparently sub-optimal, trusting the developer to know what he wants. Both are valid approaches with their own advantages and disadvantages.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...