Project Stage 3

 

Project Stage 3: Tidy & Wrap – Clone-Prune Analysis at Scale

Welcome to the final part of my SPO600 project! In the previous stage, I built a GCC pass that could compare a cloned function to its original version by checking the number of basic blocks and the GIMPLE statements inside them. The pass would then decide whether the cloned function was identical and could be safely pruned.

That version worked well when there was only one cloned function to analyze. But in this stage I wanted to scale things up.

What’s New in Stage 3?

The goal for Stage 3 was to make the Clone-Prune Analysis Pass more complete and scalable. Here's what I aimed to do:

  • Handle multiple cloned functions, not just one.

  • Automatically skip resolver functions, which are internal compiler helpers we don’t want to compare.

  • Make the output more clear and architecture-independent (i.e., it works the same on both x86_64 and aarch64).

  • Improve diagnostic logging to make testing and verification easier.

This meant refactoring some parts of the code, updating the logic for comparing functions, and testing the pass thoroughly across both architectures to ensure consistent results.

Why This Matters

Modern compilers often create cloned versions of functions for optimization purposes (e.g., based on CPU-specific features). But sometimes, those clones end up being identical to the original function. In those cases, we can "prune" them i.e., remove or avoid keeping them, to save space and reduce redundancy.

Here is the final version of my compiler pass for Stage 3:

#include "config.h"

#include "system.h"

#include "coretypes.h"

#include "tree.h"

#include "tree-pass.h"

#include "cgraph.h"

#include "function.h"

#include "basic-block.h"

#include "gimple.h"

#include "gimple-iterator.h"

#include "gimple-pretty-print.h"

#include "cfg.h"


#include <string>

#include <vector>

#include <map>


namespace {


const pass_data pass_data_project = {

    GIMPLE_PASS,    // type

    "pass_project", // name

    OPTGROUP_NONE,  // optinfo_flags

    TV_NONE,        // tv_id

    PROP_cfg,       // properties_required

    0,              // properties_provided

    0,              // properties_destroyed

    0,              // todo_flags_start

    0               // todo_flags_finish

};


// Pass class

class pass_project : public gimple_opt_pass {

    std::vector<std::string> functions;

    std::map<std::string, std::vector<int>> gimpleCodes;

    std::map<std::string, int> bbCnts;


public:

    // Constructor

    pass_project(gcc::context* ctxt) : gimple_opt_pass(pass_data_project, ctxt) {}


    unsigned int execute(function* fun) override {

        cgraph_node* node = cgraph_node::get(fun->decl);

        std::string currFunName = node->name();

        basic_block bb;

        int bb_cnt = 0;

        int funIndex = findClonedFunction(currFunName);


        if (dump_file) {

            if (funIndex != EOF) {

                fprintf(dump_file, "===== This %s is a cloned function =====\n", currFunName.c_str());

            } else {

                fprintf(dump_file, "===== This is not a cloned function =====\n");

            }

        }


        // Store function names

        functions.push_back(currFunName);


        // Store GIMPLE code

        FOR_EACH_BB_FN(bb, fun) {

            bb_cnt++;

            for (gimple_stmt_iterator gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) {

                gimple* stmt = gsi_stmt(gsi);

                gimpleCodes[node->name()].push_back(gimple_code(stmt));

            }

        }


        bbCnts[currFunName] = bb_cnt;


        if (dump_file) {

            if (funIndex == EOF) {

                fprintf(dump_file, "NOPRUNE: %s\n\n", removeSuffix(currFunName).c_str());

            } else {

                fprintf(

                    dump_file, "%s: %s\n\n",

                    isIdenticalFunction(currFunName, funIndex) ? "PRUNE" : "NOPRUNE",

                    removeSuffix(currFunName).c_str()

                );

            }

        }


        return 0;

    }


    // Finds index of cloned function if one exists, else returns -1

    int findClonedFunction(std::string checkingFunName) {

        if (functions.empty()) {

            return -1;

        }


        size_t dotPos = checkingFunName.find('.');

        std::string funName = checkingFunName.substr(0, dotPos);

        std::string suffix = "";

        if (dotPos != std::string::npos) {

            suffix = checkingFunName.substr(dotPos + 1);

        }


        for (size_t i = 0; i < functions.size(); ++i) {

            if (funName == functions[i] && suffix != "resolver") {

                return i;

            }

        }


        return -1;

    }


    // Checks if two functions are identical based on basic block and GIMPLE comparison

    bool isIdenticalFunction(std::string currFunName, int orgFunIndex) {

        fprintf(dump_file, "========== Comparing Original Function and the Cloned Function ==========\n");

        fprintf(dump_file, "*** Compare Basic Block ***\n");


        fprintf(dump_file,

                "Current Function Basic Block: %d\n"

                "Original Function Basic Block: %d\n",

                bbCnts[currFunName], bbCnts[functions[orgFunIndex]]);


        if (bbCnts[currFunName] != bbCnts[functions[orgFunIndex]]) {

            fprintf(dump_file, "*** Different amount of Basic Blocks ***\n");

            return false;

        }


        fprintf(dump_file, "*** They have same amount of Basic Blocks ***\n");

        fprintf(dump_file, "Cloned Function GIMPLE code (Current) | Original Function GIMPLE code\n");


        for (size_t i = 0; i < gimpleCodes[currFunName].size(); ++i) {

            if (gimpleCodes[currFunName][i] == gimpleCodes[functions[orgFunIndex]][i]) {

                fprintf(

                    dump_file, "%d | %d\n",

                    gimpleCodes[currFunName][i], gimpleCodes[functions[orgFunIndex]][i]

                );

            } else {

                return false;

            }

        }


        return true;

    }


    // Removes ".clone" or ".resolver" suffix from function name

    std::string removeSuffix(std::string fullName) {

        size_t dotPos = fullName.find('.');

        return dotPos != std::string::npos ? fullName.substr(0, dotPos) : fullName;

    }

};


} // end namespace


// Custom pass creation function

gimple_opt_pass* make_pass_project(gcc::context* ctxt) {

    return new pass_project(ctxt);

}


Summary

This custom GCC pass identifies and analyzes cloned functions during the compilation process. Here's what it does:

  • Stores function names as they are encountered.

  • Collects GIMPLE codes for each basic block in a function.

  • Tracks basic block counts to compare function structures.

  • Detects cloned functions by comparing names and suffixes like .clone or .resolver.

  • Performs structural comparisons between original and cloned functions based on:

    • Number of basic blocks

    • Sequence of GIMPLE statements

  • Prints out pruning decisions using PRUNE or NOPRUNE based on similarity.

This pass helps determine whether a cloned function can be safely pruned, which is useful for optimizing code size and performance, especially when comparing across architectures like x86_64 and aarch64.

Testing with Multiple Cloned Functions

To test whether my pass can handle multiple cloned functions, I created a custom test case with three different functions, all marked to be cloned using the target_clones attribute. Each function does some simple computation using different data types (int16_t, float, and int32_t), so they produce meaningful GIMPLE code for analysis.

Here’s the test file I used:

#include <stdio.h>

#include <stdint.h>

#include <stddef.h>


CLONE_ATTRIBUTE

int process_int16(int16_t *data, size_t len) {

    int sum = 0;

    for (size_t i = 0; i < len; ++i) {

        sum += data[i] * 3 - data[i] / 2 + data[i] % 7;

    }

    return sum;

}


CLONE_ATTRIBUTE

float process_floats(float *data, size_t len) {

    float sum = 0.0f;

    for (size_t i = 0; i < len; ++i) {

        sum += data[i] * 1.5f - data[i] / 3.0f + data[i] * data[i];

    }

    return sum;

}


CLONE_ATTRIBUTE

int process_int32(int32_t *data, size_t len) {

    int result = 0;

    for (size_t i = 0; i < len; ++i) {

        result += (data[i] & 0xFF) + (data[i] >> 3) - (data[i] << 1);

    }

    return result;

}


int main() {

    int16_t a[128];

    float b[128];

    int32_t c[128];


    for (int i = 0; i < 128; ++i) {

        a[i] = i;

        b[i] = (float)i;

        c[i] = i;

    }


    printf("process_int16: %d\n", process_int16(a, 128));

    printf("process_floats: %.2f\n", process_floats(b, 128));

    printf("process_int32: %d\n", process_int32(c, 128));


    return 0;

}

Compiling and Testing the Pass

To actually generate the clones and see how my pass behaves with different target_clones, I compiled the test file with two different target settings:

  • One setup where the functions are similar enough to prune.

  • One where the clones are different and shouldn’t be pruned.

Here’s how I did it:

PRUNE Case

gcc -D 'CLONE_ATTRIBUTE=__attribute__((target_clones("default", "popcnt")))' -march=x86-64 -O3 -fdump-tree-pass_project testcase.c -o testcase_prune


In this case, the functions are compiled with clone targets "default" and "popcnt", which are close enough that the cloned versions often have the same GIMPLE structure. This lets my pass mark them as PRUNE.

NOPRUNE Case

gcc -D 'CLONE_ATTRIBUTE=__attribute__((target_clones("default", "arch=x86-64-v3")))' -march=x86-64 -O3 -fdump-tree-pass_project testcase.c -o testcase_noprune


This time, the cloned versions can be quite different, especially when the architecture target is more specific like x86-64-v3. As a result, my pass usually reports NOPRUNE for these functions because the GIMPLE structure changes enough to not be identical.

Results for testcase_prune

After running my pass on the testcase_prune binary, the results looked promising. The pass correctly identified cloned functions and was able to detect identical GIMPLE

structures between them, making the right pruning decisions.

Here’s a summary of what was printed in the .pass_project dump file:

===== This is not a cloned function =====

NOPRUNE: process_int16


__attribute__((target ("default"), target_clones ("default", "popcnt")))

int process_int16 (int16_t * data, size_t len)


The function process_int16 was not considered a clone (it was treated as the base/original), so no comparison or pruning applied here.

;; Function process_floats (process_floats.default, ...)

===== This is not a cloned function =====

NOPRUNE: process_floats


__attribute__((target ("default"), target_clones ("default", "popcnt")))

float process_floats (float * data, size_t len)


Same with process_floats — this function was the base version and didn't trigger a clone comparison either.

;; Function process_int32 (process_int32.default, ...)

===== This is not a cloned function =====

NOPRUNE: process_int32


__attribute__((target ("default"), target_clones ("default", "popcnt")))

int process_int32 (int32_t * data, size_t len)


Again, the default process_int32 was treated as the original.

But now it gets interesting...

Clone Detected: process_int32.popcnt

;; Function process_int32.popcnt (process_int32.popcnt, ...)


===== This process_int32.popcnt is a cloned function =====

===== Comparing Original Function and the Cloned Function =====

*** Compare Basic Block ***

Current Function Basic Block: 11

Original Function Basic Block: 11

*** They have same amount of Basic Block ***

Cloned Function Gimple code(Current) | Original Function Gimple code

1 | 1

6 | 6

1 | 1

6 | 6

6 | 6

6 | 6

6 | 6

6 | 6

6 | 6

6 | 6

6 | 6

10 | 10

PRUNE: process_int32


The pass successfully detected that process_int32.popcnt is a cloned version of process_int32.

It checked the number of basic blocks and found that they matched (11 in both), then compared the GIMPLE

statement codes one by one. Since all matched perfectly, the function was marked as PRUNE.

The Result for testcase_noprune

Let's now look at the output when running the GCC pass without enabling pruning.

As expected, this version still performs target cloning, but it does not attempt to prune any of the functions even if they are identical or very similar. Here's what the output looked like:

Not a Cloned Function

Function: process_int16

NOPRUNE: process_int16


__attribute__((target("default"), target_clones("default", "arch=x86-64-v3")))

int process_int16(int16_t * data, size_t len)

Not a Cloned Function

Function: process_floats

NOPRUNE: process_floats


__attribute__((target("default"), target_clones("default", "arch=x86-64-v3")))

float process_floats(float * data, size_t len)

Not a Cloned Function

Function: process_int32

NOPRUNE: process_int32


__attribute__((target("default"), target_clones("default", "arch=x86-64-v3")))

int process_int32(int32_t * data, size_t len)

Cloned Function Detected

Function: process_int32.arch_x86_64_v3

===== This process_int32.arch_x86_64_v3 is a cloned function =====

===== Comparing Original Function and the Cloned Function =====

*** Compare Basic Block ***

Current Function Basic Block: 15  

Original Function Basic Block: 15  

*** Same amount of Basic Blocks ***

NOPRUNE: process_int32


__attribute__((target("arch=x86-64-v3"), target_clones("default", "arch=x86-64-v3")))

int process_int32.arch_x86_64_v3(int32_t * data, size_t len)

Cloned Function Detected

Function: process_floats.arch_x86_64_v3

===== This process_floats.arch_x86_64_v3 is a cloned function =====

===== Comparing Original Function and the Cloned Function =====

*** Compare Basic Block ***

Current Function Basic Block: 15  

Original Function Basic Block: 13  

*** Different amount of Basic Blocks ***

NOPRUNE: process_floats


__attribute__((target("arch=x86-64-v3"), target_clones("default", "arch=x86-64-v3")))

float process_floats.arch_x86_64_v3(float * data, size_t len)

Cloned Function Detected

Function: process_int16.arch_x86_64_v3

===== This process_int16.arch_x86_64_v3 is a cloned function =====

===== Comparing Original Function and the Cloned Function =====

*** Compare Basic Block ***

Current Function Basic Block: 19  

Original Function Basic Block: 15  

*** Different amount of Basic Blocks ***

NOPRUNE: process_int16

__attribute__((target("arch=x86-64-v3"), target_clones("default", "arch=x86-64-v3")))

int process_int16.arch_x86_64_v3(int16_t * data, size_t len)

Summary

As you can see, none of the cloned functions were removed, even if some had the same number of basic blocks (like process_int32). This makes sense since in the nopruned version, we explicitly chose not to prune anything. It confirms that our analysis and pruning logic is gated correctly.

What I Learned

This project was a deep dive into compiler internals, and it was honestly the most technical thing I’ve done so far. A few key takeaways:

  • GCC passes are hard but incredibly rewarding once they work.

  • I got comfortable working with GIMPLE, basic blocks, and function declarations at the compiler level.

  • Architecture differences can be subtle but important, testing on both is crucial.

  • Writing blog posts alongside development helped me organize my thoughts and reflect as I went.

Final Thoughts

I came into SPO600 not knowing much about compilers, and now I’ve written a working GCC pass that can help identify

cloned functions and give smart pruning suggestions. That’s wild to me!

Huge thanks to my instructor and classmates and if you’re just starting out with compilers, trust me, it does get easier. Stick with it.

Thanks for following along! 🙌

~ Haroon Ahmed Bajwa



Comments

Popular posts from this blog

Project Stage 2.A

Project Stage 2.C

Welcome to My SPO600 Journey

Lab1 6502 Assembly Language

Stage 1 Completed

Project Stage 2.B

Lab 4 Building GCC on My Linux VM

Lab 2 Adventures in 6502 Assembly

Project Stage 1 Diving In!