Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
379 views
in Technique[技术] by (71.8m points)

c - 查找具有最多突变的最长回文DNA子序列(Find the longest palindromic DNA sub-sequence that has the most mutations on it)

I've been trying to do a Dynamic Programming assignment for university but I had no success so far.

(我一直在尝试大学的动态编程任务,但到目前为止我还没有成功。)

The problem:

(问题:)

Given a DNA string and a list of mutation locations (for exemple, pieces 0 and 2 are mutations), find the longest palindromic sub-sequence that contains the most mutations on it.

(给定一个DNA字符串和一个突变位置列表(例如,片段0和2是突变),找到包含其上最多突变的最长回文子序列。)

Input: a string S with 0 to 2000 chars;

(输入:0到2000个字符的字符串S;)

an integer N such that 0<=N<=|S|

(整数N,使得0 <= N <= | S |)

and N positions (numbers from 0 to |S|) of mutations.

(和N个位置(数字从0到| S |)。)

Output: an integer representing the size of the longest palindromic sub-sequence containing the maximum number of mutations.

(输出:一个整数,代表最长回文子序列的大小,其中包含最大突变数。)

Examples:

(例子:)

Input: CAGACAT 0

(输入:CAGACAT 0)

Output: 5

(输出:5)

Input: GATTACA 1 0

(输入:GATTACA 1 0)

Output: 1

(输出1)

Input: GATTACA 3 0 4 5

(输入:GATTACA 3 0 4 5)

Output: 3

(输出3)

Input: TATACTATA 2 4 8

(输入:TATACTATA 2 4 8)

Output: 7

(输出:7)

We have to code it in C, but what I really need are ideas, any language or pseudo-code is good to me.

(我们必须用C编写代码,但是我真正需要的是想法,任何语言或伪代码对我都很好。)

My code to find the LPS (in C)

(我找到LPS的代码(在C中))

int find_lps(char *input)
{
    int len = strlen(input), i, cur_len;
    int c[len][len];

    for (i = 0; i < len; i++)
        c[i][i] = 1;

    for (cur_len = 1; cur_len < len; cur_len++) {
        for (i = 0; i < len - cur_len; i++) {
            int j = i + cur_len;

            if (input[i] == input[j]) {
                c[i][j] = c[i + 1][j - 1] + 2;
            } else {
                c[i][j] = max(c[i + 1][j], c[i][j - 1]);
            }
        }
    }

    return c[0][len - 1];
}

What I tried to do for the mutations:

(我试图为突变做些什么:)

1- Creating an array of places where the LPS is changed.

(1-创建更改LPS的位置的数组。)

That doesn't work, and really, I have no idea of what to do.

(那是行不通的,实际上,我不知道该怎么办。)

More details about the problem: In a situation where you have n palindromic subsequences, both of them with the same size of mutations inside, I need the longest of them.

(有关该问题的更多详细信息:在您有n个回文序列的情况下,这两个子序列内部都有相同大小的突变,我需要它们中最长的一个。)

Given that you have n palindromic subsequences with X mutations, (we have M mutations), I need the longest palindromic subsequence of X mutations, considering you don't have a palindromic subsequence with M mutations.

(假设您有n个具有X突变的回文子序列(我们有M个突变),考虑到您没有具有M突变的回文子序列,我需要最长的X突变的回文子序列。)

If you do, then you should choose the other subsequence, even if it's shorter.

(如果这样做,则应该选择另一个子序列,即使它更短。)

So, first criteria: most mutations in a palindromic subsequence.

(因此,第一个标准:回文子序列中的大多数突变。)

If we have the same amount, then the longest of the subsequences.

(如果我们有相同的数量,那么最长的子序列。)

Any help is appreciated, thank you.

(任何帮助表示赞赏,谢谢。)

  ask by c-orlando translate from so

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Lets define C[i][j] to store 2 values:

(让我们定义C [i] [j]来存储2个值:)

1- The length of the longest palindromic sub-sequence in the sub-string S(i,j) that contains the most mutations in it, and lets denote it by C[i][j].len

(1-子串S(i,j)中最长回文子序列的长度,该子串中包含最多突变,并用C [i] [j] .len表示)

2- The number of mutations in the longest palindromic sub-sequence in the sub-string S(i,j) that contains the most mutations in it, and lets denote it by C[i][j].ms

(2-子串S(i,j)中最长回文子序列中包含最多突变的突变数,并用C [i] [j] .ms表示)

Then the result of the problem would be C[0][|S|-1].len

(那么问题的结果将是C [0] [| S | -1] .len)

Note : m[i] = 1 means the character s[i] is a mutation, otherwise m[i] = 0

(注意 :m [i] = 1表示字符s [i]是一个突变,否则m [i] = 0)

Here is the full code written in c++:

(这是用c ++编写的完整代码:)

#include <iostream>
#include <string>
using namespace std;


string s;
int m[2001];

struct Node {
    int ms;//number of mutations
    int len;

    Node() {
        ms = len = 0;
    }

    Node(int v1,int v2) {
        ms = v1;
        len = v2;
    }
};

Node C[2001][2001];


Node getBestNode(Node n1, Node n2) {
    if (n1.ms > n2.ms)
        return n1;

    if (n1.ms <  n2.ms)
        return n2;

    if (n1.len > n2.len)
        return n1;

    if (n1.len < n2.len)
        return n2;  

    return n1;
}

void init() {
    for (int i = 0; i < 2001; i++) {
        m[i] = 0;
        for (int j = 0; j < 2001; j++)  C[i][j] = Node(0,0);
    }
}

void solve() {
    int len = s.length();

    // initializing the ranges of length = 1
    for (int i = 0; i < len; i++)
        C[i][i] = Node( m[i],1 );

    // initializing the ranges of length = 2
    for (int i = 0; i < len - 1; i++) 
        if (s[i] == s[i + 1])
            C[i][i + 1] = Node(m[i] + m[i + 1],2);
        else if (m[i] || m[i + 1])
                C[i][i + 1] = Node(1,1) ;

    // for ranges of length >= 3
    for (int cur_len = 3; cur_len <= len; cur_len++)
        for (int i = 0; i <= len - cur_len; i++) {

            int j = i + cur_len - 1;

            C[i][j] = getBestNode(C[i + 1][j], C[i][j-1]);

            if (s[i] == s[j]) {
                Node nn = Node(
                    C[i + 1][j - 1].ms + m[i] + m[j] , 
                    C[i + 1][j - 1].len + 2
                );

                C[i][j] = getBestNode(C[i][j], nn);
            }
        }
}

int main() {

    int n;  
    cin >> s >> n;

    init();//initializing the arrays with zeros

    for (int i = 0; i < n; i++) {
        int x;  cin >> x;
        m[x] = 1;
    }

    solve();

    cout << C[0][s.length()-1].len << endl; 
    return 0;
}

The function getBestNode() is returning the best of 2 solutions by considering the number of mutations then the length of the sub-sequence.

(函数getBestNode()通过考虑突变数和子序列的长度,返回2个解中的最佳解。)

Note : The code can be shorter, but I made it this way for clarity.

(注意 :代码可以更短,但是为了清楚起见,我这样做是这样的。)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

56.8k users

...