生物信息学算法之Python实现|Rosalind刷题笔记：004 求DNA的反向互补序列

碱基互补配对原则是：A 与 T 配对，G 与 C 配对。求 DNA 的反向互补序列分两步：第一是反向，第二是互补。比如序列“ATGC”，反向就是“CGTA”，再互补就是“GCAT”。给定...

简佐义的博客

2745人浏览 · 2020-12-07 09:19:02

简佐义的博客 · 2020-12-07 09:19:02 发布

碱基互补配对原则是：A 与 T 配对，G 与 C 配对。

求 DNA 的反向互补序列分两步：第一是反向，第二是互补。比如序列“ATGC”，反向就是“CGTA”，再互补就是“GCAT”。

给定：长度不超过 1000bp 的 DNA 序列。

需得：其反向互补序列。

示例数据

AAAACCCGGT

示例结果

ACCGGGTTTT

Python 实现

Complementing_a_Strand_of_DNA.py

import sys

def reverse_complement(dna):
    revc = ""
    basepair = {'A':'T', 'T':'A', 'G':'C', 'C':'G'}
    for c in dna:
        revc = basepair[c] + revc
    return revc

def test():
    dna = 'AAAACCCGGT'
    return reverse_complement(dna) == 'ACCGGGTTTT'

if __name__ == '__main__':
    if not test():
        print("reverse_complement: Failed")
        sys.exit(1)

    with open('rosalind_revc.txt') as fh:
        dna = fh.read()
        revc = reverse_complement(dna.strip().upper())
        print(revc)

在 Python 中，是没有 switch 语句的，可以用 if...elif...elif..else 来模拟 switch 语句；而更 pythonic 的做法是用字典来代替。在本题中，你可以尝试用 if...elif...else 来实现反向互补。
Python 中的序列反向可以通过切片实现，如 dna_forward[::-1]，就得到了其反向序列，再求其互补序列，也可以实现反向互补的需求。

Problem

In DNA strings, symbols 'A' and 'T' are complements of each other, as are 'C' and 'G'.

The reverse complement of a DNA string is the string formed by reversing the symbols of , then taking the complement of each symbol (e.g., the reverse complement of "GTCA" is "TGAC").

Given: A DNA string of length at most 1000 bp.

Return: The reverse complement of .