Complementing a Strand of DNA

The Secondary and Tertiary Structures of DNA

In “Counting DNA Nucleotides”, we introduced nucleic acids, and we saw that the primary structure of a nucleic acid is determined by the ordering of its nucleobases along the sugar-phosphate backbone that constitutes the bonds of the nucleic acid polymer. Yet primary structure tells us nothing about the larger, 3-dimensional shape of the molecule, which is vital for a complete understanding of nucleic acids.
The search for a complete chemical structure of nucleic acids was central to molecular biology research in the mid-20th Century, culminating in 1953 with a publication in Nature of fewer than 800 words by James Watson and Francis Crick. Consolidating a high resolution X-ray image created by Rosalind Franklin and Raymond Gosling with a number of established chemical results, Watson and Crick proposed the following structure for DNA:
The DNA molecule is made up of two strands, running in opposite directions. Each base bonds to a base in the opposite strand. Adenine always bonds with thymine, and cytosine always bonds with guanine; the complement of a base is the base to which it always bonds.
In light of Watson and Crick’s model, the bonding of two complementary bases is called a base pair (bp). Therefore, the length of a DNA molecule will commonly be given in bp instead of nt. By complementarity, once we know the order of bases on one strand, we can immediately deduce the sequence of bases in the complementary strand. These bases will run in the opposite order to match the fact that the two strands of DNA run in opposite directions.


In DNA strings, symbols ‘A’ and ‘T’ are complements of each other, as are ‘C’ and ‘G’.

The reverse complement of a DNA string s is the string sc formed by reversing the symbols of s, then taking the complement of each symbol (e.g., the reverse complement of “GTCA” is “TGAC”).


A DNA string s of length at most 1000 bp.


The reverse complement sc of s.


This is a fairly simple task. Therefore, we can read the file, run the function and return the output altogether:

for N in open("/Users/cenkcelik/Downloads/rosalind_orf.txt","r").read()[::-1]:
    for pair in ["GC","AT"]:
        if N in pair: print("".join(set(N)^set(pair)),end="")
Important note:

This problem is taken from Please visit ROSALIND to find out more about Bioinformatics problems. You may also clink onto links for the definitions of each terminology.