Organising Strings

When cataloguing a collection of genetic strings, we should have an established system by which to organize them. The standard method is to organize strings as they would appear in a dictionary, so that “APPLE” precedes “APRON”, which in turn comes before “ARMOR”.

#### Problem

Assume that an alphabet* 𝒜* has a predetermined order; that is, we write the alphabet as a permutation *𝒜* = (*a*1, *a*2, …, *ak*), where *a*1 <* a*2 < ⋯ < *ak*. For instance, the English alphabet is organized as (A, B, …, Z).

Given two strings *s* and *t* having the same length *n*, we say that *s* precedes *t* in the lexicographic order (and write *s* < Lex_*t*) if the first symbol *s*[*j*] that doesn’t match *t*[*j*] satisfies *sj* < *tj* in *𝒜*.

##### Given

A collection of at most 10 symbols defining an ordered alphabet, and a positive integer *n* (*n* ≤ 10).

##### Return

All strings of length *n* that can be formed from the alphabet, ordered lexicographically (use the standard order of symbols in the English alphabet).

#### Solution

First, let’s import the tools and read the FASTA file:

```
import itertools
with open("rosalind_lexf.txt") as f:
data = f.read().split()
letters = data[:-1]
n = int(data[-1])
```

Then, we find the all possible strings of given length n:

```
perm = itertools.product(letters, repeat = n)
output = []
for i, j in enumerate(list(perm)):
permutation = ''
for item in j:
permutation += str(item)
output.append(permutation)
```

We need to sort the strings in the list alphabetically:

```
output.sort()
```

Finally, we print the result as in rosalind’s desired format:

```
for item in output:
print(item, end="\n")
```

###### Important note:

This problem is taken from rosalind.info. Please visit ROSALIND to find out more about Bioinformatics problems. You may also clink onto links for the definitions of each terminology.