Tools for Life Scientists

In this post, you will find some great tools that I have been using since the beginning of my PhD. You may be familiar with most tools; but, I am sure you will find some tools that you have not heard before. Also, even the title mention “Life Scientists”, you will find some tools useful for any work.

Antibody search

As a life scientist, you may have encountered the same antibody from different supplier failed your blot or staining. If you have no clue how to pick the correct antibody, benchsci is the place to go. You can filter an antibody published in any journal based on cell type, host, specificity, clone ID and so on. If you have struggled in getting a fluorescent signal or a good blot, give benchsci a try!

Collaborative manuscript editing

I have long been waiting for an effective communication tool in scientific writing. There we go! Manuscript.io is a free, web-based, real-time collaboration platform for your manuscripts. You can include mathematical equations, figures and codes from your Jupyter notebook. It also allows importing markdown and LaTeX, as well as references from CrossRef, PubMed and Datacite. But, my long await turned into disappointment because of my no-physical-supervisor PhD experience. If your supervisor is around, please encourage them for collaboration in your manuscript writing!

Where to submit your manuscript

You now have written your manuscript, but you’re not sure where to submit it. You’re also afraid to be a prey for predator journals. Elsevier JournalFinder helps you find the best match for your article. Not to mention, it uses a specific vocabulary engine to match your article with journals. But before jumping in, please read its user guide thoroughly! Also, don’t forget to get your persistent digital identifier at ORCID.org before submission.

More than a text editor

If you are looking for a more advanced text editor, then Atom is your best friend! It is an open source and freeware text and source code editor for almost all platforms. Atom comes with support for many programming language syntax including C++, Bash, Java, etc., and you can do more if you integrate Atom with Kite. Kite is an AI coding assistant to increase your productivity.

Colour palette for presentations

I give great importance to my presentations regardless of its audience. Even for our internal group meetings, I put a lot of effort to deliver a concise and clear presentation. During one of my recent internal group meeting presentations, my co-supervisor could not find a gap on my talk and was eager to ask me at least one question, but then he could only criticised the colour palette that I used for charts that not being colour-blind friendly. Until that day, I had never thought of this could be an issue, but his criticism made me find a colour-blind palette and re-create all of the figures for the future presentations. So, I found this cool website, coolors.co, that can generate colours of your choice.

Digital labbook

Benchling is a digital platform for industries to improve productivity and data integrity by directly populating Notebook entries with results from instrument runs. However, I use Benchling as a digital laboratory notebook since they have a free option for Academics. So, spend less time taking notes by hand, and more time for asking scientific questions and carrying out experiments.

Stunning illustrations

I had long been waiting for a platform to draw scientific illustration. Despite myself had worked as a graphical designer in a web design company, I had the problem of inconsistency of items in my scientific figures over time. I came across with a platform, biorender, formed by three professional illustrators for scientific community to create stunning (and consistent!) graphics for their works. I have been using biorender for almost two years and I can’t believe how much the platform has been improved since then. So, if you need a beautiful graphical abstract for your next paper, give biorender a shot!

Safer file transfers

When your data contain sensitive data such as patient identifiers, and you want to eliminate the risk of data breeching during data transfer, you can try the cross-platform file encryption app Encrypto. Just drag and drop the file(s) you want to protect and let Encrypto do the rest for you.

Markdown editor

Nowadays, we all know what markdown is. To be honest, I no longer open a blank Word document for note taking since I discovered Typora. It is a cross-platform minimalistic markdown editor with seamless live preview and export options. You can almost export (or import) files to any format, including PDF, docx, OpenOffice, LaTeX, MediaWiki, Epub.

Access paid journals for free

It can sometimes be very frustrating when your institution does not have an access to a journal or article you want to download. Kopernio is the browser-plugin tool for you which works on numerous academic websites to help you get full-text PDFs. Luckily, I never had an access issue to an article but I used Kopernio to automatically save full-text articles to my cloud space just with a click.

Looking for new opportunities?

Everyone needs a crisp-clear, sharp, modern-looking and concise CV or resumé for their prospective job application. I discovered this powerful website for you to build a professional-looking CVs/resumés with few clicks and you don’t even need to sign up for it. Ladies, gentlemens and LGBTQI+s, please salute flowcv!

Nice start! Now, you don’t know where to send your new, professional-looking CV. Well, John Hopkins collates post doctoral positions for you. All you need to do is visit and look up for the best position fits you all around the world!

Don’t read articles only!

I am aware that in academia, we read as many articles as possible to keep up with the research of interest around the globe. But, I also try to read as many non-academic books as I can during my me-time. If you are environmentally-conscious who do not buy hard copies of books like me and owns an e-reader, you need Calibri to convert the e-books into the file format that your e-reader supports. Being a freeware and open source software, it does not only convert books into various formats but it also allows you to manage your e-library.

Listen to podcasts

Well, not every researcher has time to read books, especially those working in the wet-lab. So, podcasts are your company when you are on your feet in the lab to listen to talks about various topics. I have tried many other podcast apps and I found Pocket Casts more useful that satisfies my needs. It also has a web player that you can sync the episodes with your phone.

Final words

I have listed all of my favourite apps/platforms for best of my productivity for researchers. Some apps may not directly be related to science, but at some point, they have helped my productivity during my PhD. I now work as a researcher and I still use plenty of these. I hope you will find some of them matching your needs as well! Have a productive research life!

Disclaimer

I believe in openness and accessibility of information, so aforementioned apps/websites/platforms do not for advertorial purposes. Some may be paid some may not, but I have (and will have) no income from listing them out here on my website.

Right tool for productivity

Everyone I know either in academia or industry has one or two (or sometimes a mouthful) things to say about productivity. Some even say how productive they are. Yes, they are the ones to sacrifice their sleep to meet the deadlines. I am not one of those, nor will I be ever.

If you are looking for tips on meeting your deadlines and time management or even reducing your anxiety, this is not to place to be. Refer to the suggested readings below if you intend to learn about those.

Here, I will only take you through a short read on how to find the right tools to boost your productivity and some tools that I used during my no-physical-supervisor-PhD journey.

Finding the right working arrangement

What is most essential in productivity shift is to be aware of your unique character. What I mean is to find out under what circumstances you are productive. For instance, I can finish a task only in a couple of hours by working in a noisy place which would typically take days for others. Most, however, can’t be productive in such crowded and noise-polluted places. So the rule number one is to know yourself!

When I need to work remotely from home or somedays that I don’t necessarily have to be present at the office, I use this great app called Coffitivity to imitate a packed coffee shop, a restaurant with cutlery clanging or even a crowded campus to keep myself motivated.

Finding the right tools

Regardless of what industry you are working for or research you are involved in, you will need the help of computers at some point. If you use computers to only send emails, I even have some tips for your productivity. The best free tool for an email client is Spark from Readdle. With its elegant design (and dark mode support), you can set up as many email addresses as you want. Also, it comes handy with a Smart Inbox feature, with which you can set what type of notifications you will get during work hours. Spark also supports natural language search and email snoozing, so you will never forget to read or reply to an email. You can also create email templates if your work includes tedious email writing.

What’s more, it has a built-in calendar which will make you forget about other calendar apps. Best is for last, you can schedule emails. Unfortunately, it is only available for macOS, iOS/iPadOS and Android at the moment.

Now, we are all set with our emails. Next tool I am going to introduce to you help you plan your day on behalf of you. It is an app called Sorted3, unlike other “to-do” apps it sorts your daily tasks automatically for the period you set ahead. It is only available for iOS at the moment, but they are working hard to develop apps for other platforms.

But how about long-term tasks or projects? To effectively organise your projects and keep your track on your progress, Notion app is built for you already. It is available for most platforms and free for one user, so give it a shot to plan and track your projects.

If your work has to do with large images, figures, photographs, graphic designing, etc., ImageOptim is the only tool you need. It strips off metadata from the image files; therefore, saves disk space and bandwidth by compressing images without losing quality. Available on most platforms, or you may use the in-browser service.

Let’s keep working on files. As a scientist, my only struggle on the computer was to spot duplicate files. We write manuscripts, grants or reports, in all of which, we need to use citations. Reference managers copy the articles in their own libraries by default rather than moving them, therefore, creating duplicates of each file. To check whether you have duplicates on your computer, dupeGuru is the tool to go.

Even while you are reading this article, your eyes have been exposed to too much light. F.lux makes the colours of your computer’s display adapt to the time of day, warm at night and like sunlight during the day. It’s even possible that you’re staying up too late because of your computer. You could use f.lux because it makes you sleep better, or you could just use it just because it makes your computer look better. Different colours in the light spectrum stimulate the circadian system in humans. Blue light includes lights that appear green, blue, cyan, and even orange, which may exacerbate sleep disorders, especially in children and adolescents. This effect can be minimized by using dim red lighting in the nighttime bedroom environment or if you are working late on your computer, f.lux does it for you depending on sunrise/sunset time of your location.

My other favourite app to keep my biologic clock set is Pillow Sleep Tracker. Just by facing your phones down on your mattress, it monitors your sleep movements and heart rate to wake you up at the lightest possible sleep stage. After I discovered this app, my daily energy levels have drastically increased, and I could complete more work compared with days I did not sleep well.

Let’s presume that you have all the right tools and the workplace you needed, but you still dithering and postponing your work. I will wrap up this passage with some words on procrastination.

Procrastination

Procrastination is a serious behavioural issue*, unlike how some use to mask his or her reluctance to do a task. However, you may turn procrastination into your benefit by replacing it with conscious procrastination, at which you start planning on how you would perform the job before you start. It should not be confused with pre-crastination, which is the exact opposite of procrastination. Pre-crastinators tend to finish a task quickly just for the sake of getting it done, moving away from taking responsibilities. On the other hand, conscious procrastination is when you have a planning period rather than jumping directly in doing the task. Thus, you would be able to calm yourself down even if the deadline comes closer. The video below is an excellent talk on procrastination:

My best advice on avoiding procrastination is to slice your tasks into multiple sub-tasks and note down those into a notepad (digital or physical). When you finish a sub-task, strikethrough it. This visual progress will help you motivated when you see tasks completed. As I mentioned above, you may use Sorted3 app to plan your sliced tasks automatically.

Final words

Let’s be honest to ourselves and accept the fact that you will not implement my suggestions above right away. You will have found some ideas worth trying and others not. Nonetheless, adopt one at each step (or at least one), or discover what tools could be better to improve your productivity. We are all unique characters, and therefore, we don’t necessarily adopt bullet points in a self-improvement book.

Disclaimer

All above are extracted from my personal experiences. Some may be backed up with science, some may not. Please consult experts regarding a specific topic I mentioned hereby.

Suggested Readings
  • Barrett, M. (2017). Five tips to get the most out of your workday. The Conversation, April 20 1-3.
  • Lai V.T.T. (2019) Struggling with Mental Illnesses Before and During the PhD Journey: When Multiple Treatments Join the Healing Process. In: Pretorius L., Macaulay L., Cahusac de Caux B. (eds) Wellbeing in Doctoral Education. Springer, Singapore.
References
  • Rosenbaum, D. A. (2014).It’s A Jungle In There: How Competition And Cooperation In The Brain Shape The Mind. New York: Oxford University Press. Print version released March 2014.
  • Anne-Marie Chang, Daniel Aeschbach, Jeanne F. Duffy, Charles A. Czeisler Proceedings of the National Academy of Sciences Jan 2015, 112 (4) 1232-1237.
  • Holzman DC 2010. What’s in a Color? The Unique Human Health Effects of Blue Light. Environ Health Perspect 118:A22-A27.

Enumerating k-mers Lexicographically

Organising Strings

When cataloguing a collection of genetic strings, we should have an established system by which to organize them. The standard method is to organize strings as they would appear in a dictionary, so that “APPLE” precedes “APRON”, which in turn comes before “ARMOR”.

Problem

Assume that an alphabet 𝒜 has a predetermined order; that is, we write the alphabet as a permutation 𝒜 = (a1, a2, …, ak), where a1 < a2 < ⋯ < ak. For instance, the English alphabet is organized as (A, B, …, Z).

Given two strings s and t having the same length n, we say that s precedes t in the lexicographic order (and write s < Lex_t) if the first symbol s[j] that doesn’t match t[j] satisfies sj < tj in 𝒜.

Given

A collection of at most 10 symbols defining an ordered alphabet, and a positive integer n (n ≤ 10).

Return

All strings of length n that can be formed from the alphabet, ordered lexicographically (use the standard order of symbols in the English alphabet).


Solution

First, let’s import the tools and read the FASTA file:

import itertools

with open("rosalind_lexf.txt") as f:
    data = f.read().split()
    letters = data[:-1]
    n = int(data[-1])

Then, we find the all possible strings of given length n:

perm = itertools.product(letters, repeat = n)
output = []
for i, j in enumerate(list(perm)):
    permutation = ''
    for item in j:
        permutation += str(item)
    output.append(permutation)

We need to sort the strings in the list alphabetically:

output.sort()

Finally, we print the result as in rosalind’s desired format:

for item in output:
    print(item, end="\n")
Important note:

This problem is taken from rosalind.info. Please visit ROSALIND to find out more about Bioinformatics problems. You may also clink onto links for the definitions of each terminology.

RNA Splicing

Genes are Discontiguous

In “Transcribing DNA into RNA”, we mentioned that a strand of DNA is copied into a strand of RNA during transcription, but we neglected to mention how transcription is achieved.
In the nucleus, an enzyme (i.e., a molecule that accelerates a chemical reaction) called RNA polymerase (RNAP) initiates transcription by breaking the bonds joining complementary bases of DNA. It then creates a molecule called precursor mRNA, or pre-mRNA, by using one of the two strands of DNA as a template strand: moving down the template strand, when RNAP encounters the next nucleotide, it adds the complementary base to the growing RNA strand, with the provision that uracil must be used in place of thymine.
Because RNA is constructed based on complementarity, the second strand of DNA, called the coding strand, is identical to the new strand of RNA except for the replacement of thymine with uracil.
After RNAP has created several nucleotides of RNA, the first separated complementary DNA bases then bond back together. The overall effect is very similar to a pair of zippers traversing the DNA double helix, unzipping the two strands and then quickly zipping them back together while the strand of pre-mRNA is produced.
For that matter, it is not the case that an entire substring of DNA is transcribed into RNA and then translated into a peptide one codon at a time. In reality, a pre-mRNA is first chopped into smaller segments called introns and exons; for the purposes of protein translation, the introns are thrown out, and the exons are glued together sequentially to produce a final strand of mRNA. This cutting and pasting process is called splicing, and it is facilitated by a collection of RNA and proteins called a spliceosome. The fact that the spliceosome is made of RNA and proteins despite regulating the splicing of RNA to create proteins is just one manifestation of a molecular chicken-and-egg scenario that has yet to be fully resolved.
In terms of DNA, the exons deriving from a gene are collectively known as the gene’s coding region.

Problem

After identifying the exons and introns of an RNA string, we only need to delete the introns and concatenate the exons to form a new string ready for translation.

Given

A DNA string s (of length at most 1 kbp) and a collection of substrings of s acting as introns. All strings are given in FASTA format.

Return

A protein string resulting from transcribing and translating the exons of s (Note: Only one solution will exist for the dataset provided).


Solution

First, let’s read the FASTA file:

with open("rosalind_splc.txt") as fasta:
    nextline = str()
    dict = {}
    for line in fasta:
        if line.startswith(">"):
            header = line.strip(">").strip("\n")
            nextline = ""
            continue
        else:
            nextline += line.strip("\n")
        dict[header] = nextline

Now, we have sequences and their names as in dictionary, but we don’t need the names. Let’s convert the dict values to list items. Then, we remove introns:

list = []
for value in dict.values():
    list.append(value)
for item in list[1::]:
    list[0] = list[0].replace(item, "")

Now, we can use “Translating RNA into Protein” function, but before, we need to transcribe mRNA from the intron-free DNA:

rna_sequence = list[0].replace("T", "U")

map = {"UUU":"F", "UUC":"F", "UUA":"L", "UUG":"L",
    "UCU":"S", "UCC":"S", "UCA":"S", "UCG":"S",
    "UAU":"Y", "UAC":"Y", "UAA":"STOP", "UAG":"STOP",
    "UGU":"C", "UGC":"C", "UGA":"STOP", "UGG":"W",
    "CUU":"L", "CUC":"L", "CUA":"L", "CUG":"L",
    "CCU":"P", "CCC":"P", "CCA":"P", "CCG":"P",
    "CAU":"H", "CAC":"H", "CAA":"Q", "CAG":"Q",
    "CGU":"R", "CGC":"R", "CGA":"R", "CGG":"R",
    "AUU":"I", "AUC":"I", "AUA":"I", "AUG":"M",
    "ACU":"T", "ACC":"T", "ACA":"T", "ACG":"T",
    "AAU":"N", "AAC":"N", "AAA":"K", "AAG":"K",
    "AGU":"S", "AGC":"S", "AGA":"R", "AGG":"R",
    "GUU":"V", "GUC":"V", "GUA":"V", "GUG":"V",
    "GCU":"A", "GCC":"A", "GCA":"A", "GCG":"A",
    "GAU":"D", "GAC":"D", "GAA":"E", "GAG":"E",
    "GGU":"G", "GGC":"G", "GGA":"G", "GGG":"G",}

def protein_translation(mRNA):
    start = mRNA.find("AUG")
    triplets = [mRNA[start:start+3] for start in range(start, len(mRNA), 3)]
    for triplet in triplets:
        if map.get(triplet) == "STOP":
            return
        else:
            print(map.get(triplet), end="")
    return

Finally, we call the function:

protein_translation(rna_sequence)
Important note:

This problem is taken from rosalind.info. Please visit ROSALIND to find out more about Bioinformatics problems. You may also clink onto links for the definitions of each terminology.

Locating Restriction Sites

The Billion-Year War

The war between viruses and bacteria has been waged for over a billion years. Viruses called bacteriophages (or simply phages) require a bacterial host to propagate, and so they must somehow infiltrate the bacterium; such deception can only be achieved if the phage understands the genetic framework underlying the bacterium’s cellular functions. The phage’s goal is to insert DNA that will be replicated within the bacterium and lead to the reproduction of as many copies of the phage as possible, which sometimes also involves the bacterium’s demise.
To defend itself, the bacterium must either obfuscate its cellular functions so that the phage cannot infiltrate it, or better yet, go on the counterattack by calling in the air force. Specifically, the bacterium employs aerial scouts called restriction enzymes, which operate by cutting through viral DNA to cripple the phage. But what kind of DNA are restriction enzymes looking for?
The restriction enzyme is a homodimer, which means that it is composed of two identical substructures. Each of these structures separates from the restriction enzyme in order to bind to and cut one strand of the phage DNA molecule; both substructures are pre-programmed with the same target string containing 4 to 12 nucleotides to search for within the phage DNA. The chance that both strands of phage DNA will be cut (thus crippling the phage) is greater if the target is located on both strands of phage DNA, as close to each other as possible. By extension, the best chance of disarming the phage occurs when the two target copies appear directly across from each other along the phage DNA, a phenomenon that occurs precisely when the target is equal to its own reverse complement. Eons of evolution have made sure that most restriction enzyme targets now have this form.

Problem

A DNA string is a reverse palindrome if it is equal to its reverse complement. For instance, GCATGC is a reverse palindrome because its reverse complement is GCATGC.

Given

A DNA string of length at most 1 kbp in FASTA format.

Return

The position and length of every reverse palindrome in the string having length between 4 and 12. You may return these pairs in any order.


Solution

First, let’s read the FASTA file. Remember that first line includes a name, therefore, we skip the first line:

with open("rosalind_revp.txt") as f:
    for line in f:
        if line.startswith(">"):
            dna_str = str()
        else:
            dna_str += (line.strip("\n"))

To find its complementary sequence, we use the same function that we covered in “Complementing a String of DNA

def complementary(dna_sequence):
    replace_bases = {"A":"T","T":"A","G":"C","C":"G"}
    return ''.join([replace_bases[base] for base in reversed(dna_sequence)])

Then, we create another function to find out the location and the length of palindromes:

def LocatingRestrictionSites(dna):
    position_length = []
    for i in range(4, 13):
        for j in range(0, len(dna) - i + 1):
            if complementary(dna[j:j+i]) == dna[j:j+i]:
                position_length.append(str(j+1) + ' ' + str(i))
    return position_length

Finally, we print the potision and length of the palindromes:

for pos_len in LocatingRestrictionSites(dna_str):
    print(pos_len)
Important note:

This problem is taken from rosalind.info. Please visit ROSALIND to find out more about Bioinformatics problems. You may also clink onto links for the definitions of each terminology.