Biowebtronics Biotech, startups, web development and internet of things.

Making Autoprotocols more flexible

On my first attempt at translating an experimental protocol to the Autoprotocol format I got as far as creating a run on Transcriptic with the protocol which is awesome. The downside of that Autoprotocol though, was that the samples being used in the experiment were hardcoded into the python script, so any variation in experimental samples or parameters would have to be made in the python.

Thankfully the awesome people behind Autoprotocol made it possible to create a protocol that is parameterised. The Autoprotocol can be packaged up and uploaded to Transcriptic where Transcriptic's web app generates a user interface making it easy for the user to just type in the experimental parameters and hit run as you can see in the screenshot.

protocol ui screenshot

How to package up an Autoprotocol python script

I created my assay package by leaning heavily on the protocols in Autoprotocol-core and the Transcriptic Runner documentation.

Change how the protocol is wrapped

The first thing to do was replace the start and the end of the Python script. At the start of old protocol there was a Protocol object where all the refs and actions get attached to.

import json
from autoprotocol.protocol import Protocol

p = Protocol()

# ...your protocol...

And at the end the whole the whole thing was dumped as a JSON object in Autoprotocol format:

# Builds the Autoprotocol JSON
print json.dumps(p.as_dict(), indent=2)

In the new protocol we don't want the python script to dump JSON as it now has to operate slightly differently. Because of this we also don't need to instantiate a Protocol object.

So the start now looks like this:

from autoprotocol.util import make_dottable_dict

def assay_name(protocol,params):
    params = make_dottable_dict(params)

    # ...your protocol...

In the new protocol every action is wrapped up inside a method assay_name in the example, that accepts the arguments protocol and params.

The end of the protocol now looks like this:

if __name__ == '__main__':
    from autoprotocol.harness import run
    run(assay_name, 'AssayName')

Let's move on to the next task of parameterisation.

Remove hardcoded parameters

In the old protocol we defined a lot of fixed parameters, things like the type of container, what volume of sample, which sample, what wavelengths to make spectral measurements. All of this can be parameterised ultimately making the protocol more flexible and easier to use by colleagues who aren't confident editing a python script.

In the old protocol the user didn't have a choice of media I would always force the use of LB broth doped with ampicillin, because it was hardcoded into every dispense command. But what if a user wanted to use un-doped LB? Adding this flexibility is straight forward, by checking the experimental parameters passed to the protocol method and assigning the choice to a variable to use whenever media is needed. In the following example the choice between two types of media is made by checking the params object if simple conditional statements.

## Check the params object for the choice of media and make sure only one choice is made.
if params["media"]["lb-broth-100ug-ml-amp"] and not params["media"]["lb-broth-noAB"]:
    # Set the growth_media variable to the users choice.
    growth_media = "lb-broth-100ug-ml-amp"
elif params["media"]["lb-broth-noAB"] and not params["media"]["lb-broth-100ug-ml-amp"]:
    growth_media = "lb-broth-noAB"
else:
    ## Notify the user that they need to make a choice.
    raise RuntimeError("You must select a growth medium.")

protocol.dispense(96_well_container, growth_media, [{'column': 0}])

So where does the params object get populated?

Introducing Manifest.json

Manifest.json serves a couple of purposes. It is where the assignable parameters for the protocol are defined, in addition to a set of parameters that are used as defaults when we preview the protocol in testing with Transcriptic Runner.

This is the example manifest.json from the documentation:

{
  "version": "1.0.0",
  "format": "python",
  "license": "MIT",
  "protocols": [
    {
      "name": "SampleProtocol",
      "command_string": "python -m my_protocols.sample_protocol",
      "description": "this is a sample protocol",
      "preview": {
        "refs": {
          "sample_plate": {
            "type": "96-pcr",
            "discard": true
          }
        },
        "parameters": {
          "source_sample": "sample_plate/A1",
          "dest_sample": "sample_plate/A2",
          "transfer_vol": "5:microliter"
        }
      },
      "inputs": {
        "source_sample": "aliquot",
        "dest_sample": "aliquot",
        "transfer_vol": "volume"
      },
      "dependencies": []
    }
  ]
}

You can see it kicks off with some definitions around the version and other housekeeping details about the protocol. In the example there is just one protocol, however a whole array of protocols can be defined. Important bits here are:

  1. command_string the path to the python script of the protocol
  2. preview the parameters and refs used in the preview (good for testing locally)
  3. inputs dictates the fields offered to the user to populate the params object

Let's look at the inputs but for the growth media choice I mentioned earlier:

{
  "inputs": {
    "media": {
      "type": "group",
      "description": "Type of media to grow bacteria in. (check off only one)",
      "inputs": {
        "lb-broth-100ug-ml-amp": {
          "type": "bool"
        },
        "lb-broth-noAB": {
          "type": "bool"
        }
      }
    }
  }
}

Under the "media" property there are two inputs one for each medium, the "type" of input is bool indicating that the selection input is either true or false.

For this kind of input Transcriptic generates checkbox UI elements as seen below:

protocol ui screenshot

Inputs need to be added for each of the parameters you reference in the python protocol.

Directory Structure

Be sure to arrange the protocols in a file structure similar to that as recommended by Transcriptic:

protocols/
  manifest.json
  requirements.txt
  my_protocols/
    __init__.py
    sample_protocol.py

The manifest.json needs to be at the top level of the directory tree, so at least a level up from the python protocol.

Testing

To test the protocol using Transcriptic Runner in the same directory as the manifest.json run:

$ transcriptic preview AssayName

If all is working properly it dumps the JSON autoprotocol to STDOUT. Other wise you will get an error either due to the manifest.json or python protocol having errors. Keep fixing errors until you get the JSON dump, I found using pylint useful for fixing basic syntax errors in my python.

After that you can bounce it off of the Transcriptic servers by piping the JSON to analyze:

$ transcriptic preview AssayName | transcriptic analyze

I found it useful to create a run from the preview as I like to use the run UI on the Transcriptic web app to quickly scan through the run to make sure the protocol is doing what I want it to do.

$ transcriptic preview AssayName | transcriptic submit --project ":project_code" --title "Test Run" --test

If the run looks good on the Transcriptic web application it's time to package it up.

Uploading releases to Transcriptic

Transcriptic has a really easy way of uploading packages of protocols to the server. In the directory create a .zip archive from the manifest.json and the directory containing the python protocol. Name the .zip release_someVersionNumber.zip in line with what version number is specified in the manifest.json.

Next login to the Transcriptic web app and click 'Manage' for your organization. Then click the 'PACKAGES' tab. From here click 'Create New Package'. Name the protocol, give it a short description then upload the .zip file for the package. Click 'Save & Analyze'. Under the 'RELEASES' you will want to click publish to ensure other members of your organization can use the protocol.

Whenever you add improvements to the protocol, bump the version number in the manifest.json, zip up all the files again and upload the new version, then hit publish! I think this release management is a really nice feature as version tracking of protocols is essential for experimental repeatability.

Summary

All in all the process of packaging up a protocol was not too difficult and I think it goes a long way to make the Transcriptic platform more widely accessible to all researchers, not just the ones that can write python. The ability to make protocols public is awesome as well. You could write a paper, publish it on PLOS, and reference your protocol on Transcriptic. Then when people want to try your technique, all they need to do is find your protocol on Transcriptic and use their samples! Think of what this will do for repeatability...

I had the pleasure of meeting Tali, Max and Dorothy-Lou from Transcriptic at SynBioBetaUK this week. They are all extremely smart and super friendly people, I highly recommend you get in touch with them if you are interested giving Transcriptic a try with your research.

Twemoji is coming

At Dentally we were initially hesitant of using Twitter regularly because we wanted to use the platform to provide value rather than just self promote, which is all over Twitter.

But I'm getting distracted this post is actually about Twemoji!

Slack does Twitter well, and I like how relaxed they come across as a company and I think the use of emojis is in concert with that. But what I like more than emojis are Twitters own Twemojis, I just prefer the colours and design, so I now have Twemoji support here all thanks to Twemoji Awesome which is a really easy way to integrate them into your site.

Go forth and Twemoji...

My first attempt at working with Autoprotocol

Recently I wrote about my first experience of running a simple experiment on Transcriptic's cloud biology platform. It was a simple bacterial growth curve. The protocol for running this experiment was produced by interacting with a GUI to enter parameters and the actual knitty-gritty of liquid handling and spectroscopic measurements was already predefined.

The growth curve protocol had been written as a package, which is simply a package of code that when connected to the Transcriptic web application generates a user interface that can parametrically generate the commands that are executed by the platform. This is a great way of handling protocols as it allows easy execution of experiments by users that have no experience with code.

Transcriptic accepts protocols defined by the Autoprotocol standard (also designed by Transcriptic). So another method of executing experiments on Transcriptic is to write a protocol in the Autoprotocol standard and submit this to Transcriptic via the API for execution. This is what I had a quick go with.

autoprotocol brand

Example protocol, the burden assay

First I needed a protocol to turn into the Autoprotocol standard JSON format. I picked a protocol from 'Quantifying cellular capacity identifies gene expression designs with reduced burden' a paper from the Ellis group and co. The specific protocol is the spectroscopic analysis of a fluorescent reporter recombinant DNA system transformed into cells. The experiment is designed to assess the burden of the gene cassette on the host.

Writing the protocol

Autoprotocol protocols can be quite lengthy due to the granularity of specifying each liquid handling step etc. When working with 96-well or 384-well plates one could end up with a lot of repetition. To make the construction of protocols simpler Autoprotocol provides a python package that can programmatically output Autoprotocol JSON.

Containers

I first started by defining references in the protocol, these are essentially the containers that are used in the experiment, be they existing containers with reagents or containers to be created where for instance the assay will take place.

When instantiating a container, you need to supply a few arguments mainly ID, container type and container destiny (where the container ends up at the end of the run).

import json
from autoprotocol.protocol import Protocol

#instantiate new Protocol object
p = Protocol()

# Add the containers to the the protocol, unfortunately I had to pick slightly different containers to the ones used in the paper.

# The protocol assumes I already have a stock of already transformed bacteria and of arabinose

bacteria_stock = p.ref("bacteria_stock", cont_type="micro-2.0", storage="cold_4")
bacteria_overgrow = p.ref("bacteria_overgrow", cont_type="96-deep", discard=True)

inducer_arabinose = p.ref("inducer_arabinose", cont_type="micro-1.5", storage="cold_4")
reaction_plate = p.ref("reaction_plate", cont_type="96-flat", storage="cold_4")
bacteria_prep = p.ref("bacteria_prep", cont_type="96-deep", discard=True)

After adding all the containers the protocol actions kick off with growing a fresh liquid culture from a bacterial stock. To achieve a culture of bacteria in an exponential phase of growth.

Liquid handling and culturing bacteria

# Should be dispensing M9, but M9 media isn't a standard reagent at Transcriptic
# Dispense fills the container with standard reagents from Transcriptic
p.dispense(bacteria_prep,
            "lb-broth-100ug-ml-amp",
            [{"column": 0, "volume": "1500:microliter"}])

# Add bacteria from stock container to fresh media
p.transfer(bacteria_stock.well(0).set_volume("1000:microliter"),
           bacteria_prep.well(0),
           "5:microliter")

# Cover the plate prior to shaking incubation
p.cover(bacteria_prep, lid="universal")

# 16hr incubation
p.incubate(bacteria_prep,
           "warm_37",
           "16:hour",
           shaking=True)

# Prep media for overgrowing bacteria sample
p.dispense(bacteria_overgrow,
            "lb-broth-100ug-ml-amp",
            [
              {"column": 0, "volume": "1000:microliter"}
            ]
          )

p.uncover(bacteria_prep)

# Innoculate overgrowth sample
p.transfer(bacteria_prep.well("A1"),
           bacteria_overgrow.well("A1"),
           "20:microliter",
           mix_after=True)

p.cover(bacteria_overgrow, lid="universal")

# Incubate bacteria to guarantee exponential phase
p.incubate(bacteria_overgrow,
           "warm_37",
           "1:hour",
           shaking=True)

p.uncover(bacteria_overgrow)

# Transfer exponential phase bacteria to microplate for the assay
p.distribute(bacteria_overgrow.well("A1").set_volume("1000:microliter"),
             reaction_plate.wells_from(0,4),
             "200:microliter"
             )

OD600 and fluorescence measurements with an arabinose induction step

After the bacteria has been cultured to be in an exponential phase of growth the culture is transfered to another plate where the spectroscopic assay will take place.

During the assay, measurements are made of the OD600, fluorescent emission at 528nm and fluorescent emission at 645nm. These measurements occur twice prior to the expression system being induced by arabinose. Then following induction the 3 spectroscopic measurements are taken every 30 minutes, 8 times.

As an aside I'm pretty sure this code can be cleaned up a lot to remove so much of the repetition.

## Assay time!

# Incubate bacteria at 37 degrees for 3 hours
p.cover(reaction_plate, lid="universal")
p.incubate(reaction_plate, "warm_37", "3:hour",shaking=True)

# Read the first four wells on the reaction plate.
p.absorbance(reaction_plate, reaction_plate.wells_from(0,4).indices(), "600:nanometer",
    "OD600_reading_post3hr")
p.fluorescence(reaction_plate, reaction_plate.wells_from(0,4).indices(), excitation="485:nanometer", emission= "528:nanometer", dataref=
        "528_reading_post3hr")
p.fluorescence(reaction_plate, reaction_plate.wells_from(0,4).indices(), excitation="590:nanometer", emission= "645:nanometer", dataref=
        "645_reading_post3hr")

# Incubate bacteria at 37 degrees for 30 mins
p.incubate(reaction_plate, "warm_37", "30:minute",shaking=True)

# Another measurement
p.absorbance(reaction_plate, reaction_plate.wells_from(0,4).indices(), "600:nanometer",
    "OD600_reading_post3hr2")
p.fluorescence(reaction_plate, reaction_plate.wells_from(0,4).indices(), excitation="485:nanometer", emission= "528:nanometer", dataref=
        "528_reading_post3hr2")
p.fluorescence(reaction_plate, reaction_plate.wells_from(0,4).indices(), excitation="590:nanometer", emission= "645:nanometer", dataref=
        "645_reading_post3hr2")

# Incubate
p.incubate(reaction_plate, "warm_37", "30:minute",shaking=True)

# Measurement
p.absorbance(reaction_plate, reaction_plate.wells_from(0,4).indices(), "600:nanometer",
    "OD600_reading_preinduce")
p.fluorescence(reaction_plate, reaction_plate.wells_from(0,4).indices(), excitation="485:nanometer", emission= "528:nanometer", dataref=
        "528_reading_preinduce")
p.fluorescence(reaction_plate, reaction_plate.wells_from(0,4).indices(), excitation="590:nanometer", emission= "645:nanometer", dataref=
        "645_reading_preinduce")

p.uncover(reaction_plate)

# Induce the expression system with arabinose
p.distribute(inducer_arabinose.well(0).set_volume("1000:microliter"),
             reaction_plate.wells_from(0,4),
             "100:microliter")

p.cover(reaction_plate, lid="universal")

# Note that creating the 8 time series measurements from here can be done with a single while loop. The count variable is used in the dataref assignment
count = 0
while count < 9:
    # Incubate
    p.incubate(reaction_plate, "warm_37", "30:minute",shaking=True)

    # Measure
    p.absorbance(reaction_plate, reaction_plate.wells_from(0,4).indices(), "600:nanometer",
        "OD600_reading_" + str(count))
    p.fluorescence(reaction_plate, reaction_plate.wells_from(0,4).indices(), excitation="485:nanometer", emission= "528:nanometer", dataref=
            "528_reading_" + str(count))
    p.fluorescence(reaction_plate, reaction_plate.wells_from(0,4).indices(), excitation="590:nanometer", emission= "645:nanometer", dataref=
            "645_reading_" + str(count))
    count +=1

Once the run finishes all the containers execute their 'destiny' which is either being discarded or returned to storage.

Getting the protocol onto Transcriptic

After the protocol has been written this can all be 'built' to JSON in the Autoprotocol format with this line in the python protocol:

# Builds the Autoprotocol JSON
print json.dumps(p.as_dict(), indent=2)

Executing the file with python burden.py will dump the JSON into STDOUT however this output can be piped into other functions.

I used the Transcriptic Runner package to first validate the protocol then create a test run via the API with the following command from the docs:

$ python burden.py | transcriptic submit --project ":project_id" --title "Burden Assay" --test

If the PUT request on the API works the run appears with all of the actions and containers interpreted into the UI:

burden screenshot

Once the run is logged against the project it is easier to go through the UI to check the run than going through the JSON from python file.

I haven't tested the run on any materials as I don't have any of the strains or the plasmids in my inventory, but it would be cool try and replicate some of the results from the paper.

I'll try and work on wrapping the burden assay protocol in a harness which makes the protocol flexible by accepting parameters via UI.

The documentation between Transcriptic and Autoprotocol is really decent and helpful. In particular analysis and validation with the Transcriptic Runner was a big helper in eliminating small errors in the protocol. One common thing was setting 'virtual volumes' where containers will have a volume of liquid in sometime in the future.

Looking forward to trying to grab a quick word with the people from the Transcriptic team at SynBioBeta next week at Imperial College.

Ben

Living the dream, robots running biological experiments in silicon valley AKA my first go on Transcriptic

So I thought I would write up my first experience using Transcriptic.

DH5alpha Growth Curve

To test the platform I just wanted to perform a simple growth curve of E. coli DH5alpha. This is pretty simple to perform on Transcriptic because there is already a core protocol defined by the autoprotocol standard that they created.

Set up

The first step was to acquire some bacteria, this was super easy as Transcriptic provide some core molecular bio reagents. One of these is competent cells so I grabbed an aliquot of DH5alpha because I have used it before during my PhD. The aliquot cost $5.05 for 50µL containing 1 unit and DH5a vendor was Zymo Research.

Inventory screenshot

Next I had to create the run. I used the 'core' protocol for performing growth curves which is very easy to get started with as you just enter parameters into the fields as can be seen in the screenshot.

New run screenshot

The main parameters were: 5µL of the DH5alpha aliquot in each of the 3 replicates, 1 negative control with no bacteria. OD600 measurements taken every 30mins for a total time of 12 hours. From these parameters the protocol generates all of the run commands that include dispensing of LB into the 96-well plate and all of the incubation and plate reader acquisition steps. After all the steps have been generated Transcriptic also give you the cost of the run which was $7.00, which is totally reasonable in my opinion.

This was so easy to set up, and I will take a look at writing my own protocols at some point in the future using Autoprotocol.

Run progress

I found the whole tracking the run very exciting, picturing these robots over in California executing this experiment, it fills you with excitement about the potentials of this technology. I told my flat mate "this was the dream" but he said "most people probably don't dream about executing biology experiments on robots in silicon valley"...

Transcriptic have done a really nice job of showing the steps involved in a run and indicating the progress through that run. You can also preview observations made to give you an idea of how well the experiment is going.

Run progress screenshot

At Dentally we make heavy use of dashboards built on Dashing to track statistics about our web application and business related stuff like our sales pipeline and our support stats. With this in mind I was picturing a future company with multiple simultaneous runs on Transcriptic and what their dashboard would look like. To drive these dashboards you need an API to access the real time data and thankfully Transcriptic do have an API. I just wanted to ping the API to see if I could get the status of my run, an early step in building a realtime dashboard.

The endpoint takes this form: https://secure.transcriptic.com/:organization/:project/runs/:run_id and it takes headers of the user email and access token. The API should return a 200 and some JSON containing details of the run status, costs, who created it and other metadata. Unfortunately I was just getting back HTML from my requests and not the expected JSON. This may be due to me only having a test API key, though I am unsure of the limitations of just being on the test level. I left a support ticket so I should find out if I was being dumb shortly!

Data processing and visualisation

Once the run had completed I retrieved the data from Transcriptic. You can download the data as a .zip. Frustratingly, though I can understand why, the dataset was a collection of 24 .csv files one for each acquisition from the plate reader.

So to aggregate the whole data set I did a few things.

I first added column names to all 24 .csv files in bash using sed:

find . -maxdepth 1 -type f -exec sed -i.bk '1i \
well,abs
' {} \;

This script prepended every file with the column names and backed up the original files.

Then using csvkit I stacked all the files together:

csvstack -g 0.5000000000,1.0000000000,1.5000000000,2.0000000000,2.5000000000,3.0000000000,3.5000000000,4.0000000000,4.5000000000,5.0000000000,5.5000000000,6.0000000000,6.5000000000,7.0000000000,7.5000000000,8.0000000000,8.5000000000,9.0000000000,9.5000000000,10.0000000000,10.5000000000,11.0000000000,11.5000000000,12.0000000000 -n hours OD600_01.csv OD600_02.csv OD600_03.csv OD600_04.csv OD600_05.csv OD600_06.csv OD600_07.csv OD600_08.csv OD600_09.csv OD600_10.csv OD600_11.csv OD600_12.csv OD600_13.csv OD600_14.csv OD600_15.csv OD600_16.csv OD600_17.csv OD600_18.csv OD600_19.csv OD600_20.csv OD600_21.csv OD600_22.csv OD600_23.csv OD600_24.csv >> data.csv

The csvstack function takes a list of csv files and stacks them, but it also allows you to group each stack. For this I wanted to group by time point so using -g to define the group names I supplied a sequence from 0.5 to 12 in increments of 0.5.

You can generate a sequence with bash which saves typing them out manually:

START=0.5
END=12.5

for ((i=START; i<=END; i+=0.5))
do
 echo "$i,"
done

Using the -n option csvstack will also name the new group column so I called this 'hours'.

Finally for csvstack it needed the list of csv files, I didn't want to type this out manually so I used python to grab the list as an array.

from os import listdir
listdir('./')

This just simply returns an array of all files in the current directory. One small annoyance is Transcriptic named the files OD600_1.csv rather than OD600_01.csv which means the files don't get ordered properly so I had to go through and rename the first 9 files.

Once I got the list with python I just copied and pasted it into the csvstack function and had to remove the quotation marks and commas.

Finally I got my single dataset file which looked like this:

hours,well,abs
0.5000000000,a1,0.1673733069494896
0.5000000000,b1,0.16264489187043785
0.5000000000,c1,0.1599586580539759
0.5000000000,d1,0.09181026208443736
1.0000000000,a1,0.1922003154880472
1.0000000000,b1,0.1840397293890507
1.0000000000,c1,0.17612867568903176
1.0000000000,d1,0.09225487472406724
1.5000000000,a1,0.21441167080055457
1.5000000000,b1,0.20193852213131347
1.5000000000,c1,0.18963634938073973
1.5000000000,d1,0.09274211583155267
2.0000000000,a1,0.30488315358057455
2.0000000000,b1,0.2747621314874783
2.0000000000,c1,0.23442914734536185
2.0000000000,d1,0.09334329521399287
2.5000000000,a1,0.36189560828780176

The last step was to plot the growth curves with R and ggplot2 using:

df <- data.frame(read.csv('data.csv'))
qplot(data=df,x=hours, y=abs, color=well)

growth curves

In the plot you can see the 3 replicates from a1, b1 and c1 of the plate and finally the negative bacteria control in well d1. This isn't the usual growth curve behavior I've observed previously. Though firstly the negative bacteria control performed as expected and showed no growth. The 3 sample replicates very quickly reached the peak bacterial density at 3 - 3.5 hours. Typically I have observed peak density to be reached at around 6 hours. Furthermore the peak absorbance for each replicate fell between 0.25 and 0.40, whereas observations are usually made up towards 0.8 to 1.2. Interestingly there is a correlation between the measured absorbance and the well position in the plate, this may be coincidental or perhaps due to the lag times between innoculation of the LB in each well.

A possible explanation for the very short time to peak density is that I might have started from too high an initial concentration of bacteria. After initiating the run I noticed that Transcriptic recommend a growth curve as a sort of 'hello world' and performed it nearly exactly the same as me however they add 2µL of DH5alpha instead of the 5µL I did.

I'm not sure about the absorbance at the peak density, it might be to do with the quantity of LB as population stagnation and decline is usually due to nutrient scarcity toxicity build up.

Summary

I am unbelievably excited about Transcriptic and I think it is the future of research. The idea of a student building a biotech company from their laptop has crystallised into a very tangible vision, however there are some questions left. I think to do any original research one needs to be creating bespoke reagents and sending them to Transcriptic to execute these experiments. From what I understand Transcriptic already have this shipping and storage process nailed. However if I do not have access to a lab how do I get a custom buffer made and stored at Transcriptic?

Furthermore where AWS and Heroku (used to) have free tiers enabling very cheap and easy prototyping and iteration, with Transcriptic you are going to be paying for nearly everything and every iteration of a run is going to erode away your disposable income. This is obviously understandable as there are far fewer automated work cells than there are data centers and we're working in a world of atoms rather than bits.

Perhaps in a world of independent research there will be a resurgence in patronage, or some life science savvy risk taking angels fronting a lot of the cash.

It would be awesome to see Transcriptic start a Discourse forum so everyone using the platform can discuss ideas and help each other. I'm sure they must have enough engaged users to make this worthwhile.

All in all I'm amazed and excited and hope do more and more with the platform over time.

My R Profile

Towards the end of my PhD I started to get really into R. During my undergraduate degree at the University of Leeds my whole class were really heavy users of Origin Pro which is a really great plotting tool with an easy to use GUI. Though the only real taste of programming during that period was the use of Maple for one practical.

Most of my data during my PhD was based on micrographs so numeric data was difficult to extract, eventually I started doing more spectroscopy and I needed to start doing more analysis and visualisation. Now whenever I have data processing work to do I turn to R, even now at Dentally. I thought it would be cool to share my R profile file which is copied and altered from a handful of different sources

All of it is pretty self explanatory but just to highlight that I load the colorout package that colours the text in the terminal window. In addition dplyr and ggplot2 are loaded at startup as I use these religiously. Additionally there is a set column width function which makes data output in the terminal so much more readable.

###
# My .Rprofile
###

# Create a new invisible environment for all the functions to go in so it doesn't clutter your workspace
.env <- new.env()

# aliases
s <- base::summary
h <- utils::head
n <- base::names

# Set terminal width
wideScreen <- function(howWide=Sys.getenv("COLUMNS")) {
  options(width=as.integer(howWide))
}

# options
options(stringsAsFactors=FALSE)
options(scipen=999)
options(digits=3)
options(repos=structure(c(CRAN='http://cran.ma.imperial.ac.uk/')))

## Read data on clipboard.
.env$read.cb <- function(...) {
  ismac <- Sys.info()[1]=="Darwin"
  if (!ismac) read.table(file="clipboard", ...)
  else read.table(pipe("pbpaste"), ...)
}

## Attach all the variables above
attach(.env)

## .First() run at the start of every R session.
## Use to load commonly used packages?
.First <- function() {
  # library
  library('colorout')
  setOutputColors256(normal = 70, number = 56, negnum = 56, date = 56, string = 179, const = 202, verbose = FALSE)
  library('dplyr')
  library('ggplot2')
    cat("\nSuccessfully loaded .Rprofile at", date(), "\n")
}

## .Last() run at the end of the session
.Last <- function() {
    # save command history here?
    cat("\nGoodbye at ", date(), "\n")
}

I hope some of you find it useful, I know I will when I accidentally delete my .Rprofile!

Cheers,

Ben