Due to the holiday of Eid al-Fitr our week was cut short, so we had to have an action packed short week to make up for it. Therefore there will not be a new blog entry until the end of the holiday. Without wasting anymore time lets begin with the 13th of July.
and we were instructed to find the CG ratio of the sequence by shifting a starting 20 base sequence 1 base at a time until the end.
Then Mr. Ahmet made a talk about bioinformatics and in which scenarios we can use it. Bioinformatics divides into data analysis and biological analysis. In order to explain himself fully we saw examples of AND and OR gates, the differences between GPUs and CPUs, how histones are believed to unbind and rebind to DNA, how the precursor of programming began with Assembly and even before that how programmers used data cards. Then he compared Python to C, and how Python has a native dictionary function allowing us save a lot of time by not having to code our own dictionary functions, and how sorting the dictionary allows us to find entries faster. He contrasted by saying that people use Oracle or MySQL for large datasets, or else we have to play a hot and cold game to find previous entries of interest by sorting by the first letter, then the second letter, then the third letter, etc...
Next up we were given a new project that involved taking the CFTR sequence and finding motifs in the sequence. The CFTR sequence although considered small was very taxing on our computers, and took several minutes to display the resulting motifs we had specified. The next challenge was to reduce the time it took for the result to be displayed, and the following challenge was to display the top 10 motifs in our sequence.
In between us starting the CFTR project and the lunch break we proceeded to build briefly in our Minecraft server.
In the afternoon we finished up our code and had another lecture on Regular Expressions (regex). The example given was any given phone number or email no matter the formatting, or spacing can be understood by a person, but the same thing is tricky to transfer to a machine. To transfer such information we use \w to transfer words, and \d to transfer digits, and [x] where x is anything we want specifically such as [5-9]. A “.” represents an empty space while a “\.” is for an actual dot. A “+” means the preceding item is present at least once, while a”*” means the preceding item is present 0 or more times. A “?” means the preceding item is present 0 or 1 time (can be present but not mandatory). The “^” shows us the start of a string while “$” shows us the end of a string. Mr. Ahmet then showed us how to write his email with this method like so: “\w+(\.\w+)*@\w+(\.\w+)+”.
Using knowledge from our lecture we used import re in Python to import the Regular Expressions into our CFTR code. Using the re.search function we were able to remove the repeating motifs from our code. Meanwhile at the same time our mentors were taking a look at our work in the Minecraft server along with helping out as well. A lot of underground and behind the scenes work occurred today to make the world more pleasing to the eye. A lot of trees, monuments, and an underground railroad to get to places quickly began to take place.
Today we began day with a surprise presentation on select cities around the world, with only an hour to prepare! The pressure was on as everyone raced to finish their slides without making them look too bad. Once all the presentations were done we presented in the order of them being sent. We saw a scope from the Opera House in Sydney, Fiestas from Granada, Majorella Garden from Marrakesh, Carnival from Rio de Janeiro, Seville Cathedral in Seville, St. Vitus Cathedral of Prague, Fountain of Neptune from Florence, and the El Ateneo Grand Splendid bookshop of Buenos Aires.
To follow up on our presentations we had a Skype talk with another former bioinformatics intern called Çağla Çetin. She is a graduate of Molecular Biology and Genetics from Istanbul University and talked to us about her current life. She is currently developing mobile applications for the Android platform and she explained how the internship she was in helped her life. She remarked that the surprise presentations they had like us was very important to learn how to combat stressful situations. She uses Android Studio to do her coding, and learned how to code using videos online and moving on to more jargon oriented learning websites.
We moved on to further the work on our Minecraft server. Genkok now has a bouncy elevator to get out from the minecart system, and all the other minecart exit points are being fleshed out. The hospital is also taking shape while we work on a new project - the football field - complete with stands for the crowds and the spare team players to sit as well.
We continued our presentation with Peking aka Beijing and its many wonders such as Tiananmen Square, The Forbidden City, Terracotta Army, The Great Wall and the Peking roasted duck. And lastly we understood Tokyo with the Tokyo Imperial Palace, Ueno Park, Rainbow Bridge, Akihabara, Sushi, Miso, and Kimonos.
Nearing the end of our day we saw a 27 minute video titled 1963 Timesharing: A Solution to Computer Bottlenecks. It talked about the bottlenecks early computers faced such as being unreliable, not able to take an input that was not a number and the poor man-machine interaction. It also mentioned the solutions to said problems which you can watch yourself to find out.
Close to the end we had the presentations from Chapter 6 of Data Science from Scratch. Chapter 6 was on probability and talked about dependent/independent events, Bayes’s Theorem, the normal distribution and the central limit theorem.
Lastly we were tasked with creating a program in IPython to simulate the central limit theorem, which states the mean of iterates of a lot of independent random variables will be normally distributed into the classic bell like curve. The results of our dice program can change depending on the input given to our function zaratma(x, y, z) where x is the number of dice rolled, y is the sides present on the dice, and z is the amount of times the dice are rolled. As you can see the results resemble a normal distribution.
We started the day with a short attempt to improve on our previous CFTR code by scanning a wide range of nucleotide combination inputs to see if our repeating bases still pop up. Our efforts were mostly fruitless as the code that was not changed in any way was no longer working today, so we had to try and get the code to work before we could modify it.
After struggling with our code we moved on to further refine our IPython by modifying the range function in the code. We also had a prettier graph from additions added in the previous night.
Chapter 7 from Data Science from Scratch (Hypothesis and Interface) made its way onto our agenda right after our IPython session. The chapter consisted of explaining null and alternative hypotheses, type 1 and type 2 errors, 1 or 2 tailed hypotheses, P values, confidence intervals, P-hacking, A/B testing, Bayesian Inference, and Beta distribution.
For our lunch break we again jumped back into our Minecraft server to further work on our stadium. First we redid the stands completely to surround the fields, and made plans to modify the stands to have entrances for the fans to enter like a real stadium. A lot of work went into building the infrastructure, mostly due to the non cubic shapes we were building in a cube based game.
We moved on to short presentations about our new group members and their lives. The new members are Mr. Servets daughters, named Elif and Nazik Ozcan. Their lives consist of a healthy blend of culture from Turkey and the USA.
To end our short day, and to celebrate the beginning of our holiday we watched a movie - The Grey.