Teaching Page is Up

I have been working for a couple of semesters at Manhattan College teaching Computer Applications for Life Sciences. I have spent a good amount of time revamping the slides given to me to make the course more accessible and more approachable for self-study. Take a look at my Teaching Page to see lectures, homework solutions, and other material designed by me to see how its going!

First Math Teaching Interview

Yesterday, I had the opportunity to interview for an Adjunct position in the Manhattan College Math department. They had me prepare a teaching demo for Calculus I, which as a former Calculus I grader, feels like stepping into the spotlight. Remembering my Calculus I and II professor, an Adjunct named Dr. P., I felt it was fitting to write a post since he was one of my earlier math inspirations, and I was channeling the lesson and analogies he used back in the day for my interview.

Check out my presentation on The Chain Rule

Communicate Effectively with Grafana

If you’re like me, it can takes days to figure out what a group of researchers discovered in a math and statistics research paper. Things can get even more complex if you’re not abreast of the niche developments in the sub-field. To make matters worse, when putting together projects, I find that I am trying to convey something relatively simple but my point is often mired by the formalism, and precise definitions, endemic to statistics.

To solve this issue, I am incorporating Grafana dashboards into my old projects, and current projects. Coupled with writing a one page summary, I think my projects will be accessible to those not well versed in change-point detection, or even market microstructure. Experts will find it refreshing to see projects that can quickly points out anomalies in the order book.

Check out my Thesis Dashboard

Check out my LOBSTER Dashboard

Programming is only half the battle

Documentation always seems to get pushed to the side while innovation and achievements are promoted and venerated. However, the usefulness of your innovation hinges on the communication of your idea. Developing for crobat, I find that nobody can contribute anything because there is no documentation, and conversely I find I can use other people’s API’s with some success because of their documentation. I created some documentation for crobat using LaTeX, but found https://swagger.io to be a very useful tool.

While innocuous this has started me down the path of learning how to keep track of changes and further organize projects so that others can readily contribute.

If you would like to check out the current state check out the github page for the manual and see the current manual.pdf. Here is a quick screenshot of the for function documentation hat I made!

Originally published February 5, 2021

Half-baked: LSE Project Proposals

These past two months I’ve found out exactly how hard it is to write a project proposal in Math and Stats. My ideas focused on detection schemes similar to my thesis, but had elements of market microstructure. I hope you take a look!

LSE Statistics Proposal:

It’s about detecting adverse selection events, and learning about how the order book changes when a private signal makes it to informed traders.

LSE Mathematics Proposal:

It’s about also detecting adverse selection events, but emphasizing the use of queuing theory, learning the mathematics behind the thresholds and constraints faced by investors when they cause these changes.

Originally published January 19, 2021

Pricing Model Calibration Through Stochastic Optimization – Stochastic Optimization Project

Hi, this is very overdue but I thought I should upload my old stochastic optimization project. I did it as the final rpoject for the course stochastic optimization by computer simulation MATH 795-61 at the graduate center. The course (website) was taught by Prof. Felisa Vasquez-Abad and the it gave a statistical footing on the underpinnings of machine learning.

The project was about calibrating the parameters from the Heston stochastic volatility to a time series of closing option prices. The work uses stochastic optimization to minimize the MSE between the observed option prices and the simulated option prices.

I should make a demo on github but the project paper is great on its own.

github link

Originally published: September 25, 2020

launched crobat!

Hi tonight I launched my surprisingly summer project, the Cryptocurrency Order Book Analysis Tool aka crobat.

It’s pretty cool. it can give you a rough snapshot of the orderbook and better yet it can write down the events at occur in the Level 2 order book for coinbase exchanges. This will be convenient for longitudinal studies of market microstructure since its just taking off.

Follow the github link to learn more about it !

Originally published: September 24, 2020

A Study of CUSUM Statistics on Bitcoin Transactions

Hello all,

Putting out the current draft of my thesis beamer just in case anyone wants to check it out. Still working on a useful application, and incorporating CUSUM statistics for spreads but this work some cool insights.

Ivan

Abstract:

In this thesis, our objective is to study the relationship between transaction price and volume in the BTC/USD Coinbase exchange. In the second chapter, we develop a consecutive CUSUM algorithm to detect instantaneous changes in the arrival rate of market orders. We begin by estimating a baseline rate using the assumption of a local time-homogeneous compound Poisson process. Our observations lead us to reject the plausibility of a time-homogeneous compound Poisson model on a more global scale by using a chi-squared test. We thus proceed to use CUSUM-based alarms to detect consecutive upward and downward changes in the arrival rate of market orders. In the third chapter we identify active periods from the number of consecutive upward CUSUM alarms, leading to the classification of active versus inactive periods. Finally we use One-Way ANOVA to assess the level effect on price swings for periods classified as containing at least two or three consecutive CUSUM up alarms. We show that in these active periods, price swings are significantly larger.

Link: A Study of CUSUM Statistics on Bitcoin Transactions

Originally published: July 23, 2020

Process Route of Simeprevir(Olysio®)

Today I will be covering the process route of Simeprevir, a leading small molecule antiviral for Hepatitis C. Developed with Jannsen Pharmaceuticals Inc., Rosenquist’s paper highlights many things with SAR and computer aided drug design, and makes a great case study for medicinal chemistry students. One thing not overly emphasized in the paper, was how they moved from a lab scale lead optimization route to a process development route. They make short references that their choices of substitute reagents made things safer in the pilot plant. Given my love for scale up, I think we should talk a little more about the clever process techniques they employed. The route is broken up into 3 Schemes, and the lead optimization route will run concurrent to the process development route to highlight differences.

Before we start let’s contextualize the lead optimization route. Methodology behind medicinal chemistry synthesis is analogous to combinatorial synthesis or large panel screening synthesis techniques. The reactions have to be ubiquitous, work across a wide range of substrates and lend themselves to facile or minimal purification. We can take away is that in order to do this synthesis we need, peptide coupling, a robust method to asymmetrically get to the trans-cylclopentanone dicarboxyllic acid, and a simple way to selectively peptide couple each carbonyl.

Scheme 1: Preparation of The Bicylic Lactone Acid

Simeprevir Scheme 1

The Lead Optimization Route:

Starts with a 5 step process to get to the bycyclic lactone acid. They employ unscalable reactions such as a Diels-Alder. Reagents include sodium borohydride (pyrophorric when finely divided).  They employ a clever asymmetric esterhydrolysis using Pig Liver Esterase(PLE). We must remember that PLE isn’t a cost effective method process synthesis, and would require an investment in a pilot scale bioreactor, and the somewhat rare pilot scale biosynthesis production chemist.

The Process Development Route:

The synthesis begins from an unresolved mixture of cyclopentanone 3,4 dicarboxylic acid() cutting out ~3steps from the Lead Optimization Route. They improve, on the synthesis by performing a Raney Nickel Hydrogenation, suspending the product as triethylamine salt in water, and directly lactonizing all in one pot. They perform a kinetic resolution using cinchonidine, and report the solid has a shelf life of at least 3 years.

Part 2: Preparation of The Olefin Metathesis Partners

Scheme 2 Simeprevir

The Lead Optimization Route:

From the bicycliclactone, the synthesis continues with a HATU/DIPEA peptide coupling off the free acid. keeping with the straightforward and simple synthesis, they perform a LiOH/water hydrolysis to open up the lactone. With the second acid free, they peptide couple to install the cycloproyl. After setting up the metathesis partners they finish the synthesis by using a Mitsunobu to install the aryl substituent.

The Process Development Route:

The process route takes a similar approach, they opt with using 2-Ethoxy-1-ethoxycarbonyl-1,2-dihydroquinoline (EEDQ) and N-Methyl Morpholine (NMM) to peptide couple the n-methyl hexene, citing its relative safety compared to HATU. Next, they get clever with the preparation of ring opening. The lead optimization route ties up the free carboxylic acid with a peptide coupling, remembering triphenylphosphine’s oxophillicty, you can gleam that the process route needed to tie up free hydroxyl of the acid thus prompting the careful methanolysis of the bicyclic lactone acid. The aryl moiety is installed using the Mitsunobu again, with a one pot hydrolysis of the ester, and peptide coupling we get to the identical olefin metathesis setup.

Since it’s not obvious, lets talk about the pot economy and purification methods at for the two routes.

In the lead optimization route, they peptide couple, then hydrolyze in one pot, Acid/Base extraction, peptide couple in a second flask, Acid/Base extraction again, Mitsunobu in a third flask.

The process route, peptide couples in refluxing THF, Acid/Base extracts out the cinchodine, using the organic THF layer, the directly subject it to the methanoloysis (note these extractions can be done in the same reaction flask if they use a jacketed flask). They drain the aqueous layer and extract in toluene. close to dry Toluene layer into a fresh reactor(because DIAD decomposes in water) to perform the Mitsunobu and the product is collected in crystalline form. They hydrolyze the methyl ester using LiOH in the same reactor, and drain out the Aqueous layer, and peptide couple using EEDQ.They likely use Boc anhydride in excess after Acid/Base extracting out the used EEDQ. Herein they must purify the Boc Olefin, and dilute to 0.05M to perform the infinite dilution approach in Scheme 3.

Part 3: Ring Closing Metathesis(RCM) and getting to Simperevir

Scheme 3 Simeprevir

Lead Optimization Route:

The RCM completes in classic conditions, everyone knows there is substantial schlenck techniques necessary with carrying out a Grubbs Hoveyda gen. 1 RCM. They hydrolyze the ethyl ester off the cycloproyl group to afford an acid ready for peptide coupling.  To activate carbonyl, they cyclize into the oxazolinone, then added the cyclopropyl sulfonamide to reopen the ring, affording Simeprevir.

Process Development Route:

Given the difficulty of getting an RCM to work on the process scale, I direct you a mini-review called Olefin metathesis on the process scale.  They made mention of the process route in Simeprevir. The Simeprevir makes an allusion with “SHD techniques”, “M1 catalyst” the unexplained Boc protection/deprotections of the RCM partners/products.

To get the RCM to work they employed the Ziegler Infinite Dilution. The method hinges on the lemma of competing conversion rates between the RCM product, and oligomerization. Therein, Ziegler et. al show that if the RCM partners are added to a solution of catalyst at a rate identical to its conversion to its RCM product, there will an infinitely dilute concentration of the starting material, and thus oligomerization will be disfavored.

In Higman’s review, their paragraph on Simeprevir, they clear up that SHD techniques meant that the unexplained Boc protection served to increase the concentration at which the RCM partners could be fed into the M1 Catalyst (A specialized ligand for GH RCM) solution. After the RCM, They hydrolyze using NaOH and EtOH, isolating then activating the free carboxylic acid with EDCI in a new flask, likely affording the spiro-oxazolidinone intermediate. Then using DBU to couple the cyclopropylsulfonamide. Simeprevir was collected using a controlled crystallization.

Reflections: What can you take away from this synthesis ?

What can we learn from this synthesis, process syntheses face different issues than traditional synthetic chemists. I think Janssen’s approach embodied the principles of using safer reagents, inventing cheaper methods of enantiomer resolution, and cleverly addressing the longstanding problem of unscalable olefin metathesis. After Simeprevir many other drugs including Paritaprevir(Abbvie), and Telaprevir(Vertex) have come to market using similar process syntheses. I hope to cover them in the next few weeks.

Originally published December 12, 2018