About me

My name is Martyn, I’m an Alan Turing Institute funded PhD student in Mathematics at the University of Manchester. My overall research focus lies in infectious disease epidemiology, where I focus on the SARS-CoV-2 and Hepatitis C pathogens.

Infectious disease epidemiology is a broad field, where cutting edge data science techniques can be applied in many different areas to advance our understanding and to improve public health outcomes. As such I maintain research interest in many areas of statistics, machine learning and data science.

At the beginning of my PhD, I worked on developing dynamic network models of Hepatitis C transmission in incarceration settings and community settings, to inform strategies for the elimination of Hepatitis C as a public health hazard in Manchester by 2025. When the SARS-CoV-2 pandemic started we joined the emergency modelling response and worked with SPI-M and SAGE to inform the pandemic response. In collaboration with an interdisciplinary group of researchers (behavioural scientists, mathematicians, epidemiologists, clinicians, infectious disease modellers), I led the development of a contact tracing model to study various contact tracing interventions. Our contact tracing research topics included: predictions of low efficacy contact tracing early in the pandemic; evaluation of household-based quarantine strategies; optimal allocation of limited testing resources; rapid evaluation of LFD daily contact testing strategies; and optimal strategies for genetic sequencing of infections for control of imported variants.

More recently, I led a paper examining the evidence for Covid-19 symptom phenotypes across multiple datasets/studies (Test & Trace, ONS Community Infection Survey, and Zoe Covid Symptom Study), where we employed several cutting-edge unsupervised machine learning techniques. During our symptom phenotypes work, we demonstrated clear differences in the symptom phenotypes of Covid-19 cases at the extremes of age across all datasets (paper currently submitted to Science).

Currently, my focus is largely on publishing the evidence base we developed during the pandemic and consolidating the code we developed into a software package. I have ongoing work attempting to relate models of within host viral kinetics with infectiousness as a function of viral load – the goal of this task is to provide a clear scientific evaluation of the utility of LFD tests as a test for infection. Finally, I am collaborating with others on developing a more advanced, network-based model of Lateral flow testing strategies, such as outbreak investigations.

Research areas:

The following is a non-comprehensive list of areas where I maintain active research interests

  • SARS-CoV-2, Hepatitis C
  • Contact tracing
  • Epidemics on networks, networked data analysis (e.g ERGM’s, cluster detection)
  • Dimensionality reduction (particularly UMAP, and related methods such as PacMAP)
  • Bayesian statistics (model stacking/averaging/comparison, sequential MCMC, optimisation, Gaussian processes)

Software

Our work developing a SARS-CoV-2 contact tracing model has had an initial release, and is available at: https://github.com/TTI-modelling/TestingContactModel. As much of the code was developed at pace so that we can provide rapid response modelling, we are taking the time to polish and completely document some advanced aspects of the model.

Programming languages

  • Python (proficient)
  • R + Tidyverse (proficient)
  • SQL (proficient)
  • Stan (proficient)
  • Julia (basic)
  • MATLAB (basic)

For coding, I primarily use a mixture of R or Python – depending which language I think is stronger for the task at hand. I find R, particularly the Tidyverse, fantastic for: rapidly exploring, wrangling, tidying, visualising, performing statistical analysis on datasets and then communicating the results. However, if I need to: program for speed (typically using Numba), develop a large codebase, run intensive simulations, offload computation to the GPU, or can really leverage object-oriented programming for the task, then I will prefer to work in Python. I have extensive experience using SQL and relational database models. For Bayesian inference, I typically write my models using Stan but am familiar with the Python implementations.