Dr Stephen J Newhouse

BSc Hons, MSc, PhD
Lead Data Scientist & Senior Bioinformatician at The Bioinformatics Core at the NIHR Biomedical Research Centre for Mental Health, Kings College London


Objective

Real world


Reserch Publications, Metrics and Impact

Metric/Id Information
Google Scholar: Publications
h-index: 34
i10-index: 48
h-index Scopus: 32
Scopus Author ID: 8931613700
ResearcherID: C-9330-2011
ORCHID ID: 0000-0002-1843-9842
KCL Research Portal kclpure
impactstory https://impactstory.org/u/0000-0002-1843-9842

Current Positions

Date Position Institute
2017 Honoray Senior Lecturer (Secondment) Institute of Health Informatics, University College London
2016 Lead Data Scientist & Senior Bioinformatician Bioinformatics Core at the NIHR Biomedical Research Centre for Mental Health, Kings College London
2015 Bioinformatics Module Lead (Genomic Medicine MSc) St Georges University of London & Kings College London

Bioinformatics Module Lead

As module lead: successfully organise and co-ordinate an intensive week of lectures and hands-on practicals covering bioinformatics applied to Genomic Medicine and the 100K Genome Project (http://www.genomicsengland.co.uk/).

Lead Data Scientist and Senior Bioinformatician

  • NIHR BRC-MH Bioinformatics & Statistics Theme
  • NIHR BRC-MH Bioinformatics Core
  • Lead & Manage sub-team focused on Genomic and Transcriptomic work applied to neurodegenerative and psychiatric disorders
  • Lead & Manage a small Core Bioinformatics service for Illumina SNP and Expression Array processing: raw data to analysis ready data
  • Pipeline Development and Implementation for Genomic and Transcriptomic Data analysis
  • Pipeline Development and Implementation for Drug for Repurposing using public and private Genomic and Transcriptomic Data
  • Consult on NGS pipelines for local research groups
  • Consult on Data Science (Biostatistics, Statistical Genetics and Applied Predictive Modelling) for local research groups
  • Co-supervise 3 PhD students: academic and pastoral support
  • Present work at internal and external meetings
  • Publication list at: Google Scholar

Education

Date Degree Subject Institute
2001 – 2005 PhD Genetics Queen Mary, University of London, UK
1999 – 2000 MSc with Merit Forensic Science Kings College London, UK
1996 – 1999 BSc. Hons. (2.1) Molecular Biology University of Liverpool, UK
1995 – 1996 GCE (A Levels) Biology (B), Chemistry (B), Maths (C) Wirral Grammer School, Bebington, UK
1990 – 1995 GCSE 10 inc. English (A), Maths (B) Wellington School, Bebington, UK

Work History

Date Description
2017 Honoray Senior Lecturer
2016 Lead Data Scientist & Senior Bioinformatician
2015 Bioinformatics Module Lead (Genomic Medicine MSc)
2011 Senior Bioinformatician, Bioinformatics Core at the NIHR Biomedical Research Centre for Mental Health, Kings College London
2010 Postdoctoral Research Associate, MRC Centre for Neurodegeneration, Kings College London
2009 Postdoctoral Research Associate in Cardiovascular Genetics. Department of Medicine, Clinical Pharmacology Unit, Cambridge University.
2006 Postdoctoral Research Scientist. Dept of Clinical Pharmacology, William Harvey Research Institute, Barts and The London School of Medicine, Queen Mary, University of London
2005 Wellcome Trust Value in People Award Postdoctoral Research Fellow.Dept of Clinical Pharmacology, William Harvey Research Institute, Barts and The London School of Medicine, Queen Mary, University of London
2000 Research Assistant. “MRC British Genetics of Hypertension Study.” Dept of Clinical Pharmacology, William Harvey Research Institute, Barts and The London School of Medicine, Queen Mary, University of London

Data Science Skills and Experience

This is not an exhaustive list, but a snap-shot of the kind of methods, techniques, analyses and programming languages I have had experience with over the years.

  • R & Bioconductor
    • 15+ years experience
    • Example packages: dplyr, mice,caret, ggplot, limma, lumi, sva, wgcna
    • Rstudio: https://www.rstudio.com/
  • Biostatistics & Exploratory Data Analysis
    • Data Visualiation : boxplots, scatter plots, histograms…
    • Example methods: linear and logistic regression, principal component analysis , missing data imputation
  • Machine Learning & Applied Predictive Modelling
  • Network Data Integration, Analysis, and Visualization using Cytoscape and Ingenuity Pathway Analysis (IPA)
  • Functional Enrichment Analyses: slide.
    • Example methods: Ingenuity Pathway Analysis, Enrichr, Genemania, GSEA
  • SNP & Gene Expression Array Analysis : Illumina_expression_workflow
  • Applied Statistical Genetics
    • Candidate Gene Analysis
    • Genome Wide Association Analyses
    • SNP Imputation
    • Polygenic risk score calcualtions
    • Example Software: haplo.stats, PLINK, beagle, impute, snptest
  • Next Generation Sequence Pipelines (DNA & RNA-seq)
  • NGS Variant calling & Prioritisation Pipelines
    • Example Software: freebayes, platypus, vardict, cnvkit, annovar, vep, gemini, exomiser
  • Version Control: GitHub
  • Bash shell programming
  • Python : basic, anaconda
  • Package and environment management systems: conda (http://conda.pydata.org/docs/) & bioconda (https://bioconda.github.io/)
  • High Performance Computing: Running analyses on multi-tenant HPC with Sun Grid Engine
  • Cloud Computing: Amazon Web Serives (AWS) and Google Cloud and GCE
  • Basecamp: https://basecamp.com/. Project co-ordination and management
  • Operating Systems: Unix, Mac OSX and Windows
  • Databases : exposure to tranSMART, SQL and neo4j

DataCamp

DataCamp Certifications License
Introduction to Machine Learning 6e2e5c25ccc8f5cba1eacd4e229104c04e7e9063
Intermediate Python for Data Science 409e19dbf6ee1a03daac8aba00a13b02e2436ae6
Intro to Python for Data Science 1fd5cf54fc08358440b9cbd81acb54b5cafca0c6
Kaggle Python Tutorial on Machine Learning e43f424808f019a19a1feb290afa73f343c13fd8
Data Exploration With Kaggle Scripts 393b59a85f9e9c42a2b6bf1209d03a99da7a8365
Having Fun with googleVis 25e6cf9cae8613558fe24f5fad38fb6e80f75800
Introduction to R 249990d217171669ef72d64edd3dba3d840557a2

Personal and Contact Details

Personal Details Information
Date of Birth: available upon request
Nationality: British
Gender: Male
Marital Status: Married
Ethnicity: Mixed (Caucasian, South-Asian)
Willing to Relocate: Yes
Contact Details Information
Address: available upon request
Mobile no: available upon request
Email: stephen.j.newhouse@gmail.com

Salary and Notice Peroid

Salary/Notice Information
Academic Salary Grade: Grade 8 pt 48
Current Salary: available upon request
Notice period: 12 weeks

Personal Qualities

As you can see from my CV and experience as a senior scientist in academia, I have clearly developed and demonstrated the following good personal qualities:-

  • Good Communication skills
  • Team player skills
  • Leadership skills
  • Attention to detail
  • Enthusiasm and personal drive
  • Initiative
  • Management and organisational skills
  • Ability to handle pressure and meet deadlines
  • Willing to learn
  • Flexibility

Some of my “Bad” qualities:

  • I can be too honest
  • I can be fairly intolerant of
    • laziness
    • jargon junkies
    • behaviour that demonstrates an absence of trust
    • behaviour that demonstrates a focus on individual egos
    • behaviour that demonstrates a focus on personal politics and hierarchies

Mostly, I am easy going and softly spoken and get along with everyone


Full Disclosure

I have Multiple Sclerosis.
Multiple Sclerosis is covered under the Equality Act 2010.
Diagnosed: 2011.
Diagnosed at : KCH NHS Multiple sclerosis
Get Informed at : https://www.mssociety.org.uk/ and https://www.mstrust.org.uk

Supervision of Research Students and staff

Teaching

Student/Staff Course Date
Students/NHS staff Bioinformatics , Genomic Medicine MSc 2015- present
Students MSc Genes Environment & Development 2015- present
Students MSc Neuroscience 2015- present
Staff/Students Master Class in Translational Research using Bioinformatics and Epidemiology 2014 – present
Staff/Students SGDP Summer School: Bioinformatics 2011 – present
Staff/Students BRC-MH Bioinformatics Workshops 2011 - present
Medical Students Problem Based Learning tutor (QMUL) 2005 - 2009
Medical Students BMedSci Lecture Molecular Biology (QMUL) 2008 - 2009

Student Supervision

Name Studentship Date
Daniel Leirer PhD Student 2014- present
Hamel Patel PhD Student 2014- present
Elizabeth Baker PhD Student 2014- present
Bugra Ozer MSc Bionformatics student 2012-2012
S. Sivakumar BMedSci student 2007-2007

Co-Supervision and Team Management

Name Student/Staff Date
A.Iacoangeli Post Doctoral Research Fellow BRC-MH 2016- present
G.C.Antona Post Doctoral Research Fellow BRC-MH 2015- present
H. Patel Bioinformatician BRC-MH 2013- present
A. Gulati Bioinformatician KCH 2013-2014
E. Azizan PhD Student, Cambridge 2009-2010
J.Coleman Medical Student, Cambridge 2009-2009
K.Sayal Medical Student, Cambridge 2009-2009
M.Hoti PhD Student, QMUL 2006-2009
A. Doyle BMedSci student, QMUL 2005-2005

Funding Awards

Gene expression profiling in the MRC Brain bank : a systems based biology approach to Dementia
Newhouse, S.
Biomedical Research Centre: £28,686.58
30/11/12 → 31/03/13

Development of a high throughput gene, environment and epigenetics database and analysis system for international ALS research
Motor Neurone Disease Association
Al-Chalabi, A., Dobson, R., Newhouse, S.
£171,479.00
1/10/14 → 30/09/17

An integrated systems view of Alzheimer’s disease in patients harbouring rare risk variants in TREM2
Dobson, R., Hodges, A., Kiddle, S. & Newhouse, S.
Eli Lilly and Company (USA): £149,713.00
1/01/15 → 31/12/16

UK Infrastructure for Large-Scale Clinical Genomics Research
MRC
Dobson, R., Hubbard, T., Newhouse, S.
£251,454.00
1/04/15 → 30/09/18

International Collaborations

Active participant and invited member of the following:-

Project MinE

Plan to map the full DNA profiles of at least 15,000 people with ALS and 7,500 control subjects, and to perform comparative analyses on the resulting data.

European Medical Information Framework: WP3 integrative analysis task force and WP3/WP4 Genomics task force. URL: http://www.imi.europa.eu/content/emif

The EMIF project aims to develop a common information framework of patient-level data that will link up and facilitate access to diverse medical and research data sources, opening up new avenues of research for scientists. To provide a focus and guidance for the development of the framework, the project will focus initially on questions relating to obesity and Alzheimer’s disease.

The Genetic Architecture of Rate of Alzheimer’s Decline (GENAROAD) Consortium. URL: http://www.genomes2people.org/genetic-architecture-of-rate-of-decline-in-alzheimers-disease/

There is tremendous unexplained variability in the rate of Alzheimer’s disease progression that is not explained by clinical features or co-morbidities and is therefore, likely genetic. Understanding more about the genetic basis of this variability could help illuminate biological pathways involved in disease progression and could uncover clues to new therapies to slow disease progression. In addition, identifying genetic markers associated with more rapid or less rapid decline might also help refine the selection of subjects or inform the interpretation of future clinical trials. Investigators from several large studies are pooling longitudinal psychometric data and genotype data in order to discover new genes associated with rate of decline. These data have been collected or are being collected from the Alzheimer’s Disease Genetics Consortium, the Alzheimer’s Disease Neuroimaging Initiative, the Rush Religious Orders Study, Rush Memory and Ageing Study, the Cache County Study of Memory and Ageing, AddNeuroMed and several industry-sponsored pharmaceutical trials


Personal Interests


Some selected unpublished works in progress:-

  • Brain expression analysis in Alzheimer’s disease: Early Results: [slide]
  • Genome-wide association analysis identifies common variants associated with measures of disease progression in patients with Alzheimer’s disease:[pdf]
  • NGSeasy:[git]
  • Selected Posters:[F1000 Research Posters]

Some selected highlights:-

Bio in Docker 2015 Symposium

Took the lead in co-ordinating and organising this 2 day symosium, largely funded by Genomics England. A lot of credit and thanks go to Ms Tanya Hardy, Ms Lucy O’Neill for their support and hard work in helping to bring this all together and their logistics prowess.

Some online coverage of this even can be found below:-

London Containing Bioinformatics and Data Analytics

As a spin off from our Bio in Docker event, I have started a Meetup.

London Containing Bioinformatics & Data Analytics

A group for all those coders and open source champions. Git lovers, Docker enthusiasts, applied Bio-/Health-/Medical-Informaticians and Machine Learners. ELK stack fans, software devs and engineers and UX/UI designers. For all those general computer and data science folks that are interested in meeting like-minded practitioners to chat/rant and play; set some standards, and hopefully, start and do some interesting things with real world applications.

References

Available upon request.